kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 00/13] Remove vfio_mdev.c, mdev_parent_ops and more
@ 2021-04-26 20:00 Jason Gunthorpe
  2021-04-26 20:00 ` [PATCH v2 01/13] vfio/mdev: Remove CONFIG_VFIO_MDEV_DEVICE Jason Gunthorpe
                   ` (11 more replies)
  0 siblings, 12 replies; 59+ messages in thread
From: Jason Gunthorpe @ 2021-04-26 20:00 UTC (permalink / raw)
  To: David Airlie, Tony Krowiak, Alex Williamson,
	Christian Borntraeger, Cornelia Huck, Jonathan Corbet,
	Daniel Vetter, dri-devel, Eric Farman, Harald Freudenberger,
	Vasily Gorbik, Heiko Carstens, intel-gfx, intel-gvt-dev,
	Jani Nikula, Joonas Lahtinen, kvm, Kirti Wankhede, linux-doc,
	linux-s390, Peter Oberparleiter, Halil Pasic, Pierre Morel,
	Rodrigo Vivi, Vineeth Vijayan, Zhenyu Wang, Zhi Wang
  Cc: Raj, Ashok, Dan Williams, Christoph Hellwig, Leon Romanovsky,
	Max Gurtovoy, Tarun Gupta

The mdev bus's core part for managing the lifecycle of devices is mostly
as one would expect for a driver core bus subsystem.

However instead of having a normal 'struct device_driver' and binding the
actual mdev drivers through the standard driver core mechanisms it open
codes this with the struct mdev_parent_ops and provides a single driver
that shims between the VFIO core and the actual device driver.

Make every one of the mdev drivers implement an actual struct mdev_driver
and directly call vfio_register_group_dev() in the probe() function for
the mdev.

Squash what is left of the mdev_parent_ops into the mdev_driver and remap
create(), remove() and mdev_attr_groups to their driver core
equivalents. Arrange to bind the created mdev_device to the mdev_driver
that is provided by the end driver.

The actual execution flow doesn't change much, eg what was
parent_ops->create is now device_driver->probe and it is called at almost
the exact same time - except under the normal control of the driver core.

This allows deleting the entire mdev_drvdata, and tidying some of the
sysfs. Many places in the drivers start using container_of()

This cleanly splits the mdev sysfs GUID lifecycle management stuff from
the vfio_device implementation part, the only VFIO special part of mdev
that remains is the mdev specific iommu intervention.

v2:
 - Keep && m in samples kconfig
 - Restore accidently squashed removeal of vfio_mdev.c
 - Remove indirections to call bus_register()/bus_unregister()
 - Reflow long doc lines
v1: https://lore.kernel.org/r/0-v1-d88406ed308e+418-vfio3_jgg@nvidia.com

Jason

Cc: Leon Romanovsky <leonro@nvidia.com>
Cc: "Raj, Ashok" <ashok.raj@intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Max Gurtovoy <mgurtovoy@nvidia.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Tarun Gupta <targupta@nvidia.com>
Cc: Daniel Vetter <daniel@ffwll.ch>


Jason Gunthorpe (13):
  vfio/mdev: Remove CONFIG_VFIO_MDEV_DEVICE
  vfio/mdev: Allow the mdev_parent_ops to specify the device driver to
    bind
  vfio/mtty: Convert to use vfio_register_group_dev()
  vfio/mdpy: Convert to use vfio_register_group_dev()
  vfio/mbochs: Convert to use vfio_register_group_dev()
  vfio/ap_ops: Convert to use vfio_register_group_dev()
  vfio/ccw: Convert to use vfio_register_group_dev()
  vfio/gvt: Convert to use vfio_register_group_dev()
  vfio/mdev: Remove vfio_mdev.c
  vfio/mdev: Remove mdev_parent_ops dev_attr_groups
  vfio/mdev: Remove mdev_parent_ops
  vfio/mdev: Use the driver core to create the 'remove' file
  vfio/mdev: Remove mdev drvdata

 .../driver-api/vfio-mediated-device.rst       |  56 ++---
 Documentation/s390/vfio-ap.rst                |   1 -
 arch/s390/Kconfig                             |   2 +-
 drivers/gpu/drm/i915/Kconfig                  |   2 +-
 drivers/gpu/drm/i915/gvt/kvmgt.c              | 210 +++++++++--------
 drivers/s390/cio/vfio_ccw_drv.c               |  21 +-
 drivers/s390/cio/vfio_ccw_ops.c               | 136 ++++++-----
 drivers/s390/cio/vfio_ccw_private.h           |   5 +
 drivers/s390/crypto/vfio_ap_ops.c             | 138 ++++++-----
 drivers/s390/crypto/vfio_ap_private.h         |   2 +
 drivers/vfio/mdev/Kconfig                     |   7 -
 drivers/vfio/mdev/Makefile                    |   1 -
 drivers/vfio/mdev/mdev_core.c                 |  67 ++++--
 drivers/vfio/mdev/mdev_driver.c               |  20 +-
 drivers/vfio/mdev/mdev_private.h              |   4 +-
 drivers/vfio/mdev/mdev_sysfs.c                |  37 ++-
 drivers/vfio/mdev/vfio_mdev.c                 | 180 ---------------
 drivers/vfio/vfio.c                           |   6 +-
 include/linux/mdev.h                          |  86 +------
 include/linux/vfio.h                          |   4 +
 samples/Kconfig                               |   6 +-
 samples/vfio-mdev/mbochs.c                    | 166 +++++++------
 samples/vfio-mdev/mdpy.c                      | 162 +++++++------
 samples/vfio-mdev/mtty.c                      | 218 +++++++-----------
 24 files changed, 651 insertions(+), 886 deletions(-)
 delete mode 100644 drivers/vfio/mdev/vfio_mdev.c

-- 
2.31.1


^ permalink raw reply	[flat|nested] 59+ messages in thread

* [PATCH v2 01/13] vfio/mdev: Remove CONFIG_VFIO_MDEV_DEVICE
  2021-04-26 20:00 [PATCH v2 00/13] Remove vfio_mdev.c, mdev_parent_ops and more Jason Gunthorpe
@ 2021-04-26 20:00 ` Jason Gunthorpe
  2021-04-27 11:05   ` Cornelia Huck
  2021-04-26 20:00 ` [PATCH v2 02/13] vfio/mdev: Allow the mdev_parent_ops to specify the device driver to bind Jason Gunthorpe
                   ` (10 subsequent siblings)
  11 siblings, 1 reply; 59+ messages in thread
From: Jason Gunthorpe @ 2021-04-26 20:00 UTC (permalink / raw)
  To: David Airlie, Tony Krowiak, Alex Williamson,
	Christian Borntraeger, Cornelia Huck, Jonathan Corbet,
	Daniel Vetter, dri-devel, Vasily Gorbik, Heiko Carstens,
	intel-gfx, Jani Nikula, Joonas Lahtinen, kvm, Kirti Wankhede,
	linux-doc, linux-s390, Halil Pasic, Pierre Morel, Rodrigo Vivi
  Cc: Raj, Ashok, Dan Williams, Christoph Hellwig, Leon Romanovsky,
	Max Gurtovoy, Tarun Gupta

For some reason the vfio_mdev shim mdev_driver has its own module and
kconfig. As the next patch requires access to it from mdev.ko merge the
two modules together and remove VFIO_MDEV_DEVICE.

A later patch deletes this driver entirely.

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 Documentation/s390/vfio-ap.rst   |  1 -
 arch/s390/Kconfig                |  2 +-
 drivers/gpu/drm/i915/Kconfig     |  2 +-
 drivers/vfio/mdev/Kconfig        |  7 -------
 drivers/vfio/mdev/Makefile       |  3 +--
 drivers/vfio/mdev/mdev_core.c    | 16 ++++++++++++++--
 drivers/vfio/mdev/mdev_private.h |  2 ++
 drivers/vfio/mdev/vfio_mdev.c    | 24 +-----------------------
 samples/Kconfig                  |  6 +++---
 9 files changed, 23 insertions(+), 40 deletions(-)

diff --git a/Documentation/s390/vfio-ap.rst b/Documentation/s390/vfio-ap.rst
index e15436599086b7..f57ae621f33e89 100644
--- a/Documentation/s390/vfio-ap.rst
+++ b/Documentation/s390/vfio-ap.rst
@@ -514,7 +514,6 @@ These are the steps:
    * S390_AP_IOMMU
    * VFIO
    * VFIO_MDEV
-   * VFIO_MDEV_DEVICE
    * KVM
 
    If using make menuconfig select the following to build the vfio_ap module::
diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig
index c1ff874e6c2e63..dc7928e37fa409 100644
--- a/arch/s390/Kconfig
+++ b/arch/s390/Kconfig
@@ -773,7 +773,7 @@ config VFIO_CCW
 config VFIO_AP
 	def_tristate n
 	prompt "VFIO support for AP devices"
-	depends on S390_AP_IOMMU && VFIO_MDEV_DEVICE && KVM
+	depends on S390_AP_IOMMU && VFIO_MDEV && KVM
 	depends on ZCRYPT
 	help
 		This driver grants access to Adjunct Processor (AP) devices
diff --git a/drivers/gpu/drm/i915/Kconfig b/drivers/gpu/drm/i915/Kconfig
index 483e9ff8ca1d23..388bc41aa1a75b 100644
--- a/drivers/gpu/drm/i915/Kconfig
+++ b/drivers/gpu/drm/i915/Kconfig
@@ -125,7 +125,7 @@ config DRM_I915_GVT_KVMGT
 	tristate "Enable KVM/VFIO support for Intel GVT-g"
 	depends on DRM_I915_GVT
 	depends on KVM
-	depends on VFIO_MDEV && VFIO_MDEV_DEVICE
+	depends on VFIO_MDEV
 	default n
 	help
 	  Choose this option if you want to enable KVMGT support for
diff --git a/drivers/vfio/mdev/Kconfig b/drivers/vfio/mdev/Kconfig
index 5da27f2100f9bd..763c877a1318bc 100644
--- a/drivers/vfio/mdev/Kconfig
+++ b/drivers/vfio/mdev/Kconfig
@@ -9,10 +9,3 @@ config VFIO_MDEV
 	  See Documentation/driver-api/vfio-mediated-device.rst for more details.
 
 	  If you don't know what do here, say N.
-
-config VFIO_MDEV_DEVICE
-	tristate "VFIO driver for Mediated devices"
-	depends on VFIO && VFIO_MDEV
-	default n
-	help
-	  VFIO based driver for Mediated devices.
diff --git a/drivers/vfio/mdev/Makefile b/drivers/vfio/mdev/Makefile
index 101516fdf3753e..ff9ecd80212503 100644
--- a/drivers/vfio/mdev/Makefile
+++ b/drivers/vfio/mdev/Makefile
@@ -1,6 +1,5 @@
 # SPDX-License-Identifier: GPL-2.0-only
 
-mdev-y := mdev_core.o mdev_sysfs.o mdev_driver.o
+mdev-y := mdev_core.o mdev_sysfs.o mdev_driver.o vfio_mdev.o
 
 obj-$(CONFIG_VFIO_MDEV) += mdev.o
-obj-$(CONFIG_VFIO_MDEV_DEVICE) += vfio_mdev.o
diff --git a/drivers/vfio/mdev/mdev_core.c b/drivers/vfio/mdev/mdev_core.c
index 2a85d6fcb7ddd0..ff8c1a84516698 100644
--- a/drivers/vfio/mdev/mdev_core.c
+++ b/drivers/vfio/mdev/mdev_core.c
@@ -360,11 +360,24 @@ int mdev_device_remove(struct mdev_device *mdev)
 
 static int __init mdev_init(void)
 {
-	return mdev_bus_register();
+	int rc;
+
+	rc = mdev_bus_register();
+	if (rc)
+		return rc;
+	rc = mdev_register_driver(&vfio_mdev_driver);
+	if (rc)
+		goto err_bus;
+	return 0;
+err_bus:
+	mdev_bus_unregister();
+	return rc;
 }
 
 static void __exit mdev_exit(void)
 {
+	mdev_unregister_driver(&vfio_mdev_driver);
+
 	if (mdev_bus_compat_class)
 		class_compat_unregister(mdev_bus_compat_class);
 
@@ -378,4 +391,3 @@ MODULE_VERSION(DRIVER_VERSION);
 MODULE_LICENSE("GPL v2");
 MODULE_AUTHOR(DRIVER_AUTHOR);
 MODULE_DESCRIPTION(DRIVER_DESC);
-MODULE_SOFTDEP("post: vfio_mdev");
diff --git a/drivers/vfio/mdev/mdev_private.h b/drivers/vfio/mdev/mdev_private.h
index a656cfe0346c33..5461b67582289f 100644
--- a/drivers/vfio/mdev/mdev_private.h
+++ b/drivers/vfio/mdev/mdev_private.h
@@ -37,6 +37,8 @@ struct mdev_type {
 #define to_mdev_type(_kobj)		\
 	container_of(_kobj, struct mdev_type, kobj)
 
+extern struct mdev_driver vfio_mdev_driver;
+
 int  parent_create_sysfs_files(struct mdev_parent *parent);
 void parent_remove_sysfs_files(struct mdev_parent *parent);
 
diff --git a/drivers/vfio/mdev/vfio_mdev.c b/drivers/vfio/mdev/vfio_mdev.c
index 922729071c5a8e..d5b4eede47c1a5 100644
--- a/drivers/vfio/mdev/vfio_mdev.c
+++ b/drivers/vfio/mdev/vfio_mdev.c
@@ -17,10 +17,6 @@
 
 #include "mdev_private.h"
 
-#define DRIVER_VERSION  "0.1"
-#define DRIVER_AUTHOR   "NVIDIA Corporation"
-#define DRIVER_DESC     "VFIO based driver for Mediated device"
-
 static int vfio_mdev_open(struct vfio_device *core_vdev)
 {
 	struct mdev_device *mdev = to_mdev_device(core_vdev->dev);
@@ -151,7 +147,7 @@ static void vfio_mdev_remove(struct mdev_device *mdev)
 	kfree(vdev);
 }
 
-static struct mdev_driver vfio_mdev_driver = {
+struct mdev_driver vfio_mdev_driver = {
 	.driver = {
 		.name = "vfio_mdev",
 		.owner = THIS_MODULE,
@@ -160,21 +156,3 @@ static struct mdev_driver vfio_mdev_driver = {
 	.probe	= vfio_mdev_probe,
 	.remove	= vfio_mdev_remove,
 };
-
-static int __init vfio_mdev_init(void)
-{
-	return mdev_register_driver(&vfio_mdev_driver);
-}
-
-static void __exit vfio_mdev_exit(void)
-{
-	mdev_unregister_driver(&vfio_mdev_driver);
-}
-
-module_init(vfio_mdev_init)
-module_exit(vfio_mdev_exit)
-
-MODULE_VERSION(DRIVER_VERSION);
-MODULE_LICENSE("GPL v2");
-MODULE_AUTHOR(DRIVER_AUTHOR);
-MODULE_DESCRIPTION(DRIVER_DESC);
diff --git a/samples/Kconfig b/samples/Kconfig
index e76cdfc50e257d..5708abcc55c4df 100644
--- a/samples/Kconfig
+++ b/samples/Kconfig
@@ -147,14 +147,14 @@ config SAMPLE_UHID
 
 config SAMPLE_VFIO_MDEV_MTTY
 	tristate "Build VFIO mtty example mediated device sample code -- loadable modules only"
-	depends on VFIO_MDEV_DEVICE && m
+	depends on VFIO_MDEV && m
 	help
 	  Build a virtual tty sample driver for use as a VFIO
 	  mediated device
 
 config SAMPLE_VFIO_MDEV_MDPY
 	tristate "Build VFIO mdpy example mediated device sample code -- loadable modules only"
-	depends on VFIO_MDEV_DEVICE && m
+	depends on VFIO_MDEV && m
 	help
 	  Build a virtual display sample driver for use as a VFIO
 	  mediated device.  It is a simple framebuffer and supports
@@ -171,7 +171,7 @@ config SAMPLE_VFIO_MDEV_MDPY_FB
 
 config SAMPLE_VFIO_MDEV_MBOCHS
 	tristate "Build VFIO mdpy example mediated device sample code -- loadable modules only"
-	depends on VFIO_MDEV_DEVICE && m
+	depends on VFIO_MDEV && m
 	select DMA_SHARED_BUFFER
 	help
 	  Build a virtual display sample driver for use as a VFIO
-- 
2.31.1


^ permalink raw reply	[flat|nested] 59+ messages in thread

* [PATCH v2 02/13] vfio/mdev: Allow the mdev_parent_ops to specify the device driver to bind
  2021-04-26 20:00 [PATCH v2 00/13] Remove vfio_mdev.c, mdev_parent_ops and more Jason Gunthorpe
  2021-04-26 20:00 ` [PATCH v2 01/13] vfio/mdev: Remove CONFIG_VFIO_MDEV_DEVICE Jason Gunthorpe
@ 2021-04-26 20:00 ` Jason Gunthorpe
  2021-04-27 12:32   ` Cornelia Huck
                     ` (2 more replies)
  2021-04-26 20:00 ` [PATCH v2 03/13] vfio/mtty: Convert to use vfio_register_group_dev() Jason Gunthorpe
                   ` (9 subsequent siblings)
  11 siblings, 3 replies; 59+ messages in thread
From: Jason Gunthorpe @ 2021-04-26 20:00 UTC (permalink / raw)
  To: Alex Williamson, Cornelia Huck, kvm, Kirti Wankhede
  Cc: Raj, Ashok, Dan Williams, Daniel Vetter, Christoph Hellwig,
	Leon Romanovsky, Max Gurtovoy, Tarun Gupta

This allows a mdev driver to opt out of using vfio_mdev.c, instead the
driver will provide a 'struct mdev_driver' and register directly with the
driver core.

Much of mdev_parent_ops becomes unused in this mode:
- create()/remove() are done via the mdev_driver probe()/remove()
- mdev_attr_groups becomes mdev_driver driver.dev_groups
- Wrapper function callbacks are replaced with the same ones from
  struct vfio_device_ops

Following patches convert all the drivers.

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/vfio/mdev/mdev_core.c   | 64 ++++++++++++++++++++++++++++-----
 drivers/vfio/mdev/mdev_driver.c | 17 ++++++++-
 include/linux/mdev.h            |  3 ++
 3 files changed, 75 insertions(+), 9 deletions(-)

diff --git a/drivers/vfio/mdev/mdev_core.c b/drivers/vfio/mdev/mdev_core.c
index ff8c1a84516698..51b8a9fcf866ad 100644
--- a/drivers/vfio/mdev/mdev_core.c
+++ b/drivers/vfio/mdev/mdev_core.c
@@ -94,9 +94,11 @@ static void mdev_device_remove_common(struct mdev_device *mdev)
 	mdev_remove_sysfs_files(mdev);
 	device_del(&mdev->dev);
 	lockdep_assert_held(&parent->unreg_sem);
-	ret = parent->ops->remove(mdev);
-	if (ret)
-		dev_err(&mdev->dev, "Remove failed: err=%d\n", ret);
+	if (parent->ops->remove) {
+		ret = parent->ops->remove(mdev);
+		if (ret)
+			dev_err(&mdev->dev, "Remove failed: err=%d\n", ret);
+	}
 
 	/* Balances with device_initialize() */
 	put_device(&mdev->dev);
@@ -127,7 +129,9 @@ int mdev_register_device(struct device *dev, const struct mdev_parent_ops *ops)
 	char *envp[] = { env_string, NULL };
 
 	/* check for mandatory ops */
-	if (!ops || !ops->create || !ops->remove || !ops->supported_type_groups)
+	if (!ops || !ops->supported_type_groups)
+		return -EINVAL;
+	if (!ops->device_driver && (!ops->create || !ops->remove))
 		return -EINVAL;
 
 	dev = get_device(dev);
@@ -251,6 +255,43 @@ static void mdev_device_release(struct device *dev)
 	kfree(mdev);
 }
 
+/*
+ * mdev drivers can refuse to bind during probe(), in this case we want to fail
+ * the creation of the mdev all the way back to sysfs. This is a weird model
+ * that doesn't fit in the driver core well, nor does it seem to appear any
+ * place else in the kernel, so use a simple hack.
+ */
+static int mdev_bind_driver(struct mdev_device *mdev)
+{
+	struct mdev_driver *drv = mdev->type->parent->ops->device_driver;
+	int ret;
+
+	if (!drv)
+		drv = &vfio_mdev_driver;
+
+	while (1) {
+		device_lock(&mdev->dev);
+		if (mdev->dev.driver == &drv->driver) {
+			ret = 0;
+			goto out_unlock;
+		}
+		if (mdev->probe_err) {
+			ret = mdev->probe_err;
+			goto out_unlock;
+		}
+		device_unlock(&mdev->dev);
+		ret = device_attach(&mdev->dev);
+		if (ret)
+			return ret;
+		mdev->probe_err = -EINVAL;
+	}
+	return 0;
+
+out_unlock:
+	device_unlock(&mdev->dev);
+	return ret;
+}
+
 int mdev_device_create(struct mdev_type *type, const guid_t *uuid)
 {
 	int ret;
@@ -296,14 +337,20 @@ int mdev_device_create(struct mdev_type *type, const guid_t *uuid)
 		goto out_put_device;
 	}
 
-	ret = parent->ops->create(mdev);
-	if (ret)
-		goto out_unlock;
+	if (parent->ops->create) {
+		ret = parent->ops->create(mdev);
+		if (ret)
+			goto out_unlock;
+	}
 
 	ret = device_add(&mdev->dev);
 	if (ret)
 		goto out_remove;
 
+	ret = mdev_bind_driver(mdev);
+	if (ret)
+		goto out_del;
+
 	ret = mdev_create_sysfs_files(mdev);
 	if (ret)
 		goto out_del;
@@ -317,7 +364,8 @@ int mdev_device_create(struct mdev_type *type, const guid_t *uuid)
 out_del:
 	device_del(&mdev->dev);
 out_remove:
-	parent->ops->remove(mdev);
+	if (parent->ops->remove)
+		parent->ops->remove(mdev);
 out_unlock:
 	up_read(&parent->unreg_sem);
 out_put_device:
diff --git a/drivers/vfio/mdev/mdev_driver.c b/drivers/vfio/mdev/mdev_driver.c
index 041699571b7e55..6e96c023d7823d 100644
--- a/drivers/vfio/mdev/mdev_driver.c
+++ b/drivers/vfio/mdev/mdev_driver.c
@@ -49,7 +49,7 @@ static int mdev_probe(struct device *dev)
 		return ret;
 
 	if (drv->probe) {
-		ret = drv->probe(mdev);
+		ret = mdev->probe_err = drv->probe(mdev);
 		if (ret)
 			mdev_detach_iommu(mdev);
 	}
@@ -71,10 +71,25 @@ static int mdev_remove(struct device *dev)
 	return 0;
 }
 
+static int mdev_match(struct device *dev, struct device_driver *drv)
+{
+	struct mdev_device *mdev = to_mdev_device(dev);
+	struct mdev_driver *target = mdev->type->parent->ops->device_driver;
+
+	/*
+	 * The ops specify the device driver to connect, fall back to the old
+	 * shim driver if the driver hasn't been converted.
+	 */
+	if (!target)
+		target = &vfio_mdev_driver;
+	return drv == &target->driver;
+}
+
 struct bus_type mdev_bus_type = {
 	.name		= "mdev",
 	.probe		= mdev_probe,
 	.remove		= mdev_remove,
+	.match		= mdev_match,
 };
 EXPORT_SYMBOL_GPL(mdev_bus_type);
 
diff --git a/include/linux/mdev.h b/include/linux/mdev.h
index 1fb34ea394ad46..49cc4f65120d57 100644
--- a/include/linux/mdev.h
+++ b/include/linux/mdev.h
@@ -19,6 +19,7 @@ struct mdev_device {
 	struct list_head next;
 	struct mdev_type *type;
 	struct device *iommu_device;
+	int probe_err;
 	bool active;
 };
 
@@ -55,6 +56,7 @@ struct device *mtype_get_parent_dev(struct mdev_type *mtype);
  * register the device to mdev module.
  *
  * @owner:		The module owner.
+ * @device_driver:	Which device driver to probe() on newly created devices
  * @dev_attr_groups:	Attributes of the parent device.
  * @mdev_attr_groups:	Attributes of the mediated device.
  * @supported_type_groups: Attributes to define supported types. It is mandatory
@@ -103,6 +105,7 @@ struct device *mtype_get_parent_dev(struct mdev_type *mtype);
  **/
 struct mdev_parent_ops {
 	struct module   *owner;
+	struct mdev_driver *device_driver;
 	const struct attribute_group **dev_attr_groups;
 	const struct attribute_group **mdev_attr_groups;
 	struct attribute_group **supported_type_groups;
-- 
2.31.1


^ permalink raw reply	[flat|nested] 59+ messages in thread

* [PATCH v2 03/13] vfio/mtty: Convert to use vfio_register_group_dev()
  2021-04-26 20:00 [PATCH v2 00/13] Remove vfio_mdev.c, mdev_parent_ops and more Jason Gunthorpe
  2021-04-26 20:00 ` [PATCH v2 01/13] vfio/mdev: Remove CONFIG_VFIO_MDEV_DEVICE Jason Gunthorpe
  2021-04-26 20:00 ` [PATCH v2 02/13] vfio/mdev: Allow the mdev_parent_ops to specify the device driver to bind Jason Gunthorpe
@ 2021-04-26 20:00 ` Jason Gunthorpe
  2021-04-26 20:00 ` [PATCH v2 04/13] vfio/mdpy: " Jason Gunthorpe
                   ` (8 subsequent siblings)
  11 siblings, 0 replies; 59+ messages in thread
From: Jason Gunthorpe @ 2021-04-26 20:00 UTC (permalink / raw)
  To: kvm, Kirti Wankhede
  Cc: Raj, Ashok, Dan Williams, Daniel Vetter, Christoph Hellwig,
	Leon Romanovsky, Max Gurtovoy, Tarun Gupta

This is straightforward conversion, the mdev_state is actually serving as
the vfio_device and we can replace all the mdev_get_drvdata()'s and the
wonky dead code with a simple container_of()

Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 samples/vfio-mdev/mtty.c | 185 ++++++++++++++++++---------------------
 1 file changed, 83 insertions(+), 102 deletions(-)

diff --git a/samples/vfio-mdev/mtty.c b/samples/vfio-mdev/mtty.c
index b9b24be4abdab7..d2a168420b775d 100644
--- a/samples/vfio-mdev/mtty.c
+++ b/samples/vfio-mdev/mtty.c
@@ -127,6 +127,7 @@ struct serial_port {
 
 /* State of each mdev device */
 struct mdev_state {
+	struct vfio_device vdev;
 	int irq_fd;
 	struct eventfd_ctx *intx_evtfd;
 	struct eventfd_ctx *msi_evtfd;
@@ -150,6 +151,8 @@ static const struct file_operations vd_fops = {
 	.owner          = THIS_MODULE,
 };
 
+static const struct vfio_device_ops mtty_dev_ops;
+
 /* function prototypes */
 
 static int mtty_trigger_interrupt(struct mdev_state *mdev_state);
@@ -631,22 +634,15 @@ static void mdev_read_base(struct mdev_state *mdev_state)
 	}
 }
 
-static ssize_t mdev_access(struct mdev_device *mdev, u8 *buf, size_t count,
+static ssize_t mdev_access(struct mdev_state *mdev_state, u8 *buf, size_t count,
 			   loff_t pos, bool is_write)
 {
-	struct mdev_state *mdev_state;
 	unsigned int index;
 	loff_t offset;
 	int ret = 0;
 
-	if (!mdev || !buf)
-		return -EINVAL;
-
-	mdev_state = mdev_get_drvdata(mdev);
-	if (!mdev_state) {
-		pr_err("%s mdev_state not found\n", __func__);
+	if (!buf)
 		return -EINVAL;
-	}
 
 	mutex_lock(&mdev_state->ops_lock);
 
@@ -708,15 +704,18 @@ static ssize_t mdev_access(struct mdev_device *mdev, u8 *buf, size_t count,
 	return ret;
 }
 
-static int mtty_create(struct mdev_device *mdev)
+static int mtty_probe(struct mdev_device *mdev)
 {
 	struct mdev_state *mdev_state;
 	int nr_ports = mdev_get_type_group_id(mdev) + 1;
+	int ret;
 
 	mdev_state = kzalloc(sizeof(struct mdev_state), GFP_KERNEL);
 	if (mdev_state == NULL)
 		return -ENOMEM;
 
+	vfio_init_group_dev(&mdev_state->vdev, &mdev->dev, &mtty_dev_ops);
+
 	mdev_state->nr_ports = nr_ports;
 	mdev_state->irq_index = -1;
 	mdev_state->s[0].max_fifo_size = MAX_FIFO_SIZE;
@@ -731,7 +730,6 @@ static int mtty_create(struct mdev_device *mdev)
 
 	mutex_init(&mdev_state->ops_lock);
 	mdev_state->mdev = mdev;
-	mdev_set_drvdata(mdev, mdev_state);
 
 	mtty_create_config_space(mdev_state);
 
@@ -739,50 +737,40 @@ static int mtty_create(struct mdev_device *mdev)
 	list_add(&mdev_state->next, &mdev_devices_list);
 	mutex_unlock(&mdev_list_lock);
 
+	ret = vfio_register_group_dev(&mdev_state->vdev);
+	if (ret) {
+		kfree(mdev_state);
+		return ret;
+	}
+	dev_set_drvdata(&mdev->dev, mdev_state);
 	return 0;
 }
 
-static int mtty_remove(struct mdev_device *mdev)
+static void mtty_remove(struct mdev_device *mdev)
 {
-	struct mdev_state *mds, *tmp_mds;
-	struct mdev_state *mdev_state = mdev_get_drvdata(mdev);
-	int ret = -EINVAL;
+	struct mdev_state *mdev_state = dev_get_drvdata(&mdev->dev);
 
+	vfio_unregister_group_dev(&mdev_state->vdev);
 	mutex_lock(&mdev_list_lock);
-	list_for_each_entry_safe(mds, tmp_mds, &mdev_devices_list, next) {
-		if (mdev_state == mds) {
-			list_del(&mdev_state->next);
-			mdev_set_drvdata(mdev, NULL);
-			kfree(mdev_state->vconfig);
-			kfree(mdev_state);
-			ret = 0;
-			break;
-		}
-	}
+	list_del(&mdev_state->next);
 	mutex_unlock(&mdev_list_lock);
 
-	return ret;
+	kfree(mdev_state->vconfig);
+	kfree(mdev_state);
 }
 
-static int mtty_reset(struct mdev_device *mdev)
+static int mtty_reset(struct mdev_state *mdev_stte)
 {
-	struct mdev_state *mdev_state;
-
-	if (!mdev)
-		return -EINVAL;
-
-	mdev_state = mdev_get_drvdata(mdev);
-	if (!mdev_state)
-		return -EINVAL;
-
 	pr_info("%s: called\n", __func__);
 
 	return 0;
 }
 
-static ssize_t mtty_read(struct mdev_device *mdev, char __user *buf,
+static ssize_t mtty_read(struct vfio_device *vdev, char __user *buf,
 			 size_t count, loff_t *ppos)
 {
+	struct mdev_state *mdev_state =
+		container_of(vdev, struct mdev_state, vdev);
 	unsigned int done = 0;
 	int ret;
 
@@ -792,7 +780,7 @@ static ssize_t mtty_read(struct mdev_device *mdev, char __user *buf,
 		if (count >= 4 && !(*ppos % 4)) {
 			u32 val;
 
-			ret =  mdev_access(mdev, (u8 *)&val, sizeof(val),
+			ret =  mdev_access(mdev_state, (u8 *)&val, sizeof(val),
 					   *ppos, false);
 			if (ret <= 0)
 				goto read_err;
@@ -804,7 +792,7 @@ static ssize_t mtty_read(struct mdev_device *mdev, char __user *buf,
 		} else if (count >= 2 && !(*ppos % 2)) {
 			u16 val;
 
-			ret = mdev_access(mdev, (u8 *)&val, sizeof(val),
+			ret = mdev_access(mdev_state, (u8 *)&val, sizeof(val),
 					  *ppos, false);
 			if (ret <= 0)
 				goto read_err;
@@ -816,7 +804,7 @@ static ssize_t mtty_read(struct mdev_device *mdev, char __user *buf,
 		} else {
 			u8 val;
 
-			ret = mdev_access(mdev, (u8 *)&val, sizeof(val),
+			ret = mdev_access(mdev_state, (u8 *)&val, sizeof(val),
 					  *ppos, false);
 			if (ret <= 0)
 				goto read_err;
@@ -839,9 +827,11 @@ static ssize_t mtty_read(struct mdev_device *mdev, char __user *buf,
 	return -EFAULT;
 }
 
-static ssize_t mtty_write(struct mdev_device *mdev, const char __user *buf,
+static ssize_t mtty_write(struct vfio_device *vdev, const char __user *buf,
 		   size_t count, loff_t *ppos)
 {
+	struct mdev_state *mdev_state =
+		container_of(vdev, struct mdev_state, vdev);
 	unsigned int done = 0;
 	int ret;
 
@@ -854,7 +844,7 @@ static ssize_t mtty_write(struct mdev_device *mdev, const char __user *buf,
 			if (copy_from_user(&val, buf, sizeof(val)))
 				goto write_err;
 
-			ret = mdev_access(mdev, (u8 *)&val, sizeof(val),
+			ret = mdev_access(mdev_state, (u8 *)&val, sizeof(val),
 					  *ppos, true);
 			if (ret <= 0)
 				goto write_err;
@@ -866,7 +856,7 @@ static ssize_t mtty_write(struct mdev_device *mdev, const char __user *buf,
 			if (copy_from_user(&val, buf, sizeof(val)))
 				goto write_err;
 
-			ret = mdev_access(mdev, (u8 *)&val, sizeof(val),
+			ret = mdev_access(mdev_state, (u8 *)&val, sizeof(val),
 					  *ppos, true);
 			if (ret <= 0)
 				goto write_err;
@@ -878,7 +868,7 @@ static ssize_t mtty_write(struct mdev_device *mdev, const char __user *buf,
 			if (copy_from_user(&val, buf, sizeof(val)))
 				goto write_err;
 
-			ret = mdev_access(mdev, (u8 *)&val, sizeof(val),
+			ret = mdev_access(mdev_state, (u8 *)&val, sizeof(val),
 					  *ppos, true);
 			if (ret <= 0)
 				goto write_err;
@@ -896,19 +886,11 @@ static ssize_t mtty_write(struct mdev_device *mdev, const char __user *buf,
 	return -EFAULT;
 }
 
-static int mtty_set_irqs(struct mdev_device *mdev, uint32_t flags,
+static int mtty_set_irqs(struct mdev_state *mdev_state, uint32_t flags,
 			 unsigned int index, unsigned int start,
 			 unsigned int count, void *data)
 {
 	int ret = 0;
-	struct mdev_state *mdev_state;
-
-	if (!mdev)
-		return -EINVAL;
-
-	mdev_state = mdev_get_drvdata(mdev);
-	if (!mdev_state)
-		return -EINVAL;
 
 	mutex_lock(&mdev_state->ops_lock);
 	switch (index) {
@@ -1024,21 +1006,13 @@ static int mtty_trigger_interrupt(struct mdev_state *mdev_state)
 	return ret;
 }
 
-static int mtty_get_region_info(struct mdev_device *mdev,
+static int mtty_get_region_info(struct mdev_state *mdev_state,
 			 struct vfio_region_info *region_info,
 			 u16 *cap_type_id, void **cap_type)
 {
 	unsigned int size = 0;
-	struct mdev_state *mdev_state;
 	u32 bar_index;
 
-	if (!mdev)
-		return -EINVAL;
-
-	mdev_state = mdev_get_drvdata(mdev);
-	if (!mdev_state)
-		return -EINVAL;
-
 	bar_index = region_info->index;
 	if (bar_index >= VFIO_PCI_NUM_REGIONS)
 		return -EINVAL;
@@ -1073,8 +1047,7 @@ static int mtty_get_region_info(struct mdev_device *mdev,
 	return 0;
 }
 
-static int mtty_get_irq_info(struct mdev_device *mdev,
-			     struct vfio_irq_info *irq_info)
+static int mtty_get_irq_info(struct vfio_irq_info *irq_info)
 {
 	switch (irq_info->index) {
 	case VFIO_PCI_INTX_IRQ_INDEX:
@@ -1098,8 +1071,7 @@ static int mtty_get_irq_info(struct mdev_device *mdev,
 	return 0;
 }
 
-static int mtty_get_device_info(struct mdev_device *mdev,
-			 struct vfio_device_info *dev_info)
+static int mtty_get_device_info(struct vfio_device_info *dev_info)
 {
 	dev_info->flags = VFIO_DEVICE_FLAGS_PCI;
 	dev_info->num_regions = VFIO_PCI_NUM_REGIONS;
@@ -1108,19 +1080,13 @@ static int mtty_get_device_info(struct mdev_device *mdev,
 	return 0;
 }
 
-static long mtty_ioctl(struct mdev_device *mdev, unsigned int cmd,
+static long mtty_ioctl(struct vfio_device *vdev, unsigned int cmd,
 			unsigned long arg)
 {
+	struct mdev_state *mdev_state =
+		container_of(vdev, struct mdev_state, vdev);
 	int ret = 0;
 	unsigned long minsz;
-	struct mdev_state *mdev_state;
-
-	if (!mdev)
-		return -EINVAL;
-
-	mdev_state = mdev_get_drvdata(mdev);
-	if (!mdev_state)
-		return -ENODEV;
 
 	switch (cmd) {
 	case VFIO_DEVICE_GET_INFO:
@@ -1135,7 +1101,7 @@ static long mtty_ioctl(struct mdev_device *mdev, unsigned int cmd,
 		if (info.argsz < minsz)
 			return -EINVAL;
 
-		ret = mtty_get_device_info(mdev, &info);
+		ret = mtty_get_device_info(&info);
 		if (ret)
 			return ret;
 
@@ -1160,7 +1126,7 @@ static long mtty_ioctl(struct mdev_device *mdev, unsigned int cmd,
 		if (info.argsz < minsz)
 			return -EINVAL;
 
-		ret = mtty_get_region_info(mdev, &info, &cap_type_id,
+		ret = mtty_get_region_info(mdev_state, &info, &cap_type_id,
 					   &cap_type);
 		if (ret)
 			return ret;
@@ -1184,7 +1150,7 @@ static long mtty_ioctl(struct mdev_device *mdev, unsigned int cmd,
 		    (info.index >= mdev_state->dev_info.num_irqs))
 			return -EINVAL;
 
-		ret = mtty_get_irq_info(mdev, &info);
+		ret = mtty_get_irq_info(&info);
 		if (ret)
 			return ret;
 
@@ -1218,25 +1184,25 @@ static long mtty_ioctl(struct mdev_device *mdev, unsigned int cmd,
 				return PTR_ERR(data);
 		}
 
-		ret = mtty_set_irqs(mdev, hdr.flags, hdr.index, hdr.start,
+		ret = mtty_set_irqs(mdev_state, hdr.flags, hdr.index, hdr.start,
 				    hdr.count, data);
 
 		kfree(ptr);
 		return ret;
 	}
 	case VFIO_DEVICE_RESET:
-		return mtty_reset(mdev);
+		return mtty_reset(mdev_state);
 	}
 	return -ENOTTY;
 }
 
-static int mtty_open(struct mdev_device *mdev)
+static int mtty_open(struct vfio_device *vdev)
 {
 	pr_info("%s\n", __func__);
 	return 0;
 }
 
-static void mtty_close(struct mdev_device *mdev)
+static void mtty_close(struct vfio_device *mdev)
 {
 	pr_info("%s\n", __func__);
 }
@@ -1351,18 +1317,31 @@ static struct attribute_group *mdev_type_groups[] = {
 	NULL,
 };
 
+static const struct vfio_device_ops mtty_dev_ops = {
+	.name = "vfio-mdev",
+	.open = mtty_open,
+	.release = mtty_close,
+	.read = mtty_read,
+	.write = mtty_write,
+	.ioctl = mtty_ioctl,
+};
+
+static struct mdev_driver mtty_driver = {
+	.driver = {
+		.name = "mtty",
+		.owner = THIS_MODULE,
+		.mod_name = KBUILD_MODNAME,
+		.dev_groups = mdev_dev_groups,
+	},
+	.probe = mtty_probe,
+	.remove	= mtty_remove,
+};
+
 static const struct mdev_parent_ops mdev_fops = {
 	.owner                  = THIS_MODULE,
+	.device_driver		= &mtty_driver,
 	.dev_attr_groups        = mtty_dev_groups,
-	.mdev_attr_groups       = mdev_dev_groups,
 	.supported_type_groups  = mdev_type_groups,
-	.create                 = mtty_create,
-	.remove			= mtty_remove,
-	.open                   = mtty_open,
-	.release                = mtty_close,
-	.read                   = mtty_read,
-	.write                  = mtty_write,
-	.ioctl		        = mtty_ioctl,
 };
 
 static void mtty_device_release(struct device *dev)
@@ -1393,12 +1372,16 @@ static int __init mtty_dev_init(void)
 
 	pr_info("major_number:%d\n", MAJOR(mtty_dev.vd_devt));
 
+	ret = mdev_register_driver(&mtty_driver);
+	if (ret)
+		goto err_cdev;
+
 	mtty_dev.vd_class = class_create(THIS_MODULE, MTTY_CLASS_NAME);
 
 	if (IS_ERR(mtty_dev.vd_class)) {
 		pr_err("Error: failed to register mtty_dev class\n");
 		ret = PTR_ERR(mtty_dev.vd_class);
-		goto failed1;
+		goto err_driver;
 	}
 
 	mtty_dev.dev.class = mtty_dev.vd_class;
@@ -1407,28 +1390,25 @@ static int __init mtty_dev_init(void)
 
 	ret = device_register(&mtty_dev.dev);
 	if (ret)
-		goto failed2;
+		goto err_class;
 
 	ret = mdev_register_device(&mtty_dev.dev, &mdev_fops);
 	if (ret)
-		goto failed3;
+		goto err_device;
 
 	mutex_init(&mdev_list_lock);
 	INIT_LIST_HEAD(&mdev_devices_list);
+	return 0;
 
-	goto all_done;
-
-failed3:
-
+err_device:
 	device_unregister(&mtty_dev.dev);
-failed2:
+err_class:
 	class_destroy(mtty_dev.vd_class);
-
-failed1:
+err_driver:
+	mdev_unregister_driver(&mtty_driver);
+err_cdev:
 	cdev_del(&mtty_dev.vd_cdev);
 	unregister_chrdev_region(mtty_dev.vd_devt, MINORMASK + 1);
-
-all_done:
 	return ret;
 }
 
@@ -1439,6 +1419,7 @@ static void __exit mtty_dev_exit(void)
 
 	device_unregister(&mtty_dev.dev);
 	idr_destroy(&mtty_dev.vd_idr);
+	mdev_unregister_driver(&mtty_driver);
 	cdev_del(&mtty_dev.vd_cdev);
 	unregister_chrdev_region(mtty_dev.vd_devt, MINORMASK + 1);
 	class_destroy(mtty_dev.vd_class);
-- 
2.31.1


^ permalink raw reply	[flat|nested] 59+ messages in thread

* [PATCH v2 04/13] vfio/mdpy: Convert to use vfio_register_group_dev()
  2021-04-26 20:00 [PATCH v2 00/13] Remove vfio_mdev.c, mdev_parent_ops and more Jason Gunthorpe
                   ` (2 preceding siblings ...)
  2021-04-26 20:00 ` [PATCH v2 03/13] vfio/mtty: Convert to use vfio_register_group_dev() Jason Gunthorpe
@ 2021-04-26 20:00 ` Jason Gunthorpe
  2021-04-26 20:00 ` [PATCH v2 05/13] vfio/mbochs: " Jason Gunthorpe
                   ` (7 subsequent siblings)
  11 siblings, 0 replies; 59+ messages in thread
From: Jason Gunthorpe @ 2021-04-26 20:00 UTC (permalink / raw)
  To: kvm, Kirti Wankhede
  Cc: Raj, Ashok, Dan Williams, Daniel Vetter, Christoph Hellwig,
	Leon Romanovsky, Max Gurtovoy, Tarun Gupta

This is straightforward conversion, the mdev_state is actually serving as
the vfio_device and we can replace all the mdev_get_drvdata()'s and the
wonky dead code with a simple container_of().

Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 samples/vfio-mdev/mdpy.c | 159 ++++++++++++++++++++++-----------------
 1 file changed, 88 insertions(+), 71 deletions(-)

diff --git a/samples/vfio-mdev/mdpy.c b/samples/vfio-mdev/mdpy.c
index 885b88ea20e234..82638de333330d 100644
--- a/samples/vfio-mdev/mdpy.c
+++ b/samples/vfio-mdev/mdpy.c
@@ -85,9 +85,11 @@ static struct class	*mdpy_class;
 static struct cdev	mdpy_cdev;
 static struct device	mdpy_dev;
 static u32		mdpy_count;
+static const struct vfio_device_ops mdpy_dev_ops;
 
 /* State of each mdev device */
 struct mdev_state {
+	struct vfio_device vdev;
 	u8 *vconfig;
 	u32 bar_mask;
 	struct mutex ops_lock;
@@ -162,11 +164,9 @@ static void handle_pci_cfg_write(struct mdev_state *mdev_state, u16 offset,
 	}
 }
 
-static ssize_t mdev_access(struct mdev_device *mdev, char *buf, size_t count,
-			   loff_t pos, bool is_write)
+static ssize_t mdev_access(struct mdev_state *mdev_state, char *buf,
+			   size_t count, loff_t pos, bool is_write)
 {
-	struct mdev_state *mdev_state = mdev_get_drvdata(mdev);
-	struct device *dev = mdev_dev(mdev);
 	int ret = 0;
 
 	mutex_lock(&mdev_state->ops_lock);
@@ -187,8 +187,9 @@ static ssize_t mdev_access(struct mdev_device *mdev, char *buf, size_t count,
 			memcpy(buf, mdev_state->memblk, count);
 
 	} else {
-		dev_info(dev, "%s: %s @0x%llx (unhandled)\n",
-			 __func__, is_write ? "WR" : "RD", pos);
+		dev_info(mdev_state->vdev.dev,
+			 "%s: %s @0x%llx (unhandled)\n", __func__,
+			 is_write ? "WR" : "RD", pos);
 		ret = -1;
 		goto accessfailed;
 	}
@@ -202,9 +203,8 @@ static ssize_t mdev_access(struct mdev_device *mdev, char *buf, size_t count,
 	return ret;
 }
 
-static int mdpy_reset(struct mdev_device *mdev)
+static int mdpy_reset(struct mdev_state *mdev_state)
 {
-	struct mdev_state *mdev_state = mdev_get_drvdata(mdev);
 	u32 stride, i;
 
 	/* initialize with gray gradient */
@@ -216,13 +216,14 @@ static int mdpy_reset(struct mdev_device *mdev)
 	return 0;
 }
 
-static int mdpy_create(struct mdev_device *mdev)
+static int mdpy_probe(struct mdev_device *mdev)
 {
 	const struct mdpy_type *type =
 		&mdpy_types[mdev_get_type_group_id(mdev)];
 	struct device *dev = mdev_dev(mdev);
 	struct mdev_state *mdev_state;
 	u32 fbsize;
+	int ret;
 
 	if (mdpy_count >= max_devices)
 		return -ENOMEM;
@@ -230,6 +231,7 @@ static int mdpy_create(struct mdev_device *mdev)
 	mdev_state = kzalloc(sizeof(struct mdev_state), GFP_KERNEL);
 	if (mdev_state == NULL)
 		return -ENOMEM;
+	vfio_init_group_dev(&mdev_state->vdev, &mdev->dev, &mdpy_dev_ops);
 
 	mdev_state->vconfig = kzalloc(MDPY_CONFIG_SPACE_SIZE, GFP_KERNEL);
 	if (mdev_state->vconfig == NULL) {
@@ -250,36 +252,41 @@ static int mdpy_create(struct mdev_device *mdev)
 
 	mutex_init(&mdev_state->ops_lock);
 	mdev_state->mdev = mdev;
-	mdev_set_drvdata(mdev, mdev_state);
-
 	mdev_state->type    = type;
 	mdev_state->memsize = fbsize;
 	mdpy_create_config_space(mdev_state);
-	mdpy_reset(mdev);
+	mdpy_reset(mdev_state);
 
 	mdpy_count++;
+
+	ret = vfio_register_group_dev(&mdev_state->vdev);
+	if (ret) {
+		kfree(mdev_state);
+		return ret;
+	}
+	dev_set_drvdata(&mdev->dev, mdev_state);
 	return 0;
 }
 
-static int mdpy_remove(struct mdev_device *mdev)
+static void mdpy_remove(struct mdev_device *mdev)
 {
-	struct mdev_state *mdev_state = mdev_get_drvdata(mdev);
-	struct device *dev = mdev_dev(mdev);
+	struct mdev_state *mdev_state = dev_get_drvdata(&mdev->dev);
 
-	dev_info(dev, "%s\n", __func__);
+	dev_info(&mdev->dev, "%s\n", __func__);
 
-	mdev_set_drvdata(mdev, NULL);
+	vfio_unregister_group_dev(&mdev_state->vdev);
 	vfree(mdev_state->memblk);
 	kfree(mdev_state->vconfig);
 	kfree(mdev_state);
 
 	mdpy_count--;
-	return 0;
 }
 
-static ssize_t mdpy_read(struct mdev_device *mdev, char __user *buf,
+static ssize_t mdpy_read(struct vfio_device *vdev, char __user *buf,
 			 size_t count, loff_t *ppos)
 {
+	struct mdev_state *mdev_state =
+		container_of(vdev, struct mdev_state, vdev);
 	unsigned int done = 0;
 	int ret;
 
@@ -289,8 +296,8 @@ static ssize_t mdpy_read(struct mdev_device *mdev, char __user *buf,
 		if (count >= 4 && !(*ppos % 4)) {
 			u32 val;
 
-			ret =  mdev_access(mdev, (char *)&val, sizeof(val),
-					   *ppos, false);
+			ret = mdev_access(mdev_state, (char *)&val, sizeof(val),
+					  *ppos, false);
 			if (ret <= 0)
 				goto read_err;
 
@@ -301,7 +308,7 @@ static ssize_t mdpy_read(struct mdev_device *mdev, char __user *buf,
 		} else if (count >= 2 && !(*ppos % 2)) {
 			u16 val;
 
-			ret = mdev_access(mdev, (char *)&val, sizeof(val),
+			ret = mdev_access(mdev_state, (char *)&val, sizeof(val),
 					  *ppos, false);
 			if (ret <= 0)
 				goto read_err;
@@ -313,7 +320,7 @@ static ssize_t mdpy_read(struct mdev_device *mdev, char __user *buf,
 		} else {
 			u8 val;
 
-			ret = mdev_access(mdev, (char *)&val, sizeof(val),
+			ret = mdev_access(mdev_state, (char *)&val, sizeof(val),
 					  *ppos, false);
 			if (ret <= 0)
 				goto read_err;
@@ -336,9 +343,11 @@ static ssize_t mdpy_read(struct mdev_device *mdev, char __user *buf,
 	return -EFAULT;
 }
 
-static ssize_t mdpy_write(struct mdev_device *mdev, const char __user *buf,
+static ssize_t mdpy_write(struct vfio_device *vdev, const char __user *buf,
 			  size_t count, loff_t *ppos)
 {
+	struct mdev_state *mdev_state =
+		container_of(vdev, struct mdev_state, vdev);
 	unsigned int done = 0;
 	int ret;
 
@@ -351,7 +360,7 @@ static ssize_t mdpy_write(struct mdev_device *mdev, const char __user *buf,
 			if (copy_from_user(&val, buf, sizeof(val)))
 				goto write_err;
 
-			ret = mdev_access(mdev, (char *)&val, sizeof(val),
+			ret = mdev_access(mdev_state, (char *)&val, sizeof(val),
 					  *ppos, true);
 			if (ret <= 0)
 				goto write_err;
@@ -363,7 +372,7 @@ static ssize_t mdpy_write(struct mdev_device *mdev, const char __user *buf,
 			if (copy_from_user(&val, buf, sizeof(val)))
 				goto write_err;
 
-			ret = mdev_access(mdev, (char *)&val, sizeof(val),
+			ret = mdev_access(mdev_state, (char *)&val, sizeof(val),
 					  *ppos, true);
 			if (ret <= 0)
 				goto write_err;
@@ -375,7 +384,7 @@ static ssize_t mdpy_write(struct mdev_device *mdev, const char __user *buf,
 			if (copy_from_user(&val, buf, sizeof(val)))
 				goto write_err;
 
-			ret = mdev_access(mdev, (char *)&val, sizeof(val),
+			ret = mdev_access(mdev_state, (char *)&val, sizeof(val),
 					  *ppos, true);
 			if (ret <= 0)
 				goto write_err;
@@ -393,9 +402,10 @@ static ssize_t mdpy_write(struct mdev_device *mdev, const char __user *buf,
 	return -EFAULT;
 }
 
-static int mdpy_mmap(struct mdev_device *mdev, struct vm_area_struct *vma)
+static int mdpy_mmap(struct vfio_device *vdev, struct vm_area_struct *vma)
 {
-	struct mdev_state *mdev_state = mdev_get_drvdata(mdev);
+	struct mdev_state *mdev_state =
+		container_of(vdev, struct mdev_state, vdev);
 
 	if (vma->vm_pgoff != MDPY_MEMORY_BAR_OFFSET >> PAGE_SHIFT)
 		return -EINVAL;
@@ -411,16 +421,10 @@ static int mdpy_mmap(struct mdev_device *mdev, struct vm_area_struct *vma)
 					   vma->vm_end - vma->vm_start);
 }
 
-static int mdpy_get_region_info(struct mdev_device *mdev,
+static int mdpy_get_region_info(struct mdev_state *mdev_state,
 				struct vfio_region_info *region_info,
 				u16 *cap_type_id, void **cap_type)
 {
-	struct mdev_state *mdev_state;
-
-	mdev_state = mdev_get_drvdata(mdev);
-	if (!mdev_state)
-		return -EINVAL;
-
 	if (region_info->index >= VFIO_PCI_NUM_REGIONS &&
 	    region_info->index != MDPY_DISPLAY_REGION)
 		return -EINVAL;
@@ -449,15 +453,13 @@ static int mdpy_get_region_info(struct mdev_device *mdev,
 	return 0;
 }
 
-static int mdpy_get_irq_info(struct mdev_device *mdev,
-			     struct vfio_irq_info *irq_info)
+static int mdpy_get_irq_info(struct vfio_irq_info *irq_info)
 {
 	irq_info->count = 0;
 	return 0;
 }
 
-static int mdpy_get_device_info(struct mdev_device *mdev,
-				struct vfio_device_info *dev_info)
+static int mdpy_get_device_info(struct vfio_device_info *dev_info)
 {
 	dev_info->flags = VFIO_DEVICE_FLAGS_PCI;
 	dev_info->num_regions = VFIO_PCI_NUM_REGIONS;
@@ -465,11 +467,9 @@ static int mdpy_get_device_info(struct mdev_device *mdev,
 	return 0;
 }
 
-static int mdpy_query_gfx_plane(struct mdev_device *mdev,
+static int mdpy_query_gfx_plane(struct mdev_state *mdev_state,
 				struct vfio_device_gfx_plane_info *plane)
 {
-	struct mdev_state *mdev_state = mdev_get_drvdata(mdev);
-
 	if (plane->flags & VFIO_GFX_PLANE_TYPE_PROBE) {
 		if (plane->flags == (VFIO_GFX_PLANE_TYPE_PROBE |
 				     VFIO_GFX_PLANE_TYPE_REGION))
@@ -498,14 +498,13 @@ static int mdpy_query_gfx_plane(struct mdev_device *mdev,
 	return 0;
 }
 
-static long mdpy_ioctl(struct mdev_device *mdev, unsigned int cmd,
+static long mdpy_ioctl(struct vfio_device *vdev, unsigned int cmd,
 		       unsigned long arg)
 {
 	int ret = 0;
 	unsigned long minsz;
-	struct mdev_state *mdev_state;
-
-	mdev_state = mdev_get_drvdata(mdev);
+	struct mdev_state *mdev_state =
+		container_of(vdev, struct mdev_state, vdev);
 
 	switch (cmd) {
 	case VFIO_DEVICE_GET_INFO:
@@ -520,7 +519,7 @@ static long mdpy_ioctl(struct mdev_device *mdev, unsigned int cmd,
 		if (info.argsz < minsz)
 			return -EINVAL;
 
-		ret = mdpy_get_device_info(mdev, &info);
+		ret = mdpy_get_device_info(&info);
 		if (ret)
 			return ret;
 
@@ -545,7 +544,7 @@ static long mdpy_ioctl(struct mdev_device *mdev, unsigned int cmd,
 		if (info.argsz < minsz)
 			return -EINVAL;
 
-		ret = mdpy_get_region_info(mdev, &info, &cap_type_id,
+		ret = mdpy_get_region_info(mdev_state, &info, &cap_type_id,
 					   &cap_type);
 		if (ret)
 			return ret;
@@ -569,7 +568,7 @@ static long mdpy_ioctl(struct mdev_device *mdev, unsigned int cmd,
 		    (info.index >= mdev_state->dev_info.num_irqs))
 			return -EINVAL;
 
-		ret = mdpy_get_irq_info(mdev, &info);
+		ret = mdpy_get_irq_info(&info);
 		if (ret)
 			return ret;
 
@@ -592,7 +591,7 @@ static long mdpy_ioctl(struct mdev_device *mdev, unsigned int cmd,
 		if (plane.argsz < minsz)
 			return -EINVAL;
 
-		ret = mdpy_query_gfx_plane(mdev, &plane);
+		ret = mdpy_query_gfx_plane(mdev_state, &plane);
 		if (ret)
 			return ret;
 
@@ -606,12 +605,12 @@ static long mdpy_ioctl(struct mdev_device *mdev, unsigned int cmd,
 		return -EINVAL;
 
 	case VFIO_DEVICE_RESET:
-		return mdpy_reset(mdev);
+		return mdpy_reset(mdev_state);
 	}
 	return -ENOTTY;
 }
 
-static int mdpy_open(struct mdev_device *mdev)
+static int mdpy_open(struct vfio_device *vdev)
 {
 	if (!try_module_get(THIS_MODULE))
 		return -ENODEV;
@@ -619,7 +618,7 @@ static int mdpy_open(struct mdev_device *mdev)
 	return 0;
 }
 
-static void mdpy_close(struct mdev_device *mdev)
+static void mdpy_close(struct vfio_device *vdev)
 {
 	module_put(THIS_MODULE);
 }
@@ -628,8 +627,7 @@ static ssize_t
 resolution_show(struct device *dev, struct device_attribute *attr,
 		char *buf)
 {
-	struct mdev_device *mdev = mdev_from_dev(dev);
-	struct mdev_state *mdev_state = mdev_get_drvdata(mdev);
+	struct mdev_state *mdev_state = dev_get_drvdata(dev);
 
 	return sprintf(buf, "%dx%d\n",
 		       mdev_state->type->width,
@@ -719,18 +717,30 @@ static struct attribute_group *mdev_type_groups[] = {
 	NULL,
 };
 
+static const struct vfio_device_ops mdpy_dev_ops = {
+	.open = mdpy_open,
+	.release = mdpy_close,
+	.read = mdpy_read,
+	.write = mdpy_write,
+	.ioctl = mdpy_ioctl,
+	.mmap = mdpy_mmap,
+};
+
+static struct mdev_driver mdpy_driver = {
+	.driver = {
+		.name = "mdpy",
+		.owner = THIS_MODULE,
+		.mod_name = KBUILD_MODNAME,
+		.dev_groups = mdev_dev_groups,
+	},
+	.probe = mdpy_probe,
+	.remove	= mdpy_remove,
+};
+
 static const struct mdev_parent_ops mdev_fops = {
 	.owner			= THIS_MODULE,
-	.mdev_attr_groups	= mdev_dev_groups,
+	.device_driver          = &mdpy_driver,
 	.supported_type_groups	= mdev_type_groups,
-	.create			= mdpy_create,
-	.remove			= mdpy_remove,
-	.open			= mdpy_open,
-	.release		= mdpy_close,
-	.read			= mdpy_read,
-	.write			= mdpy_write,
-	.ioctl			= mdpy_ioctl,
-	.mmap			= mdpy_mmap,
 };
 
 static const struct file_operations vd_fops = {
@@ -755,11 +765,15 @@ static int __init mdpy_dev_init(void)
 	cdev_add(&mdpy_cdev, mdpy_devt, MINORMASK + 1);
 	pr_info("%s: major %d\n", __func__, MAJOR(mdpy_devt));
 
+	ret = mdev_register_driver(&mdpy_driver);
+	if (ret)
+		goto err_cdev;
+
 	mdpy_class = class_create(THIS_MODULE, MDPY_CLASS_NAME);
 	if (IS_ERR(mdpy_class)) {
 		pr_err("Error: failed to register mdpy_dev class\n");
 		ret = PTR_ERR(mdpy_class);
-		goto failed1;
+		goto err_driver;
 	}
 	mdpy_dev.class = mdpy_class;
 	mdpy_dev.release = mdpy_device_release;
@@ -767,19 +781,21 @@ static int __init mdpy_dev_init(void)
 
 	ret = device_register(&mdpy_dev);
 	if (ret)
-		goto failed2;
+		goto err_class;
 
 	ret = mdev_register_device(&mdpy_dev, &mdev_fops);
 	if (ret)
-		goto failed3;
+		goto err_device;
 
 	return 0;
 
-failed3:
+err_device:
 	device_unregister(&mdpy_dev);
-failed2:
+err_class:
 	class_destroy(mdpy_class);
-failed1:
+err_driver:
+	mdev_unregister_driver(&mdpy_driver);
+err_cdev:
 	cdev_del(&mdpy_cdev);
 	unregister_chrdev_region(mdpy_devt, MINORMASK + 1);
 	return ret;
@@ -791,6 +807,7 @@ static void __exit mdpy_dev_exit(void)
 	mdev_unregister_device(&mdpy_dev);
 
 	device_unregister(&mdpy_dev);
+	mdev_unregister_driver(&mdpy_driver);
 	cdev_del(&mdpy_cdev);
 	unregister_chrdev_region(mdpy_devt, MINORMASK + 1);
 	class_destroy(mdpy_class);
-- 
2.31.1


^ permalink raw reply	[flat|nested] 59+ messages in thread

* [PATCH v2 05/13] vfio/mbochs: Convert to use vfio_register_group_dev()
  2021-04-26 20:00 [PATCH v2 00/13] Remove vfio_mdev.c, mdev_parent_ops and more Jason Gunthorpe
                   ` (3 preceding siblings ...)
  2021-04-26 20:00 ` [PATCH v2 04/13] vfio/mdpy: " Jason Gunthorpe
@ 2021-04-26 20:00 ` Jason Gunthorpe
  2021-04-26 20:00 ` [PATCH v2 07/13] vfio/ccw: " Jason Gunthorpe
                   ` (6 subsequent siblings)
  11 siblings, 0 replies; 59+ messages in thread
From: Jason Gunthorpe @ 2021-04-26 20:00 UTC (permalink / raw)
  To: kvm, Kirti Wankhede
  Cc: Raj, Ashok, Dan Williams, Daniel Vetter, Christoph Hellwig,
	Leon Romanovsky, Max Gurtovoy, Tarun Gupta

This is straightforward conversion, the mdev_state is actually serving as
the vfio_device and we can replace all the mdev_get_drvdata()'s and the
wonky dead code with a simple container_of().

Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 samples/vfio-mdev/mbochs.c | 163 +++++++++++++++++++++----------------
 1 file changed, 91 insertions(+), 72 deletions(-)

diff --git a/samples/vfio-mdev/mbochs.c b/samples/vfio-mdev/mbochs.c
index 861c76914e7639..e18821a8a6beb8 100644
--- a/samples/vfio-mdev/mbochs.c
+++ b/samples/vfio-mdev/mbochs.c
@@ -130,6 +130,7 @@ static struct class	*mbochs_class;
 static struct cdev	mbochs_cdev;
 static struct device	mbochs_dev;
 static int		mbochs_used_mbytes;
+static const struct vfio_device_ops mbochs_dev_ops;
 
 struct vfio_region_info_ext {
 	struct vfio_region_info          base;
@@ -160,6 +161,7 @@ struct mbochs_dmabuf {
 
 /* State of each mdev device */
 struct mdev_state {
+	struct vfio_device vdev;
 	u8 *vconfig;
 	u64 bar_mask[3];
 	u32 memory_bar_mask;
@@ -425,11 +427,9 @@ static void handle_edid_blob(struct mdev_state *mdev_state, u16 offset,
 		memcpy(buf, mdev_state->edid_blob + offset, count);
 }
 
-static ssize_t mdev_access(struct mdev_device *mdev, char *buf, size_t count,
-			   loff_t pos, bool is_write)
+static ssize_t mdev_access(struct mdev_state *mdev_state, char *buf,
+			   size_t count, loff_t pos, bool is_write)
 {
-	struct mdev_state *mdev_state = mdev_get_drvdata(mdev);
-	struct device *dev = mdev_dev(mdev);
 	struct page *pg;
 	loff_t poff;
 	char *map;
@@ -478,7 +478,7 @@ static ssize_t mdev_access(struct mdev_device *mdev, char *buf, size_t count,
 		put_page(pg);
 
 	} else {
-		dev_dbg(dev, "%s: %s @0x%llx (unhandled)\n",
+		dev_dbg(mdev_state->vdev.dev, "%s: %s @0x%llx (unhandled)\n",
 			__func__, is_write ? "WR" : "RD", pos);
 		ret = -1;
 		goto accessfailed;
@@ -493,9 +493,8 @@ static ssize_t mdev_access(struct mdev_device *mdev, char *buf, size_t count,
 	return ret;
 }
 
-static int mbochs_reset(struct mdev_device *mdev)
+static int mbochs_reset(struct mdev_state *mdev_state)
 {
-	struct mdev_state *mdev_state = mdev_get_drvdata(mdev);
 	u32 size64k = mdev_state->memsize / (64 * 1024);
 	int i;
 
@@ -506,12 +505,13 @@ static int mbochs_reset(struct mdev_device *mdev)
 	return 0;
 }
 
-static int mbochs_create(struct mdev_device *mdev)
+static int mbochs_probe(struct mdev_device *mdev)
 {
 	const struct mbochs_type *type =
 		&mbochs_types[mdev_get_type_group_id(mdev)];
 	struct device *dev = mdev_dev(mdev);
 	struct mdev_state *mdev_state;
+	int ret = -ENOMEM;
 
 	if (!type)
 		type = &mbochs_types[0];
@@ -521,6 +521,7 @@ static int mbochs_create(struct mdev_device *mdev)
 	mdev_state = kzalloc(sizeof(struct mdev_state), GFP_KERNEL);
 	if (mdev_state == NULL)
 		return -ENOMEM;
+	vfio_init_group_dev(&mdev_state->vdev, &mdev->dev, &mbochs_dev_ops);
 
 	mdev_state->vconfig = kzalloc(MBOCHS_CONFIG_SPACE_SIZE, GFP_KERNEL);
 	if (mdev_state->vconfig == NULL)
@@ -539,7 +540,6 @@ static int mbochs_create(struct mdev_device *mdev)
 
 	mutex_init(&mdev_state->ops_lock);
 	mdev_state->mdev = mdev;
-	mdev_set_drvdata(mdev, mdev_state);
 	INIT_LIST_HEAD(&mdev_state->dmabufs);
 	mdev_state->next_id = 1;
 
@@ -549,32 +549,38 @@ static int mbochs_create(struct mdev_device *mdev)
 	mdev_state->edid_regs.edid_offset = MBOCHS_EDID_BLOB_OFFSET;
 	mdev_state->edid_regs.edid_max_size = sizeof(mdev_state->edid_blob);
 	mbochs_create_config_space(mdev_state);
-	mbochs_reset(mdev);
+	mbochs_reset(mdev_state);
 
 	mbochs_used_mbytes += type->mbytes;
+
+	ret = vfio_register_group_dev(&mdev_state->vdev);
+	if (ret)
+		goto err_mem;
+	dev_set_drvdata(&mdev->dev, mdev_state);
 	return 0;
 
 err_mem:
 	kfree(mdev_state->vconfig);
 	kfree(mdev_state);
-	return -ENOMEM;
+	return ret;
 }
 
-static int mbochs_remove(struct mdev_device *mdev)
+static void mbochs_remove(struct mdev_device *mdev)
 {
-	struct mdev_state *mdev_state = mdev_get_drvdata(mdev);
+	struct mdev_state *mdev_state = dev_get_drvdata(&mdev->dev);
 
 	mbochs_used_mbytes -= mdev_state->type->mbytes;
-	mdev_set_drvdata(mdev, NULL);
+	vfio_unregister_group_dev(&mdev_state->vdev);
 	kfree(mdev_state->pages);
 	kfree(mdev_state->vconfig);
 	kfree(mdev_state);
-	return 0;
 }
 
-static ssize_t mbochs_read(struct mdev_device *mdev, char __user *buf,
+static ssize_t mbochs_read(struct vfio_device *vdev, char __user *buf,
 			   size_t count, loff_t *ppos)
 {
+	struct mdev_state *mdev_state =
+		container_of(vdev, struct mdev_state, vdev);
 	unsigned int done = 0;
 	int ret;
 
@@ -584,7 +590,7 @@ static ssize_t mbochs_read(struct mdev_device *mdev, char __user *buf,
 		if (count >= 4 && !(*ppos % 4)) {
 			u32 val;
 
-			ret =  mdev_access(mdev, (char *)&val, sizeof(val),
+			ret =  mdev_access(mdev_state, (char *)&val, sizeof(val),
 					   *ppos, false);
 			if (ret <= 0)
 				goto read_err;
@@ -596,7 +602,7 @@ static ssize_t mbochs_read(struct mdev_device *mdev, char __user *buf,
 		} else if (count >= 2 && !(*ppos % 2)) {
 			u16 val;
 
-			ret = mdev_access(mdev, (char *)&val, sizeof(val),
+			ret = mdev_access(mdev_state, (char *)&val, sizeof(val),
 					  *ppos, false);
 			if (ret <= 0)
 				goto read_err;
@@ -608,7 +614,7 @@ static ssize_t mbochs_read(struct mdev_device *mdev, char __user *buf,
 		} else {
 			u8 val;
 
-			ret = mdev_access(mdev, (char *)&val, sizeof(val),
+			ret = mdev_access(mdev_state, (char *)&val, sizeof(val),
 					  *ppos, false);
 			if (ret <= 0)
 				goto read_err;
@@ -631,9 +637,11 @@ static ssize_t mbochs_read(struct mdev_device *mdev, char __user *buf,
 	return -EFAULT;
 }
 
-static ssize_t mbochs_write(struct mdev_device *mdev, const char __user *buf,
+static ssize_t mbochs_write(struct vfio_device *vdev, const char __user *buf,
 			    size_t count, loff_t *ppos)
 {
+	struct mdev_state *mdev_state =
+		container_of(vdev, struct mdev_state, vdev);
 	unsigned int done = 0;
 	int ret;
 
@@ -646,7 +654,7 @@ static ssize_t mbochs_write(struct mdev_device *mdev, const char __user *buf,
 			if (copy_from_user(&val, buf, sizeof(val)))
 				goto write_err;
 
-			ret = mdev_access(mdev, (char *)&val, sizeof(val),
+			ret = mdev_access(mdev_state, (char *)&val, sizeof(val),
 					  *ppos, true);
 			if (ret <= 0)
 				goto write_err;
@@ -658,7 +666,7 @@ static ssize_t mbochs_write(struct mdev_device *mdev, const char __user *buf,
 			if (copy_from_user(&val, buf, sizeof(val)))
 				goto write_err;
 
-			ret = mdev_access(mdev, (char *)&val, sizeof(val),
+			ret = mdev_access(mdev_state, (char *)&val, sizeof(val),
 					  *ppos, true);
 			if (ret <= 0)
 				goto write_err;
@@ -670,7 +678,7 @@ static ssize_t mbochs_write(struct mdev_device *mdev, const char __user *buf,
 			if (copy_from_user(&val, buf, sizeof(val)))
 				goto write_err;
 
-			ret = mdev_access(mdev, (char *)&val, sizeof(val),
+			ret = mdev_access(mdev_state, (char *)&val, sizeof(val),
 					  *ppos, true);
 			if (ret <= 0)
 				goto write_err;
@@ -756,9 +764,10 @@ static const struct vm_operations_struct mbochs_region_vm_ops = {
 	.fault = mbochs_region_vm_fault,
 };
 
-static int mbochs_mmap(struct mdev_device *mdev, struct vm_area_struct *vma)
+static int mbochs_mmap(struct vfio_device *vdev, struct vm_area_struct *vma)
 {
-	struct mdev_state *mdev_state = mdev_get_drvdata(mdev);
+	struct mdev_state *mdev_state =
+		container_of(vdev, struct mdev_state, vdev);
 
 	if (vma->vm_pgoff != MBOCHS_MEMORY_BAR_OFFSET >> PAGE_SHIFT)
 		return -EINVAL;
@@ -965,7 +974,7 @@ mbochs_dmabuf_find_by_id(struct mdev_state *mdev_state, u32 id)
 static int mbochs_dmabuf_export(struct mbochs_dmabuf *dmabuf)
 {
 	struct mdev_state *mdev_state = dmabuf->mdev_state;
-	struct device *dev = mdev_dev(mdev_state->mdev);
+	struct device *dev = mdev_state->vdev.dev;
 	DEFINE_DMA_BUF_EXPORT_INFO(exp_info);
 	struct dma_buf *buf;
 
@@ -993,15 +1002,10 @@ static int mbochs_dmabuf_export(struct mbochs_dmabuf *dmabuf)
 	return 0;
 }
 
-static int mbochs_get_region_info(struct mdev_device *mdev,
+static int mbochs_get_region_info(struct mdev_state *mdev_state,
 				  struct vfio_region_info_ext *ext)
 {
 	struct vfio_region_info *region_info = &ext->base;
-	struct mdev_state *mdev_state;
-
-	mdev_state = mdev_get_drvdata(mdev);
-	if (!mdev_state)
-		return -EINVAL;
 
 	if (region_info->index >= MBOCHS_NUM_REGIONS)
 		return -EINVAL;
@@ -1049,15 +1053,13 @@ static int mbochs_get_region_info(struct mdev_device *mdev,
 	return 0;
 }
 
-static int mbochs_get_irq_info(struct mdev_device *mdev,
-			       struct vfio_irq_info *irq_info)
+static int mbochs_get_irq_info(struct vfio_irq_info *irq_info)
 {
 	irq_info->count = 0;
 	return 0;
 }
 
-static int mbochs_get_device_info(struct mdev_device *mdev,
-				  struct vfio_device_info *dev_info)
+static int mbochs_get_device_info(struct vfio_device_info *dev_info)
 {
 	dev_info->flags = VFIO_DEVICE_FLAGS_PCI;
 	dev_info->num_regions = MBOCHS_NUM_REGIONS;
@@ -1065,11 +1067,9 @@ static int mbochs_get_device_info(struct mdev_device *mdev,
 	return 0;
 }
 
-static int mbochs_query_gfx_plane(struct mdev_device *mdev,
+static int mbochs_query_gfx_plane(struct mdev_state *mdev_state,
 				  struct vfio_device_gfx_plane_info *plane)
 {
-	struct mdev_state *mdev_state = mdev_get_drvdata(mdev);
-	struct device *dev = mdev_dev(mdev);
 	struct mbochs_dmabuf *dmabuf;
 	struct mbochs_mode mode;
 	int ret;
@@ -1123,18 +1123,16 @@ static int mbochs_query_gfx_plane(struct mdev_device *mdev,
 done:
 	if (plane->drm_plane_type == DRM_PLANE_TYPE_PRIMARY &&
 	    mdev_state->active_id != plane->dmabuf_id) {
-		dev_dbg(dev, "%s: primary: %d => %d\n", __func__,
-			mdev_state->active_id, plane->dmabuf_id);
+		dev_dbg(mdev_state->vdev.dev, "%s: primary: %d => %d\n",
+			__func__, mdev_state->active_id, plane->dmabuf_id);
 		mdev_state->active_id = plane->dmabuf_id;
 	}
 	mutex_unlock(&mdev_state->ops_lock);
 	return 0;
 }
 
-static int mbochs_get_gfx_dmabuf(struct mdev_device *mdev,
-				 u32 id)
+static int mbochs_get_gfx_dmabuf(struct mdev_state *mdev_state, u32 id)
 {
-	struct mdev_state *mdev_state = mdev_get_drvdata(mdev);
 	struct mbochs_dmabuf *dmabuf;
 
 	mutex_lock(&mdev_state->ops_lock);
@@ -1156,9 +1154,11 @@ static int mbochs_get_gfx_dmabuf(struct mdev_device *mdev,
 	return dma_buf_fd(dmabuf->buf, 0);
 }
 
-static long mbochs_ioctl(struct mdev_device *mdev, unsigned int cmd,
-			unsigned long arg)
+static long mbochs_ioctl(struct vfio_device *vdev, unsigned int cmd,
+			 unsigned long arg)
 {
+	struct mdev_state *mdev_state =
+		container_of(vdev, struct mdev_state, vdev);
 	int ret = 0;
 	unsigned long minsz, outsz;
 
@@ -1175,7 +1175,7 @@ static long mbochs_ioctl(struct mdev_device *mdev, unsigned int cmd,
 		if (info.argsz < minsz)
 			return -EINVAL;
 
-		ret = mbochs_get_device_info(mdev, &info);
+		ret = mbochs_get_device_info(&info);
 		if (ret)
 			return ret;
 
@@ -1199,7 +1199,7 @@ static long mbochs_ioctl(struct mdev_device *mdev, unsigned int cmd,
 		if (outsz > sizeof(info))
 			return -EINVAL;
 
-		ret = mbochs_get_region_info(mdev, &info);
+		ret = mbochs_get_region_info(mdev_state, &info);
 		if (ret)
 			return ret;
 
@@ -1222,7 +1222,7 @@ static long mbochs_ioctl(struct mdev_device *mdev, unsigned int cmd,
 		    (info.index >= VFIO_PCI_NUM_IRQS))
 			return -EINVAL;
 
-		ret = mbochs_get_irq_info(mdev, &info);
+		ret = mbochs_get_irq_info(&info);
 		if (ret)
 			return ret;
 
@@ -1245,7 +1245,7 @@ static long mbochs_ioctl(struct mdev_device *mdev, unsigned int cmd,
 		if (plane.argsz < minsz)
 			return -EINVAL;
 
-		ret = mbochs_query_gfx_plane(mdev, &plane);
+		ret = mbochs_query_gfx_plane(mdev_state, &plane);
 		if (ret)
 			return ret;
 
@@ -1262,19 +1262,19 @@ static long mbochs_ioctl(struct mdev_device *mdev, unsigned int cmd,
 		if (get_user(dmabuf_id, (__u32 __user *)arg))
 			return -EFAULT;
 
-		return mbochs_get_gfx_dmabuf(mdev, dmabuf_id);
+		return mbochs_get_gfx_dmabuf(mdev_state, dmabuf_id);
 	}
 
 	case VFIO_DEVICE_SET_IRQS:
 		return -EINVAL;
 
 	case VFIO_DEVICE_RESET:
-		return mbochs_reset(mdev);
+		return mbochs_reset(mdev_state);
 	}
 	return -ENOTTY;
 }
 
-static int mbochs_open(struct mdev_device *mdev)
+static int mbochs_open(struct vfio_device *vdev)
 {
 	if (!try_module_get(THIS_MODULE))
 		return -ENODEV;
@@ -1282,9 +1282,10 @@ static int mbochs_open(struct mdev_device *mdev)
 	return 0;
 }
 
-static void mbochs_close(struct mdev_device *mdev)
+static void mbochs_close(struct vfio_device *vdev)
 {
-	struct mdev_state *mdev_state = mdev_get_drvdata(mdev);
+	struct mdev_state *mdev_state =
+		container_of(vdev, struct mdev_state, vdev);
 	struct mbochs_dmabuf *dmabuf, *tmp;
 
 	mutex_lock(&mdev_state->ops_lock);
@@ -1308,8 +1309,7 @@ static ssize_t
 memory_show(struct device *dev, struct device_attribute *attr,
 	    char *buf)
 {
-	struct mdev_device *mdev = mdev_from_dev(dev);
-	struct mdev_state *mdev_state = mdev_get_drvdata(mdev);
+	struct mdev_state *mdev_state = dev_get_drvdata(dev);
 
 	return sprintf(buf, "%d MB\n", mdev_state->type->mbytes);
 }
@@ -1400,18 +1400,30 @@ static struct attribute_group *mdev_type_groups[] = {
 	NULL,
 };
 
+static const struct vfio_device_ops mbochs_dev_ops = {
+	.open = mbochs_open,
+	.release = mbochs_close,
+	.read = mbochs_read,
+	.write = mbochs_write,
+	.ioctl = mbochs_ioctl,
+	.mmap = mbochs_mmap,
+};
+
+static struct mdev_driver mbochs_driver = {
+	.driver = {
+		.name = "mbochs",
+		.owner = THIS_MODULE,
+		.mod_name = KBUILD_MODNAME,
+		.dev_groups = mdev_dev_groups,
+	},
+	.probe = mbochs_probe,
+	.remove	= mbochs_remove,
+};
+
 static const struct mdev_parent_ops mdev_fops = {
 	.owner			= THIS_MODULE,
-	.mdev_attr_groups	= mdev_dev_groups,
+	.device_driver		= &mbochs_driver,
 	.supported_type_groups	= mdev_type_groups,
-	.create			= mbochs_create,
-	.remove			= mbochs_remove,
-	.open			= mbochs_open,
-	.release		= mbochs_close,
-	.read			= mbochs_read,
-	.write			= mbochs_write,
-	.ioctl			= mbochs_ioctl,
-	.mmap			= mbochs_mmap,
 };
 
 static const struct file_operations vd_fops = {
@@ -1436,11 +1448,15 @@ static int __init mbochs_dev_init(void)
 	cdev_add(&mbochs_cdev, mbochs_devt, MINORMASK + 1);
 	pr_info("%s: major %d\n", __func__, MAJOR(mbochs_devt));
 
+	ret = mdev_register_driver(&mbochs_driver);
+	if (ret)
+		goto err_cdev;
+
 	mbochs_class = class_create(THIS_MODULE, MBOCHS_CLASS_NAME);
 	if (IS_ERR(mbochs_class)) {
 		pr_err("Error: failed to register mbochs_dev class\n");
 		ret = PTR_ERR(mbochs_class);
-		goto failed1;
+		goto err_driver;
 	}
 	mbochs_dev.class = mbochs_class;
 	mbochs_dev.release = mbochs_device_release;
@@ -1448,19 +1464,21 @@ static int __init mbochs_dev_init(void)
 
 	ret = device_register(&mbochs_dev);
 	if (ret)
-		goto failed2;
+		goto err_class;
 
 	ret = mdev_register_device(&mbochs_dev, &mdev_fops);
 	if (ret)
-		goto failed3;
+		goto err_device;
 
 	return 0;
 
-failed3:
+err_device:
 	device_unregister(&mbochs_dev);
-failed2:
+err_class:
 	class_destroy(mbochs_class);
-failed1:
+err_driver:
+	mdev_unregister_driver(&mbochs_driver);
+err_cdev:
 	cdev_del(&mbochs_cdev);
 	unregister_chrdev_region(mbochs_devt, MINORMASK + 1);
 	return ret;
@@ -1472,6 +1490,7 @@ static void __exit mbochs_dev_exit(void)
 	mdev_unregister_device(&mbochs_dev);
 
 	device_unregister(&mbochs_dev);
+	mdev_unregister_driver(&mbochs_driver);
 	cdev_del(&mbochs_cdev);
 	unregister_chrdev_region(mbochs_devt, MINORMASK + 1);
 	class_destroy(mbochs_class);
-- 
2.31.1


^ permalink raw reply	[flat|nested] 59+ messages in thread

* [PATCH v2 07/13] vfio/ccw: Convert to use vfio_register_group_dev()
  2021-04-26 20:00 [PATCH v2 00/13] Remove vfio_mdev.c, mdev_parent_ops and more Jason Gunthorpe
                   ` (4 preceding siblings ...)
  2021-04-26 20:00 ` [PATCH v2 05/13] vfio/mbochs: " Jason Gunthorpe
@ 2021-04-26 20:00 ` Jason Gunthorpe
  2021-04-27 20:06   ` Eric Farman
  2021-04-28 17:09   ` Cornelia Huck
  2021-04-26 20:00 ` [PATCH v2 09/13] vfio/mdev: Remove vfio_mdev.c Jason Gunthorpe
                   ` (5 subsequent siblings)
  11 siblings, 2 replies; 59+ messages in thread
From: Jason Gunthorpe @ 2021-04-26 20:00 UTC (permalink / raw)
  To: Christian Borntraeger, Cornelia Huck, Eric Farman, Vasily Gorbik,
	Heiko Carstens, kvm, linux-s390, Peter Oberparleiter,
	Halil Pasic, Vineeth Vijayan
  Cc: Raj, Ashok, Dan Williams, Daniel Vetter, Christoph Hellwig,
	Leon Romanovsky, Max Gurtovoy, Tarun Gupta

This is more complicated because vfio_ccw is sharing the vfio_device
between both the mdev_device and its vfio_device and the css_driver.

The mdev is a singleton, and the reason for this sharing appears to be to
allow the extra css_driver function callbacks to be delivered to the
vfio_device.

This keeps things as they were, with the css_driver allocating the
singleton, not the mdev_driver, this is pretty confusing. I'm also
uncertain how the lifetime model for the mdev works in the css_driver
callbacks.

At this point embed the vfio_device in the vfio_ccw_private and
instantiate it as a vfio_device when the mdev probes. The drvdata of both
the css_device and the mdev_device point at the private, and container_of
is used to get it back from the vfio_device.

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/s390/cio/vfio_ccw_drv.c     |  21 +++--
 drivers/s390/cio/vfio_ccw_ops.c     | 135 +++++++++++++++-------------
 drivers/s390/cio/vfio_ccw_private.h |   5 ++
 3 files changed, 94 insertions(+), 67 deletions(-)

diff --git a/drivers/s390/cio/vfio_ccw_drv.c b/drivers/s390/cio/vfio_ccw_drv.c
index 8c625b530035f5..55c4876dfd139d 100644
--- a/drivers/s390/cio/vfio_ccw_drv.c
+++ b/drivers/s390/cio/vfio_ccw_drv.c
@@ -442,7 +442,7 @@ static int __init vfio_ccw_sch_init(void)
 	vfio_ccw_work_q = create_singlethread_workqueue("vfio-ccw");
 	if (!vfio_ccw_work_q) {
 		ret = -ENOMEM;
-		goto out_err;
+		goto out_regions;
 	}
 
 	vfio_ccw_io_region = kmem_cache_create_usercopy("vfio_ccw_io_region",
@@ -451,7 +451,7 @@ static int __init vfio_ccw_sch_init(void)
 					sizeof(struct ccw_io_region), NULL);
 	if (!vfio_ccw_io_region) {
 		ret = -ENOMEM;
-		goto out_err;
+		goto out_regions;
 	}
 
 	vfio_ccw_cmd_region = kmem_cache_create_usercopy("vfio_ccw_cmd_region",
@@ -460,7 +460,7 @@ static int __init vfio_ccw_sch_init(void)
 					sizeof(struct ccw_cmd_region), NULL);
 	if (!vfio_ccw_cmd_region) {
 		ret = -ENOMEM;
-		goto out_err;
+		goto out_regions;
 	}
 
 	vfio_ccw_schib_region = kmem_cache_create_usercopy("vfio_ccw_schib_region",
@@ -470,7 +470,7 @@ static int __init vfio_ccw_sch_init(void)
 
 	if (!vfio_ccw_schib_region) {
 		ret = -ENOMEM;
-		goto out_err;
+		goto out_regions;
 	}
 
 	vfio_ccw_crw_region = kmem_cache_create_usercopy("vfio_ccw_crw_region",
@@ -480,19 +480,25 @@ static int __init vfio_ccw_sch_init(void)
 
 	if (!vfio_ccw_crw_region) {
 		ret = -ENOMEM;
-		goto out_err;
+		goto out_regions;
 	}
 
+	ret = mdev_register_driver(&vfio_ccw_mdev_driver);
+	if (ret)
+		goto out_regions;
+
 	isc_register(VFIO_CCW_ISC);
 	ret = css_driver_register(&vfio_ccw_sch_driver);
 	if (ret) {
 		isc_unregister(VFIO_CCW_ISC);
-		goto out_err;
+		goto out_driver;
 	}
 
 	return ret;
 
-out_err:
+out_driver:
+	mdev_unregister_driver(&vfio_ccw_mdev_driver);
+out_regions:
 	vfio_ccw_destroy_regions();
 	destroy_workqueue(vfio_ccw_work_q);
 	vfio_ccw_debug_exit();
@@ -501,6 +507,7 @@ static int __init vfio_ccw_sch_init(void)
 
 static void __exit vfio_ccw_sch_exit(void)
 {
+	mdev_unregister_driver(&vfio_ccw_mdev_driver);
 	css_driver_unregister(&vfio_ccw_sch_driver);
 	isc_unregister(VFIO_CCW_ISC);
 	vfio_ccw_destroy_regions();
diff --git a/drivers/s390/cio/vfio_ccw_ops.c b/drivers/s390/cio/vfio_ccw_ops.c
index 491a64c61fff1a..0fcf46031d3821 100644
--- a/drivers/s390/cio/vfio_ccw_ops.c
+++ b/drivers/s390/cio/vfio_ccw_ops.c
@@ -17,13 +17,13 @@
 
 #include "vfio_ccw_private.h"
 
-static int vfio_ccw_mdev_reset(struct mdev_device *mdev)
+static const struct vfio_device_ops vfio_ccw_dev_ops;
+
+static int vfio_ccw_mdev_reset(struct vfio_ccw_private *private)
 {
-	struct vfio_ccw_private *private;
 	struct subchannel *sch;
 	int ret;
 
-	private = dev_get_drvdata(mdev_parent_dev(mdev));
 	sch = private->sch;
 	/*
 	 * TODO:
@@ -61,7 +61,7 @@ static int vfio_ccw_mdev_notifier(struct notifier_block *nb,
 		if (!cp_iova_pinned(&private->cp, unmap->iova))
 			return NOTIFY_OK;
 
-		if (vfio_ccw_mdev_reset(private->mdev))
+		if (vfio_ccw_mdev_reset(private))
 			return NOTIFY_BAD;
 
 		cp_free(&private->cp);
@@ -113,10 +113,11 @@ static struct attribute_group *mdev_type_groups[] = {
 	NULL,
 };
 
-static int vfio_ccw_mdev_create(struct mdev_device *mdev)
+static int vfio_ccw_mdev_probe(struct mdev_device *mdev)
 {
 	struct vfio_ccw_private *private =
 		dev_get_drvdata(mdev_parent_dev(mdev));
+	int ret;
 
 	if (private->state == VFIO_CCW_STATE_NOT_OPER)
 		return -ENODEV;
@@ -124,6 +125,10 @@ static int vfio_ccw_mdev_create(struct mdev_device *mdev)
 	if (atomic_dec_if_positive(&private->avail) < 0)
 		return -EPERM;
 
+	memset(&private->vdev, 0, sizeof(private->vdev));
+	vfio_init_group_dev(&private->vdev, &mdev->dev,
+			    &vfio_ccw_dev_ops);
+
 	private->mdev = mdev;
 	private->state = VFIO_CCW_STATE_IDLE;
 
@@ -132,19 +137,28 @@ static int vfio_ccw_mdev_create(struct mdev_device *mdev)
 			   private->sch->schid.ssid,
 			   private->sch->schid.sch_no);
 
+	ret = vfio_register_group_dev(&private->vdev);
+	if (ret)
+		goto err_atomic;
+	dev_set_drvdata(&mdev->dev, private);
 	return 0;
+
+err_atomic:
+	atomic_inc(&private->avail);
+	return ret;
 }
 
-static int vfio_ccw_mdev_remove(struct mdev_device *mdev)
+static void vfio_ccw_mdev_remove(struct mdev_device *mdev)
 {
-	struct vfio_ccw_private *private =
-		dev_get_drvdata(mdev_parent_dev(mdev));
+	struct vfio_ccw_private *private = dev_get_drvdata(&mdev->dev);
 
 	VFIO_CCW_MSG_EVENT(2, "mdev %pUl, sch %x.%x.%04x: remove\n",
 			   mdev_uuid(mdev), private->sch->schid.cssid,
 			   private->sch->schid.ssid,
 			   private->sch->schid.sch_no);
 
+	vfio_unregister_group_dev(&private->vdev);
+
 	if ((private->state != VFIO_CCW_STATE_NOT_OPER) &&
 	    (private->state != VFIO_CCW_STATE_STANDBY)) {
 		if (!vfio_ccw_sch_quiesce(private->sch))
@@ -155,20 +169,18 @@ static int vfio_ccw_mdev_remove(struct mdev_device *mdev)
 	cp_free(&private->cp);
 	private->mdev = NULL;
 	atomic_inc(&private->avail);
-
-	return 0;
 }
 
-static int vfio_ccw_mdev_open(struct mdev_device *mdev)
+static int vfio_ccw_mdev_open(struct vfio_device *vdev)
 {
 	struct vfio_ccw_private *private =
-		dev_get_drvdata(mdev_parent_dev(mdev));
+		container_of(vdev, struct vfio_ccw_private, vdev);
 	unsigned long events = VFIO_IOMMU_NOTIFY_DMA_UNMAP;
 	int ret;
 
 	private->nb.notifier_call = vfio_ccw_mdev_notifier;
 
-	ret = vfio_register_notifier(mdev_dev(mdev), VFIO_IOMMU_NOTIFY,
+	ret = vfio_register_notifier(vdev->dev, VFIO_IOMMU_NOTIFY,
 				     &events, &private->nb);
 	if (ret)
 		return ret;
@@ -189,27 +201,26 @@ static int vfio_ccw_mdev_open(struct mdev_device *mdev)
 
 out_unregister:
 	vfio_ccw_unregister_dev_regions(private);
-	vfio_unregister_notifier(mdev_dev(mdev), VFIO_IOMMU_NOTIFY,
+	vfio_unregister_notifier(vdev->dev, VFIO_IOMMU_NOTIFY,
 				 &private->nb);
 	return ret;
 }
 
-static void vfio_ccw_mdev_release(struct mdev_device *mdev)
+static void vfio_ccw_mdev_release(struct vfio_device *vdev)
 {
 	struct vfio_ccw_private *private =
-		dev_get_drvdata(mdev_parent_dev(mdev));
+		container_of(vdev, struct vfio_ccw_private, vdev);
 
 	if ((private->state != VFIO_CCW_STATE_NOT_OPER) &&
 	    (private->state != VFIO_CCW_STATE_STANDBY)) {
-		if (!vfio_ccw_mdev_reset(mdev))
+		if (!vfio_ccw_mdev_reset(private))
 			private->state = VFIO_CCW_STATE_STANDBY;
 		/* The state will be NOT_OPER on error. */
 	}
 
 	cp_free(&private->cp);
 	vfio_ccw_unregister_dev_regions(private);
-	vfio_unregister_notifier(mdev_dev(mdev), VFIO_IOMMU_NOTIFY,
-				 &private->nb);
+	vfio_unregister_notifier(vdev->dev, VFIO_IOMMU_NOTIFY, &private->nb);
 }
 
 static ssize_t vfio_ccw_mdev_read_io_region(struct vfio_ccw_private *private,
@@ -233,15 +244,14 @@ static ssize_t vfio_ccw_mdev_read_io_region(struct vfio_ccw_private *private,
 	return ret;
 }
 
-static ssize_t vfio_ccw_mdev_read(struct mdev_device *mdev,
+static ssize_t vfio_ccw_mdev_read(struct vfio_device *vdev,
 				  char __user *buf,
 				  size_t count,
 				  loff_t *ppos)
 {
+	struct vfio_ccw_private *private =
+		container_of(vdev, struct vfio_ccw_private, vdev);
 	unsigned int index = VFIO_CCW_OFFSET_TO_INDEX(*ppos);
-	struct vfio_ccw_private *private;
-
-	private = dev_get_drvdata(mdev_parent_dev(mdev));
 
 	if (index >= VFIO_CCW_NUM_REGIONS + private->num_regions)
 		return -EINVAL;
@@ -288,15 +298,14 @@ static ssize_t vfio_ccw_mdev_write_io_region(struct vfio_ccw_private *private,
 	return ret;
 }
 
-static ssize_t vfio_ccw_mdev_write(struct mdev_device *mdev,
+static ssize_t vfio_ccw_mdev_write(struct vfio_device *vdev,
 				   const char __user *buf,
 				   size_t count,
 				   loff_t *ppos)
 {
+	struct vfio_ccw_private *private =
+		container_of(vdev, struct vfio_ccw_private, vdev);
 	unsigned int index = VFIO_CCW_OFFSET_TO_INDEX(*ppos);
-	struct vfio_ccw_private *private;
-
-	private = dev_get_drvdata(mdev_parent_dev(mdev));
 
 	if (index >= VFIO_CCW_NUM_REGIONS + private->num_regions)
 		return -EINVAL;
@@ -313,12 +322,9 @@ static ssize_t vfio_ccw_mdev_write(struct mdev_device *mdev,
 	return -EINVAL;
 }
 
-static int vfio_ccw_mdev_get_device_info(struct vfio_device_info *info,
-					 struct mdev_device *mdev)
+static int vfio_ccw_mdev_get_device_info(struct vfio_ccw_private *private,
+					 struct vfio_device_info *info)
 {
-	struct vfio_ccw_private *private;
-
-	private = dev_get_drvdata(mdev_parent_dev(mdev));
 	info->flags = VFIO_DEVICE_FLAGS_CCW | VFIO_DEVICE_FLAGS_RESET;
 	info->num_regions = VFIO_CCW_NUM_REGIONS + private->num_regions;
 	info->num_irqs = VFIO_CCW_NUM_IRQS;
@@ -326,14 +332,12 @@ static int vfio_ccw_mdev_get_device_info(struct vfio_device_info *info,
 	return 0;
 }
 
-static int vfio_ccw_mdev_get_region_info(struct vfio_region_info *info,
-					 struct mdev_device *mdev,
+static int vfio_ccw_mdev_get_region_info(struct vfio_ccw_private *private,
+					 struct vfio_region_info *info,
 					 unsigned long arg)
 {
-	struct vfio_ccw_private *private;
 	int i;
 
-	private = dev_get_drvdata(mdev_parent_dev(mdev));
 	switch (info->index) {
 	case VFIO_CCW_CONFIG_REGION_INDEX:
 		info->offset = 0;
@@ -408,19 +412,16 @@ static int vfio_ccw_mdev_get_irq_info(struct vfio_irq_info *info)
 	return 0;
 }
 
-static int vfio_ccw_mdev_set_irqs(struct mdev_device *mdev,
+static int vfio_ccw_mdev_set_irqs(struct vfio_ccw_private *private,
 				  uint32_t flags,
 				  uint32_t index,
 				  void __user *data)
 {
-	struct vfio_ccw_private *private;
 	struct eventfd_ctx **ctx;
 
 	if (!(flags & VFIO_IRQ_SET_ACTION_TRIGGER))
 		return -EINVAL;
 
-	private = dev_get_drvdata(mdev_parent_dev(mdev));
-
 	switch (index) {
 	case VFIO_CCW_IO_IRQ_INDEX:
 		ctx = &private->io_trigger;
@@ -522,10 +523,12 @@ void vfio_ccw_unregister_dev_regions(struct vfio_ccw_private *private)
 	private->region = NULL;
 }
 
-static ssize_t vfio_ccw_mdev_ioctl(struct mdev_device *mdev,
+static ssize_t vfio_ccw_mdev_ioctl(struct vfio_device *vdev,
 				   unsigned int cmd,
 				   unsigned long arg)
 {
+	struct vfio_ccw_private *private =
+		container_of(vdev, struct vfio_ccw_private, vdev);
 	int ret = 0;
 	unsigned long minsz;
 
@@ -542,7 +545,7 @@ static ssize_t vfio_ccw_mdev_ioctl(struct mdev_device *mdev,
 		if (info.argsz < minsz)
 			return -EINVAL;
 
-		ret = vfio_ccw_mdev_get_device_info(&info, mdev);
+		ret = vfio_ccw_mdev_get_device_info(private, &info);
 		if (ret)
 			return ret;
 
@@ -560,7 +563,7 @@ static ssize_t vfio_ccw_mdev_ioctl(struct mdev_device *mdev,
 		if (info.argsz < minsz)
 			return -EINVAL;
 
-		ret = vfio_ccw_mdev_get_region_info(&info, mdev, arg);
+		ret = vfio_ccw_mdev_get_region_info(private, &info, arg);
 		if (ret)
 			return ret;
 
@@ -605,47 +608,59 @@ static ssize_t vfio_ccw_mdev_ioctl(struct mdev_device *mdev,
 			return ret;
 
 		data = (void __user *)(arg + minsz);
-		return vfio_ccw_mdev_set_irqs(mdev, hdr.flags, hdr.index, data);
+		return vfio_ccw_mdev_set_irqs(private, hdr.flags, hdr.index,
+					      data);
 	}
 	case VFIO_DEVICE_RESET:
-		return vfio_ccw_mdev_reset(mdev);
+		return vfio_ccw_mdev_reset(private);
 	default:
 		return -ENOTTY;
 	}
 }
 
 /* Request removal of the device*/
-static void vfio_ccw_mdev_request(struct mdev_device *mdev, unsigned int count)
+static void vfio_ccw_mdev_request(struct vfio_device *vdev, unsigned int count)
 {
-	struct vfio_ccw_private *private = dev_get_drvdata(mdev_parent_dev(mdev));
-
-	if (!private)
-		return;
+	struct vfio_ccw_private *private =
+		container_of(vdev, struct vfio_ccw_private, vdev);
+	struct device *dev = private->vdev.dev;
 
 	if (private->req_trigger) {
 		if (!(count % 10))
-			dev_notice_ratelimited(mdev_dev(private->mdev),
+			dev_notice_ratelimited(dev,
 					       "Relaying device request to user (#%u)\n",
 					       count);
 
 		eventfd_signal(private->req_trigger, 1);
 	} else if (count == 0) {
-		dev_notice(mdev_dev(private->mdev),
+		dev_notice(dev,
 			   "No device request channel registered, blocked until released by user\n");
 	}
 }
 
+static const struct vfio_device_ops vfio_ccw_dev_ops = {
+	.open = vfio_ccw_mdev_open,
+	.release = vfio_ccw_mdev_release,
+	.read = vfio_ccw_mdev_read,
+	.write = vfio_ccw_mdev_write,
+	.ioctl = vfio_ccw_mdev_ioctl,
+	.request = vfio_ccw_mdev_request,
+};
+
+struct mdev_driver vfio_ccw_mdev_driver = {
+	.driver = {
+		.name = "vfio_ccw_mdev",
+		.owner = THIS_MODULE,
+		.mod_name = KBUILD_MODNAME,
+	},
+	.probe = vfio_ccw_mdev_probe,
+	.remove = vfio_ccw_mdev_remove,
+};
+
 static const struct mdev_parent_ops vfio_ccw_mdev_ops = {
 	.owner			= THIS_MODULE,
+	.device_driver		= &vfio_ccw_mdev_driver,
 	.supported_type_groups  = mdev_type_groups,
-	.create			= vfio_ccw_mdev_create,
-	.remove			= vfio_ccw_mdev_remove,
-	.open			= vfio_ccw_mdev_open,
-	.release		= vfio_ccw_mdev_release,
-	.read			= vfio_ccw_mdev_read,
-	.write			= vfio_ccw_mdev_write,
-	.ioctl			= vfio_ccw_mdev_ioctl,
-	.request		= vfio_ccw_mdev_request,
 };
 
 int vfio_ccw_mdev_reg(struct subchannel *sch)
diff --git a/drivers/s390/cio/vfio_ccw_private.h b/drivers/s390/cio/vfio_ccw_private.h
index b2c762eb42b9bb..7272eb78861244 100644
--- a/drivers/s390/cio/vfio_ccw_private.h
+++ b/drivers/s390/cio/vfio_ccw_private.h
@@ -17,6 +17,7 @@
 #include <linux/eventfd.h>
 #include <linux/workqueue.h>
 #include <linux/vfio_ccw.h>
+#include <linux/vfio.h>
 #include <asm/crw.h>
 #include <asm/debug.h>
 
@@ -67,6 +68,7 @@ struct vfio_ccw_crw {
 
 /**
  * struct vfio_ccw_private
+ * @vdev: Embedded VFIO device
  * @sch: pointer to the subchannel
  * @state: internal state of the device
  * @completion: synchronization helper of the I/O completion
@@ -90,6 +92,7 @@ struct vfio_ccw_crw {
  * @crw_work: work for deferral process of CRW handling
  */
 struct vfio_ccw_private {
+	struct vfio_device vdev;
 	struct subchannel	*sch;
 	int			state;
 	struct completion	*completion;
@@ -121,6 +124,8 @@ extern void vfio_ccw_mdev_unreg(struct subchannel *sch);
 
 extern int vfio_ccw_sch_quiesce(struct subchannel *sch);
 
+extern struct mdev_driver vfio_ccw_mdev_driver;
+
 /*
  * States of the device statemachine.
  */
-- 
2.31.1


^ permalink raw reply	[flat|nested] 59+ messages in thread

* [PATCH v2 09/13] vfio/mdev: Remove vfio_mdev.c
  2021-04-26 20:00 [PATCH v2 00/13] Remove vfio_mdev.c, mdev_parent_ops and more Jason Gunthorpe
                   ` (5 preceding siblings ...)
  2021-04-26 20:00 ` [PATCH v2 07/13] vfio/ccw: " Jason Gunthorpe
@ 2021-04-26 20:00 ` Jason Gunthorpe
  2021-04-28  6:07   ` Christoph Hellwig
  2021-04-26 20:00 ` [PATCH v2 10/13] vfio/mdev: Remove mdev_parent_ops dev_attr_groups Jason Gunthorpe
                   ` (4 subsequent siblings)
  11 siblings, 1 reply; 59+ messages in thread
From: Jason Gunthorpe @ 2021-04-26 20:00 UTC (permalink / raw)
  To: Alex Williamson, Cornelia Huck, Jonathan Corbet, kvm,
	Kirti Wankhede, linux-doc
  Cc: Raj, Ashok, Dan Williams, Daniel Vetter, Christoph Hellwig,
	Leon Romanovsky, Max Gurtovoy, Tarun Gupta

Now that all mdev drivers directly create their own mdev_device driver and
directly register with the vfio core's vfio_device_ops this is all dead
code.

Delete vfio_mdev.c and the mdev_parent_ops members that are connected to
it.

Preserve VFIO's design of allowing mdev drivers to be !GPL by allowing the
three functions that replace this module for !GPL usage. This goes along
with the other 19 symbols that are already marked !GPL in VFIO.

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 .../driver-api/vfio-mediated-device.rst       |  19 ---
 drivers/vfio/mdev/Makefile                    |   2 +-
 drivers/vfio/mdev/mdev_core.c                 |  49 +-----
 drivers/vfio/mdev/mdev_driver.c               |  21 +--
 drivers/vfio/mdev/mdev_private.h              |   2 -
 drivers/vfio/mdev/vfio_mdev.c                 | 158 ------------------
 drivers/vfio/vfio.c                           |   6 +-
 include/linux/mdev.h                          |  52 ------
 include/linux/vfio.h                          |   4 +
 9 files changed, 16 insertions(+), 297 deletions(-)
 delete mode 100644 drivers/vfio/mdev/vfio_mdev.c

diff --git a/Documentation/driver-api/vfio-mediated-device.rst b/Documentation/driver-api/vfio-mediated-device.rst
index 1779b85f014e2f..5f866b17c93e69 100644
--- a/Documentation/driver-api/vfio-mediated-device.rst
+++ b/Documentation/driver-api/vfio-mediated-device.rst
@@ -137,25 +137,6 @@ The structures in the mdev_parent_ops structure are as follows:
 * mdev_attr_groups: attributes of the mediated device
 * supported_config: attributes to define supported configurations
 
-The functions in the mdev_parent_ops structure are as follows:
-
-* create: allocate basic resources in a driver for a mediated device
-* remove: free resources in a driver when a mediated device is destroyed
-
-(Note that mdev-core provides no implicit serialization of create/remove
-callbacks per mdev parent device, per mdev type, or any other categorization.
-Vendor drivers are expected to be fully asynchronous in this respect or
-provide their own internal resource protection.)
-
-The callbacks in the mdev_parent_ops structure are as follows:
-
-* open: open callback of mediated device
-* close: close callback of mediated device
-* ioctl: ioctl callback of mediated device
-* read : read emulation callback
-* write: write emulation callback
-* mmap: mmap emulation callback
-
 A driver should use the mdev_parent_ops structure in the function call to
 register itself with the mdev core driver::
 
diff --git a/drivers/vfio/mdev/Makefile b/drivers/vfio/mdev/Makefile
index ff9ecd80212503..7c236ba1b90eb1 100644
--- a/drivers/vfio/mdev/Makefile
+++ b/drivers/vfio/mdev/Makefile
@@ -1,5 +1,5 @@
 # SPDX-License-Identifier: GPL-2.0-only
 
-mdev-y := mdev_core.o mdev_sysfs.o mdev_driver.o vfio_mdev.o
+mdev-y := mdev_core.o mdev_sysfs.o mdev_driver.o
 
 obj-$(CONFIG_VFIO_MDEV) += mdev.o
diff --git a/drivers/vfio/mdev/mdev_core.c b/drivers/vfio/mdev/mdev_core.c
index 51b8a9fcf866ad..d507047e6ecf4a 100644
--- a/drivers/vfio/mdev/mdev_core.c
+++ b/drivers/vfio/mdev/mdev_core.c
@@ -89,17 +89,10 @@ void mdev_release_parent(struct kref *kref)
 static void mdev_device_remove_common(struct mdev_device *mdev)
 {
 	struct mdev_parent *parent = mdev->type->parent;
-	int ret;
 
 	mdev_remove_sysfs_files(mdev);
 	device_del(&mdev->dev);
 	lockdep_assert_held(&parent->unreg_sem);
-	if (parent->ops->remove) {
-		ret = parent->ops->remove(mdev);
-		if (ret)
-			dev_err(&mdev->dev, "Remove failed: err=%d\n", ret);
-	}
-
 	/* Balances with device_initialize() */
 	put_device(&mdev->dev);
 }
@@ -131,17 +124,13 @@ int mdev_register_device(struct device *dev, const struct mdev_parent_ops *ops)
 	/* check for mandatory ops */
 	if (!ops || !ops->supported_type_groups)
 		return -EINVAL;
-	if (!ops->device_driver && (!ops->create || !ops->remove))
+	if (!ops->device_driver)
 		return -EINVAL;
 
 	dev = get_device(dev);
 	if (!dev)
 		return -EINVAL;
 
-	/* Not mandatory, but its absence could be a problem */
-	if (!ops->request)
-		dev_info(dev, "Driver cannot be asked to release device\n");
-
 	mutex_lock(&parent_list_lock);
 
 	/* Check for duplicate */
@@ -263,15 +252,12 @@ static void mdev_device_release(struct device *dev)
  */
 static int mdev_bind_driver(struct mdev_device *mdev)
 {
-	struct mdev_driver *drv = mdev->type->parent->ops->device_driver;
 	int ret;
 
-	if (!drv)
-		drv = &vfio_mdev_driver;
-
 	while (1) {
 		device_lock(&mdev->dev);
-		if (mdev->dev.driver == &drv->driver) {
+		if (mdev->dev.driver ==
+		    &mdev->type->parent->ops->device_driver->driver) {
 			ret = 0;
 			goto out_unlock;
 		}
@@ -337,15 +323,9 @@ int mdev_device_create(struct mdev_type *type, const guid_t *uuid)
 		goto out_put_device;
 	}
 
-	if (parent->ops->create) {
-		ret = parent->ops->create(mdev);
-		if (ret)
-			goto out_unlock;
-	}
-
 	ret = device_add(&mdev->dev);
 	if (ret)
-		goto out_remove;
+		goto out_unlock;
 
 	ret = mdev_bind_driver(mdev);
 	if (ret)
@@ -363,9 +343,6 @@ int mdev_device_create(struct mdev_type *type, const guid_t *uuid)
 
 out_del:
 	device_del(&mdev->dev);
-out_remove:
-	if (parent->ops->remove)
-		parent->ops->remove(mdev);
 out_unlock:
 	up_read(&parent->unreg_sem);
 out_put_device:
@@ -408,28 +385,14 @@ int mdev_device_remove(struct mdev_device *mdev)
 
 static int __init mdev_init(void)
 {
-	int rc;
-
-	rc = mdev_bus_register();
-	if (rc)
-		return rc;
-	rc = mdev_register_driver(&vfio_mdev_driver);
-	if (rc)
-		goto err_bus;
-	return 0;
-err_bus:
-	mdev_bus_unregister();
-	return rc;
+	return bus_register(&mdev_bus_type);
 }
 
 static void __exit mdev_exit(void)
 {
-	mdev_unregister_driver(&vfio_mdev_driver);
-
 	if (mdev_bus_compat_class)
 		class_compat_unregister(mdev_bus_compat_class);
-
-	mdev_bus_unregister();
+	bus_unregister(&mdev_bus_type);
 }
 
 module_init(mdev_init)
diff --git a/drivers/vfio/mdev/mdev_driver.c b/drivers/vfio/mdev/mdev_driver.c
index 6e96c023d7823d..07ada55efd6228 100644
--- a/drivers/vfio/mdev/mdev_driver.c
+++ b/drivers/vfio/mdev/mdev_driver.c
@@ -74,15 +74,8 @@ static int mdev_remove(struct device *dev)
 static int mdev_match(struct device *dev, struct device_driver *drv)
 {
 	struct mdev_device *mdev = to_mdev_device(dev);
-	struct mdev_driver *target = mdev->type->parent->ops->device_driver;
-
-	/*
-	 * The ops specify the device driver to connect, fall back to the old
-	 * shim driver if the driver hasn't been converted.
-	 */
-	if (!target)
-		target = &vfio_mdev_driver;
-	return drv == &target->driver;
+
+	return drv == &mdev->type->parent->ops->device_driver->driver;
 }
 
 struct bus_type mdev_bus_type = {
@@ -118,13 +111,3 @@ void mdev_unregister_driver(struct mdev_driver *drv)
 	driver_unregister(&drv->driver);
 }
 EXPORT_SYMBOL(mdev_unregister_driver);
-
-int mdev_bus_register(void)
-{
-	return bus_register(&mdev_bus_type);
-}
-
-void mdev_bus_unregister(void)
-{
-	bus_unregister(&mdev_bus_type);
-}
diff --git a/drivers/vfio/mdev/mdev_private.h b/drivers/vfio/mdev/mdev_private.h
index 5461b67582289f..a656cfe0346c33 100644
--- a/drivers/vfio/mdev/mdev_private.h
+++ b/drivers/vfio/mdev/mdev_private.h
@@ -37,8 +37,6 @@ struct mdev_type {
 #define to_mdev_type(_kobj)		\
 	container_of(_kobj, struct mdev_type, kobj)
 
-extern struct mdev_driver vfio_mdev_driver;
-
 int  parent_create_sysfs_files(struct mdev_parent *parent);
 void parent_remove_sysfs_files(struct mdev_parent *parent);
 
diff --git a/drivers/vfio/mdev/vfio_mdev.c b/drivers/vfio/mdev/vfio_mdev.c
deleted file mode 100644
index d5b4eede47c1a5..00000000000000
--- a/drivers/vfio/mdev/vfio_mdev.c
+++ /dev/null
@@ -1,158 +0,0 @@
-// SPDX-License-Identifier: GPL-2.0-only
-/*
- * VFIO based driver for Mediated device
- *
- * Copyright (c) 2016, NVIDIA CORPORATION. All rights reserved.
- *     Author: Neo Jia <cjia@nvidia.com>
- *             Kirti Wankhede <kwankhede@nvidia.com>
- */
-
-#include <linux/init.h>
-#include <linux/module.h>
-#include <linux/device.h>
-#include <linux/kernel.h>
-#include <linux/slab.h>
-#include <linux/vfio.h>
-#include <linux/mdev.h>
-
-#include "mdev_private.h"
-
-static int vfio_mdev_open(struct vfio_device *core_vdev)
-{
-	struct mdev_device *mdev = to_mdev_device(core_vdev->dev);
-	struct mdev_parent *parent = mdev->type->parent;
-
-	int ret;
-
-	if (unlikely(!parent->ops->open))
-		return -EINVAL;
-
-	if (!try_module_get(THIS_MODULE))
-		return -ENODEV;
-
-	ret = parent->ops->open(mdev);
-	if (ret)
-		module_put(THIS_MODULE);
-
-	return ret;
-}
-
-static void vfio_mdev_release(struct vfio_device *core_vdev)
-{
-	struct mdev_device *mdev = to_mdev_device(core_vdev->dev);
-	struct mdev_parent *parent = mdev->type->parent;
-
-	if (likely(parent->ops->release))
-		parent->ops->release(mdev);
-
-	module_put(THIS_MODULE);
-}
-
-static long vfio_mdev_unlocked_ioctl(struct vfio_device *core_vdev,
-				     unsigned int cmd, unsigned long arg)
-{
-	struct mdev_device *mdev = to_mdev_device(core_vdev->dev);
-	struct mdev_parent *parent = mdev->type->parent;
-
-	if (unlikely(!parent->ops->ioctl))
-		return -EINVAL;
-
-	return parent->ops->ioctl(mdev, cmd, arg);
-}
-
-static ssize_t vfio_mdev_read(struct vfio_device *core_vdev, char __user *buf,
-			      size_t count, loff_t *ppos)
-{
-	struct mdev_device *mdev = to_mdev_device(core_vdev->dev);
-	struct mdev_parent *parent = mdev->type->parent;
-
-	if (unlikely(!parent->ops->read))
-		return -EINVAL;
-
-	return parent->ops->read(mdev, buf, count, ppos);
-}
-
-static ssize_t vfio_mdev_write(struct vfio_device *core_vdev,
-			       const char __user *buf, size_t count,
-			       loff_t *ppos)
-{
-	struct mdev_device *mdev = to_mdev_device(core_vdev->dev);
-	struct mdev_parent *parent = mdev->type->parent;
-
-	if (unlikely(!parent->ops->write))
-		return -EINVAL;
-
-	return parent->ops->write(mdev, buf, count, ppos);
-}
-
-static int vfio_mdev_mmap(struct vfio_device *core_vdev,
-			  struct vm_area_struct *vma)
-{
-	struct mdev_device *mdev = to_mdev_device(core_vdev->dev);
-	struct mdev_parent *parent = mdev->type->parent;
-
-	if (unlikely(!parent->ops->mmap))
-		return -EINVAL;
-
-	return parent->ops->mmap(mdev, vma);
-}
-
-static void vfio_mdev_request(struct vfio_device *core_vdev, unsigned int count)
-{
-	struct mdev_device *mdev = to_mdev_device(core_vdev->dev);
-	struct mdev_parent *parent = mdev->type->parent;
-
-	if (parent->ops->request)
-		parent->ops->request(mdev, count);
-	else if (count == 0)
-		dev_notice(mdev_dev(mdev),
-			   "No mdev vendor driver request callback support, blocked until released by user\n");
-}
-
-static const struct vfio_device_ops vfio_mdev_dev_ops = {
-	.name		= "vfio-mdev",
-	.open		= vfio_mdev_open,
-	.release	= vfio_mdev_release,
-	.ioctl		= vfio_mdev_unlocked_ioctl,
-	.read		= vfio_mdev_read,
-	.write		= vfio_mdev_write,
-	.mmap		= vfio_mdev_mmap,
-	.request	= vfio_mdev_request,
-};
-
-static int vfio_mdev_probe(struct mdev_device *mdev)
-{
-	struct vfio_device *vdev;
-	int ret;
-
-	vdev = kzalloc(sizeof(*vdev), GFP_KERNEL);
-	if (!vdev)
-		return -ENOMEM;
-
-	vfio_init_group_dev(vdev, &mdev->dev, &vfio_mdev_dev_ops);
-	ret = vfio_register_group_dev(vdev);
-	if (ret) {
-		kfree(vdev);
-		return ret;
-	}
-	dev_set_drvdata(&mdev->dev, vdev);
-	return 0;
-}
-
-static void vfio_mdev_remove(struct mdev_device *mdev)
-{
-	struct vfio_device *vdev = dev_get_drvdata(&mdev->dev);
-
-	vfio_unregister_group_dev(vdev);
-	kfree(vdev);
-}
-
-struct mdev_driver vfio_mdev_driver = {
-	.driver = {
-		.name = "vfio_mdev",
-		.owner = THIS_MODULE,
-		.mod_name = KBUILD_MODNAME,
-	},
-	.probe	= vfio_mdev_probe,
-	.remove	= vfio_mdev_remove,
-};
diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
index 5e631c359ef23c..59bbdf6634f934 100644
--- a/drivers/vfio/vfio.c
+++ b/drivers/vfio/vfio.c
@@ -747,7 +747,7 @@ void vfio_init_group_dev(struct vfio_device *device, struct device *dev,
 	device->dev = dev;
 	device->ops = ops;
 }
-EXPORT_SYMBOL_GPL(vfio_init_group_dev);
+EXPORT_SYMBOL(vfio_init_group_dev);
 
 int vfio_register_group_dev(struct vfio_device *device)
 {
@@ -796,7 +796,7 @@ int vfio_register_group_dev(struct vfio_device *device)
 
 	return 0;
 }
-EXPORT_SYMBOL_GPL(vfio_register_group_dev);
+EXPORT_SYMBOL(vfio_register_group_dev);
 
 /**
  * Get a reference to the vfio_device for a device.  Even if the
@@ -927,7 +927,7 @@ void vfio_unregister_group_dev(struct vfio_device *device)
 	/* Matches the get in vfio_register_group_dev() */
 	vfio_group_put(group);
 }
-EXPORT_SYMBOL_GPL(vfio_unregister_group_dev);
+EXPORT_SYMBOL(vfio_unregister_group_dev);
 
 /**
  * VFIO base fd, /dev/vfio/vfio
diff --git a/include/linux/mdev.h b/include/linux/mdev.h
index 49cc4f65120d57..ea48c401e4fa63 100644
--- a/include/linux/mdev.h
+++ b/include/linux/mdev.h
@@ -61,45 +61,6 @@ struct device *mtype_get_parent_dev(struct mdev_type *mtype);
  * @mdev_attr_groups:	Attributes of the mediated device.
  * @supported_type_groups: Attributes to define supported types. It is mandatory
  *			to provide supported types.
- * @create:		Called to allocate basic resources in parent device's
- *			driver for a particular mediated device. It is
- *			mandatory to provide create ops.
- *			@mdev: mdev_device structure on of mediated device
- *			      that is being created
- *			Returns integer: success (0) or error (< 0)
- * @remove:		Called to free resources in parent device's driver for
- *			a mediated device. It is mandatory to provide 'remove'
- *			ops.
- *			@mdev: mdev_device device structure which is being
- *			       destroyed
- *			Returns integer: success (0) or error (< 0)
- * @open:		Open mediated device.
- *			@mdev: mediated device.
- *			Returns integer: success (0) or error (< 0)
- * @release:		release mediated device
- *			@mdev: mediated device.
- * @read:		Read emulation callback
- *			@mdev: mediated device structure
- *			@buf: read buffer
- *			@count: number of bytes to read
- *			@ppos: address.
- *			Retuns number on bytes read on success or error.
- * @write:		Write emulation callback
- *			@mdev: mediated device structure
- *			@buf: write buffer
- *			@count: number of bytes to be written
- *			@ppos: address.
- *			Retuns number on bytes written on success or error.
- * @ioctl:		IOCTL callback
- *			@mdev: mediated device structure
- *			@cmd: ioctl command
- *			@arg: arguments to ioctl
- * @mmap:		mmap callback
- *			@mdev: mediated device structure
- *			@vma: vma structure
- * @request:		request callback to release device
- *			@mdev: mediated device structure
- *			@count: request sequence number
  * Parent device that support mediated device should be registered with mdev
  * module with mdev_parent_ops structure.
  **/
@@ -109,19 +70,6 @@ struct mdev_parent_ops {
 	const struct attribute_group **dev_attr_groups;
 	const struct attribute_group **mdev_attr_groups;
 	struct attribute_group **supported_type_groups;
-
-	int     (*create)(struct mdev_device *mdev);
-	int     (*remove)(struct mdev_device *mdev);
-	int     (*open)(struct mdev_device *mdev);
-	void    (*release)(struct mdev_device *mdev);
-	ssize_t (*read)(struct mdev_device *mdev, char __user *buf,
-			size_t count, loff_t *ppos);
-	ssize_t (*write)(struct mdev_device *mdev, const char __user *buf,
-			 size_t count, loff_t *ppos);
-	long	(*ioctl)(struct mdev_device *mdev, unsigned int cmd,
-			 unsigned long arg);
-	int	(*mmap)(struct mdev_device *mdev, struct vm_area_struct *vma);
-	void	(*request)(struct mdev_device *mdev, unsigned int count);
 };
 
 /* interface for exporting mdev supported type attributes */
diff --git a/include/linux/vfio.h b/include/linux/vfio.h
index a2c5b30e1763ba..c5e08be4c56395 100644
--- a/include/linux/vfio.h
+++ b/include/linux/vfio.h
@@ -64,6 +64,10 @@ void vfio_init_group_dev(struct vfio_device *device, struct device *dev,
 int vfio_register_group_dev(struct vfio_device *device);
 void vfio_unregister_group_dev(struct vfio_device *device);
 extern struct vfio_device *vfio_device_get_from_dev(struct device *dev);
+static inline void vfio_device_get(struct vfio_device *device)
+{
+	refcount_inc(&device->refcount);
+}
 extern void vfio_device_put(struct vfio_device *device);
 
 /* events for the backend driver notify callback */
-- 
2.31.1


^ permalink raw reply	[flat|nested] 59+ messages in thread

* [PATCH v2 10/13] vfio/mdev: Remove mdev_parent_ops dev_attr_groups
  2021-04-26 20:00 [PATCH v2 00/13] Remove vfio_mdev.c, mdev_parent_ops and more Jason Gunthorpe
                   ` (6 preceding siblings ...)
  2021-04-26 20:00 ` [PATCH v2 09/13] vfio/mdev: Remove vfio_mdev.c Jason Gunthorpe
@ 2021-04-26 20:00 ` Jason Gunthorpe
  2021-04-26 20:00 ` [PATCH v2 11/13] vfio/mdev: Remove mdev_parent_ops Jason Gunthorpe
                   ` (3 subsequent siblings)
  11 siblings, 0 replies; 59+ messages in thread
From: Jason Gunthorpe @ 2021-04-26 20:00 UTC (permalink / raw)
  To: Alex Williamson, Cornelia Huck, kvm, Kirti Wankhede
  Cc: Raj, Ashok, Dan Williams, Daniel Vetter, Christoph Hellwig,
	Leon Romanovsky, Max Gurtovoy, Tarun Gupta

This is only used by one sample to print a fixed string that is pointless.

In general, having a device driver attach sysfs attributes to the parent
is horrific. This should never happen, and always leads to some kind of
liftime bug as it become very difficult for the sysfs attribute to go back
to any data owned by the device driver.

Remove the general mechanism to create this abuse.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/vfio/mdev/mdev_sysfs.c | 12 ++----------
 include/linux/mdev.h           |  2 --
 samples/vfio-mdev/mtty.c       | 30 +-----------------------------
 3 files changed, 3 insertions(+), 41 deletions(-)

diff --git a/drivers/vfio/mdev/mdev_sysfs.c b/drivers/vfio/mdev/mdev_sysfs.c
index f5cf1931c54e48..66eef08833a4ef 100644
--- a/drivers/vfio/mdev/mdev_sysfs.c
+++ b/drivers/vfio/mdev/mdev_sysfs.c
@@ -197,7 +197,6 @@ void parent_remove_sysfs_files(struct mdev_parent *parent)
 		remove_mdev_supported_type(type);
 	}
 
-	sysfs_remove_groups(&parent->dev->kobj, parent->ops->dev_attr_groups);
 	kset_unregister(parent->mdev_types_kset);
 }
 
@@ -213,17 +212,10 @@ int parent_create_sysfs_files(struct mdev_parent *parent)
 
 	INIT_LIST_HEAD(&parent->type_list);
 
-	ret = sysfs_create_groups(&parent->dev->kobj,
-				  parent->ops->dev_attr_groups);
-	if (ret)
-		goto create_err;
-
 	ret = add_mdev_supported_type_groups(parent);
 	if (ret)
-		sysfs_remove_groups(&parent->dev->kobj,
-				    parent->ops->dev_attr_groups);
-	else
-		return ret;
+		goto create_err;
+	return 0;
 
 create_err:
 	kset_unregister(parent->mdev_types_kset);
diff --git a/include/linux/mdev.h b/include/linux/mdev.h
index ea48c401e4fa63..fd9fe1dcf0e230 100644
--- a/include/linux/mdev.h
+++ b/include/linux/mdev.h
@@ -57,7 +57,6 @@ struct device *mtype_get_parent_dev(struct mdev_type *mtype);
  *
  * @owner:		The module owner.
  * @device_driver:	Which device driver to probe() on newly created devices
- * @dev_attr_groups:	Attributes of the parent device.
  * @mdev_attr_groups:	Attributes of the mediated device.
  * @supported_type_groups: Attributes to define supported types. It is mandatory
  *			to provide supported types.
@@ -67,7 +66,6 @@ struct device *mtype_get_parent_dev(struct mdev_type *mtype);
 struct mdev_parent_ops {
 	struct module   *owner;
 	struct mdev_driver *device_driver;
-	const struct attribute_group **dev_attr_groups;
 	const struct attribute_group **mdev_attr_groups;
 	struct attribute_group **supported_type_groups;
 };
diff --git a/samples/vfio-mdev/mtty.c b/samples/vfio-mdev/mtty.c
index d2a168420b775d..31eec76bc553ce 100644
--- a/samples/vfio-mdev/mtty.c
+++ b/samples/vfio-mdev/mtty.c
@@ -1207,38 +1207,11 @@ static void mtty_close(struct vfio_device *mdev)
 	pr_info("%s\n", __func__);
 }
 
-static ssize_t
-sample_mtty_dev_show(struct device *dev, struct device_attribute *attr,
-		     char *buf)
-{
-	return sprintf(buf, "This is phy device\n");
-}
-
-static DEVICE_ATTR_RO(sample_mtty_dev);
-
-static struct attribute *mtty_dev_attrs[] = {
-	&dev_attr_sample_mtty_dev.attr,
-	NULL,
-};
-
-static const struct attribute_group mtty_dev_group = {
-	.name  = "mtty_dev",
-	.attrs = mtty_dev_attrs,
-};
-
-static const struct attribute_group *mtty_dev_groups[] = {
-	&mtty_dev_group,
-	NULL,
-};
-
 static ssize_t
 sample_mdev_dev_show(struct device *dev, struct device_attribute *attr,
 		     char *buf)
 {
-	if (mdev_from_dev(dev))
-		return sprintf(buf, "This is MDEV %s\n", dev_name(dev));
-
-	return sprintf(buf, "\n");
+	return sprintf(buf, "This is MDEV %s\n", dev_name(dev));
 }
 
 static DEVICE_ATTR_RO(sample_mdev_dev);
@@ -1340,7 +1313,6 @@ static struct mdev_driver mtty_driver = {
 static const struct mdev_parent_ops mdev_fops = {
 	.owner                  = THIS_MODULE,
 	.device_driver		= &mtty_driver,
-	.dev_attr_groups        = mtty_dev_groups,
 	.supported_type_groups  = mdev_type_groups,
 };
 
-- 
2.31.1


^ permalink raw reply	[flat|nested] 59+ messages in thread

* [PATCH v2 11/13] vfio/mdev: Remove mdev_parent_ops
  2021-04-26 20:00 [PATCH v2 00/13] Remove vfio_mdev.c, mdev_parent_ops and more Jason Gunthorpe
                   ` (7 preceding siblings ...)
  2021-04-26 20:00 ` [PATCH v2 10/13] vfio/mdev: Remove mdev_parent_ops dev_attr_groups Jason Gunthorpe
@ 2021-04-26 20:00 ` Jason Gunthorpe
  2021-04-26 20:00 ` [PATCH v2 12/13] vfio/mdev: Use the driver core to create the 'remove' file Jason Gunthorpe
                   ` (2 subsequent siblings)
  11 siblings, 0 replies; 59+ messages in thread
From: Jason Gunthorpe @ 2021-04-26 20:00 UTC (permalink / raw)
  To: David Airlie, Tony Krowiak, Alex Williamson,
	Christian Borntraeger, Cornelia Huck, Jonathan Corbet,
	Daniel Vetter, dri-devel, Eric Farman, Harald Freudenberger,
	Vasily Gorbik, Heiko Carstens, intel-gfx, intel-gvt-dev,
	Jani Nikula, Joonas Lahtinen, kvm, Kirti Wankhede, linux-doc,
	linux-s390, Peter Oberparleiter, Halil Pasic, Pierre Morel,
	Rodrigo Vivi, Vineeth Vijayan, Zhenyu Wang, Zhi Wang
  Cc: Raj, Ashok, Dan Williams, Christoph Hellwig, Leon Romanovsky,
	Max Gurtovoy, Tarun Gupta

The last useful member in this struct is the supported_type_groups, move
it to the mdev_driver and delete mdev_parent_ops.

Replace it with mdev_driver as an argument to mdev_register_device()

Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 .../driver-api/vfio-mediated-device.rst       | 37 +++++++------------
 drivers/gpu/drm/i915/gvt/kvmgt.c              |  8 +---
 drivers/s390/cio/vfio_ccw_ops.c               |  7 +---
 drivers/s390/crypto/vfio_ap_ops.c             |  9 +----
 drivers/vfio/mdev/mdev_core.c                 | 13 +++----
 drivers/vfio/mdev/mdev_driver.c               |  2 +-
 drivers/vfio/mdev/mdev_private.h              |  2 +-
 drivers/vfio/mdev/mdev_sysfs.c                |  6 +--
 include/linux/mdev.h                          | 24 ++----------
 samples/vfio-mdev/mbochs.c                    |  9 +----
 samples/vfio-mdev/mdpy.c                      |  9 +----
 samples/vfio-mdev/mtty.c                      |  9 +----
 12 files changed, 39 insertions(+), 96 deletions(-)

diff --git a/Documentation/driver-api/vfio-mediated-device.rst b/Documentation/driver-api/vfio-mediated-device.rst
index 5f866b17c93e69..a073d0bb06e7fd 100644
--- a/Documentation/driver-api/vfio-mediated-device.rst
+++ b/Documentation/driver-api/vfio-mediated-device.rst
@@ -93,7 +93,7 @@ interfaces:
 Registration Interface for a Mediated Bus Driver
 ------------------------------------------------
 
-The registration interface for a mediated bus driver provides the following
+The registration interface for a mediated device driver provides the following
 structure to represent a mediated device's driver::
 
      /*
@@ -105,6 +105,7 @@ structure to represent a mediated device's driver::
      struct mdev_driver {
 	     int  (*probe)  (struct mdev_device *dev);
 	     void (*remove) (struct mdev_device *dev);
+	     struct attribute_group **supported_type_groups;
 	     struct device_driver    driver;
      };
 
@@ -119,35 +120,25 @@ to register and unregister itself with the core driver:
 
     extern void mdev_unregister_driver(struct mdev_driver *drv);
 
-The mediated bus driver is responsible for adding mediated devices to the VFIO
-group when devices are bound to the driver and removing mediated devices from
-the VFIO when devices are unbound from the driver.
+The mediated bus driver's probe function should create a vfio_device on top of
+the mdev_device and connect it to an appropriate implementation of
+vfio_device_ops.
 
-
-Physical Device Driver Interface
---------------------------------
-
-The physical device driver interface provides the mdev_parent_ops[3] structure
-to define the APIs to manage work in the mediated core driver that is related
-to the physical device.
-
-The structures in the mdev_parent_ops structure are as follows:
-
-* dev_attr_groups: attributes of the parent device
-* mdev_attr_groups: attributes of the mediated device
-* supported_config: attributes to define supported configurations
-
-A driver should use the mdev_parent_ops structure in the function call to
-register itself with the mdev core driver::
+When a driver wants to add the GUID creation sysfs to an existing device it has
+probe'd to then it should call:
 
 	extern int  mdev_register_device(struct device *dev,
-	                                 const struct mdev_parent_ops *ops);
+	                                 struct mdev_driver *mdev_driver);
+
+This will provide the 'mdev_supported_types/XX/create' files which can then be
+used to trigger the creation of a mdev_device. The created mdev_device will be
+attached to the specified driver.
 
-However, the mdev_parent_ops structure is not required in the function call
-that a driver should use to unregister itself with the mdev core driver::
+When the driver needs to remove itself it calls:
 
 	extern void mdev_unregister_device(struct device *dev);
 
+Which will unbind and destroy all the created mdevs and remove the sysfs files.
 
 Mediated Device Management Interface Through sysfs
 ==================================================
diff --git a/drivers/gpu/drm/i915/gvt/kvmgt.c b/drivers/gpu/drm/i915/gvt/kvmgt.c
index 85ef300087e091..02089efd15bb92 100644
--- a/drivers/gpu/drm/i915/gvt/kvmgt.c
+++ b/drivers/gpu/drm/i915/gvt/kvmgt.c
@@ -1669,10 +1669,6 @@ static struct mdev_driver intel_vgpu_mdev_driver = {
 	.remove	= intel_vgpu_remove,
 };
 
-static struct mdev_parent_ops intel_vgpu_ops = {
-	.device_driver		= &intel_vgpu_mdev_driver,
-};
-
 static int kvmgt_host_init(struct device *dev, void *gvt, const void *ops)
 {
 	struct attribute_group **kvm_vgpu_type_groups;
@@ -1680,9 +1676,9 @@ static int kvmgt_host_init(struct device *dev, void *gvt, const void *ops)
 	intel_gvt_ops = ops;
 	if (!intel_gvt_ops->get_gvt_attrs(&kvm_vgpu_type_groups))
 		return -EFAULT;
-	intel_vgpu_ops.supported_type_groups = kvm_vgpu_type_groups;
+	intel_vgpu_mdev_driver.supported_type_groups = kvm_vgpu_type_groups;
 
-	return mdev_register_device(dev, &intel_vgpu_ops);
+	return mdev_register_device(dev, &intel_vgpu_mdev_driver);
 }
 
 static void kvmgt_host_exit(struct device *dev)
diff --git a/drivers/s390/cio/vfio_ccw_ops.c b/drivers/s390/cio/vfio_ccw_ops.c
index 0fcf46031d3821..161697529dcc41 100644
--- a/drivers/s390/cio/vfio_ccw_ops.c
+++ b/drivers/s390/cio/vfio_ccw_ops.c
@@ -655,17 +655,12 @@ struct mdev_driver vfio_ccw_mdev_driver = {
 	},
 	.probe = vfio_ccw_mdev_probe,
 	.remove = vfio_ccw_mdev_remove,
-};
-
-static const struct mdev_parent_ops vfio_ccw_mdev_ops = {
-	.owner			= THIS_MODULE,
-	.device_driver		= &vfio_ccw_mdev_driver,
 	.supported_type_groups  = mdev_type_groups,
 };
 
 int vfio_ccw_mdev_reg(struct subchannel *sch)
 {
-	return mdev_register_device(&sch->dev, &vfio_ccw_mdev_ops);
+	return mdev_register_device(&sch->dev, &vfio_ccw_mdev_driver);
 }
 
 void vfio_ccw_mdev_unreg(struct subchannel *sch)
diff --git a/drivers/s390/crypto/vfio_ap_ops.c b/drivers/s390/crypto/vfio_ap_ops.c
index 79872c857dd522..92789257c87639 100644
--- a/drivers/s390/crypto/vfio_ap_ops.c
+++ b/drivers/s390/crypto/vfio_ap_ops.c
@@ -1339,12 +1339,7 @@ static struct mdev_driver vfio_ap_matrix_driver = {
 	},
 	.probe = vfio_ap_mdev_probe,
 	.remove = vfio_ap_mdev_remove,
-};
-
-static const struct mdev_parent_ops vfio_ap_matrix_ops = {
-	.owner			= THIS_MODULE,
-	.device_driver		= &vfio_ap_matrix_driver,
-	.supported_type_groups	= vfio_ap_mdev_type_groups,
+	.supported_type_groups = vfio_ap_mdev_type_groups,
 };
 
 int vfio_ap_mdev_register(void)
@@ -1357,7 +1352,7 @@ int vfio_ap_mdev_register(void)
 	if (ret)
 		return ret;
 
-	ret = mdev_register_device(&matrix_dev->device, &vfio_ap_matrix_ops);
+	ret = mdev_register_device(&matrix_dev->device, &vfio_ap_matrix_driver);
 	if (ret)
 		goto err_driver;
 	return 0;
diff --git a/drivers/vfio/mdev/mdev_core.c b/drivers/vfio/mdev/mdev_core.c
index d507047e6ecf4a..cd1ab9fe299445 100644
--- a/drivers/vfio/mdev/mdev_core.c
+++ b/drivers/vfio/mdev/mdev_core.c
@@ -109,12 +109,12 @@ static int mdev_device_remove_cb(struct device *dev, void *data)
 /*
  * mdev_register_device : Register a device
  * @dev: device structure representing parent device.
- * @ops: Parent device operation structure to be registered.
+ * @mdev_driver: Device driver to bind to the newly created mdev
  *
  * Add device to list of registered parent devices.
  * Returns a negative value on error, otherwise 0.
  */
-int mdev_register_device(struct device *dev, const struct mdev_parent_ops *ops)
+int mdev_register_device(struct device *dev, struct mdev_driver *mdev_driver)
 {
 	int ret;
 	struct mdev_parent *parent;
@@ -122,9 +122,7 @@ int mdev_register_device(struct device *dev, const struct mdev_parent_ops *ops)
 	char *envp[] = { env_string, NULL };
 
 	/* check for mandatory ops */
-	if (!ops || !ops->supported_type_groups)
-		return -EINVAL;
-	if (!ops->device_driver)
+	if (!mdev_driver->supported_type_groups)
 		return -EINVAL;
 
 	dev = get_device(dev);
@@ -151,7 +149,7 @@ int mdev_register_device(struct device *dev, const struct mdev_parent_ops *ops)
 	init_rwsem(&parent->unreg_sem);
 
 	parent->dev = dev;
-	parent->ops = ops;
+	parent->mdev_driver = mdev_driver;
 
 	if (!mdev_bus_compat_class) {
 		mdev_bus_compat_class = class_compat_register("mdev_bus");
@@ -257,7 +255,7 @@ static int mdev_bind_driver(struct mdev_device *mdev)
 	while (1) {
 		device_lock(&mdev->dev);
 		if (mdev->dev.driver ==
-		    &mdev->type->parent->ops->device_driver->driver) {
+		    &mdev->type->parent->mdev_driver->driver) {
 			ret = 0;
 			goto out_unlock;
 		}
@@ -304,7 +302,6 @@ int mdev_device_create(struct mdev_type *type, const guid_t *uuid)
 	mdev->dev.parent  = parent->dev;
 	mdev->dev.bus = &mdev_bus_type;
 	mdev->dev.release = mdev_device_release;
-	mdev->dev.groups = parent->ops->mdev_attr_groups;
 	mdev->type = type;
 	/* Pairs with the put in mdev_device_release() */
 	kobject_get(&type->kobj);
diff --git a/drivers/vfio/mdev/mdev_driver.c b/drivers/vfio/mdev/mdev_driver.c
index 07ada55efd6228..d743a9f51f4c90 100644
--- a/drivers/vfio/mdev/mdev_driver.c
+++ b/drivers/vfio/mdev/mdev_driver.c
@@ -75,7 +75,7 @@ static int mdev_match(struct device *dev, struct device_driver *drv)
 {
 	struct mdev_device *mdev = to_mdev_device(dev);
 
-	return drv == &mdev->type->parent->ops->device_driver->driver;
+	return drv == &mdev->type->parent->mdev_driver->driver;
 }
 
 struct bus_type mdev_bus_type = {
diff --git a/drivers/vfio/mdev/mdev_private.h b/drivers/vfio/mdev/mdev_private.h
index a656cfe0346c33..839567d059a07d 100644
--- a/drivers/vfio/mdev/mdev_private.h
+++ b/drivers/vfio/mdev/mdev_private.h
@@ -15,7 +15,7 @@ void mdev_bus_unregister(void);
 
 struct mdev_parent {
 	struct device *dev;
-	const struct mdev_parent_ops *ops;
+	const struct mdev_driver *mdev_driver;
 	struct kref ref;
 	struct list_head next;
 	struct kset *mdev_types_kset;
diff --git a/drivers/vfio/mdev/mdev_sysfs.c b/drivers/vfio/mdev/mdev_sysfs.c
index 66eef08833a4ef..5a3873d1a275ae 100644
--- a/drivers/vfio/mdev/mdev_sysfs.c
+++ b/drivers/vfio/mdev/mdev_sysfs.c
@@ -97,7 +97,7 @@ static struct mdev_type *add_mdev_supported_type(struct mdev_parent *parent,
 {
 	struct mdev_type *type;
 	struct attribute_group *group =
-		parent->ops->supported_type_groups[type_group_id];
+		parent->mdev_driver->supported_type_groups[type_group_id];
 	int ret;
 
 	if (!group->name) {
@@ -154,7 +154,7 @@ static struct mdev_type *add_mdev_supported_type(struct mdev_parent *parent,
 static void remove_mdev_supported_type(struct mdev_type *type)
 {
 	struct attribute_group *group =
-		type->parent->ops->supported_type_groups[type->type_group_id];
+		type->parent->mdev_driver->supported_type_groups[type->type_group_id];
 
 	sysfs_remove_files(&type->kobj,
 			   (const struct attribute **)group->attrs);
@@ -168,7 +168,7 @@ static int add_mdev_supported_type_groups(struct mdev_parent *parent)
 {
 	int i;
 
-	for (i = 0; parent->ops->supported_type_groups[i]; i++) {
+	for (i = 0; parent->mdev_driver->supported_type_groups[i]; i++) {
 		struct mdev_type *type;
 
 		type = add_mdev_supported_type(parent, i);
diff --git a/include/linux/mdev.h b/include/linux/mdev.h
index fd9fe1dcf0e230..af807c77c1e0f5 100644
--- a/include/linux/mdev.h
+++ b/include/linux/mdev.h
@@ -51,25 +51,6 @@ unsigned int mdev_get_type_group_id(struct mdev_device *mdev);
 unsigned int mtype_get_type_group_id(struct mdev_type *mtype);
 struct device *mtype_get_parent_dev(struct mdev_type *mtype);
 
-/**
- * struct mdev_parent_ops - Structure to be registered for each parent device to
- * register the device to mdev module.
- *
- * @owner:		The module owner.
- * @device_driver:	Which device driver to probe() on newly created devices
- * @mdev_attr_groups:	Attributes of the mediated device.
- * @supported_type_groups: Attributes to define supported types. It is mandatory
- *			to provide supported types.
- * Parent device that support mediated device should be registered with mdev
- * module with mdev_parent_ops structure.
- **/
-struct mdev_parent_ops {
-	struct module   *owner;
-	struct mdev_driver *device_driver;
-	const struct attribute_group **mdev_attr_groups;
-	struct attribute_group **supported_type_groups;
-};
-
 /* interface for exporting mdev supported type attributes */
 struct mdev_type_attribute {
 	struct attribute attr;
@@ -94,12 +75,15 @@ struct mdev_type_attribute mdev_type_attr_##_name =		\
  * struct mdev_driver - Mediated device driver
  * @probe: called when new device created
  * @remove: called when device removed
+ * @supported_type_groups: Attributes to define supported types. It is mandatory
+ *			to provide supported types.
  * @driver: device driver structure
  *
  **/
 struct mdev_driver {
 	int (*probe)(struct mdev_device *dev);
 	void (*remove)(struct mdev_device *dev);
+	struct attribute_group **supported_type_groups;
 	struct device_driver driver;
 };
 
@@ -118,7 +102,7 @@ static inline const guid_t *mdev_uuid(struct mdev_device *mdev)
 
 extern struct bus_type mdev_bus_type;
 
-int mdev_register_device(struct device *dev, const struct mdev_parent_ops *ops);
+int mdev_register_device(struct device *dev, struct mdev_driver *mdev_driver);
 void mdev_unregister_device(struct device *dev);
 
 int mdev_register_driver(struct mdev_driver *drv);
diff --git a/samples/vfio-mdev/mbochs.c b/samples/vfio-mdev/mbochs.c
index e18821a8a6beb8..c76ceec584b41b 100644
--- a/samples/vfio-mdev/mbochs.c
+++ b/samples/vfio-mdev/mbochs.c
@@ -1418,12 +1418,7 @@ static struct mdev_driver mbochs_driver = {
 	},
 	.probe = mbochs_probe,
 	.remove	= mbochs_remove,
-};
-
-static const struct mdev_parent_ops mdev_fops = {
-	.owner			= THIS_MODULE,
-	.device_driver		= &mbochs_driver,
-	.supported_type_groups	= mdev_type_groups,
+	.supported_type_groups = mdev_type_groups,
 };
 
 static const struct file_operations vd_fops = {
@@ -1466,7 +1461,7 @@ static int __init mbochs_dev_init(void)
 	if (ret)
 		goto err_class;
 
-	ret = mdev_register_device(&mbochs_dev, &mdev_fops);
+	ret = mdev_register_device(&mbochs_dev, &mbochs_driver);
 	if (ret)
 		goto err_device;
 
diff --git a/samples/vfio-mdev/mdpy.c b/samples/vfio-mdev/mdpy.c
index 82638de333330d..c22b2c808d132d 100644
--- a/samples/vfio-mdev/mdpy.c
+++ b/samples/vfio-mdev/mdpy.c
@@ -735,12 +735,7 @@ static struct mdev_driver mdpy_driver = {
 	},
 	.probe = mdpy_probe,
 	.remove	= mdpy_remove,
-};
-
-static const struct mdev_parent_ops mdev_fops = {
-	.owner			= THIS_MODULE,
-	.device_driver          = &mdpy_driver,
-	.supported_type_groups	= mdev_type_groups,
+	.supported_type_groups = mdev_type_groups,
 };
 
 static const struct file_operations vd_fops = {
@@ -783,7 +778,7 @@ static int __init mdpy_dev_init(void)
 	if (ret)
 		goto err_class;
 
-	ret = mdev_register_device(&mdpy_dev, &mdev_fops);
+	ret = mdev_register_device(&mdpy_dev, &mdpy_driver);
 	if (ret)
 		goto err_device;
 
diff --git a/samples/vfio-mdev/mtty.c b/samples/vfio-mdev/mtty.c
index 31eec76bc553ce..87f5ba12a230e3 100644
--- a/samples/vfio-mdev/mtty.c
+++ b/samples/vfio-mdev/mtty.c
@@ -1308,12 +1308,7 @@ static struct mdev_driver mtty_driver = {
 	},
 	.probe = mtty_probe,
 	.remove	= mtty_remove,
-};
-
-static const struct mdev_parent_ops mdev_fops = {
-	.owner                  = THIS_MODULE,
-	.device_driver		= &mtty_driver,
-	.supported_type_groups  = mdev_type_groups,
+	.supported_type_groups = mdev_type_groups,
 };
 
 static void mtty_device_release(struct device *dev)
@@ -1364,7 +1359,7 @@ static int __init mtty_dev_init(void)
 	if (ret)
 		goto err_class;
 
-	ret = mdev_register_device(&mtty_dev.dev, &mdev_fops);
+	ret = mdev_register_device(&mtty_dev.dev, &mtty_driver);
 	if (ret)
 		goto err_device;
 
-- 
2.31.1


^ permalink raw reply	[flat|nested] 59+ messages in thread

* [PATCH v2 12/13] vfio/mdev: Use the driver core to create the 'remove' file
  2021-04-26 20:00 [PATCH v2 00/13] Remove vfio_mdev.c, mdev_parent_ops and more Jason Gunthorpe
                   ` (8 preceding siblings ...)
  2021-04-26 20:00 ` [PATCH v2 11/13] vfio/mdev: Remove mdev_parent_ops Jason Gunthorpe
@ 2021-04-26 20:00 ` Jason Gunthorpe
  2021-04-26 20:00 ` [PATCH v2 13/13] vfio/mdev: Remove mdev drvdata Jason Gunthorpe
  2021-04-27 21:30 ` [PATCH v2 00/13] Remove vfio_mdev.c, mdev_parent_ops and more Alex Williamson
  11 siblings, 0 replies; 59+ messages in thread
From: Jason Gunthorpe @ 2021-04-26 20:00 UTC (permalink / raw)
  To: Alex Williamson, Cornelia Huck, kvm, Kirti Wankhede
  Cc: Raj, Ashok, Dan Williams, Daniel Vetter, Christoph Hellwig,
	Leon Romanovsky, Max Gurtovoy, Tarun Gupta

The device creator is supposed to use the dev.groups value to add sysfs
files before device_add is called, not call sysfs_create_files() after
device_add() returns. This creates a race with uevent delivery where the
extra attribute will not be visible.

This was being done because the groups had been co-opted by the mdev
driver, now that prior patches have moved the driver's groups to the
struct device_driver the dev.group is properly free for use here.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/vfio/mdev/mdev_core.c    |  1 +
 drivers/vfio/mdev/mdev_private.h |  2 ++
 drivers/vfio/mdev/mdev_sysfs.c   | 19 ++++++++++---------
 3 files changed, 13 insertions(+), 9 deletions(-)

diff --git a/drivers/vfio/mdev/mdev_core.c b/drivers/vfio/mdev/mdev_core.c
index cd1ab9fe299445..a61685d8844d44 100644
--- a/drivers/vfio/mdev/mdev_core.c
+++ b/drivers/vfio/mdev/mdev_core.c
@@ -302,6 +302,7 @@ int mdev_device_create(struct mdev_type *type, const guid_t *uuid)
 	mdev->dev.parent  = parent->dev;
 	mdev->dev.bus = &mdev_bus_type;
 	mdev->dev.release = mdev_device_release;
+	mdev->dev.groups = mdev_device_groups;
 	mdev->type = type;
 	/* Pairs with the put in mdev_device_release() */
 	kobject_get(&type->kobj);
diff --git a/drivers/vfio/mdev/mdev_private.h b/drivers/vfio/mdev/mdev_private.h
index 839567d059a07d..c6944d3eaf78fa 100644
--- a/drivers/vfio/mdev/mdev_private.h
+++ b/drivers/vfio/mdev/mdev_private.h
@@ -32,6 +32,8 @@ struct mdev_type {
 	unsigned int type_group_id;
 };
 
+extern const struct attribute_group *mdev_device_groups[];
+
 #define to_mdev_type_attr(_attr)	\
 	container_of(_attr, struct mdev_type_attribute, attr)
 #define to_mdev_type(_kobj)		\
diff --git a/drivers/vfio/mdev/mdev_sysfs.c b/drivers/vfio/mdev/mdev_sysfs.c
index 5a3873d1a275ae..0ccfeb3dda2455 100644
--- a/drivers/vfio/mdev/mdev_sysfs.c
+++ b/drivers/vfio/mdev/mdev_sysfs.c
@@ -244,11 +244,20 @@ static ssize_t remove_store(struct device *dev, struct device_attribute *attr,
 
 static DEVICE_ATTR_WO(remove);
 
-static const struct attribute *mdev_device_attrs[] = {
+static struct attribute *mdev_device_attrs[] = {
 	&dev_attr_remove.attr,
 	NULL,
 };
 
+static const struct attribute_group mdev_device_group = {
+	.attrs = mdev_device_attrs,
+};
+
+const struct attribute_group *mdev_device_groups[] = {
+	&mdev_device_group,
+	NULL
+};
+
 int mdev_create_sysfs_files(struct mdev_device *mdev)
 {
 	struct mdev_type *type = mdev->type;
@@ -262,15 +271,8 @@ int mdev_create_sysfs_files(struct mdev_device *mdev)
 	ret = sysfs_create_link(kobj, &type->kobj, "mdev_type");
 	if (ret)
 		goto type_link_failed;
-
-	ret = sysfs_create_files(kobj, mdev_device_attrs);
-	if (ret)
-		goto create_files_failed;
-
 	return ret;
 
-create_files_failed:
-	sysfs_remove_link(kobj, "mdev_type");
 type_link_failed:
 	sysfs_remove_link(mdev->type->devices_kobj, dev_name(&mdev->dev));
 	return ret;
@@ -280,7 +282,6 @@ void mdev_remove_sysfs_files(struct mdev_device *mdev)
 {
 	struct kobject *kobj = &mdev->dev.kobj;
 
-	sysfs_remove_files(kobj, mdev_device_attrs);
 	sysfs_remove_link(kobj, "mdev_type");
 	sysfs_remove_link(mdev->type->devices_kobj, dev_name(&mdev->dev));
 }
-- 
2.31.1


^ permalink raw reply	[flat|nested] 59+ messages in thread

* [PATCH v2 13/13] vfio/mdev: Remove mdev drvdata
  2021-04-26 20:00 [PATCH v2 00/13] Remove vfio_mdev.c, mdev_parent_ops and more Jason Gunthorpe
                   ` (9 preceding siblings ...)
  2021-04-26 20:00 ` [PATCH v2 12/13] vfio/mdev: Use the driver core to create the 'remove' file Jason Gunthorpe
@ 2021-04-26 20:00 ` Jason Gunthorpe
  2021-04-27 21:30 ` [PATCH v2 00/13] Remove vfio_mdev.c, mdev_parent_ops and more Alex Williamson
  11 siblings, 0 replies; 59+ messages in thread
From: Jason Gunthorpe @ 2021-04-26 20:00 UTC (permalink / raw)
  To: kvm, Kirti Wankhede
  Cc: Raj, Ashok, Dan Williams, Daniel Vetter, Christoph Hellwig,
	Leon Romanovsky, Max Gurtovoy, Tarun Gupta

This is no longer used, remove it.

All usages were moved over to either use container_of() from a vfio_device
or to use dev_drvdata() directly on the mdev.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 include/linux/mdev.h | 9 ---------
 1 file changed, 9 deletions(-)

diff --git a/include/linux/mdev.h b/include/linux/mdev.h
index af807c77c1e0f5..2c7267f1356d78 100644
--- a/include/linux/mdev.h
+++ b/include/linux/mdev.h
@@ -15,7 +15,6 @@ struct mdev_type;
 struct mdev_device {
 	struct device dev;
 	guid_t uuid;
-	void *driver_data;
 	struct list_head next;
 	struct mdev_type *type;
 	struct device *iommu_device;
@@ -87,14 +86,6 @@ struct mdev_driver {
 	struct device_driver driver;
 };
 
-static inline void *mdev_get_drvdata(struct mdev_device *mdev)
-{
-	return mdev->driver_data;
-}
-static inline void mdev_set_drvdata(struct mdev_device *mdev, void *data)
-{
-	mdev->driver_data = data;
-}
 static inline const guid_t *mdev_uuid(struct mdev_device *mdev)
 {
 	return &mdev->uuid;
-- 
2.31.1


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v2 01/13] vfio/mdev: Remove CONFIG_VFIO_MDEV_DEVICE
  2021-04-26 20:00 ` [PATCH v2 01/13] vfio/mdev: Remove CONFIG_VFIO_MDEV_DEVICE Jason Gunthorpe
@ 2021-04-27 11:05   ` Cornelia Huck
  0 siblings, 0 replies; 59+ messages in thread
From: Cornelia Huck @ 2021-04-27 11:05 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: David Airlie, Tony Krowiak, Alex Williamson,
	Christian Borntraeger, Jonathan Corbet, Daniel Vetter, dri-devel,
	Vasily Gorbik, Heiko Carstens, intel-gfx, Jani Nikula,
	Joonas Lahtinen, kvm, Kirti Wankhede, linux-doc, linux-s390,
	Halil Pasic, Pierre Morel, Rodrigo Vivi, Raj, Ashok,
	Dan Williams, Christoph Hellwig, Leon Romanovsky, Max Gurtovoy,
	Tarun Gupta

On Mon, 26 Apr 2021 17:00:03 -0300
Jason Gunthorpe <jgg@nvidia.com> wrote:

> For some reason the vfio_mdev shim mdev_driver has its own module and
> kconfig. As the next patch requires access to it from mdev.ko merge the
> two modules together and remove VFIO_MDEV_DEVICE.
> 
> A later patch deletes this driver entirely.
> 
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> ---
>  Documentation/s390/vfio-ap.rst   |  1 -
>  arch/s390/Kconfig                |  2 +-
>  drivers/gpu/drm/i915/Kconfig     |  2 +-
>  drivers/vfio/mdev/Kconfig        |  7 -------
>  drivers/vfio/mdev/Makefile       |  3 +--
>  drivers/vfio/mdev/mdev_core.c    | 16 ++++++++++++++--
>  drivers/vfio/mdev/mdev_private.h |  2 ++
>  drivers/vfio/mdev/vfio_mdev.c    | 24 +-----------------------
>  samples/Kconfig                  |  6 +++---
>  9 files changed, 23 insertions(+), 40 deletions(-)

This also fixes the dependencies for vfio-ccw, which never depended on
VFIO_MDEV_DEVICE directly...

Reviewed-by: Cornelia Huck <cohuck@redhat.com>


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v2 02/13] vfio/mdev: Allow the mdev_parent_ops to specify the device driver to bind
  2021-04-26 20:00 ` [PATCH v2 02/13] vfio/mdev: Allow the mdev_parent_ops to specify the device driver to bind Jason Gunthorpe
@ 2021-04-27 12:32   ` Cornelia Huck
  2021-04-27 23:20     ` Jason Gunthorpe
  2021-04-28  6:03   ` Christoph Hellwig
  2021-04-28  6:44   ` Leon Romanovsky
  2 siblings, 1 reply; 59+ messages in thread
From: Cornelia Huck @ 2021-04-27 12:32 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Alex Williamson, kvm, Kirti Wankhede, Raj, Ashok, Dan Williams,
	Daniel Vetter, Christoph Hellwig, Leon Romanovsky, Max Gurtovoy,
	Tarun Gupta

On Mon, 26 Apr 2021 17:00:04 -0300
Jason Gunthorpe <jgg@nvidia.com> wrote:

> This allows a mdev driver to opt out of using vfio_mdev.c, instead the
> driver will provide a 'struct mdev_driver' and register directly with the
> driver core.
> 
> Much of mdev_parent_ops becomes unused in this mode:
> - create()/remove() are done via the mdev_driver probe()/remove()
> - mdev_attr_groups becomes mdev_driver driver.dev_groups
> - Wrapper function callbacks are replaced with the same ones from
>   struct vfio_device_ops
> 
> Following patches convert all the drivers.
> 
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> ---
>  drivers/vfio/mdev/mdev_core.c   | 64 ++++++++++++++++++++++++++++-----
>  drivers/vfio/mdev/mdev_driver.c | 17 ++++++++-
>  include/linux/mdev.h            |  3 ++
>  3 files changed, 75 insertions(+), 9 deletions(-)
> 

(...)

> +/*
> + * mdev drivers can refuse to bind during probe(), in this case we want to fail
> + * the creation of the mdev all the way back to sysfs. This is a weird model
> + * that doesn't fit in the driver core well, nor does it seem to appear any
> + * place else in the kernel, so use a simple hack.
> + */
> +static int mdev_bind_driver(struct mdev_device *mdev)
> +{
> +	struct mdev_driver *drv = mdev->type->parent->ops->device_driver;
> +	int ret;
> +
> +	if (!drv)
> +		drv = &vfio_mdev_driver;
> +
> +	while (1) {
> +		device_lock(&mdev->dev);
> +		if (mdev->dev.driver == &drv->driver) {
> +			ret = 0;
> +			goto out_unlock;
> +		}
> +		if (mdev->probe_err) {
> +			ret = mdev->probe_err;
> +			goto out_unlock;
> +		}
> +		device_unlock(&mdev->dev);
> +		ret = device_attach(&mdev->dev);
> +		if (ret)
> +			return ret;

device_attach() can return 0 (no driver), 1 (bound), or -ENODEV (device
not registered). I would expect mdev_bind_driver() to return 0 in case
of success and !0 otherwise, and I think the calling code does so as
well?

> +		mdev->probe_err = -EINVAL;
> +	}
> +	return 0;
> +
> +out_unlock:
> +	device_unlock(&mdev->dev);
> +	return ret;
> +}
> +

(...)

Rest of the patch looks good to me.


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v2 07/13] vfio/ccw: Convert to use vfio_register_group_dev()
  2021-04-26 20:00 ` [PATCH v2 07/13] vfio/ccw: " Jason Gunthorpe
@ 2021-04-27 20:06   ` Eric Farman
  2021-04-27 22:10     ` Jason Gunthorpe
  2021-04-28 17:09   ` Cornelia Huck
  1 sibling, 1 reply; 59+ messages in thread
From: Eric Farman @ 2021-04-27 20:06 UTC (permalink / raw)
  To: Jason Gunthorpe, Christian Borntraeger, Cornelia Huck,
	Vasily Gorbik, Heiko Carstens, kvm, linux-s390,
	Peter Oberparleiter, Halil Pasic, Vineeth Vijayan
  Cc: Raj, Ashok, Dan Williams, Daniel Vetter, Christoph Hellwig,
	Leon Romanovsky, Max Gurtovoy, Tarun Gupta

On Mon, 2021-04-26 at 17:00 -0300, Jason Gunthorpe wrote:
> This is more complicated because vfio_ccw is sharing the vfio_device
> between both the mdev_device and its vfio_device and the css_driver.
> 
> The mdev is a singleton, and the reason for this sharing appears to
> be to
> allow the extra css_driver function callbacks to be delivered to the
> vfio_device.
> 
> This keeps things as they were, with the css_driver allocating the
> singleton, not the mdev_driver, this is pretty confusing. I'm also
> uncertain how the lifetime model for the mdev works in the css_driver
> callbacks.
> 
> At this point embed the vfio_device in the vfio_ccw_private and
> instantiate it as a vfio_device when the mdev probes. The drvdata of
> both
> the css_device and the mdev_device point at the private, and
> container_of
> is used to get it back from the vfio_device.
> 
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> ---
>  drivers/s390/cio/vfio_ccw_drv.c     |  21 +++--
>  drivers/s390/cio/vfio_ccw_ops.c     | 135 +++++++++++++++-----------
> --
>  drivers/s390/cio/vfio_ccw_private.h |   5 ++
>  3 files changed, 94 insertions(+), 67 deletions(-)
> 
> 

...snip...

> diff --git a/drivers/s390/cio/vfio_ccw_ops.c
> b/drivers/s390/cio/vfio_ccw_ops.c
> index 491a64c61fff1a..0fcf46031d3821 100644
> --- a/drivers/s390/cio/vfio_ccw_ops.c
> +++ b/drivers/s390/cio/vfio_ccw_ops.c
> @@ -17,13 +17,13 @@
>  
>  #include "vfio_ccw_private.h"
>  
> -static int vfio_ccw_mdev_reset(struct mdev_device *mdev)
> +static const struct vfio_device_ops vfio_ccw_dev_ops;
> +
> +static int vfio_ccw_mdev_reset(struct vfio_ccw_private *private)
>  {
> -	struct vfio_ccw_private *private;
>  	struct subchannel *sch;
>  	int ret;
>  
> -	private = dev_get_drvdata(mdev_parent_dev(mdev));
>  	sch = private->sch;
>  	/*
>  	 * TODO:
> @@ -61,7 +61,7 @@ static int vfio_ccw_mdev_notifier(struct
> notifier_block *nb,
>  		if (!cp_iova_pinned(&private->cp, unmap->iova))
>  			return NOTIFY_OK;
>  
> -		if (vfio_ccw_mdev_reset(private->mdev))
> +		if (vfio_ccw_mdev_reset(private))
>  			return NOTIFY_BAD;
>  
>  		cp_free(&private->cp);
> @@ -113,10 +113,11 @@ static struct attribute_group
> *mdev_type_groups[] = {
>  	NULL,
>  };
>  
> -static int vfio_ccw_mdev_create(struct mdev_device *mdev)
> +static int vfio_ccw_mdev_probe(struct mdev_device *mdev)
>  {
>  	struct vfio_ccw_private *private =
>  		dev_get_drvdata(mdev_parent_dev(mdev));
> +	int ret;
>  
>  	if (private->state == VFIO_CCW_STATE_NOT_OPER)
>  		return -ENODEV;
> @@ -124,6 +125,10 @@ static int vfio_ccw_mdev_create(struct
> mdev_device *mdev)
>  	if (atomic_dec_if_positive(&private->avail) < 0)
>  		return -EPERM;
>  
> +	memset(&private->vdev, 0, sizeof(private->vdev));
> +	vfio_init_group_dev(&private->vdev, &mdev->dev,
> +			    &vfio_ccw_dev_ops);
> +
>  	private->mdev = mdev;
>  	private->state = VFIO_CCW_STATE_IDLE;
>  
> @@ -132,19 +137,28 @@ static int vfio_ccw_mdev_create(struct
> mdev_device *mdev)
>  			   private->sch->schid.ssid,
>  			   private->sch->schid.sch_no);
>  
> +	ret = vfio_register_group_dev(&private->vdev);
> +	if (ret)
> +		goto err_atomic;
> +	dev_set_drvdata(&mdev->dev, private);
>  	return 0;
> +
> +err_atomic:
> +	atomic_inc(&private->avail);

Since we're unwinding, should also do

private->mdev = NULL
private->state = VFIO_CCW_STATE_STANDBY

> +	return ret;
>  }
>  
> -static int vfio_ccw_mdev_remove(struct mdev_device *mdev)
> +static void vfio_ccw_mdev_remove(struct mdev_device *mdev)
>  {
> -	struct vfio_ccw_private *private =
> -		dev_get_drvdata(mdev_parent_dev(mdev));
> +	struct vfio_ccw_private *private = dev_get_drvdata(&mdev->dev);
>  
>  	VFIO_CCW_MSG_EVENT(2, "mdev %pUl, sch %x.%x.%04x: remove\n",
>  			   mdev_uuid(mdev), private->sch->schid.cssid,
>  			   private->sch->schid.ssid,
>  			   private->sch->schid.sch_no);
>  
> +	vfio_unregister_group_dev(&private->vdev);
> +
>  	if ((private->state != VFIO_CCW_STATE_NOT_OPER) &&
>  	    (private->state != VFIO_CCW_STATE_STANDBY)) {
>  		if (!vfio_ccw_sch_quiesce(private->sch))
> @@ -155,20 +169,18 @@ static int vfio_ccw_mdev_remove(struct
> mdev_device *mdev)
>  	cp_free(&private->cp);
>  	private->mdev = NULL;
>  	atomic_inc(&private->avail);
> -
> -	return 0;
>  }
>  
> -static int vfio_ccw_mdev_open(struct mdev_device *mdev)
> +static int vfio_ccw_mdev_open(struct vfio_device *vdev)
>  {
>  	struct vfio_ccw_private *private =
> -		dev_get_drvdata(mdev_parent_dev(mdev));
> +		container_of(vdev, struct vfio_ccw_private, vdev);
>  	unsigned long events = VFIO_IOMMU_NOTIFY_DMA_UNMAP;
>  	int ret;
>  
>  	private->nb.notifier_call = vfio_ccw_mdev_notifier;
>  
> -	ret = vfio_register_notifier(mdev_dev(mdev), VFIO_IOMMU_NOTIFY,
> +	ret = vfio_register_notifier(vdev->dev, VFIO_IOMMU_NOTIFY,
>  				     &events, &private->nb);
>  	if (ret)
>  		return ret;
> @@ -189,27 +201,26 @@ static int vfio_ccw_mdev_open(struct
> mdev_device *mdev)
>  
>  out_unregister:
>  	vfio_ccw_unregister_dev_regions(private);
> -	vfio_unregister_notifier(mdev_dev(mdev), VFIO_IOMMU_NOTIFY,
> +	vfio_unregister_notifier(vdev->dev, VFIO_IOMMU_NOTIFY,
>  				 &private->nb);
>  	return ret;
>  }
>  
> -static void vfio_ccw_mdev_release(struct mdev_device *mdev)
> +static void vfio_ccw_mdev_release(struct vfio_device *vdev)
>  {
>  	struct vfio_ccw_private *private =
> -		dev_get_drvdata(mdev_parent_dev(mdev));
> +		container_of(vdev, struct vfio_ccw_private, vdev);
>  
>  	if ((private->state != VFIO_CCW_STATE_NOT_OPER) &&
>  	    (private->state != VFIO_CCW_STATE_STANDBY)) {
> -		if (!vfio_ccw_mdev_reset(mdev))
> +		if (!vfio_ccw_mdev_reset(private))
>  			private->state = VFIO_CCW_STATE_STANDBY;
>  		/* The state will be NOT_OPER on error. */
>  	}
>  
>  	cp_free(&private->cp);
>  	vfio_ccw_unregister_dev_regions(private);
> -	vfio_unregister_notifier(mdev_dev(mdev), VFIO_IOMMU_NOTIFY,
> -				 &private->nb);
> +	vfio_unregister_notifier(vdev->dev, VFIO_IOMMU_NOTIFY,
> &private->nb);
>  }
>  
>  static ssize_t vfio_ccw_mdev_read_io_region(struct vfio_ccw_private
> *private,
> @@ -233,15 +244,14 @@ static ssize_t
> vfio_ccw_mdev_read_io_region(struct vfio_ccw_private *private,
>  	return ret;
>  }
>  
> -static ssize_t vfio_ccw_mdev_read(struct mdev_device *mdev,
> +static ssize_t vfio_ccw_mdev_read(struct vfio_device *vdev,
>  				  char __user *buf,
>  				  size_t count,
>  				  loff_t *ppos)
>  {
> +	struct vfio_ccw_private *private =
> +		container_of(vdev, struct vfio_ccw_private, vdev);
>  	unsigned int index = VFIO_CCW_OFFSET_TO_INDEX(*ppos);
> -	struct vfio_ccw_private *private;
> -
> -	private = dev_get_drvdata(mdev_parent_dev(mdev));
>  
>  	if (index >= VFIO_CCW_NUM_REGIONS + private->num_regions)
>  		return -EINVAL;
> @@ -288,15 +298,14 @@ static ssize_t
> vfio_ccw_mdev_write_io_region(struct vfio_ccw_private *private,
>  	return ret;
>  }
>  
> -static ssize_t vfio_ccw_mdev_write(struct mdev_device *mdev,
> +static ssize_t vfio_ccw_mdev_write(struct vfio_device *vdev,
>  				   const char __user *buf,
>  				   size_t count,
>  				   loff_t *ppos)
>  {
> +	struct vfio_ccw_private *private =
> +		container_of(vdev, struct vfio_ccw_private, vdev);
>  	unsigned int index = VFIO_CCW_OFFSET_TO_INDEX(*ppos);
> -	struct vfio_ccw_private *private;
> -
> -	private = dev_get_drvdata(mdev_parent_dev(mdev));
>  
>  	if (index >= VFIO_CCW_NUM_REGIONS + private->num_regions)
>  		return -EINVAL;
> @@ -313,12 +322,9 @@ static ssize_t vfio_ccw_mdev_write(struct
> mdev_device *mdev,
>  	return -EINVAL;
>  }
>  
> -static int vfio_ccw_mdev_get_device_info(struct vfio_device_info
> *info,
> -					 struct mdev_device *mdev)
> +static int vfio_ccw_mdev_get_device_info(struct vfio_ccw_private
> *private,
> +					 struct vfio_device_info *info)
>  {
> -	struct vfio_ccw_private *private;
> -
> -	private = dev_get_drvdata(mdev_parent_dev(mdev));
>  	info->flags = VFIO_DEVICE_FLAGS_CCW | VFIO_DEVICE_FLAGS_RESET;
>  	info->num_regions = VFIO_CCW_NUM_REGIONS + private-
> >num_regions;
>  	info->num_irqs = VFIO_CCW_NUM_IRQS;
> @@ -326,14 +332,12 @@ static int vfio_ccw_mdev_get_device_info(struct
> vfio_device_info *info,
>  	return 0;
>  }
>  
> -static int vfio_ccw_mdev_get_region_info(struct vfio_region_info
> *info,
> -					 struct mdev_device *mdev,
> +static int vfio_ccw_mdev_get_region_info(struct vfio_ccw_private
> *private,
> +					 struct vfio_region_info *info,
>  					 unsigned long arg)
>  {
> -	struct vfio_ccw_private *private;
>  	int i;
>  
> -	private = dev_get_drvdata(mdev_parent_dev(mdev));
>  	switch (info->index) {
>  	case VFIO_CCW_CONFIG_REGION_INDEX:
>  		info->offset = 0;
> @@ -408,19 +412,16 @@ static int vfio_ccw_mdev_get_irq_info(struct
> vfio_irq_info *info)
>  	return 0;
>  }
>  
> -static int vfio_ccw_mdev_set_irqs(struct mdev_device *mdev,
> +static int vfio_ccw_mdev_set_irqs(struct vfio_ccw_private *private,
>  				  uint32_t flags,
>  				  uint32_t index,
>  				  void __user *data)
>  {
> -	struct vfio_ccw_private *private;
>  	struct eventfd_ctx **ctx;
>  
>  	if (!(flags & VFIO_IRQ_SET_ACTION_TRIGGER))
>  		return -EINVAL;
>  
> -	private = dev_get_drvdata(mdev_parent_dev(mdev));
> -
>  	switch (index) {
>  	case VFIO_CCW_IO_IRQ_INDEX:
>  		ctx = &private->io_trigger;
> @@ -522,10 +523,12 @@ void vfio_ccw_unregister_dev_regions(struct
> vfio_ccw_private *private)
>  	private->region = NULL;
>  }
>  
> -static ssize_t vfio_ccw_mdev_ioctl(struct mdev_device *mdev,
> +static ssize_t vfio_ccw_mdev_ioctl(struct vfio_device *vdev,
>  				   unsigned int cmd,
>  				   unsigned long arg)
>  {
> +	struct vfio_ccw_private *private =
> +		container_of(vdev, struct vfio_ccw_private, vdev);
>  	int ret = 0;
>  	unsigned long minsz;
>  
> @@ -542,7 +545,7 @@ static ssize_t vfio_ccw_mdev_ioctl(struct
> mdev_device *mdev,
>  		if (info.argsz < minsz)
>  			return -EINVAL;
>  
> -		ret = vfio_ccw_mdev_get_device_info(&info, mdev);
> +		ret = vfio_ccw_mdev_get_device_info(private, &info);
>  		if (ret)
>  			return ret;
>  
> @@ -560,7 +563,7 @@ static ssize_t vfio_ccw_mdev_ioctl(struct
> mdev_device *mdev,
>  		if (info.argsz < minsz)
>  			return -EINVAL;
>  
> -		ret = vfio_ccw_mdev_get_region_info(&info, mdev, arg);
> +		ret = vfio_ccw_mdev_get_region_info(private, &info,
> arg);
>  		if (ret)
>  			return ret;
>  
> @@ -605,47 +608,59 @@ static ssize_t vfio_ccw_mdev_ioctl(struct
> mdev_device *mdev,
>  			return ret;
>  
>  		data = (void __user *)(arg + minsz);
> -		return vfio_ccw_mdev_set_irqs(mdev, hdr.flags,
> hdr.index, data);
> +		return vfio_ccw_mdev_set_irqs(private, hdr.flags,
> hdr.index,
> +					      data);
>  	}
>  	case VFIO_DEVICE_RESET:
> -		return vfio_ccw_mdev_reset(mdev);
> +		return vfio_ccw_mdev_reset(private);
>  	default:
>  		return -ENOTTY;
>  	}
>  }
>  
>  /* Request removal of the device*/
> -static void vfio_ccw_mdev_request(struct mdev_device *mdev, unsigned
> int count)
> +static void vfio_ccw_mdev_request(struct vfio_device *vdev, unsigned
> int count)
>  {
> -	struct vfio_ccw_private *private =
> dev_get_drvdata(mdev_parent_dev(mdev));
> -
> -	if (!private)
> -		return;
> +	struct vfio_ccw_private *private =
> +		container_of(vdev, struct vfio_ccw_private, vdev);
> +	struct device *dev = private->vdev.dev;

This could be simply vdev->dev.
The rest seems okay.

Thanks,
Eric

>  
>  	if (private->req_trigger) {
>  		if (!(count % 10))
> -			dev_notice_ratelimited(mdev_dev(private->mdev),
> +			dev_notice_ratelimited(dev,
>  					       "Relaying device request
> to user (#%u)\n",
>  					       count);
>  
>  		eventfd_signal(private->req_trigger, 1);
>  	} else if (count == 0) {
> -		dev_notice(mdev_dev(private->mdev),
> +		dev_notice(dev,
>  			   "No device request channel registered,
> blocked until released by user\n");
>  	}
>  }
>  
> +static const struct vfio_device_ops vfio_ccw_dev_ops = {
> +	.open = vfio_ccw_mdev_open,
> +	.release = vfio_ccw_mdev_release,
> +	.read = vfio_ccw_mdev_read,
> +	.write = vfio_ccw_mdev_write,
> +	.ioctl = vfio_ccw_mdev_ioctl,
> +	.request = vfio_ccw_mdev_request,
> +};
> +
> +struct mdev_driver vfio_ccw_mdev_driver = {
> +	.driver = {
> +		.name = "vfio_ccw_mdev",
> +		.owner = THIS_MODULE,
> +		.mod_name = KBUILD_MODNAME,
> +	},
> +	.probe = vfio_ccw_mdev_probe,
> +	.remove = vfio_ccw_mdev_remove,
> +};
> +
>  static const struct mdev_parent_ops vfio_ccw_mdev_ops = {
>  	.owner			= THIS_MODULE,
> +	.device_driver		= &vfio_ccw_mdev_driver,
>  	.supported_type_groups  = mdev_type_groups,
> -	.create			= vfio_ccw_mdev_create,
> -	.remove			= vfio_ccw_mdev_remove,
> -	.open			= vfio_ccw_mdev_open,
> -	.release		= vfio_ccw_mdev_release,
> -	.read			= vfio_ccw_mdev_read,
> -	.write			= vfio_ccw_mdev_write,
> -	.ioctl			= vfio_ccw_mdev_ioctl,
> -	.request		= vfio_ccw_mdev_request,
>  };
>  
>  int vfio_ccw_mdev_reg(struct subchannel *sch)
> diff --git a/drivers/s390/cio/vfio_ccw_private.h
> b/drivers/s390/cio/vfio_ccw_private.h
> index b2c762eb42b9bb..7272eb78861244 100644
> --- a/drivers/s390/cio/vfio_ccw_private.h
> +++ b/drivers/s390/cio/vfio_ccw_private.h
> @@ -17,6 +17,7 @@
>  #include <linux/eventfd.h>
>  #include <linux/workqueue.h>
>  #include <linux/vfio_ccw.h>
> +#include <linux/vfio.h>
>  #include <asm/crw.h>
>  #include <asm/debug.h>
>  
> @@ -67,6 +68,7 @@ struct vfio_ccw_crw {
>  
>  /**
>   * struct vfio_ccw_private
> + * @vdev: Embedded VFIO device
>   * @sch: pointer to the subchannel
>   * @state: internal state of the device
>   * @completion: synchronization helper of the I/O completion
> @@ -90,6 +92,7 @@ struct vfio_ccw_crw {
>   * @crw_work: work for deferral process of CRW handling
>   */
>  struct vfio_ccw_private {
> +	struct vfio_device vdev;
>  	struct subchannel	*sch;
>  	int			state;
>  	struct completion	*completion;
> @@ -121,6 +124,8 @@ extern void vfio_ccw_mdev_unreg(struct subchannel
> *sch);
>  
>  extern int vfio_ccw_sch_quiesce(struct subchannel *sch);
>  
> +extern struct mdev_driver vfio_ccw_mdev_driver;
> +
>  /*
>   * States of the device statemachine.
>   */


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v2 00/13] Remove vfio_mdev.c, mdev_parent_ops and more
  2021-04-26 20:00 [PATCH v2 00/13] Remove vfio_mdev.c, mdev_parent_ops and more Jason Gunthorpe
                   ` (10 preceding siblings ...)
  2021-04-26 20:00 ` [PATCH v2 13/13] vfio/mdev: Remove mdev drvdata Jason Gunthorpe
@ 2021-04-27 21:30 ` Alex Williamson
  2021-04-27 22:20   ` Jason Gunthorpe
  11 siblings, 1 reply; 59+ messages in thread
From: Alex Williamson @ 2021-04-27 21:30 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: David Airlie, Tony Krowiak, Christian Borntraeger, Cornelia Huck,
	Jonathan Corbet, Daniel Vetter, dri-devel, Eric Farman,
	Harald Freudenberger, Vasily Gorbik, Heiko Carstens, intel-gfx,
	intel-gvt-dev, Jani Nikula, Joonas Lahtinen, kvm, Kirti Wankhede,
	linux-doc, linux-s390, Peter Oberparleiter, Halil Pasic,
	Pierre Morel, Rodrigo Vivi, Vineeth Vijayan, Zhenyu Wang,
	Zhi Wang, Raj, Ashok, Dan Williams, Christoph Hellwig,
	Leon Romanovsky, Max Gurtovoy, Tarun Gupta

On Mon, 26 Apr 2021 17:00:02 -0300
Jason Gunthorpe <jgg@nvidia.com> wrote:

> The mdev bus's core part for managing the lifecycle of devices is mostly
> as one would expect for a driver core bus subsystem.
> 
> However instead of having a normal 'struct device_driver' and binding the
> actual mdev drivers through the standard driver core mechanisms it open
> codes this with the struct mdev_parent_ops and provides a single driver
> that shims between the VFIO core and the actual device driver.
> 
> Make every one of the mdev drivers implement an actual struct mdev_driver
> and directly call vfio_register_group_dev() in the probe() function for
> the mdev.
> 
> Squash what is left of the mdev_parent_ops into the mdev_driver and remap
> create(), remove() and mdev_attr_groups to their driver core
> equivalents. Arrange to bind the created mdev_device to the mdev_driver
> that is provided by the end driver.
> 
> The actual execution flow doesn't change much, eg what was
> parent_ops->create is now device_driver->probe and it is called at almost
> the exact same time - except under the normal control of the driver core.
> 
> This allows deleting the entire mdev_drvdata, and tidying some of the
> sysfs. Many places in the drivers start using container_of()
> 
> This cleanly splits the mdev sysfs GUID lifecycle management stuff from
> the vfio_device implementation part, the only VFIO special part of mdev
> that remains is the mdev specific iommu intervention.
> 
> v2:
>  - Keep && m in samples kconfig
>  - Restore accidently squashed removeal of vfio_mdev.c
>  - Remove indirections to call bus_register()/bus_unregister()
>  - Reflow long doc lines
> v1: https://lore.kernel.org/r/0-v1-d88406ed308e+418-vfio3_jgg@nvidia.com
> 
> Jason
> 
> Cc: Leon Romanovsky <leonro@nvidia.com>
> Cc: "Raj, Ashok" <ashok.raj@intel.com>
> Cc: Dan Williams <dan.j.williams@intel.com>
> Cc: Max Gurtovoy <mgurtovoy@nvidia.com>
> Cc: Christoph Hellwig <hch@lst.de>
> Cc: Tarun Gupta <targupta@nvidia.com>
> Cc: Daniel Vetter <daniel@ffwll.ch>
> 
> 
> Jason Gunthorpe (13):
>   vfio/mdev: Remove CONFIG_VFIO_MDEV_DEVICE
>   vfio/mdev: Allow the mdev_parent_ops to specify the device driver to
>     bind
>   vfio/mtty: Convert to use vfio_register_group_dev()
>   vfio/mdpy: Convert to use vfio_register_group_dev()
>   vfio/mbochs: Convert to use vfio_register_group_dev()
>   vfio/ap_ops: Convert to use vfio_register_group_dev()
>   vfio/ccw: Convert to use vfio_register_group_dev()
>   vfio/gvt: Convert to use vfio_register_group_dev()
>   vfio/mdev: Remove vfio_mdev.c
>   vfio/mdev: Remove mdev_parent_ops dev_attr_groups
>   vfio/mdev: Remove mdev_parent_ops
>   vfio/mdev: Use the driver core to create the 'remove' file
>   vfio/mdev: Remove mdev drvdata

It'd be really helpful if you could consistently copy at least one
list, preferably one monitored by patchwork, for an entire series.  The
kvm list is missing patches 06 and 08.  I can find the latter hopping
over to the intel-gfx or dri-devel projects as I did for the last
series, but 06 only copied linux-s390, where I need to use lore and
can't find a patchwork.  Thanks,

Alex

> 
>  .../driver-api/vfio-mediated-device.rst       |  56 ++---
>  Documentation/s390/vfio-ap.rst                |   1 -
>  arch/s390/Kconfig                             |   2 +-
>  drivers/gpu/drm/i915/Kconfig                  |   2 +-
>  drivers/gpu/drm/i915/gvt/kvmgt.c              | 210 +++++++++--------
>  drivers/s390/cio/vfio_ccw_drv.c               |  21 +-
>  drivers/s390/cio/vfio_ccw_ops.c               | 136 ++++++-----
>  drivers/s390/cio/vfio_ccw_private.h           |   5 +
>  drivers/s390/crypto/vfio_ap_ops.c             | 138 ++++++-----
>  drivers/s390/crypto/vfio_ap_private.h         |   2 +
>  drivers/vfio/mdev/Kconfig                     |   7 -
>  drivers/vfio/mdev/Makefile                    |   1 -
>  drivers/vfio/mdev/mdev_core.c                 |  67 ++++--
>  drivers/vfio/mdev/mdev_driver.c               |  20 +-
>  drivers/vfio/mdev/mdev_private.h              |   4 +-
>  drivers/vfio/mdev/mdev_sysfs.c                |  37 ++-
>  drivers/vfio/mdev/vfio_mdev.c                 | 180 ---------------
>  drivers/vfio/vfio.c                           |   6 +-
>  include/linux/mdev.h                          |  86 +------
>  include/linux/vfio.h                          |   4 +
>  samples/Kconfig                               |   6 +-
>  samples/vfio-mdev/mbochs.c                    | 166 +++++++------
>  samples/vfio-mdev/mdpy.c                      | 162 +++++++------
>  samples/vfio-mdev/mtty.c                      | 218 +++++++-----------
>  24 files changed, 651 insertions(+), 886 deletions(-)
>  delete mode 100644 drivers/vfio/mdev/vfio_mdev.c
> 


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v2 07/13] vfio/ccw: Convert to use vfio_register_group_dev()
  2021-04-27 20:06   ` Eric Farman
@ 2021-04-27 22:10     ` Jason Gunthorpe
  2021-04-28 12:55       ` Eric Farman
  0 siblings, 1 reply; 59+ messages in thread
From: Jason Gunthorpe @ 2021-04-27 22:10 UTC (permalink / raw)
  To: Eric Farman
  Cc: Christian Borntraeger, Cornelia Huck, Vasily Gorbik,
	Heiko Carstens, kvm, linux-s390, Peter Oberparleiter,
	Halil Pasic, Vineeth Vijayan, Raj, Ashok, Dan Williams,
	Daniel Vetter, Christoph Hellwig, Leon Romanovsky, Max Gurtovoy,
	Tarun Gupta

On Tue, Apr 27, 2021 at 04:06:04PM -0400, Eric Farman wrote:
> > @@ -132,19 +137,28 @@ static int vfio_ccw_mdev_create(struct
> > mdev_device *mdev)
> >  			   private->sch->schid.ssid,
> >  			   private->sch->schid.sch_no);
> >  
> > +	ret = vfio_register_group_dev(&private->vdev);
> > +	if (ret)
> > +		goto err_atomic;
> > +	dev_set_drvdata(&mdev->dev, private);
> >  	return 0;
> > +
> > +err_atomic:
> > +	atomic_inc(&private->avail);
> 
> Since we're unwinding, should also do
> 
> private->mdev = NULL
> private->state = VFIO_CCW_STATE_STANDBY

I can change this, but it looks quite weird to do stuff like this with
no locking.

eg the only reads are here:

drivers/s390/cio/vfio_ccw_drv.c:        if (private->mdev && is_final)
drivers/s390/cio/vfio_ccw_drv.c:                private->state = private->mdev ? VFIO_CCW_STATE_IDLE :

Which is from a WQ, if someone thinks setting mdev to NULL should
effect those WQs then there are problems...

The non-atomic state is equally confusing

Jason

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v2 00/13] Remove vfio_mdev.c, mdev_parent_ops and more
  2021-04-27 21:30 ` [PATCH v2 00/13] Remove vfio_mdev.c, mdev_parent_ops and more Alex Williamson
@ 2021-04-27 22:20   ` Jason Gunthorpe
  2021-04-27 22:49     ` Alex Williamson
  0 siblings, 1 reply; 59+ messages in thread
From: Jason Gunthorpe @ 2021-04-27 22:20 UTC (permalink / raw)
  To: Alex Williamson
  Cc: David Airlie, Tony Krowiak, Christian Borntraeger, Cornelia Huck,
	Jonathan Corbet, Daniel Vetter, dri-devel, Eric Farman,
	Harald Freudenberger, Vasily Gorbik, Heiko Carstens, intel-gfx,
	intel-gvt-dev, Jani Nikula, Joonas Lahtinen, kvm, Kirti Wankhede,
	linux-doc, linux-s390, Peter Oberparleiter, Halil Pasic,
	Pierre Morel, Rodrigo Vivi, Vineeth Vijayan, Zhenyu Wang,
	Zhi Wang, Raj, Ashok, Dan Williams, Christoph Hellwig,
	Leon Romanovsky, Max Gurtovoy, Tarun Gupta

On Tue, Apr 27, 2021 at 03:30:42PM -0600, Alex Williamson wrote:
 
> It'd be really helpful if you could consistently copy at least one
> list, preferably one monitored by patchwork, for an entire series.  The
> kvm list is missing patches 06 and 08.  I can find the latter hopping
> over to the intel-gfx or dri-devel projects as I did for the last
> series, but 06 only copied linux-s390, where I need to use lore and
> can't find a patchwork.  Thanks,

Oh wow, that is not intentional, sorry! Thanks for pointing it out

I didn't notice this was happening, basically a side effect of having
so many different people and lists to get on this series - kvm should
have been CC on them all, I fixed it up going forward.

FWIW you may be interested in b4 if you haven't seen it before, it is
a good alternative if there isn't an offical patchworks.

Jason

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v2 00/13] Remove vfio_mdev.c, mdev_parent_ops and more
  2021-04-27 22:20   ` Jason Gunthorpe
@ 2021-04-27 22:49     ` Alex Williamson
  0 siblings, 0 replies; 59+ messages in thread
From: Alex Williamson @ 2021-04-27 22:49 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: David Airlie, Tony Krowiak, Christian Borntraeger, Cornelia Huck,
	Jonathan Corbet, Daniel Vetter, dri-devel, Eric Farman,
	Harald Freudenberger, Vasily Gorbik, Heiko Carstens, intel-gfx,
	intel-gvt-dev, Jani Nikula, Joonas Lahtinen, kvm, Kirti Wankhede,
	linux-doc, linux-s390, Peter Oberparleiter, Halil Pasic,
	Pierre Morel, Rodrigo Vivi, Vineeth Vijayan, Zhenyu Wang,
	Zhi Wang, Raj, Ashok, Dan Williams, Christoph Hellwig,
	Leon Romanovsky, Max Gurtovoy, Tarun Gupta

On Tue, 27 Apr 2021 19:20:26 -0300
Jason Gunthorpe <jgg@nvidia.com> wrote:

> On Tue, Apr 27, 2021 at 03:30:42PM -0600, Alex Williamson wrote:
>  
> > It'd be really helpful if you could consistently copy at least one
> > list, preferably one monitored by patchwork, for an entire series.  The
> > kvm list is missing patches 06 and 08.  I can find the latter hopping
> > over to the intel-gfx or dri-devel projects as I did for the last
> > series, but 06 only copied linux-s390, where I need to use lore and
> > can't find a patchwork.  Thanks,  
> 
> Oh wow, that is not intentional, sorry! Thanks for pointing it out
> 
> I didn't notice this was happening, basically a side effect of having
> so many different people and lists to get on this series - kvm should
> have been CC on them all, I fixed it up going forward.
> 
> FWIW you may be interested in b4 if you haven't seen it before, it is
> a good alternative if there isn't an offical patchworks.

I'm sold!  Thanks,

Alex


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v2 02/13] vfio/mdev: Allow the mdev_parent_ops to specify the device driver to bind
  2021-04-27 12:32   ` Cornelia Huck
@ 2021-04-27 23:20     ` Jason Gunthorpe
  0 siblings, 0 replies; 59+ messages in thread
From: Jason Gunthorpe @ 2021-04-27 23:20 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: Alex Williamson, kvm, Kirti Wankhede, Raj, Ashok, Dan Williams,
	Daniel Vetter, Christoph Hellwig, Leon Romanovsky, Max Gurtovoy,
	Tarun Gupta

On Tue, Apr 27, 2021 at 02:32:27PM +0200, Cornelia Huck wrote:

> > +		device_unlock(&mdev->dev);
> > +		ret = device_attach(&mdev->dev);
> > +		if (ret)
> > +			return ret;
> 
> device_attach() can return 0 (no driver), 1 (bound), or -ENODEV (device
> not registered). I would expect mdev_bind_driver() to return 0 in case
> of success and !0 otherwise, and I think the calling code does so as
> well?

Oops yes it can, I changed it to this, thanks!

@@ -269,7 +269,7 @@ static int mdev_bind_driver(struct mdev_device *mdev)
 		}
 		device_unlock(&mdev->dev);
 		ret = device_attach(&mdev->dev);
-		if (ret)
+		if (ret < 0)
 			return ret;
 		mdev->probe_err = -EINVAL;
 	}

Jason

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v2 02/13] vfio/mdev: Allow the mdev_parent_ops to specify the device driver to bind
  2021-04-26 20:00 ` [PATCH v2 02/13] vfio/mdev: Allow the mdev_parent_ops to specify the device driver to bind Jason Gunthorpe
  2021-04-27 12:32   ` Cornelia Huck
@ 2021-04-28  6:03   ` Christoph Hellwig
  2021-04-28  7:56     ` Dan Williams
  2021-04-28  6:44   ` Leon Romanovsky
  2 siblings, 1 reply; 59+ messages in thread
From: Christoph Hellwig @ 2021-04-28  6:03 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Alex Williamson, Cornelia Huck, kvm, Kirti Wankhede, Raj, Ashok,
	Dan Williams, Daniel Vetter, Christoph Hellwig, Leon Romanovsky,
	Max Gurtovoy, Tarun Gupta

On Mon, Apr 26, 2021 at 05:00:04PM -0300, Jason Gunthorpe wrote:
> +/*
> + * mdev drivers can refuse to bind during probe(), in this case we want to fail
> + * the creation of the mdev all the way back to sysfs. This is a weird model
> + * that doesn't fit in the driver core well, nor does it seem to appear any
> + * place else in the kernel, so use a simple hack.
> + */
> +static int mdev_bind_driver(struct mdev_device *mdev)
> +{
> +	struct mdev_driver *drv = mdev->type->parent->ops->device_driver;
> +	int ret;
> +
> +	if (!drv)
> +		drv = &vfio_mdev_driver;
> +
> +	while (1) {
> +		device_lock(&mdev->dev);
> +		if (mdev->dev.driver == &drv->driver) {
> +			ret = 0;
> +			goto out_unlock;
> +		}
> +		if (mdev->probe_err) {
> +			ret = mdev->probe_err;
> +			goto out_unlock;
> +		}
> +		device_unlock(&mdev->dev);
> +		ret = device_attach(&mdev->dev);
> +		if (ret)
> +			return ret;
> +		mdev->probe_err = -EINVAL;
> +	}
> +	return 0;
> +
> +out_unlock:
> +	device_unlock(&mdev->dev);
> +	return ret;
> +}

> +++ b/drivers/vfio/mdev/mdev_driver.c
> @@ -49,7 +49,7 @@ static int mdev_probe(struct device *dev)
>  		return ret;
>  
>  	if (drv->probe) {
> -		ret = drv->probe(mdev);
> +		ret = mdev->probe_err = drv->probe(mdev);
>  		if (ret)
>  			mdev_detach_iommu(mdev);
>  	}

>  	return 0;
>  }
>  
> +static int mdev_match(struct device *dev, struct device_driver *drv)
> +{
> +	struct mdev_device *mdev = to_mdev_device(dev);
> +	struct mdev_driver *target = mdev->type->parent->ops->device_driver;
> +
> +	/*
> +	 * The ops specify the device driver to connect, fall back to the old
> +	 * shim driver if the driver hasn't been converted.
> +	 */
> +	if (!target)
> +		target = &vfio_mdev_driver;
> +	return drv == &target->driver;
> +}

I still think this going the wrong way.  Why can't we enhance the core
driver code with a version of device_bind_driver() that does call into
->probe?  That probably seems like a better model for those existing
direct users of device_bind_driver or device_attach with a pre-set
->drv anyway.

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v2 09/13] vfio/mdev: Remove vfio_mdev.c
  2021-04-26 20:00 ` [PATCH v2 09/13] vfio/mdev: Remove vfio_mdev.c Jason Gunthorpe
@ 2021-04-28  6:07   ` Christoph Hellwig
  2021-04-28  6:36     ` Greg Kroah-Hartman
  0 siblings, 1 reply; 59+ messages in thread
From: Christoph Hellwig @ 2021-04-28  6:07 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Alex Williamson, Cornelia Huck, Jonathan Corbet, kvm,
	Kirti Wankhede, linux-doc, Raj, Ashok, Dan Williams,
	Daniel Vetter, Christoph Hellwig, Leon Romanovsky, Max Gurtovoy,
	Tarun Gupta, Greg Kroah-Hartman

On Mon, Apr 26, 2021 at 05:00:11PM -0300, Jason Gunthorpe wrote:
> Preserve VFIO's design of allowing mdev drivers to be !GPL by allowing the
> three functions that replace this module for !GPL usage. This goes along
> with the other 19 symbols that are already marked !GPL in VFIO.

NAK.  This was a sneak by Nvidia to try a GPL condom, and now that we
remove that not working condom it does not mean core symbols can be
just changed.

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v2 09/13] vfio/mdev: Remove vfio_mdev.c
  2021-04-28  6:07   ` Christoph Hellwig
@ 2021-04-28  6:36     ` Greg Kroah-Hartman
  2021-04-28 12:53       ` Jason Gunthorpe
  0 siblings, 1 reply; 59+ messages in thread
From: Greg Kroah-Hartman @ 2021-04-28  6:36 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Jason Gunthorpe, Alex Williamson, Cornelia Huck, Jonathan Corbet,
	kvm, Kirti Wankhede, linux-doc, Raj, Ashok, Dan Williams,
	Daniel Vetter, Leon Romanovsky, Max Gurtovoy, Tarun Gupta

On Wed, Apr 28, 2021 at 08:07:03AM +0200, Christoph Hellwig wrote:
> On Mon, Apr 26, 2021 at 05:00:11PM -0300, Jason Gunthorpe wrote:
> > Preserve VFIO's design of allowing mdev drivers to be !GPL by allowing the
> > three functions that replace this module for !GPL usage. This goes along
> > with the other 19 symbols that are already marked !GPL in VFIO.
> 
> NAK.  This was a sneak by Nvidia to try a GPL condom, and now that we
> remove that not working condom it does not mean core symbols can be
> just changed.

Agreed, these symbols should not be changed.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v2 02/13] vfio/mdev: Allow the mdev_parent_ops to specify the device driver to bind
  2021-04-26 20:00 ` [PATCH v2 02/13] vfio/mdev: Allow the mdev_parent_ops to specify the device driver to bind Jason Gunthorpe
  2021-04-27 12:32   ` Cornelia Huck
  2021-04-28  6:03   ` Christoph Hellwig
@ 2021-04-28  6:44   ` Leon Romanovsky
  2021-04-28 14:14     ` Jason Gunthorpe
  2 siblings, 1 reply; 59+ messages in thread
From: Leon Romanovsky @ 2021-04-28  6:44 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Alex Williamson, Cornelia Huck, kvm, Kirti Wankhede, Raj, Ashok,
	Dan Williams, Daniel Vetter, Christoph Hellwig, Max Gurtovoy,
	Tarun Gupta

On Mon, Apr 26, 2021 at 05:00:04PM -0300, Jason Gunthorpe wrote:
> This allows a mdev driver to opt out of using vfio_mdev.c, instead the
> driver will provide a 'struct mdev_driver' and register directly with the
> driver core.
> 
> Much of mdev_parent_ops becomes unused in this mode:
> - create()/remove() are done via the mdev_driver probe()/remove()
> - mdev_attr_groups becomes mdev_driver driver.dev_groups
> - Wrapper function callbacks are replaced with the same ones from
>   struct vfio_device_ops
> 
> Following patches convert all the drivers.
> 
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> ---
>  drivers/vfio/mdev/mdev_core.c   | 64 ++++++++++++++++++++++++++++-----
>  drivers/vfio/mdev/mdev_driver.c | 17 ++++++++-
>  include/linux/mdev.h            |  3 ++
>  3 files changed, 75 insertions(+), 9 deletions(-)

<...>

> +/*
> + * mdev drivers can refuse to bind during probe(), in this case we want to fail
> + * the creation of the mdev all the way back to sysfs. This is a weird model
> + * that doesn't fit in the driver core well, nor does it seem to appear any
> + * place else in the kernel, so use a simple hack.
> + */
> +static int mdev_bind_driver(struct mdev_device *mdev)
> +{
> +	struct mdev_driver *drv = mdev->type->parent->ops->device_driver;
> +	int ret;
> +
> +	if (!drv)
> +		drv = &vfio_mdev_driver;
> +
> +	while (1) {
> +		device_lock(&mdev->dev);
> +		if (mdev->dev.driver == &drv->driver) {
> +			ret = 0;
> +			goto out_unlock;
> +		}
> +		if (mdev->probe_err) {
> +			ret = mdev->probe_err;
> +			goto out_unlock;
> +		}
> +		device_unlock(&mdev->dev);
> +		ret = device_attach(&mdev->dev);

The sequence above looks sketchy:
1. lock
2. check for driver
3. unlock
4. device_attach - it takes internally same lock as in step 1.

Why don't you rely on internal to device_attach() driver check?

Thanks

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v2 02/13] vfio/mdev: Allow the mdev_parent_ops to specify the device driver to bind
  2021-04-28  6:03   ` Christoph Hellwig
@ 2021-04-28  7:56     ` Dan Williams
  2021-04-28 12:41       ` Christoph Hellwig
  0 siblings, 1 reply; 59+ messages in thread
From: Dan Williams @ 2021-04-28  7:56 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Jason Gunthorpe, Alex Williamson, Cornelia Huck, kvm,
	Kirti Wankhede, Raj, Ashok, Daniel Vetter, Leon Romanovsky,
	Max Gurtovoy, Tarun Gupta

On Tue, Apr 27, 2021 at 11:04 PM Christoph Hellwig <hch@lst.de> wrote:
>
> On Mon, Apr 26, 2021 at 05:00:04PM -0300, Jason Gunthorpe wrote:
> > +/*
> > + * mdev drivers can refuse to bind during probe(), in this case we want to fail
> > + * the creation of the mdev all the way back to sysfs. This is a weird model
> > + * that doesn't fit in the driver core well, nor does it seem to appear any
> > + * place else in the kernel, so use a simple hack.
> > + */
> > +static int mdev_bind_driver(struct mdev_device *mdev)
> > +{
> > +     struct mdev_driver *drv = mdev->type->parent->ops->device_driver;
> > +     int ret;
> > +
> > +     if (!drv)
> > +             drv = &vfio_mdev_driver;
> > +
> > +     while (1) {
> > +             device_lock(&mdev->dev);
> > +             if (mdev->dev.driver == &drv->driver) {
> > +                     ret = 0;
> > +                     goto out_unlock;
> > +             }
> > +             if (mdev->probe_err) {
> > +                     ret = mdev->probe_err;
> > +                     goto out_unlock;
> > +             }
> > +             device_unlock(&mdev->dev);
> > +             ret = device_attach(&mdev->dev);
> > +             if (ret)
> > +                     return ret;
> > +             mdev->probe_err = -EINVAL;
> > +     }
> > +     return 0;
> > +
> > +out_unlock:
> > +     device_unlock(&mdev->dev);
> > +     return ret;
> > +}
>
> > +++ b/drivers/vfio/mdev/mdev_driver.c
> > @@ -49,7 +49,7 @@ static int mdev_probe(struct device *dev)
> >               return ret;
> >
> >       if (drv->probe) {
> > -             ret = drv->probe(mdev);
> > +             ret = mdev->probe_err = drv->probe(mdev);
> >               if (ret)
> >                       mdev_detach_iommu(mdev);
> >       }
>
> >       return 0;
> >  }
> >
> > +static int mdev_match(struct device *dev, struct device_driver *drv)
> > +{
> > +     struct mdev_device *mdev = to_mdev_device(dev);
> > +     struct mdev_driver *target = mdev->type->parent->ops->device_driver;
> > +
> > +     /*
> > +      * The ops specify the device driver to connect, fall back to the old
> > +      * shim driver if the driver hasn't been converted.
> > +      */
> > +     if (!target)
> > +             target = &vfio_mdev_driver;
> > +     return drv == &target->driver;
> > +}
>
> I still think this going the wrong way.  Why can't we enhance the core
> driver code with a version of device_bind_driver() that does call into
> ->probe?  That probably seems like a better model for those existing
> direct users of device_bind_driver or device_attach with a pre-set
> ->drv anyway.

Wouldn't that just be "export device_driver_attach()" so that drivers
can implement their own custom bind implementation?

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v2 02/13] vfio/mdev: Allow the mdev_parent_ops to specify the device driver to bind
  2021-04-28  7:56     ` Dan Williams
@ 2021-04-28 12:41       ` Christoph Hellwig
  2021-04-28 14:00         ` Jason Gunthorpe
  0 siblings, 1 reply; 59+ messages in thread
From: Christoph Hellwig @ 2021-04-28 12:41 UTC (permalink / raw)
  To: Dan Williams
  Cc: Christoph Hellwig, Jason Gunthorpe, Alex Williamson,
	Cornelia Huck, kvm, Kirti Wankhede, Raj, Ashok, Daniel Vetter,
	Leon Romanovsky, Max Gurtovoy, Tarun Gupta

On Wed, Apr 28, 2021 at 12:56:21AM -0700, Dan Williams wrote:
> > I still think this going the wrong way.  Why can't we enhance the core
> > driver code with a version of device_bind_driver() that does call into
> > ->probe?  That probably seems like a better model for those existing
> > direct users of device_bind_driver or device_attach with a pre-set
> > ->drv anyway.
> 
> Wouldn't that just be "export device_driver_attach()" so that drivers
> can implement their own custom bind implementation?

That looks like it might be all that is needed.

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v2 09/13] vfio/mdev: Remove vfio_mdev.c
  2021-04-28  6:36     ` Greg Kroah-Hartman
@ 2021-04-28 12:53       ` Jason Gunthorpe
  2021-04-29  6:53         ` Christoph Hellwig
  0 siblings, 1 reply; 59+ messages in thread
From: Jason Gunthorpe @ 2021-04-28 12:53 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Christoph Hellwig, Alex Williamson, Cornelia Huck,
	Jonathan Corbet, kvm, Kirti Wankhede, linux-doc, Raj, Ashok,
	Dan Williams, Daniel Vetter, Leon Romanovsky, Max Gurtovoy,
	Tarun Gupta

On Wed, Apr 28, 2021 at 08:36:06AM +0200, Greg Kroah-Hartman wrote:
> On Wed, Apr 28, 2021 at 08:07:03AM +0200, Christoph Hellwig wrote:
> > On Mon, Apr 26, 2021 at 05:00:11PM -0300, Jason Gunthorpe wrote:
> > > Preserve VFIO's design of allowing mdev drivers to be !GPL by allowing the
> > > three functions that replace this module for !GPL usage. This goes along
> > > with the other 19 symbols that are already marked !GPL in VFIO.
> > 
> > NAK.  This was a sneak by Nvidia to try a GPL condom, and now that we
> > remove that not working condom it does not mean core symbols can be
> > just changed.
> 
> Agreed, these symbols should not be changed.

During the development on this series I got a private email that
people have existing !GPL mdev drivers.

When I checked I saw that VFIO community seems to have decided that
!GPL is OK for mdev. I say this because many essential symbols for
implementing a mdev in vfio.c have been marked !GPL:

drivers/vfio/vfio.c:EXPORT_SYMBOL(vfio_info_cap_shift);
drivers/vfio/vfio.c:EXPORT_SYMBOL(vfio_info_add_capability);
drivers/vfio/vfio.c:EXPORT_SYMBOL(vfio_set_irqs_validate_and_prepare);
drivers/vfio/vfio.c:EXPORT_SYMBOL(vfio_pin_pages);
drivers/vfio/vfio.c:EXPORT_SYMBOL(vfio_unpin_pages);
drivers/vfio/vfio.c:EXPORT_SYMBOL(vfio_group_pin_pages);
drivers/vfio/vfio.c:EXPORT_SYMBOL(vfio_group_unpin_pages);
drivers/vfio/vfio.c:EXPORT_SYMBOL(vfio_dma_rw);
drivers/vfio/vfio.c:EXPORT_SYMBOL(vfio_register_notifier);
drivers/vfio/vfio.c:EXPORT_SYMBOL(vfio_unregister_notifier);

Why it is like this, I do not know. IMHO it is not some "condom" if a
chunk of the core vfio.c code is marked !GPL and is called by mdev
drivers.

The Linux standard is one patch one change. It is inapporiate for me
to backdoor sneak revert the VFIO communities past decisions on
licensing inside some unrelated cleanup patch.

If you two want to argue VFIO has err'd and should be using GPL for
its API toward mdev then please send a patch to switch the above
symbols and I'll rebase this series ontop of it.

That way the change can get a proper airing and not be sneakily buried
inside a cleanup patch.

Otherwise this patch changes nothing - what existed today continues to
exist, and nothing new is being allowed.

Jason

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v2 07/13] vfio/ccw: Convert to use vfio_register_group_dev()
  2021-04-27 22:10     ` Jason Gunthorpe
@ 2021-04-28 12:55       ` Eric Farman
  2021-04-28 13:21         ` Jason Gunthorpe
  0 siblings, 1 reply; 59+ messages in thread
From: Eric Farman @ 2021-04-28 12:55 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Christian Borntraeger, Cornelia Huck, Vasily Gorbik,
	Heiko Carstens, kvm, linux-s390, Peter Oberparleiter,
	Halil Pasic, Vineeth Vijayan, Raj, Ashok, Dan Williams,
	Daniel Vetter, Christoph Hellwig, Leon Romanovsky, Max Gurtovoy,
	Tarun Gupta

On Tue, 2021-04-27 at 19:10 -0300, Jason Gunthorpe wrote:
> On Tue, Apr 27, 2021 at 04:06:04PM -0400, Eric Farman wrote:
> > > @@ -132,19 +137,28 @@ static int vfio_ccw_mdev_create(struct
> > > mdev_device *mdev)
> > >  			   private->sch->schid.ssid,
> > >  			   private->sch->schid.sch_no);
> > >  
> > > +	ret = vfio_register_group_dev(&private->vdev);
> > > +	if (ret)
> > > +		goto err_atomic;
> > > +	dev_set_drvdata(&mdev->dev, private);
> > >  	return 0;
> > > +
> > > +err_atomic:
> > > +	atomic_inc(&private->avail);
> > 
> > Since we're unwinding, should also do
> > 
> > private->mdev = NULL
> > private->state = VFIO_CCW_STATE_STANDBY
> 
> I can change this, but it looks quite weird to do stuff like this
> with
> no locking.

I agree, but mdev_create didn't fail before, so backing out part of its
work seems weird too.

> 
> eg the only reads are here:
> 
> drivers/s390/cio/vfio_ccw_drv.c:        if (private->mdev &&
> is_final)
> drivers/s390/cio/vfio_ccw_drv.c:                private->state =
> private->mdev ? VFIO_CCW_STATE_IDLE :
> 
> Which is from a WQ, if someone thinks setting mdev to NULL should
> effect those WQs then there are problems...
> 
> The non-atomic state is equally confusing

Agreed, it's already on the list.

Eric

> 
> Jason


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v2 07/13] vfio/ccw: Convert to use vfio_register_group_dev()
  2021-04-28 12:55       ` Eric Farman
@ 2021-04-28 13:21         ` Jason Gunthorpe
  0 siblings, 0 replies; 59+ messages in thread
From: Jason Gunthorpe @ 2021-04-28 13:21 UTC (permalink / raw)
  To: Eric Farman
  Cc: Christian Borntraeger, Cornelia Huck, Vasily Gorbik,
	Heiko Carstens, kvm, linux-s390, Peter Oberparleiter,
	Halil Pasic, Vineeth Vijayan, Raj, Ashok, Dan Williams,
	Daniel Vetter, Christoph Hellwig, Leon Romanovsky, Max Gurtovoy,
	Tarun Gupta

On Wed, Apr 28, 2021 at 08:55:51AM -0400, Eric Farman wrote:
> On Tue, 2021-04-27 at 19:10 -0300, Jason Gunthorpe wrote:
> > On Tue, Apr 27, 2021 at 04:06:04PM -0400, Eric Farman wrote:
> > > > @@ -132,19 +137,28 @@ static int vfio_ccw_mdev_create(struct
> > > > mdev_device *mdev)
> > > >  			   private->sch->schid.ssid,
> > > >  			   private->sch->schid.sch_no);
> > > >  
> > > > +	ret = vfio_register_group_dev(&private->vdev);
> > > > +	if (ret)
> > > > +		goto err_atomic;
> > > > +	dev_set_drvdata(&mdev->dev, private);
> > > >  	return 0;
> > > > +
> > > > +err_atomic:
> > > > +	atomic_inc(&private->avail);
> > > 
> > > Since we're unwinding, should also do
> > > 
> > > private->mdev = NULL
> > > private->state = VFIO_CCW_STATE_STANDBY
> > 
> > I can change this, but it looks quite weird to do stuff like this
> > with
> > no locking.
> 
> I agree, but mdev_create didn't fail before, so backing out part of its
> work seems weird too.

Before if vfio_register_group_dev() failed the device would be left
half created but without a driver attached. It wasn't good.

The way it should work is up until vfio_register_group_dev() returns
success there should be no concurrancy and no touches to 'private' -
those WQs should all be shutdown.

Ideally the private would be allocated here as well so these rules are
clear and obvious

Jason

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v2 02/13] vfio/mdev: Allow the mdev_parent_ops to specify the device driver to bind
  2021-04-28 12:41       ` Christoph Hellwig
@ 2021-04-28 14:00         ` Jason Gunthorpe
  2021-04-28 19:58           ` Dan Williams
                             ` (2 more replies)
  0 siblings, 3 replies; 59+ messages in thread
From: Jason Gunthorpe @ 2021-04-28 14:00 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Dan Williams, Alex Williamson, Cornelia Huck, kvm,
	Kirti Wankhede, Raj, Ashok, Daniel Vetter, Leon Romanovsky,
	Max Gurtovoy, Tarun Gupta

On Wed, Apr 28, 2021 at 02:41:53PM +0200, Christoph Hellwig wrote:
> On Wed, Apr 28, 2021 at 12:56:21AM -0700, Dan Williams wrote:
> > > I still think this going the wrong way.  Why can't we enhance the core
> > > driver code with a version of device_bind_driver() that does call into
> > > ->probe?  That probably seems like a better model for those existing
> > > direct users of device_bind_driver or device_attach with a pre-set
> > > ->drv anyway.
> > 
> > Wouldn't that just be "export device_driver_attach()" so that drivers
> > can implement their own custom bind implementation?
> 
> That looks like it might be all that is needed.

I thought about doing it like that, it is generally a good idea,
however, if I add new API surface to the driver core I really want to
get rid of device_bind_driver(), or at least most of its users.

I'm pretty sure Greg will ask for it too.

So, I need a way to sequence that which doesn't mean I have to shelf
the mdev stuff for ages while I try to get acks from lots of places.

Leave this alone and fix it after? Export device_driver_attach() and
say to try and fix the rest after?

I think this will still need the ugly errno capture though..

Jason

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v2 02/13] vfio/mdev: Allow the mdev_parent_ops to specify the device driver to bind
  2021-04-28  6:44   ` Leon Romanovsky
@ 2021-04-28 14:14     ` Jason Gunthorpe
  2021-04-28 14:24       ` Leon Romanovsky
  0 siblings, 1 reply; 59+ messages in thread
From: Jason Gunthorpe @ 2021-04-28 14:14 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: Alex Williamson, Cornelia Huck, kvm, Kirti Wankhede, Raj, Ashok,
	Dan Williams, Daniel Vetter, Christoph Hellwig, Max Gurtovoy,
	Tarun Gupta

On Wed, Apr 28, 2021 at 09:44:07AM +0300, Leon Romanovsky wrote:
> On Mon, Apr 26, 2021 at 05:00:04PM -0300, Jason Gunthorpe wrote:
> > This allows a mdev driver to opt out of using vfio_mdev.c, instead the
> > driver will provide a 'struct mdev_driver' and register directly with the
> > driver core.
> > 
> > Much of mdev_parent_ops becomes unused in this mode:
> > - create()/remove() are done via the mdev_driver probe()/remove()
> > - mdev_attr_groups becomes mdev_driver driver.dev_groups
> > - Wrapper function callbacks are replaced with the same ones from
> >   struct vfio_device_ops
> > 
> > Following patches convert all the drivers.
> > 
> > Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> >  drivers/vfio/mdev/mdev_core.c   | 64 ++++++++++++++++++++++++++++-----
> >  drivers/vfio/mdev/mdev_driver.c | 17 ++++++++-
> >  include/linux/mdev.h            |  3 ++
> >  3 files changed, 75 insertions(+), 9 deletions(-)
> 
> <...>
> 
> > +/*
> > + * mdev drivers can refuse to bind during probe(), in this case we want to fail
> > + * the creation of the mdev all the way back to sysfs. This is a weird model
> > + * that doesn't fit in the driver core well, nor does it seem to appear any
> > + * place else in the kernel, so use a simple hack.
> > + */
> > +static int mdev_bind_driver(struct mdev_device *mdev)
> > +{
> > +	struct mdev_driver *drv = mdev->type->parent->ops->device_driver;
> > +	int ret;
> > +
> > +	if (!drv)
> > +		drv = &vfio_mdev_driver;
> > +
> > +	while (1) {
> > +		device_lock(&mdev->dev);
> > +		if (mdev->dev.driver == &drv->driver) {
> > +			ret = 0;
> > +			goto out_unlock;
> > +		}
> > +		if (mdev->probe_err) {
> > +			ret = mdev->probe_err;
> > +			goto out_unlock;
> > +		}
> > +		device_unlock(&mdev->dev);
> > +		ret = device_attach(&mdev->dev);
> 
> The sequence above looks sketchy:
> 1. lock
> 2. check for driver
> 3. unlock
> 4. device_attach - it takes internally same lock as in step 1.
> 
> Why don't you rely on internal to device_attach() driver check?

This is locking both probe_err and the check that the right driver is
bound. device_attach() doesn't tell you the same information

Jason

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v2 02/13] vfio/mdev: Allow the mdev_parent_ops to specify the device driver to bind
  2021-04-28 14:14     ` Jason Gunthorpe
@ 2021-04-28 14:24       ` Leon Romanovsky
  0 siblings, 0 replies; 59+ messages in thread
From: Leon Romanovsky @ 2021-04-28 14:24 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Alex Williamson, Cornelia Huck, kvm, Kirti Wankhede, Raj, Ashok,
	Dan Williams, Daniel Vetter, Christoph Hellwig, Max Gurtovoy,
	Tarun Gupta

On Wed, Apr 28, 2021 at 11:14:46AM -0300, Jason Gunthorpe wrote:
> On Wed, Apr 28, 2021 at 09:44:07AM +0300, Leon Romanovsky wrote:
> > On Mon, Apr 26, 2021 at 05:00:04PM -0300, Jason Gunthorpe wrote:
> > > This allows a mdev driver to opt out of using vfio_mdev.c, instead the
> > > driver will provide a 'struct mdev_driver' and register directly with the
> > > driver core.
> > > 
> > > Much of mdev_parent_ops becomes unused in this mode:
> > > - create()/remove() are done via the mdev_driver probe()/remove()
> > > - mdev_attr_groups becomes mdev_driver driver.dev_groups
> > > - Wrapper function callbacks are replaced with the same ones from
> > >   struct vfio_device_ops
> > > 
> > > Following patches convert all the drivers.
> > > 
> > > Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> > >  drivers/vfio/mdev/mdev_core.c   | 64 ++++++++++++++++++++++++++++-----
> > >  drivers/vfio/mdev/mdev_driver.c | 17 ++++++++-
> > >  include/linux/mdev.h            |  3 ++
> > >  3 files changed, 75 insertions(+), 9 deletions(-)
> > 
> > <...>
> > 
> > > +/*
> > > + * mdev drivers can refuse to bind during probe(), in this case we want to fail
> > > + * the creation of the mdev all the way back to sysfs. This is a weird model
> > > + * that doesn't fit in the driver core well, nor does it seem to appear any
> > > + * place else in the kernel, so use a simple hack.
> > > + */
> > > +static int mdev_bind_driver(struct mdev_device *mdev)
> > > +{
> > > +	struct mdev_driver *drv = mdev->type->parent->ops->device_driver;
> > > +	int ret;
> > > +
> > > +	if (!drv)
> > > +		drv = &vfio_mdev_driver;
> > > +
> > > +	while (1) {
> > > +		device_lock(&mdev->dev);
> > > +		if (mdev->dev.driver == &drv->driver) {
> > > +			ret = 0;
> > > +			goto out_unlock;
> > > +		}
> > > +		if (mdev->probe_err) {
> > > +			ret = mdev->probe_err;
> > > +			goto out_unlock;
> > > +		}
> > > +		device_unlock(&mdev->dev);
> > > +		ret = device_attach(&mdev->dev);
> > 
> > The sequence above looks sketchy:
> > 1. lock
> > 2. check for driver
> > 3. unlock
> > 4. device_attach - it takes internally same lock as in step 1.
> > 
> > Why don't you rely on internal to device_attach() driver check?
> 
> This is locking both probe_err and the check that the right driver is
> bound. device_attach() doesn't tell you the same information

device_attach() returns you the information that driver is already
bound, which is the same as you are doing here, because you don't
unbind "the wrong driver".

Thanks

> 
> Jason

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v2 07/13] vfio/ccw: Convert to use vfio_register_group_dev()
  2021-04-26 20:00 ` [PATCH v2 07/13] vfio/ccw: " Jason Gunthorpe
  2021-04-27 20:06   ` Eric Farman
@ 2021-04-28 17:09   ` Cornelia Huck
  2021-04-28 17:20     ` Jason Gunthorpe
  1 sibling, 1 reply; 59+ messages in thread
From: Cornelia Huck @ 2021-04-28 17:09 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Christian Borntraeger, Eric Farman, Vasily Gorbik,
	Heiko Carstens, kvm, linux-s390, Peter Oberparleiter,
	Halil Pasic, Vineeth Vijayan, Raj, Ashok, Dan Williams,
	Daniel Vetter, Christoph Hellwig, Leon Romanovsky, Max Gurtovoy,
	Tarun Gupta

On Mon, 26 Apr 2021 17:00:09 -0300
Jason Gunthorpe <jgg@nvidia.com> wrote:

> This is more complicated because vfio_ccw is sharing the vfio_device
> between both the mdev_device and its vfio_device and the css_driver.
> 
> The mdev is a singleton, and the reason for this sharing appears to be to
> allow the extra css_driver function callbacks to be delivered to the
> vfio_device.
> 
> This keeps things as they were, with the css_driver allocating the
> singleton, not the mdev_driver, this is pretty confusing. I'm also
> uncertain how the lifetime model for the mdev works in the css_driver
> callbacks.
> 
> At this point embed the vfio_device in the vfio_ccw_private and
> instantiate it as a vfio_device when the mdev probes. The drvdata of both
> the css_device and the mdev_device point at the private, and container_of
> is used to get it back from the vfio_device.

I've been staring at this for some time, and I'm not sure whether this
is a good approach.

We allow at most one mdev per subchannel (slicing it up does not make
sense), so we can be sure that there's a 1:1 relationship between mdev
and parent device, and we can track it via a single pointer.

The vfio_ccw_private driver data is allocated during probe (same as for
other css_drivers.) Embedding a vfio_device here means that we have a
structure tied into it that is operating with different lifetime rules.

What about creating a second structure instead that can embed the
vfio_device, is allocated during mdev probing, and is linked up with
the vfio_ccw_private structure? That would follow the pattern of other
drivers more closely.

> 
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> ---
>  drivers/s390/cio/vfio_ccw_drv.c     |  21 +++--
>  drivers/s390/cio/vfio_ccw_ops.c     | 135 +++++++++++++++-------------
>  drivers/s390/cio/vfio_ccw_private.h |   5 ++
>  3 files changed, 94 insertions(+), 67 deletions(-)


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v2 07/13] vfio/ccw: Convert to use vfio_register_group_dev()
  2021-04-28 17:09   ` Cornelia Huck
@ 2021-04-28 17:20     ` Jason Gunthorpe
  2021-04-29 11:58       ` Cornelia Huck
  0 siblings, 1 reply; 59+ messages in thread
From: Jason Gunthorpe @ 2021-04-28 17:20 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: Christian Borntraeger, Eric Farman, Vasily Gorbik,
	Heiko Carstens, kvm, linux-s390, Peter Oberparleiter,
	Halil Pasic, Vineeth Vijayan, Raj, Ashok, Dan Williams,
	Daniel Vetter, Christoph Hellwig, Leon Romanovsky, Max Gurtovoy,
	Tarun Gupta

On Wed, Apr 28, 2021 at 07:09:49PM +0200, Cornelia Huck wrote:
> On Mon, 26 Apr 2021 17:00:09 -0300
> Jason Gunthorpe <jgg@nvidia.com> wrote:
> 
> > This is more complicated because vfio_ccw is sharing the vfio_device
> > between both the mdev_device and its vfio_device and the css_driver.
> > 
> > The mdev is a singleton, and the reason for this sharing appears to be to
> > allow the extra css_driver function callbacks to be delivered to the
> > vfio_device.
> > 
> > This keeps things as they were, with the css_driver allocating the
> > singleton, not the mdev_driver, this is pretty confusing. I'm also
> > uncertain how the lifetime model for the mdev works in the css_driver
> > callbacks.
> > 
> > At this point embed the vfio_device in the vfio_ccw_private and
> > instantiate it as a vfio_device when the mdev probes. The drvdata of both
> > the css_device and the mdev_device point at the private, and container_of
> > is used to get it back from the vfio_device.
> 
> I've been staring at this for some time, and I'm not sure whether this
> is a good approach.
> 
> We allow at most one mdev per subchannel (slicing it up does not make
> sense), so we can be sure that there's a 1:1 relationship between mdev
> and parent device, and we can track it via a single pointer.

This seems like one of these cases where using the mdev GUID API was not a
great fit. The ccs_driver should have just directly created a
vfio_device and not gone into the mdev guid lifecycle world.

> The vfio_ccw_private driver data is allocated during probe (same as for
> other css_drivers.) Embedding a vfio_device here means that we have a
> structure tied into it that is operating with different lifetime rules.
> 
> What about creating a second structure instead that can embed the
> vfio_device, is allocated during mdev probing, and is linked up with
> the vfio_ccw_private structure? That would follow the pattern of other
> drivers more closely.

IIRC we still end up with pointers crossing between the two
structs. If you can't convince yourself that is correct (and I could
not) then it is already buggy today.

It is as I said to Eric, either there is no concurrency when there is
no mdev and everything is correct today, or there is concurrency and
it seems buggy today too.

The right answer it to move the allocations out of the css_driver
probe and put them only in the mdev driver probe because they can only
make sense when the mdev driver is instantiated. Then everything is
clear and very understandable how it should work.

I almost did this, but couldn't figure out how the lifetime of the
ccs_driver callbacks are working relative to the lifetime of the mdev
device since they also reach into these structs. Maybe they can't be
called for some css related reason?

Jason

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v2 02/13] vfio/mdev: Allow the mdev_parent_ops to specify the device driver to bind
  2021-04-28 14:00         ` Jason Gunthorpe
@ 2021-04-28 19:58           ` Dan Williams
  2021-04-28 23:38             ` Jason Gunthorpe
  2021-05-26  0:42             ` Jason Gunthorpe
  2021-04-29  6:51           ` Christoph Hellwig
  2021-05-04  9:36           ` Christoph Hellwig
  2 siblings, 2 replies; 59+ messages in thread
From: Dan Williams @ 2021-04-28 19:58 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Christoph Hellwig, Alex Williamson, Cornelia Huck, kvm,
	Kirti Wankhede, Raj, Ashok, Daniel Vetter, Leon Romanovsky,
	Max Gurtovoy, Tarun Gupta, dave.jiang

On Wed, Apr 28, 2021 at 7:00 AM Jason Gunthorpe <jgg@nvidia.com> wrote:
>
> On Wed, Apr 28, 2021 at 02:41:53PM +0200, Christoph Hellwig wrote:
> > On Wed, Apr 28, 2021 at 12:56:21AM -0700, Dan Williams wrote:
> > > > I still think this going the wrong way.  Why can't we enhance the core
> > > > driver code with a version of device_bind_driver() that does call into
> > > > ->probe?  That probably seems like a better model for those existing
> > > > direct users of device_bind_driver or device_attach with a pre-set
> > > > ->drv anyway.
> > >
> > > Wouldn't that just be "export device_driver_attach()" so that drivers
> > > can implement their own custom bind implementation?
> >
> > That looks like it might be all that is needed.
>
> I thought about doing it like that, it is generally a good idea,
> however, if I add new API surface to the driver core I really want to
> get rid of device_bind_driver(), or at least most of its users.

I might be missing where you are going with this comment, but
device_driver_attach() isn't a drop-in replacement for
device_bind_driver(). So while I agree with you that it's a
significant escalation of the driver core API surface, I don't see why
it would be necessarily predicated on removing device_bind_driver()?

If this export prevented a new device_bind_driver() user, I think
that's a net positive, because device_bind_driver() seems an odd way
to implement bus code to me.

> I'm pretty sure Greg will ask for it too.

I think it's worth asking.

I have an ulterior motive / additional use case in mind here which is
the work-in-progress cleanup of the DSA driver. It uses the driver
model to assign an engine to different use cases via driver binding.
However, it currently has a custom bind implementation that does not
operate like a typical /sys/bus/$bus/drivers interface. If
device_driver_attach() was exported then some DSA compat code could
model the current way while also allowing a transition path to the
right way. As is I was telling Dave that the compat code would need to
be built-in because I don't think fixing a DSA device-model problem is
enough justification on its own to ask for a device_driver_attach()
export.

> So, I need a way to sequence that which doesn't mean I have to shelf
> the mdev stuff for ages while I try to get acks from lots of places.

Lets see if it can stand alone.

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v2 02/13] vfio/mdev: Allow the mdev_parent_ops to specify the device driver to bind
  2021-04-28 19:58           ` Dan Williams
@ 2021-04-28 23:38             ` Jason Gunthorpe
  2021-04-29  0:00               ` Dave Jiang
  2021-05-26  0:42             ` Jason Gunthorpe
  1 sibling, 1 reply; 59+ messages in thread
From: Jason Gunthorpe @ 2021-04-28 23:38 UTC (permalink / raw)
  To: Dan Williams
  Cc: Christoph Hellwig, Alex Williamson, Cornelia Huck, kvm,
	Kirti Wankhede, Raj, Ashok, Daniel Vetter, Leon Romanovsky,
	Max Gurtovoy, Tarun Gupta, dave.jiang

On Wed, Apr 28, 2021 at 12:58:29PM -0700, Dan Williams wrote:
> On Wed, Apr 28, 2021 at 7:00 AM Jason Gunthorpe <jgg@nvidia.com> wrote:
> >
> > On Wed, Apr 28, 2021 at 02:41:53PM +0200, Christoph Hellwig wrote:
> > > On Wed, Apr 28, 2021 at 12:56:21AM -0700, Dan Williams wrote:
> > > > > I still think this going the wrong way.  Why can't we enhance the core
> > > > > driver code with a version of device_bind_driver() that does call into
> > > > > ->probe?  That probably seems like a better model for those existing
> > > > > direct users of device_bind_driver or device_attach with a pre-set
> > > > > ->drv anyway.
> > > >
> > > > Wouldn't that just be "export device_driver_attach()" so that drivers
> > > > can implement their own custom bind implementation?
> > >
> > > That looks like it might be all that is needed.
> >
> > I thought about doing it like that, it is generally a good idea,
> > however, if I add new API surface to the driver core I really want to
> > get rid of device_bind_driver(), or at least most of its users.
> 
> I might be missing where you are going with this comment, but
> device_driver_attach() isn't a drop-in replacement for
> device_bind_driver().

Many of the places calling device_bind_driver() are wonky things
like this:

        dev->dev.driver = &drv->link.driver;
        if (pnp_bus_type.probe(&dev->dev))
                goto err_out;
        if (device_bind_driver(&dev->dev))
                goto err_out;

So device_driver_attach() does replace that - with some differences.

Notable is that bind_driver requires the driver_lock but driver_attach
gets it internally. However, as far as I can tell, none of the
bind_driver callers do get it, so huh.

Aside from the driver_lock there are lots of small subtle differences
that are probably not important unless they are for some very complex
reason. :\

Of the callers:
  drivers/input/serio/serio.c
    This definitely doesn't have the device_lock
    It uses connect instead of probe and for some reason uses its own
    mutex instead of the device_lock. Murky.

  drivers/input/gameport/gameport.c
    This looks alot like serio, same comments

  drivers/net/phy/phy_device.c
    device_driver_attach() is better, looks unlikely that
    device_lock is properly held here. Little unclear on what
    the bus is and if bus->probe will be OK

  drivers/net/wireless/mac80211_hwsim.c
    Definitely does not hold the driver lock, the class and the driver
    have NULL probes so this could be changed

  drivers/pnp/card.c
    device_driver_attach() is better, very unlikely that a random
    device pulled from a linked list has the driver_lock held

  drivers/usb/core/driver.c
    This comment says the caller must have the device lock, but it
    doesn't call probe, and when I look at cdc_ether.c I wonder
    where the device_lock is hidden? Murky.

Basically, there is some mess here, and eliminating
device_bind_driver() for device_driver_attach() is quite a reasonable
cleanup. But hard, complex enough it needs testing each patch.

The other driver self bind scenario is to directly assign driver
before device_add, but I have a hard time finding those cases in the
tree with grep.

> If this export prevented a new device_bind_driver() user, I think
> that's a net positive, because device_bind_driver() seems an odd way
> to implement bus code to me.

Yes, I looked into why it is like this and concluded it is just very
very old.
 
> I have an ulterior motive / additional use case in mind here which is
> the work-in-progress cleanup of the DSA driver. It uses the driver
> model to assign an engine to different use cases via driver binding.
> However, it currently has a custom bind implementation that does not
> operate like a typical /sys/bus/$bus/drivers interface. If
> device_driver_attach() was exported then some DSA compat code could
> model the current way while also allowing a transition path to the
> right way. As is I was telling Dave that the compat code would need to
> be built-in because I don't think fixing a DSA device-model problem is
> enough justification on its own to ask for a device_driver_attach()
> export.

Can you make and test a DSA patch? If we have two concrete things and
I can sketch two more out of the above that should meet Greg's "need 4
things" general thinking for driver core API changes.

But I still would like to keep this going while we wait for acks, you
know how long that can take...

Jason

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v2 02/13] vfio/mdev: Allow the mdev_parent_ops to specify the device driver to bind
  2021-04-28 23:38             ` Jason Gunthorpe
@ 2021-04-29  0:00               ` Dave Jiang
  0 siblings, 0 replies; 59+ messages in thread
From: Dave Jiang @ 2021-04-29  0:00 UTC (permalink / raw)
  To: Jason Gunthorpe, Dan Williams
  Cc: Christoph Hellwig, Alex Williamson, Cornelia Huck, kvm,
	Kirti Wankhede, Raj, Ashok, Daniel Vetter, Leon Romanovsky,
	Max Gurtovoy, Tarun Gupta


On 4/28/2021 4:38 PM, Jason Gunthorpe wrote:
> On Wed, Apr 28, 2021 at 12:58:29PM -0700, Dan Williams wrote:
>> On Wed, Apr 28, 2021 at 7:00 AM Jason Gunthorpe <jgg@nvidia.com> wrote:
>>> On Wed, Apr 28, 2021 at 02:41:53PM +0200, Christoph Hellwig wrote:
>>>> On Wed, Apr 28, 2021 at 12:56:21AM -0700, Dan Williams wrote:
>>>>>> I still think this going the wrong way.  Why can't we enhance the core
>>>>>> driver code with a version of device_bind_driver() that does call into
>>>>>> ->probe?  That probably seems like a better model for those existing
>>>>>> direct users of device_bind_driver or device_attach with a pre-set
>>>>>> ->drv anyway.
>>>>> Wouldn't that just be "export device_driver_attach()" so that drivers
>>>>> can implement their own custom bind implementation?
>>>> That looks like it might be all that is needed.
>>> I thought about doing it like that, it is generally a good idea,
>>> however, if I add new API surface to the driver core I really want to
>>> get rid of device_bind_driver(), or at least most of its users.
>> I might be missing where you are going with this comment, but
>> device_driver_attach() isn't a drop-in replacement for
>> device_bind_driver().
> Many of the places calling device_bind_driver() are wonky things
> like this:
>
>          dev->dev.driver = &drv->link.driver;
>          if (pnp_bus_type.probe(&dev->dev))
>                  goto err_out;
>          if (device_bind_driver(&dev->dev))
>                  goto err_out;
>
> So device_driver_attach() does replace that - with some differences.
>
> Notable is that bind_driver requires the driver_lock but driver_attach
> gets it internally. However, as far as I can tell, none of the
> bind_driver callers do get it, so huh.
>
> Aside from the driver_lock there are lots of small subtle differences
> that are probably not important unless they are for some very complex
> reason. :\
>
> Of the callers:
>    drivers/input/serio/serio.c
>      This definitely doesn't have the device_lock
>      It uses connect instead of probe and for some reason uses its own
>      mutex instead of the device_lock. Murky.
>
>    drivers/input/gameport/gameport.c
>      This looks alot like serio, same comments
>
>    drivers/net/phy/phy_device.c
>      device_driver_attach() is better, looks unlikely that
>      device_lock is properly held here. Little unclear on what
>      the bus is and if bus->probe will be OK
>
>    drivers/net/wireless/mac80211_hwsim.c
>      Definitely does not hold the driver lock, the class and the driver
>      have NULL probes so this could be changed
>
>    drivers/pnp/card.c
>      device_driver_attach() is better, very unlikely that a random
>      device pulled from a linked list has the driver_lock held
>
>    drivers/usb/core/driver.c
>      This comment says the caller must have the device lock, but it
>      doesn't call probe, and when I look at cdc_ether.c I wonder
>      where the device_lock is hidden? Murky.
>
> Basically, there is some mess here, and eliminating
> device_bind_driver() for device_driver_attach() is quite a reasonable
> cleanup. But hard, complex enough it needs testing each patch.
>
> The other driver self bind scenario is to directly assign driver
> before device_add, but I have a hard time finding those cases in the
> tree with grep.
>
>> If this export prevented a new device_bind_driver() user, I think
>> that's a net positive, because device_bind_driver() seems an odd way
>> to implement bus code to me.
> Yes, I looked into why it is like this and concluded it is just very
> very old.
>   
>> I have an ulterior motive / additional use case in mind here which is
>> the work-in-progress cleanup of the DSA driver. It uses the driver
>> model to assign an engine to different use cases via driver binding.
>> However, it currently has a custom bind implementation that does not
>> operate like a typical /sys/bus/$bus/drivers interface. If
>> device_driver_attach() was exported then some DSA compat code could
>> model the current way while also allowing a transition path to the
>> right way. As is I was telling Dave that the compat code would need to
>> be built-in because I don't think fixing a DSA device-model problem is
>> enough justification on its own to ask for a device_driver_attach()
>> export.
> Can you make and test a DSA patch? If we have two concrete things and
> I can sketch two more out of the above that should meet Greg's "need 4
> things" general thinking for driver core API changes.

Working on it. Having device_driver_attach() exported will definitely 
make things easier on my side. Thanks for doing the heavy lifting.


>
> But I still would like to keep this going while we wait for acks, you
> know how long that can take...
>
> Jason

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v2 02/13] vfio/mdev: Allow the mdev_parent_ops to specify the device driver to bind
  2021-04-28 14:00         ` Jason Gunthorpe
  2021-04-28 19:58           ` Dan Williams
@ 2021-04-29  6:51           ` Christoph Hellwig
  2021-05-04  9:36           ` Christoph Hellwig
  2 siblings, 0 replies; 59+ messages in thread
From: Christoph Hellwig @ 2021-04-29  6:51 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Christoph Hellwig, Dan Williams, Alex Williamson, Cornelia Huck,
	kvm, Kirti Wankhede, Raj, Ashok, Daniel Vetter, Leon Romanovsky,
	Max Gurtovoy, Tarun Gupta

On Wed, Apr 28, 2021 at 11:00:05AM -0300, Jason Gunthorpe wrote:
> I thought about doing it like that, it is generally a good idea,
> however, if I add new API surface to the driver core I really want to
> get rid of device_bind_driver(), or at least most of its users.
> 
> I'm pretty sure Greg will ask for it too.

Well, let's ask them.  I though I had Cced him on my original mail,
but must have missed that.

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v2 09/13] vfio/mdev: Remove vfio_mdev.c
  2021-04-28 12:53       ` Jason Gunthorpe
@ 2021-04-29  6:53         ` Christoph Hellwig
  2021-04-29  6:56           ` Greg Kroah-Hartman
  0 siblings, 1 reply; 59+ messages in thread
From: Christoph Hellwig @ 2021-04-29  6:53 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Greg Kroah-Hartman, Christoph Hellwig, Alex Williamson,
	Cornelia Huck, Jonathan Corbet, kvm, Kirti Wankhede, linux-doc,
	Raj, Ashok, Dan Williams, Daniel Vetter, Leon Romanovsky,
	Max Gurtovoy, Tarun Gupta

On Wed, Apr 28, 2021 at 09:53:21AM -0300, Jason Gunthorpe wrote:
> The Linux standard is one patch one change. It is inapporiate for me
> to backdoor sneak revert the VFIO communities past decisions on
> licensing inside some unrelated cleanup patch.

That's not what you are doing.  You are removing weird condom code
that could never work, and remove the sneak attempt of an nvidia employee
to create a derived work that has no legal standing.

> Otherwise this patch changes nothing - what existed today continues to
> exist, and nothing new is being allowed.

No, it changes the existing exports, which is a complete no-go.

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v2 09/13] vfio/mdev: Remove vfio_mdev.c
  2021-04-29  6:53         ` Christoph Hellwig
@ 2021-04-29  6:56           ` Greg Kroah-Hartman
  2021-05-03 17:32             ` Jason Gunthorpe
  0 siblings, 1 reply; 59+ messages in thread
From: Greg Kroah-Hartman @ 2021-04-29  6:56 UTC (permalink / raw)
  To: Christoph Hellwig, Jason Gunthorpe
  Cc: Alex Williamson, Cornelia Huck, Jonathan Corbet, kvm,
	Kirti Wankhede, linux-doc, Raj, Ashok, Dan Williams,
	Daniel Vetter, Leon Romanovsky, Max Gurtovoy, Tarun Gupta

On Thu, Apr 29, 2021 at 08:53:15AM +0200, Christoph Hellwig wrote:
> On Wed, Apr 28, 2021 at 09:53:21AM -0300, Jason Gunthorpe wrote:
> > The Linux standard is one patch one change. It is inapporiate for me
> > to backdoor sneak revert the VFIO communities past decisions on
> > licensing inside some unrelated cleanup patch.
> 
> That's not what you are doing.  You are removing weird condom code
> that could never work, and remove the sneak attempt of an nvidia employee
> to create a derived work that has no legal standing.
> 
> > Otherwise this patch changes nothing - what existed today continues to
> > exist, and nothing new is being allowed.
> 
> No, it changes the existing exports, which is a complete no-go.

Agreed, Jason, please do not change the existing exports.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v2 07/13] vfio/ccw: Convert to use vfio_register_group_dev()
  2021-04-28 17:20     ` Jason Gunthorpe
@ 2021-04-29 11:58       ` Cornelia Huck
  2021-04-29 18:13         ` Jason Gunthorpe
  0 siblings, 1 reply; 59+ messages in thread
From: Cornelia Huck @ 2021-04-29 11:58 UTC (permalink / raw)
  To: Jason Gunthorpe, Eric Farman
  Cc: Christian Borntraeger, Vasily Gorbik, Heiko Carstens, kvm,
	linux-s390, Peter Oberparleiter, Halil Pasic, Vineeth Vijayan,
	Raj, Ashok, Dan Williams, Daniel Vetter, Christoph Hellwig,
	Leon Romanovsky, Max Gurtovoy, Tarun Gupta

On Wed, 28 Apr 2021 14:20:08 -0300
Jason Gunthorpe <jgg@nvidia.com> wrote:

> On Wed, Apr 28, 2021 at 07:09:49PM +0200, Cornelia Huck wrote:
> > On Mon, 26 Apr 2021 17:00:09 -0300
> > Jason Gunthorpe <jgg@nvidia.com> wrote:
> >   
> > > This is more complicated because vfio_ccw is sharing the vfio_device
> > > between both the mdev_device and its vfio_device and the css_driver.
> > > 
> > > The mdev is a singleton, and the reason for this sharing appears to be to
> > > allow the extra css_driver function callbacks to be delivered to the
> > > vfio_device.
> > > 
> > > This keeps things as they were, with the css_driver allocating the
> > > singleton, not the mdev_driver, this is pretty confusing. I'm also
> > > uncertain how the lifetime model for the mdev works in the css_driver
> > > callbacks.
> > > 
> > > At this point embed the vfio_device in the vfio_ccw_private and
> > > instantiate it as a vfio_device when the mdev probes. The drvdata of both
> > > the css_device and the mdev_device point at the private, and container_of
> > > is used to get it back from the vfio_device.  
> > 
> > I've been staring at this for some time, and I'm not sure whether this
> > is a good approach.
> > 
> > We allow at most one mdev per subchannel (slicing it up does not make
> > sense), so we can be sure that there's a 1:1 relationship between mdev
> > and parent device, and we can track it via a single pointer.  
> 
> This seems like one of these cases where using the mdev GUID API was not a
> great fit. The ccs_driver should have just directly created a
> vfio_device and not gone into the mdev guid lifecycle world.

I don't remember much of the discussion back then, but I don't think
the explicit generation of devices was the part we needed, but rather
some other kind of mediation -- probably iommu related, as subchannels
don't have that concept on their own. Anyway, too late to change now.

> 
> > The vfio_ccw_private driver data is allocated during probe (same as for
> > other css_drivers.) Embedding a vfio_device here means that we have a
> > structure tied into it that is operating with different lifetime rules.
> > 
> > What about creating a second structure instead that can embed the
> > vfio_device, is allocated during mdev probing, and is linked up with
> > the vfio_ccw_private structure? That would follow the pattern of other
> > drivers more closely.  
> 
> IIRC we still end up with pointers crossing between the two
> structs. If you can't convince yourself that is correct (and I could
> not) then it is already buggy today.
> 
> It is as I said to Eric, either there is no concurrency when there is
> no mdev and everything is correct today, or there is concurrency and
> it seems buggy today too.
> 
> The right answer it to move the allocations out of the css_driver
> probe and put them only in the mdev driver probe because they can only
> make sense when the mdev driver is instantiated. Then everything is
> clear and very understandable how it should work.
> 
> I almost did this, but couldn't figure out how the lifetime of the
> ccs_driver callbacks are working relative to the lifetime of the mdev
> device since they also reach into these structs. Maybe they can't be
> called for some css related reason?

Moving allocations to the mdev driver probe makes sense, I guess. We
should also move enabling the subchannel to that point in time (I don't
remember why we enable it in the css probe function, and can't think of
a good reason for that; obviously needs to be paired with quiescing and
disabling the subchannel in the mdev driver remove function); that
leaves the uevent dance (which can hopefully also be removed, if some
discussed changes are implemented in the common I/O layer) and fencing
QDIO.

Regarding the other callbacks,
- vfio_ccw_sch_irq should not be invoked if the subchannel is not
  enabled; maybe log a message before returning for !private.
- vfio_ccw_sch_remove should be able to return 0 for !private (nothing
  to quiesce, if the subchannel is not enabled).
- vfio_ccw_sch_shutdown has nothing to do for !private (same reason.)
- In vfio_ccw_sch_event, we should either skip the fsm_event and the
  state change for !private, or return 0 in that case.
- vfio_ccw_chp_event already checks for !private. Not sure whether we
  should try to update some control blocks and return -ENODEV if the
  subchannel is not operational, but it's probably not needed.

Eric, what do you think?


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v2 07/13] vfio/ccw: Convert to use vfio_register_group_dev()
  2021-04-29 11:58       ` Cornelia Huck
@ 2021-04-29 18:13         ` Jason Gunthorpe
  2021-04-30 12:31           ` Cornelia Huck
  0 siblings, 1 reply; 59+ messages in thread
From: Jason Gunthorpe @ 2021-04-29 18:13 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: Eric Farman, Christian Borntraeger, Vasily Gorbik,
	Heiko Carstens, kvm, linux-s390, Peter Oberparleiter,
	Halil Pasic, Vineeth Vijayan, Raj, Ashok, Dan Williams,
	Daniel Vetter, Christoph Hellwig, Leon Romanovsky, Max Gurtovoy,
	Tarun Gupta

On Thu, Apr 29, 2021 at 01:58:55PM +0200, Cornelia Huck wrote:

> > This seems like one of these cases where using the mdev GUID API
> > was not a great fit. The ccs_driver should have just directly
> > created a vfio_device and not gone into the mdev guid lifecycle
> > world.
> 
> I don't remember much of the discussion back then, but I don't think
> the explicit generation of devices was the part we needed, but rather
> some other kind of mediation -- probably iommu related, as subchannels
> don't have that concept on their own. Anyway, too late to change now.

The mdev part does three significant things:
 - Provide a lifecycle model based on sysfs and the GUIDs
 - Hackily inject itself into the VFIO IOMMU code as a special case
 - Force the creation of a unique iommu group as the group FD is
   mandatory to get the device FD.

This is why PASID is such a mess for mdev because it requires even
more special hacky stuff to link up the dummy IOMMU but still operate
within the iommu group of the parent device.

I can see an alternative arrangement using the /dev/ioasid idea that
is a lot less hacky and does not force the mdev guid lifecycle on
everyone that wants to create vfio_device.

> > I almost did this, but couldn't figure out how the lifetime of the
> > ccs_driver callbacks are working relative to the lifetime of the mdev
> > device since they also reach into these structs. Maybe they can't be
> > called for some css related reason?
> 
> Moving allocations to the mdev driver probe makes sense, I guess. We
> should also move enabling the subchannel to that point in time (I don't
> remember why we enable it in the css probe function, and can't think of
> a good reason for that; obviously needs to be paired with quiescing and
> disabling the subchannel in the mdev driver remove function); that
> leaves the uevent dance (which can hopefully also be removed, if some
> discussed changes are implemented in the common I/O layer) and fencing
> QDIO.
> 
> Regarding the other callbacks,
> - vfio_ccw_sch_irq should not be invoked if the subchannel is not
>   enabled; maybe log a message before returning for !private.
> - vfio_ccw_sch_remove should be able to return 0 for !private (nothing
>   to quiesce, if the subchannel is not enabled).
> - vfio_ccw_sch_shutdown has nothing to do for !private (same reason.)
> - In vfio_ccw_sch_event, we should either skip the fsm_event and the
>   state change for !private, or return 0 in that case.
> - vfio_ccw_chp_event already checks for !private. Not sure whether we
>   should try to update some control blocks and return -ENODEV if the
>   subchannel is not operational, but it's probably not needed.

All the checks for !private need some kind of locking. The driver core
model is that the 'struct device_driver' callbacks are all called
under the device_lock (this prevents the driver unbinding during the
callback). I didn't check if ccs does this or not..

So if we NULL drvdata under the device_lock everything can be
quite simple here.

Jason

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v2 07/13] vfio/ccw: Convert to use vfio_register_group_dev()
  2021-04-29 18:13         ` Jason Gunthorpe
@ 2021-04-30 12:31           ` Cornelia Huck
  2021-04-30 17:19             ` Jason Gunthorpe
  0 siblings, 1 reply; 59+ messages in thread
From: Cornelia Huck @ 2021-04-30 12:31 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Eric Farman, Christian Borntraeger, Vasily Gorbik,
	Heiko Carstens, kvm, linux-s390, Peter Oberparleiter,
	Halil Pasic, Vineeth Vijayan, Raj, Ashok, Dan Williams,
	Daniel Vetter, Christoph Hellwig, Leon Romanovsky, Max Gurtovoy,
	Tarun Gupta

On Thu, 29 Apr 2021 15:13:47 -0300
Jason Gunthorpe <jgg@nvidia.com> wrote:

> On Thu, Apr 29, 2021 at 01:58:55PM +0200, Cornelia Huck wrote:
> 
> > > This seems like one of these cases where using the mdev GUID API
> > > was not a great fit. The ccs_driver should have just directly
> > > created a vfio_device and not gone into the mdev guid lifecycle
> > > world.  
> > 
> > I don't remember much of the discussion back then, but I don't think
> > the explicit generation of devices was the part we needed, but rather
> > some other kind of mediation -- probably iommu related, as subchannels
> > don't have that concept on their own. Anyway, too late to change now.  
> 
> The mdev part does three significant things:
>  - Provide a lifecycle model based on sysfs and the GUIDs
>  - Hackily inject itself into the VFIO IOMMU code as a special case
>  - Force the creation of a unique iommu group as the group FD is
>    mandatory to get the device FD.
> 
> This is why PASID is such a mess for mdev because it requires even
> more special hacky stuff to link up the dummy IOMMU but still operate
> within the iommu group of the parent device.
> 
> I can see an alternative arrangement using the /dev/ioasid idea that
> is a lot less hacky and does not force the mdev guid lifecycle on
> everyone that wants to create vfio_device.

I have not followed that discussion -- do you have a summary or a
pointer?

> 
> > > I almost did this, but couldn't figure out how the lifetime of the
> > > ccs_driver callbacks are working relative to the lifetime of the mdev
> > > device since they also reach into these structs. Maybe they can't be
> > > called for some css related reason?  
> > 
> > Moving allocations to the mdev driver probe makes sense, I guess. We
> > should also move enabling the subchannel to that point in time (I don't
> > remember why we enable it in the css probe function, and can't think of
> > a good reason for that; obviously needs to be paired with quiescing and
> > disabling the subchannel in the mdev driver remove function); that
> > leaves the uevent dance (which can hopefully also be removed, if some
> > discussed changes are implemented in the common I/O layer) and fencing
> > QDIO.
> > 
> > Regarding the other callbacks,
> > - vfio_ccw_sch_irq should not be invoked if the subchannel is not
> >   enabled; maybe log a message before returning for !private.
> > - vfio_ccw_sch_remove should be able to return 0 for !private (nothing
> >   to quiesce, if the subchannel is not enabled).
> > - vfio_ccw_sch_shutdown has nothing to do for !private (same reason.)
> > - In vfio_ccw_sch_event, we should either skip the fsm_event and the
> >   state change for !private, or return 0 in that case.
> > - vfio_ccw_chp_event already checks for !private. Not sure whether we
> >   should try to update some control blocks and return -ENODEV if the
> >   subchannel is not operational, but it's probably not needed.  
> 
> All the checks for !private need some kind of locking. The driver core
> model is that the 'struct device_driver' callbacks are all called
> under the device_lock (this prevents the driver unbinding during the
> callback). I didn't check if ccs does this or not..

probe/remove/shutdown are basically a forward of the callbacks at the
bus level. The css bus should make sure that we serialize
irq/sch_event/chp_event with probe/remove.

> 
> So if we NULL drvdata under the device_lock everything can be
> quite simple here.
> 
> Jason
> 


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v2 07/13] vfio/ccw: Convert to use vfio_register_group_dev()
  2021-04-30 12:31           ` Cornelia Huck
@ 2021-04-30 17:19             ` Jason Gunthorpe
  2021-05-03 10:54               ` s390 common I/O layer locking (was: [PATCH v2 07/13] vfio/ccw: Convert to use vfio_register_group_dev()) Cornelia Huck
  0 siblings, 1 reply; 59+ messages in thread
From: Jason Gunthorpe @ 2021-04-30 17:19 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: Eric Farman, Christian Borntraeger, Vasily Gorbik,
	Heiko Carstens, kvm, linux-s390, Peter Oberparleiter,
	Halil Pasic, Vineeth Vijayan, Raj, Ashok, Dan Williams,
	Daniel Vetter, Christoph Hellwig, Leon Romanovsky, Max Gurtovoy,
	Tarun Gupta

On Fri, Apr 30, 2021 at 02:31:40PM +0200, Cornelia Huck wrote:
> On Thu, 29 Apr 2021 15:13:47 -0300
> Jason Gunthorpe <jgg@nvidia.com> wrote:
> 
> > On Thu, Apr 29, 2021 at 01:58:55PM +0200, Cornelia Huck wrote:
> > 
> > > > This seems like one of these cases where using the mdev GUID API
> > > > was not a great fit. The ccs_driver should have just directly
> > > > created a vfio_device and not gone into the mdev guid lifecycle
> > > > world.  
> > > 
> > > I don't remember much of the discussion back then, but I don't think
> > > the explicit generation of devices was the part we needed, but rather
> > > some other kind of mediation -- probably iommu related, as subchannels
> > > don't have that concept on their own. Anyway, too late to change now.  
> > 
> > The mdev part does three significant things:
> >  - Provide a lifecycle model based on sysfs and the GUIDs
> >  - Hackily inject itself into the VFIO IOMMU code as a special case
> >  - Force the creation of a unique iommu group as the group FD is
> >    mandatory to get the device FD.
> > 
> > This is why PASID is such a mess for mdev because it requires even
> > more special hacky stuff to link up the dummy IOMMU but still operate
> > within the iommu group of the parent device.
> > 
> > I can see an alternative arrangement using the /dev/ioasid idea that
> > is a lot less hacky and does not force the mdev guid lifecycle on
> > everyone that wants to create vfio_device.
> 
> I have not followed that discussion -- do you have a summary or a
> pointer?

I think it is still evolving, I'm hoping Intel can draft some RFC
soonish

Basically, I'd imagine to put the mdev driver itself directly in
charge of how the iommu is operated. When the driver is commanded to
connect to an ioasid (which is sort of like a VFIO container) it can
tell drivers/iommu exactly what it wants, be it a PASID in a physical
iommu device or a simple SW "page table" like the current mdevs use.

This would replace all the round about stuff to try and get other
components to setup things the way they hope the mdev driver needs.

> > All the checks for !private need some kind of locking. The driver core
> > model is that the 'struct device_driver' callbacks are all called
> > under the device_lock (this prevents the driver unbinding during the
> > callback). I didn't check if ccs does this or not..
> 
> probe/remove/shutdown are basically a forward of the callbacks at the
> bus level.

These are all covered by device_lock

> The css bus should make sure that we serialize
> irq/sch_event/chp_event with probe/remove.

Hum it doesn't look OK, like here:

css_process_crw()
  css_evaluate_subchannel()
   sch = bus_find_device()
      -- So we have a refcount on the struct device
   css_evaluate_known_subchannel() {
	if (sch->driver) {
		if (sch->driver->sch_event)
			ret = sch->driver->sch_event(sch, slow);
   }

But the above call and touches to sch->driver (which is really just
sch->dev.driver) are unlocked and racy.

I would hold the device_lock() over all touches to sch->driver outside
of a driver core callback.

Jason

^ permalink raw reply	[flat|nested] 59+ messages in thread

* s390 common I/O layer locking (was: [PATCH v2 07/13] vfio/ccw: Convert to use vfio_register_group_dev())
  2021-04-30 17:19             ` Jason Gunthorpe
@ 2021-05-03 10:54               ` Cornelia Huck
  2021-05-04 15:10                 ` s390 common I/O layer locking Vineeth Vijayan
  0 siblings, 1 reply; 59+ messages in thread
From: Cornelia Huck @ 2021-05-03 10:54 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Eric Farman, Christian Borntraeger, Vasily Gorbik,
	Heiko Carstens, kvm, linux-s390, Peter Oberparleiter,
	Halil Pasic, Vineeth Vijayan, Raj, Ashok, Dan Williams,
	Daniel Vetter, Christoph Hellwig, Leon Romanovsky, Max Gurtovoy,
	Tarun Gupta

On Fri, 30 Apr 2021 14:19:08 -0300
Jason Gunthorpe <jgg@nvidia.com> wrote:

> On Fri, Apr 30, 2021 at 02:31:40PM +0200, Cornelia Huck wrote:
> > On Thu, 29 Apr 2021 15:13:47 -0300
> > Jason Gunthorpe <jgg@nvidia.com> wrote:

> > > All the checks for !private need some kind of locking. The driver core
> > > model is that the 'struct device_driver' callbacks are all called
> > > under the device_lock (this prevents the driver unbinding during the
> > > callback). I didn't check if ccs does this or not..  
> > 
> > probe/remove/shutdown are basically a forward of the callbacks at the
> > bus level.  
> 
> These are all covered by device_lock
> 
> > The css bus should make sure that we serialize
> > irq/sch_event/chp_event with probe/remove.  
> 
> Hum it doesn't look OK, like here:
> 
> css_process_crw()
>   css_evaluate_subchannel()
>    sch = bus_find_device()
>       -- So we have a refcount on the struct device
>    css_evaluate_known_subchannel() {
> 	if (sch->driver) {
> 		if (sch->driver->sch_event)
> 			ret = sch->driver->sch_event(sch, slow);
>    }
> 
> But the above call and touches to sch->driver (which is really just
> sch->dev.driver) are unlocked and racy.
> 
> I would hold the device_lock() over all touches to sch->driver outside
> of a driver core callback.

I think this issue did not come up much before, as most drivers on the
css bus tend to stay put during the lifetime of the device; but yes, it
seems we're missing some locking.

For the css bus, we need locking for the event callbacks; for irq, this
may interact with the subchannel lock and likely needs some care.

I also looked at the other busses in the common I/O layer: scm looks
good at a glance, ccwgroup and ccw have locking for online/offline; the
other callbacks for the ccw drivers probably need to take the device
lock as well.

Common I/O layer maintainers, does that look right?


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v2 09/13] vfio/mdev: Remove vfio_mdev.c
  2021-04-29  6:56           ` Greg Kroah-Hartman
@ 2021-05-03 17:32             ` Jason Gunthorpe
  2021-05-04  9:38               ` Christoph Hellwig
  0 siblings, 1 reply; 59+ messages in thread
From: Jason Gunthorpe @ 2021-05-03 17:32 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Christoph Hellwig, Alex Williamson, Cornelia Huck,
	Jonathan Corbet, kvm, Kirti Wankhede, linux-doc, Raj, Ashok,
	Dan Williams, Daniel Vetter, Leon Romanovsky, Max Gurtovoy,
	Tarun Gupta

On Thu, Apr 29, 2021 at 08:56:31AM +0200, Greg Kroah-Hartman wrote:
> On Thu, Apr 29, 2021 at 08:53:15AM +0200, Christoph Hellwig wrote:
> > On Wed, Apr 28, 2021 at 09:53:21AM -0300, Jason Gunthorpe wrote:
> > > The Linux standard is one patch one change. It is inapporiate for me
> > > to backdoor sneak revert the VFIO communities past decisions on
> > > licensing inside some unrelated cleanup patch.
> > 
> > That's not what you are doing.  You are removing weird condom code
> > that could never work, and remove the sneak attempt of an nvidia employee
> > to create a derived work that has no legal standing.
> > 
> > > Otherwise this patch changes nothing - what existed today continues to
> > > exist, and nothing new is being allowed.
> > 
> > No, it changes the existing exports, which is a complete no-go.
> 
> Agreed, Jason, please do not change the existing exports.

I respect both of your positions on this topic, but.. I can't be part
of a licensing discussion here.

This is a cleanup project from the Mellanox BU at NVIDIA to get VFIO
more in line with kernel design patterns. Mellanox is fully open
source for all our kernel work and has no stake in these licensing
topics.

As Christoph notes, it seems some other BU at NVIDIA has an interest
here. I hope you'll both understand that I can't get involved in a
licensing topic between the community and some other BU at NVIDIA.

Since none of the past discussions on EXPORT_SYMBOL resulted in any
concrete guidelines to follow, I feel the basic "don't change things"
(in the pragmatic, it worked before, so don't break it sense)
guideline should be applied here.

Since that is not agreeable I will shrink this patch series to remove
the ccw conversion that already has complex feedback and drop this
patch. I'll sadly shelve the rest of the work until something changes.

This will at least allow the coming Intel IDXD mdev driver to be
implemented more cleanly from the start.

Regards,
Jason

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v2 02/13] vfio/mdev: Allow the mdev_parent_ops to specify the device driver to bind
  2021-04-28 14:00         ` Jason Gunthorpe
  2021-04-28 19:58           ` Dan Williams
  2021-04-29  6:51           ` Christoph Hellwig
@ 2021-05-04  9:36           ` Christoph Hellwig
  2021-05-04 11:30             ` Jason Gunthorpe
  2 siblings, 1 reply; 59+ messages in thread
From: Christoph Hellwig @ 2021-05-04  9:36 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Christoph Hellwig, Dan Williams, Alex Williamson, Cornelia Huck,
	kvm, Kirti Wankhede, Raj, Ashok, Daniel Vetter, Leon Romanovsky,
	Max Gurtovoy, Tarun Gupta

On Wed, Apr 28, 2021 at 11:00:05AM -0300, Jason Gunthorpe wrote:
> I thought about doing it like that, it is generally a good idea,
> however, if I add new API surface to the driver core I really want to
> get rid of device_bind_driver(), or at least most of its users.
> 
> I'm pretty sure Greg will ask for it too.
> 
> So, I need a way to sequence that which doesn't mean I have to shelf
> the mdev stuff for ages while I try to get acks from lots of places.
> 
> Leave this alone and fix it after? Export device_driver_attach() and
> say to try and fix the rest after?

Maybe.  Or convert one or two samples.

> I think this will still need the ugly errno capture though..

Why?

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v2 09/13] vfio/mdev: Remove vfio_mdev.c
  2021-05-03 17:32             ` Jason Gunthorpe
@ 2021-05-04  9:38               ` Christoph Hellwig
  2021-05-04 16:20                 ` Jason Gunthorpe
  0 siblings, 1 reply; 59+ messages in thread
From: Christoph Hellwig @ 2021-05-04  9:38 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Greg Kroah-Hartman, Christoph Hellwig, Alex Williamson,
	Cornelia Huck, Jonathan Corbet, kvm, Kirti Wankhede, linux-doc,
	Raj, Ashok, Dan Williams, Daniel Vetter, Leon Romanovsky,
	Max Gurtovoy, Tarun Gupta

On Mon, May 03, 2021 at 02:32:20PM -0300, Jason Gunthorpe wrote:
> Since that is not agreeable I will shrink this patch series to remove
> the ccw conversion that already has complex feedback and drop this
> patch. I'll sadly shelve the rest of the work until something changes.

Please don't.  I'll happily takes on that this is the right work, and
should not be damaged by a bad actor (Nvidia corporate that has been
sneaking weird backdoors into Linux for a while) directing someone
that now works for them through an acquisition.

And we realy need to put Nvidia in the watchlist unfortunately as they
have caused so much damage to Linux through all their crazy backdoors.

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v2 02/13] vfio/mdev: Allow the mdev_parent_ops to specify the device driver to bind
  2021-05-04  9:36           ` Christoph Hellwig
@ 2021-05-04 11:30             ` Jason Gunthorpe
  0 siblings, 0 replies; 59+ messages in thread
From: Jason Gunthorpe @ 2021-05-04 11:30 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Dan Williams, Alex Williamson, Cornelia Huck, kvm,
	Kirti Wankhede, Raj, Ashok, Daniel Vetter, Leon Romanovsky,
	Max Gurtovoy, Tarun Gupta

On Tue, May 04, 2021 at 11:36:36AM +0200, Christoph Hellwig wrote:
> On Wed, Apr 28, 2021 at 11:00:05AM -0300, Jason Gunthorpe wrote:
> > I thought about doing it like that, it is generally a good idea,
> > however, if I add new API surface to the driver core I really want to
> > get rid of device_bind_driver(), or at least most of its users.
> > 
> > I'm pretty sure Greg will ask for it too.
> > 
> > So, I need a way to sequence that which doesn't mean I have to shelf
> > the mdev stuff for ages while I try to get acks from lots of places.
> > 
> > Leave this alone and fix it after? Export device_driver_attach() and
> > say to try and fix the rest after?
> 
> Maybe.  Or convert one or two samples.

The conversions are easy I just can't test them or completely tell if
they are correct..
 
> > I think this will still need the ugly errno capture though..
> 
> Why?

Several of the mdev drivers are checking some predicate during their
new probe function, like total # of devices. So if userspace exceeds
that then the old behavior was to fail the sysfs create operation. eg:


static int vfio_ccw_mdev_probe(struct mdev_device *mdev)
{
	if (private->state == VFIO_CCW_STATE_NOT_OPER)
		return -ENODEV;

	if (atomic_dec_if_positive(&private->avail) < 0)
		return -EPERM;

Without the errno capture this doesn't work anymore and things end
succeeding to create a device and but failing to attach a driver.

It could be changed to loose the errno and just return with some
generic -EINVAL if no driver bound, but that seems pretty ugly too.

Returning the probe error from some device_driver_attach() also make
some sense, but revising the code to do that is a big touch and this
is so strange I don't know if it is worth it.

Jason

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: s390 common I/O layer locking
  2021-05-03 10:54               ` s390 common I/O layer locking (was: [PATCH v2 07/13] vfio/ccw: Convert to use vfio_register_group_dev()) Cornelia Huck
@ 2021-05-04 15:10                 ` Vineeth Vijayan
  2021-07-24 13:24                   ` Christoph Hellwig
  0 siblings, 1 reply; 59+ messages in thread
From: Vineeth Vijayan @ 2021-05-04 15:10 UTC (permalink / raw)
  To: Cornelia Huck, Jason Gunthorpe
  Cc: Eric Farman, Christian Borntraeger, Vasily Gorbik,
	Heiko Carstens, kvm, linux-s390, Peter Oberparleiter,
	Halil Pasic, Raj, Ashok, Dan Williams, Daniel Vetter,
	Christoph Hellwig, Leon Romanovsky, Max Gurtovoy, Tarun Gupta


On 5/3/21 12:54 PM, Cornelia Huck wrote:
> On Fri, 30 Apr 2021 14:19:08 -0300
> Jason Gunthorpe <jgg@nvidia.com> wrote:
>
>> On Fri, Apr 30, 2021 at 02:31:40PM +0200, Cornelia Huck wrote:
>>> On Thu, 29 Apr 2021 15:13:47 -0300
>>> Jason Gunthorpe <jgg@nvidia.com> wrote:
>>>> All the checks for !private need some kind of locking. The driver core
>>>> model is that the 'struct device_driver' callbacks are all called
>>>> under the device_lock (this prevents the driver unbinding during the
>>>> callback). I didn't check if ccs does this or not..
>>> probe/remove/shutdown are basically a forward of the callbacks at the
>>> bus level.
>> These are all covered by device_lock
>>
>>> The css bus should make sure that we serialize
>>> irq/sch_event/chp_event with probe/remove.
>> Hum it doesn't look OK, like here:
>>
>> css_process_crw()
>>    css_evaluate_subchannel()
>>     sch = bus_find_device()
>>        -- So we have a refcount on the struct device
>>     css_evaluate_known_subchannel() {
>> 	if (sch->driver) {
>> 		if (sch->driver->sch_event)
>> 			ret = sch->driver->sch_event(sch, slow);
>>     }
>>
>> But the above call and touches to sch->driver (which is really just
>> sch->dev.driver) are unlocked and racy.
>>
>> I would hold the device_lock() over all touches to sch->driver outside
>> of a driver core callback.
> I think this issue did not come up much before, as most drivers on the
> css bus tend to stay put during the lifetime of the device; but yes, it
> seems we're missing some locking.
>
> For the css bus, we need locking for the event callbacks; for irq, this
> may interact with the subchannel lock and likely needs some care.
>
> I also looked at the other busses in the common I/O layer: scm looks
> good at a glance, ccwgroup and ccw have locking for online/offline; the
> other callbacks for the ccw drivers probably need to take the device
> lock as well.
>
> Common I/O layer maintainers, does that look right?
>
I just had a quick glance on the CIO layer drivers. And at first look, 
you are right.
It looks likewe need modifications in the event callbacks (referring css 
here)
Let me go thoughthis thoroughly and update.
Thank you.

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v2 09/13] vfio/mdev: Remove vfio_mdev.c
  2021-05-04  9:38               ` Christoph Hellwig
@ 2021-05-04 16:20                 ` Jason Gunthorpe
  0 siblings, 0 replies; 59+ messages in thread
From: Jason Gunthorpe @ 2021-05-04 16:20 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Greg Kroah-Hartman, Alex Williamson, Cornelia Huck,
	Jonathan Corbet, kvm, Kirti Wankhede, linux-doc, Raj, Ashok,
	Dan Williams, Daniel Vetter, Leon Romanovsky, Max Gurtovoy,
	Tarun Gupta

On Tue, May 04, 2021 at 11:38:57AM +0200, Christoph Hellwig wrote:
> On Mon, May 03, 2021 at 02:32:20PM -0300, Jason Gunthorpe wrote:
> > Since that is not agreeable I will shrink this patch series to remove
> > the ccw conversion that already has complex feedback and drop this
> > patch. I'll sadly shelve the rest of the work until something changes.
> 
> Please don't.  I'll happily takes on that this is the right work, and
> should not be damaged by a bad actor (Nvidia corporate that has been
> sneaking weird backdoors into Linux for a while) directing someone
> that now works for them through an acquisition.

If everyone can have a solid agreement on licensing for vfio-mdev
modules then it is fine from my perspective. IMHO that needs to be
settled outside this patch series. If it has to wait while that is
done, then fine.

I'm not being "directed" by NVIDIA. My limitation is I can't be
involved in licensing discussions, and frankly after 25 years of this
I'm tired of them anyhow.

This licensing topic in particular never seems to go anywhere. Half
the participants want EXPORT_SYMBOL() abolished and the other half
view it as an existential requirement. The whole thing is toxic to the
community.

> And we realy need to put Nvidia in the watchlist unfortunately as they
> have caused so much damage to Linux through all their crazy backdoors.

Well that seems impractical.

Check the lwn statistics. NVIDIA is fairly regularly the 10th largest
changeset contributor. We have > 100 people in Mellanox writing kernel
patches and we employ several kernel maintainers now. If the
NVIDIA/ARM purchase goes ahead it will be get even bigger.

All these big amalgamations of people seem to have their unique
challenges, and I'm not convinced NVIDIA is significantly more
damaging to the kernel than Intel, the Android world or other places.

So, let's not paint > 100 developers with such a broad brush please.

I prefer the optimistive view: Mellanox's continued open source
success will be inspiring.

Jason

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v2 02/13] vfio/mdev: Allow the mdev_parent_ops to specify the device driver to bind
  2021-04-28 19:58           ` Dan Williams
  2021-04-28 23:38             ` Jason Gunthorpe
@ 2021-05-26  0:42             ` Jason Gunthorpe
  2021-05-26  1:42               ` Dan Williams
  2021-05-27 11:44               ` Christoph Hellwig
  1 sibling, 2 replies; 59+ messages in thread
From: Jason Gunthorpe @ 2021-05-26  0:42 UTC (permalink / raw)
  To: Dan Williams
  Cc: Christoph Hellwig, Alex Williamson, Cornelia Huck, kvm,
	Kirti Wankhede, Raj, Ashok, Daniel Vetter, Leon Romanovsky,
	Max Gurtovoy, Tarun Gupta, dave.jiang

On Wed, Apr 28, 2021 at 12:58:29PM -0700, Dan Williams wrote:

> I have an ulterior motive / additional use case in mind here which is
> the work-in-progress cleanup of the DSA driver. 

Well, I worked on it for a while, please take a look at this:

https://github.com/jgunthorpe/linux/commits/device_driver_attach

It makes device_driver_attach() into what this mdev stuff needs, and I
think improves the sysfs bind file as a side effect.

Is it what you need for DSA?

Thanks,
Jason

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v2 02/13] vfio/mdev: Allow the mdev_parent_ops to specify the device driver to bind
  2021-05-26  0:42             ` Jason Gunthorpe
@ 2021-05-26  1:42               ` Dan Williams
  2021-05-27 11:44               ` Christoph Hellwig
  1 sibling, 0 replies; 59+ messages in thread
From: Dan Williams @ 2021-05-26  1:42 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Christoph Hellwig, Alex Williamson, Cornelia Huck, KVM list,
	Kirti Wankhede, Raj, Ashok, Daniel Vetter, Leon Romanovsky,
	Max Gurtovoy, Tarun Gupta, Dave Jiang

On Tue, May 25, 2021 at 5:42 PM Jason Gunthorpe <jgg@nvidia.com> wrote:
>
> On Wed, Apr 28, 2021 at 12:58:29PM -0700, Dan Williams wrote:
>
> > I have an ulterior motive / additional use case in mind here which is
> > the work-in-progress cleanup of the DSA driver.
>
> Well, I worked on it for a while, please take a look at this:
>
> https://github.com/jgunthorpe/linux/commits/device_driver_attach
>
> It makes device_driver_attach() into what this mdev stuff needs, and I
> think improves the sysfs bind file as a side effect.

Nice, yes, it looks like it does.

> Is it what you need for DSA?

Yes.

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v2 02/13] vfio/mdev: Allow the mdev_parent_ops to specify the device driver to bind
  2021-05-26  0:42             ` Jason Gunthorpe
  2021-05-26  1:42               ` Dan Williams
@ 2021-05-27 11:44               ` Christoph Hellwig
  2021-05-27 14:53                 ` Jason Gunthorpe
  1 sibling, 1 reply; 59+ messages in thread
From: Christoph Hellwig @ 2021-05-27 11:44 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Dan Williams, Christoph Hellwig, Alex Williamson, Cornelia Huck,
	kvm, Kirti Wankhede, Raj, Ashok, Daniel Vetter, Leon Romanovsky,
	Max Gurtovoy, Tarun Gupta, dave.jiang

On Tue, May 25, 2021 at 09:42:30PM -0300, Jason Gunthorpe wrote:
> On Wed, Apr 28, 2021 at 12:58:29PM -0700, Dan Williams wrote:
> 
> > I have an ulterior motive / additional use case in mind here which is
> > the work-in-progress cleanup of the DSA driver. 
> 
> Well, I worked on it for a while, please take a look at this:
> 
> https://github.com/jgunthorpe/linux/commits/device_driver_attach
> 
> It makes device_driver_attach() into what this mdev stuff needs, and I
> think improves the sysfs bind file as a side effect.

This looks great.  Please get it out to Greg ASAP!

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v2 02/13] vfio/mdev: Allow the mdev_parent_ops to specify the device driver to bind
  2021-05-27 11:44               ` Christoph Hellwig
@ 2021-05-27 14:53                 ` Jason Gunthorpe
  2021-05-27 15:13                   ` Christoph Hellwig
  0 siblings, 1 reply; 59+ messages in thread
From: Jason Gunthorpe @ 2021-05-27 14:53 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Dan Williams, Alex Williamson, Cornelia Huck, kvm,
	Kirti Wankhede, Raj, Ashok, Daniel Vetter, Leon Romanovsky,
	Max Gurtovoy, Tarun Gupta, dave.jiang

On Thu, May 27, 2021 at 01:44:52PM +0200, Christoph Hellwig wrote:
> On Tue, May 25, 2021 at 09:42:30PM -0300, Jason Gunthorpe wrote:
> > On Wed, Apr 28, 2021 at 12:58:29PM -0700, Dan Williams wrote:
> > 
> > > I have an ulterior motive / additional use case in mind here which is
> > > the work-in-progress cleanup of the DSA driver. 
> > 
> > Well, I worked on it for a while, please take a look at this:
> > 
> > https://github.com/jgunthorpe/linux/commits/device_driver_attach
> > 
> > It makes device_driver_attach() into what this mdev stuff needs, and I
> > think improves the sysfs bind file as a side effect.
> 
> This looks great.  Please get it out to Greg ASAP!

Thanks, I need to test it all carefully it is pretty tricky. You are
OK with the funky in/out flag? That was the most ick part

Jason

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v2 02/13] vfio/mdev: Allow the mdev_parent_ops to specify the device driver to bind
  2021-05-27 14:53                 ` Jason Gunthorpe
@ 2021-05-27 15:13                   ` Christoph Hellwig
  0 siblings, 0 replies; 59+ messages in thread
From: Christoph Hellwig @ 2021-05-27 15:13 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Christoph Hellwig, Dan Williams, Alex Williamson, Cornelia Huck,
	kvm, Kirti Wankhede, Raj, Ashok, Daniel Vetter, Leon Romanovsky,
	Max Gurtovoy, Tarun Gupta, dave.jiang

On Thu, May 27, 2021 at 11:53:42AM -0300, Jason Gunthorpe wrote:
> > This looks great.  Please get it out to Greg ASAP!
> 
> Thanks, I need to test it all carefully it is pretty tricky. You are
> OK with the funky in/out flag? That was the most ick part

It is a little ugly, but what are the alternatives?  Would an input
flag to never return EPROBE_DEFER work instead?

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: s390 common I/O layer locking
  2021-05-04 15:10                 ` s390 common I/O layer locking Vineeth Vijayan
@ 2021-07-24 13:24                   ` Christoph Hellwig
  2021-08-03 14:27                     ` Vineeth Vijayan
  0 siblings, 1 reply; 59+ messages in thread
From: Christoph Hellwig @ 2021-07-24 13:24 UTC (permalink / raw)
  To: Vineeth Vijayan
  Cc: Cornelia Huck, Jason Gunthorpe, Eric Farman,
	Christian Borntraeger, Vasily Gorbik, Heiko Carstens, kvm,
	linux-s390, Peter Oberparleiter, Halil Pasic, Raj, Ashok,
	Dan Williams, Daniel Vetter, Christoph Hellwig, Leon Romanovsky,
	Max Gurtovoy, Tarun Gupta

On Tue, May 04, 2021 at 05:10:42PM +0200, Vineeth Vijayan wrote:
>> For the css bus, we need locking for the event callbacks; for irq, this
>> may interact with the subchannel lock and likely needs some care.
>>
>> I also looked at the other busses in the common I/O layer: scm looks
>> good at a glance, ccwgroup and ccw have locking for online/offline; the
>> other callbacks for the ccw drivers probably need to take the device
>> lock as well.
>>
>> Common I/O layer maintainers, does that look right?
>>
> I just had a quick glance on the CIO layer drivers. And at first look, you 
> are right.
> It looks likewe need modifications in the event callbacks (referring css 
> here)
> Let me go thoughthis thoroughly and update.

Did this go anywhere?

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: s390 common I/O layer locking
  2021-07-24 13:24                   ` Christoph Hellwig
@ 2021-08-03 14:27                     ` Vineeth Vijayan
  2021-08-10 15:00                       ` Cornelia Huck
  0 siblings, 1 reply; 59+ messages in thread
From: Vineeth Vijayan @ 2021-08-03 14:27 UTC (permalink / raw)
  To: Christoph Hellwig, Cornelia Huck
  Cc: Jason Gunthorpe, Eric Farman, Christian Borntraeger,
	Vasily Gorbik, Heiko Carstens, kvm, linux-s390,
	Peter Oberparleiter, Halil Pasic, Raj, Ashok, Dan Williams,
	Daniel Vetter, Leon Romanovsky, Max Gurtovoy, Tarun Gupta


On 7/24/21 3:24 PM, Christoph Hellwig wrote:
> On Tue, May 04, 2021 at 05:10:42PM +0200, Vineeth Vijayan wrote:
...snip...
>> I just had a quick glance on the CIO layer drivers. And at first 
>> look, you
>> are right.
>> It looks likewe need modifications in the event callbacks (referring css
>> here)
>> Let me go thoughthis thoroughly and update.
> Did this go anywhere?
Hello Christoph,

Thank you for this reminder. Also, my apologies for the slow reply; This 
was one of those item which really needed this reminder :-)

Coming to the point, The event-callbacks  are under sch->lock, which i 
think is the right thing to do. But i also agree on your feedback about 
the sch->driver accesses in the css_evaluate_known_subchannel() call. My 
first impression was to add them under device_lock(). As Conny 
mentioned, most of the drivers on the css-bus remained-stable during the 
lifetime of the devices, and we never got this racy scenario.  And then 
having this change with device_lock(), as you mentioned,this code-base 
would need significant change in the sch_event callbacks. I am not sure 
if there is a straight forward solution for this locking-issue scenario.

Currently, i am trying to see the "minimal" change i can work on on the 
event-callbacks and the css_evaluate_known_subchannel() call, to make 
sure that, this racy condition can never occur.

Conny,

Please do let me know if you think i am missing something here. I would 
like to concentrate more on the sch->driver() access scenario first and 
would like to see how it can have minimal impact on the event-callbacks. 
especially io_subchannel_sch_event.


Vineeth

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: s390 common I/O layer locking
  2021-08-03 14:27                     ` Vineeth Vijayan
@ 2021-08-10 15:00                       ` Cornelia Huck
  0 siblings, 0 replies; 59+ messages in thread
From: Cornelia Huck @ 2021-08-10 15:00 UTC (permalink / raw)
  To: Vineeth Vijayan, Christoph Hellwig
  Cc: Jason Gunthorpe, Eric Farman, Christian Borntraeger,
	Vasily Gorbik, Heiko Carstens, kvm, linux-s390,
	Peter Oberparleiter, Halil Pasic, Raj, Ashok, Dan Williams,
	Daniel Vetter, Leon Romanovsky, Max Gurtovoy, Tarun Gupta

On Tue, Aug 03 2021, Vineeth Vijayan <vneethv@linux.ibm.com> wrote:

> On 7/24/21 3:24 PM, Christoph Hellwig wrote:
>> On Tue, May 04, 2021 at 05:10:42PM +0200, Vineeth Vijayan wrote:
> ...snip...
>>> I just had a quick glance on the CIO layer drivers. And at first 
>>> look, you
>>> are right.
>>> It looks likewe need modifications in the event callbacks (referring css
>>> here)
>>> Let me go thoughthis thoroughly and update.
>> Did this go anywhere?
> Hello Christoph,
>
> Thank you for this reminder. Also, my apologies for the slow reply; This 
> was one of those item which really needed this reminder :-)
>
> Coming to the point, The event-callbacks  are under sch->lock, which i 
> think is the right thing to do. But i also agree on your feedback about 
> the sch->driver accesses in the css_evaluate_known_subchannel() call. My 
> first impression was to add them under device_lock(). As Conny 
> mentioned, most of the drivers on the css-bus remained-stable during the 
> lifetime of the devices, and we never got this racy scenario.  And then 
> having this change with device_lock(), as you mentioned,this code-base 
> would need significant change in the sch_event callbacks. I am not sure 
> if there is a straight forward solution for this locking-issue
> scenario.

Hm, I may have lost my way in the code, but I think ->sch_event is
called _without_ the subchannel lock being held? It is only taken in
e.g. io_subchannel_sch_event.

->chp_event is called with the subchannel lock held, though.

>
> Currently, i am trying to see the "minimal" change i can work on on the 
> event-callbacks and the css_evaluate_known_subchannel() call, to make 
> sure that, this racy condition can never occur.
>
> Conny,
>
> Please do let me know if you think i am missing something here. I would 
> like to concentrate more on the sch->driver() access scenario first and 
> would like to see how it can have minimal impact on the event-callbacks. 
> especially io_subchannel_sch_event.

Given that the code changing sch->driver holds the device lock, but not
the subchannel lock, you probably need to make sure that the device lock
is held? It has been some time since I've done more complicated work in
the common I/O layer, though, and I might be missing something.


^ permalink raw reply	[flat|nested] 59+ messages in thread

end of thread, other threads:[~2021-08-10 15:01 UTC | newest]

Thread overview: 59+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-26 20:00 [PATCH v2 00/13] Remove vfio_mdev.c, mdev_parent_ops and more Jason Gunthorpe
2021-04-26 20:00 ` [PATCH v2 01/13] vfio/mdev: Remove CONFIG_VFIO_MDEV_DEVICE Jason Gunthorpe
2021-04-27 11:05   ` Cornelia Huck
2021-04-26 20:00 ` [PATCH v2 02/13] vfio/mdev: Allow the mdev_parent_ops to specify the device driver to bind Jason Gunthorpe
2021-04-27 12:32   ` Cornelia Huck
2021-04-27 23:20     ` Jason Gunthorpe
2021-04-28  6:03   ` Christoph Hellwig
2021-04-28  7:56     ` Dan Williams
2021-04-28 12:41       ` Christoph Hellwig
2021-04-28 14:00         ` Jason Gunthorpe
2021-04-28 19:58           ` Dan Williams
2021-04-28 23:38             ` Jason Gunthorpe
2021-04-29  0:00               ` Dave Jiang
2021-05-26  0:42             ` Jason Gunthorpe
2021-05-26  1:42               ` Dan Williams
2021-05-27 11:44               ` Christoph Hellwig
2021-05-27 14:53                 ` Jason Gunthorpe
2021-05-27 15:13                   ` Christoph Hellwig
2021-04-29  6:51           ` Christoph Hellwig
2021-05-04  9:36           ` Christoph Hellwig
2021-05-04 11:30             ` Jason Gunthorpe
2021-04-28  6:44   ` Leon Romanovsky
2021-04-28 14:14     ` Jason Gunthorpe
2021-04-28 14:24       ` Leon Romanovsky
2021-04-26 20:00 ` [PATCH v2 03/13] vfio/mtty: Convert to use vfio_register_group_dev() Jason Gunthorpe
2021-04-26 20:00 ` [PATCH v2 04/13] vfio/mdpy: " Jason Gunthorpe
2021-04-26 20:00 ` [PATCH v2 05/13] vfio/mbochs: " Jason Gunthorpe
2021-04-26 20:00 ` [PATCH v2 07/13] vfio/ccw: " Jason Gunthorpe
2021-04-27 20:06   ` Eric Farman
2021-04-27 22:10     ` Jason Gunthorpe
2021-04-28 12:55       ` Eric Farman
2021-04-28 13:21         ` Jason Gunthorpe
2021-04-28 17:09   ` Cornelia Huck
2021-04-28 17:20     ` Jason Gunthorpe
2021-04-29 11:58       ` Cornelia Huck
2021-04-29 18:13         ` Jason Gunthorpe
2021-04-30 12:31           ` Cornelia Huck
2021-04-30 17:19             ` Jason Gunthorpe
2021-05-03 10:54               ` s390 common I/O layer locking (was: [PATCH v2 07/13] vfio/ccw: Convert to use vfio_register_group_dev()) Cornelia Huck
2021-05-04 15:10                 ` s390 common I/O layer locking Vineeth Vijayan
2021-07-24 13:24                   ` Christoph Hellwig
2021-08-03 14:27                     ` Vineeth Vijayan
2021-08-10 15:00                       ` Cornelia Huck
2021-04-26 20:00 ` [PATCH v2 09/13] vfio/mdev: Remove vfio_mdev.c Jason Gunthorpe
2021-04-28  6:07   ` Christoph Hellwig
2021-04-28  6:36     ` Greg Kroah-Hartman
2021-04-28 12:53       ` Jason Gunthorpe
2021-04-29  6:53         ` Christoph Hellwig
2021-04-29  6:56           ` Greg Kroah-Hartman
2021-05-03 17:32             ` Jason Gunthorpe
2021-05-04  9:38               ` Christoph Hellwig
2021-05-04 16:20                 ` Jason Gunthorpe
2021-04-26 20:00 ` [PATCH v2 10/13] vfio/mdev: Remove mdev_parent_ops dev_attr_groups Jason Gunthorpe
2021-04-26 20:00 ` [PATCH v2 11/13] vfio/mdev: Remove mdev_parent_ops Jason Gunthorpe
2021-04-26 20:00 ` [PATCH v2 12/13] vfio/mdev: Use the driver core to create the 'remove' file Jason Gunthorpe
2021-04-26 20:00 ` [PATCH v2 13/13] vfio/mdev: Remove mdev drvdata Jason Gunthorpe
2021-04-27 21:30 ` [PATCH v2 00/13] Remove vfio_mdev.c, mdev_parent_ops and more Alex Williamson
2021-04-27 22:20   ` Jason Gunthorpe
2021-04-27 22:49     ` Alex Williamson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).