All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 00/14] Embed struct vfio_device in all sub-structures
@ 2021-03-13  0:55 Jason Gunthorpe
  2021-03-13  0:55 ` [PATCH v2 01/14] vfio: Remove extra put/gets around vfio_device->group Jason Gunthorpe
                   ` (13 more replies)
  0 siblings, 14 replies; 82+ messages in thread
From: Jason Gunthorpe @ 2021-03-13  0:55 UTC (permalink / raw)
  To: Alex Williamson, Cornelia Huck, Jonathan Corbet, Diana Craciun,
	Eric Auger, kvm, Kirti Wankhede, linux-doc
  Cc: Raj, Ashok, Bharat Bhushan, Christian Ehrhardt, Dan Williams,
	Daniel Vetter, Christoph Hellwig, Kevin Tian, Leon Romanovsky,
	Max Gurtovoy, Tarun Gupta, Liu Yi L

The main focus of this series is to make VFIO follow the normal kernel
convention of structure embedding for structure inheritance instead of
linking using a 'void *opaque'. Here we focus on moving the vfio_device to
be a member of every struct vfio_XX_device that is linked by a
vfio_add_group_dev().

In turn this allows 'struct vfio_device *' to be used everwhere, and the
public API out of vfio.c can be cleaned to remove places using 'struct
device *' and 'void *' as surrogates to refer to the device.

While this has the minor trade off of moving 'struct vfio_device' the
clarity of the design is worth it. I can speak directly to this idea, as
I've invested a fair amount of time carefully working backwards what all
the type-erased APIs are supposed to be and it is certainly not trivial or
intuitive.

When we get into mdev land things become even more inscrutable, and while
I now have a pretty clear picture, it was hard to obtain. I think this
agrees with the kernel style ideal of being explicit in typing and not
sacrificing clarity to create opaque structs.

After this series the general rules are:
 - Any vfio_XX_device * can be obtained at no cost from a vfio_device *
   using container_of(), and the reverse is possible by &XXdev->vdev

   This is similar to how 'struct pci_device' and 'struct device' are
   interrelated.

   This allows 'device_data' to be completely removed from the vfio.c API.

 - The drvdata for a struct device points at the vfio_XX_device that
   belongs to the driver that was probed. drvdata is removed from the core
   code, and only used as part of the implementation of the struct
   device_driver.

 - The lifetime of vfio_XX_device and vfio_device are identical, they are
   the same memory.

   This follows the existing model where vfio_del_group_dev() blocks until
   all vfio_device_put()'s are completed. This in turn means the struct
   device_driver remove() blocks, and thus under the driver_lock() a bound
   driver must have a valid drvdata pointing at both vfio device
   structs. A following series exploits this further.

Most vfio_XX_device structs have data that duplicates the 'struct
device *dev' member of vfio_device, a following series removes that
duplication too.

v2:
 - Split the get/put changes out of "Simlpify the lifetime logic for
   vfio_device"
 - Add a patch to fix probe ordering in fsl-mc and remove FIXME
 - Add a patch to re-org pci probe
 - Add a patch to fix probe odering in pci and remove FIXME
 - Remove the **pf_dev output from get_pf_vdev()
v1: https://lore.kernel.org/r/0-v1-7355d38b9344+17481-vfio1_jgg@nvidia.com

Thanks,
Jason

Jason Gunthorpe (14):
  vfio: Remove extra put/gets around vfio_device->group
  vfio: Simplify the lifetime logic for vfio_device
  vfio: Split creation of a vfio_device into init and register ops
  vfio/platform: Use vfio_init/register/unregister_group_dev
  vfio/fsl-mc: Re-order vfio_fsl_mc_probe()
  vfio/fsl-mc: Use vfio_init/register/unregister_group_dev
  vfio/pci: Move VGA and VF initialization to functions
  vfio/pci: Re-order vfio_pci_probe()
  vfio/pci: Use vfio_init/register/unregister_group_dev
  vfio/mdev: Use vfio_init/register/unregister_group_dev
  vfio/mdev: Make to_mdev_device() into a static inline
  vfio: Make vfio_device_ops pass a 'struct vfio_device *' instead of
    'void *'
  vfio/pci: Replace uses of vfio_device_data() with container_of
  vfio: Remove device_data from the vfio bus driver API

 Documentation/driver-api/vfio.rst             |  48 ++--
 drivers/vfio/fsl-mc/vfio_fsl_mc.c             |  96 ++++---
 drivers/vfio/fsl-mc/vfio_fsl_mc_private.h     |   1 +
 drivers/vfio/mdev/mdev_private.h              |   5 +-
 drivers/vfio/mdev/vfio_mdev.c                 |  57 ++--
 drivers/vfio/pci/vfio_pci.c                   | 253 ++++++++++--------
 drivers/vfio/pci/vfio_pci_private.h           |   1 +
 drivers/vfio/platform/vfio_amba.c             |   8 +-
 drivers/vfio/platform/vfio_platform.c         |  21 +-
 drivers/vfio/platform/vfio_platform_common.c  |  56 ++--
 drivers/vfio/platform/vfio_platform_private.h |   5 +-
 drivers/vfio/vfio.c                           | 210 +++++----------
 include/linux/vfio.h                          |  37 ++-
 13 files changed, 397 insertions(+), 401 deletions(-)

-- 
2.30.2


^ permalink raw reply	[flat|nested] 82+ messages in thread

* [PATCH v2 01/14] vfio: Remove extra put/gets around vfio_device->group
  2021-03-13  0:55 [PATCH v2 00/14] Embed struct vfio_device in all sub-structures Jason Gunthorpe
@ 2021-03-13  0:55 ` Jason Gunthorpe
  2021-03-16  7:33   ` Tian, Kevin
                     ` (3 more replies)
  2021-03-13  0:55 ` [PATCH v2 02/14] vfio: Simplify the lifetime logic for vfio_device Jason Gunthorpe
                   ` (12 subsequent siblings)
  13 siblings, 4 replies; 82+ messages in thread
From: Jason Gunthorpe @ 2021-03-13  0:55 UTC (permalink / raw)
  To: Alex Williamson, Cornelia Huck, kvm
  Cc: Raj, Ashok, Dan Williams, Daniel Vetter, Christoph Hellwig,
	Leon Romanovsky, Max Gurtovoy, Tarun Gupta

The vfio_device->group value has a get obtained during
vfio_add_group_dev() which gets moved from the stack to vfio_device->group
in vfio_group_create_device().

The reference remains until we reach the end of vfio_del_group_dev() when
it is put back.

Thus anything that already has a kref on the vfio_device is guaranteed a
valid group pointer. Remove all the extra reference traffic.

It is tricky to see, but the get at the start of vfio_del_group_dev() is
actually pairing with the put hidden inside vfio_device_put() a few lines
below.

A later patch merges vfio_group_create_device() into vfio_add_group_dev()
which makes the ownership and error flow on the create side easier to
follow.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/vfio/vfio.c | 21 ++-------------------
 1 file changed, 2 insertions(+), 19 deletions(-)

diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
index 38779e6fd80cb4..15d8e678e5563a 100644
--- a/drivers/vfio/vfio.c
+++ b/drivers/vfio/vfio.c
@@ -546,14 +546,12 @@ struct vfio_device *vfio_group_create_device(struct vfio_group *group,
 
 	kref_init(&device->kref);
 	device->dev = dev;
+	/* Our reference on group is moved to the device */
 	device->group = group;
 	device->ops = ops;
 	device->device_data = device_data;
 	dev_set_drvdata(dev, device);
 
-	/* No need to get group_lock, caller has group reference */
-	vfio_group_get(group);
-
 	mutex_lock(&group->device_lock);
 	list_add(&device->group_next, &group->device_list);
 	group->dev_counter++;
@@ -585,13 +583,11 @@ void vfio_device_put(struct vfio_device *device)
 {
 	struct vfio_group *group = device->group;
 	kref_put_mutex(&device->kref, vfio_device_release, &group->device_lock);
-	vfio_group_put(group);
 }
 EXPORT_SYMBOL_GPL(vfio_device_put);
 
 static void vfio_device_get(struct vfio_device *device)
 {
-	vfio_group_get(device->group);
 	kref_get(&device->kref);
 }
 
@@ -841,14 +837,6 @@ int vfio_add_group_dev(struct device *dev,
 		vfio_group_put(group);
 		return PTR_ERR(device);
 	}
-
-	/*
-	 * Drop all but the vfio_device reference.  The vfio_device holds
-	 * a reference to the vfio_group, which holds a reference to the
-	 * iommu_group.
-	 */
-	vfio_group_put(group);
-
 	return 0;
 }
 EXPORT_SYMBOL_GPL(vfio_add_group_dev);
@@ -928,12 +916,6 @@ void *vfio_del_group_dev(struct device *dev)
 	unsigned int i = 0;
 	bool interrupted = false;
 
-	/*
-	 * The group exists so long as we have a device reference.  Get
-	 * a group reference and use it to scan for the device going away.
-	 */
-	vfio_group_get(group);
-
 	/*
 	 * When the device is removed from the group, the group suddenly
 	 * becomes non-viable; the device has a driver (until the unbind
@@ -1008,6 +990,7 @@ void *vfio_del_group_dev(struct device *dev)
 	if (list_empty(&group->device_list))
 		wait_event(group->container_q, !group->container);
 
+	/* Matches the get in vfio_group_create_device() */
 	vfio_group_put(group);
 
 	return device_data;
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 82+ messages in thread

* [PATCH v2 02/14] vfio: Simplify the lifetime logic for vfio_device
  2021-03-13  0:55 [PATCH v2 00/14] Embed struct vfio_device in all sub-structures Jason Gunthorpe
  2021-03-13  0:55 ` [PATCH v2 01/14] vfio: Remove extra put/gets around vfio_device->group Jason Gunthorpe
@ 2021-03-13  0:55 ` Jason Gunthorpe
  2021-03-16  7:38   ` Tian, Kevin
  2021-03-18 13:10   ` Auger Eric
  2021-03-13  0:55 ` [PATCH v2 03/14] vfio: Split creation of a vfio_device into init and register ops Jason Gunthorpe
                   ` (11 subsequent siblings)
  13 siblings, 2 replies; 82+ messages in thread
From: Jason Gunthorpe @ 2021-03-13  0:55 UTC (permalink / raw)
  To: Alex Williamson, Cornelia Huck, kvm
  Cc: Raj, Ashok, Dan Williams, Daniel Vetter, Christoph Hellwig,
	Leon Romanovsky, Max Gurtovoy, Tarun Gupta

The vfio_device is using a 'sleep until all refs go to zero' pattern for
its lifetime, but it is indirectly coded by repeatedly scanning the group
list waiting for the device to be removed on its own.

Switch this around to be a direct representation, use a refcount to count
the number of places that are blocking destruction and sleep directly on a
completion until that counter goes to zero. kfree the device after other
accesses have been excluded in vfio_del_group_dev(). This is a fairly
common Linux idiom.

Due to this we can now remove kref_put_mutex(), which is very rarely used
in the kernel. Here it is being used to prevent a zero ref device from
being seen in the group list. Instead allow the zero ref device to
continue to exist in the device_list and use refcount_inc_not_zero() to
exclude it once refs go to zero.

This patch is organized so the next patch will be able to alter the API to
allow drivers to provide the kfree.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/vfio/vfio.c | 79 ++++++++++++++-------------------------------
 1 file changed, 25 insertions(+), 54 deletions(-)

diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
index 15d8e678e5563a..32660e8a69ae20 100644
--- a/drivers/vfio/vfio.c
+++ b/drivers/vfio/vfio.c
@@ -46,7 +46,6 @@ static struct vfio {
 	struct mutex			group_lock;
 	struct cdev			group_cdev;
 	dev_t				group_devt;
-	wait_queue_head_t		release_q;
 } vfio;
 
 struct vfio_iommu_driver {
@@ -91,7 +90,8 @@ struct vfio_group {
 };
 
 struct vfio_device {
-	struct kref			kref;
+	refcount_t			refcount;
+	struct completion		comp;
 	struct device			*dev;
 	const struct vfio_device_ops	*ops;
 	struct vfio_group		*group;
@@ -544,7 +544,8 @@ struct vfio_device *vfio_group_create_device(struct vfio_group *group,
 	if (!device)
 		return ERR_PTR(-ENOMEM);
 
-	kref_init(&device->kref);
+	refcount_set(&device->refcount, 1);
+	init_completion(&device->comp);
 	device->dev = dev;
 	/* Our reference on group is moved to the device */
 	device->group = group;
@@ -560,35 +561,17 @@ struct vfio_device *vfio_group_create_device(struct vfio_group *group,
 	return device;
 }
 
-static void vfio_device_release(struct kref *kref)
-{
-	struct vfio_device *device = container_of(kref,
-						  struct vfio_device, kref);
-	struct vfio_group *group = device->group;
-
-	list_del(&device->group_next);
-	group->dev_counter--;
-	mutex_unlock(&group->device_lock);
-
-	dev_set_drvdata(device->dev, NULL);
-
-	kfree(device);
-
-	/* vfio_del_group_dev may be waiting for this device */
-	wake_up(&vfio.release_q);
-}
-
 /* Device reference always implies a group reference */
 void vfio_device_put(struct vfio_device *device)
 {
-	struct vfio_group *group = device->group;
-	kref_put_mutex(&device->kref, vfio_device_release, &group->device_lock);
+	if (refcount_dec_and_test(&device->refcount))
+		complete(&device->comp);
 }
 EXPORT_SYMBOL_GPL(vfio_device_put);
 
-static void vfio_device_get(struct vfio_device *device)
+static bool vfio_device_try_get(struct vfio_device *device)
 {
-	kref_get(&device->kref);
+	return refcount_inc_not_zero(&device->refcount);
 }
 
 static struct vfio_device *vfio_group_get_device(struct vfio_group *group,
@@ -598,8 +581,7 @@ static struct vfio_device *vfio_group_get_device(struct vfio_group *group,
 
 	mutex_lock(&group->device_lock);
 	list_for_each_entry(device, &group->device_list, group_next) {
-		if (device->dev == dev) {
-			vfio_device_get(device);
+		if (device->dev == dev && vfio_device_try_get(device)) {
 			mutex_unlock(&group->device_lock);
 			return device;
 		}
@@ -883,9 +865,8 @@ static struct vfio_device *vfio_device_get_from_name(struct vfio_group *group,
 			ret = !strcmp(dev_name(it->dev), buf);
 		}
 
-		if (ret) {
+		if (ret && vfio_device_try_get(it)) {
 			device = it;
-			vfio_device_get(device);
 			break;
 		}
 	}
@@ -908,13 +889,13 @@ EXPORT_SYMBOL_GPL(vfio_device_data);
  * removed.  Open file descriptors for the device... */
 void *vfio_del_group_dev(struct device *dev)
 {
-	DEFINE_WAIT_FUNC(wait, woken_wake_function);
 	struct vfio_device *device = dev_get_drvdata(dev);
 	struct vfio_group *group = device->group;
 	void *device_data = device->device_data;
 	struct vfio_unbound_dev *unbound;
 	unsigned int i = 0;
 	bool interrupted = false;
+	long rc;
 
 	/*
 	 * When the device is removed from the group, the group suddenly
@@ -935,32 +916,18 @@ void *vfio_del_group_dev(struct device *dev)
 	WARN_ON(!unbound);
 
 	vfio_device_put(device);
-
-	/*
-	 * If the device is still present in the group after the above
-	 * 'put', then it is in use and we need to request it from the
-	 * bus driver.  The driver may in turn need to request the
-	 * device from the user.  We send the request on an arbitrary
-	 * interval with counter to allow the driver to take escalating
-	 * measures to release the device if it has the ability to do so.
-	 */
-	add_wait_queue(&vfio.release_q, &wait);
-
-	do {
-		device = vfio_group_get_device(group, dev);
-		if (!device)
-			break;
-
+	rc = try_wait_for_completion(&device->comp);
+	while (rc <= 0) {
 		if (device->ops->request)
 			device->ops->request(device_data, i++);
 
-		vfio_device_put(device);
-
 		if (interrupted) {
-			wait_woken(&wait, TASK_UNINTERRUPTIBLE, HZ * 10);
+			rc = wait_for_completion_timeout(&device->comp,
+							 HZ * 10);
 		} else {
-			wait_woken(&wait, TASK_INTERRUPTIBLE, HZ * 10);
-			if (signal_pending(current)) {
+			rc = wait_for_completion_interruptible_timeout(
+				&device->comp, HZ * 10);
+			if (rc < 0) {
 				interrupted = true;
 				dev_warn(dev,
 					 "Device is currently in use, task"
@@ -969,10 +936,13 @@ void *vfio_del_group_dev(struct device *dev)
 					 current->comm, task_pid_nr(current));
 			}
 		}
+	}
 
-	} while (1);
+	mutex_lock(&group->device_lock);
+	list_del(&device->group_next);
+	group->dev_counter--;
+	mutex_unlock(&group->device_lock);
 
-	remove_wait_queue(&vfio.release_q, &wait);
 	/*
 	 * In order to support multiple devices per group, devices can be
 	 * plucked from the group while other devices in the group are still
@@ -992,6 +962,8 @@ void *vfio_del_group_dev(struct device *dev)
 
 	/* Matches the get in vfio_group_create_device() */
 	vfio_group_put(group);
+	dev_set_drvdata(dev, NULL);
+	kfree(device);
 
 	return device_data;
 }
@@ -2362,7 +2334,6 @@ static int __init vfio_init(void)
 	mutex_init(&vfio.iommu_drivers_lock);
 	INIT_LIST_HEAD(&vfio.group_list);
 	INIT_LIST_HEAD(&vfio.iommu_drivers_list);
-	init_waitqueue_head(&vfio.release_q);
 
 	ret = misc_register(&vfio_dev);
 	if (ret) {
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 82+ messages in thread

* [PATCH v2 03/14] vfio: Split creation of a vfio_device into init and register ops
  2021-03-13  0:55 [PATCH v2 00/14] Embed struct vfio_device in all sub-structures Jason Gunthorpe
  2021-03-13  0:55 ` [PATCH v2 01/14] vfio: Remove extra put/gets around vfio_device->group Jason Gunthorpe
  2021-03-13  0:55 ` [PATCH v2 02/14] vfio: Simplify the lifetime logic for vfio_device Jason Gunthorpe
@ 2021-03-13  0:55 ` Jason Gunthorpe
  2021-03-16  7:55   ` Tian, Kevin
                     ` (3 more replies)
  2021-03-13  0:55 ` [PATCH v2 04/14] vfio/platform: Use vfio_init/register/unregister_group_dev Jason Gunthorpe
                   ` (10 subsequent siblings)
  13 siblings, 4 replies; 82+ messages in thread
From: Jason Gunthorpe @ 2021-03-13  0:55 UTC (permalink / raw)
  To: Alex Williamson, Cornelia Huck, Jonathan Corbet, kvm, linux-doc
  Cc: Raj, Ashok, Dan Williams, Daniel Vetter, Christoph Hellwig,
	Leon Romanovsky, Max Gurtovoy, Tarun Gupta, Liu Yi L

This makes the struct vfio_pci_device part of the public interface so it
can be used with container_of and so forth, as is typical for a Linux
subystem.

This is the first step to bring some type-safety to the vfio interface by
allowing the replacement of 'void *' and 'struct device *' inputs with a
simple and clear 'struct vfio_pci_device *'

For now the self-allocating vfio_add_group_dev() interface is kept so each
user can be updated as a separate patch.

The expected usage pattern is

  driver core probe() function:
     my_device = kzalloc(sizeof(*mydevice));
     vfio_init_group_dev(&my_device->vdev, dev, ops, mydevice);
     /* other driver specific prep */
     vfio_register_group_dev(&my_device->vdev);
     dev_set_drvdata(my_device);

  driver core remove() function:
     my_device = dev_get_drvdata(dev);
     vfio_unregister_group_dev(&my_device->vdev);
     /* other driver specific tear down */
     kfree(my_device);

Allowing the driver to be able to use the drvdata and vifo_device to go
to/from its own data.

The pattern also makes it clear that vfio_register_group_dev() must be
last in the sequence, as once it is called the core code can immediately
start calling ops. The init/register gap is provided to allow for the
driver to do setup before ops can be called and thus avoid races.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Liu Yi L <yi.l.liu@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 Documentation/driver-api/vfio.rst |  31 ++++----
 drivers/vfio/vfio.c               | 123 ++++++++++++++++--------------
 include/linux/vfio.h              |  16 ++++
 3 files changed, 98 insertions(+), 72 deletions(-)

diff --git a/Documentation/driver-api/vfio.rst b/Documentation/driver-api/vfio.rst
index f1a4d3c3ba0bb1..d3a02300913a7f 100644
--- a/Documentation/driver-api/vfio.rst
+++ b/Documentation/driver-api/vfio.rst
@@ -249,18 +249,23 @@ VFIO bus driver API
 
 VFIO bus drivers, such as vfio-pci make use of only a few interfaces
 into VFIO core.  When devices are bound and unbound to the driver,
-the driver should call vfio_add_group_dev() and vfio_del_group_dev()
-respectively::
-
-	extern int vfio_add_group_dev(struct device *dev,
-				      const struct vfio_device_ops *ops,
-				      void *device_data);
-
-	extern void *vfio_del_group_dev(struct device *dev);
-
-vfio_add_group_dev() indicates to the core to begin tracking the
-iommu_group of the specified dev and register the dev as owned by
-a VFIO bus driver.  The driver provides an ops structure for callbacks
+the driver should call vfio_register_group_dev() and
+vfio_unregister_group_dev() respectively::
+
+	void vfio_init_group_dev(struct vfio_device *device,
+				struct device *dev,
+				const struct vfio_device_ops *ops,
+				void *device_data);
+	int vfio_register_group_dev(struct vfio_device *device);
+	void vfio_unregister_group_dev(struct vfio_device *device);
+
+The driver should embed the vfio_device in its own structure and call
+vfio_init_group_dev() to pre-configure it before going to registration.
+vfio_register_group_dev() indicates to the core to begin tracking the
+iommu_group of the specified dev and register the dev as owned by a VFIO bus
+driver. Once vfio_register_group_dev() returns it is possible for userspace to
+start accessing the driver, thus the driver should ensure it is completely
+ready before calling it. The driver provides an ops structure for callbacks
 similar to a file operations structure::
 
 	struct vfio_device_ops {
@@ -276,7 +281,7 @@ similar to a file operations structure::
 	};
 
 Each function is passed the device_data that was originally registered
-in the vfio_add_group_dev() call above.  This allows the bus driver
+in the vfio_register_group_dev() call above.  This allows the bus driver
 an easy place to store its opaque, private data.  The open/release
 callbacks are issued when a new file descriptor is created for a
 device (via VFIO_GROUP_GET_DEVICE_FD).  The ioctl interface provides
diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
index 32660e8a69ae20..cfa06ae3b9018b 100644
--- a/drivers/vfio/vfio.c
+++ b/drivers/vfio/vfio.c
@@ -89,16 +89,6 @@ struct vfio_group {
 	struct blocking_notifier_head	notifier;
 };
 
-struct vfio_device {
-	refcount_t			refcount;
-	struct completion		comp;
-	struct device			*dev;
-	const struct vfio_device_ops	*ops;
-	struct vfio_group		*group;
-	struct list_head		group_next;
-	void				*device_data;
-};
-
 #ifdef CONFIG_VFIO_NOIOMMU
 static bool noiommu __read_mostly;
 module_param_named(enable_unsafe_noiommu_mode,
@@ -532,35 +522,6 @@ static struct vfio_group *vfio_group_get_from_dev(struct device *dev)
 /**
  * Device objects - create, release, get, put, search
  */
-static
-struct vfio_device *vfio_group_create_device(struct vfio_group *group,
-					     struct device *dev,
-					     const struct vfio_device_ops *ops,
-					     void *device_data)
-{
-	struct vfio_device *device;
-
-	device = kzalloc(sizeof(*device), GFP_KERNEL);
-	if (!device)
-		return ERR_PTR(-ENOMEM);
-
-	refcount_set(&device->refcount, 1);
-	init_completion(&device->comp);
-	device->dev = dev;
-	/* Our reference on group is moved to the device */
-	device->group = group;
-	device->ops = ops;
-	device->device_data = device_data;
-	dev_set_drvdata(dev, device);
-
-	mutex_lock(&group->device_lock);
-	list_add(&device->group_next, &group->device_list);
-	group->dev_counter++;
-	mutex_unlock(&group->device_lock);
-
-	return device;
-}
-
 /* Device reference always implies a group reference */
 void vfio_device_put(struct vfio_device *device)
 {
@@ -779,14 +740,23 @@ static int vfio_iommu_group_notifier(struct notifier_block *nb,
 /**
  * VFIO driver API
  */
-int vfio_add_group_dev(struct device *dev,
-		       const struct vfio_device_ops *ops, void *device_data)
+void vfio_init_group_dev(struct vfio_device *device, struct device *dev,
+			 const struct vfio_device_ops *ops, void *device_data)
+{
+	init_completion(&device->comp);
+	device->dev = dev;
+	device->ops = ops;
+	device->device_data = device_data;
+}
+EXPORT_SYMBOL_GPL(vfio_init_group_dev);
+
+int vfio_register_group_dev(struct vfio_device *device)
 {
+	struct vfio_device *existing_device;
 	struct iommu_group *iommu_group;
 	struct vfio_group *group;
-	struct vfio_device *device;
 
-	iommu_group = iommu_group_get(dev);
+	iommu_group = iommu_group_get(device->dev);
 	if (!iommu_group)
 		return -EINVAL;
 
@@ -805,21 +775,50 @@ int vfio_add_group_dev(struct device *dev,
 		iommu_group_put(iommu_group);
 	}
 
-	device = vfio_group_get_device(group, dev);
-	if (device) {
-		dev_WARN(dev, "Device already exists on group %d\n",
+	existing_device = vfio_group_get_device(group, device->dev);
+	if (existing_device) {
+		dev_WARN(device->dev, "Device already exists on group %d\n",
 			 iommu_group_id(iommu_group));
-		vfio_device_put(device);
+		vfio_device_put(existing_device);
 		vfio_group_put(group);
 		return -EBUSY;
 	}
 
-	device = vfio_group_create_device(group, dev, ops, device_data);
-	if (IS_ERR(device)) {
-		vfio_group_put(group);
-		return PTR_ERR(device);
-	}
+	/* Our reference on group is moved to the device */
+	device->group = group;
+
+	/* Refcounting can't start until the driver calls register */
+	refcount_set(&device->refcount, 1);
+
+	mutex_lock(&group->device_lock);
+	list_add(&device->group_next, &group->device_list);
+	group->dev_counter++;
+	mutex_unlock(&group->device_lock);
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(vfio_register_group_dev);
+
+int vfio_add_group_dev(struct device *dev, const struct vfio_device_ops *ops,
+		       void *device_data)
+{
+	struct vfio_device *device;
+	int ret;
+
+	device = kzalloc(sizeof(*device), GFP_KERNEL);
+	if (!device)
+		return -ENOMEM;
+
+	vfio_init_group_dev(device, dev, ops, device_data);
+	ret = vfio_register_group_dev(device);
+	if (ret)
+		goto err_kfree;
+	dev_set_drvdata(dev, device);
 	return 0;
+
+err_kfree:
+	kfree(device);
+	return ret;
 }
 EXPORT_SYMBOL_GPL(vfio_add_group_dev);
 
@@ -887,11 +886,9 @@ EXPORT_SYMBOL_GPL(vfio_device_data);
 /*
  * Decrement the device reference count and wait for the device to be
  * removed.  Open file descriptors for the device... */
-void *vfio_del_group_dev(struct device *dev)
+void vfio_unregister_group_dev(struct vfio_device *device)
 {
-	struct vfio_device *device = dev_get_drvdata(dev);
 	struct vfio_group *group = device->group;
-	void *device_data = device->device_data;
 	struct vfio_unbound_dev *unbound;
 	unsigned int i = 0;
 	bool interrupted = false;
@@ -908,7 +905,7 @@ void *vfio_del_group_dev(struct device *dev)
 	 */
 	unbound = kzalloc(sizeof(*unbound), GFP_KERNEL);
 	if (unbound) {
-		unbound->dev = dev;
+		unbound->dev = device->dev;
 		mutex_lock(&group->unbound_lock);
 		list_add(&unbound->unbound_next, &group->unbound_list);
 		mutex_unlock(&group->unbound_lock);
@@ -919,7 +916,7 @@ void *vfio_del_group_dev(struct device *dev)
 	rc = try_wait_for_completion(&device->comp);
 	while (rc <= 0) {
 		if (device->ops->request)
-			device->ops->request(device_data, i++);
+			device->ops->request(device->device_data, i++);
 
 		if (interrupted) {
 			rc = wait_for_completion_timeout(&device->comp,
@@ -929,7 +926,7 @@ void *vfio_del_group_dev(struct device *dev)
 				&device->comp, HZ * 10);
 			if (rc < 0) {
 				interrupted = true;
-				dev_warn(dev,
+				dev_warn(device->dev,
 					 "Device is currently in use, task"
 					 " \"%s\" (%d) "
 					 "blocked until device is released",
@@ -962,9 +959,17 @@ void *vfio_del_group_dev(struct device *dev)
 
 	/* Matches the get in vfio_group_create_device() */
 	vfio_group_put(group);
+}
+EXPORT_SYMBOL_GPL(vfio_unregister_group_dev);
+
+void *vfio_del_group_dev(struct device *dev)
+{
+	struct vfio_device *device = dev_get_drvdata(dev);
+	void *device_data = device->device_data;
+
+	vfio_unregister_group_dev(device);
 	dev_set_drvdata(dev, NULL);
 	kfree(device);
-
 	return device_data;
 }
 EXPORT_SYMBOL_GPL(vfio_del_group_dev);
diff --git a/include/linux/vfio.h b/include/linux/vfio.h
index b7e18bde5aa8b3..ad8b579d67d34a 100644
--- a/include/linux/vfio.h
+++ b/include/linux/vfio.h
@@ -15,6 +15,18 @@
 #include <linux/poll.h>
 #include <uapi/linux/vfio.h>
 
+struct vfio_device {
+	struct device *dev;
+	const struct vfio_device_ops *ops;
+	struct vfio_group *group;
+
+	/* Members below here are private, not for driver use */
+	refcount_t refcount;
+	struct completion comp;
+	struct list_head group_next;
+	void *device_data;
+};
+
 /**
  * struct vfio_device_ops - VFIO bus driver device callbacks
  *
@@ -48,11 +60,15 @@ struct vfio_device_ops {
 extern struct iommu_group *vfio_iommu_group_get(struct device *dev);
 extern void vfio_iommu_group_put(struct iommu_group *group, struct device *dev);
 
+void vfio_init_group_dev(struct vfio_device *device, struct device *dev,
+			 const struct vfio_device_ops *ops, void *device_data);
+int vfio_register_group_dev(struct vfio_device *device);
 extern int vfio_add_group_dev(struct device *dev,
 			      const struct vfio_device_ops *ops,
 			      void *device_data);
 
 extern void *vfio_del_group_dev(struct device *dev);
+void vfio_unregister_group_dev(struct vfio_device *device);
 extern struct vfio_device *vfio_device_get_from_dev(struct device *dev);
 extern void vfio_device_put(struct vfio_device *device);
 extern void *vfio_device_data(struct vfio_device *device);
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 82+ messages in thread

* [PATCH v2 04/14] vfio/platform: Use vfio_init/register/unregister_group_dev
  2021-03-13  0:55 [PATCH v2 00/14] Embed struct vfio_device in all sub-structures Jason Gunthorpe
                   ` (2 preceding siblings ...)
  2021-03-13  0:55 ` [PATCH v2 03/14] vfio: Split creation of a vfio_device into init and register ops Jason Gunthorpe
@ 2021-03-13  0:55 ` Jason Gunthorpe
  2021-03-16 16:22   ` Cornelia Huck
                     ` (2 more replies)
  2021-03-13  0:55 ` [PATCH v2 05/14] vfio/fsl-mc: Re-order vfio_fsl_mc_probe() Jason Gunthorpe
                   ` (9 subsequent siblings)
  13 siblings, 3 replies; 82+ messages in thread
From: Jason Gunthorpe @ 2021-03-13  0:55 UTC (permalink / raw)
  To: Alex Williamson, Cornelia Huck, Eric Auger, kvm
  Cc: Raj, Ashok, Dan Williams, Daniel Vetter, Christoph Hellwig,
	Leon Romanovsky, Max Gurtovoy, Tarun Gupta

platform already allocates a struct vfio_platform_device with exactly
the same lifetime as vfio_device, switch to the new API and embed
vfio_device in vfio_platform_device.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/vfio/platform/vfio_amba.c             |  8 ++++---
 drivers/vfio/platform/vfio_platform.c         | 21 ++++++++---------
 drivers/vfio/platform/vfio_platform_common.c  | 23 +++++++------------
 drivers/vfio/platform/vfio_platform_private.h |  5 ++--
 4 files changed, 26 insertions(+), 31 deletions(-)

diff --git a/drivers/vfio/platform/vfio_amba.c b/drivers/vfio/platform/vfio_amba.c
index 3626c21501017e..f970eb2a999f29 100644
--- a/drivers/vfio/platform/vfio_amba.c
+++ b/drivers/vfio/platform/vfio_amba.c
@@ -66,16 +66,18 @@ static int vfio_amba_probe(struct amba_device *adev, const struct amba_id *id)
 	if (ret) {
 		kfree(vdev->name);
 		kfree(vdev);
+		return ret;
 	}
 
-	return ret;
+	dev_set_drvdata(&adev->dev, vdev);
+	return 0;
 }
 
 static void vfio_amba_remove(struct amba_device *adev)
 {
-	struct vfio_platform_device *vdev =
-		vfio_platform_remove_common(&adev->dev);
+	struct vfio_platform_device *vdev = dev_get_drvdata(&adev->dev);
 
+	vfio_platform_remove_common(vdev);
 	kfree(vdev->name);
 	kfree(vdev);
 }
diff --git a/drivers/vfio/platform/vfio_platform.c b/drivers/vfio/platform/vfio_platform.c
index 9fb6818cea12cb..f7b3f64ecc7f6c 100644
--- a/drivers/vfio/platform/vfio_platform.c
+++ b/drivers/vfio/platform/vfio_platform.c
@@ -54,23 +54,22 @@ static int vfio_platform_probe(struct platform_device *pdev)
 	vdev->reset_required = reset_required;
 
 	ret = vfio_platform_probe_common(vdev, &pdev->dev);
-	if (ret)
+	if (ret) {
 		kfree(vdev);
-
-	return ret;
+		return ret;
+	}
+	dev_set_drvdata(&pdev->dev, vdev);
+	return 0;
 }
 
 static int vfio_platform_remove(struct platform_device *pdev)
 {
-	struct vfio_platform_device *vdev;
-
-	vdev = vfio_platform_remove_common(&pdev->dev);
-	if (vdev) {
-		kfree(vdev);
-		return 0;
-	}
+	struct vfio_platform_device *vdev = dev_get_drvdata(&pdev->dev);
 
-	return -EINVAL;
+	vfio_platform_remove_common(vdev);
+	kfree(vdev->name);
+	kfree(vdev);
+	return 0;
 }
 
 static struct platform_driver vfio_platform_driver = {
diff --git a/drivers/vfio/platform/vfio_platform_common.c b/drivers/vfio/platform/vfio_platform_common.c
index fb4b385191f288..6eb749250ee41c 100644
--- a/drivers/vfio/platform/vfio_platform_common.c
+++ b/drivers/vfio/platform/vfio_platform_common.c
@@ -659,8 +659,7 @@ int vfio_platform_probe_common(struct vfio_platform_device *vdev,
 	struct iommu_group *group;
 	int ret;
 
-	if (!vdev)
-		return -EINVAL;
+	vfio_init_group_dev(&vdev->vdev, dev, &vfio_platform_ops, vdev);
 
 	ret = vfio_platform_acpi_probe(vdev, dev);
 	if (ret)
@@ -685,13 +684,13 @@ int vfio_platform_probe_common(struct vfio_platform_device *vdev,
 		goto put_reset;
 	}
 
-	ret = vfio_add_group_dev(dev, &vfio_platform_ops, vdev);
+	ret = vfio_register_group_dev(&vdev->vdev);
 	if (ret)
 		goto put_iommu;
 
 	mutex_init(&vdev->igate);
 
-	pm_runtime_enable(vdev->device);
+	pm_runtime_enable(dev);
 	return 0;
 
 put_iommu:
@@ -702,19 +701,13 @@ int vfio_platform_probe_common(struct vfio_platform_device *vdev,
 }
 EXPORT_SYMBOL_GPL(vfio_platform_probe_common);
 
-struct vfio_platform_device *vfio_platform_remove_common(struct device *dev)
+void vfio_platform_remove_common(struct vfio_platform_device *vdev)
 {
-	struct vfio_platform_device *vdev;
-
-	vdev = vfio_del_group_dev(dev);
+	vfio_unregister_group_dev(&vdev->vdev);
 
-	if (vdev) {
-		pm_runtime_disable(vdev->device);
-		vfio_platform_put_reset(vdev);
-		vfio_iommu_group_put(dev->iommu_group, dev);
-	}
-
-	return vdev;
+	pm_runtime_disable(vdev->device);
+	vfio_platform_put_reset(vdev);
+	vfio_iommu_group_put(vdev->vdev.dev->iommu_group, vdev->vdev.dev);
 }
 EXPORT_SYMBOL_GPL(vfio_platform_remove_common);
 
diff --git a/drivers/vfio/platform/vfio_platform_private.h b/drivers/vfio/platform/vfio_platform_private.h
index 289089910643ac..a5ba82c8cbc354 100644
--- a/drivers/vfio/platform/vfio_platform_private.h
+++ b/drivers/vfio/platform/vfio_platform_private.h
@@ -9,6 +9,7 @@
 
 #include <linux/types.h>
 #include <linux/interrupt.h>
+#include <linux/vfio.h>
 
 #define VFIO_PLATFORM_OFFSET_SHIFT   40
 #define VFIO_PLATFORM_OFFSET_MASK (((u64)(1) << VFIO_PLATFORM_OFFSET_SHIFT) - 1)
@@ -42,6 +43,7 @@ struct vfio_platform_region {
 };
 
 struct vfio_platform_device {
+	struct vfio_device		vdev;
 	struct vfio_platform_region	*regions;
 	u32				num_regions;
 	struct vfio_platform_irq	*irqs;
@@ -80,8 +82,7 @@ struct vfio_platform_reset_node {
 
 extern int vfio_platform_probe_common(struct vfio_platform_device *vdev,
 				      struct device *dev);
-extern struct vfio_platform_device *vfio_platform_remove_common
-				     (struct device *dev);
+void vfio_platform_remove_common(struct vfio_platform_device *vdev);
 
 extern int vfio_platform_irq_init(struct vfio_platform_device *vdev);
 extern void vfio_platform_irq_cleanup(struct vfio_platform_device *vdev);
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 82+ messages in thread

* [PATCH v2 05/14] vfio/fsl-mc: Re-order vfio_fsl_mc_probe()
  2021-03-13  0:55 [PATCH v2 00/14] Embed struct vfio_device in all sub-structures Jason Gunthorpe
                   ` (3 preceding siblings ...)
  2021-03-13  0:55 ` [PATCH v2 04/14] vfio/platform: Use vfio_init/register/unregister_group_dev Jason Gunthorpe
@ 2021-03-13  0:55 ` Jason Gunthorpe
  2021-03-15  8:44   ` Christoph Hellwig
                     ` (3 more replies)
  2021-03-13  0:55 ` [PATCH v2 06/14] vfio/fsl-mc: Use vfio_init/register/unregister_group_dev Jason Gunthorpe
                   ` (8 subsequent siblings)
  13 siblings, 4 replies; 82+ messages in thread
From: Jason Gunthorpe @ 2021-03-13  0:55 UTC (permalink / raw)
  To: Cornelia Huck, kvm
  Cc: Alex Williamson, Raj, Ashok, Bharat Bhushan, Dan Williams,
	Daniel Vetter, Diana Craciun, Eric Auger, Christoph Hellwig,
	Leon Romanovsky, Max Gurtovoy, Tarun Gupta

vfio_add_group_dev() must be called only after all of the private data in
vdev is fully setup and ready, otherwise there could be races with user
space instantiating a device file descriptor and starting to call ops.

For instance vfio_fsl_mc_reflck_attach() sets vdev->reflck and
vfio_fsl_mc_open(), called by fops open, unconditionally derefs it, which
will crash if things get out of order.

This driver started life with the right sequence, but three commits added
stuff after vfio_add_group_dev().

Fixes: 2e0d29561f59 ("vfio/fsl-mc: Add irq infrastructure for fsl-mc devices")
Fixes: f2ba7e8c947b ("vfio/fsl-mc: Added lock support in preparation for interrupt handling")
Fixes: 704f5082d845 ("vfio/fsl-mc: Scan DPRC objects on vfio-fsl-mc driver bind")
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/vfio/fsl-mc/vfio_fsl_mc.c | 43 ++++++++++++++++---------------
 1 file changed, 22 insertions(+), 21 deletions(-)

diff --git a/drivers/vfio/fsl-mc/vfio_fsl_mc.c b/drivers/vfio/fsl-mc/vfio_fsl_mc.c
index f27e25112c4037..881849723b4dfb 100644
--- a/drivers/vfio/fsl-mc/vfio_fsl_mc.c
+++ b/drivers/vfio/fsl-mc/vfio_fsl_mc.c
@@ -582,11 +582,21 @@ static int vfio_fsl_mc_init_device(struct vfio_fsl_mc_device *vdev)
 	dprc_cleanup(mc_dev);
 out_nc_unreg:
 	bus_unregister_notifier(&fsl_mc_bus_type, &vdev->nb);
-	vdev->nb.notifier_call = NULL;
-
 	return ret;
 }
 
+static void vfio_fsl_uninit_device(struct vfio_fsl_mc_device *vdev)
+{
+	struct fsl_mc_device *mc_dev = vdev->mc_dev;
+
+	if (!is_fsl_mc_bus_dprc(mc_dev))
+		return;
+
+	dprc_remove_devices(mc_dev, NULL, 0);
+	dprc_cleanup(mc_dev);
+	bus_unregister_notifier(&fsl_mc_bus_type, &vdev->nb);
+}
+
 static int vfio_fsl_mc_probe(struct fsl_mc_device *mc_dev)
 {
 	struct iommu_group *group;
@@ -607,29 +617,27 @@ static int vfio_fsl_mc_probe(struct fsl_mc_device *mc_dev)
 	}
 
 	vdev->mc_dev = mc_dev;
-
-	ret = vfio_add_group_dev(dev, &vfio_fsl_mc_ops, vdev);
-	if (ret) {
-		dev_err(dev, "VFIO_FSL_MC: Failed to add to vfio group\n");
-		goto out_group_put;
-	}
+	mutex_init(&vdev->igate);
 
 	ret = vfio_fsl_mc_reflck_attach(vdev);
 	if (ret)
-		goto out_group_dev;
+		goto out_group_put;
 
 	ret = vfio_fsl_mc_init_device(vdev);
 	if (ret)
 		goto out_reflck;
 
-	mutex_init(&vdev->igate);
-
+	ret = vfio_add_group_dev(dev, &vfio_fsl_mc_ops, vdev);
+	if (ret) {
+		dev_err(dev, "VFIO_FSL_MC: Failed to add to vfio group\n");
+		goto out_device;
+	}
 	return 0;
 
+out_device:
+	vfio_fsl_uninit_device(vdev);
 out_reflck:
 	vfio_fsl_mc_reflck_put(vdev->reflck);
-out_group_dev:
-	vfio_del_group_dev(dev);
 out_group_put:
 	vfio_iommu_group_put(group, dev);
 	return ret;
@@ -646,16 +654,9 @@ static int vfio_fsl_mc_remove(struct fsl_mc_device *mc_dev)
 
 	mutex_destroy(&vdev->igate);
 
+	vfio_fsl_uninit_device(vdev);
 	vfio_fsl_mc_reflck_put(vdev->reflck);
 
-	if (is_fsl_mc_bus_dprc(mc_dev)) {
-		dprc_remove_devices(mc_dev, NULL, 0);
-		dprc_cleanup(mc_dev);
-	}
-
-	if (vdev->nb.notifier_call)
-		bus_unregister_notifier(&fsl_mc_bus_type, &vdev->nb);
-
 	vfio_iommu_group_put(mc_dev->dev.iommu_group, dev);
 
 	return 0;
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 82+ messages in thread

* [PATCH v2 06/14] vfio/fsl-mc: Use vfio_init/register/unregister_group_dev
  2021-03-13  0:55 [PATCH v2 00/14] Embed struct vfio_device in all sub-structures Jason Gunthorpe
                   ` (4 preceding siblings ...)
  2021-03-13  0:55 ` [PATCH v2 05/14] vfio/fsl-mc: Re-order vfio_fsl_mc_probe() Jason Gunthorpe
@ 2021-03-13  0:55 ` Jason Gunthorpe
  2021-03-15  8:44   ` Christoph Hellwig
  2021-03-16 16:43   ` Cornelia Huck
  2021-03-13  0:55 ` [PATCH v2 07/14] vfio/pci: Move VGA and VF initialization to functions Jason Gunthorpe
                   ` (7 subsequent siblings)
  13 siblings, 2 replies; 82+ messages in thread
From: Jason Gunthorpe @ 2021-03-13  0:55 UTC (permalink / raw)
  To: Alex Williamson, Cornelia Huck, Diana Craciun, kvm
  Cc: Raj, Ashok, Dan Williams, Daniel Vetter, Christoph Hellwig,
	Leon Romanovsky, Max Gurtovoy, Tarun Gupta

fsl-mc already allocates a struct vfio_fsl_mc_device with exactly the same
lifetime as vfio_device, switch to the new API and embed vfio_device in
vfio_fsl_mc_device. While here remove the devm usage for the vdev, this
code is clean and doesn't need devm.

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/vfio/fsl-mc/vfio_fsl_mc.c         | 18 ++++++++++--------
 drivers/vfio/fsl-mc/vfio_fsl_mc_private.h |  1 +
 2 files changed, 11 insertions(+), 8 deletions(-)

diff --git a/drivers/vfio/fsl-mc/vfio_fsl_mc.c b/drivers/vfio/fsl-mc/vfio_fsl_mc.c
index 881849723b4dfb..87ea8368aa510a 100644
--- a/drivers/vfio/fsl-mc/vfio_fsl_mc.c
+++ b/drivers/vfio/fsl-mc/vfio_fsl_mc.c
@@ -610,34 +610,38 @@ static int vfio_fsl_mc_probe(struct fsl_mc_device *mc_dev)
 		return -EINVAL;
 	}
 
-	vdev = devm_kzalloc(dev, sizeof(*vdev), GFP_KERNEL);
+	vdev = kzalloc(sizeof(*vdev), GFP_KERNEL);
 	if (!vdev) {
 		ret = -ENOMEM;
 		goto out_group_put;
 	}
 
+	vfio_init_group_dev(&vdev->vdev, dev, &vfio_fsl_mc_ops, vdev);
 	vdev->mc_dev = mc_dev;
 	mutex_init(&vdev->igate);
 
 	ret = vfio_fsl_mc_reflck_attach(vdev);
 	if (ret)
-		goto out_group_put;
+		goto out_kfree;
 
 	ret = vfio_fsl_mc_init_device(vdev);
 	if (ret)
 		goto out_reflck;
 
-	ret = vfio_add_group_dev(dev, &vfio_fsl_mc_ops, vdev);
+	ret = vfio_register_group_dev(&vdev->vdev);
 	if (ret) {
 		dev_err(dev, "VFIO_FSL_MC: Failed to add to vfio group\n");
 		goto out_device;
 	}
+	dev_set_drvdata(dev, vdev);
 	return 0;
 
 out_device:
 	vfio_fsl_uninit_device(vdev);
 out_reflck:
 	vfio_fsl_mc_reflck_put(vdev->reflck);
+out_kfree:
+	kfree(vdev);
 out_group_put:
 	vfio_iommu_group_put(group, dev);
 	return ret;
@@ -645,18 +649,16 @@ static int vfio_fsl_mc_probe(struct fsl_mc_device *mc_dev)
 
 static int vfio_fsl_mc_remove(struct fsl_mc_device *mc_dev)
 {
-	struct vfio_fsl_mc_device *vdev;
 	struct device *dev = &mc_dev->dev;
+	struct vfio_fsl_mc_device *vdev = dev_get_drvdata(dev);
 
-	vdev = vfio_del_group_dev(dev);
-	if (!vdev)
-		return -EINVAL;
-
+	vfio_unregister_group_dev(&vdev->vdev);
 	mutex_destroy(&vdev->igate);
 
 	vfio_fsl_uninit_device(vdev);
 	vfio_fsl_mc_reflck_put(vdev->reflck);
 
+	kfree(vdev);
 	vfio_iommu_group_put(mc_dev->dev.iommu_group, dev);
 
 	return 0;
diff --git a/drivers/vfio/fsl-mc/vfio_fsl_mc_private.h b/drivers/vfio/fsl-mc/vfio_fsl_mc_private.h
index a97ee691ed47ec..89700e00e77d10 100644
--- a/drivers/vfio/fsl-mc/vfio_fsl_mc_private.h
+++ b/drivers/vfio/fsl-mc/vfio_fsl_mc_private.h
@@ -36,6 +36,7 @@ struct vfio_fsl_mc_region {
 };
 
 struct vfio_fsl_mc_device {
+	struct vfio_device		vdev;
 	struct fsl_mc_device		*mc_dev;
 	struct notifier_block        nb;
 	int				refcnt;
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 82+ messages in thread

* [PATCH v2 07/14] vfio/pci: Move VGA and VF initialization to functions
  2021-03-13  0:55 [PATCH v2 00/14] Embed struct vfio_device in all sub-structures Jason Gunthorpe
                   ` (5 preceding siblings ...)
  2021-03-13  0:55 ` [PATCH v2 06/14] vfio/fsl-mc: Use vfio_init/register/unregister_group_dev Jason Gunthorpe
@ 2021-03-13  0:55 ` Jason Gunthorpe
  2021-03-15  8:45   ` Christoph Hellwig
                     ` (4 more replies)
  2021-03-13  0:56 ` [PATCH v2 08/14] vfio/pci: Re-order vfio_pci_probe() Jason Gunthorpe
                   ` (6 subsequent siblings)
  13 siblings, 5 replies; 82+ messages in thread
From: Jason Gunthorpe @ 2021-03-13  0:55 UTC (permalink / raw)
  To: Cornelia Huck, kvm
  Cc: Alex Williamson, Raj, Ashok, Dan Williams, Daniel Vetter,
	Christoph Hellwig, Leon Romanovsky, Max Gurtovoy, Tarun Gupta

vfio_pci_probe() is quite complicated, with optional VF and VGA sub
components. Move these into clear init/uninit functions and have a linear
flow in probe/remove.

This fixes a few little buglets:
 - vfio_pci_remove() is in the wrong order, vga_client_register() removes
   a notifier and is after kfree(vdev), but the notifier refers to vdev,
   so it can use after free in a race.
 - vga_client_register() can fail but was ignored

Organize things so destruction order is the reverse of creation order.

Fixes: ecaa1f6a0154 ("vfio-pci: Add VGA arbiter client")
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/vfio/pci/vfio_pci.c | 116 +++++++++++++++++++++++-------------
 1 file changed, 74 insertions(+), 42 deletions(-)

diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c
index 65e7e6b44578c2..f95b58376156a0 100644
--- a/drivers/vfio/pci/vfio_pci.c
+++ b/drivers/vfio/pci/vfio_pci.c
@@ -1922,6 +1922,68 @@ static int vfio_pci_bus_notifier(struct notifier_block *nb,
 	return 0;
 }
 
+static int vfio_pci_vf_init(struct vfio_pci_device *vdev)
+{
+	struct pci_dev *pdev = vdev->pdev;
+	int ret;
+
+	if (!pdev->is_physfn)
+		return 0;
+
+	vdev->vf_token = kzalloc(sizeof(*vdev->vf_token), GFP_KERNEL);
+	if (!vdev->vf_token)
+		return -ENOMEM;
+
+	mutex_init(&vdev->vf_token->lock);
+	uuid_gen(&vdev->vf_token->uuid);
+
+	vdev->nb.notifier_call = vfio_pci_bus_notifier;
+	ret = bus_register_notifier(&pci_bus_type, &vdev->nb);
+	if (ret) {
+		kfree(vdev->vf_token);
+		return ret;
+	}
+	return 0;
+}
+
+static void vfio_pci_vf_uninit(struct vfio_pci_device *vdev)
+{
+	if (!vdev->vf_token)
+		return;
+
+	bus_unregister_notifier(&pci_bus_type, &vdev->nb);
+	WARN_ON(vdev->vf_token->users);
+	mutex_destroy(&vdev->vf_token->lock);
+	kfree(vdev->vf_token);
+}
+
+static int vfio_pci_vga_init(struct vfio_pci_device *vdev)
+{
+	struct pci_dev *pdev = vdev->pdev;
+	int ret;
+
+	if (!vfio_pci_is_vga(pdev))
+		return 0;
+
+	ret = vga_client_register(pdev, vdev, NULL, vfio_pci_set_vga_decode);
+	if (ret)
+		return ret;
+	vga_set_legacy_decoding(pdev, vfio_pci_set_vga_decode(vdev, false));
+	return 0;
+}
+
+static void vfio_pci_vga_uninit(struct vfio_pci_device *vdev)
+{
+	struct pci_dev *pdev = vdev->pdev;
+
+	if (!vfio_pci_is_vga(pdev))
+		return;
+	vga_client_register(pdev, NULL, NULL, NULL);
+	vga_set_legacy_decoding(pdev, VGA_RSRC_NORMAL_IO | VGA_RSRC_NORMAL_MEM |
+					      VGA_RSRC_LEGACY_IO |
+					      VGA_RSRC_LEGACY_MEM);
+}
+
 static int vfio_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
 {
 	struct vfio_pci_device *vdev;
@@ -1975,28 +2037,12 @@ static int vfio_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
 	ret = vfio_pci_reflck_attach(vdev);
 	if (ret)
 		goto out_del_group_dev;
-
-	if (pdev->is_physfn) {
-		vdev->vf_token = kzalloc(sizeof(*vdev->vf_token), GFP_KERNEL);
-		if (!vdev->vf_token) {
-			ret = -ENOMEM;
-			goto out_reflck;
-		}
-
-		mutex_init(&vdev->vf_token->lock);
-		uuid_gen(&vdev->vf_token->uuid);
-
-		vdev->nb.notifier_call = vfio_pci_bus_notifier;
-		ret = bus_register_notifier(&pci_bus_type, &vdev->nb);
-		if (ret)
-			goto out_vf_token;
-	}
-
-	if (vfio_pci_is_vga(pdev)) {
-		vga_client_register(pdev, vdev, NULL, vfio_pci_set_vga_decode);
-		vga_set_legacy_decoding(pdev,
-					vfio_pci_set_vga_decode(vdev, false));
-	}
+	ret = vfio_pci_vf_init(vdev);
+	if (ret)
+		goto out_reflck;
+	ret = vfio_pci_vga_init(vdev);
+	if (ret)
+		goto out_vf;
 
 	vfio_pci_probe_power_state(vdev);
 
@@ -2016,8 +2062,8 @@ static int vfio_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
 
 	return ret;
 
-out_vf_token:
-	kfree(vdev->vf_token);
+out_vf:
+	vfio_pci_vf_uninit(vdev);
 out_reflck:
 	vfio_pci_reflck_put(vdev->reflck);
 out_del_group_dev:
@@ -2039,33 +2085,19 @@ static void vfio_pci_remove(struct pci_dev *pdev)
 	if (!vdev)
 		return;
 
-	if (vdev->vf_token) {
-		WARN_ON(vdev->vf_token->users);
-		mutex_destroy(&vdev->vf_token->lock);
-		kfree(vdev->vf_token);
-	}
-
-	if (vdev->nb.notifier_call)
-		bus_unregister_notifier(&pci_bus_type, &vdev->nb);
-
+	vfio_pci_vf_uninit(vdev);
 	vfio_pci_reflck_put(vdev->reflck);
+	vfio_pci_vga_uninit(vdev);
 
 	vfio_iommu_group_put(pdev->dev.iommu_group, &pdev->dev);
-	kfree(vdev->region);
-	mutex_destroy(&vdev->ioeventfds_lock);
 
 	if (!disable_idle_d3)
 		vfio_pci_set_power_state(vdev, PCI_D0);
 
+	mutex_destroy(&vdev->ioeventfds_lock);
+	kfree(vdev->region);
 	kfree(vdev->pm_save);
 	kfree(vdev);
-
-	if (vfio_pci_is_vga(pdev)) {
-		vga_client_register(pdev, NULL, NULL, NULL);
-		vga_set_legacy_decoding(pdev,
-				VGA_RSRC_NORMAL_IO | VGA_RSRC_NORMAL_MEM |
-				VGA_RSRC_LEGACY_IO | VGA_RSRC_LEGACY_MEM);
-	}
 }
 
 static pci_ers_result_t vfio_pci_aer_err_detected(struct pci_dev *pdev,
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 82+ messages in thread

* [PATCH v2 08/14] vfio/pci: Re-order vfio_pci_probe()
  2021-03-13  0:55 [PATCH v2 00/14] Embed struct vfio_device in all sub-structures Jason Gunthorpe
                   ` (6 preceding siblings ...)
  2021-03-13  0:55 ` [PATCH v2 07/14] vfio/pci: Move VGA and VF initialization to functions Jason Gunthorpe
@ 2021-03-13  0:56 ` Jason Gunthorpe
  2021-03-15  8:46   ` Christoph Hellwig
                     ` (4 more replies)
  2021-03-13  0:56 ` [PATCH v2 09/14] vfio/pci: Use vfio_init/register/unregister_group_dev Jason Gunthorpe
                   ` (5 subsequent siblings)
  13 siblings, 5 replies; 82+ messages in thread
From: Jason Gunthorpe @ 2021-03-13  0:56 UTC (permalink / raw)
  To: kvm
  Cc: Alex Williamson, Raj, Ashok, Christian Ehrhardt, Cornelia Huck,
	Dan Williams, Daniel Vetter, Eric Auger, Christoph Hellwig,
	Kevin Tian, Leon Romanovsky, Max Gurtovoy, Tarun Gupta

vfio_add_group_dev() must be called only after all of the private data in
vdev is fully setup and ready, otherwise there could be races with user
space instantiating a device file descriptor and starting to call ops.

For instance vfio_pci_reflck_attach() sets vdev->reflck and
vfio_pci_open(), called by fops open, unconditionally derefs it, which
will crash if things get out of order.

Fixes: cc20d7999000 ("vfio/pci: Introduce VF token")
Fixes: e309df5b0c9e ("vfio/pci: Parallelize device open and release")
Fixes: 6eb7018705de ("vfio-pci: Move idle devices to D3hot power state")
Fixes: ecaa1f6a0154 ("vfio-pci: Add VGA arbiter client")
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/vfio/pci/vfio_pci.c | 17 +++++++++--------
 1 file changed, 9 insertions(+), 8 deletions(-)

diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c
index f95b58376156a0..0e7682e7a0b478 100644
--- a/drivers/vfio/pci/vfio_pci.c
+++ b/drivers/vfio/pci/vfio_pci.c
@@ -2030,13 +2030,9 @@ static int vfio_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
 	INIT_LIST_HEAD(&vdev->vma_list);
 	init_rwsem(&vdev->memory_lock);
 
-	ret = vfio_add_group_dev(&pdev->dev, &vfio_pci_ops, vdev);
-	if (ret)
-		goto out_free;
-
 	ret = vfio_pci_reflck_attach(vdev);
 	if (ret)
-		goto out_del_group_dev;
+		goto out_free;
 	ret = vfio_pci_vf_init(vdev);
 	if (ret)
 		goto out_reflck;
@@ -2060,15 +2056,20 @@ static int vfio_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
 		vfio_pci_set_power_state(vdev, PCI_D3hot);
 	}
 
-	return ret;
+	ret = vfio_add_group_dev(&pdev->dev, &vfio_pci_ops, vdev);
+	if (ret)
+		goto out_power;
+	return 0;
 
+out_power:
+	if (!disable_idle_d3)
+		vfio_pci_set_power_state(vdev, PCI_D0);
 out_vf:
 	vfio_pci_vf_uninit(vdev);
 out_reflck:
 	vfio_pci_reflck_put(vdev->reflck);
-out_del_group_dev:
-	vfio_del_group_dev(&pdev->dev);
 out_free:
+	kfree(vdev->pm_save);
 	kfree(vdev);
 out_group_put:
 	vfio_iommu_group_put(group, &pdev->dev);
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 82+ messages in thread

* [PATCH v2 09/14] vfio/pci: Use vfio_init/register/unregister_group_dev
  2021-03-13  0:55 [PATCH v2 00/14] Embed struct vfio_device in all sub-structures Jason Gunthorpe
                   ` (7 preceding siblings ...)
  2021-03-13  0:56 ` [PATCH v2 08/14] vfio/pci: Re-order vfio_pci_probe() Jason Gunthorpe
@ 2021-03-13  0:56 ` Jason Gunthorpe
  2021-03-16  8:06   ` Tian, Kevin
                     ` (2 more replies)
  2021-03-13  0:56 ` [PATCH v2 10/14] vfio/mdev: " Jason Gunthorpe
                   ` (4 subsequent siblings)
  13 siblings, 3 replies; 82+ messages in thread
From: Jason Gunthorpe @ 2021-03-13  0:56 UTC (permalink / raw)
  To: Alex Williamson, Cornelia Huck, kvm
  Cc: Raj, Ashok, Dan Williams, Daniel Vetter, Christoph Hellwig,
	Leon Romanovsky, Max Gurtovoy, Tarun Gupta, Liu Yi L

pci already allocates a struct vfio_pci_device with exactly the same
lifetime as vfio_device, switch to the new API and embed vfio_device in
vfio_pci_device.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Liu Yi L <yi.l.liu@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/vfio/pci/vfio_pci.c         | 10 +++++-----
 drivers/vfio/pci/vfio_pci_private.h |  1 +
 2 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c
index 0e7682e7a0b478..a0ac20a499cf6c 100644
--- a/drivers/vfio/pci/vfio_pci.c
+++ b/drivers/vfio/pci/vfio_pci.c
@@ -2019,6 +2019,7 @@ static int vfio_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
 		goto out_group_put;
 	}
 
+	vfio_init_group_dev(&vdev->vdev, &pdev->dev, &vfio_pci_ops, vdev);
 	vdev->pdev = pdev;
 	vdev->irq_type = VFIO_PCI_NUM_IRQS;
 	mutex_init(&vdev->igate);
@@ -2056,9 +2057,10 @@ static int vfio_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
 		vfio_pci_set_power_state(vdev, PCI_D3hot);
 	}
 
-	ret = vfio_add_group_dev(&pdev->dev, &vfio_pci_ops, vdev);
+	ret = vfio_register_group_dev(&vdev->vdev);
 	if (ret)
 		goto out_power;
+	dev_set_drvdata(&pdev->dev, vdev);
 	return 0;
 
 out_power:
@@ -2078,13 +2080,11 @@ static int vfio_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
 
 static void vfio_pci_remove(struct pci_dev *pdev)
 {
-	struct vfio_pci_device *vdev;
+	struct vfio_pci_device *vdev = dev_get_drvdata(&pdev->dev);
 
 	pci_disable_sriov(pdev);
 
-	vdev = vfio_del_group_dev(&pdev->dev);
-	if (!vdev)
-		return;
+	vfio_unregister_group_dev(&vdev->vdev);
 
 	vfio_pci_vf_uninit(vdev);
 	vfio_pci_reflck_put(vdev->reflck);
diff --git a/drivers/vfio/pci/vfio_pci_private.h b/drivers/vfio/pci/vfio_pci_private.h
index 9cd1882a05af69..8755a0febd054a 100644
--- a/drivers/vfio/pci/vfio_pci_private.h
+++ b/drivers/vfio/pci/vfio_pci_private.h
@@ -100,6 +100,7 @@ struct vfio_pci_mmap_vma {
 };
 
 struct vfio_pci_device {
+	struct vfio_device	vdev;
 	struct pci_dev		*pdev;
 	void __iomem		*barmap[PCI_STD_NUM_BARS];
 	bool			bar_mmap_supported[PCI_STD_NUM_BARS];
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 82+ messages in thread

* [PATCH v2 10/14] vfio/mdev: Use vfio_init/register/unregister_group_dev
  2021-03-13  0:55 [PATCH v2 00/14] Embed struct vfio_device in all sub-structures Jason Gunthorpe
                   ` (8 preceding siblings ...)
  2021-03-13  0:56 ` [PATCH v2 09/14] vfio/pci: Use vfio_init/register/unregister_group_dev Jason Gunthorpe
@ 2021-03-13  0:56 ` Jason Gunthorpe
  2021-03-16  8:09   ` Tian, Kevin
  2021-03-17 10:36   ` Cornelia Huck
  2021-03-13  0:56 ` [PATCH v2 11/14] vfio/mdev: Make to_mdev_device() into a static inline Jason Gunthorpe
                   ` (3 subsequent siblings)
  13 siblings, 2 replies; 82+ messages in thread
From: Jason Gunthorpe @ 2021-03-13  0:56 UTC (permalink / raw)
  To: Alex Williamson, Cornelia Huck, kvm, Kirti Wankhede
  Cc: Raj, Ashok, Dan Williams, Daniel Vetter, Christoph Hellwig,
	Leon Romanovsky, Max Gurtovoy, Tarun Gupta, Liu Yi L

mdev gets little benefit because it doesn't actually do anything, however
it is the last user, so move the code here for now.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Liu Yi L <yi.l.liu@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/vfio/mdev/vfio_mdev.c | 24 +++++++++++++++++++--
 drivers/vfio/vfio.c           | 39 ++---------------------------------
 include/linux/vfio.h          |  5 -----
 3 files changed, 24 insertions(+), 44 deletions(-)

diff --git a/drivers/vfio/mdev/vfio_mdev.c b/drivers/vfio/mdev/vfio_mdev.c
index b52eea128549ee..4469aaf31b56cb 100644
--- a/drivers/vfio/mdev/vfio_mdev.c
+++ b/drivers/vfio/mdev/vfio_mdev.c
@@ -21,6 +21,10 @@
 #define DRIVER_AUTHOR   "NVIDIA Corporation"
 #define DRIVER_DESC     "VFIO based driver for Mediated device"
 
+struct mdev_vfio_device {
+	struct vfio_device vdev;
+};
+
 static int vfio_mdev_open(void *device_data)
 {
 	struct mdev_device *mdev = device_data;
@@ -124,13 +128,29 @@ static const struct vfio_device_ops vfio_mdev_dev_ops = {
 static int vfio_mdev_probe(struct device *dev)
 {
 	struct mdev_device *mdev = to_mdev_device(dev);
+	struct mdev_vfio_device *mvdev;
+	int ret;
 
-	return vfio_add_group_dev(dev, &vfio_mdev_dev_ops, mdev);
+	mvdev = kzalloc(sizeof(*mvdev), GFP_KERNEL);
+	if (!mvdev)
+		return -ENOMEM;
+
+	vfio_init_group_dev(&mvdev->vdev, &mdev->dev, &vfio_mdev_dev_ops, mdev);
+	ret = vfio_register_group_dev(&mvdev->vdev);
+	if (ret) {
+		kfree(mvdev);
+		return ret;
+	}
+	dev_set_drvdata(&mdev->dev, mvdev);
+	return 0;
 }
 
 static void vfio_mdev_remove(struct device *dev)
 {
-	vfio_del_group_dev(dev);
+	struct mdev_vfio_device *mvdev = dev_get_drvdata(dev);
+
+	vfio_unregister_group_dev(&mvdev->vdev);
+	kfree(mvdev);
 }
 
 static struct mdev_driver vfio_mdev_driver = {
diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
index cfa06ae3b9018b..2d6d7cc1d1ebf9 100644
--- a/drivers/vfio/vfio.c
+++ b/drivers/vfio/vfio.c
@@ -99,8 +99,8 @@ MODULE_PARM_DESC(enable_unsafe_noiommu_mode, "Enable UNSAFE, no-IOMMU mode.  Thi
 /*
  * vfio_iommu_group_{get,put} are only intended for VFIO bus driver probe
  * and remove functions, any use cases other than acquiring the first
- * reference for the purpose of calling vfio_add_group_dev() or removing
- * that symmetric reference after vfio_del_group_dev() should use the raw
+ * reference for the purpose of calling vfio_register_group_dev() or removing
+ * that symmetric reference after vfio_unregister_group_dev() should use the raw
  * iommu_group_{get,put} functions.  In particular, vfio_iommu_group_put()
  * removes the device from the dummy group and cannot be nested.
  */
@@ -799,29 +799,6 @@ int vfio_register_group_dev(struct vfio_device *device)
 }
 EXPORT_SYMBOL_GPL(vfio_register_group_dev);
 
-int vfio_add_group_dev(struct device *dev, const struct vfio_device_ops *ops,
-		       void *device_data)
-{
-	struct vfio_device *device;
-	int ret;
-
-	device = kzalloc(sizeof(*device), GFP_KERNEL);
-	if (!device)
-		return -ENOMEM;
-
-	vfio_init_group_dev(device, dev, ops, device_data);
-	ret = vfio_register_group_dev(device);
-	if (ret)
-		goto err_kfree;
-	dev_set_drvdata(dev, device);
-	return 0;
-
-err_kfree:
-	kfree(device);
-	return ret;
-}
-EXPORT_SYMBOL_GPL(vfio_add_group_dev);
-
 /**
  * Get a reference to the vfio_device for a device.  Even if the
  * caller thinks they own the device, they could be racing with a
@@ -962,18 +939,6 @@ void vfio_unregister_group_dev(struct vfio_device *device)
 }
 EXPORT_SYMBOL_GPL(vfio_unregister_group_dev);
 
-void *vfio_del_group_dev(struct device *dev)
-{
-	struct vfio_device *device = dev_get_drvdata(dev);
-	void *device_data = device->device_data;
-
-	vfio_unregister_group_dev(device);
-	dev_set_drvdata(dev, NULL);
-	kfree(device);
-	return device_data;
-}
-EXPORT_SYMBOL_GPL(vfio_del_group_dev);
-
 /**
  * VFIO base fd, /dev/vfio/vfio
  */
diff --git a/include/linux/vfio.h b/include/linux/vfio.h
index ad8b579d67d34a..4995faf51efeae 100644
--- a/include/linux/vfio.h
+++ b/include/linux/vfio.h
@@ -63,11 +63,6 @@ extern void vfio_iommu_group_put(struct iommu_group *group, struct device *dev);
 void vfio_init_group_dev(struct vfio_device *device, struct device *dev,
 			 const struct vfio_device_ops *ops, void *device_data);
 int vfio_register_group_dev(struct vfio_device *device);
-extern int vfio_add_group_dev(struct device *dev,
-			      const struct vfio_device_ops *ops,
-			      void *device_data);
-
-extern void *vfio_del_group_dev(struct device *dev);
 void vfio_unregister_group_dev(struct vfio_device *device);
 extern struct vfio_device *vfio_device_get_from_dev(struct device *dev);
 extern void vfio_device_put(struct vfio_device *device);
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 82+ messages in thread

* [PATCH v2 11/14] vfio/mdev: Make to_mdev_device() into a static inline
  2021-03-13  0:55 [PATCH v2 00/14] Embed struct vfio_device in all sub-structures Jason Gunthorpe
                   ` (9 preceding siblings ...)
  2021-03-13  0:56 ` [PATCH v2 10/14] vfio/mdev: " Jason Gunthorpe
@ 2021-03-13  0:56 ` Jason Gunthorpe
  2021-03-16  8:10   ` Tian, Kevin
                     ` (2 more replies)
  2021-03-13  0:56 ` [PATCH v2 12/14] vfio: Make vfio_device_ops pass a 'struct vfio_device *' instead of 'void *' Jason Gunthorpe
                   ` (2 subsequent siblings)
  13 siblings, 3 replies; 82+ messages in thread
From: Jason Gunthorpe @ 2021-03-13  0:56 UTC (permalink / raw)
  To: Alex Williamson, Cornelia Huck, kvm, Kirti Wankhede
  Cc: Raj, Ashok, Dan Williams, Daniel Vetter, Christoph Hellwig,
	Leon Romanovsky, Max Gurtovoy, Tarun Gupta

The macro wrongly uses 'dev' as both the macro argument and the member
name, which means it fails compilation if any caller uses a word other
than 'dev' as the single argument. Fix this defect by making it into
proper static inline, which is more clear and typesafe anyhow.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/vfio/mdev/mdev_private.h | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/vfio/mdev/mdev_private.h b/drivers/vfio/mdev/mdev_private.h
index 7d922950caaf3c..74c2e541146999 100644
--- a/drivers/vfio/mdev/mdev_private.h
+++ b/drivers/vfio/mdev/mdev_private.h
@@ -35,7 +35,10 @@ struct mdev_device {
 	bool active;
 };
 
-#define to_mdev_device(dev)	container_of(dev, struct mdev_device, dev)
+static inline struct mdev_device *to_mdev_device(struct device *dev)
+{
+	return container_of(dev, struct mdev_device, dev);
+}
 #define dev_is_mdev(d)		((d)->bus == &mdev_bus_type)
 
 struct mdev_type {
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 82+ messages in thread

* [PATCH v2 12/14] vfio: Make vfio_device_ops pass a 'struct vfio_device *' instead of 'void *'
  2021-03-13  0:55 [PATCH v2 00/14] Embed struct vfio_device in all sub-structures Jason Gunthorpe
                   ` (10 preceding siblings ...)
  2021-03-13  0:56 ` [PATCH v2 11/14] vfio/mdev: Make to_mdev_device() into a static inline Jason Gunthorpe
@ 2021-03-13  0:56 ` Jason Gunthorpe
  2021-03-15  8:58   ` Christoph Hellwig
  2021-03-17 11:33   ` Cornelia Huck
  2021-03-13  0:56 ` [PATCH v2 13/14] vfio/pci: Replace uses of vfio_device_data() with container_of Jason Gunthorpe
  2021-03-13  0:56 ` [PATCH v2 14/14] vfio: Remove device_data from the vfio bus driver API Jason Gunthorpe
  13 siblings, 2 replies; 82+ messages in thread
From: Jason Gunthorpe @ 2021-03-13  0:56 UTC (permalink / raw)
  To: Alex Williamson, Cornelia Huck, Jonathan Corbet, Diana Craciun,
	Eric Auger, kvm, Kirti Wankhede, linux-doc
  Cc: Raj, Ashok, Dan Williams, Daniel Vetter, Christoph Hellwig,
	Leon Romanovsky, Max Gurtovoy, Tarun Gupta

This is the standard kernel pattern, the ops associated with a struct get
the struct pointer in for typesafety. The expected design is to use
container_of to cleanly go from the subsystem level type to the driver
level type without having any type erasure in a void *.

Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 Documentation/driver-api/vfio.rst            | 18 ++++----
 drivers/vfio/fsl-mc/vfio_fsl_mc.c            | 36 +++++++++------
 drivers/vfio/mdev/vfio_mdev.c                | 33 +++++++-------
 drivers/vfio/pci/vfio_pci.c                  | 47 ++++++++++++--------
 drivers/vfio/platform/vfio_platform_common.c | 33 ++++++++------
 drivers/vfio/vfio.c                          | 20 ++++-----
 include/linux/vfio.h                         | 16 +++----
 7 files changed, 117 insertions(+), 86 deletions(-)

diff --git a/Documentation/driver-api/vfio.rst b/Documentation/driver-api/vfio.rst
index d3a02300913a7f..3337f337293a32 100644
--- a/Documentation/driver-api/vfio.rst
+++ b/Documentation/driver-api/vfio.rst
@@ -269,20 +269,22 @@ ready before calling it. The driver provides an ops structure for callbacks
 similar to a file operations structure::
 
 	struct vfio_device_ops {
-		int	(*open)(void *device_data);
-		void	(*release)(void *device_data);
-		ssize_t	(*read)(void *device_data, char __user *buf,
+		int	(*open)(struct vfio_device *vdev);
+		void	(*release)(struct vfio_device *vdev);
+		ssize_t	(*read)(struct vfio_device *vdev, char __user *buf,
 				size_t count, loff_t *ppos);
-		ssize_t	(*write)(void *device_data, const char __user *buf,
+		ssize_t	(*write)(struct vfio_device *vdev,
+				 const char __user *buf,
 				 size_t size, loff_t *ppos);
-		long	(*ioctl)(void *device_data, unsigned int cmd,
+		long	(*ioctl)(struct vfio_device *vdev, unsigned int cmd,
 				 unsigned long arg);
-		int	(*mmap)(void *device_data, struct vm_area_struct *vma);
+		int	(*mmap)(struct vfio_device *vdev,
+				struct vm_area_struct *vma);
 	};
 
-Each function is passed the device_data that was originally registered
+Each function is passed the vdev that was originally registered
 in the vfio_register_group_dev() call above.  This allows the bus driver
-an easy place to store its opaque, private data.  The open/release
+to obtain its private data using container_of().  The open/release
 callbacks are issued when a new file descriptor is created for a
 device (via VFIO_GROUP_GET_DEVICE_FD).  The ioctl interface provides
 a direct pass through for VFIO_DEVICE_* ioctls.  The read/write/mmap
diff --git a/drivers/vfio/fsl-mc/vfio_fsl_mc.c b/drivers/vfio/fsl-mc/vfio_fsl_mc.c
index 87ea8368aa510a..023b2222806424 100644
--- a/drivers/vfio/fsl-mc/vfio_fsl_mc.c
+++ b/drivers/vfio/fsl-mc/vfio_fsl_mc.c
@@ -135,9 +135,10 @@ static void vfio_fsl_mc_regions_cleanup(struct vfio_fsl_mc_device *vdev)
 	kfree(vdev->regions);
 }
 
-static int vfio_fsl_mc_open(void *device_data)
+static int vfio_fsl_mc_open(struct vfio_device *core_vdev)
 {
-	struct vfio_fsl_mc_device *vdev = device_data;
+	struct vfio_fsl_mc_device *vdev =
+		container_of(core_vdev, struct vfio_fsl_mc_device, vdev);
 	int ret;
 
 	if (!try_module_get(THIS_MODULE))
@@ -161,9 +162,10 @@ static int vfio_fsl_mc_open(void *device_data)
 	return ret;
 }
 
-static void vfio_fsl_mc_release(void *device_data)
+static void vfio_fsl_mc_release(struct vfio_device *core_vdev)
 {
-	struct vfio_fsl_mc_device *vdev = device_data;
+	struct vfio_fsl_mc_device *vdev =
+		container_of(core_vdev, struct vfio_fsl_mc_device, vdev);
 	int ret;
 
 	mutex_lock(&vdev->reflck->lock);
@@ -197,11 +199,12 @@ static void vfio_fsl_mc_release(void *device_data)
 	module_put(THIS_MODULE);
 }
 
-static long vfio_fsl_mc_ioctl(void *device_data, unsigned int cmd,
-			      unsigned long arg)
+static long vfio_fsl_mc_ioctl(struct vfio_device *core_vdev,
+			      unsigned int cmd, unsigned long arg)
 {
 	unsigned long minsz;
-	struct vfio_fsl_mc_device *vdev = device_data;
+	struct vfio_fsl_mc_device *vdev =
+		container_of(core_vdev, struct vfio_fsl_mc_device, vdev);
 	struct fsl_mc_device *mc_dev = vdev->mc_dev;
 
 	switch (cmd) {
@@ -327,10 +330,11 @@ static long vfio_fsl_mc_ioctl(void *device_data, unsigned int cmd,
 	}
 }
 
-static ssize_t vfio_fsl_mc_read(void *device_data, char __user *buf,
+static ssize_t vfio_fsl_mc_read(struct vfio_device *core_vdev, char __user *buf,
 				size_t count, loff_t *ppos)
 {
-	struct vfio_fsl_mc_device *vdev = device_data;
+	struct vfio_fsl_mc_device *vdev =
+		container_of(core_vdev, struct vfio_fsl_mc_device, vdev);
 	unsigned int index = VFIO_FSL_MC_OFFSET_TO_INDEX(*ppos);
 	loff_t off = *ppos & VFIO_FSL_MC_OFFSET_MASK;
 	struct fsl_mc_device *mc_dev = vdev->mc_dev;
@@ -404,10 +408,12 @@ static int vfio_fsl_mc_send_command(void __iomem *ioaddr, uint64_t *cmd_data)
 	return 0;
 }
 
-static ssize_t vfio_fsl_mc_write(void *device_data, const char __user *buf,
-				 size_t count, loff_t *ppos)
+static ssize_t vfio_fsl_mc_write(struct vfio_device *core_vdev,
+				 const char __user *buf, size_t count,
+				 loff_t *ppos)
 {
-	struct vfio_fsl_mc_device *vdev = device_data;
+	struct vfio_fsl_mc_device *vdev =
+		container_of(core_vdev, struct vfio_fsl_mc_device, vdev);
 	unsigned int index = VFIO_FSL_MC_OFFSET_TO_INDEX(*ppos);
 	loff_t off = *ppos & VFIO_FSL_MC_OFFSET_MASK;
 	struct fsl_mc_device *mc_dev = vdev->mc_dev;
@@ -468,9 +474,11 @@ static int vfio_fsl_mc_mmap_mmio(struct vfio_fsl_mc_region region,
 			       size, vma->vm_page_prot);
 }
 
-static int vfio_fsl_mc_mmap(void *device_data, struct vm_area_struct *vma)
+static int vfio_fsl_mc_mmap(struct vfio_device *core_vdev,
+			    struct vm_area_struct *vma)
 {
-	struct vfio_fsl_mc_device *vdev = device_data;
+	struct vfio_fsl_mc_device *vdev =
+		container_of(core_vdev, struct vfio_fsl_mc_device, vdev);
 	struct fsl_mc_device *mc_dev = vdev->mc_dev;
 	unsigned int index;
 
diff --git a/drivers/vfio/mdev/vfio_mdev.c b/drivers/vfio/mdev/vfio_mdev.c
index 4469aaf31b56cb..e7309caa99c71b 100644
--- a/drivers/vfio/mdev/vfio_mdev.c
+++ b/drivers/vfio/mdev/vfio_mdev.c
@@ -25,10 +25,11 @@ struct mdev_vfio_device {
 	struct vfio_device vdev;
 };
 
-static int vfio_mdev_open(void *device_data)
+static int vfio_mdev_open(struct vfio_device *core_vdev)
 {
-	struct mdev_device *mdev = device_data;
+	struct mdev_device *mdev = to_mdev_device(core_vdev->dev);
 	struct mdev_parent *parent = mdev->parent;
+
 	int ret;
 
 	if (unlikely(!parent->ops->open))
@@ -44,9 +45,9 @@ static int vfio_mdev_open(void *device_data)
 	return ret;
 }
 
-static void vfio_mdev_release(void *device_data)
+static void vfio_mdev_release(struct vfio_device *core_vdev)
 {
-	struct mdev_device *mdev = device_data;
+	struct mdev_device *mdev = to_mdev_device(core_vdev->dev);
 	struct mdev_parent *parent = mdev->parent;
 
 	if (likely(parent->ops->release))
@@ -55,10 +56,10 @@ static void vfio_mdev_release(void *device_data)
 	module_put(THIS_MODULE);
 }
 
-static long vfio_mdev_unlocked_ioctl(void *device_data,
+static long vfio_mdev_unlocked_ioctl(struct vfio_device *core_vdev,
 				     unsigned int cmd, unsigned long arg)
 {
-	struct mdev_device *mdev = device_data;
+	struct mdev_device *mdev = to_mdev_device(core_vdev->dev);
 	struct mdev_parent *parent = mdev->parent;
 
 	if (unlikely(!parent->ops->ioctl))
@@ -67,10 +68,10 @@ static long vfio_mdev_unlocked_ioctl(void *device_data,
 	return parent->ops->ioctl(mdev, cmd, arg);
 }
 
-static ssize_t vfio_mdev_read(void *device_data, char __user *buf,
+static ssize_t vfio_mdev_read(struct vfio_device *core_vdev, char __user *buf,
 			      size_t count, loff_t *ppos)
 {
-	struct mdev_device *mdev = device_data;
+	struct mdev_device *mdev = to_mdev_device(core_vdev->dev);
 	struct mdev_parent *parent = mdev->parent;
 
 	if (unlikely(!parent->ops->read))
@@ -79,10 +80,11 @@ static ssize_t vfio_mdev_read(void *device_data, char __user *buf,
 	return parent->ops->read(mdev, buf, count, ppos);
 }
 
-static ssize_t vfio_mdev_write(void *device_data, const char __user *buf,
-			       size_t count, loff_t *ppos)
+static ssize_t vfio_mdev_write(struct vfio_device *core_vdev,
+			       const char __user *buf, size_t count,
+			       loff_t *ppos)
 {
-	struct mdev_device *mdev = device_data;
+	struct mdev_device *mdev = to_mdev_device(core_vdev->dev);
 	struct mdev_parent *parent = mdev->parent;
 
 	if (unlikely(!parent->ops->write))
@@ -91,9 +93,10 @@ static ssize_t vfio_mdev_write(void *device_data, const char __user *buf,
 	return parent->ops->write(mdev, buf, count, ppos);
 }
 
-static int vfio_mdev_mmap(void *device_data, struct vm_area_struct *vma)
+static int vfio_mdev_mmap(struct vfio_device *core_vdev,
+			  struct vm_area_struct *vma)
 {
-	struct mdev_device *mdev = device_data;
+	struct mdev_device *mdev = to_mdev_device(core_vdev->dev);
 	struct mdev_parent *parent = mdev->parent;
 
 	if (unlikely(!parent->ops->mmap))
@@ -102,9 +105,9 @@ static int vfio_mdev_mmap(void *device_data, struct vm_area_struct *vma)
 	return parent->ops->mmap(mdev, vma);
 }
 
-static void vfio_mdev_request(void *device_data, unsigned int count)
+static void vfio_mdev_request(struct vfio_device *core_vdev, unsigned int count)
 {
-	struct mdev_device *mdev = device_data;
+	struct mdev_device *mdev = to_mdev_device(core_vdev->dev);
 	struct mdev_parent *parent = mdev->parent;
 
 	if (parent->ops->request)
diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c
index a0ac20a499cf6c..5f1a782d1c65ae 100644
--- a/drivers/vfio/pci/vfio_pci.c
+++ b/drivers/vfio/pci/vfio_pci.c
@@ -553,9 +553,10 @@ static void vfio_pci_vf_token_user_add(struct vfio_pci_device *vdev, int val)
 	vfio_device_put(pf_dev);
 }
 
-static void vfio_pci_release(void *device_data)
+static void vfio_pci_release(struct vfio_device *core_vdev)
 {
-	struct vfio_pci_device *vdev = device_data;
+	struct vfio_pci_device *vdev =
+		container_of(core_vdev, struct vfio_pci_device, vdev);
 
 	mutex_lock(&vdev->reflck->lock);
 
@@ -581,9 +582,10 @@ static void vfio_pci_release(void *device_data)
 	module_put(THIS_MODULE);
 }
 
-static int vfio_pci_open(void *device_data)
+static int vfio_pci_open(struct vfio_device *core_vdev)
 {
-	struct vfio_pci_device *vdev = device_data;
+	struct vfio_pci_device *vdev =
+		container_of(core_vdev, struct vfio_pci_device, vdev);
 	int ret = 0;
 
 	if (!try_module_get(THIS_MODULE))
@@ -797,10 +799,11 @@ struct vfio_devices {
 	int max_index;
 };
 
-static long vfio_pci_ioctl(void *device_data,
+static long vfio_pci_ioctl(struct vfio_device *core_vdev,
 			   unsigned int cmd, unsigned long arg)
 {
-	struct vfio_pci_device *vdev = device_data;
+	struct vfio_pci_device *vdev =
+		container_of(core_vdev, struct vfio_pci_device, vdev);
 	unsigned long minsz;
 
 	if (cmd == VFIO_DEVICE_GET_INFO) {
@@ -1402,11 +1405,10 @@ static long vfio_pci_ioctl(void *device_data,
 	return -ENOTTY;
 }
 
-static ssize_t vfio_pci_rw(void *device_data, char __user *buf,
+static ssize_t vfio_pci_rw(struct vfio_pci_device *vdev, char __user *buf,
 			   size_t count, loff_t *ppos, bool iswrite)
 {
 	unsigned int index = VFIO_PCI_OFFSET_TO_INDEX(*ppos);
-	struct vfio_pci_device *vdev = device_data;
 
 	if (index >= VFIO_PCI_NUM_REGIONS + vdev->num_regions)
 		return -EINVAL;
@@ -1434,22 +1436,28 @@ static ssize_t vfio_pci_rw(void *device_data, char __user *buf,
 	return -EINVAL;
 }
 
-static ssize_t vfio_pci_read(void *device_data, char __user *buf,
+static ssize_t vfio_pci_read(struct vfio_device *core_vdev, char __user *buf,
 			     size_t count, loff_t *ppos)
 {
+	struct vfio_pci_device *vdev =
+		container_of(core_vdev, struct vfio_pci_device, vdev);
+
 	if (!count)
 		return 0;
 
-	return vfio_pci_rw(device_data, buf, count, ppos, false);
+	return vfio_pci_rw(vdev, buf, count, ppos, false);
 }
 
-static ssize_t vfio_pci_write(void *device_data, const char __user *buf,
+static ssize_t vfio_pci_write(struct vfio_device *core_vdev, const char __user *buf,
 			      size_t count, loff_t *ppos)
 {
+	struct vfio_pci_device *vdev =
+		container_of(core_vdev, struct vfio_pci_device, vdev);
+
 	if (!count)
 		return 0;
 
-	return vfio_pci_rw(device_data, (char __user *)buf, count, ppos, true);
+	return vfio_pci_rw(vdev, (char __user *)buf, count, ppos, true);
 }
 
 /* Return 1 on zap and vma_lock acquired, 0 on contention (only with @try) */
@@ -1646,9 +1654,10 @@ static const struct vm_operations_struct vfio_pci_mmap_ops = {
 	.fault = vfio_pci_mmap_fault,
 };
 
-static int vfio_pci_mmap(void *device_data, struct vm_area_struct *vma)
+static int vfio_pci_mmap(struct vfio_device *core_vdev, struct vm_area_struct *vma)
 {
-	struct vfio_pci_device *vdev = device_data;
+	struct vfio_pci_device *vdev =
+		container_of(core_vdev, struct vfio_pci_device, vdev);
 	struct pci_dev *pdev = vdev->pdev;
 	unsigned int index;
 	u64 phys_len, req_len, pgoff, req_start;
@@ -1714,9 +1723,10 @@ static int vfio_pci_mmap(void *device_data, struct vm_area_struct *vma)
 	return 0;
 }
 
-static void vfio_pci_request(void *device_data, unsigned int count)
+static void vfio_pci_request(struct vfio_device *core_vdev, unsigned int count)
 {
-	struct vfio_pci_device *vdev = device_data;
+	struct vfio_pci_device *vdev =
+		container_of(core_vdev, struct vfio_pci_device, vdev);
 	struct pci_dev *pdev = vdev->pdev;
 
 	mutex_lock(&vdev->igate);
@@ -1830,9 +1840,10 @@ static int vfio_pci_validate_vf_token(struct vfio_pci_device *vdev,
 
 #define VF_TOKEN_ARG "vf_token="
 
-static int vfio_pci_match(void *device_data, char *buf)
+static int vfio_pci_match(struct vfio_device *core_vdev, char *buf)
 {
-	struct vfio_pci_device *vdev = device_data;
+	struct vfio_pci_device *vdev =
+		container_of(core_vdev, struct vfio_pci_device, vdev);
 	bool vf_token = false;
 	uuid_t uuid;
 	int ret;
diff --git a/drivers/vfio/platform/vfio_platform_common.c b/drivers/vfio/platform/vfio_platform_common.c
index 6eb749250ee41c..f5f6b537084a67 100644
--- a/drivers/vfio/platform/vfio_platform_common.c
+++ b/drivers/vfio/platform/vfio_platform_common.c
@@ -218,9 +218,10 @@ static int vfio_platform_call_reset(struct vfio_platform_device *vdev,
 	return -EINVAL;
 }
 
-static void vfio_platform_release(void *device_data)
+static void vfio_platform_release(struct vfio_device *core_vdev)
 {
-	struct vfio_platform_device *vdev = device_data;
+	struct vfio_platform_device *vdev =
+		container_of(core_vdev, struct vfio_platform_device, vdev);
 
 	mutex_lock(&driver_lock);
 
@@ -244,9 +245,10 @@ static void vfio_platform_release(void *device_data)
 	module_put(vdev->parent_module);
 }
 
-static int vfio_platform_open(void *device_data)
+static int vfio_platform_open(struct vfio_device *core_vdev)
 {
-	struct vfio_platform_device *vdev = device_data;
+	struct vfio_platform_device *vdev =
+		container_of(core_vdev, struct vfio_platform_device, vdev);
 	int ret;
 
 	if (!try_module_get(vdev->parent_module))
@@ -293,10 +295,12 @@ static int vfio_platform_open(void *device_data)
 	return ret;
 }
 
-static long vfio_platform_ioctl(void *device_data,
+static long vfio_platform_ioctl(struct vfio_device *core_vdev,
 				unsigned int cmd, unsigned long arg)
 {
-	struct vfio_platform_device *vdev = device_data;
+	struct vfio_platform_device *vdev =
+		container_of(core_vdev, struct vfio_platform_device, vdev);
+
 	unsigned long minsz;
 
 	if (cmd == VFIO_DEVICE_GET_INFO) {
@@ -455,10 +459,11 @@ static ssize_t vfio_platform_read_mmio(struct vfio_platform_region *reg,
 	return -EFAULT;
 }
 
-static ssize_t vfio_platform_read(void *device_data, char __user *buf,
-				  size_t count, loff_t *ppos)
+static ssize_t vfio_platform_read(struct vfio_device *core_vdev,
+				  char __user *buf, size_t count, loff_t *ppos)
 {
-	struct vfio_platform_device *vdev = device_data;
+	struct vfio_platform_device *vdev =
+		container_of(core_vdev, struct vfio_platform_device, vdev);
 	unsigned int index = VFIO_PLATFORM_OFFSET_TO_INDEX(*ppos);
 	loff_t off = *ppos & VFIO_PLATFORM_OFFSET_MASK;
 
@@ -531,10 +536,11 @@ static ssize_t vfio_platform_write_mmio(struct vfio_platform_region *reg,
 	return -EFAULT;
 }
 
-static ssize_t vfio_platform_write(void *device_data, const char __user *buf,
+static ssize_t vfio_platform_write(struct vfio_device *core_vdev, const char __user *buf,
 				   size_t count, loff_t *ppos)
 {
-	struct vfio_platform_device *vdev = device_data;
+	struct vfio_platform_device *vdev =
+		container_of(core_vdev, struct vfio_platform_device, vdev);
 	unsigned int index = VFIO_PLATFORM_OFFSET_TO_INDEX(*ppos);
 	loff_t off = *ppos & VFIO_PLATFORM_OFFSET_MASK;
 
@@ -573,9 +579,10 @@ static int vfio_platform_mmap_mmio(struct vfio_platform_region region,
 			       req_len, vma->vm_page_prot);
 }
 
-static int vfio_platform_mmap(void *device_data, struct vm_area_struct *vma)
+static int vfio_platform_mmap(struct vfio_device *core_vdev, struct vm_area_struct *vma)
 {
-	struct vfio_platform_device *vdev = device_data;
+	struct vfio_platform_device *vdev =
+		container_of(core_vdev, struct vfio_platform_device, vdev);
 	unsigned int index;
 
 	index = vma->vm_pgoff >> (VFIO_PLATFORM_OFFSET_SHIFT - PAGE_SHIFT);
diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
index 2d6d7cc1d1ebf9..01de47d1810b6b 100644
--- a/drivers/vfio/vfio.c
+++ b/drivers/vfio/vfio.c
@@ -832,7 +832,7 @@ static struct vfio_device *vfio_device_get_from_name(struct vfio_group *group,
 		int ret;
 
 		if (it->ops->match) {
-			ret = it->ops->match(it->device_data, buf);
+			ret = it->ops->match(it, buf);
 			if (ret < 0) {
 				device = ERR_PTR(ret);
 				break;
@@ -893,7 +893,7 @@ void vfio_unregister_group_dev(struct vfio_device *device)
 	rc = try_wait_for_completion(&device->comp);
 	while (rc <= 0) {
 		if (device->ops->request)
-			device->ops->request(device->device_data, i++);
+			device->ops->request(device, i++);
 
 		if (interrupted) {
 			rc = wait_for_completion_timeout(&device->comp,
@@ -1379,7 +1379,7 @@ static int vfio_group_get_device_fd(struct vfio_group *group, char *buf)
 	if (IS_ERR(device))
 		return PTR_ERR(device);
 
-	ret = device->ops->open(device->device_data);
+	ret = device->ops->open(device);
 	if (ret) {
 		vfio_device_put(device);
 		return ret;
@@ -1391,7 +1391,7 @@ static int vfio_group_get_device_fd(struct vfio_group *group, char *buf)
 	 */
 	ret = get_unused_fd_flags(O_CLOEXEC);
 	if (ret < 0) {
-		device->ops->release(device->device_data);
+		device->ops->release(device);
 		vfio_device_put(device);
 		return ret;
 	}
@@ -1401,7 +1401,7 @@ static int vfio_group_get_device_fd(struct vfio_group *group, char *buf)
 	if (IS_ERR(filep)) {
 		put_unused_fd(ret);
 		ret = PTR_ERR(filep);
-		device->ops->release(device->device_data);
+		device->ops->release(device);
 		vfio_device_put(device);
 		return ret;
 	}
@@ -1558,7 +1558,7 @@ static int vfio_device_fops_release(struct inode *inode, struct file *filep)
 {
 	struct vfio_device *device = filep->private_data;
 
-	device->ops->release(device->device_data);
+	device->ops->release(device);
 
 	vfio_group_try_dissolve_container(device->group);
 
@@ -1575,7 +1575,7 @@ static long vfio_device_fops_unl_ioctl(struct file *filep,
 	if (unlikely(!device->ops->ioctl))
 		return -EINVAL;
 
-	return device->ops->ioctl(device->device_data, cmd, arg);
+	return device->ops->ioctl(device, cmd, arg);
 }
 
 static ssize_t vfio_device_fops_read(struct file *filep, char __user *buf,
@@ -1586,7 +1586,7 @@ static ssize_t vfio_device_fops_read(struct file *filep, char __user *buf,
 	if (unlikely(!device->ops->read))
 		return -EINVAL;
 
-	return device->ops->read(device->device_data, buf, count, ppos);
+	return device->ops->read(device, buf, count, ppos);
 }
 
 static ssize_t vfio_device_fops_write(struct file *filep,
@@ -1598,7 +1598,7 @@ static ssize_t vfio_device_fops_write(struct file *filep,
 	if (unlikely(!device->ops->write))
 		return -EINVAL;
 
-	return device->ops->write(device->device_data, buf, count, ppos);
+	return device->ops->write(device, buf, count, ppos);
 }
 
 static int vfio_device_fops_mmap(struct file *filep, struct vm_area_struct *vma)
@@ -1608,7 +1608,7 @@ static int vfio_device_fops_mmap(struct file *filep, struct vm_area_struct *vma)
 	if (unlikely(!device->ops->mmap))
 		return -EINVAL;
 
-	return device->ops->mmap(device->device_data, vma);
+	return device->ops->mmap(device, vma);
 }
 
 static const struct file_operations vfio_device_fops = {
diff --git a/include/linux/vfio.h b/include/linux/vfio.h
index 4995faf51efeae..784c34c0a28763 100644
--- a/include/linux/vfio.h
+++ b/include/linux/vfio.h
@@ -44,17 +44,17 @@ struct vfio_device {
  */
 struct vfio_device_ops {
 	char	*name;
-	int	(*open)(void *device_data);
-	void	(*release)(void *device_data);
-	ssize_t	(*read)(void *device_data, char __user *buf,
+	int	(*open)(struct vfio_device *vdev);
+	void	(*release)(struct vfio_device *vdev);
+	ssize_t	(*read)(struct vfio_device *vdev, char __user *buf,
 			size_t count, loff_t *ppos);
-	ssize_t	(*write)(void *device_data, const char __user *buf,
+	ssize_t	(*write)(struct vfio_device *vdev, const char __user *buf,
 			 size_t count, loff_t *size);
-	long	(*ioctl)(void *device_data, unsigned int cmd,
+	long	(*ioctl)(struct vfio_device *vdev, unsigned int cmd,
 			 unsigned long arg);
-	int	(*mmap)(void *device_data, struct vm_area_struct *vma);
-	void	(*request)(void *device_data, unsigned int count);
-	int	(*match)(void *device_data, char *buf);
+	int	(*mmap)(struct vfio_device *vdev, struct vm_area_struct *vma);
+	void	(*request)(struct vfio_device *vdev, unsigned int count);
+	int	(*match)(struct vfio_device *vdev, char *buf);
 };
 
 extern struct iommu_group *vfio_iommu_group_get(struct device *dev);
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 82+ messages in thread

* [PATCH v2 13/14] vfio/pci: Replace uses of vfio_device_data() with container_of
  2021-03-13  0:55 [PATCH v2 00/14] Embed struct vfio_device in all sub-structures Jason Gunthorpe
                   ` (11 preceding siblings ...)
  2021-03-13  0:56 ` [PATCH v2 12/14] vfio: Make vfio_device_ops pass a 'struct vfio_device *' instead of 'void *' Jason Gunthorpe
@ 2021-03-13  0:56 ` Jason Gunthorpe
  2021-03-16  8:20   ` Tian, Kevin
  2021-03-17 12:06   ` Cornelia Huck
  2021-03-13  0:56 ` [PATCH v2 14/14] vfio: Remove device_data from the vfio bus driver API Jason Gunthorpe
  13 siblings, 2 replies; 82+ messages in thread
From: Jason Gunthorpe @ 2021-03-13  0:56 UTC (permalink / raw)
  To: Alex Williamson, Cornelia Huck, kvm
  Cc: Raj, Ashok, Dan Williams, Daniel Vetter, Christoph Hellwig,
	Leon Romanovsky, Max Gurtovoy, Tarun Gupta

This tidies a few confused places that think they can have a refcount on
the vfio_device but the device_data could be NULL, that isn't possible by
design.

Most of the change falls out when struct vfio_devices is updated to just
store the struct vfio_pci_device itself. This wasn't possible before
because there was no easy way to get from the 'struct vfio_pci_device' to
the 'struct vfio_device' to put back the refcount.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/vfio/pci/vfio_pci.c | 67 +++++++++++++------------------------
 1 file changed, 24 insertions(+), 43 deletions(-)

diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c
index 5f1a782d1c65ae..1f70387c8afe37 100644
--- a/drivers/vfio/pci/vfio_pci.c
+++ b/drivers/vfio/pci/vfio_pci.c
@@ -517,30 +517,29 @@ static void vfio_pci_disable(struct vfio_pci_device *vdev)
 
 static struct pci_driver vfio_pci_driver;
 
-static struct vfio_pci_device *get_pf_vdev(struct vfio_pci_device *vdev,
-					   struct vfio_device **pf_dev)
+static struct vfio_pci_device *get_pf_vdev(struct vfio_pci_device *vdev)
 {
 	struct pci_dev *physfn = pci_physfn(vdev->pdev);
+	struct vfio_device *pf_dev;
 
 	if (!vdev->pdev->is_virtfn)
 		return NULL;
 
-	*pf_dev = vfio_device_get_from_dev(&physfn->dev);
-	if (!*pf_dev)
+	pf_dev = vfio_device_get_from_dev(&physfn->dev);
+	if (!pf_dev)
 		return NULL;
 
 	if (pci_dev_driver(physfn) != &vfio_pci_driver) {
-		vfio_device_put(*pf_dev);
+		vfio_device_put(pf_dev);
 		return NULL;
 	}
 
-	return vfio_device_data(*pf_dev);
+	return container_of(pf_dev, struct vfio_pci_device, vdev);
 }
 
 static void vfio_pci_vf_token_user_add(struct vfio_pci_device *vdev, int val)
 {
-	struct vfio_device *pf_dev;
-	struct vfio_pci_device *pf_vdev = get_pf_vdev(vdev, &pf_dev);
+	struct vfio_pci_device *pf_vdev = get_pf_vdev(vdev);
 
 	if (!pf_vdev)
 		return;
@@ -550,7 +549,7 @@ static void vfio_pci_vf_token_user_add(struct vfio_pci_device *vdev, int val)
 	WARN_ON(pf_vdev->vf_token->users < 0);
 	mutex_unlock(&pf_vdev->vf_token->lock);
 
-	vfio_device_put(pf_dev);
+	vfio_device_put(&pf_vdev->vdev);
 }
 
 static void vfio_pci_release(struct vfio_device *core_vdev)
@@ -794,7 +793,7 @@ int vfio_pci_register_dev_region(struct vfio_pci_device *vdev,
 }
 
 struct vfio_devices {
-	struct vfio_device **devices;
+	struct vfio_pci_device **devices;
 	int cur_index;
 	int max_index;
 };
@@ -1283,9 +1282,7 @@ static long vfio_pci_ioctl(struct vfio_device *core_vdev,
 			goto hot_reset_release;
 
 		for (; mem_idx < devs.cur_index; mem_idx++) {
-			struct vfio_pci_device *tmp;
-
-			tmp = vfio_device_data(devs.devices[mem_idx]);
+			struct vfio_pci_device *tmp = devs.devices[mem_idx];
 
 			ret = down_write_trylock(&tmp->memory_lock);
 			if (!ret) {
@@ -1300,17 +1297,13 @@ static long vfio_pci_ioctl(struct vfio_device *core_vdev,
 
 hot_reset_release:
 		for (i = 0; i < devs.cur_index; i++) {
-			struct vfio_device *device;
-			struct vfio_pci_device *tmp;
-
-			device = devs.devices[i];
-			tmp = vfio_device_data(device);
+			struct vfio_pci_device *tmp = devs.devices[i];
 
 			if (i < mem_idx)
 				up_write(&tmp->memory_lock);
 			else
 				mutex_unlock(&tmp->vma_lock);
-			vfio_device_put(device);
+			vfio_device_put(&tmp->vdev);
 		}
 		kfree(devs.devices);
 
@@ -1777,8 +1770,7 @@ static int vfio_pci_validate_vf_token(struct vfio_pci_device *vdev,
 		return 0; /* No VF token provided or required */
 
 	if (vdev->pdev->is_virtfn) {
-		struct vfio_device *pf_dev;
-		struct vfio_pci_device *pf_vdev = get_pf_vdev(vdev, &pf_dev);
+		struct vfio_pci_device *pf_vdev = get_pf_vdev(vdev);
 		bool match;
 
 		if (!pf_vdev) {
@@ -1791,7 +1783,7 @@ static int vfio_pci_validate_vf_token(struct vfio_pci_device *vdev,
 		}
 
 		if (!vf_token) {
-			vfio_device_put(pf_dev);
+			vfio_device_put(&pf_vdev->vdev);
 			pci_info_ratelimited(vdev->pdev,
 				"VF token required to access device\n");
 			return -EACCES;
@@ -1801,7 +1793,7 @@ static int vfio_pci_validate_vf_token(struct vfio_pci_device *vdev,
 		match = uuid_equal(uuid, &pf_vdev->vf_token->uuid);
 		mutex_unlock(&pf_vdev->vf_token->lock);
 
-		vfio_device_put(pf_dev);
+		vfio_device_put(&pf_vdev->vdev);
 
 		if (!match) {
 			pci_info_ratelimited(vdev->pdev,
@@ -2122,11 +2114,7 @@ static pci_ers_result_t vfio_pci_aer_err_detected(struct pci_dev *pdev,
 	if (device == NULL)
 		return PCI_ERS_RESULT_DISCONNECT;
 
-	vdev = vfio_device_data(device);
-	if (vdev == NULL) {
-		vfio_device_put(device);
-		return PCI_ERS_RESULT_DISCONNECT;
-	}
+	vdev = container_of(device, struct vfio_pci_device, vdev);
 
 	mutex_lock(&vdev->igate);
 
@@ -2142,7 +2130,6 @@ static pci_ers_result_t vfio_pci_aer_err_detected(struct pci_dev *pdev,
 
 static int vfio_pci_sriov_configure(struct pci_dev *pdev, int nr_virtfn)
 {
-	struct vfio_pci_device *vdev;
 	struct vfio_device *device;
 	int ret = 0;
 
@@ -2155,12 +2142,6 @@ static int vfio_pci_sriov_configure(struct pci_dev *pdev, int nr_virtfn)
 	if (!device)
 		return -ENODEV;
 
-	vdev = vfio_device_data(device);
-	if (!vdev) {
-		vfio_device_put(device);
-		return -ENODEV;
-	}
-
 	if (nr_virtfn == 0)
 		pci_disable_sriov(pdev);
 	else
@@ -2220,7 +2201,7 @@ static int vfio_pci_reflck_find(struct pci_dev *pdev, void *data)
 		return 0;
 	}
 
-	vdev = vfio_device_data(device);
+	vdev = container_of(device, struct vfio_pci_device, vdev);
 
 	if (vdev->reflck) {
 		vfio_pci_reflck_get(vdev->reflck);
@@ -2282,7 +2263,7 @@ static int vfio_pci_get_unused_devs(struct pci_dev *pdev, void *data)
 		return -EBUSY;
 	}
 
-	vdev = vfio_device_data(device);
+	vdev = container_of(device, struct vfio_pci_device, vdev);
 
 	/* Fault if the device is not unused */
 	if (vdev->refcnt) {
@@ -2290,7 +2271,7 @@ static int vfio_pci_get_unused_devs(struct pci_dev *pdev, void *data)
 		return -EBUSY;
 	}
 
-	devs->devices[devs->cur_index++] = device;
+	devs->devices[devs->cur_index++] = vdev;
 	return 0;
 }
 
@@ -2312,7 +2293,7 @@ static int vfio_pci_try_zap_and_vma_lock_cb(struct pci_dev *pdev, void *data)
 		return -EBUSY;
 	}
 
-	vdev = vfio_device_data(device);
+	vdev = container_of(device, struct vfio_pci_device, vdev);
 
 	/*
 	 * Locking multiple devices is prone to deadlock, runaway and
@@ -2323,7 +2304,7 @@ static int vfio_pci_try_zap_and_vma_lock_cb(struct pci_dev *pdev, void *data)
 		return -EBUSY;
 	}
 
-	devs->devices[devs->cur_index++] = device;
+	devs->devices[devs->cur_index++] = vdev;
 	return 0;
 }
 
@@ -2371,7 +2352,7 @@ static void vfio_pci_try_bus_reset(struct vfio_pci_device *vdev)
 
 	/* Does at least one need a reset? */
 	for (i = 0; i < devs.cur_index; i++) {
-		tmp = vfio_device_data(devs.devices[i]);
+		tmp = devs.devices[i];
 		if (tmp->needs_reset) {
 			ret = pci_reset_bus(vdev->pdev);
 			break;
@@ -2380,7 +2361,7 @@ static void vfio_pci_try_bus_reset(struct vfio_pci_device *vdev)
 
 put_devs:
 	for (i = 0; i < devs.cur_index; i++) {
-		tmp = vfio_device_data(devs.devices[i]);
+		tmp = devs.devices[i];
 
 		/*
 		 * If reset was successful, affected devices no longer need
@@ -2396,7 +2377,7 @@ static void vfio_pci_try_bus_reset(struct vfio_pci_device *vdev)
 				vfio_pci_set_power_state(tmp, PCI_D3hot);
 		}
 
-		vfio_device_put(devs.devices[i]);
+		vfio_device_put(&tmp->vdev);
 	}
 
 	kfree(devs.devices);
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 82+ messages in thread

* [PATCH v2 14/14] vfio: Remove device_data from the vfio bus driver API
  2021-03-13  0:55 [PATCH v2 00/14] Embed struct vfio_device in all sub-structures Jason Gunthorpe
                   ` (12 preceding siblings ...)
  2021-03-13  0:56 ` [PATCH v2 13/14] vfio/pci: Replace uses of vfio_device_data() with container_of Jason Gunthorpe
@ 2021-03-13  0:56 ` Jason Gunthorpe
  2021-03-16  8:22   ` Tian, Kevin
                     ` (2 more replies)
  13 siblings, 3 replies; 82+ messages in thread
From: Jason Gunthorpe @ 2021-03-13  0:56 UTC (permalink / raw)
  To: Alex Williamson, Cornelia Huck, Jonathan Corbet, Diana Craciun,
	Eric Auger, kvm, Kirti Wankhede, linux-doc
  Cc: Raj, Ashok, Dan Williams, Daniel Vetter, Christoph Hellwig,
	Leon Romanovsky, Max Gurtovoy, Tarun Gupta

There are no longer any users, so it can go away. Everything is using
container_of now.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 Documentation/driver-api/vfio.rst            |  3 +--
 drivers/vfio/fsl-mc/vfio_fsl_mc.c            |  5 +++--
 drivers/vfio/mdev/vfio_mdev.c                |  2 +-
 drivers/vfio/pci/vfio_pci.c                  |  2 +-
 drivers/vfio/platform/vfio_platform_common.c |  2 +-
 drivers/vfio/vfio.c                          | 12 +-----------
 include/linux/vfio.h                         |  4 +---
 7 files changed, 9 insertions(+), 21 deletions(-)

diff --git a/Documentation/driver-api/vfio.rst b/Documentation/driver-api/vfio.rst
index 3337f337293a32..decc68cb8114ac 100644
--- a/Documentation/driver-api/vfio.rst
+++ b/Documentation/driver-api/vfio.rst
@@ -254,8 +254,7 @@ vfio_unregister_group_dev() respectively::
 
 	void vfio_init_group_dev(struct vfio_device *device,
 				struct device *dev,
-				const struct vfio_device_ops *ops,
-				void *device_data);
+				const struct vfio_device_ops *ops);
 	int vfio_register_group_dev(struct vfio_device *device);
 	void vfio_unregister_group_dev(struct vfio_device *device);
 
diff --git a/drivers/vfio/fsl-mc/vfio_fsl_mc.c b/drivers/vfio/fsl-mc/vfio_fsl_mc.c
index 023b2222806424..3af3ca59478f94 100644
--- a/drivers/vfio/fsl-mc/vfio_fsl_mc.c
+++ b/drivers/vfio/fsl-mc/vfio_fsl_mc.c
@@ -75,7 +75,8 @@ static int vfio_fsl_mc_reflck_attach(struct vfio_fsl_mc_device *vdev)
 			goto unlock;
 		}
 
-		cont_vdev = vfio_device_data(device);
+		cont_vdev =
+			container_of(device, struct vfio_fsl_mc_device, vdev);
 		if (!cont_vdev || !cont_vdev->reflck) {
 			vfio_device_put(device);
 			ret = -ENODEV;
@@ -624,7 +625,7 @@ static int vfio_fsl_mc_probe(struct fsl_mc_device *mc_dev)
 		goto out_group_put;
 	}
 
-	vfio_init_group_dev(&vdev->vdev, dev, &vfio_fsl_mc_ops, vdev);
+	vfio_init_group_dev(&vdev->vdev, dev, &vfio_fsl_mc_ops);
 	vdev->mc_dev = mc_dev;
 	mutex_init(&vdev->igate);
 
diff --git a/drivers/vfio/mdev/vfio_mdev.c b/drivers/vfio/mdev/vfio_mdev.c
index e7309caa99c71b..71bd28f976e5af 100644
--- a/drivers/vfio/mdev/vfio_mdev.c
+++ b/drivers/vfio/mdev/vfio_mdev.c
@@ -138,7 +138,7 @@ static int vfio_mdev_probe(struct device *dev)
 	if (!mvdev)
 		return -ENOMEM;
 
-	vfio_init_group_dev(&mvdev->vdev, &mdev->dev, &vfio_mdev_dev_ops, mdev);
+	vfio_init_group_dev(&mvdev->vdev, &mdev->dev, &vfio_mdev_dev_ops);
 	ret = vfio_register_group_dev(&mvdev->vdev);
 	if (ret) {
 		kfree(mvdev);
diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c
index 1f70387c8afe37..55ef27a15d4d3f 100644
--- a/drivers/vfio/pci/vfio_pci.c
+++ b/drivers/vfio/pci/vfio_pci.c
@@ -2022,7 +2022,7 @@ static int vfio_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
 		goto out_group_put;
 	}
 
-	vfio_init_group_dev(&vdev->vdev, &pdev->dev, &vfio_pci_ops, vdev);
+	vfio_init_group_dev(&vdev->vdev, &pdev->dev, &vfio_pci_ops);
 	vdev->pdev = pdev;
 	vdev->irq_type = VFIO_PCI_NUM_IRQS;
 	mutex_init(&vdev->igate);
diff --git a/drivers/vfio/platform/vfio_platform_common.c b/drivers/vfio/platform/vfio_platform_common.c
index f5f6b537084a67..361e5b57e36932 100644
--- a/drivers/vfio/platform/vfio_platform_common.c
+++ b/drivers/vfio/platform/vfio_platform_common.c
@@ -666,7 +666,7 @@ int vfio_platform_probe_common(struct vfio_platform_device *vdev,
 	struct iommu_group *group;
 	int ret;
 
-	vfio_init_group_dev(&vdev->vdev, dev, &vfio_platform_ops, vdev);
+	vfio_init_group_dev(&vdev->vdev, dev, &vfio_platform_ops);
 
 	ret = vfio_platform_acpi_probe(vdev, dev);
 	if (ret)
diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
index 01de47d1810b6b..39ea77557ba0c4 100644
--- a/drivers/vfio/vfio.c
+++ b/drivers/vfio/vfio.c
@@ -741,12 +741,11 @@ static int vfio_iommu_group_notifier(struct notifier_block *nb,
  * VFIO driver API
  */
 void vfio_init_group_dev(struct vfio_device *device, struct device *dev,
-			 const struct vfio_device_ops *ops, void *device_data)
+			 const struct vfio_device_ops *ops)
 {
 	init_completion(&device->comp);
 	device->dev = dev;
 	device->ops = ops;
-	device->device_data = device_data;
 }
 EXPORT_SYMBOL_GPL(vfio_init_group_dev);
 
@@ -851,15 +850,6 @@ static struct vfio_device *vfio_device_get_from_name(struct vfio_group *group,
 	return device;
 }
 
-/*
- * Caller must hold a reference to the vfio_device
- */
-void *vfio_device_data(struct vfio_device *device)
-{
-	return device->device_data;
-}
-EXPORT_SYMBOL_GPL(vfio_device_data);
-
 /*
  * Decrement the device reference count and wait for the device to be
  * removed.  Open file descriptors for the device... */
diff --git a/include/linux/vfio.h b/include/linux/vfio.h
index 784c34c0a28763..a2c5b30e1763ba 100644
--- a/include/linux/vfio.h
+++ b/include/linux/vfio.h
@@ -24,7 +24,6 @@ struct vfio_device {
 	refcount_t refcount;
 	struct completion comp;
 	struct list_head group_next;
-	void *device_data;
 };
 
 /**
@@ -61,12 +60,11 @@ extern struct iommu_group *vfio_iommu_group_get(struct device *dev);
 extern void vfio_iommu_group_put(struct iommu_group *group, struct device *dev);
 
 void vfio_init_group_dev(struct vfio_device *device, struct device *dev,
-			 const struct vfio_device_ops *ops, void *device_data);
+			 const struct vfio_device_ops *ops);
 int vfio_register_group_dev(struct vfio_device *device);
 void vfio_unregister_group_dev(struct vfio_device *device);
 extern struct vfio_device *vfio_device_get_from_dev(struct device *dev);
 extern void vfio_device_put(struct vfio_device *device);
-extern void *vfio_device_data(struct vfio_device *device);
 
 /* events for the backend driver notify callback */
 enum vfio_iommu_notify_type {
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 82+ messages in thread

* Re: [PATCH v2 05/14] vfio/fsl-mc: Re-order vfio_fsl_mc_probe()
  2021-03-13  0:55 ` [PATCH v2 05/14] vfio/fsl-mc: Re-order vfio_fsl_mc_probe() Jason Gunthorpe
@ 2021-03-15  8:44   ` Christoph Hellwig
  2021-03-16  9:16   ` Diana Craciun OSS
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 82+ messages in thread
From: Christoph Hellwig @ 2021-03-15  8:44 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Cornelia Huck, kvm, Alex Williamson, Raj, Ashok, Bharat Bhushan,
	Dan Williams, Daniel Vetter, Diana Craciun, Eric Auger,
	Christoph Hellwig, Leon Romanovsky, Max Gurtovoy, Tarun Gupta

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v2 06/14] vfio/fsl-mc: Use vfio_init/register/unregister_group_dev
  2021-03-13  0:55 ` [PATCH v2 06/14] vfio/fsl-mc: Use vfio_init/register/unregister_group_dev Jason Gunthorpe
@ 2021-03-15  8:44   ` Christoph Hellwig
  2021-03-16 16:43   ` Cornelia Huck
  1 sibling, 0 replies; 82+ messages in thread
From: Christoph Hellwig @ 2021-03-15  8:44 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Alex Williamson, Cornelia Huck, Diana Craciun, kvm, Raj, Ashok,
	Dan Williams, Daniel Vetter, Christoph Hellwig, Leon Romanovsky,
	Max Gurtovoy, Tarun Gupta

On Fri, Mar 12, 2021 at 08:55:58PM -0400, Jason Gunthorpe wrote:
> fsl-mc already allocates a struct vfio_fsl_mc_device with exactly the same
> lifetime as vfio_device, switch to the new API and embed vfio_device in
> vfio_fsl_mc_device. While here remove the devm usage for the vdev, this
> code is clean and doesn't need devm.

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v2 07/14] vfio/pci: Move VGA and VF initialization to functions
  2021-03-13  0:55 ` [PATCH v2 07/14] vfio/pci: Move VGA and VF initialization to functions Jason Gunthorpe
@ 2021-03-15  8:45   ` Christoph Hellwig
  2021-03-15 23:07     ` Jason Gunthorpe
  2021-03-16  7:57   ` Tian, Kevin
                     ` (3 subsequent siblings)
  4 siblings, 1 reply; 82+ messages in thread
From: Christoph Hellwig @ 2021-03-15  8:45 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Cornelia Huck, kvm, Alex Williamson, Raj, Ashok, Dan Williams,
	Daniel Vetter, Christoph Hellwig, Leon Romanovsky, Max Gurtovoy,
	Tarun Gupta

> +static int vfio_pci_vf_init(struct vfio_pci_device *vdev)
> +{
> +	struct pci_dev *pdev = vdev->pdev;
> +	int ret;
> +
> +	if (!pdev->is_physfn)
> +		return 0;
> +
> +	vdev->vf_token = kzalloc(sizeof(*vdev->vf_token), GFP_KERNEL);
> +	if (!vdev->vf_token)
> +		return -ENOMEM;

> +static void vfio_pci_vf_uninit(struct vfio_pci_device *vdev)
> +{
> +	if (!vdev->vf_token)
> +		return;

I'd really prefer to keep these checks in the callers, as it makes the
intent of the code much more clear.  Same for the VGA side.

But in general I like these helpers.

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v2 08/14] vfio/pci: Re-order vfio_pci_probe()
  2021-03-13  0:56 ` [PATCH v2 08/14] vfio/pci: Re-order vfio_pci_probe() Jason Gunthorpe
@ 2021-03-15  8:46   ` Christoph Hellwig
  2021-03-16  8:04   ` Tian, Kevin
                     ` (3 subsequent siblings)
  4 siblings, 0 replies; 82+ messages in thread
From: Christoph Hellwig @ 2021-03-15  8:46 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: kvm, Alex Williamson, Raj, Ashok, Christian Ehrhardt,
	Cornelia Huck, Dan Williams, Daniel Vetter, Eric Auger,
	Christoph Hellwig, Kevin Tian, Leon Romanovsky, Max Gurtovoy,
	Tarun Gupta

On Fri, Mar 12, 2021 at 08:56:00PM -0400, Jason Gunthorpe wrote:
> vfio_add_group_dev() must be called only after all of the private data in
> vdev is fully setup and ready, otherwise there could be races with user
> space instantiating a device file descriptor and starting to call ops.
> 
> For instance vfio_pci_reflck_attach() sets vdev->reflck and
> vfio_pci_open(), called by fops open, unconditionally derefs it, which
> will crash if things get out of order.
> 
> Fixes: cc20d7999000 ("vfio/pci: Introduce VF token")
> Fixes: e309df5b0c9e ("vfio/pci: Parallelize device open and release")
> Fixes: 6eb7018705de ("vfio-pci: Move idle devices to D3hot power state")
> Fixes: ecaa1f6a0154 ("vfio-pci: Add VGA arbiter client")
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v2 12/14] vfio: Make vfio_device_ops pass a 'struct vfio_device *' instead of 'void *'
  2021-03-13  0:56 ` [PATCH v2 12/14] vfio: Make vfio_device_ops pass a 'struct vfio_device *' instead of 'void *' Jason Gunthorpe
@ 2021-03-15  8:58   ` Christoph Hellwig
  2021-03-17 11:33   ` Cornelia Huck
  1 sibling, 0 replies; 82+ messages in thread
From: Christoph Hellwig @ 2021-03-15  8:58 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Alex Williamson, Cornelia Huck, Jonathan Corbet, Diana Craciun,
	Eric Auger, kvm, Kirti Wankhede, linux-doc, Raj, Ashok,
	Dan Williams, Daniel Vetter, Christoph Hellwig, Leon Romanovsky,
	Max Gurtovoy, Tarun Gupta

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v2 07/14] vfio/pci: Move VGA and VF initialization to functions
  2021-03-15  8:45   ` Christoph Hellwig
@ 2021-03-15 23:07     ` Jason Gunthorpe
  2021-03-16  6:27       ` Christoph Hellwig
  0 siblings, 1 reply; 82+ messages in thread
From: Jason Gunthorpe @ 2021-03-15 23:07 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Cornelia Huck, kvm, Alex Williamson, Raj, Ashok, Dan Williams,
	Daniel Vetter, Leon Romanovsky, Max Gurtovoy, Tarun Gupta

On Mon, Mar 15, 2021 at 09:45:34AM +0100, Christoph Hellwig wrote:
> > +static int vfio_pci_vf_init(struct vfio_pci_device *vdev)
> > +{
> > +	struct pci_dev *pdev = vdev->pdev;
> > +	int ret;
> > +
> > +	if (!pdev->is_physfn)
> > +		return 0;
> > +
> > +	vdev->vf_token = kzalloc(sizeof(*vdev->vf_token), GFP_KERNEL);
> > +	if (!vdev->vf_token)
> > +		return -ENOMEM;
> 
> > +static void vfio_pci_vf_uninit(struct vfio_pci_device *vdev)
> > +{
> > +	if (!vdev->vf_token)
> > +		return;
> 
> I'd really prefer to keep these checks in the callers, as it makes the
> intent of the code much more clear.  Same for the VGA side.
> 
> But in general I like these helpers.

I'm here because I needed to make the error unwind tidy before I could
re-order everything in the next patch, as re-ordering with the
existing unwind quickly became a mess.

It ends up like this:

out_power:
	if (!disable_idle_d3)
		vfio_pci_set_power_state(vdev, PCI_D0);
out_vf:
	vfio_pci_vf_uninit(vdev);
out_reflck:
	vfio_pci_reflck_put(vdev->reflck);
out_free:
	kfree(vdev->pm_save);
	kfree(vdev);

I'm always leery about adding conditionals to these unwinds, it is
easy to make a mistake.

Particularly in this case the init/uninit checks are not symmetric:

 +	if (!pdev->is_physfn)
 +		return 0;

vs

 +	if (!vdev->vf_token)
 +		return;

So the goto unwind looks quite odd when this is open coded. At least
with the helpers you can read the init then uninit and go 'yah, OK,
this makes sense'

Thanks,
Jason

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v2 07/14] vfio/pci: Move VGA and VF initialization to functions
  2021-03-15 23:07     ` Jason Gunthorpe
@ 2021-03-16  6:27       ` Christoph Hellwig
  0 siblings, 0 replies; 82+ messages in thread
From: Christoph Hellwig @ 2021-03-16  6:27 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Christoph Hellwig, Cornelia Huck, kvm, Alex Williamson, Raj,
	Ashok, Dan Williams, Daniel Vetter, Leon Romanovsky,
	Max Gurtovoy, Tarun Gupta

On Mon, Mar 15, 2021 at 08:07:46PM -0300, Jason Gunthorpe wrote:
> So the goto unwind looks quite odd when this is open coded. At least
> with the helpers you can read the init then uninit and go 'yah, OK,
> this makes sense'

Still looks odd to me.  But this is your series and overall a major
improvements, so:

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 82+ messages in thread

* RE: [PATCH v2 01/14] vfio: Remove extra put/gets around vfio_device->group
  2021-03-13  0:55 ` [PATCH v2 01/14] vfio: Remove extra put/gets around vfio_device->group Jason Gunthorpe
@ 2021-03-16  7:33   ` Tian, Kevin
  2021-03-16 23:07     ` Jason Gunthorpe
  2021-03-16 11:15   ` Max Gurtovoy
                     ` (2 subsequent siblings)
  3 siblings, 1 reply; 82+ messages in thread
From: Tian, Kevin @ 2021-03-16  7:33 UTC (permalink / raw)
  To: Jason Gunthorpe, Alex Williamson, Cornelia Huck, kvm
  Cc: Raj, Ashok, Williams, Dan J, Daniel Vetter, Christoph Hellwig,
	Leon Romanovsky, Max Gurtovoy, Tarun Gupta

> From: Jason Gunthorpe <jgg@nvidia.com>
> Sent: Saturday, March 13, 2021 8:56 AM
> 
> The vfio_device->group value has a get obtained during
> vfio_add_group_dev() which gets moved from the stack to vfio_device-
> >group
> in vfio_group_create_device().
> 
> The reference remains until we reach the end of vfio_del_group_dev() when
> it is put back.
> 
> Thus anything that already has a kref on the vfio_device is guaranteed a
> valid group pointer. Remove all the extra reference traffic.
> 
> It is tricky to see, but the get at the start of vfio_del_group_dev() is
> actually pairing with the put hidden inside vfio_device_put() a few lines
> below.

I feel that the put inside vfio_device_put was meant to pair with the get in 
vfio_group_create_device before this patch is applied. Because vfio_device_
put may drop the last reference to the group, vfio_del_group_dev then 
issues its own get to hold the reference until the put at the end of the func. 

Nevertheless this patch does make the flow much cleaner:

Reviewed-by: Kevin Tian <kevin.tian@intel.com>

> 
> A later patch merges vfio_group_create_device() into vfio_add_group_dev()
> which makes the ownership and error flow on the create side easier to
> follow.
> 
> Reviewed-by: Christoph Hellwig <hch@lst.de>
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> ---
>  drivers/vfio/vfio.c | 21 ++-------------------
>  1 file changed, 2 insertions(+), 19 deletions(-)
> 
> diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
> index 38779e6fd80cb4..15d8e678e5563a 100644
> --- a/drivers/vfio/vfio.c
> +++ b/drivers/vfio/vfio.c
> @@ -546,14 +546,12 @@ struct vfio_device
> *vfio_group_create_device(struct vfio_group *group,
> 
>  	kref_init(&device->kref);
>  	device->dev = dev;
> +	/* Our reference on group is moved to the device */
>  	device->group = group;
>  	device->ops = ops;
>  	device->device_data = device_data;
>  	dev_set_drvdata(dev, device);
> 
> -	/* No need to get group_lock, caller has group reference */
> -	vfio_group_get(group);
> -
>  	mutex_lock(&group->device_lock);
>  	list_add(&device->group_next, &group->device_list);
>  	group->dev_counter++;
> @@ -585,13 +583,11 @@ void vfio_device_put(struct vfio_device *device)
>  {
>  	struct vfio_group *group = device->group;
>  	kref_put_mutex(&device->kref, vfio_device_release, &group-
> >device_lock);
> -	vfio_group_put(group);
>  }
>  EXPORT_SYMBOL_GPL(vfio_device_put);
> 
>  static void vfio_device_get(struct vfio_device *device)
>  {
> -	vfio_group_get(device->group);
>  	kref_get(&device->kref);
>  }
> 
> @@ -841,14 +837,6 @@ int vfio_add_group_dev(struct device *dev,
>  		vfio_group_put(group);
>  		return PTR_ERR(device);
>  	}
> -
> -	/*
> -	 * Drop all but the vfio_device reference.  The vfio_device holds
> -	 * a reference to the vfio_group, which holds a reference to the
> -	 * iommu_group.
> -	 */
> -	vfio_group_put(group);
> -
>  	return 0;
>  }
>  EXPORT_SYMBOL_GPL(vfio_add_group_dev);
> @@ -928,12 +916,6 @@ void *vfio_del_group_dev(struct device *dev)
>  	unsigned int i = 0;
>  	bool interrupted = false;
> 
> -	/*
> -	 * The group exists so long as we have a device reference.  Get
> -	 * a group reference and use it to scan for the device going away.
> -	 */
> -	vfio_group_get(group);
> -
>  	/*
>  	 * When the device is removed from the group, the group suddenly
>  	 * becomes non-viable; the device has a driver (until the unbind
> @@ -1008,6 +990,7 @@ void *vfio_del_group_dev(struct device *dev)
>  	if (list_empty(&group->device_list))
>  		wait_event(group->container_q, !group->container);
> 
> +	/* Matches the get in vfio_group_create_device() */

There is no get there now.

>  	vfio_group_put(group);
> 
>  	return device_data;
> --
> 2.30.2


^ permalink raw reply	[flat|nested] 82+ messages in thread

* RE: [PATCH v2 02/14] vfio: Simplify the lifetime logic for vfio_device
  2021-03-13  0:55 ` [PATCH v2 02/14] vfio: Simplify the lifetime logic for vfio_device Jason Gunthorpe
@ 2021-03-16  7:38   ` Tian, Kevin
  2021-03-16 12:10     ` Cornelia Huck
  2021-03-16 20:24     ` Alex Williamson
  2021-03-18 13:10   ` Auger Eric
  1 sibling, 2 replies; 82+ messages in thread
From: Tian, Kevin @ 2021-03-16  7:38 UTC (permalink / raw)
  To: Jason Gunthorpe, Alex Williamson, Cornelia Huck, kvm
  Cc: Raj, Ashok, Williams, Dan J, Daniel Vetter, Christoph Hellwig,
	Leon Romanovsky, Max Gurtovoy, Tarun Gupta

> From: Jason Gunthorpe <jgg@nvidia.com>
> Sent: Saturday, March 13, 2021 8:56 AM
> 
> The vfio_device is using a 'sleep until all refs go to zero' pattern for
> its lifetime, but it is indirectly coded by repeatedly scanning the group
> list waiting for the device to be removed on its own.
> 
> Switch this around to be a direct representation, use a refcount to count
> the number of places that are blocking destruction and sleep directly on a
> completion until that counter goes to zero. kfree the device after other
> accesses have been excluded in vfio_del_group_dev(). This is a fairly
> common Linux idiom.
> 
> Due to this we can now remove kref_put_mutex(), which is very rarely used
> in the kernel. Here it is being used to prevent a zero ref device from
> being seen in the group list. Instead allow the zero ref device to
> continue to exist in the device_list and use refcount_inc_not_zero() to
> exclude it once refs go to zero.
> 
> This patch is organized so the next patch will be able to alter the API to
> allow drivers to provide the kfree.
> 
> Reviewed-by: Christoph Hellwig <hch@lst.de>
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> ---
>  drivers/vfio/vfio.c | 79 ++++++++++++++-------------------------------
>  1 file changed, 25 insertions(+), 54 deletions(-)
> 
> diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
> index 15d8e678e5563a..32660e8a69ae20 100644
> --- a/drivers/vfio/vfio.c
> +++ b/drivers/vfio/vfio.c
> @@ -46,7 +46,6 @@ static struct vfio {
>  	struct mutex			group_lock;
>  	struct cdev			group_cdev;
>  	dev_t				group_devt;
> -	wait_queue_head_t		release_q;
>  } vfio;
> 
>  struct vfio_iommu_driver {
> @@ -91,7 +90,8 @@ struct vfio_group {
>  };
> 
>  struct vfio_device {
> -	struct kref			kref;
> +	refcount_t			refcount;
> +	struct completion		comp;
>  	struct device			*dev;
>  	const struct vfio_device_ops	*ops;
>  	struct vfio_group		*group;
> @@ -544,7 +544,8 @@ struct vfio_device *vfio_group_create_device(struct
> vfio_group *group,
>  	if (!device)
>  		return ERR_PTR(-ENOMEM);
> 
> -	kref_init(&device->kref);
> +	refcount_set(&device->refcount, 1);
> +	init_completion(&device->comp);
>  	device->dev = dev;
>  	/* Our reference on group is moved to the device */
>  	device->group = group;
> @@ -560,35 +561,17 @@ struct vfio_device
> *vfio_group_create_device(struct vfio_group *group,
>  	return device;
>  }
> 
> -static void vfio_device_release(struct kref *kref)
> -{
> -	struct vfio_device *device = container_of(kref,
> -						  struct vfio_device, kref);
> -	struct vfio_group *group = device->group;
> -
> -	list_del(&device->group_next);
> -	group->dev_counter--;
> -	mutex_unlock(&group->device_lock);
> -
> -	dev_set_drvdata(device->dev, NULL);
> -
> -	kfree(device);
> -
> -	/* vfio_del_group_dev may be waiting for this device */
> -	wake_up(&vfio.release_q);
> -}
> -
>  /* Device reference always implies a group reference */
>  void vfio_device_put(struct vfio_device *device)
>  {
> -	struct vfio_group *group = device->group;
> -	kref_put_mutex(&device->kref, vfio_device_release, &group-
> >device_lock);
> +	if (refcount_dec_and_test(&device->refcount))
> +		complete(&device->comp);
>  }
>  EXPORT_SYMBOL_GPL(vfio_device_put);
> 
> -static void vfio_device_get(struct vfio_device *device)
> +static bool vfio_device_try_get(struct vfio_device *device)
>  {
> -	kref_get(&device->kref);
> +	return refcount_inc_not_zero(&device->refcount);
>  }
> 
>  static struct vfio_device *vfio_group_get_device(struct vfio_group *group,
> @@ -598,8 +581,7 @@ static struct vfio_device
> *vfio_group_get_device(struct vfio_group *group,
> 
>  	mutex_lock(&group->device_lock);
>  	list_for_each_entry(device, &group->device_list, group_next) {
> -		if (device->dev == dev) {
> -			vfio_device_get(device);
> +		if (device->dev == dev && vfio_device_try_get(device)) {
>  			mutex_unlock(&group->device_lock);
>  			return device;
>  		}
> @@ -883,9 +865,8 @@ static struct vfio_device
> *vfio_device_get_from_name(struct vfio_group *group,
>  			ret = !strcmp(dev_name(it->dev), buf);
>  		}
> 
> -		if (ret) {
> +		if (ret && vfio_device_try_get(it)) {
>  			device = it;
> -			vfio_device_get(device);
>  			break;
>  		}
>  	}
> @@ -908,13 +889,13 @@ EXPORT_SYMBOL_GPL(vfio_device_data);
>   * removed.  Open file descriptors for the device... */
>  void *vfio_del_group_dev(struct device *dev)
>  {
> -	DEFINE_WAIT_FUNC(wait, woken_wake_function);
>  	struct vfio_device *device = dev_get_drvdata(dev);
>  	struct vfio_group *group = device->group;
>  	void *device_data = device->device_data;
>  	struct vfio_unbound_dev *unbound;
>  	unsigned int i = 0;
>  	bool interrupted = false;
> +	long rc;
> 
>  	/*
>  	 * When the device is removed from the group, the group suddenly
> @@ -935,32 +916,18 @@ void *vfio_del_group_dev(struct device *dev)
>  	WARN_ON(!unbound);
> 
>  	vfio_device_put(device);
> -
> -	/*
> -	 * If the device is still present in the group after the above
> -	 * 'put', then it is in use and we need to request it from the
> -	 * bus driver.  The driver may in turn need to request the
> -	 * device from the user.  We send the request on an arbitrary
> -	 * interval with counter to allow the driver to take escalating
> -	 * measures to release the device if it has the ability to do so.
> -	 */

Above comment still makes sense even with this patch. What about
keeping it? otherwise:

Reviewed-by: Kevin Tian <kevin.tian@intel.com>

> -	add_wait_queue(&vfio.release_q, &wait);
> -
> -	do {
> -		device = vfio_group_get_device(group, dev);
> -		if (!device)
> -			break;
> -
> +	rc = try_wait_for_completion(&device->comp);
> +	while (rc <= 0) {
>  		if (device->ops->request)
>  			device->ops->request(device_data, i++);
> 
> -		vfio_device_put(device);
> -
>  		if (interrupted) {
> -			wait_woken(&wait, TASK_UNINTERRUPTIBLE, HZ *
> 10);
> +			rc = wait_for_completion_timeout(&device->comp,
> +							 HZ * 10);
>  		} else {
> -			wait_woken(&wait, TASK_INTERRUPTIBLE, HZ * 10);
> -			if (signal_pending(current)) {
> +			rc = wait_for_completion_interruptible_timeout(
> +				&device->comp, HZ * 10);
> +			if (rc < 0) {
>  				interrupted = true;
>  				dev_warn(dev,
>  					 "Device is currently in use, task"
> @@ -969,10 +936,13 @@ void *vfio_del_group_dev(struct device *dev)
>  					 current->comm,
> task_pid_nr(current));
>  			}
>  		}
> +	}
> 
> -	} while (1);
> +	mutex_lock(&group->device_lock);
> +	list_del(&device->group_next);
> +	group->dev_counter--;
> +	mutex_unlock(&group->device_lock);
> 
> -	remove_wait_queue(&vfio.release_q, &wait);
>  	/*
>  	 * In order to support multiple devices per group, devices can be
>  	 * plucked from the group while other devices in the group are still
> @@ -992,6 +962,8 @@ void *vfio_del_group_dev(struct device *dev)
> 
>  	/* Matches the get in vfio_group_create_device() */
>  	vfio_group_put(group);
> +	dev_set_drvdata(dev, NULL);
> +	kfree(device);
> 
>  	return device_data;
>  }
> @@ -2362,7 +2334,6 @@ static int __init vfio_init(void)
>  	mutex_init(&vfio.iommu_drivers_lock);
>  	INIT_LIST_HEAD(&vfio.group_list);
>  	INIT_LIST_HEAD(&vfio.iommu_drivers_list);
> -	init_waitqueue_head(&vfio.release_q);
> 
>  	ret = misc_register(&vfio_dev);
>  	if (ret) {
> --
> 2.30.2


^ permalink raw reply	[flat|nested] 82+ messages in thread

* RE: [PATCH v2 03/14] vfio: Split creation of a vfio_device into init and register ops
  2021-03-13  0:55 ` [PATCH v2 03/14] vfio: Split creation of a vfio_device into init and register ops Jason Gunthorpe
@ 2021-03-16  7:55   ` Tian, Kevin
  2021-03-16 13:34     ` Jason Gunthorpe
  2021-03-16 12:25   ` Cornelia Huck
                     ` (2 subsequent siblings)
  3 siblings, 1 reply; 82+ messages in thread
From: Tian, Kevin @ 2021-03-16  7:55 UTC (permalink / raw)
  To: Jason Gunthorpe, Alex Williamson, Cornelia Huck, Jonathan Corbet,
	kvm, linux-doc
  Cc: Raj, Ashok, Williams, Dan J, Daniel Vetter, Christoph Hellwig,
	Leon Romanovsky, Max Gurtovoy, Tarun Gupta, Liu, Yi L

> From: Jason Gunthorpe <jgg@nvidia.com>
> Sent: Saturday, March 13, 2021 8:56 AM
> 
> This makes the struct vfio_pci_device part of the public interface so it
> can be used with container_of and so forth, as is typical for a Linux
> subystem.
> 
> This is the first step to bring some type-safety to the vfio interface by
> allowing the replacement of 'void *' and 'struct device *' inputs with a
> simple and clear 'struct vfio_pci_device *'
> 
> For now the self-allocating vfio_add_group_dev() interface is kept so each
> user can be updated as a separate patch.
> 
> The expected usage pattern is
> 
>   driver core probe() function:
>      my_device = kzalloc(sizeof(*mydevice));
>      vfio_init_group_dev(&my_device->vdev, dev, ops, mydevice);
>      /* other driver specific prep */
>      vfio_register_group_dev(&my_device->vdev);
>      dev_set_drvdata(my_device);

dev_set_drvdata(dev, my_device);

> 
>   driver core remove() function:
>      my_device = dev_get_drvdata(dev);
>      vfio_unregister_group_dev(&my_device->vdev);
>      /* other driver specific tear down */
>      kfree(my_device);
> 
> Allowing the driver to be able to use the drvdata and vifo_device to go
> to/from its own data.
> 
> The pattern also makes it clear that vfio_register_group_dev() must be
> last in the sequence, as once it is called the core code can immediately
> start calling ops. The init/register gap is provided to allow for the
> driver to do setup before ops can be called and thus avoid races.
> 
> Reviewed-by: Christoph Hellwig <hch@lst.de>
> Reviewed-by: Liu Yi L <yi.l.liu@intel.com>
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> ---
>  Documentation/driver-api/vfio.rst |  31 ++++----
>  drivers/vfio/vfio.c               | 123 ++++++++++++++++--------------
>  include/linux/vfio.h              |  16 ++++
>  3 files changed, 98 insertions(+), 72 deletions(-)
> 
> diff --git a/Documentation/driver-api/vfio.rst b/Documentation/driver-
> api/vfio.rst
> index f1a4d3c3ba0bb1..d3a02300913a7f 100644
> --- a/Documentation/driver-api/vfio.rst
> +++ b/Documentation/driver-api/vfio.rst
> @@ -249,18 +249,23 @@ VFIO bus driver API
> 
>  VFIO bus drivers, such as vfio-pci make use of only a few interfaces
>  into VFIO core.  When devices are bound and unbound to the driver,
> -the driver should call vfio_add_group_dev() and vfio_del_group_dev()
> -respectively::
> -
> -	extern int vfio_add_group_dev(struct device *dev,
> -				      const struct vfio_device_ops *ops,
> -				      void *device_data);
> -
> -	extern void *vfio_del_group_dev(struct device *dev);
> -
> -vfio_add_group_dev() indicates to the core to begin tracking the
> -iommu_group of the specified dev and register the dev as owned by
> -a VFIO bus driver.  The driver provides an ops structure for callbacks
> +the driver should call vfio_register_group_dev() and
> +vfio_unregister_group_dev() respectively::
> +
> +	void vfio_init_group_dev(struct vfio_device *device,
> +				struct device *dev,
> +				const struct vfio_device_ops *ops,
> +				void *device_data);
> +	int vfio_register_group_dev(struct vfio_device *device);
> +	void vfio_unregister_group_dev(struct vfio_device *device);
> +
> +The driver should embed the vfio_device in its own structure and call
> +vfio_init_group_dev() to pre-configure it before going to registration.
> +vfio_register_group_dev() indicates to the core to begin tracking the
> +iommu_group of the specified dev and register the dev as owned by a VFIO
> bus
> +driver. Once vfio_register_group_dev() returns it is possible for userspace
> to
> +start accessing the driver, thus the driver should ensure it is completely
> +ready before calling it. The driver provides an ops structure for callbacks
>  similar to a file operations structure::
> 
>  	struct vfio_device_ops {
> @@ -276,7 +281,7 @@ similar to a file operations structure::
>  	};
> 
>  Each function is passed the device_data that was originally registered
> -in the vfio_add_group_dev() call above.  This allows the bus driver
> +in the vfio_register_group_dev() call above.  This allows the bus driver
>  an easy place to store its opaque, private data.  The open/release
>  callbacks are issued when a new file descriptor is created for a
>  device (via VFIO_GROUP_GET_DEVICE_FD).  The ioctl interface provides
> diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
> index 32660e8a69ae20..cfa06ae3b9018b 100644
> --- a/drivers/vfio/vfio.c
> +++ b/drivers/vfio/vfio.c
> @@ -89,16 +89,6 @@ struct vfio_group {
>  	struct blocking_notifier_head	notifier;
>  };
> 
> -struct vfio_device {
> -	refcount_t			refcount;
> -	struct completion		comp;
> -	struct device			*dev;
> -	const struct vfio_device_ops	*ops;
> -	struct vfio_group		*group;
> -	struct list_head		group_next;
> -	void				*device_data;
> -};
> -
>  #ifdef CONFIG_VFIO_NOIOMMU
>  static bool noiommu __read_mostly;
>  module_param_named(enable_unsafe_noiommu_mode,
> @@ -532,35 +522,6 @@ static struct vfio_group
> *vfio_group_get_from_dev(struct device *dev)
>  /**
>   * Device objects - create, release, get, put, search
>   */
> -static
> -struct vfio_device *vfio_group_create_device(struct vfio_group *group,
> -					     struct device *dev,
> -					     const struct vfio_device_ops *ops,
> -					     void *device_data)
> -{
> -	struct vfio_device *device;
> -
> -	device = kzalloc(sizeof(*device), GFP_KERNEL);
> -	if (!device)
> -		return ERR_PTR(-ENOMEM);
> -
> -	refcount_set(&device->refcount, 1);
> -	init_completion(&device->comp);
> -	device->dev = dev;
> -	/* Our reference on group is moved to the device */
> -	device->group = group;
> -	device->ops = ops;
> -	device->device_data = device_data;
> -	dev_set_drvdata(dev, device);
> -
> -	mutex_lock(&group->device_lock);
> -	list_add(&device->group_next, &group->device_list);
> -	group->dev_counter++;
> -	mutex_unlock(&group->device_lock);
> -
> -	return device;
> -}
> -
>  /* Device reference always implies a group reference */
>  void vfio_device_put(struct vfio_device *device)
>  {
> @@ -779,14 +740,23 @@ static int vfio_iommu_group_notifier(struct
> notifier_block *nb,
>  /**
>   * VFIO driver API
>   */
> -int vfio_add_group_dev(struct device *dev,
> -		       const struct vfio_device_ops *ops, void *device_data)
> +void vfio_init_group_dev(struct vfio_device *device, struct device *dev,
> +			 const struct vfio_device_ops *ops, void *device_data)
> +{
> +	init_completion(&device->comp);
> +	device->dev = dev;
> +	device->ops = ops;
> +	device->device_data = device_data;
> +}
> +EXPORT_SYMBOL_GPL(vfio_init_group_dev);
> +
> +int vfio_register_group_dev(struct vfio_device *device)
>  {
> +	struct vfio_device *existing_device;
>  	struct iommu_group *iommu_group;
>  	struct vfio_group *group;
> -	struct vfio_device *device;
> 
> -	iommu_group = iommu_group_get(dev);
> +	iommu_group = iommu_group_get(device->dev);
>  	if (!iommu_group)
>  		return -EINVAL;
> 
> @@ -805,21 +775,50 @@ int vfio_add_group_dev(struct device *dev,
>  		iommu_group_put(iommu_group);
>  	}
> 
> -	device = vfio_group_get_device(group, dev);
> -	if (device) {
> -		dev_WARN(dev, "Device already exists on group %d\n",
> +	existing_device = vfio_group_get_device(group, device->dev);
> +	if (existing_device) {
> +		dev_WARN(device->dev, "Device already exists on
> group %d\n",
>  			 iommu_group_id(iommu_group));
> -		vfio_device_put(device);
> +		vfio_device_put(existing_device);
>  		vfio_group_put(group);
>  		return -EBUSY;
>  	}
> 
> -	device = vfio_group_create_device(group, dev, ops, device_data);
> -	if (IS_ERR(device)) {
> -		vfio_group_put(group);
> -		return PTR_ERR(device);
> -	}
> +	/* Our reference on group is moved to the device */
> +	device->group = group;
> +
> +	/* Refcounting can't start until the driver calls register */
> +	refcount_set(&device->refcount, 1);
> +
> +	mutex_lock(&group->device_lock);
> +	list_add(&device->group_next, &group->device_list);
> +	group->dev_counter++;
> +	mutex_unlock(&group->device_lock);
> +
> +	return 0;
> +}
> +EXPORT_SYMBOL_GPL(vfio_register_group_dev);
> +
> +int vfio_add_group_dev(struct device *dev, const struct vfio_device_ops
> *ops,
> +		       void *device_data)
> +{
> +	struct vfio_device *device;
> +	int ret;
> +
> +	device = kzalloc(sizeof(*device), GFP_KERNEL);
> +	if (!device)
> +		return -ENOMEM;
> +
> +	vfio_init_group_dev(device, dev, ops, device_data);
> +	ret = vfio_register_group_dev(device);
> +	if (ret)
> +		goto err_kfree;
> +	dev_set_drvdata(dev, device);
>  	return 0;
> +
> +err_kfree:
> +	kfree(device);
> +	return ret;
>  }
>  EXPORT_SYMBOL_GPL(vfio_add_group_dev);
> 
> @@ -887,11 +886,9 @@ EXPORT_SYMBOL_GPL(vfio_device_data);
>  /*
>   * Decrement the device reference count and wait for the device to be
>   * removed.  Open file descriptors for the device... */
> -void *vfio_del_group_dev(struct device *dev)
> +void vfio_unregister_group_dev(struct vfio_device *device)
>  {
> -	struct vfio_device *device = dev_get_drvdata(dev);
>  	struct vfio_group *group = device->group;
> -	void *device_data = device->device_data;
>  	struct vfio_unbound_dev *unbound;
>  	unsigned int i = 0;
>  	bool interrupted = false;
> @@ -908,7 +905,7 @@ void *vfio_del_group_dev(struct device *dev)
>  	 */
>  	unbound = kzalloc(sizeof(*unbound), GFP_KERNEL);
>  	if (unbound) {
> -		unbound->dev = dev;
> +		unbound->dev = device->dev;
>  		mutex_lock(&group->unbound_lock);
>  		list_add(&unbound->unbound_next, &group->unbound_list);
>  		mutex_unlock(&group->unbound_lock);
> @@ -919,7 +916,7 @@ void *vfio_del_group_dev(struct device *dev)
>  	rc = try_wait_for_completion(&device->comp);
>  	while (rc <= 0) {
>  		if (device->ops->request)
> -			device->ops->request(device_data, i++);
> +			device->ops->request(device->device_data, i++);
> 
>  		if (interrupted) {
>  			rc = wait_for_completion_timeout(&device->comp,
> @@ -929,7 +926,7 @@ void *vfio_del_group_dev(struct device *dev)
>  				&device->comp, HZ * 10);
>  			if (rc < 0) {
>  				interrupted = true;
> -				dev_warn(dev,
> +				dev_warn(device->dev,
>  					 "Device is currently in use, task"
>  					 " \"%s\" (%d) "
>  					 "blocked until device is released",
> @@ -962,9 +959,17 @@ void *vfio_del_group_dev(struct device *dev)
> 
>  	/* Matches the get in vfio_group_create_device() */
>  	vfio_group_put(group);
> +}
> +EXPORT_SYMBOL_GPL(vfio_unregister_group_dev);
> +
> +void *vfio_del_group_dev(struct device *dev)
> +{
> +	struct vfio_device *device = dev_get_drvdata(dev);
> +	void *device_data = device->device_data;
> +
> +	vfio_unregister_group_dev(device);
>  	dev_set_drvdata(dev, NULL);

Move to vfio_unregister_group_dev? In the cover letter you mentioned
that drvdata is managed by the driver but removed from the core. Looks
it's also the rule obeyed by the following patches.

Thanks
Kevin

>  	kfree(device);
> -
>  	return device_data;
>  }
>  EXPORT_SYMBOL_GPL(vfio_del_group_dev);
> diff --git a/include/linux/vfio.h b/include/linux/vfio.h
> index b7e18bde5aa8b3..ad8b579d67d34a 100644
> --- a/include/linux/vfio.h
> +++ b/include/linux/vfio.h
> @@ -15,6 +15,18 @@
>  #include <linux/poll.h>
>  #include <uapi/linux/vfio.h>
> 
> +struct vfio_device {
> +	struct device *dev;
> +	const struct vfio_device_ops *ops;
> +	struct vfio_group *group;
> +
> +	/* Members below here are private, not for driver use */
> +	refcount_t refcount;
> +	struct completion comp;
> +	struct list_head group_next;
> +	void *device_data;
> +};
> +
>  /**
>   * struct vfio_device_ops - VFIO bus driver device callbacks
>   *
> @@ -48,11 +60,15 @@ struct vfio_device_ops {
>  extern struct iommu_group *vfio_iommu_group_get(struct device *dev);
>  extern void vfio_iommu_group_put(struct iommu_group *group, struct
> device *dev);
> 
> +void vfio_init_group_dev(struct vfio_device *device, struct device *dev,
> +			 const struct vfio_device_ops *ops, void
> *device_data);
> +int vfio_register_group_dev(struct vfio_device *device);
>  extern int vfio_add_group_dev(struct device *dev,
>  			      const struct vfio_device_ops *ops,
>  			      void *device_data);
> 
>  extern void *vfio_del_group_dev(struct device *dev);
> +void vfio_unregister_group_dev(struct vfio_device *device);
>  extern struct vfio_device *vfio_device_get_from_dev(struct device *dev);
>  extern void vfio_device_put(struct vfio_device *device);
>  extern void *vfio_device_data(struct vfio_device *device);
> --
> 2.30.2


^ permalink raw reply	[flat|nested] 82+ messages in thread

* RE: [PATCH v2 07/14] vfio/pci: Move VGA and VF initialization to functions
  2021-03-13  0:55 ` [PATCH v2 07/14] vfio/pci: Move VGA and VF initialization to functions Jason Gunthorpe
  2021-03-15  8:45   ` Christoph Hellwig
@ 2021-03-16  7:57   ` Tian, Kevin
  2021-03-16 13:02   ` Max Gurtovoy
                     ` (2 subsequent siblings)
  4 siblings, 0 replies; 82+ messages in thread
From: Tian, Kevin @ 2021-03-16  7:57 UTC (permalink / raw)
  To: Jason Gunthorpe, Cornelia Huck, kvm
  Cc: Alex Williamson, Raj, Ashok, Williams, Dan J, Daniel Vetter,
	Christoph Hellwig, Leon Romanovsky, Max Gurtovoy, Tarun Gupta

> From: Jason Gunthorpe <jgg@nvidia.com>
> Sent: Saturday, March 13, 2021 8:56 AM
> 
> vfio_pci_probe() is quite complicated, with optional VF and VGA sub
> components. Move these into clear init/uninit functions and have a linear
> flow in probe/remove.
> 
> This fixes a few little buglets:
>  - vfio_pci_remove() is in the wrong order, vga_client_register() removes
>    a notifier and is after kfree(vdev), but the notifier refers to vdev,
>    so it can use after free in a race.
>  - vga_client_register() can fail but was ignored
> 
> Organize things so destruction order is the reverse of creation order.
> 
> Fixes: ecaa1f6a0154 ("vfio-pci: Add VGA arbiter client")
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

Reviewed-by: Kevin Tian <kevin.tian@intel.com>

> ---
>  drivers/vfio/pci/vfio_pci.c | 116 +++++++++++++++++++++++-------------
>  1 file changed, 74 insertions(+), 42 deletions(-)
> 
> diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c
> index 65e7e6b44578c2..f95b58376156a0 100644
> --- a/drivers/vfio/pci/vfio_pci.c
> +++ b/drivers/vfio/pci/vfio_pci.c
> @@ -1922,6 +1922,68 @@ static int vfio_pci_bus_notifier(struct
> notifier_block *nb,
>  	return 0;
>  }
> 
> +static int vfio_pci_vf_init(struct vfio_pci_device *vdev)
> +{
> +	struct pci_dev *pdev = vdev->pdev;
> +	int ret;
> +
> +	if (!pdev->is_physfn)
> +		return 0;
> +
> +	vdev->vf_token = kzalloc(sizeof(*vdev->vf_token), GFP_KERNEL);
> +	if (!vdev->vf_token)
> +		return -ENOMEM;
> +
> +	mutex_init(&vdev->vf_token->lock);
> +	uuid_gen(&vdev->vf_token->uuid);
> +
> +	vdev->nb.notifier_call = vfio_pci_bus_notifier;
> +	ret = bus_register_notifier(&pci_bus_type, &vdev->nb);
> +	if (ret) {
> +		kfree(vdev->vf_token);
> +		return ret;
> +	}
> +	return 0;
> +}
> +
> +static void vfio_pci_vf_uninit(struct vfio_pci_device *vdev)
> +{
> +	if (!vdev->vf_token)
> +		return;
> +
> +	bus_unregister_notifier(&pci_bus_type, &vdev->nb);
> +	WARN_ON(vdev->vf_token->users);
> +	mutex_destroy(&vdev->vf_token->lock);
> +	kfree(vdev->vf_token);
> +}
> +
> +static int vfio_pci_vga_init(struct vfio_pci_device *vdev)
> +{
> +	struct pci_dev *pdev = vdev->pdev;
> +	int ret;
> +
> +	if (!vfio_pci_is_vga(pdev))
> +		return 0;
> +
> +	ret = vga_client_register(pdev, vdev, NULL, vfio_pci_set_vga_decode);
> +	if (ret)
> +		return ret;
> +	vga_set_legacy_decoding(pdev, vfio_pci_set_vga_decode(vdev,
> false));
> +	return 0;
> +}
> +
> +static void vfio_pci_vga_uninit(struct vfio_pci_device *vdev)
> +{
> +	struct pci_dev *pdev = vdev->pdev;
> +
> +	if (!vfio_pci_is_vga(pdev))
> +		return;
> +	vga_client_register(pdev, NULL, NULL, NULL);
> +	vga_set_legacy_decoding(pdev, VGA_RSRC_NORMAL_IO |
> VGA_RSRC_NORMAL_MEM |
> +					      VGA_RSRC_LEGACY_IO |
> +					      VGA_RSRC_LEGACY_MEM);
> +}
> +
>  static int vfio_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
>  {
>  	struct vfio_pci_device *vdev;
> @@ -1975,28 +2037,12 @@ static int vfio_pci_probe(struct pci_dev *pdev,
> const struct pci_device_id *id)
>  	ret = vfio_pci_reflck_attach(vdev);
>  	if (ret)
>  		goto out_del_group_dev;
> -
> -	if (pdev->is_physfn) {
> -		vdev->vf_token = kzalloc(sizeof(*vdev->vf_token),
> GFP_KERNEL);
> -		if (!vdev->vf_token) {
> -			ret = -ENOMEM;
> -			goto out_reflck;
> -		}
> -
> -		mutex_init(&vdev->vf_token->lock);
> -		uuid_gen(&vdev->vf_token->uuid);
> -
> -		vdev->nb.notifier_call = vfio_pci_bus_notifier;
> -		ret = bus_register_notifier(&pci_bus_type, &vdev->nb);
> -		if (ret)
> -			goto out_vf_token;
> -	}
> -
> -	if (vfio_pci_is_vga(pdev)) {
> -		vga_client_register(pdev, vdev, NULL,
> vfio_pci_set_vga_decode);
> -		vga_set_legacy_decoding(pdev,
> -					vfio_pci_set_vga_decode(vdev,
> false));
> -	}
> +	ret = vfio_pci_vf_init(vdev);
> +	if (ret)
> +		goto out_reflck;
> +	ret = vfio_pci_vga_init(vdev);
> +	if (ret)
> +		goto out_vf;
> 
>  	vfio_pci_probe_power_state(vdev);
> 
> @@ -2016,8 +2062,8 @@ static int vfio_pci_probe(struct pci_dev *pdev,
> const struct pci_device_id *id)
> 
>  	return ret;
> 
> -out_vf_token:
> -	kfree(vdev->vf_token);
> +out_vf:
> +	vfio_pci_vf_uninit(vdev);
>  out_reflck:
>  	vfio_pci_reflck_put(vdev->reflck);
>  out_del_group_dev:
> @@ -2039,33 +2085,19 @@ static void vfio_pci_remove(struct pci_dev
> *pdev)
>  	if (!vdev)
>  		return;
> 
> -	if (vdev->vf_token) {
> -		WARN_ON(vdev->vf_token->users);
> -		mutex_destroy(&vdev->vf_token->lock);
> -		kfree(vdev->vf_token);
> -	}
> -
> -	if (vdev->nb.notifier_call)
> -		bus_unregister_notifier(&pci_bus_type, &vdev->nb);
> -
> +	vfio_pci_vf_uninit(vdev);
>  	vfio_pci_reflck_put(vdev->reflck);
> +	vfio_pci_vga_uninit(vdev);
> 
>  	vfio_iommu_group_put(pdev->dev.iommu_group, &pdev->dev);
> -	kfree(vdev->region);
> -	mutex_destroy(&vdev->ioeventfds_lock);
> 
>  	if (!disable_idle_d3)
>  		vfio_pci_set_power_state(vdev, PCI_D0);
> 
> +	mutex_destroy(&vdev->ioeventfds_lock);
> +	kfree(vdev->region);
>  	kfree(vdev->pm_save);
>  	kfree(vdev);
> -
> -	if (vfio_pci_is_vga(pdev)) {
> -		vga_client_register(pdev, NULL, NULL, NULL);
> -		vga_set_legacy_decoding(pdev,
> -				VGA_RSRC_NORMAL_IO |
> VGA_RSRC_NORMAL_MEM |
> -				VGA_RSRC_LEGACY_IO |
> VGA_RSRC_LEGACY_MEM);
> -	}
>  }
> 
>  static pci_ers_result_t vfio_pci_aer_err_detected(struct pci_dev *pdev,
> --
> 2.30.2


^ permalink raw reply	[flat|nested] 82+ messages in thread

* RE: [PATCH v2 08/14] vfio/pci: Re-order vfio_pci_probe()
  2021-03-13  0:56 ` [PATCH v2 08/14] vfio/pci: Re-order vfio_pci_probe() Jason Gunthorpe
  2021-03-15  8:46   ` Christoph Hellwig
@ 2021-03-16  8:04   ` Tian, Kevin
  2021-03-16 13:20     ` Jason Gunthorpe
  2021-03-16 11:28   ` Max Gurtovoy
                     ` (2 subsequent siblings)
  4 siblings, 1 reply; 82+ messages in thread
From: Tian, Kevin @ 2021-03-16  8:04 UTC (permalink / raw)
  To: Jason Gunthorpe, kvm
  Cc: Alex Williamson, Raj, Ashok, Christian Ehrhardt, Cornelia Huck,
	Williams, Dan J, Daniel Vetter, Eric Auger, Christoph Hellwig,
	Leon Romanovsky, Max Gurtovoy, Tarun Gupta

> From: Jason Gunthorpe <jgg@nvidia.com>
> Sent: Saturday, March 13, 2021 8:56 AM
> 
> vfio_add_group_dev() must be called only after all of the private data in
> vdev is fully setup and ready, otherwise there could be races with user
> space instantiating a device file descriptor and starting to call ops.
> 
> For instance vfio_pci_reflck_attach() sets vdev->reflck and
> vfio_pci_open(), called by fops open, unconditionally derefs it, which
> will crash if things get out of order.
> 
> Fixes: cc20d7999000 ("vfio/pci: Introduce VF token")
> Fixes: e309df5b0c9e ("vfio/pci: Parallelize device open and release")
> Fixes: 6eb7018705de ("vfio-pci: Move idle devices to D3hot power state")
> Fixes: ecaa1f6a0154 ("vfio-pci: Add VGA arbiter client")
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> ---
>  drivers/vfio/pci/vfio_pci.c | 17 +++++++++--------
>  1 file changed, 9 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c
> index f95b58376156a0..0e7682e7a0b478 100644
> --- a/drivers/vfio/pci/vfio_pci.c
> +++ b/drivers/vfio/pci/vfio_pci.c
> @@ -2030,13 +2030,9 @@ static int vfio_pci_probe(struct pci_dev *pdev,
> const struct pci_device_id *id)
>  	INIT_LIST_HEAD(&vdev->vma_list);
>  	init_rwsem(&vdev->memory_lock);
> 
> -	ret = vfio_add_group_dev(&pdev->dev, &vfio_pci_ops, vdev);
> -	if (ret)
> -		goto out_free;
> -
>  	ret = vfio_pci_reflck_attach(vdev);
>  	if (ret)
> -		goto out_del_group_dev;
> +		goto out_free;
>  	ret = vfio_pci_vf_init(vdev);
>  	if (ret)
>  		goto out_reflck;
> @@ -2060,15 +2056,20 @@ static int vfio_pci_probe(struct pci_dev *pdev,
> const struct pci_device_id *id)
>  		vfio_pci_set_power_state(vdev, PCI_D3hot);
>  	}
> 
> -	return ret;
> +	ret = vfio_add_group_dev(&pdev->dev, &vfio_pci_ops, vdev);
> +	if (ret)
> +		goto out_power;
> +	return 0;
> 
> +out_power:
> +	if (!disable_idle_d3)
> +		vfio_pci_set_power_state(vdev, PCI_D0);

Just curious whether the power state must be recovered upon failure here.
From the comment several lines above, the power state is set to an unknown
state before doing D3 transaction. From this point it looks fine if leaving the
device in D3 since there is no expected state to be recovered?

>  out_vf:
>  	vfio_pci_vf_uninit(vdev);
>  out_reflck:
>  	vfio_pci_reflck_put(vdev->reflck);
> -out_del_group_dev:
> -	vfio_del_group_dev(&pdev->dev);
>  out_free:
> +	kfree(vdev->pm_save);
>  	kfree(vdev);
>  out_group_put:
>  	vfio_iommu_group_put(group, &pdev->dev);
> --
> 2.30.2


^ permalink raw reply	[flat|nested] 82+ messages in thread

* RE: [PATCH v2 09/14] vfio/pci: Use vfio_init/register/unregister_group_dev
  2021-03-13  0:56 ` [PATCH v2 09/14] vfio/pci: Use vfio_init/register/unregister_group_dev Jason Gunthorpe
@ 2021-03-16  8:06   ` Tian, Kevin
  2021-03-17 10:33   ` Cornelia Huck
  2021-03-18 13:43   ` Auger Eric
  2 siblings, 0 replies; 82+ messages in thread
From: Tian, Kevin @ 2021-03-16  8:06 UTC (permalink / raw)
  To: Jason Gunthorpe, Alex Williamson, Cornelia Huck, kvm
  Cc: Raj, Ashok, Williams, Dan J, Daniel Vetter, Christoph Hellwig,
	Leon Romanovsky, Max Gurtovoy, Tarun Gupta, Liu, Yi L

> From: Jason Gunthorpe <jgg@nvidia.com>
> Sent: Saturday, March 13, 2021 8:56 AM
> 
> pci already allocates a struct vfio_pci_device with exactly the same
> lifetime as vfio_device, switch to the new API and embed vfio_device in
> vfio_pci_device.
> 
> Reviewed-by: Christoph Hellwig <hch@lst.de>
> Reviewed-by: Liu Yi L <yi.l.liu@intel.com>
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

Reviewed-by: Kevin Tian <kevin.tian@intel.com>

> ---
>  drivers/vfio/pci/vfio_pci.c         | 10 +++++-----
>  drivers/vfio/pci/vfio_pci_private.h |  1 +
>  2 files changed, 6 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c
> index 0e7682e7a0b478..a0ac20a499cf6c 100644
> --- a/drivers/vfio/pci/vfio_pci.c
> +++ b/drivers/vfio/pci/vfio_pci.c
> @@ -2019,6 +2019,7 @@ static int vfio_pci_probe(struct pci_dev *pdev,
> const struct pci_device_id *id)
>  		goto out_group_put;
>  	}
> 
> +	vfio_init_group_dev(&vdev->vdev, &pdev->dev, &vfio_pci_ops, vdev);
>  	vdev->pdev = pdev;
>  	vdev->irq_type = VFIO_PCI_NUM_IRQS;
>  	mutex_init(&vdev->igate);
> @@ -2056,9 +2057,10 @@ static int vfio_pci_probe(struct pci_dev *pdev,
> const struct pci_device_id *id)
>  		vfio_pci_set_power_state(vdev, PCI_D3hot);
>  	}
> 
> -	ret = vfio_add_group_dev(&pdev->dev, &vfio_pci_ops, vdev);
> +	ret = vfio_register_group_dev(&vdev->vdev);
>  	if (ret)
>  		goto out_power;
> +	dev_set_drvdata(&pdev->dev, vdev);
>  	return 0;
> 
>  out_power:
> @@ -2078,13 +2080,11 @@ static int vfio_pci_probe(struct pci_dev *pdev,
> const struct pci_device_id *id)
> 
>  static void vfio_pci_remove(struct pci_dev *pdev)
>  {
> -	struct vfio_pci_device *vdev;
> +	struct vfio_pci_device *vdev = dev_get_drvdata(&pdev->dev);
> 
>  	pci_disable_sriov(pdev);
> 
> -	vdev = vfio_del_group_dev(&pdev->dev);
> -	if (!vdev)
> -		return;
> +	vfio_unregister_group_dev(&vdev->vdev);
> 
>  	vfio_pci_vf_uninit(vdev);
>  	vfio_pci_reflck_put(vdev->reflck);
> diff --git a/drivers/vfio/pci/vfio_pci_private.h
> b/drivers/vfio/pci/vfio_pci_private.h
> index 9cd1882a05af69..8755a0febd054a 100644
> --- a/drivers/vfio/pci/vfio_pci_private.h
> +++ b/drivers/vfio/pci/vfio_pci_private.h
> @@ -100,6 +100,7 @@ struct vfio_pci_mmap_vma {
>  };
> 
>  struct vfio_pci_device {
> +	struct vfio_device	vdev;
>  	struct pci_dev		*pdev;
>  	void __iomem		*barmap[PCI_STD_NUM_BARS];
>  	bool			bar_mmap_supported[PCI_STD_NUM_BARS];
> --
> 2.30.2


^ permalink raw reply	[flat|nested] 82+ messages in thread

* RE: [PATCH v2 10/14] vfio/mdev: Use vfio_init/register/unregister_group_dev
  2021-03-13  0:56 ` [PATCH v2 10/14] vfio/mdev: " Jason Gunthorpe
@ 2021-03-16  8:09   ` Tian, Kevin
  2021-03-16 22:51     ` Alex Williamson
  2021-03-16 23:19     ` Jason Gunthorpe
  2021-03-17 10:36   ` Cornelia Huck
  1 sibling, 2 replies; 82+ messages in thread
From: Tian, Kevin @ 2021-03-16  8:09 UTC (permalink / raw)
  To: Jason Gunthorpe, Alex Williamson, Cornelia Huck, kvm, Kirti Wankhede
  Cc: Raj, Ashok, Williams, Dan J, Daniel Vetter, Christoph Hellwig,
	Leon Romanovsky, Max Gurtovoy, Tarun Gupta, Liu, Yi L

> From: Jason Gunthorpe <jgg@nvidia.com>
> Sent: Saturday, March 13, 2021 8:56 AM
> 
> mdev gets little benefit because it doesn't actually do anything, however
> it is the last user, so move the code here for now.

and indicate that vfio_add/del_group_dev is removed in this patch.

> 
> Reviewed-by: Christoph Hellwig <hch@lst.de>
> Reviewed-by: Liu Yi L <yi.l.liu@intel.com>
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> ---
>  drivers/vfio/mdev/vfio_mdev.c | 24 +++++++++++++++++++--
>  drivers/vfio/vfio.c           | 39 ++---------------------------------
>  include/linux/vfio.h          |  5 -----
>  3 files changed, 24 insertions(+), 44 deletions(-)
> 
> diff --git a/drivers/vfio/mdev/vfio_mdev.c b/drivers/vfio/mdev/vfio_mdev.c
> index b52eea128549ee..4469aaf31b56cb 100644
> --- a/drivers/vfio/mdev/vfio_mdev.c
> +++ b/drivers/vfio/mdev/vfio_mdev.c
> @@ -21,6 +21,10 @@
>  #define DRIVER_AUTHOR   "NVIDIA Corporation"
>  #define DRIVER_DESC     "VFIO based driver for Mediated device"
> 
> +struct mdev_vfio_device {
> +	struct vfio_device vdev;
> +};

following other vfio_XXX_device convention, what about calling it
vfio_mdev_device? otherwise,

Reviewed-by: Kevin Tian <kevin.tian@intel.com>

> +
>  static int vfio_mdev_open(void *device_data)
>  {
>  	struct mdev_device *mdev = device_data;
> @@ -124,13 +128,29 @@ static const struct vfio_device_ops
> vfio_mdev_dev_ops = {
>  static int vfio_mdev_probe(struct device *dev)
>  {
>  	struct mdev_device *mdev = to_mdev_device(dev);
> +	struct mdev_vfio_device *mvdev;
> +	int ret;
> 
> -	return vfio_add_group_dev(dev, &vfio_mdev_dev_ops, mdev);
> +	mvdev = kzalloc(sizeof(*mvdev), GFP_KERNEL);
> +	if (!mvdev)
> +		return -ENOMEM;
> +
> +	vfio_init_group_dev(&mvdev->vdev, &mdev->dev,
> &vfio_mdev_dev_ops, mdev);
> +	ret = vfio_register_group_dev(&mvdev->vdev);
> +	if (ret) {
> +		kfree(mvdev);
> +		return ret;
> +	}
> +	dev_set_drvdata(&mdev->dev, mvdev);
> +	return 0;
>  }
> 
>  static void vfio_mdev_remove(struct device *dev)
>  {
> -	vfio_del_group_dev(dev);
> +	struct mdev_vfio_device *mvdev = dev_get_drvdata(dev);
> +
> +	vfio_unregister_group_dev(&mvdev->vdev);
> +	kfree(mvdev);
>  }
> 
>  static struct mdev_driver vfio_mdev_driver = {
> diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
> index cfa06ae3b9018b..2d6d7cc1d1ebf9 100644
> --- a/drivers/vfio/vfio.c
> +++ b/drivers/vfio/vfio.c
> @@ -99,8 +99,8 @@
> MODULE_PARM_DESC(enable_unsafe_noiommu_mode, "Enable UNSAFE,
> no-IOMMU mode.  Thi
>  /*
>   * vfio_iommu_group_{get,put} are only intended for VFIO bus driver probe
>   * and remove functions, any use cases other than acquiring the first
> - * reference for the purpose of calling vfio_add_group_dev() or removing
> - * that symmetric reference after vfio_del_group_dev() should use the raw
> + * reference for the purpose of calling vfio_register_group_dev() or
> removing
> + * that symmetric reference after vfio_unregister_group_dev() should use
> the raw
>   * iommu_group_{get,put} functions.  In particular, vfio_iommu_group_put()
>   * removes the device from the dummy group and cannot be nested.
>   */
> @@ -799,29 +799,6 @@ int vfio_register_group_dev(struct vfio_device
> *device)
>  }
>  EXPORT_SYMBOL_GPL(vfio_register_group_dev);
> 
> -int vfio_add_group_dev(struct device *dev, const struct vfio_device_ops
> *ops,
> -		       void *device_data)
> -{
> -	struct vfio_device *device;
> -	int ret;
> -
> -	device = kzalloc(sizeof(*device), GFP_KERNEL);
> -	if (!device)
> -		return -ENOMEM;
> -
> -	vfio_init_group_dev(device, dev, ops, device_data);
> -	ret = vfio_register_group_dev(device);
> -	if (ret)
> -		goto err_kfree;
> -	dev_set_drvdata(dev, device);
> -	return 0;
> -
> -err_kfree:
> -	kfree(device);
> -	return ret;
> -}
> -EXPORT_SYMBOL_GPL(vfio_add_group_dev);
> -
>  /**
>   * Get a reference to the vfio_device for a device.  Even if the
>   * caller thinks they own the device, they could be racing with a
> @@ -962,18 +939,6 @@ void vfio_unregister_group_dev(struct vfio_device
> *device)
>  }
>  EXPORT_SYMBOL_GPL(vfio_unregister_group_dev);
> 
> -void *vfio_del_group_dev(struct device *dev)
> -{
> -	struct vfio_device *device = dev_get_drvdata(dev);
> -	void *device_data = device->device_data;
> -
> -	vfio_unregister_group_dev(device);
> -	dev_set_drvdata(dev, NULL);
> -	kfree(device);
> -	return device_data;
> -}
> -EXPORT_SYMBOL_GPL(vfio_del_group_dev);
> -
>  /**
>   * VFIO base fd, /dev/vfio/vfio
>   */
> diff --git a/include/linux/vfio.h b/include/linux/vfio.h
> index ad8b579d67d34a..4995faf51efeae 100644
> --- a/include/linux/vfio.h
> +++ b/include/linux/vfio.h
> @@ -63,11 +63,6 @@ extern void vfio_iommu_group_put(struct
> iommu_group *group, struct device *dev);
>  void vfio_init_group_dev(struct vfio_device *device, struct device *dev,
>  			 const struct vfio_device_ops *ops, void
> *device_data);
>  int vfio_register_group_dev(struct vfio_device *device);
> -extern int vfio_add_group_dev(struct device *dev,
> -			      const struct vfio_device_ops *ops,
> -			      void *device_data);
> -
> -extern void *vfio_del_group_dev(struct device *dev);
>  void vfio_unregister_group_dev(struct vfio_device *device);
>  extern struct vfio_device *vfio_device_get_from_dev(struct device *dev);
>  extern void vfio_device_put(struct vfio_device *device);
> --
> 2.30.2


^ permalink raw reply	[flat|nested] 82+ messages in thread

* RE: [PATCH v2 11/14] vfio/mdev: Make to_mdev_device() into a static inline
  2021-03-13  0:56 ` [PATCH v2 11/14] vfio/mdev: Make to_mdev_device() into a static inline Jason Gunthorpe
@ 2021-03-16  8:10   ` Tian, Kevin
  2021-03-16 22:55   ` Alex Williamson
  2021-03-17 10:36   ` Cornelia Huck
  2 siblings, 0 replies; 82+ messages in thread
From: Tian, Kevin @ 2021-03-16  8:10 UTC (permalink / raw)
  To: Jason Gunthorpe, Alex Williamson, Cornelia Huck, kvm, Kirti Wankhede
  Cc: Raj, Ashok, Williams, Dan J, Daniel Vetter, Christoph Hellwig,
	Leon Romanovsky, Max Gurtovoy, Tarun Gupta

> From: Jason Gunthorpe <jgg@nvidia.com>
> Sent: Saturday, March 13, 2021 8:56 AM
> 
> The macro wrongly uses 'dev' as both the macro argument and the member
> name, which means it fails compilation if any caller uses a word other
> than 'dev' as the single argument. Fix this defect by making it into
> proper static inline, which is more clear and typesafe anyhow.
> 
> Reviewed-by: Christoph Hellwig <hch@lst.de>
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

Reviewed-by: Kevin Tian <kevin.tian@intel.com>

> ---
>  drivers/vfio/mdev/mdev_private.h | 5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/vfio/mdev/mdev_private.h
> b/drivers/vfio/mdev/mdev_private.h
> index 7d922950caaf3c..74c2e541146999 100644
> --- a/drivers/vfio/mdev/mdev_private.h
> +++ b/drivers/vfio/mdev/mdev_private.h
> @@ -35,7 +35,10 @@ struct mdev_device {
>  	bool active;
>  };
> 
> -#define to_mdev_device(dev)	container_of(dev, struct mdev_device, dev)
> +static inline struct mdev_device *to_mdev_device(struct device *dev)
> +{
> +	return container_of(dev, struct mdev_device, dev);
> +}
>  #define dev_is_mdev(d)		((d)->bus == &mdev_bus_type)
> 
>  struct mdev_type {
> --
> 2.30.2


^ permalink raw reply	[flat|nested] 82+ messages in thread

* RE: [PATCH v2 13/14] vfio/pci: Replace uses of vfio_device_data() with container_of
  2021-03-13  0:56 ` [PATCH v2 13/14] vfio/pci: Replace uses of vfio_device_data() with container_of Jason Gunthorpe
@ 2021-03-16  8:20   ` Tian, Kevin
  2021-03-17 12:06   ` Cornelia Huck
  1 sibling, 0 replies; 82+ messages in thread
From: Tian, Kevin @ 2021-03-16  8:20 UTC (permalink / raw)
  To: Jason Gunthorpe, Alex Williamson, Cornelia Huck, kvm
  Cc: Raj, Ashok, Williams, Dan J, Daniel Vetter, Christoph Hellwig,
	Leon Romanovsky, Max Gurtovoy, Tarun Gupta

> From: Jason Gunthorpe <jgg@nvidia.com>
> Sent: Saturday, March 13, 2021 8:56 AM
> 
> This tidies a few confused places that think they can have a refcount on
> the vfio_device but the device_data could be NULL, that isn't possible by
> design.
> 
> Most of the change falls out when struct vfio_devices is updated to just
> store the struct vfio_pci_device itself. This wasn't possible before
> because there was no easy way to get from the 'struct vfio_pci_device' to
> the 'struct vfio_device' to put back the refcount.
> 
> Reviewed-by: Christoph Hellwig <hch@lst.de>
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

Reviewed-by: Kevin Tian <kevin.tian@intel.com>

> ---
>  drivers/vfio/pci/vfio_pci.c | 67 +++++++++++++------------------------
>  1 file changed, 24 insertions(+), 43 deletions(-)
> 
> diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c
> index 5f1a782d1c65ae..1f70387c8afe37 100644
> --- a/drivers/vfio/pci/vfio_pci.c
> +++ b/drivers/vfio/pci/vfio_pci.c
> @@ -517,30 +517,29 @@ static void vfio_pci_disable(struct vfio_pci_device
> *vdev)
> 
>  static struct pci_driver vfio_pci_driver;
> 
> -static struct vfio_pci_device *get_pf_vdev(struct vfio_pci_device *vdev,
> -					   struct vfio_device **pf_dev)
> +static struct vfio_pci_device *get_pf_vdev(struct vfio_pci_device *vdev)
>  {
>  	struct pci_dev *physfn = pci_physfn(vdev->pdev);
> +	struct vfio_device *pf_dev;
> 
>  	if (!vdev->pdev->is_virtfn)
>  		return NULL;
> 
> -	*pf_dev = vfio_device_get_from_dev(&physfn->dev);
> -	if (!*pf_dev)
> +	pf_dev = vfio_device_get_from_dev(&physfn->dev);
> +	if (!pf_dev)
>  		return NULL;
> 
>  	if (pci_dev_driver(physfn) != &vfio_pci_driver) {
> -		vfio_device_put(*pf_dev);
> +		vfio_device_put(pf_dev);
>  		return NULL;
>  	}
> 
> -	return vfio_device_data(*pf_dev);
> +	return container_of(pf_dev, struct vfio_pci_device, vdev);
>  }
> 
>  static void vfio_pci_vf_token_user_add(struct vfio_pci_device *vdev, int val)
>  {
> -	struct vfio_device *pf_dev;
> -	struct vfio_pci_device *pf_vdev = get_pf_vdev(vdev, &pf_dev);
> +	struct vfio_pci_device *pf_vdev = get_pf_vdev(vdev);
> 
>  	if (!pf_vdev)
>  		return;
> @@ -550,7 +549,7 @@ static void vfio_pci_vf_token_user_add(struct
> vfio_pci_device *vdev, int val)
>  	WARN_ON(pf_vdev->vf_token->users < 0);
>  	mutex_unlock(&pf_vdev->vf_token->lock);
> 
> -	vfio_device_put(pf_dev);
> +	vfio_device_put(&pf_vdev->vdev);
>  }
> 
>  static void vfio_pci_release(struct vfio_device *core_vdev)
> @@ -794,7 +793,7 @@ int vfio_pci_register_dev_region(struct
> vfio_pci_device *vdev,
>  }
> 
>  struct vfio_devices {
> -	struct vfio_device **devices;
> +	struct vfio_pci_device **devices;
>  	int cur_index;
>  	int max_index;
>  };
> @@ -1283,9 +1282,7 @@ static long vfio_pci_ioctl(struct vfio_device
> *core_vdev,
>  			goto hot_reset_release;
> 
>  		for (; mem_idx < devs.cur_index; mem_idx++) {
> -			struct vfio_pci_device *tmp;
> -
> -			tmp = vfio_device_data(devs.devices[mem_idx]);
> +			struct vfio_pci_device *tmp = devs.devices[mem_idx];
> 
>  			ret = down_write_trylock(&tmp->memory_lock);
>  			if (!ret) {
> @@ -1300,17 +1297,13 @@ static long vfio_pci_ioctl(struct vfio_device
> *core_vdev,
> 
>  hot_reset_release:
>  		for (i = 0; i < devs.cur_index; i++) {
> -			struct vfio_device *device;
> -			struct vfio_pci_device *tmp;
> -
> -			device = devs.devices[i];
> -			tmp = vfio_device_data(device);
> +			struct vfio_pci_device *tmp = devs.devices[i];
> 
>  			if (i < mem_idx)
>  				up_write(&tmp->memory_lock);
>  			else
>  				mutex_unlock(&tmp->vma_lock);
> -			vfio_device_put(device);
> +			vfio_device_put(&tmp->vdev);
>  		}
>  		kfree(devs.devices);
> 
> @@ -1777,8 +1770,7 @@ static int vfio_pci_validate_vf_token(struct
> vfio_pci_device *vdev,
>  		return 0; /* No VF token provided or required */
> 
>  	if (vdev->pdev->is_virtfn) {
> -		struct vfio_device *pf_dev;
> -		struct vfio_pci_device *pf_vdev = get_pf_vdev(vdev,
> &pf_dev);
> +		struct vfio_pci_device *pf_vdev = get_pf_vdev(vdev);
>  		bool match;
> 
>  		if (!pf_vdev) {
> @@ -1791,7 +1783,7 @@ static int vfio_pci_validate_vf_token(struct
> vfio_pci_device *vdev,
>  		}
> 
>  		if (!vf_token) {
> -			vfio_device_put(pf_dev);
> +			vfio_device_put(&pf_vdev->vdev);
>  			pci_info_ratelimited(vdev->pdev,
>  				"VF token required to access device\n");
>  			return -EACCES;
> @@ -1801,7 +1793,7 @@ static int vfio_pci_validate_vf_token(struct
> vfio_pci_device *vdev,
>  		match = uuid_equal(uuid, &pf_vdev->vf_token->uuid);
>  		mutex_unlock(&pf_vdev->vf_token->lock);
> 
> -		vfio_device_put(pf_dev);
> +		vfio_device_put(&pf_vdev->vdev);
> 
>  		if (!match) {
>  			pci_info_ratelimited(vdev->pdev,
> @@ -2122,11 +2114,7 @@ static pci_ers_result_t
> vfio_pci_aer_err_detected(struct pci_dev *pdev,
>  	if (device == NULL)
>  		return PCI_ERS_RESULT_DISCONNECT;
> 
> -	vdev = vfio_device_data(device);
> -	if (vdev == NULL) {
> -		vfio_device_put(device);
> -		return PCI_ERS_RESULT_DISCONNECT;
> -	}
> +	vdev = container_of(device, struct vfio_pci_device, vdev);
> 
>  	mutex_lock(&vdev->igate);
> 
> @@ -2142,7 +2130,6 @@ static pci_ers_result_t
> vfio_pci_aer_err_detected(struct pci_dev *pdev,
> 
>  static int vfio_pci_sriov_configure(struct pci_dev *pdev, int nr_virtfn)
>  {
> -	struct vfio_pci_device *vdev;
>  	struct vfio_device *device;
>  	int ret = 0;
> 
> @@ -2155,12 +2142,6 @@ static int vfio_pci_sriov_configure(struct pci_dev
> *pdev, int nr_virtfn)
>  	if (!device)
>  		return -ENODEV;
> 
> -	vdev = vfio_device_data(device);
> -	if (!vdev) {
> -		vfio_device_put(device);
> -		return -ENODEV;
> -	}
> -
>  	if (nr_virtfn == 0)
>  		pci_disable_sriov(pdev);
>  	else
> @@ -2220,7 +2201,7 @@ static int vfio_pci_reflck_find(struct pci_dev *pdev,
> void *data)
>  		return 0;
>  	}
> 
> -	vdev = vfio_device_data(device);
> +	vdev = container_of(device, struct vfio_pci_device, vdev);
> 
>  	if (vdev->reflck) {
>  		vfio_pci_reflck_get(vdev->reflck);
> @@ -2282,7 +2263,7 @@ static int vfio_pci_get_unused_devs(struct pci_dev
> *pdev, void *data)
>  		return -EBUSY;
>  	}
> 
> -	vdev = vfio_device_data(device);
> +	vdev = container_of(device, struct vfio_pci_device, vdev);
> 
>  	/* Fault if the device is not unused */
>  	if (vdev->refcnt) {
> @@ -2290,7 +2271,7 @@ static int vfio_pci_get_unused_devs(struct pci_dev
> *pdev, void *data)
>  		return -EBUSY;
>  	}
> 
> -	devs->devices[devs->cur_index++] = device;
> +	devs->devices[devs->cur_index++] = vdev;
>  	return 0;
>  }
> 
> @@ -2312,7 +2293,7 @@ static int
> vfio_pci_try_zap_and_vma_lock_cb(struct pci_dev *pdev, void *data)
>  		return -EBUSY;
>  	}
> 
> -	vdev = vfio_device_data(device);
> +	vdev = container_of(device, struct vfio_pci_device, vdev);
> 
>  	/*
>  	 * Locking multiple devices is prone to deadlock, runaway and
> @@ -2323,7 +2304,7 @@ static int
> vfio_pci_try_zap_and_vma_lock_cb(struct pci_dev *pdev, void *data)
>  		return -EBUSY;
>  	}
> 
> -	devs->devices[devs->cur_index++] = device;
> +	devs->devices[devs->cur_index++] = vdev;
>  	return 0;
>  }
> 
> @@ -2371,7 +2352,7 @@ static void vfio_pci_try_bus_reset(struct
> vfio_pci_device *vdev)
> 
>  	/* Does at least one need a reset? */
>  	for (i = 0; i < devs.cur_index; i++) {
> -		tmp = vfio_device_data(devs.devices[i]);
> +		tmp = devs.devices[i];
>  		if (tmp->needs_reset) {
>  			ret = pci_reset_bus(vdev->pdev);
>  			break;
> @@ -2380,7 +2361,7 @@ static void vfio_pci_try_bus_reset(struct
> vfio_pci_device *vdev)
> 
>  put_devs:
>  	for (i = 0; i < devs.cur_index; i++) {
> -		tmp = vfio_device_data(devs.devices[i]);
> +		tmp = devs.devices[i];
> 
>  		/*
>  		 * If reset was successful, affected devices no longer need
> @@ -2396,7 +2377,7 @@ static void vfio_pci_try_bus_reset(struct
> vfio_pci_device *vdev)
>  				vfio_pci_set_power_state(tmp, PCI_D3hot);
>  		}
> 
> -		vfio_device_put(devs.devices[i]);
> +		vfio_device_put(&tmp->vdev);
>  	}
> 
>  	kfree(devs.devices);
> --
> 2.30.2


^ permalink raw reply	[flat|nested] 82+ messages in thread

* RE: [PATCH v2 14/14] vfio: Remove device_data from the vfio bus driver API
  2021-03-13  0:56 ` [PATCH v2 14/14] vfio: Remove device_data from the vfio bus driver API Jason Gunthorpe
@ 2021-03-16  8:22   ` Tian, Kevin
  2021-03-17 12:08   ` Cornelia Huck
  2021-03-17 23:24   ` Max Gurtovoy
  2 siblings, 0 replies; 82+ messages in thread
From: Tian, Kevin @ 2021-03-16  8:22 UTC (permalink / raw)
  To: Jason Gunthorpe, Alex Williamson, Cornelia Huck, Jonathan Corbet,
	Diana Craciun, Eric Auger, kvm, Kirti Wankhede, linux-doc
  Cc: Raj, Ashok, Williams, Dan J, Daniel Vetter, Christoph Hellwig,
	Leon Romanovsky, Max Gurtovoy, Tarun Gupta

> From: Jason Gunthorpe <jgg@nvidia.com>
> Sent: Saturday, March 13, 2021 8:56 AM
> 
> There are no longer any users, so it can go away. Everything is using
> container_of now.
> 
> Reviewed-by: Christoph Hellwig <hch@lst.de>
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

Reviewed-by: Kevin Tian <kevin.tian@intel.com>

> ---
>  Documentation/driver-api/vfio.rst            |  3 +--
>  drivers/vfio/fsl-mc/vfio_fsl_mc.c            |  5 +++--
>  drivers/vfio/mdev/vfio_mdev.c                |  2 +-
>  drivers/vfio/pci/vfio_pci.c                  |  2 +-
>  drivers/vfio/platform/vfio_platform_common.c |  2 +-
>  drivers/vfio/vfio.c                          | 12 +-----------
>  include/linux/vfio.h                         |  4 +---
>  7 files changed, 9 insertions(+), 21 deletions(-)
> 
> diff --git a/Documentation/driver-api/vfio.rst b/Documentation/driver-
> api/vfio.rst
> index 3337f337293a32..decc68cb8114ac 100644
> --- a/Documentation/driver-api/vfio.rst
> +++ b/Documentation/driver-api/vfio.rst
> @@ -254,8 +254,7 @@ vfio_unregister_group_dev() respectively::
> 
>  	void vfio_init_group_dev(struct vfio_device *device,
>  				struct device *dev,
> -				const struct vfio_device_ops *ops,
> -				void *device_data);
> +				const struct vfio_device_ops *ops);
>  	int vfio_register_group_dev(struct vfio_device *device);
>  	void vfio_unregister_group_dev(struct vfio_device *device);
> 
> diff --git a/drivers/vfio/fsl-mc/vfio_fsl_mc.c b/drivers/vfio/fsl-
> mc/vfio_fsl_mc.c
> index 023b2222806424..3af3ca59478f94 100644
> --- a/drivers/vfio/fsl-mc/vfio_fsl_mc.c
> +++ b/drivers/vfio/fsl-mc/vfio_fsl_mc.c
> @@ -75,7 +75,8 @@ static int vfio_fsl_mc_reflck_attach(struct
> vfio_fsl_mc_device *vdev)
>  			goto unlock;
>  		}
> 
> -		cont_vdev = vfio_device_data(device);
> +		cont_vdev =
> +			container_of(device, struct vfio_fsl_mc_device, vdev);
>  		if (!cont_vdev || !cont_vdev->reflck) {
>  			vfio_device_put(device);
>  			ret = -ENODEV;
> @@ -624,7 +625,7 @@ static int vfio_fsl_mc_probe(struct fsl_mc_device
> *mc_dev)
>  		goto out_group_put;
>  	}
> 
> -	vfio_init_group_dev(&vdev->vdev, dev, &vfio_fsl_mc_ops, vdev);
> +	vfio_init_group_dev(&vdev->vdev, dev, &vfio_fsl_mc_ops);
>  	vdev->mc_dev = mc_dev;
>  	mutex_init(&vdev->igate);
> 
> diff --git a/drivers/vfio/mdev/vfio_mdev.c b/drivers/vfio/mdev/vfio_mdev.c
> index e7309caa99c71b..71bd28f976e5af 100644
> --- a/drivers/vfio/mdev/vfio_mdev.c
> +++ b/drivers/vfio/mdev/vfio_mdev.c
> @@ -138,7 +138,7 @@ static int vfio_mdev_probe(struct device *dev)
>  	if (!mvdev)
>  		return -ENOMEM;
> 
> -	vfio_init_group_dev(&mvdev->vdev, &mdev->dev,
> &vfio_mdev_dev_ops, mdev);
> +	vfio_init_group_dev(&mvdev->vdev, &mdev->dev,
> &vfio_mdev_dev_ops);
>  	ret = vfio_register_group_dev(&mvdev->vdev);
>  	if (ret) {
>  		kfree(mvdev);
> diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c
> index 1f70387c8afe37..55ef27a15d4d3f 100644
> --- a/drivers/vfio/pci/vfio_pci.c
> +++ b/drivers/vfio/pci/vfio_pci.c
> @@ -2022,7 +2022,7 @@ static int vfio_pci_probe(struct pci_dev *pdev,
> const struct pci_device_id *id)
>  		goto out_group_put;
>  	}
> 
> -	vfio_init_group_dev(&vdev->vdev, &pdev->dev, &vfio_pci_ops, vdev);
> +	vfio_init_group_dev(&vdev->vdev, &pdev->dev, &vfio_pci_ops);
>  	vdev->pdev = pdev;
>  	vdev->irq_type = VFIO_PCI_NUM_IRQS;
>  	mutex_init(&vdev->igate);
> diff --git a/drivers/vfio/platform/vfio_platform_common.c
> b/drivers/vfio/platform/vfio_platform_common.c
> index f5f6b537084a67..361e5b57e36932 100644
> --- a/drivers/vfio/platform/vfio_platform_common.c
> +++ b/drivers/vfio/platform/vfio_platform_common.c
> @@ -666,7 +666,7 @@ int vfio_platform_probe_common(struct
> vfio_platform_device *vdev,
>  	struct iommu_group *group;
>  	int ret;
> 
> -	vfio_init_group_dev(&vdev->vdev, dev, &vfio_platform_ops, vdev);
> +	vfio_init_group_dev(&vdev->vdev, dev, &vfio_platform_ops);
> 
>  	ret = vfio_platform_acpi_probe(vdev, dev);
>  	if (ret)
> diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
> index 01de47d1810b6b..39ea77557ba0c4 100644
> --- a/drivers/vfio/vfio.c
> +++ b/drivers/vfio/vfio.c
> @@ -741,12 +741,11 @@ static int vfio_iommu_group_notifier(struct
> notifier_block *nb,
>   * VFIO driver API
>   */
>  void vfio_init_group_dev(struct vfio_device *device, struct device *dev,
> -			 const struct vfio_device_ops *ops, void *device_data)
> +			 const struct vfio_device_ops *ops)
>  {
>  	init_completion(&device->comp);
>  	device->dev = dev;
>  	device->ops = ops;
> -	device->device_data = device_data;
>  }
>  EXPORT_SYMBOL_GPL(vfio_init_group_dev);
> 
> @@ -851,15 +850,6 @@ static struct vfio_device
> *vfio_device_get_from_name(struct vfio_group *group,
>  	return device;
>  }
> 
> -/*
> - * Caller must hold a reference to the vfio_device
> - */
> -void *vfio_device_data(struct vfio_device *device)
> -{
> -	return device->device_data;
> -}
> -EXPORT_SYMBOL_GPL(vfio_device_data);
> -
>  /*
>   * Decrement the device reference count and wait for the device to be
>   * removed.  Open file descriptors for the device... */
> diff --git a/include/linux/vfio.h b/include/linux/vfio.h
> index 784c34c0a28763..a2c5b30e1763ba 100644
> --- a/include/linux/vfio.h
> +++ b/include/linux/vfio.h
> @@ -24,7 +24,6 @@ struct vfio_device {
>  	refcount_t refcount;
>  	struct completion comp;
>  	struct list_head group_next;
> -	void *device_data;
>  };
> 
>  /**
> @@ -61,12 +60,11 @@ extern struct iommu_group
> *vfio_iommu_group_get(struct device *dev);
>  extern void vfio_iommu_group_put(struct iommu_group *group, struct
> device *dev);
> 
>  void vfio_init_group_dev(struct vfio_device *device, struct device *dev,
> -			 const struct vfio_device_ops *ops, void
> *device_data);
> +			 const struct vfio_device_ops *ops);
>  int vfio_register_group_dev(struct vfio_device *device);
>  void vfio_unregister_group_dev(struct vfio_device *device);
>  extern struct vfio_device *vfio_device_get_from_dev(struct device *dev);
>  extern void vfio_device_put(struct vfio_device *device);
> -extern void *vfio_device_data(struct vfio_device *device);
> 
>  /* events for the backend driver notify callback */
>  enum vfio_iommu_notify_type {
> --
> 2.30.2


^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v2 05/14] vfio/fsl-mc: Re-order vfio_fsl_mc_probe()
  2021-03-13  0:55 ` [PATCH v2 05/14] vfio/fsl-mc: Re-order vfio_fsl_mc_probe() Jason Gunthorpe
  2021-03-15  8:44   ` Christoph Hellwig
@ 2021-03-16  9:16   ` Diana Craciun OSS
  2021-03-16 16:28   ` Cornelia Huck
  2021-03-17 16:36   ` Diana Craciun OSS
  3 siblings, 0 replies; 82+ messages in thread
From: Diana Craciun OSS @ 2021-03-16  9:16 UTC (permalink / raw)
  To: Jason Gunthorpe, Cornelia Huck, kvm
  Cc: Alex Williamson, Raj, Ashok, Bharat Bhushan, Dan Williams,
	Daniel Vetter, Eric Auger, Christoph Hellwig, Leon Romanovsky,
	Max Gurtovoy, Tarun Gupta

Hi,

I will test the fsl-mc related changes in the next couple of days.

Thanks,
Diana

On 3/13/2021 2:55 AM, Jason Gunthorpe wrote:
> vfio_add_group_dev() must be called only after all of the private data in
> vdev is fully setup and ready, otherwise there could be races with user
> space instantiating a device file descriptor and starting to call ops.
> 
> For instance vfio_fsl_mc_reflck_attach() sets vdev->reflck and
> vfio_fsl_mc_open(), called by fops open, unconditionally derefs it, which
> will crash if things get out of order.
> 
> This driver started life with the right sequence, but three commits added
> stuff after vfio_add_group_dev().
> 
> Fixes: 2e0d29561f59 ("vfio/fsl-mc: Add irq infrastructure for fsl-mc devices")
> Fixes: f2ba7e8c947b ("vfio/fsl-mc: Added lock support in preparation for interrupt handling")
> Fixes: 704f5082d845 ("vfio/fsl-mc: Scan DPRC objects on vfio-fsl-mc driver bind")
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> ---
>   drivers/vfio/fsl-mc/vfio_fsl_mc.c | 43 ++++++++++++++++---------------
>   1 file changed, 22 insertions(+), 21 deletions(-)
> 
> diff --git a/drivers/vfio/fsl-mc/vfio_fsl_mc.c b/drivers/vfio/fsl-mc/vfio_fsl_mc.c
> index f27e25112c4037..881849723b4dfb 100644
> --- a/drivers/vfio/fsl-mc/vfio_fsl_mc.c
> +++ b/drivers/vfio/fsl-mc/vfio_fsl_mc.c
> @@ -582,11 +582,21 @@ static int vfio_fsl_mc_init_device(struct vfio_fsl_mc_device *vdev)
>   	dprc_cleanup(mc_dev);
>   out_nc_unreg:
>   	bus_unregister_notifier(&fsl_mc_bus_type, &vdev->nb);
> -	vdev->nb.notifier_call = NULL;
> -
>   	return ret;
>   }
>   
> +static void vfio_fsl_uninit_device(struct vfio_fsl_mc_device *vdev)
> +{
> +	struct fsl_mc_device *mc_dev = vdev->mc_dev;
> +
> +	if (!is_fsl_mc_bus_dprc(mc_dev))
> +		return;
> +
> +	dprc_remove_devices(mc_dev, NULL, 0);
> +	dprc_cleanup(mc_dev);
> +	bus_unregister_notifier(&fsl_mc_bus_type, &vdev->nb);
> +}
> +
>   static int vfio_fsl_mc_probe(struct fsl_mc_device *mc_dev)
>   {
>   	struct iommu_group *group;
> @@ -607,29 +617,27 @@ static int vfio_fsl_mc_probe(struct fsl_mc_device *mc_dev)
>   	}
>   
>   	vdev->mc_dev = mc_dev;
> -
> -	ret = vfio_add_group_dev(dev, &vfio_fsl_mc_ops, vdev);
> -	if (ret) {
> -		dev_err(dev, "VFIO_FSL_MC: Failed to add to vfio group\n");
> -		goto out_group_put;
> -	}
> +	mutex_init(&vdev->igate);
>   
>   	ret = vfio_fsl_mc_reflck_attach(vdev);
>   	if (ret)
> -		goto out_group_dev;
> +		goto out_group_put;
>   
>   	ret = vfio_fsl_mc_init_device(vdev);
>   	if (ret)
>   		goto out_reflck;
>   
> -	mutex_init(&vdev->igate);
> -
> +	ret = vfio_add_group_dev(dev, &vfio_fsl_mc_ops, vdev);
> +	if (ret) {
> +		dev_err(dev, "VFIO_FSL_MC: Failed to add to vfio group\n");
> +		goto out_device;
> +	}
>   	return 0;
>   
> +out_device:
> +	vfio_fsl_uninit_device(vdev);
>   out_reflck:
>   	vfio_fsl_mc_reflck_put(vdev->reflck);
> -out_group_dev:
> -	vfio_del_group_dev(dev);
>   out_group_put:
>   	vfio_iommu_group_put(group, dev);
>   	return ret;
> @@ -646,16 +654,9 @@ static int vfio_fsl_mc_remove(struct fsl_mc_device *mc_dev)
>   
>   	mutex_destroy(&vdev->igate);
>   
> +	vfio_fsl_uninit_device(vdev);
>   	vfio_fsl_mc_reflck_put(vdev->reflck);
>   
> -	if (is_fsl_mc_bus_dprc(mc_dev)) {
> -		dprc_remove_devices(mc_dev, NULL, 0);
> -		dprc_cleanup(mc_dev);
> -	}
> -
> -	if (vdev->nb.notifier_call)
> -		bus_unregister_notifier(&fsl_mc_bus_type, &vdev->nb);
> -
>   	vfio_iommu_group_put(mc_dev->dev.iommu_group, dev);
>   
>   	return 0;
> 


^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v2 01/14] vfio: Remove extra put/gets around vfio_device->group
  2021-03-13  0:55 ` [PATCH v2 01/14] vfio: Remove extra put/gets around vfio_device->group Jason Gunthorpe
  2021-03-16  7:33   ` Tian, Kevin
@ 2021-03-16 11:15   ` Max Gurtovoy
  2021-03-16 11:59   ` Cornelia Huck
  2021-03-18  9:32   ` Auger Eric
  3 siblings, 0 replies; 82+ messages in thread
From: Max Gurtovoy @ 2021-03-16 11:15 UTC (permalink / raw)
  To: Jason Gunthorpe, Alex Williamson, Cornelia Huck, kvm
  Cc: Raj, Ashok, Dan Williams, Daniel Vetter, Christoph Hellwig,
	Leon Romanovsky, Tarun Gupta


On 3/13/2021 2:55 AM, Jason Gunthorpe wrote:
> The vfio_device->group value has a get obtained during
> vfio_add_group_dev() which gets moved from the stack to vfio_device->group
> in vfio_group_create_device().
>
> The reference remains until we reach the end of vfio_del_group_dev() when
> it is put back.
>
> Thus anything that already has a kref on the vfio_device is guaranteed a
> valid group pointer. Remove all the extra reference traffic.
>
> It is tricky to see, but the get at the start of vfio_del_group_dev() is
> actually pairing with the put hidden inside vfio_device_put() a few lines
> below.
>
> A later patch merges vfio_group_create_device() into vfio_add_group_dev()
> which makes the ownership and error flow on the create side easier to
> follow.
>
> Reviewed-by: Christoph Hellwig <hch@lst.de>
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> ---
>   drivers/vfio/vfio.c | 21 ++-------------------
>   1 file changed, 2 insertions(+), 19 deletions(-)

Looks good,

Reviewed-by: Max Gurtovoy <mgurtovoy@nvidia.com>



^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v2 08/14] vfio/pci: Re-order vfio_pci_probe()
  2021-03-13  0:56 ` [PATCH v2 08/14] vfio/pci: Re-order vfio_pci_probe() Jason Gunthorpe
  2021-03-15  8:46   ` Christoph Hellwig
  2021-03-16  8:04   ` Tian, Kevin
@ 2021-03-16 11:28   ` Max Gurtovoy
  2021-03-17 10:32   ` Cornelia Huck
  2021-03-18 16:50   ` Auger Eric
  4 siblings, 0 replies; 82+ messages in thread
From: Max Gurtovoy @ 2021-03-16 11:28 UTC (permalink / raw)
  To: Jason Gunthorpe, kvm
  Cc: Alex Williamson, Raj, Ashok, Christian Ehrhardt, Cornelia Huck,
	Dan Williams, Daniel Vetter, Eric Auger, Christoph Hellwig,
	Kevin Tian, Leon Romanovsky, Tarun Gupta


On 3/13/2021 2:56 AM, Jason Gunthorpe wrote:
> vfio_add_group_dev() must be called only after all of the private data in
> vdev is fully setup and ready, otherwise there could be races with user
> space instantiating a device file descriptor and starting to call ops.
>
> For instance vfio_pci_reflck_attach() sets vdev->reflck and
> vfio_pci_open(), called by fops open, unconditionally derefs it, which
> will crash if things get out of order.
>
> Fixes: cc20d7999000 ("vfio/pci: Introduce VF token")
> Fixes: e309df5b0c9e ("vfio/pci: Parallelize device open and release")
> Fixes: 6eb7018705de ("vfio-pci: Move idle devices to D3hot power state")
> Fixes: ecaa1f6a0154 ("vfio-pci: Add VGA arbiter client")
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> ---
>   drivers/vfio/pci/vfio_pci.c | 17 +++++++++--------
>   1 file changed, 9 insertions(+), 8 deletions(-)

Looks good,

Reviewed-by: Max Gurtovoy <mgurtovoy@nvidia.com>



^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v2 01/14] vfio: Remove extra put/gets around vfio_device->group
  2021-03-13  0:55 ` [PATCH v2 01/14] vfio: Remove extra put/gets around vfio_device->group Jason Gunthorpe
  2021-03-16  7:33   ` Tian, Kevin
  2021-03-16 11:15   ` Max Gurtovoy
@ 2021-03-16 11:59   ` Cornelia Huck
  2021-03-18  9:32   ` Auger Eric
  3 siblings, 0 replies; 82+ messages in thread
From: Cornelia Huck @ 2021-03-16 11:59 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Alex Williamson, kvm, Raj, Ashok, Dan Williams, Daniel Vetter,
	Christoph Hellwig, Leon Romanovsky, Max Gurtovoy, Tarun Gupta

On Fri, 12 Mar 2021 20:55:53 -0400
Jason Gunthorpe <jgg@nvidia.com> wrote:

> The vfio_device->group value has a get obtained during
> vfio_add_group_dev() which gets moved from the stack to vfio_device->group
> in vfio_group_create_device().
> 
> The reference remains until we reach the end of vfio_del_group_dev() when
> it is put back.
> 
> Thus anything that already has a kref on the vfio_device is guaranteed a
> valid group pointer. Remove all the extra reference traffic.
> 
> It is tricky to see, but the get at the start of vfio_del_group_dev() is
> actually pairing with the put hidden inside vfio_device_put() a few lines
> below.
> 
> A later patch merges vfio_group_create_device() into vfio_add_group_dev()
> which makes the ownership and error flow on the create side easier to
> follow.
> 
> Reviewed-by: Christoph Hellwig <hch@lst.de>
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> ---
>  drivers/vfio/vfio.c | 21 ++-------------------
>  1 file changed, 2 insertions(+), 19 deletions(-)

Reviewed-by: Cornelia Huck <cohuck@redhat.com>


^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v2 02/14] vfio: Simplify the lifetime logic for vfio_device
  2021-03-16  7:38   ` Tian, Kevin
@ 2021-03-16 12:10     ` Cornelia Huck
  2021-03-16 20:24     ` Alex Williamson
  1 sibling, 0 replies; 82+ messages in thread
From: Cornelia Huck @ 2021-03-16 12:10 UTC (permalink / raw)
  To: Tian, Kevin
  Cc: Jason Gunthorpe, Alex Williamson, kvm, Raj, Ashok, Williams,
	Dan J, Daniel Vetter, Christoph Hellwig, Leon Romanovsky,
	Max Gurtovoy, Tarun Gupta

On Tue, 16 Mar 2021 07:38:09 +0000
"Tian, Kevin" <kevin.tian@intel.com> wrote:

> > From: Jason Gunthorpe <jgg@nvidia.com>
> > Sent: Saturday, March 13, 2021 8:56 AM
> > 
> > The vfio_device is using a 'sleep until all refs go to zero' pattern for
> > its lifetime, but it is indirectly coded by repeatedly scanning the group
> > list waiting for the device to be removed on its own.
> > 
> > Switch this around to be a direct representation, use a refcount to count
> > the number of places that are blocking destruction and sleep directly on a
> > completion until that counter goes to zero. kfree the device after other
> > accesses have been excluded in vfio_del_group_dev(). This is a fairly
> > common Linux idiom.
> > 
> > Due to this we can now remove kref_put_mutex(), which is very rarely used
> > in the kernel. Here it is being used to prevent a zero ref device from
> > being seen in the group list. Instead allow the zero ref device to
> > continue to exist in the device_list and use refcount_inc_not_zero() to
> > exclude it once refs go to zero.
> > 
> > This patch is organized so the next patch will be able to alter the API to
> > allow drivers to provide the kfree.
> > 
> > Reviewed-by: Christoph Hellwig <hch@lst.de>
> > Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> > ---
> >  drivers/vfio/vfio.c | 79 ++++++++++++++-------------------------------
> >  1 file changed, 25 insertions(+), 54 deletions(-)

> > @@ -935,32 +916,18 @@ void *vfio_del_group_dev(struct device *dev)
> >  	WARN_ON(!unbound);
> > 
> >  	vfio_device_put(device);
> > -
> > -	/*
> > -	 * If the device is still present in the group after the above
> > -	 * 'put', then it is in use and we need to request it from the
> > -	 * bus driver.  The driver may in turn need to request the
> > -	 * device from the user.  We send the request on an arbitrary
> > -	 * interval with counter to allow the driver to take escalating
> > -	 * measures to release the device if it has the ability to do so.
> > -	 */  
> 
> Above comment still makes sense even with this patch. What about
> keeping it? otherwise:
> 
> Reviewed-by: Kevin Tian <kevin.tian@intel.com>

I agree, this still looks useful.

Reviewed-by: Cornelia Huck <cohuck@redhat.com>


^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v2 03/14] vfio: Split creation of a vfio_device into init and register ops
  2021-03-13  0:55 ` [PATCH v2 03/14] vfio: Split creation of a vfio_device into init and register ops Jason Gunthorpe
  2021-03-16  7:55   ` Tian, Kevin
@ 2021-03-16 12:25   ` Cornelia Huck
  2021-03-16 21:13     ` Alex Williamson
  2021-03-16 12:54   ` Max Gurtovoy
  2021-03-18 13:18   ` Auger Eric
  3 siblings, 1 reply; 82+ messages in thread
From: Cornelia Huck @ 2021-03-16 12:25 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Alex Williamson, Jonathan Corbet, kvm, linux-doc, Raj, Ashok,
	Dan Williams, Daniel Vetter, Christoph Hellwig, Leon Romanovsky,
	Max Gurtovoy, Tarun Gupta, Liu Yi L

On Fri, 12 Mar 2021 20:55:55 -0400
Jason Gunthorpe <jgg@nvidia.com> wrote:

> This makes the struct vfio_pci_device part of the public interface so it
> can be used with container_of and so forth, as is typical for a Linux
> subystem.
> 
> This is the first step to bring some type-safety to the vfio interface by
> allowing the replacement of 'void *' and 'struct device *' inputs with a
> simple and clear 'struct vfio_pci_device *'
> 
> For now the self-allocating vfio_add_group_dev() interface is kept so each
> user can be updated as a separate patch.
> 
> The expected usage pattern is
> 
>   driver core probe() function:
>      my_device = kzalloc(sizeof(*mydevice));
>      vfio_init_group_dev(&my_device->vdev, dev, ops, mydevice);
>      /* other driver specific prep */
>      vfio_register_group_dev(&my_device->vdev);
>      dev_set_drvdata(my_device);
> 
>   driver core remove() function:
>      my_device = dev_get_drvdata(dev);
>      vfio_unregister_group_dev(&my_device->vdev);
>      /* other driver specific tear down */
>      kfree(my_device);
> 
> Allowing the driver to be able to use the drvdata and vifo_device to go

s/vifo_device/vfio_device/

> to/from its own data.
> 
> The pattern also makes it clear that vfio_register_group_dev() must be
> last in the sequence, as once it is called the core code can immediately
> start calling ops. The init/register gap is provided to allow for the
> driver to do setup before ops can be called and thus avoid races.
> 
> Reviewed-by: Christoph Hellwig <hch@lst.de>
> Reviewed-by: Liu Yi L <yi.l.liu@intel.com>
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> ---
>  Documentation/driver-api/vfio.rst |  31 ++++----
>  drivers/vfio/vfio.c               | 123 ++++++++++++++++--------------
>  include/linux/vfio.h              |  16 ++++
>  3 files changed, 98 insertions(+), 72 deletions(-)
> 
> diff --git a/Documentation/driver-api/vfio.rst b/Documentation/driver-api/vfio.rst
> index f1a4d3c3ba0bb1..d3a02300913a7f 100644
> --- a/Documentation/driver-api/vfio.rst
> +++ b/Documentation/driver-api/vfio.rst
> @@ -249,18 +249,23 @@ VFIO bus driver API
>  
>  VFIO bus drivers, such as vfio-pci make use of only a few interfaces
>  into VFIO core.  When devices are bound and unbound to the driver,
> -the driver should call vfio_add_group_dev() and vfio_del_group_dev()
> -respectively::
> -
> -	extern int vfio_add_group_dev(struct device *dev,
> -				      const struct vfio_device_ops *ops,
> -				      void *device_data);
> -
> -	extern void *vfio_del_group_dev(struct device *dev);
> -
> -vfio_add_group_dev() indicates to the core to begin tracking the
> -iommu_group of the specified dev and register the dev as owned by
> -a VFIO bus driver.  The driver provides an ops structure for callbacks
> +the driver should call vfio_register_group_dev() and
> +vfio_unregister_group_dev() respectively::
> +
> +	void vfio_init_group_dev(struct vfio_device *device,
> +				struct device *dev,
> +				const struct vfio_device_ops *ops,
> +				void *device_data);
> +	int vfio_register_group_dev(struct vfio_device *device);
> +	void vfio_unregister_group_dev(struct vfio_device *device);
> +
> +The driver should embed the vfio_device in its own structure and call
> +vfio_init_group_dev() to pre-configure it before going to registration.

s/it/that structure/ (I guess?)

> +vfio_register_group_dev() indicates to the core to begin tracking the
> +iommu_group of the specified dev and register the dev as owned by a VFIO bus
> +driver. Once vfio_register_group_dev() returns it is possible for userspace to
> +start accessing the driver, thus the driver should ensure it is completely
> +ready before calling it. The driver provides an ops structure for callbacks
>  similar to a file operations structure::
>  
>  	struct vfio_device_ops {

Reviewed-by: Cornelia Huck <cohuck@redhat.com>


^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v2 03/14] vfio: Split creation of a vfio_device into init and register ops
  2021-03-13  0:55 ` [PATCH v2 03/14] vfio: Split creation of a vfio_device into init and register ops Jason Gunthorpe
  2021-03-16  7:55   ` Tian, Kevin
  2021-03-16 12:25   ` Cornelia Huck
@ 2021-03-16 12:54   ` Max Gurtovoy
  2021-03-18 13:18   ` Auger Eric
  3 siblings, 0 replies; 82+ messages in thread
From: Max Gurtovoy @ 2021-03-16 12:54 UTC (permalink / raw)
  To: Jason Gunthorpe, Alex Williamson, Cornelia Huck, Jonathan Corbet,
	kvm, linux-doc
  Cc: Raj, Ashok, Dan Williams, Daniel Vetter, Christoph Hellwig,
	Leon Romanovsky, Tarun Gupta, Liu Yi L


On 3/13/2021 2:55 AM, Jason Gunthorpe wrote:
> This makes the struct vfio_pci_device part of the public interface so it
> can be used with container_of and so forth, as is typical for a Linux
> subystem.
>
> This is the first step to bring some type-safety to the vfio interface by
> allowing the replacement of 'void *' and 'struct device *' inputs with a
> simple and clear 'struct vfio_pci_device *'
>
> For now the self-allocating vfio_add_group_dev() interface is kept so each
> user can be updated as a separate patch.
>
> The expected usage pattern is
>
>    driver core probe() function:
>       my_device = kzalloc(sizeof(*mydevice));
>       vfio_init_group_dev(&my_device->vdev, dev, ops, mydevice);
>       /* other driver specific prep */
>       vfio_register_group_dev(&my_device->vdev);
>       dev_set_drvdata(my_device);
>
>    driver core remove() function:
>       my_device = dev_get_drvdata(dev);
>       vfio_unregister_group_dev(&my_device->vdev);
>       /* other driver specific tear down */
>       kfree(my_device);
>
> Allowing the driver to be able to use the drvdata and vifo_device to go
> to/from its own data.
>
> The pattern also makes it clear that vfio_register_group_dev() must be
> last in the sequence, as once it is called the core code can immediately
> start calling ops. The init/register gap is provided to allow for the
> driver to do setup before ops can be called and thus avoid races.
>
> Reviewed-by: Christoph Hellwig <hch@lst.de>
> Reviewed-by: Liu Yi L <yi.l.liu@intel.com>
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> ---
>   Documentation/driver-api/vfio.rst |  31 ++++----
>   drivers/vfio/vfio.c               | 123 ++++++++++++++++--------------
>   include/linux/vfio.h              |  16 ++++
>   3 files changed, 98 insertions(+), 72 deletions(-)

With comments from Cornelia and Kevin, looks good.

Reviewed-by: Max Gurtovoy <mgurtovoy@nvidia.com>



^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v2 07/14] vfio/pci: Move VGA and VF initialization to functions
  2021-03-13  0:55 ` [PATCH v2 07/14] vfio/pci: Move VGA and VF initialization to functions Jason Gunthorpe
  2021-03-15  8:45   ` Christoph Hellwig
  2021-03-16  7:57   ` Tian, Kevin
@ 2021-03-16 13:02   ` Max Gurtovoy
  2021-03-16 23:04     ` Jason Gunthorpe
  2021-03-16 16:51   ` Cornelia Huck
  2021-03-18 16:34   ` Auger Eric
  4 siblings, 1 reply; 82+ messages in thread
From: Max Gurtovoy @ 2021-03-16 13:02 UTC (permalink / raw)
  To: Jason Gunthorpe, Cornelia Huck, kvm
  Cc: Alex Williamson, Raj, Ashok, Dan Williams, Daniel Vetter,
	Christoph Hellwig, Leon Romanovsky, Tarun Gupta


On 3/13/2021 2:55 AM, Jason Gunthorpe wrote:
> vfio_pci_probe() is quite complicated, with optional VF and VGA sub
> components. Move these into clear init/uninit functions and have a linear
> flow in probe/remove.
>
> This fixes a few little buglets:
>   - vfio_pci_remove() is in the wrong order, vga_client_register() removes
>     a notifier and is after kfree(vdev), but the notifier refers to vdev,
>     so it can use after free in a race.
>   - vga_client_register() can fail but was ignored
>
> Organize things so destruction order is the reverse of creation order.
>
> Fixes: ecaa1f6a0154 ("vfio-pci: Add VGA arbiter client")
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> ---
>   drivers/vfio/pci/vfio_pci.c | 116 +++++++++++++++++++++++-------------
>   1 file changed, 74 insertions(+), 42 deletions(-)
>
> diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c
> index 65e7e6b44578c2..f95b58376156a0 100644
> --- a/drivers/vfio/pci/vfio_pci.c
> +++ b/drivers/vfio/pci/vfio_pci.c
> @@ -1922,6 +1922,68 @@ static int vfio_pci_bus_notifier(struct notifier_block *nb,
>   	return 0;
>   }
>   
> +static int vfio_pci_vf_init(struct vfio_pci_device *vdev)
> +{
> +	struct pci_dev *pdev = vdev->pdev;
> +	int ret;
> +
> +	if (!pdev->is_physfn)
> +		return 0;
> +
> +	vdev->vf_token = kzalloc(sizeof(*vdev->vf_token), GFP_KERNEL);
> +	if (!vdev->vf_token)
> +		return -ENOMEM;
> +
> +	mutex_init(&vdev->vf_token->lock);
> +	uuid_gen(&vdev->vf_token->uuid);
> +
> +	vdev->nb.notifier_call = vfio_pci_bus_notifier;
> +	ret = bus_register_notifier(&pci_bus_type, &vdev->nb);
> +	if (ret) {
> +		kfree(vdev->vf_token);

you can consider "mutex_destroy(&vdev->vf_token->lock);" like you use in 
the uninit function.

I know it's not in the orig code and only for code symmetry.

otherwise looks good,

Reviewed-by: Max Gurtovoy <mgurtovoy@nvidia.com>

> +		return ret;
> +	}
> +	return 0;
> +}
> +
> +static void vfio_pci_vf_uninit(struct vfio_pci_device *vdev)
> +{
> +	if (!vdev->vf_token)
> +		return;
> +
> +	bus_unregister_notifier(&pci_bus_type, &vdev->nb);
> +	WARN_ON(vdev->vf_token->users);
> +	mutex_destroy(&vdev->vf_token->lock);
> +	kfree(vdev->vf_token);
> +}





^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v2 08/14] vfio/pci: Re-order vfio_pci_probe()
  2021-03-16  8:04   ` Tian, Kevin
@ 2021-03-16 13:20     ` Jason Gunthorpe
  2021-03-16 22:27       ` Alex Williamson
  0 siblings, 1 reply; 82+ messages in thread
From: Jason Gunthorpe @ 2021-03-16 13:20 UTC (permalink / raw)
  To: Tian, Kevin
  Cc: kvm, Alex Williamson, Raj, Ashok, Christian Ehrhardt,
	Cornelia Huck, Williams, Dan J, Daniel Vetter, Eric Auger,
	Christoph Hellwig, Leon Romanovsky, Max Gurtovoy, Tarun Gupta

On Tue, Mar 16, 2021 at 08:04:55AM +0000, Tian, Kevin wrote:
> > @@ -2060,15 +2056,20 @@ static int vfio_pci_probe(struct pci_dev *pdev,
> > const struct pci_device_id *id)
> >  		vfio_pci_set_power_state(vdev, PCI_D3hot);
> >  	}
> > 
> > -	return ret;
> > +	ret = vfio_add_group_dev(&pdev->dev, &vfio_pci_ops, vdev);
> > +	if (ret)
> > +		goto out_power;
> > +	return 0;
> > 
> > +out_power:
> > +	if (!disable_idle_d3)
> > +		vfio_pci_set_power_state(vdev, PCI_D0);
> 
> Just curious whether the power state must be recovered upon failure here.
> From the comment several lines above, the power state is set to an unknown
> state before doing D3 transaction. From this point it looks fine if leaving the
> device in D3 since there is no expected state to be recovered?

I don't know, this is what the remove function does, so I can't see a
reason why remove should do it but not here.

Jason

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v2 03/14] vfio: Split creation of a vfio_device into init and register ops
  2021-03-16  7:55   ` Tian, Kevin
@ 2021-03-16 13:34     ` Jason Gunthorpe
  2021-03-17  0:55       ` Tian, Kevin
  0 siblings, 1 reply; 82+ messages in thread
From: Jason Gunthorpe @ 2021-03-16 13:34 UTC (permalink / raw)
  To: Tian, Kevin
  Cc: Alex Williamson, Cornelia Huck, Jonathan Corbet, kvm, linux-doc,
	Raj, Ashok, Williams, Dan J, Daniel Vetter, Christoph Hellwig,
	Leon Romanovsky, Max Gurtovoy, Tarun Gupta, Liu, Yi L

On Tue, Mar 16, 2021 at 07:55:11AM +0000, Tian, Kevin wrote:

> > +void *vfio_del_group_dev(struct device *dev)
> > +{
> > +	struct vfio_device *device = dev_get_drvdata(dev);
> > +	void *device_data = device->device_data;
> > +
> > +	vfio_unregister_group_dev(device);
> >  	dev_set_drvdata(dev, NULL);
> 
> Move to vfio_unregister_group_dev? In the cover letter you mentioned
> that drvdata is managed by the driver but removed from the core. 

"removed from the core" means the core code doesn't touch drvdata at
all.

> Looks it's also the rule obeyed by the following patches.

The dev_set_drvdata(NULL) on remove is mostly cargo-cult nonsense. The
driver core sets it to null immediately after the remove function
returns, so to add another set needs a very strong reason.

It is only left here temporarily, the last patch deletes it.

Jason

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v2 04/14] vfio/platform: Use vfio_init/register/unregister_group_dev
  2021-03-13  0:55 ` [PATCH v2 04/14] vfio/platform: Use vfio_init/register/unregister_group_dev Jason Gunthorpe
@ 2021-03-16 16:22   ` Cornelia Huck
  2021-03-16 21:33   ` Alex Williamson
  2021-03-18 13:40   ` Auger Eric
  2 siblings, 0 replies; 82+ messages in thread
From: Cornelia Huck @ 2021-03-16 16:22 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Alex Williamson, Eric Auger, kvm, Raj, Ashok, Dan Williams,
	Daniel Vetter, Christoph Hellwig, Leon Romanovsky, Max Gurtovoy,
	Tarun Gupta

On Fri, 12 Mar 2021 20:55:56 -0400
Jason Gunthorpe <jgg@nvidia.com> wrote:

> platform already allocates a struct vfio_platform_device with exactly
> the same lifetime as vfio_device, switch to the new API and embed
> vfio_device in vfio_platform_device.
> 
> Reviewed-by: Christoph Hellwig <hch@lst.de>
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> ---
>  drivers/vfio/platform/vfio_amba.c             |  8 ++++---
>  drivers/vfio/platform/vfio_platform.c         | 21 ++++++++---------
>  drivers/vfio/platform/vfio_platform_common.c  | 23 +++++++------------
>  drivers/vfio/platform/vfio_platform_private.h |  5 ++--
>  4 files changed, 26 insertions(+), 31 deletions(-)

Reviewed-by: Cornelia Huck <cohuck@redhat.com>


^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v2 05/14] vfio/fsl-mc: Re-order vfio_fsl_mc_probe()
  2021-03-13  0:55 ` [PATCH v2 05/14] vfio/fsl-mc: Re-order vfio_fsl_mc_probe() Jason Gunthorpe
  2021-03-15  8:44   ` Christoph Hellwig
  2021-03-16  9:16   ` Diana Craciun OSS
@ 2021-03-16 16:28   ` Cornelia Huck
  2021-03-17 16:36   ` Diana Craciun OSS
  3 siblings, 0 replies; 82+ messages in thread
From: Cornelia Huck @ 2021-03-16 16:28 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: kvm, Alex Williamson, Raj, Ashok, Bharat Bhushan, Dan Williams,
	Daniel Vetter, Diana Craciun, Eric Auger, Christoph Hellwig,
	Leon Romanovsky, Max Gurtovoy, Tarun Gupta

On Fri, 12 Mar 2021 20:55:57 -0400
Jason Gunthorpe <jgg@nvidia.com> wrote:

> vfio_add_group_dev() must be called only after all of the private data in
> vdev is fully setup and ready, otherwise there could be races with user
> space instantiating a device file descriptor and starting to call ops.
> 
> For instance vfio_fsl_mc_reflck_attach() sets vdev->reflck and
> vfio_fsl_mc_open(), called by fops open, unconditionally derefs it, which
> will crash if things get out of order.
> 
> This driver started life with the right sequence, but three commits added
> stuff after vfio_add_group_dev().
> 
> Fixes: 2e0d29561f59 ("vfio/fsl-mc: Add irq infrastructure for fsl-mc devices")
> Fixes: f2ba7e8c947b ("vfio/fsl-mc: Added lock support in preparation for interrupt handling")
> Fixes: 704f5082d845 ("vfio/fsl-mc: Scan DPRC objects on vfio-fsl-mc driver bind")
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> ---
>  drivers/vfio/fsl-mc/vfio_fsl_mc.c | 43 ++++++++++++++++---------------
>  1 file changed, 22 insertions(+), 21 deletions(-)

Reviewed-by: Cornelia Huck <cohuck@redhat.com>


^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v2 06/14] vfio/fsl-mc: Use vfio_init/register/unregister_group_dev
  2021-03-13  0:55 ` [PATCH v2 06/14] vfio/fsl-mc: Use vfio_init/register/unregister_group_dev Jason Gunthorpe
  2021-03-15  8:44   ` Christoph Hellwig
@ 2021-03-16 16:43   ` Cornelia Huck
  1 sibling, 0 replies; 82+ messages in thread
From: Cornelia Huck @ 2021-03-16 16:43 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Alex Williamson, Diana Craciun, kvm, Raj, Ashok, Dan Williams,
	Daniel Vetter, Christoph Hellwig, Leon Romanovsky, Max Gurtovoy,
	Tarun Gupta

On Fri, 12 Mar 2021 20:55:58 -0400
Jason Gunthorpe <jgg@nvidia.com> wrote:

> fsl-mc already allocates a struct vfio_fsl_mc_device with exactly the same
> lifetime as vfio_device, switch to the new API and embed vfio_device in
> vfio_fsl_mc_device. While here remove the devm usage for the vdev, this
> code is clean and doesn't need devm.
> 
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> ---
>  drivers/vfio/fsl-mc/vfio_fsl_mc.c         | 18 ++++++++++--------
>  drivers/vfio/fsl-mc/vfio_fsl_mc_private.h |  1 +
>  2 files changed, 11 insertions(+), 8 deletions(-)

Reviewed-by: Cornelia Huck <cohuck@redhat.com>


^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v2 07/14] vfio/pci: Move VGA and VF initialization to functions
  2021-03-13  0:55 ` [PATCH v2 07/14] vfio/pci: Move VGA and VF initialization to functions Jason Gunthorpe
                     ` (2 preceding siblings ...)
  2021-03-16 13:02   ` Max Gurtovoy
@ 2021-03-16 16:51   ` Cornelia Huck
  2021-03-18 16:34   ` Auger Eric
  4 siblings, 0 replies; 82+ messages in thread
From: Cornelia Huck @ 2021-03-16 16:51 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: kvm, Alex Williamson, Raj, Ashok, Dan Williams, Daniel Vetter,
	Christoph Hellwig, Leon Romanovsky, Max Gurtovoy, Tarun Gupta

On Fri, 12 Mar 2021 20:55:59 -0400
Jason Gunthorpe <jgg@nvidia.com> wrote:

> vfio_pci_probe() is quite complicated, with optional VF and VGA sub
> components. Move these into clear init/uninit functions and have a linear
> flow in probe/remove.
> 
> This fixes a few little buglets:
>  - vfio_pci_remove() is in the wrong order, vga_client_register() removes
>    a notifier and is after kfree(vdev), but the notifier refers to vdev,
>    so it can use after free in a race.
>  - vga_client_register() can fail but was ignored
> 
> Organize things so destruction order is the reverse of creation order.
> 
> Fixes: ecaa1f6a0154 ("vfio-pci: Add VGA arbiter client")
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> ---
>  drivers/vfio/pci/vfio_pci.c | 116 +++++++++++++++++++++++-------------
>  1 file changed, 74 insertions(+), 42 deletions(-)

Reviewed-by: Cornelia Huck <cohuck@redhat.com>


^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v2 02/14] vfio: Simplify the lifetime logic for vfio_device
  2021-03-16  7:38   ` Tian, Kevin
  2021-03-16 12:10     ` Cornelia Huck
@ 2021-03-16 20:24     ` Alex Williamson
  2021-03-16 23:08       ` Jason Gunthorpe
  2021-03-17  8:12       ` Cornelia Huck
  1 sibling, 2 replies; 82+ messages in thread
From: Alex Williamson @ 2021-03-16 20:24 UTC (permalink / raw)
  To: Tian, Kevin
  Cc: Jason Gunthorpe, Cornelia Huck, kvm, Raj, Ashok, Williams, Dan J,
	Daniel Vetter, Christoph Hellwig, Leon Romanovsky, Max Gurtovoy,
	Tarun Gupta

On Tue, 16 Mar 2021 07:38:09 +0000
"Tian, Kevin" <kevin.tian@intel.com> wrote:

> > From: Jason Gunthorpe <jgg@nvidia.com>
> > Sent: Saturday, March 13, 2021 8:56 AM
> > 
> > The vfio_device is using a 'sleep until all refs go to zero' pattern for
> > its lifetime, but it is indirectly coded by repeatedly scanning the group
> > list waiting for the device to be removed on its own.
> > 
> > Switch this around to be a direct representation, use a refcount to count
> > the number of places that are blocking destruction and sleep directly on a
> > completion until that counter goes to zero. kfree the device after other
> > accesses have been excluded in vfio_del_group_dev(). This is a fairly
> > common Linux idiom.
> > 
> > Due to this we can now remove kref_put_mutex(), which is very rarely used
> > in the kernel. Here it is being used to prevent a zero ref device from
> > being seen in the group list. Instead allow the zero ref device to
> > continue to exist in the device_list and use refcount_inc_not_zero() to
> > exclude it once refs go to zero.
> > 
> > This patch is organized so the next patch will be able to alter the API to
> > allow drivers to provide the kfree.
> > 
> > Reviewed-by: Christoph Hellwig <hch@lst.de>
> > Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> > ---
> >  drivers/vfio/vfio.c | 79 ++++++++++++++-------------------------------
> >  1 file changed, 25 insertions(+), 54 deletions(-)
> > 
> > diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
> > index 15d8e678e5563a..32660e8a69ae20 100644
> > --- a/drivers/vfio/vfio.c
> > +++ b/drivers/vfio/vfio.c
> > @@ -46,7 +46,6 @@ static struct vfio {
> >  	struct mutex			group_lock;
> >  	struct cdev			group_cdev;
> >  	dev_t				group_devt;
> > -	wait_queue_head_t		release_q;
> >  } vfio;
> > 
> >  struct vfio_iommu_driver {
> > @@ -91,7 +90,8 @@ struct vfio_group {
> >  };
> > 
> >  struct vfio_device {
> > -	struct kref			kref;
> > +	refcount_t			refcount;
> > +	struct completion		comp;
> >  	struct device			*dev;
> >  	const struct vfio_device_ops	*ops;
> >  	struct vfio_group		*group;
> > @@ -544,7 +544,8 @@ struct vfio_device *vfio_group_create_device(struct
> > vfio_group *group,
> >  	if (!device)
> >  		return ERR_PTR(-ENOMEM);
> > 
> > -	kref_init(&device->kref);
> > +	refcount_set(&device->refcount, 1);
> > +	init_completion(&device->comp);
> >  	device->dev = dev;
> >  	/* Our reference on group is moved to the device */
> >  	device->group = group;
> > @@ -560,35 +561,17 @@ struct vfio_device
> > *vfio_group_create_device(struct vfio_group *group,
> >  	return device;
> >  }
> > 
> > -static void vfio_device_release(struct kref *kref)
> > -{
> > -	struct vfio_device *device = container_of(kref,
> > -						  struct vfio_device, kref);
> > -	struct vfio_group *group = device->group;
> > -
> > -	list_del(&device->group_next);
> > -	group->dev_counter--;
> > -	mutex_unlock(&group->device_lock);
> > -
> > -	dev_set_drvdata(device->dev, NULL);
> > -
> > -	kfree(device);
> > -
> > -	/* vfio_del_group_dev may be waiting for this device */
> > -	wake_up(&vfio.release_q);
> > -}
> > -
> >  /* Device reference always implies a group reference */
> >  void vfio_device_put(struct vfio_device *device)
> >  {
> > -	struct vfio_group *group = device->group;
> > -	kref_put_mutex(&device->kref, vfio_device_release, &group-  
> > >device_lock);  
> > +	if (refcount_dec_and_test(&device->refcount))
> > +		complete(&device->comp);
> >  }
> >  EXPORT_SYMBOL_GPL(vfio_device_put);
> > 
> > -static void vfio_device_get(struct vfio_device *device)
> > +static bool vfio_device_try_get(struct vfio_device *device)
> >  {
> > -	kref_get(&device->kref);
> > +	return refcount_inc_not_zero(&device->refcount);
> >  }
> > 
> >  static struct vfio_device *vfio_group_get_device(struct vfio_group *group,
> > @@ -598,8 +581,7 @@ static struct vfio_device
> > *vfio_group_get_device(struct vfio_group *group,
> > 
> >  	mutex_lock(&group->device_lock);
> >  	list_for_each_entry(device, &group->device_list, group_next) {
> > -		if (device->dev == dev) {
> > -			vfio_device_get(device);
> > +		if (device->dev == dev && vfio_device_try_get(device)) {
> >  			mutex_unlock(&group->device_lock);
> >  			return device;
> >  		}
> > @@ -883,9 +865,8 @@ static struct vfio_device
> > *vfio_device_get_from_name(struct vfio_group *group,
> >  			ret = !strcmp(dev_name(it->dev), buf);
> >  		}
> > 
> > -		if (ret) {
> > +		if (ret && vfio_device_try_get(it)) {
> >  			device = it;
> > -			vfio_device_get(device);
> >  			break;
> >  		}
> >  	}
> > @@ -908,13 +889,13 @@ EXPORT_SYMBOL_GPL(vfio_device_data);
> >   * removed.  Open file descriptors for the device... */
> >  void *vfio_del_group_dev(struct device *dev)
> >  {
> > -	DEFINE_WAIT_FUNC(wait, woken_wake_function);
> >  	struct vfio_device *device = dev_get_drvdata(dev);
> >  	struct vfio_group *group = device->group;
> >  	void *device_data = device->device_data;
> >  	struct vfio_unbound_dev *unbound;
> >  	unsigned int i = 0;
> >  	bool interrupted = false;
> > +	long rc;
> > 
> >  	/*
> >  	 * When the device is removed from the group, the group suddenly
> > @@ -935,32 +916,18 @@ void *vfio_del_group_dev(struct device *dev)
> >  	WARN_ON(!unbound);
> > 
> >  	vfio_device_put(device);
> > -
> > -	/*
> > -	 * If the device is still present in the group after the above
> > -	 * 'put', then it is in use and we need to request it from the
> > -	 * bus driver.  The driver may in turn need to request the
> > -	 * device from the user.  We send the request on an arbitrary
> > -	 * interval with counter to allow the driver to take escalating
> > -	 * measures to release the device if it has the ability to do so.
> > -	 */  
> 
> Above comment still makes sense even with this patch. What about
> keeping it? otherwise:

The comment is not exactly correct after this code change either, the
device will always be present in the group after this 'put'.  Instead,
the completion now indicates the reference count has reached zero.  If
it's worthwhile to keep more context to the request callback, perhaps:

	/*
	 * If there are still outstanding device references, such as
	 * from the device being in use, periodically kick the optional
	 * device request callback while waiting.
	 */

It's also a little obvious that's what we're doing here even without
the comment.  Thanks,

Alex
 
> Reviewed-by: Kevin Tian <kevin.tian@intel.com>
> 
> > -	add_wait_queue(&vfio.release_q, &wait);
> > -
> > -	do {
> > -		device = vfio_group_get_device(group, dev);
> > -		if (!device)
> > -			break;
> > -
> > +	rc = try_wait_for_completion(&device->comp);
> > +	while (rc <= 0) {
> >  		if (device->ops->request)
> >  			device->ops->request(device_data, i++);
> > 
> > -		vfio_device_put(device);
> > -
> >  		if (interrupted) {
> > -			wait_woken(&wait, TASK_UNINTERRUPTIBLE, HZ *
> > 10);
> > +			rc = wait_for_completion_timeout(&device->comp,
> > +							 HZ * 10);
> >  		} else {
> > -			wait_woken(&wait, TASK_INTERRUPTIBLE, HZ * 10);
> > -			if (signal_pending(current)) {
> > +			rc = wait_for_completion_interruptible_timeout(
> > +				&device->comp, HZ * 10);
> > +			if (rc < 0) {
> >  				interrupted = true;
> >  				dev_warn(dev,
> >  					 "Device is currently in use, task"
> > @@ -969,10 +936,13 @@ void *vfio_del_group_dev(struct device *dev)
> >  					 current->comm,
> > task_pid_nr(current));
> >  			}
> >  		}
> > +	}
> > 
> > -	} while (1);
> > +	mutex_lock(&group->device_lock);
> > +	list_del(&device->group_next);
> > +	group->dev_counter--;
> > +	mutex_unlock(&group->device_lock);
> > 
> > -	remove_wait_queue(&vfio.release_q, &wait);
> >  	/*
> >  	 * In order to support multiple devices per group, devices can be
> >  	 * plucked from the group while other devices in the group are still
> > @@ -992,6 +962,8 @@ void *vfio_del_group_dev(struct device *dev)
> > 
> >  	/* Matches the get in vfio_group_create_device() */
> >  	vfio_group_put(group);
> > +	dev_set_drvdata(dev, NULL);
> > +	kfree(device);
> > 
> >  	return device_data;
> >  }
> > @@ -2362,7 +2334,6 @@ static int __init vfio_init(void)
> >  	mutex_init(&vfio.iommu_drivers_lock);
> >  	INIT_LIST_HEAD(&vfio.group_list);
> >  	INIT_LIST_HEAD(&vfio.iommu_drivers_list);
> > -	init_waitqueue_head(&vfio.release_q);
> > 
> >  	ret = misc_register(&vfio_dev);
> >  	if (ret) {
> > --
> > 2.30.2  
> 


^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v2 03/14] vfio: Split creation of a vfio_device into init and register ops
  2021-03-16 12:25   ` Cornelia Huck
@ 2021-03-16 21:13     ` Alex Williamson
  2021-03-16 23:12       ` Jason Gunthorpe
  0 siblings, 1 reply; 82+ messages in thread
From: Alex Williamson @ 2021-03-16 21:13 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: Jason Gunthorpe, Jonathan Corbet, kvm, linux-doc, Raj, Ashok,
	Dan Williams, Daniel Vetter, Christoph Hellwig, Leon Romanovsky,
	Max Gurtovoy, Tarun Gupta, Liu Yi L

On Tue, 16 Mar 2021 13:25:59 +0100
Cornelia Huck <cohuck@redhat.com> wrote:

> On Fri, 12 Mar 2021 20:55:55 -0400
> Jason Gunthorpe <jgg@nvidia.com> wrote:
> 
> > This makes the struct vfio_pci_device part of the public interface so it

s/_pci//

> > can be used with container_of and so forth, as is typical for a Linux
> > subystem.
> > 
> > This is the first step to bring some type-safety to the vfio interface by
> > allowing the replacement of 'void *' and 'struct device *' inputs with a
> > simple and clear 'struct vfio_pci_device *'

s/_pci//

> > 
> > For now the self-allocating vfio_add_group_dev() interface is kept so each
> > user can be updated as a separate patch.
> > 
> > The expected usage pattern is
> > 
> >   driver core probe() function:
> >      my_device = kzalloc(sizeof(*mydevice));
> >      vfio_init_group_dev(&my_device->vdev, dev, ops, mydevice);
> >      /* other driver specific prep */
> >      vfio_register_group_dev(&my_device->vdev);
> >      dev_set_drvdata(my_device);
> > 
> >   driver core remove() function:
> >      my_device = dev_get_drvdata(dev);
> >      vfio_unregister_group_dev(&my_device->vdev);
> >      /* other driver specific tear down */
> >      kfree(my_device);
> > 
> > Allowing the driver to be able to use the drvdata and vifo_device to go  
> 
> s/vifo_device/vfio_device/
> 
> > to/from its own data.
> > 
> > The pattern also makes it clear that vfio_register_group_dev() must be
> > last in the sequence, as once it is called the core code can immediately
> > start calling ops. The init/register gap is provided to allow for the
> > driver to do setup before ops can be called and thus avoid races.
> > 
> > Reviewed-by: Christoph Hellwig <hch@lst.de>
> > Reviewed-by: Liu Yi L <yi.l.liu@intel.com>
> > Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> > ---
> >  Documentation/driver-api/vfio.rst |  31 ++++----
> >  drivers/vfio/vfio.c               | 123 ++++++++++++++++--------------
> >  include/linux/vfio.h              |  16 ++++
> >  3 files changed, 98 insertions(+), 72 deletions(-)
> > 
> > diff --git a/Documentation/driver-api/vfio.rst b/Documentation/driver-api/vfio.rst
> > index f1a4d3c3ba0bb1..d3a02300913a7f 100644
> > --- a/Documentation/driver-api/vfio.rst
> > +++ b/Documentation/driver-api/vfio.rst
> > @@ -249,18 +249,23 @@ VFIO bus driver API
> >  
> >  VFIO bus drivers, such as vfio-pci make use of only a few interfaces
> >  into VFIO core.  When devices are bound and unbound to the driver,
> > -the driver should call vfio_add_group_dev() and vfio_del_group_dev()
> > -respectively::
> > -
> > -	extern int vfio_add_group_dev(struct device *dev,
> > -				      const struct vfio_device_ops *ops,
> > -				      void *device_data);
> > -
> > -	extern void *vfio_del_group_dev(struct device *dev);
> > -
> > -vfio_add_group_dev() indicates to the core to begin tracking the
> > -iommu_group of the specified dev and register the dev as owned by
> > -a VFIO bus driver.  The driver provides an ops structure for callbacks
> > +the driver should call vfio_register_group_dev() and
> > +vfio_unregister_group_dev() respectively::
> > +
> > +	void vfio_init_group_dev(struct vfio_device *device,
> > +				struct device *dev,
> > +				const struct vfio_device_ops *ops,
> > +				void *device_data);
> > +	int vfio_register_group_dev(struct vfio_device *device);
> > +	void vfio_unregister_group_dev(struct vfio_device *device);
> > +
> > +The driver should embed the vfio_device in its own structure and call
> > +vfio_init_group_dev() to pre-configure it before going to registration.  
> 
> s/it/that structure/ (I guess?)

Seems less clear actually, is the object of "that structure" the
"vfio_device" or "its own structure".  Phrasing somewhat suggests the
latter.  s/it/the vfio_device structure/ seems excessively verbose.  I
think "it" is probably sufficient here.  Thanks,

Alex

 
> > +vfio_register_group_dev() indicates to the core to begin tracking the
> > +iommu_group of the specified dev and register the dev as owned by a VFIO bus
> > +driver. Once vfio_register_group_dev() returns it is possible for userspace to
> > +start accessing the driver, thus the driver should ensure it is completely
> > +ready before calling it. The driver provides an ops structure for callbacks
> >  similar to a file operations structure::
> >  
> >  	struct vfio_device_ops {  
> 
> Reviewed-by: Cornelia Huck <cohuck@redhat.com>


^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v2 04/14] vfio/platform: Use vfio_init/register/unregister_group_dev
  2021-03-13  0:55 ` [PATCH v2 04/14] vfio/platform: Use vfio_init/register/unregister_group_dev Jason Gunthorpe
  2021-03-16 16:22   ` Cornelia Huck
@ 2021-03-16 21:33   ` Alex Williamson
  2021-03-16 21:45     ` Jason Gunthorpe
  2021-03-18 13:40   ` Auger Eric
  2 siblings, 1 reply; 82+ messages in thread
From: Alex Williamson @ 2021-03-16 21:33 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Cornelia Huck, Eric Auger, kvm, Raj, Ashok, Dan Williams,
	Daniel Vetter, Christoph Hellwig, Leon Romanovsky, Max Gurtovoy,
	Tarun Gupta

On Fri, 12 Mar 2021 20:55:56 -0400
Jason Gunthorpe <jgg@nvidia.com> wrote:

> platform already allocates a struct vfio_platform_device with exactly
> the same lifetime as vfio_device, switch to the new API and embed
> vfio_device in vfio_platform_device.
> 
> Reviewed-by: Christoph Hellwig <hch@lst.de>
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> ---
>  drivers/vfio/platform/vfio_amba.c             |  8 ++++---
>  drivers/vfio/platform/vfio_platform.c         | 21 ++++++++---------
>  drivers/vfio/platform/vfio_platform_common.c  | 23 +++++++------------
>  drivers/vfio/platform/vfio_platform_private.h |  5 ++--
>  4 files changed, 26 insertions(+), 31 deletions(-)
> 
> diff --git a/drivers/vfio/platform/vfio_amba.c b/drivers/vfio/platform/vfio_amba.c
> index 3626c21501017e..f970eb2a999f29 100644
> --- a/drivers/vfio/platform/vfio_amba.c
> +++ b/drivers/vfio/platform/vfio_amba.c
> @@ -66,16 +66,18 @@ static int vfio_amba_probe(struct amba_device *adev, const struct amba_id *id)
>  	if (ret) {
>  		kfree(vdev->name);
>  		kfree(vdev);
> +		return ret;
>  	}
>  
> -	return ret;
> +	dev_set_drvdata(&adev->dev, vdev);
> +	return 0;
>  }
>  
>  static void vfio_amba_remove(struct amba_device *adev)
>  {
> -	struct vfio_platform_device *vdev =
> -		vfio_platform_remove_common(&adev->dev);
> +	struct vfio_platform_device *vdev = dev_get_drvdata(&adev->dev);
>  
> +	vfio_platform_remove_common(vdev);
>  	kfree(vdev->name);
>  	kfree(vdev);
>  }
> diff --git a/drivers/vfio/platform/vfio_platform.c b/drivers/vfio/platform/vfio_platform.c
> index 9fb6818cea12cb..f7b3f64ecc7f6c 100644
> --- a/drivers/vfio/platform/vfio_platform.c
> +++ b/drivers/vfio/platform/vfio_platform.c
> @@ -54,23 +54,22 @@ static int vfio_platform_probe(struct platform_device *pdev)
>  	vdev->reset_required = reset_required;
>  
>  	ret = vfio_platform_probe_common(vdev, &pdev->dev);
> -	if (ret)
> +	if (ret) {
>  		kfree(vdev);
> -
> -	return ret;
> +		return ret;
> +	}
> +	dev_set_drvdata(&pdev->dev, vdev);
> +	return 0;
>  }
>  
>  static int vfio_platform_remove(struct platform_device *pdev)
>  {
> -	struct vfio_platform_device *vdev;
> -
> -	vdev = vfio_platform_remove_common(&pdev->dev);
> -	if (vdev) {
> -		kfree(vdev);
> -		return 0;
> -	}
> +	struct vfio_platform_device *vdev = dev_get_drvdata(&pdev->dev);
>  
> -	return -EINVAL;
> +	vfio_platform_remove_common(vdev);
> +	kfree(vdev->name);


We don't own that to free it, _probe set this via:

        vdev->name = pdev->name;

Thanks,
Alex


^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v2 04/14] vfio/platform: Use vfio_init/register/unregister_group_dev
  2021-03-16 21:33   ` Alex Williamson
@ 2021-03-16 21:45     ` Jason Gunthorpe
  0 siblings, 0 replies; 82+ messages in thread
From: Jason Gunthorpe @ 2021-03-16 21:45 UTC (permalink / raw)
  To: Alex Williamson
  Cc: Cornelia Huck, Eric Auger, kvm, Raj, Ashok, Dan Williams,
	Daniel Vetter, Christoph Hellwig, Leon Romanovsky, Max Gurtovoy,
	Tarun Gupta

On Tue, Mar 16, 2021 at 03:33:55PM -0600, Alex Williamson wrote:

> >  static int vfio_platform_remove(struct platform_device *pdev)
> >  {
> > -	struct vfio_platform_device *vdev;
> > -
> > -	vdev = vfio_platform_remove_common(&pdev->dev);
> > -	if (vdev) {
> > -		kfree(vdev);
> > -		return 0;
> > -	}
> > +	struct vfio_platform_device *vdev = dev_get_drvdata(&pdev->dev);
> >  
> > -	return -EINVAL;
> > +	vfio_platform_remove_common(vdev);
> > +	kfree(vdev->name);
> 
> 
> We don't own that to free it, _probe set this via:
> 
>         vdev->name = pdev->name;

Gah, yes, this is a copy&pasto mistake from the amba code

Thanks,
Jason

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v2 08/14] vfio/pci: Re-order vfio_pci_probe()
  2021-03-16 13:20     ` Jason Gunthorpe
@ 2021-03-16 22:27       ` Alex Williamson
  2021-03-17  0:56         ` Tian, Kevin
  0 siblings, 1 reply; 82+ messages in thread
From: Alex Williamson @ 2021-03-16 22:27 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Tian, Kevin, kvm, Raj, Ashok, Christian Ehrhardt, Cornelia Huck,
	Williams, Dan J, Daniel Vetter, Eric Auger, Christoph Hellwig,
	Leon Romanovsky, Max Gurtovoy, Tarun Gupta

On Tue, 16 Mar 2021 10:20:58 -0300
Jason Gunthorpe <jgg@nvidia.com> wrote:

> On Tue, Mar 16, 2021 at 08:04:55AM +0000, Tian, Kevin wrote:
> > > @@ -2060,15 +2056,20 @@ static int vfio_pci_probe(struct pci_dev *pdev,
> > > const struct pci_device_id *id)
> > >  		vfio_pci_set_power_state(vdev, PCI_D3hot);
> > >  	}
> > > 
> > > -	return ret;
> > > +	ret = vfio_add_group_dev(&pdev->dev, &vfio_pci_ops, vdev);
> > > +	if (ret)
> > > +		goto out_power;
> > > +	return 0;
> > > 
> > > +out_power:
> > > +	if (!disable_idle_d3)
> > > +		vfio_pci_set_power_state(vdev, PCI_D0);  
> > 
> > Just curious whether the power state must be recovered upon failure here.
> > From the comment several lines above, the power state is set to an unknown
> > state before doing D3 transaction. From this point it looks fine if leaving the
> > device in D3 since there is no expected state to be recovered?  
> 
> I don't know, this is what the remove function does, so I can't see a
> reason why remove should do it but not here.

I'm not sure it matters in either case, we're just trying to be most
similar to expected driver behavior.  pci_enable_device() puts the
device in D0 but pci_disable_device() doesn't touch the power state, so
the device would typically be released from a PCI driver in D0 afaict.
Thanks,

Alex


^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v2 10/14] vfio/mdev: Use vfio_init/register/unregister_group_dev
  2021-03-16  8:09   ` Tian, Kevin
@ 2021-03-16 22:51     ` Alex Williamson
  2021-03-16 23:19     ` Jason Gunthorpe
  1 sibling, 0 replies; 82+ messages in thread
From: Alex Williamson @ 2021-03-16 22:51 UTC (permalink / raw)
  To: Tian, Kevin
  Cc: Jason Gunthorpe, Cornelia Huck, kvm, Kirti Wankhede, Raj, Ashok,
	Williams, Dan J, Daniel Vetter, Christoph Hellwig,
	Leon Romanovsky, Max Gurtovoy, Tarun Gupta, Liu, Yi L

On Tue, 16 Mar 2021 08:09:19 +0000
"Tian, Kevin" <kevin.tian@intel.com> wrote:

> > From: Jason Gunthorpe <jgg@nvidia.com>
> > Sent: Saturday, March 13, 2021 8:56 AM
> > 
> > mdev gets little benefit because it doesn't actually do anything, however
> > it is the last user, so move the code here for now.  
> 
> and indicate that vfio_add/del_group_dev is removed in this patch.
> 
> > 
> > Reviewed-by: Christoph Hellwig <hch@lst.de>
> > Reviewed-by: Liu Yi L <yi.l.liu@intel.com>
> > Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> > ---
> >  drivers/vfio/mdev/vfio_mdev.c | 24 +++++++++++++++++++--
> >  drivers/vfio/vfio.c           | 39 ++---------------------------------
> >  include/linux/vfio.h          |  5 -----
> >  3 files changed, 24 insertions(+), 44 deletions(-)
> > 
> > diff --git a/drivers/vfio/mdev/vfio_mdev.c b/drivers/vfio/mdev/vfio_mdev.c
> > index b52eea128549ee..4469aaf31b56cb 100644
> > --- a/drivers/vfio/mdev/vfio_mdev.c
> > +++ b/drivers/vfio/mdev/vfio_mdev.c
> > @@ -21,6 +21,10 @@
> >  #define DRIVER_AUTHOR   "NVIDIA Corporation"
> >  #define DRIVER_DESC     "VFIO based driver for Mediated device"
> > 
> > +struct mdev_vfio_device {
> > +	struct vfio_device vdev;
> > +};  
> 
> following other vfio_XXX_device convention, what about calling it
> vfio_mdev_device? otherwise,


Or, why actually create this structure at all?  _probe and _remove
could just use a vfio_device.  Thanks,

Alex

> Reviewed-by: Kevin Tian <kevin.tian@intel.com>
> 
> > +
> >  static int vfio_mdev_open(void *device_data)
> >  {
> >  	struct mdev_device *mdev = device_data;
> > @@ -124,13 +128,29 @@ static const struct vfio_device_ops
> > vfio_mdev_dev_ops = {
> >  static int vfio_mdev_probe(struct device *dev)
> >  {
> >  	struct mdev_device *mdev = to_mdev_device(dev);
> > +	struct mdev_vfio_device *mvdev;
> > +	int ret;
> > 
> > -	return vfio_add_group_dev(dev, &vfio_mdev_dev_ops, mdev);
> > +	mvdev = kzalloc(sizeof(*mvdev), GFP_KERNEL);
> > +	if (!mvdev)
> > +		return -ENOMEM;
> > +
> > +	vfio_init_group_dev(&mvdev->vdev, &mdev->dev,
> > &vfio_mdev_dev_ops, mdev);
> > +	ret = vfio_register_group_dev(&mvdev->vdev);
> > +	if (ret) {
> > +		kfree(mvdev);
> > +		return ret;
> > +	}
> > +	dev_set_drvdata(&mdev->dev, mvdev);
> > +	return 0;
> >  }
> > 
> >  static void vfio_mdev_remove(struct device *dev)
> >  {
> > -	vfio_del_group_dev(dev);
> > +	struct mdev_vfio_device *mvdev = dev_get_drvdata(dev);
> > +
> > +	vfio_unregister_group_dev(&mvdev->vdev);
> > +	kfree(mvdev);
> >  }
> > 
> >  static struct mdev_driver vfio_mdev_driver = {
> > diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
> > index cfa06ae3b9018b..2d6d7cc1d1ebf9 100644
> > --- a/drivers/vfio/vfio.c
> > +++ b/drivers/vfio/vfio.c
> > @@ -99,8 +99,8 @@
> > MODULE_PARM_DESC(enable_unsafe_noiommu_mode, "Enable UNSAFE,
> > no-IOMMU mode.  Thi
> >  /*
> >   * vfio_iommu_group_{get,put} are only intended for VFIO bus driver probe
> >   * and remove functions, any use cases other than acquiring the first
> > - * reference for the purpose of calling vfio_add_group_dev() or removing
> > - * that symmetric reference after vfio_del_group_dev() should use the raw
> > + * reference for the purpose of calling vfio_register_group_dev() or
> > removing
> > + * that symmetric reference after vfio_unregister_group_dev() should use
> > the raw
> >   * iommu_group_{get,put} functions.  In particular, vfio_iommu_group_put()
> >   * removes the device from the dummy group and cannot be nested.
> >   */
> > @@ -799,29 +799,6 @@ int vfio_register_group_dev(struct vfio_device
> > *device)
> >  }
> >  EXPORT_SYMBOL_GPL(vfio_register_group_dev);
> > 
> > -int vfio_add_group_dev(struct device *dev, const struct vfio_device_ops
> > *ops,
> > -		       void *device_data)
> > -{
> > -	struct vfio_device *device;
> > -	int ret;
> > -
> > -	device = kzalloc(sizeof(*device), GFP_KERNEL);
> > -	if (!device)
> > -		return -ENOMEM;
> > -
> > -	vfio_init_group_dev(device, dev, ops, device_data);
> > -	ret = vfio_register_group_dev(device);
> > -	if (ret)
> > -		goto err_kfree;
> > -	dev_set_drvdata(dev, device);
> > -	return 0;
> > -
> > -err_kfree:
> > -	kfree(device);
> > -	return ret;
> > -}
> > -EXPORT_SYMBOL_GPL(vfio_add_group_dev);
> > -
> >  /**
> >   * Get a reference to the vfio_device for a device.  Even if the
> >   * caller thinks they own the device, they could be racing with a
> > @@ -962,18 +939,6 @@ void vfio_unregister_group_dev(struct vfio_device
> > *device)
> >  }
> >  EXPORT_SYMBOL_GPL(vfio_unregister_group_dev);
> > 
> > -void *vfio_del_group_dev(struct device *dev)
> > -{
> > -	struct vfio_device *device = dev_get_drvdata(dev);
> > -	void *device_data = device->device_data;
> > -
> > -	vfio_unregister_group_dev(device);
> > -	dev_set_drvdata(dev, NULL);
> > -	kfree(device);
> > -	return device_data;
> > -}
> > -EXPORT_SYMBOL_GPL(vfio_del_group_dev);
> > -
> >  /**
> >   * VFIO base fd, /dev/vfio/vfio
> >   */
> > diff --git a/include/linux/vfio.h b/include/linux/vfio.h
> > index ad8b579d67d34a..4995faf51efeae 100644
> > --- a/include/linux/vfio.h
> > +++ b/include/linux/vfio.h
> > @@ -63,11 +63,6 @@ extern void vfio_iommu_group_put(struct
> > iommu_group *group, struct device *dev);
> >  void vfio_init_group_dev(struct vfio_device *device, struct device *dev,
> >  			 const struct vfio_device_ops *ops, void
> > *device_data);
> >  int vfio_register_group_dev(struct vfio_device *device);
> > -extern int vfio_add_group_dev(struct device *dev,
> > -			      const struct vfio_device_ops *ops,
> > -			      void *device_data);
> > -
> > -extern void *vfio_del_group_dev(struct device *dev);
> >  void vfio_unregister_group_dev(struct vfio_device *device);
> >  extern struct vfio_device *vfio_device_get_from_dev(struct device *dev);
> >  extern void vfio_device_put(struct vfio_device *device);
> > --
> > 2.30.2  
> 


^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v2 11/14] vfio/mdev: Make to_mdev_device() into a static inline
  2021-03-13  0:56 ` [PATCH v2 11/14] vfio/mdev: Make to_mdev_device() into a static inline Jason Gunthorpe
  2021-03-16  8:10   ` Tian, Kevin
@ 2021-03-16 22:55   ` Alex Williamson
  2021-03-16 23:20     ` Jason Gunthorpe
  2021-03-17 10:36   ` Cornelia Huck
  2 siblings, 1 reply; 82+ messages in thread
From: Alex Williamson @ 2021-03-16 22:55 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Cornelia Huck, kvm, Kirti Wankhede, Raj, Ashok, Dan Williams,
	Daniel Vetter, Christoph Hellwig, Leon Romanovsky, Max Gurtovoy,
	Tarun Gupta

On Fri, 12 Mar 2021 20:56:03 -0400
Jason Gunthorpe <jgg@nvidia.com> wrote:

> The macro wrongly uses 'dev' as both the macro argument and the member
> name, which means it fails compilation if any caller uses a word other
> than 'dev' as the single argument. Fix this defect by making it into
> proper static inline, which is more clear and typesafe anyhow.
> 
> Reviewed-by: Christoph Hellwig <hch@lst.de>
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> ---
>  drivers/vfio/mdev/mdev_private.h | 5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/vfio/mdev/mdev_private.h b/drivers/vfio/mdev/mdev_private.h
> index 7d922950caaf3c..74c2e541146999 100644
> --- a/drivers/vfio/mdev/mdev_private.h
> +++ b/drivers/vfio/mdev/mdev_private.h
> @@ -35,7 +35,10 @@ struct mdev_device {
>  	bool active;
>  };
>  
> -#define to_mdev_device(dev)	container_of(dev, struct mdev_device, dev)
> +static inline struct mdev_device *to_mdev_device(struct device *dev)
> +{
> +	return container_of(dev, struct mdev_device, dev);
> +}
>  #define dev_is_mdev(d)		((d)->bus == &mdev_bus_type)
>  
>  struct mdev_type {

Fixes: 99e3123e3d72 ("vfio-mdev: Make mdev_device private and abstract interfaces")


^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v2 07/14] vfio/pci: Move VGA and VF initialization to functions
  2021-03-16 13:02   ` Max Gurtovoy
@ 2021-03-16 23:04     ` Jason Gunthorpe
  0 siblings, 0 replies; 82+ messages in thread
From: Jason Gunthorpe @ 2021-03-16 23:04 UTC (permalink / raw)
  To: Max Gurtovoy
  Cc: Cornelia Huck, kvm, Alex Williamson, Raj, Ashok, Dan Williams,
	Daniel Vetter, Christoph Hellwig, Leon Romanovsky, Tarun Gupta

On Tue, Mar 16, 2021 at 03:02:40PM +0200, Max Gurtovoy wrote:
> 
> On 3/13/2021 2:55 AM, Jason Gunthorpe wrote:
> > vfio_pci_probe() is quite complicated, with optional VF and VGA sub
> > components. Move these into clear init/uninit functions and have a linear
> > flow in probe/remove.
> > 
> > This fixes a few little buglets:
> >   - vfio_pci_remove() is in the wrong order, vga_client_register() removes
> >     a notifier and is after kfree(vdev), but the notifier refers to vdev,
> >     so it can use after free in a race.
> >   - vga_client_register() can fail but was ignored
> > 
> > Organize things so destruction order is the reverse of creation order.
> > 
> > Fixes: ecaa1f6a0154 ("vfio-pci: Add VGA arbiter client")
> > Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> >   drivers/vfio/pci/vfio_pci.c | 116 +++++++++++++++++++++++-------------
> >   1 file changed, 74 insertions(+), 42 deletions(-)
> > 
> > diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c
> > index 65e7e6b44578c2..f95b58376156a0 100644
> > +++ b/drivers/vfio/pci/vfio_pci.c
> > @@ -1922,6 +1922,68 @@ static int vfio_pci_bus_notifier(struct notifier_block *nb,
> >   	return 0;
> >   }
> > +static int vfio_pci_vf_init(struct vfio_pci_device *vdev)
> > +{
> > +	struct pci_dev *pdev = vdev->pdev;
> > +	int ret;
> > +
> > +	if (!pdev->is_physfn)
> > +		return 0;
> > +
> > +	vdev->vf_token = kzalloc(sizeof(*vdev->vf_token), GFP_KERNEL);
> > +	if (!vdev->vf_token)
> > +		return -ENOMEM;
> > +
> > +	mutex_init(&vdev->vf_token->lock);
> > +	uuid_gen(&vdev->vf_token->uuid);
> > +
> > +	vdev->nb.notifier_call = vfio_pci_bus_notifier;
> > +	ret = bus_register_notifier(&pci_bus_type, &vdev->nb);
> > +	if (ret) {
> > +		kfree(vdev->vf_token);
> 
> you can consider "mutex_destroy(&vdev->vf_token->lock);" like you use in the
> uninit function.

The value in doing mutex_destroy is that it triggers a useful
debugging check that the mutex is not locked while being destructed.

In this case it is impossible for the mutex to be locked because the
pointer hasn't left the local stack

Thanks,
Jason

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v2 01/14] vfio: Remove extra put/gets around vfio_device->group
  2021-03-16  7:33   ` Tian, Kevin
@ 2021-03-16 23:07     ` Jason Gunthorpe
  2021-03-17  0:47       ` Tian, Kevin
  0 siblings, 1 reply; 82+ messages in thread
From: Jason Gunthorpe @ 2021-03-16 23:07 UTC (permalink / raw)
  To: Tian, Kevin
  Cc: Alex Williamson, Cornelia Huck, kvm, Raj, Ashok, Williams, Dan J,
	Daniel Vetter, Christoph Hellwig, Leon Romanovsky, Max Gurtovoy,
	Tarun Gupta

On Tue, Mar 16, 2021 at 07:33:55AM +0000, Tian, Kevin wrote:

> > It is tricky to see, but the get at the start of vfio_del_group_dev() is
> > actually pairing with the put hidden inside vfio_device_put() a few lines
> > below.
> 
> I feel that the put inside vfio_device_put was meant to pair with the get in 
> vfio_group_create_device before this patch is applied. Because vfio_device_
> put may drop the last reference to the group, vfio_del_group_dev then 
> issues its own get to hold the reference until the put at the end of the func. 

Here I am talking about how this patch removes 3 gets and 2 puts -
which should be a red flag. The reason it is OK is because the 3rd
extra removed get is paring with the put hidden inside another put.

> > @@ -1008,6 +990,7 @@ void *vfio_del_group_dev(struct device *dev)
> >  	if (list_empty(&group->device_list))
> >  		wait_event(group->container_q, !group->container);
> > 
> > +	/* Matches the get in vfio_group_create_device() */
> 
> There is no get there now.

It is refering to this comment:

/* Our reference on group is moved to the device */

The get is a move in this case

Later delete the function and this becomes perfectly clear

Jason

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v2 02/14] vfio: Simplify the lifetime logic for vfio_device
  2021-03-16 20:24     ` Alex Williamson
@ 2021-03-16 23:08       ` Jason Gunthorpe
  2021-03-17  8:12       ` Cornelia Huck
  1 sibling, 0 replies; 82+ messages in thread
From: Jason Gunthorpe @ 2021-03-16 23:08 UTC (permalink / raw)
  To: Alex Williamson
  Cc: Tian, Kevin, Cornelia Huck, kvm, Raj, Ashok, Williams, Dan J,
	Daniel Vetter, Christoph Hellwig, Leon Romanovsky, Max Gurtovoy,
	Tarun Gupta

On Tue, Mar 16, 2021 at 02:24:54PM -0600, Alex Williamson wrote:
> > > @@ -935,32 +916,18 @@ void *vfio_del_group_dev(struct device *dev)
> > >  	WARN_ON(!unbound);
> > > 
> > >  	vfio_device_put(device);
> > > -
> > > -	/*
> > > -	 * If the device is still present in the group after the above
> > > -	 * 'put', then it is in use and we need to request it from the
> > > -	 * bus driver.  The driver may in turn need to request the
> > > -	 * device from the user.  We send the request on an arbitrary
> > > -	 * interval with counter to allow the driver to take escalating
> > > -	 * measures to release the device if it has the ability to do so.
> > > -	 */  
> > 
> > Above comment still makes sense even with this patch. What about
> > keeping it? otherwise:
> 
> The comment is not exactly correct after this code change either, the
> device will always be present in the group after this 'put'.  Instead,
> the completion now indicates the reference count has reached zero.  If
> it's worthwhile to keep more context to the request callback, perhaps:
> 
> 	/*
> 	 * If there are still outstanding device references, such as
> 	 * from the device being in use, periodically kick the optional
> 	 * device request callback while waiting.
> 	 */
> 
> It's also a little obvious that's what we're doing here even without
> the comment.  Thanks,

Indeed, that is the explanation why I dropped it.

Thanks,
Jason

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v2 03/14] vfio: Split creation of a vfio_device into init and register ops
  2021-03-16 21:13     ` Alex Williamson
@ 2021-03-16 23:12       ` Jason Gunthorpe
  0 siblings, 0 replies; 82+ messages in thread
From: Jason Gunthorpe @ 2021-03-16 23:12 UTC (permalink / raw)
  To: Alex Williamson
  Cc: Cornelia Huck, Jonathan Corbet, kvm, linux-doc, Raj, Ashok,
	Dan Williams, Daniel Vetter, Christoph Hellwig, Leon Romanovsky,
	Max Gurtovoy, Tarun Gupta, Liu Yi L

On Tue, Mar 16, 2021 at 03:13:06PM -0600, Alex Williamson wrote:

> > > +	void vfio_init_group_dev(struct vfio_device *device,
> > > +				struct device *dev,
> > > +				const struct vfio_device_ops *ops,
> > > +				void *device_data);
> > > +	int vfio_register_group_dev(struct vfio_device *device);
> > > +	void vfio_unregister_group_dev(struct vfio_device *device);
> > > +
> > > +The driver should embed the vfio_device in its own structure and call
> > > +vfio_init_group_dev() to pre-configure it before going to registration.  
> > 
> > s/it/that structure/ (I guess?)
> 
> Seems less clear actually, is the object of "that structure" the
> "vfio_device" or "its own structure".  Phrasing somewhat suggests the
> latter.  s/it/the vfio_device structure/ seems excessively verbose.  I
> think "it" is probably sufficient here.  Thanks,

Right, it says directly above that vfio_init_group_dev() accepts a
vfio_device so I doubt anyone will be confused for long on what "it"
refers to.

I got the other language fixes thanks

Jason

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v2 10/14] vfio/mdev: Use vfio_init/register/unregister_group_dev
  2021-03-16  8:09   ` Tian, Kevin
  2021-03-16 22:51     ` Alex Williamson
@ 2021-03-16 23:19     ` Jason Gunthorpe
  1 sibling, 0 replies; 82+ messages in thread
From: Jason Gunthorpe @ 2021-03-16 23:19 UTC (permalink / raw)
  To: Tian, Kevin
  Cc: Alex Williamson, Cornelia Huck, kvm, Kirti Wankhede, Raj, Ashok,
	Williams, Dan J, Daniel Vetter, Christoph Hellwig,
	Leon Romanovsky, Max Gurtovoy, Tarun Gupta, Liu, Yi L

On Tue, Mar 16, 2021 at 08:09:19AM +0000, Tian, Kevin wrote:
> > From: Jason Gunthorpe <jgg@nvidia.com>
> > Sent: Saturday, March 13, 2021 8:56 AM
> > 
> > mdev gets little benefit because it doesn't actually do anything, however
> > it is the last user, so move the code here for now.
> 
> and indicate that vfio_add/del_group_dev is removed in this patch.

The "move the code here" (referring to the deleted functions) was
intended to cover that. I added some words:

    mdev gets little benefit because it doesn't actually do anything, however
    it is the last user, so move the vfio_init/register/unregister_group_dev()
    code here for now.
    
> > diff --git a/drivers/vfio/mdev/vfio_mdev.c b/drivers/vfio/mdev/vfio_mdev.c
> > index b52eea128549ee..4469aaf31b56cb 100644
> > +++ b/drivers/vfio/mdev/vfio_mdev.c
> > @@ -21,6 +21,10 @@
> >  #define DRIVER_AUTHOR   "NVIDIA Corporation"
> >  #define DRIVER_DESC     "VFIO based driver for Mediated device"
> > 
> > +struct mdev_vfio_device {
> > +	struct vfio_device vdev;
> > +};
> 
> following other vfio_XXX_device convention, what about calling it
> vfio_mdev_device? otherwise,

Right, but let's delete it as Alex suggests.

Jason

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v2 11/14] vfio/mdev: Make to_mdev_device() into a static inline
  2021-03-16 22:55   ` Alex Williamson
@ 2021-03-16 23:20     ` Jason Gunthorpe
  0 siblings, 0 replies; 82+ messages in thread
From: Jason Gunthorpe @ 2021-03-16 23:20 UTC (permalink / raw)
  To: Alex Williamson
  Cc: Cornelia Huck, kvm, Kirti Wankhede, Raj, Ashok, Dan Williams,
	Daniel Vetter, Christoph Hellwig, Leon Romanovsky, Max Gurtovoy,
	Tarun Gupta

On Tue, Mar 16, 2021 at 04:55:27PM -0600, Alex Williamson wrote:
> On Fri, 12 Mar 2021 20:56:03 -0400
> Jason Gunthorpe <jgg@nvidia.com> wrote:
> 
> > The macro wrongly uses 'dev' as both the macro argument and the member
> > name, which means it fails compilation if any caller uses a word other
> > than 'dev' as the single argument. Fix this defect by making it into
> > proper static inline, which is more clear and typesafe anyhow.
> > 
> > Reviewed-by: Christoph Hellwig <hch@lst.de>
> > Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> >  drivers/vfio/mdev/mdev_private.h | 5 ++++-
> >  1 file changed, 4 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/vfio/mdev/mdev_private.h b/drivers/vfio/mdev/mdev_private.h
> > index 7d922950caaf3c..74c2e541146999 100644
> > +++ b/drivers/vfio/mdev/mdev_private.h
> > @@ -35,7 +35,10 @@ struct mdev_device {
> >  	bool active;
> >  };
> >  
> > -#define to_mdev_device(dev)	container_of(dev, struct mdev_device, dev)
> > +static inline struct mdev_device *to_mdev_device(struct device *dev)
> > +{
> > +	return container_of(dev, struct mdev_device, dev);
> > +}
> >  #define dev_is_mdev(d)		((d)->bus == &mdev_bus_type)
> >  
> >  struct mdev_type {
> 
> Fixes: 99e3123e3d72 ("vfio-mdev: Make mdev_device private and abstract interfaces")

Ok, but it isn't a bug until the next patch that adds new callers for
to_mdev_device()

Jason

^ permalink raw reply	[flat|nested] 82+ messages in thread

* RE: [PATCH v2 01/14] vfio: Remove extra put/gets around vfio_device->group
  2021-03-16 23:07     ` Jason Gunthorpe
@ 2021-03-17  0:47       ` Tian, Kevin
  2021-03-19 13:58         ` Jason Gunthorpe
  0 siblings, 1 reply; 82+ messages in thread
From: Tian, Kevin @ 2021-03-17  0:47 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Alex Williamson, Cornelia Huck, kvm, Raj, Ashok, Williams, Dan J,
	Daniel Vetter, Christoph Hellwig, Leon Romanovsky, Max Gurtovoy,
	Tarun Gupta

> From: Jason Gunthorpe <jgg@nvidia.com>
> Sent: Wednesday, March 17, 2021 7:08 AM
> 
> On Tue, Mar 16, 2021 at 07:33:55AM +0000, Tian, Kevin wrote:
> 
> > > It is tricky to see, but the get at the start of vfio_del_group_dev() is
> > > actually pairing with the put hidden inside vfio_device_put() a few lines
> > > below.
> >
> > I feel that the put inside vfio_device_put was meant to pair with the get in
> > vfio_group_create_device before this patch is applied. Because
> vfio_device_
> > put may drop the last reference to the group, vfio_del_group_dev then
> > issues its own get to hold the reference until the put at the end of the func.
> 
> Here I am talking about how this patch removes 3 gets and 2 puts -
> which should be a red flag. The reason it is OK is because the 3rd
> extra removed get is paring with the put hidden inside another put.

Fine. We are just looking at it from different angles.

> 
> > > @@ -1008,6 +990,7 @@ void *vfio_del_group_dev(struct device *dev)
> > >  	if (list_empty(&group->device_list))
> > >  		wait_event(group->container_q, !group->container);
> > >
> > > +	/* Matches the get in vfio_group_create_device() */
> >
> > There is no get there now.
> 
> It is refering to this comment:
> 
> /* Our reference on group is moved to the device */
> 
> The get is a move in this case
> 
> Later delete the function and this becomes perfectly clear
> 

Looks above comment is not updated after vfio_group_create_device 
is removed in patch03.

Thanks
Kevin

^ permalink raw reply	[flat|nested] 82+ messages in thread

* RE: [PATCH v2 03/14] vfio: Split creation of a vfio_device into init and register ops
  2021-03-16 13:34     ` Jason Gunthorpe
@ 2021-03-17  0:55       ` Tian, Kevin
  0 siblings, 0 replies; 82+ messages in thread
From: Tian, Kevin @ 2021-03-17  0:55 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Alex Williamson, Cornelia Huck, Jonathan Corbet, kvm, linux-doc,
	Raj, Ashok, Williams, Dan J, Daniel Vetter, Christoph Hellwig,
	Leon Romanovsky, Max Gurtovoy, Tarun Gupta, Liu, Yi L

> From: Jason Gunthorpe <jgg@nvidia.com>
> Sent: Tuesday, March 16, 2021 9:34 PM
> 
> On Tue, Mar 16, 2021 at 07:55:11AM +0000, Tian, Kevin wrote:
> 
> > > +void *vfio_del_group_dev(struct device *dev)
> > > +{
> > > +	struct vfio_device *device = dev_get_drvdata(dev);
> > > +	void *device_data = device->device_data;
> > > +
> > > +	vfio_unregister_group_dev(device);
> > >  	dev_set_drvdata(dev, NULL);
> >
> > Move to vfio_unregister_group_dev? In the cover letter you mentioned
> > that drvdata is managed by the driver but removed from the core.
> 
> "removed from the core" means the core code doesn't touch drvdata at
> all.
> 
> > Looks it's also the rule obeyed by the following patches.
> 
> The dev_set_drvdata(NULL) on remove is mostly cargo-cult nonsense. The
> driver core sets it to null immediately after the remove function
> returns, so to add another set needs a very strong reason.
> 
> It is only left here temporarily, the last patch deletes it.
> 

Ah, I didn't realize dev_set_drvdata(NULL) is nonsense here. Just saw
no place clears it after this series. 

Reviewed-by: Kevin Tian <kevin.tian@intel.com>


^ permalink raw reply	[flat|nested] 82+ messages in thread

* RE: [PATCH v2 08/14] vfio/pci: Re-order vfio_pci_probe()
  2021-03-16 22:27       ` Alex Williamson
@ 2021-03-17  0:56         ` Tian, Kevin
  0 siblings, 0 replies; 82+ messages in thread
From: Tian, Kevin @ 2021-03-17  0:56 UTC (permalink / raw)
  To: Alex Williamson, Jason Gunthorpe
  Cc: kvm, Raj, Ashok, Christian Ehrhardt, Cornelia Huck, Williams,
	Dan J, Daniel Vetter, Eric Auger, Christoph Hellwig,
	Leon Romanovsky, Max Gurtovoy, Tarun Gupta

> From: Alex Williamson <alex.williamson@redhat.com>
> Sent: Wednesday, March 17, 2021 6:27 AM
> 
> On Tue, 16 Mar 2021 10:20:58 -0300
> Jason Gunthorpe <jgg@nvidia.com> wrote:
> 
> > On Tue, Mar 16, 2021 at 08:04:55AM +0000, Tian, Kevin wrote:
> > > > @@ -2060,15 +2056,20 @@ static int vfio_pci_probe(struct pci_dev
> *pdev,
> > > > const struct pci_device_id *id)
> > > >  		vfio_pci_set_power_state(vdev, PCI_D3hot);
> > > >  	}
> > > >
> > > > -	return ret;
> > > > +	ret = vfio_add_group_dev(&pdev->dev, &vfio_pci_ops, vdev);
> > > > +	if (ret)
> > > > +		goto out_power;
> > > > +	return 0;
> > > >
> > > > +out_power:
> > > > +	if (!disable_idle_d3)
> > > > +		vfio_pci_set_power_state(vdev, PCI_D0);
> > >
> > > Just curious whether the power state must be recovered upon failure
> here.
> > > From the comment several lines above, the power state is set to an
> unknown
> > > state before doing D3 transaction. From this point it looks fine if leaving
> the
> > > device in D3 since there is no expected state to be recovered?
> >
> > I don't know, this is what the remove function does, so I can't see a
> > reason why remove should do it but not here.
> 
> I'm not sure it matters in either case, we're just trying to be most
> similar to expected driver behavior.  pci_enable_device() puts the
> device in D0 but pci_disable_device() doesn't touch the power state, so
> the device would typically be released from a PCI driver in D0 afaict.
> Thanks,
> 

OK. Then,

Reviewed-by: Kevin Tian <kevin.tian@intel.com>

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v2 02/14] vfio: Simplify the lifetime logic for vfio_device
  2021-03-16 20:24     ` Alex Williamson
  2021-03-16 23:08       ` Jason Gunthorpe
@ 2021-03-17  8:12       ` Cornelia Huck
  2021-03-23 13:06         ` Jason Gunthorpe
  1 sibling, 1 reply; 82+ messages in thread
From: Cornelia Huck @ 2021-03-17  8:12 UTC (permalink / raw)
  To: Alex Williamson
  Cc: Tian, Kevin, Jason Gunthorpe, kvm, Raj, Ashok, Williams, Dan J,
	Daniel Vetter, Christoph Hellwig, Leon Romanovsky, Max Gurtovoy,
	Tarun Gupta

On Tue, 16 Mar 2021 14:24:54 -0600
Alex Williamson <alex.williamson@redhat.com> wrote:

> On Tue, 16 Mar 2021 07:38:09 +0000
> "Tian, Kevin" <kevin.tian@intel.com> wrote:
> 
> > > From: Jason Gunthorpe <jgg@nvidia.com>
> > > Sent: Saturday, March 13, 2021 8:56 AM
> > > 
> > > The vfio_device is using a 'sleep until all refs go to zero' pattern for
> > > its lifetime, but it is indirectly coded by repeatedly scanning the group
> > > list waiting for the device to be removed on its own.
> > > 
> > > Switch this around to be a direct representation, use a refcount to count
> > > the number of places that are blocking destruction and sleep directly on a
> > > completion until that counter goes to zero. kfree the device after other
> > > accesses have been excluded in vfio_del_group_dev(). This is a fairly
> > > common Linux idiom.
> > > 
> > > Due to this we can now remove kref_put_mutex(), which is very rarely used
> > > in the kernel. Here it is being used to prevent a zero ref device from
> > > being seen in the group list. Instead allow the zero ref device to
> > > continue to exist in the device_list and use refcount_inc_not_zero() to
> > > exclude it once refs go to zero.
> > > 
> > > This patch is organized so the next patch will be able to alter the API to
> > > allow drivers to provide the kfree.
> > > 
> > > Reviewed-by: Christoph Hellwig <hch@lst.de>
> > > Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> > > ---
> > >  drivers/vfio/vfio.c | 79 ++++++++++++++-------------------------------
> > >  1 file changed, 25 insertions(+), 54 deletions(-)
> > > 

> > > @@ -935,32 +916,18 @@ void *vfio_del_group_dev(struct device *dev)
> > >  	WARN_ON(!unbound);
> > > 
> > >  	vfio_device_put(device);
> > > -
> > > -	/*
> > > -	 * If the device is still present in the group after the above
> > > -	 * 'put', then it is in use and we need to request it from the
> > > -	 * bus driver.  The driver may in turn need to request the
> > > -	 * device from the user.  We send the request on an arbitrary
> > > -	 * interval with counter to allow the driver to take escalating
> > > -	 * measures to release the device if it has the ability to do so.
> > > -	 */    
> > 
> > Above comment still makes sense even with this patch. What about
> > keeping it? otherwise:  
> 
> The comment is not exactly correct after this code change either, the
> device will always be present in the group after this 'put'.  Instead,
> the completion now indicates the reference count has reached zero.  If
> it's worthwhile to keep more context to the request callback, perhaps:
> 
> 	/*
> 	 * If there are still outstanding device references, such as
> 	 * from the device being in use, periodically kick the optional
> 	 * device request callback while waiting.
> 	 */

I like that comment; I don't think it hurts to be a bit verbose here.

> 
> It's also a little obvious that's what we're doing here even without
> the comment.  Thanks,
> 
> Alex


^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v2 08/14] vfio/pci: Re-order vfio_pci_probe()
  2021-03-13  0:56 ` [PATCH v2 08/14] vfio/pci: Re-order vfio_pci_probe() Jason Gunthorpe
                     ` (2 preceding siblings ...)
  2021-03-16 11:28   ` Max Gurtovoy
@ 2021-03-17 10:32   ` Cornelia Huck
  2021-03-18 16:50   ` Auger Eric
  4 siblings, 0 replies; 82+ messages in thread
From: Cornelia Huck @ 2021-03-17 10:32 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: kvm, Alex Williamson, Raj, Ashok, Christian Ehrhardt,
	Dan Williams, Daniel Vetter, Eric Auger, Christoph Hellwig,
	Kevin Tian, Leon Romanovsky, Max Gurtovoy, Tarun Gupta

On Fri, 12 Mar 2021 20:56:00 -0400
Jason Gunthorpe <jgg@nvidia.com> wrote:

> vfio_add_group_dev() must be called only after all of the private data in
> vdev is fully setup and ready, otherwise there could be races with user
> space instantiating a device file descriptor and starting to call ops.
> 
> For instance vfio_pci_reflck_attach() sets vdev->reflck and
> vfio_pci_open(), called by fops open, unconditionally derefs it, which
> will crash if things get out of order.
> 
> Fixes: cc20d7999000 ("vfio/pci: Introduce VF token")
> Fixes: e309df5b0c9e ("vfio/pci: Parallelize device open and release")
> Fixes: 6eb7018705de ("vfio-pci: Move idle devices to D3hot power state")
> Fixes: ecaa1f6a0154 ("vfio-pci: Add VGA arbiter client")
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> ---
>  drivers/vfio/pci/vfio_pci.c | 17 +++++++++--------
>  1 file changed, 9 insertions(+), 8 deletions(-)

Reviewed-by: Cornelia Huck <cohuck@redhat.com>


^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v2 09/14] vfio/pci: Use vfio_init/register/unregister_group_dev
  2021-03-13  0:56 ` [PATCH v2 09/14] vfio/pci: Use vfio_init/register/unregister_group_dev Jason Gunthorpe
  2021-03-16  8:06   ` Tian, Kevin
@ 2021-03-17 10:33   ` Cornelia Huck
  2021-03-18 13:43   ` Auger Eric
  2 siblings, 0 replies; 82+ messages in thread
From: Cornelia Huck @ 2021-03-17 10:33 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Alex Williamson, kvm, Raj, Ashok, Dan Williams, Daniel Vetter,
	Christoph Hellwig, Leon Romanovsky, Max Gurtovoy, Tarun Gupta,
	Liu Yi L

On Fri, 12 Mar 2021 20:56:01 -0400
Jason Gunthorpe <jgg@nvidia.com> wrote:

> pci already allocates a struct vfio_pci_device with exactly the same
> lifetime as vfio_device, switch to the new API and embed vfio_device in
> vfio_pci_device.
> 
> Reviewed-by: Christoph Hellwig <hch@lst.de>
> Reviewed-by: Liu Yi L <yi.l.liu@intel.com>
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> ---
>  drivers/vfio/pci/vfio_pci.c         | 10 +++++-----
>  drivers/vfio/pci/vfio_pci_private.h |  1 +
>  2 files changed, 6 insertions(+), 5 deletions(-)

Reviewed-by: Cornelia Huck <cohuck@redhat.com>


^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v2 10/14] vfio/mdev: Use vfio_init/register/unregister_group_dev
  2021-03-13  0:56 ` [PATCH v2 10/14] vfio/mdev: " Jason Gunthorpe
  2021-03-16  8:09   ` Tian, Kevin
@ 2021-03-17 10:36   ` Cornelia Huck
  1 sibling, 0 replies; 82+ messages in thread
From: Cornelia Huck @ 2021-03-17 10:36 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Alex Williamson, kvm, Kirti Wankhede, Raj, Ashok, Dan Williams,
	Daniel Vetter, Christoph Hellwig, Leon Romanovsky, Max Gurtovoy,
	Tarun Gupta, Liu Yi L

On Fri, 12 Mar 2021 20:56:02 -0400
Jason Gunthorpe <jgg@nvidia.com> wrote:

> mdev gets little benefit because it doesn't actually do anything, however
> it is the last user, so move the code here for now.
> 
> Reviewed-by: Christoph Hellwig <hch@lst.de>
> Reviewed-by: Liu Yi L <yi.l.liu@intel.com>
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> ---
>  drivers/vfio/mdev/vfio_mdev.c | 24 +++++++++++++++++++--
>  drivers/vfio/vfio.c           | 39 ++---------------------------------
>  include/linux/vfio.h          |  5 -----
>  3 files changed, 24 insertions(+), 44 deletions(-)

With switching to a bare vfio_device:

Reviewed-by: Cornelia Huck <cohuck@redhat.com>


^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v2 11/14] vfio/mdev: Make to_mdev_device() into a static inline
  2021-03-13  0:56 ` [PATCH v2 11/14] vfio/mdev: Make to_mdev_device() into a static inline Jason Gunthorpe
  2021-03-16  8:10   ` Tian, Kevin
  2021-03-16 22:55   ` Alex Williamson
@ 2021-03-17 10:36   ` Cornelia Huck
  2 siblings, 0 replies; 82+ messages in thread
From: Cornelia Huck @ 2021-03-17 10:36 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Alex Williamson, kvm, Kirti Wankhede, Raj, Ashok, Dan Williams,
	Daniel Vetter, Christoph Hellwig, Leon Romanovsky, Max Gurtovoy,
	Tarun Gupta

On Fri, 12 Mar 2021 20:56:03 -0400
Jason Gunthorpe <jgg@nvidia.com> wrote:

> The macro wrongly uses 'dev' as both the macro argument and the member
> name, which means it fails compilation if any caller uses a word other
> than 'dev' as the single argument. Fix this defect by making it into
> proper static inline, which is more clear and typesafe anyhow.
> 
> Reviewed-by: Christoph Hellwig <hch@lst.de>
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> ---
>  drivers/vfio/mdev/mdev_private.h | 5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)

Reviewed-by: Cornelia Huck <cohuck@redhat.com>


^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v2 12/14] vfio: Make vfio_device_ops pass a 'struct vfio_device *' instead of 'void *'
  2021-03-13  0:56 ` [PATCH v2 12/14] vfio: Make vfio_device_ops pass a 'struct vfio_device *' instead of 'void *' Jason Gunthorpe
  2021-03-15  8:58   ` Christoph Hellwig
@ 2021-03-17 11:33   ` Cornelia Huck
  1 sibling, 0 replies; 82+ messages in thread
From: Cornelia Huck @ 2021-03-17 11:33 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Alex Williamson, Jonathan Corbet, Diana Craciun, Eric Auger, kvm,
	Kirti Wankhede, linux-doc, Raj, Ashok, Dan Williams,
	Daniel Vetter, Christoph Hellwig, Leon Romanovsky, Max Gurtovoy,
	Tarun Gupta

On Fri, 12 Mar 2021 20:56:04 -0400
Jason Gunthorpe <jgg@nvidia.com> wrote:

> This is the standard kernel pattern, the ops associated with a struct get
> the struct pointer in for typesafety. The expected design is to use
> container_of to cleanly go from the subsystem level type to the driver
> level type without having any type erasure in a void *.
> 
> Reviewed-by: Dan Williams <dan.j.williams@intel.com>
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> ---
>  Documentation/driver-api/vfio.rst            | 18 ++++----
>  drivers/vfio/fsl-mc/vfio_fsl_mc.c            | 36 +++++++++------
>  drivers/vfio/mdev/vfio_mdev.c                | 33 +++++++-------
>  drivers/vfio/pci/vfio_pci.c                  | 47 ++++++++++++--------
>  drivers/vfio/platform/vfio_platform_common.c | 33 ++++++++------
>  drivers/vfio/vfio.c                          | 20 ++++-----
>  include/linux/vfio.h                         | 16 +++----
>  7 files changed, 117 insertions(+), 86 deletions(-)

Reviewed-by: Cornelia Huck <cohuck@redhat.com>


^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v2 13/14] vfio/pci: Replace uses of vfio_device_data() with container_of
  2021-03-13  0:56 ` [PATCH v2 13/14] vfio/pci: Replace uses of vfio_device_data() with container_of Jason Gunthorpe
  2021-03-16  8:20   ` Tian, Kevin
@ 2021-03-17 12:06   ` Cornelia Huck
  1 sibling, 0 replies; 82+ messages in thread
From: Cornelia Huck @ 2021-03-17 12:06 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Alex Williamson, kvm, Raj, Ashok, Dan Williams, Daniel Vetter,
	Christoph Hellwig, Leon Romanovsky, Max Gurtovoy, Tarun Gupta

On Fri, 12 Mar 2021 20:56:05 -0400
Jason Gunthorpe <jgg@nvidia.com> wrote:

> This tidies a few confused places that think they can have a refcount on
> the vfio_device but the device_data could be NULL, that isn't possible by
> design.
> 
> Most of the change falls out when struct vfio_devices is updated to just
> store the struct vfio_pci_device itself. This wasn't possible before
> because there was no easy way to get from the 'struct vfio_pci_device' to
> the 'struct vfio_device' to put back the refcount.
> 
> Reviewed-by: Christoph Hellwig <hch@lst.de>
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> ---
>  drivers/vfio/pci/vfio_pci.c | 67 +++++++++++++------------------------
>  1 file changed, 24 insertions(+), 43 deletions(-)

Reviewed-by: Cornelia Huck <cohuck@redhat.com>


^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v2 14/14] vfio: Remove device_data from the vfio bus driver API
  2021-03-13  0:56 ` [PATCH v2 14/14] vfio: Remove device_data from the vfio bus driver API Jason Gunthorpe
  2021-03-16  8:22   ` Tian, Kevin
@ 2021-03-17 12:08   ` Cornelia Huck
  2021-03-17 23:24   ` Max Gurtovoy
  2 siblings, 0 replies; 82+ messages in thread
From: Cornelia Huck @ 2021-03-17 12:08 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Alex Williamson, Jonathan Corbet, Diana Craciun, Eric Auger, kvm,
	Kirti Wankhede, linux-doc, Raj, Ashok, Dan Williams,
	Daniel Vetter, Christoph Hellwig, Leon Romanovsky, Max Gurtovoy,
	Tarun Gupta

On Fri, 12 Mar 2021 20:56:06 -0400
Jason Gunthorpe <jgg@nvidia.com> wrote:

> There are no longer any users, so it can go away. Everything is using
> container_of now.
> 
> Reviewed-by: Christoph Hellwig <hch@lst.de>
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> ---
>  Documentation/driver-api/vfio.rst            |  3 +--
>  drivers/vfio/fsl-mc/vfio_fsl_mc.c            |  5 +++--
>  drivers/vfio/mdev/vfio_mdev.c                |  2 +-
>  drivers/vfio/pci/vfio_pci.c                  |  2 +-
>  drivers/vfio/platform/vfio_platform_common.c |  2 +-
>  drivers/vfio/vfio.c                          | 12 +-----------
>  include/linux/vfio.h                         |  4 +---
>  7 files changed, 9 insertions(+), 21 deletions(-)

Reviewed-by: Cornelia Huck <cohuck@redhat.com>


^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v2 05/14] vfio/fsl-mc: Re-order vfio_fsl_mc_probe()
  2021-03-13  0:55 ` [PATCH v2 05/14] vfio/fsl-mc: Re-order vfio_fsl_mc_probe() Jason Gunthorpe
                     ` (2 preceding siblings ...)
  2021-03-16 16:28   ` Cornelia Huck
@ 2021-03-17 16:36   ` Diana Craciun OSS
  2021-03-17 22:59     ` Jason Gunthorpe
  3 siblings, 1 reply; 82+ messages in thread
From: Diana Craciun OSS @ 2021-03-17 16:36 UTC (permalink / raw)
  To: Jason Gunthorpe, Cornelia Huck, kvm
  Cc: Alex Williamson, Raj, Ashok, Bharat Bhushan, Dan Williams,
	Daniel Vetter, Eric Auger, Christoph Hellwig, Leon Romanovsky,
	Max Gurtovoy, Tarun Gupta, Laurentiu Tudor

Hi,

Thanks for finding this!

I tested the series and currently the binding to vfio fails. The reason 
is that it is assumed that the objects scan is done after 
vfio_add_group_dev. But at this point the vdev structure is completly 
initialized.

I'll add some more context.

There are two types of FSL MC devices:
- a DPRC device
- regular devices

A DPRC is some kind of container of the other devices. The DPRC VFIO 
device is scanning for all the existing devices in the container and 
triggers the probe function for those devices. However, there are some 
pieces of code that needs to be protected by a lock, lock that is 
created by vfio_fsl_mc_reflck_attach() function. This function is 
searching for the DPRC vdev (having the physical device) in the vfio 
group, so the "parent" device should have been added in the group before 
the child devices are probed.


I did some changes on top of these series and this is how they look 
like. I hope that I do not do something that violates the way the VFIO 
is designed.

diff --git a/drivers/vfio/fsl-mc/vfio_fsl_mc.c 
b/drivers/vfio/fsl-mc/vfio_fsl_mc.c
index 3af3ca59478f..9b4c9356515a 100644
--- a/drivers/vfio/fsl-mc/vfio_fsl_mc.c
+++ b/drivers/vfio/fsl-mc/vfio_fsl_mc.c
@@ -578,22 +578,32 @@ static int vfio_fsl_mc_init_device(struct 
vfio_fsl_mc_device *vdev)
                 goto out_nc_unreg;
         }

-       ret = dprc_scan_container(mc_dev, false);
-       if (ret) {
-               dev_err(&mc_dev->dev, "VFIO_FSL_MC: Container scanning 
failed (%d)\n", ret);
-               goto out_dprc_cleanup;
-       }
-
         return 0;

-out_dprc_cleanup:
-       dprc_remove_devices(mc_dev, NULL, 0);
-       dprc_cleanup(mc_dev);
  out_nc_unreg:
         bus_unregister_notifier(&fsl_mc_bus_type, &vdev->nb);
         return ret;
  }

+static int vfio_fsl_mc_scan_container(struct vfio_fsl_mc_device *vdev)
+{
+       struct fsl_mc_device *mc_dev = vdev->mc_dev;
+       int ret;
+
+       /* non dprc devices do not scan for other devices */
+       if (is_fsl_mc_bus_dprc(mc_dev)) {
+               ret = dprc_scan_container(mc_dev, false);
+               if (ret) {
+                       dev_err(&mc_dev->dev, "VFIO_FSL_MC: Container 
scanning failed (%d)\n", ret);
+                       dprc_remove_devices(mc_dev, NULL, 0);
+                       return ret;
+               }
+       }
+
+       return 0;
+}
+
+
  static void vfio_fsl_uninit_device(struct vfio_fsl_mc_device *vdev)
  {
         struct fsl_mc_device *mc_dev = vdev->mc_dev;
@@ -642,9 +652,16 @@ static int vfio_fsl_mc_probe(struct fsl_mc_device 
*mc_dev)
                 dev_err(dev, "VFIO_FSL_MC: Failed to add to vfio group\n");
                 goto out_device;
         }
+
+       ret = vfio_fsl_mc_scan_container(vdev);
+       if (ret)
+               goto out_group_dev;
+
         dev_set_drvdata(dev, vdev);
         return 0;

+out_group_dev:
+       vfio_unregister_group_dev(&vdev->vdev);
  out_device:
         vfio_fsl_uninit_device(vdev);
  out_reflck:


Thanks,
Diana

On 3/13/2021 2:55 AM, Jason Gunthorpe wrote:
> vfio_add_group_dev() must be called only after all of the private data in
> vdev is fully setup and ready, otherwise there could be races with user
> space instantiating a device file descriptor and starting to call ops.
> 
> For instance vfio_fsl_mc_reflck_attach() sets vdev->reflck and
> vfio_fsl_mc_open(), called by fops open, unconditionally derefs it, which
> will crash if things get out of order.
> 
> This driver started life with the right sequence, but three commits added
> stuff after vfio_add_group_dev().
> 
> Fixes: 2e0d29561f59 ("vfio/fsl-mc: Add irq infrastructure for fsl-mc devices")
> Fixes: f2ba7e8c947b ("vfio/fsl-mc: Added lock support in preparation for interrupt handling")
> Fixes: 704f5082d845 ("vfio/fsl-mc: Scan DPRC objects on vfio-fsl-mc driver bind")
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> ---
>   drivers/vfio/fsl-mc/vfio_fsl_mc.c | 43 ++++++++++++++++---------------
>   1 file changed, 22 insertions(+), 21 deletions(-)
> 
> diff --git a/drivers/vfio/fsl-mc/vfio_fsl_mc.c b/drivers/vfio/fsl-mc/vfio_fsl_mc.c
> index f27e25112c4037..881849723b4dfb 100644
> --- a/drivers/vfio/fsl-mc/vfio_fsl_mc.c
> +++ b/drivers/vfio/fsl-mc/vfio_fsl_mc.c
> @@ -582,11 +582,21 @@ static int vfio_fsl_mc_init_device(struct vfio_fsl_mc_device *vdev)
>   	dprc_cleanup(mc_dev);
>   out_nc_unreg:
>   	bus_unregister_notifier(&fsl_mc_bus_type, &vdev->nb);
> -	vdev->nb.notifier_call = NULL;
> -
>   	return ret;
>   }
>   
> +static void vfio_fsl_uninit_device(struct vfio_fsl_mc_device *vdev)
> +{
> +	struct fsl_mc_device *mc_dev = vdev->mc_dev;
> +
> +	if (!is_fsl_mc_bus_dprc(mc_dev))
> +		return;
> +
> +	dprc_remove_devices(mc_dev, NULL, 0);
> +	dprc_cleanup(mc_dev);
> +	bus_unregister_notifier(&fsl_mc_bus_type, &vdev->nb);
> +}
> +
>   static int vfio_fsl_mc_probe(struct fsl_mc_device *mc_dev)
>   {
>   	struct iommu_group *group;
> @@ -607,29 +617,27 @@ static int vfio_fsl_mc_probe(struct fsl_mc_device *mc_dev)
>   	}
>   
>   	vdev->mc_dev = mc_dev;
> -
> -	ret = vfio_add_group_dev(dev, &vfio_fsl_mc_ops, vdev);
> -	if (ret) {
> -		dev_err(dev, "VFIO_FSL_MC: Failed to add to vfio group\n");
> -		goto out_group_put;
> -	}
> +	mutex_init(&vdev->igate);
>   
>   	ret = vfio_fsl_mc_reflck_attach(vdev);
>   	if (ret)
> -		goto out_group_dev;
> +		goto out_group_put;
>   
>   	ret = vfio_fsl_mc_init_device(vdev);
>   	if (ret)
>   		goto out_reflck;
>   
> -	mutex_init(&vdev->igate);
> -
> +	ret = vfio_add_group_dev(dev, &vfio_fsl_mc_ops, vdev);
> +	if (ret) {
> +		dev_err(dev, "VFIO_FSL_MC: Failed to add to vfio group\n");
> +		goto out_device;
> +	}
>   	return 0;
>   
> +out_device:
> +	vfio_fsl_uninit_device(vdev);
>   out_reflck:
>   	vfio_fsl_mc_reflck_put(vdev->reflck);
> -out_group_dev:
> -	vfio_del_group_dev(dev);
>   out_group_put:
>   	vfio_iommu_group_put(group, dev);
>   	return ret;
> @@ -646,16 +654,9 @@ static int vfio_fsl_mc_remove(struct fsl_mc_device *mc_dev)
>   
>   	mutex_destroy(&vdev->igate);
>   
> +	vfio_fsl_uninit_device(vdev);
>   	vfio_fsl_mc_reflck_put(vdev->reflck);
>   
> -	if (is_fsl_mc_bus_dprc(mc_dev)) {
> -		dprc_remove_devices(mc_dev, NULL, 0);
> -		dprc_cleanup(mc_dev);
> -	}
> -
> -	if (vdev->nb.notifier_call)
> -		bus_unregister_notifier(&fsl_mc_bus_type, &vdev->nb);
> -
>   	vfio_iommu_group_put(mc_dev->dev.iommu_group, dev);
>   
>   	return 0;
> 


^ permalink raw reply related	[flat|nested] 82+ messages in thread

* Re: [PATCH v2 05/14] vfio/fsl-mc: Re-order vfio_fsl_mc_probe()
  2021-03-17 16:36   ` Diana Craciun OSS
@ 2021-03-17 22:59     ` Jason Gunthorpe
  0 siblings, 0 replies; 82+ messages in thread
From: Jason Gunthorpe @ 2021-03-17 22:59 UTC (permalink / raw)
  To: Diana Craciun OSS
  Cc: Cornelia Huck, kvm, Alex Williamson, Raj, Ashok, Bharat Bhushan,
	Dan Williams, Daniel Vetter, Eric Auger, Christoph Hellwig,
	Leon Romanovsky, Max Gurtovoy, Tarun Gupta, Laurentiu Tudor

On Wed, Mar 17, 2021 at 06:36:09PM +0200, Diana Craciun OSS wrote:
> Hi,
> 
> Thanks for finding this!
> 
> I tested the series and currently the binding to vfio fails. The reason is
> that it is assumed that the objects scan is done after vfio_add_group_dev.
> But at this point the vdev structure is completly initialized.
>
> I'll add some more context.
> 
> There are two types of FSL MC devices:
> - a DPRC device
> - regular devices
> 
> A DPRC is some kind of container of the other devices. The DPRC VFIO device
> is scanning for all the existing devices in the container and triggers the
> probe function for those devices.

Oh. It ends up recursively calling probe() under the same stack frame?
I don't feel good about that

> However, there are some pieces of code
> that needs to be protected by a lock, lock that is created by
> vfio_fsl_mc_reflck_attach() function. This function is searching for the
> DPRC vdev (having the physical device) in the vfio group, so the "parent"
> device should have been added in the group before the child devices are
> probed.

Yes, I understood this part, but I didn't think it could be invoked
recursively from vfio_fsl_mc_init_device() :(

> I did some changes on top of these series and this is how they look like. I
> hope that I do not do something that violates the way the VFIO is designed.

Well, it is "ok" in that this is only about the reflck so it doesn't
appear to break the core's assumptions, but I don't like it at all.

I also have a later patch that revises the reflck search I now see I
will have to throw out.

I think it would be better to find the reflck entirely internally to
the driver than involving both the vfio and driver core in the
search. I will try to write that later

For now, this solution seems OK, I will fold it in, thanks

Jason

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v2 14/14] vfio: Remove device_data from the vfio bus driver API
  2021-03-13  0:56 ` [PATCH v2 14/14] vfio: Remove device_data from the vfio bus driver API Jason Gunthorpe
  2021-03-16  8:22   ` Tian, Kevin
  2021-03-17 12:08   ` Cornelia Huck
@ 2021-03-17 23:24   ` Max Gurtovoy
  2 siblings, 0 replies; 82+ messages in thread
From: Max Gurtovoy @ 2021-03-17 23:24 UTC (permalink / raw)
  To: Jason Gunthorpe, Alex Williamson, Cornelia Huck, Jonathan Corbet,
	Diana Craciun, Eric Auger, kvm, Kirti Wankhede, linux-doc
  Cc: Raj, Ashok, Dan Williams, Daniel Vetter, Christoph Hellwig,
	Leon Romanovsky, Tarun Gupta


On 3/13/2021 2:56 AM, Jason Gunthorpe wrote:
> There are no longer any users, so it can go away. Everything is using
> container_of now.
>
> Reviewed-by: Christoph Hellwig <hch@lst.de>
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> ---
>   Documentation/driver-api/vfio.rst            |  3 +--
>   drivers/vfio/fsl-mc/vfio_fsl_mc.c            |  5 +++--
>   drivers/vfio/mdev/vfio_mdev.c                |  2 +-
>   drivers/vfio/pci/vfio_pci.c                  |  2 +-
>   drivers/vfio/platform/vfio_platform_common.c |  2 +-
>   drivers/vfio/vfio.c                          | 12 +-----------
>   include/linux/vfio.h                         |  4 +---
>   7 files changed, 9 insertions(+), 21 deletions(-)


Looks good,

Reviewed-by: Max Gurtovoy <mgurtovoy@nvidia.com>


^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v2 01/14] vfio: Remove extra put/gets around vfio_device->group
  2021-03-13  0:55 ` [PATCH v2 01/14] vfio: Remove extra put/gets around vfio_device->group Jason Gunthorpe
                     ` (2 preceding siblings ...)
  2021-03-16 11:59   ` Cornelia Huck
@ 2021-03-18  9:32   ` Auger Eric
  3 siblings, 0 replies; 82+ messages in thread
From: Auger Eric @ 2021-03-18  9:32 UTC (permalink / raw)
  To: Jason Gunthorpe, Alex Williamson, Cornelia Huck, kvm
  Cc: Raj, Ashok, Dan Williams, Daniel Vetter, Christoph Hellwig,
	Leon Romanovsky, Max Gurtovoy, Tarun Gupta

Hi Jason,

On 3/13/21 1:55 AM, Jason Gunthorpe wrote:
> The vfio_device->group value has a get obtained during
> vfio_add_group_dev() which gets moved from the stack to vfio_device->group
> in vfio_group_create_device().
> 
> The reference remains until we reach the end of vfio_del_group_dev() when
> it is put back.
> 
> Thus anything that already has a kref on the vfio_device is guaranteed a
> valid group pointer. Remove all the extra reference traffic.
> 
> It is tricky to see, but the get at the start of vfio_del_group_dev() is
> actually pairing with the put hidden inside vfio_device_put() a few lines
> below.
> 
> A later patch merges vfio_group_create_device() into vfio_add_group_dev()
> which makes the ownership and error flow on the create side easier to
> follow.
> 
> Reviewed-by: Christoph Hellwig <hch@lst.de>
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>

Thanks

Eric

> ---
>  drivers/vfio/vfio.c | 21 ++-------------------
>  1 file changed, 2 insertions(+), 19 deletions(-)
> 
> diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
> index 38779e6fd80cb4..15d8e678e5563a 100644
> --- a/drivers/vfio/vfio.c
> +++ b/drivers/vfio/vfio.c
> @@ -546,14 +546,12 @@ struct vfio_device *vfio_group_create_device(struct vfio_group *group,
>  
>  	kref_init(&device->kref);
>  	device->dev = dev;
> +	/* Our reference on group is moved to the device */
>  	device->group = group;
>  	device->ops = ops;
>  	device->device_data = device_data;
>  	dev_set_drvdata(dev, device);
>  
> -	/* No need to get group_lock, caller has group reference */
> -	vfio_group_get(group);
> -
>  	mutex_lock(&group->device_lock);
>  	list_add(&device->group_next, &group->device_list);
>  	group->dev_counter++;
> @@ -585,13 +583,11 @@ void vfio_device_put(struct vfio_device *device)
>  {
>  	struct vfio_group *group = device->group;
>  	kref_put_mutex(&device->kref, vfio_device_release, &group->device_lock);
> -	vfio_group_put(group);
>  }
>  EXPORT_SYMBOL_GPL(vfio_device_put);
>  
>  static void vfio_device_get(struct vfio_device *device)
>  {
> -	vfio_group_get(device->group);
>  	kref_get(&device->kref);
>  }
>  
> @@ -841,14 +837,6 @@ int vfio_add_group_dev(struct device *dev,
>  		vfio_group_put(group);
>  		return PTR_ERR(device);
>  	}
> -
> -	/*
> -	 * Drop all but the vfio_device reference.  The vfio_device holds
> -	 * a reference to the vfio_group, which holds a reference to the
> -	 * iommu_group.
> -	 */
> -	vfio_group_put(group);
> -
>  	return 0;
>  }
>  EXPORT_SYMBOL_GPL(vfio_add_group_dev);
> @@ -928,12 +916,6 @@ void *vfio_del_group_dev(struct device *dev)
>  	unsigned int i = 0;
>  	bool interrupted = false;
>  
> -	/*
> -	 * The group exists so long as we have a device reference.  Get
> -	 * a group reference and use it to scan for the device going away.
> -	 */
> -	vfio_group_get(group);
> -
>  	/*
>  	 * When the device is removed from the group, the group suddenly
>  	 * becomes non-viable; the device has a driver (until the unbind
> @@ -1008,6 +990,7 @@ void *vfio_del_group_dev(struct device *dev)
>  	if (list_empty(&group->device_list))
>  		wait_event(group->container_q, !group->container);
>  
> +	/* Matches the get in vfio_group_create_device() */
>  	vfio_group_put(group);
>  
>  	return device_data;
> 


^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v2 02/14] vfio: Simplify the lifetime logic for vfio_device
  2021-03-13  0:55 ` [PATCH v2 02/14] vfio: Simplify the lifetime logic for vfio_device Jason Gunthorpe
  2021-03-16  7:38   ` Tian, Kevin
@ 2021-03-18 13:10   ` Auger Eric
  1 sibling, 0 replies; 82+ messages in thread
From: Auger Eric @ 2021-03-18 13:10 UTC (permalink / raw)
  To: Jason Gunthorpe, Alex Williamson, Cornelia Huck, kvm
  Cc: Raj, Ashok, Dan Williams, Daniel Vetter, Christoph Hellwig,
	Leon Romanovsky, Max Gurtovoy, Tarun Gupta

Hi,
On 3/13/21 1:55 AM, Jason Gunthorpe wrote:
> The vfio_device is using a 'sleep until all refs go to zero' pattern for
> its lifetime, but it is indirectly coded by repeatedly scanning the group
> list waiting for the device to be removed on its own.
> 
> Switch this around to be a direct representation, use a refcount to count
> the number of places that are blocking destruction and sleep directly on a
> completion until that counter goes to zero. kfree the device after other
> accesses have been excluded in vfio_del_group_dev(). This is a fairly
> common Linux idiom.
> 
> Due to this we can now remove kref_put_mutex(), which is very rarely used
> in the kernel. Here it is being used to prevent a zero ref device from
> being seen in the group list. Instead allow the zero ref device to
> continue to exist in the device_list and use refcount_inc_not_zero() to
> exclude it once refs go to zero.
> 
> This patch is organized so the next patch will be able to alter the API to
> allow drivers to provide the kfree.
> 
> Reviewed-by: Christoph Hellwig <hch@lst.de>
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>

Thanks

Eric

> ---
>  drivers/vfio/vfio.c | 79 ++++++++++++++-------------------------------
>  1 file changed, 25 insertions(+), 54 deletions(-)
> 
> diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
> index 15d8e678e5563a..32660e8a69ae20 100644
> --- a/drivers/vfio/vfio.c
> +++ b/drivers/vfio/vfio.c
> @@ -46,7 +46,6 @@ static struct vfio {
>  	struct mutex			group_lock;
>  	struct cdev			group_cdev;
>  	dev_t				group_devt;
> -	wait_queue_head_t		release_q;
>  } vfio;
>  
>  struct vfio_iommu_driver {
> @@ -91,7 +90,8 @@ struct vfio_group {
>  };
>  
>  struct vfio_device {
> -	struct kref			kref;
> +	refcount_t			refcount;
> +	struct completion		comp;
>  	struct device			*dev;
>  	const struct vfio_device_ops	*ops;
>  	struct vfio_group		*group;
> @@ -544,7 +544,8 @@ struct vfio_device *vfio_group_create_device(struct vfio_group *group,
>  	if (!device)
>  		return ERR_PTR(-ENOMEM);
>  
> -	kref_init(&device->kref);
> +	refcount_set(&device->refcount, 1);
> +	init_completion(&device->comp);
>  	device->dev = dev;
>  	/* Our reference on group is moved to the device */
>  	device->group = group;
> @@ -560,35 +561,17 @@ struct vfio_device *vfio_group_create_device(struct vfio_group *group,
>  	return device;
>  }
>  
> -static void vfio_device_release(struct kref *kref)
> -{
> -	struct vfio_device *device = container_of(kref,
> -						  struct vfio_device, kref);
> -	struct vfio_group *group = device->group;
> -
> -	list_del(&device->group_next);
> -	group->dev_counter--;
> -	mutex_unlock(&group->device_lock);
> -
> -	dev_set_drvdata(device->dev, NULL);
> -
> -	kfree(device);
> -
> -	/* vfio_del_group_dev may be waiting for this device */
> -	wake_up(&vfio.release_q);
> -}
> -
>  /* Device reference always implies a group reference */
>  void vfio_device_put(struct vfio_device *device)
>  {
> -	struct vfio_group *group = device->group;
> -	kref_put_mutex(&device->kref, vfio_device_release, &group->device_lock);
> +	if (refcount_dec_and_test(&device->refcount))
> +		complete(&device->comp);
>  }
>  EXPORT_SYMBOL_GPL(vfio_device_put);
>  
> -static void vfio_device_get(struct vfio_device *device)
> +static bool vfio_device_try_get(struct vfio_device *device)
>  {
> -	kref_get(&device->kref);
> +	return refcount_inc_not_zero(&device->refcount);
>  }
>  
>  static struct vfio_device *vfio_group_get_device(struct vfio_group *group,
> @@ -598,8 +581,7 @@ static struct vfio_device *vfio_group_get_device(struct vfio_group *group,
>  
>  	mutex_lock(&group->device_lock);
>  	list_for_each_entry(device, &group->device_list, group_next) {
> -		if (device->dev == dev) {
> -			vfio_device_get(device);
> +		if (device->dev == dev && vfio_device_try_get(device)) {
>  			mutex_unlock(&group->device_lock);
>  			return device;
>  		}
> @@ -883,9 +865,8 @@ static struct vfio_device *vfio_device_get_from_name(struct vfio_group *group,
>  			ret = !strcmp(dev_name(it->dev), buf);
>  		}
>  
> -		if (ret) {
> +		if (ret && vfio_device_try_get(it)) {
>  			device = it;
> -			vfio_device_get(device);
>  			break;
>  		}
>  	}
> @@ -908,13 +889,13 @@ EXPORT_SYMBOL_GPL(vfio_device_data);
>   * removed.  Open file descriptors for the device... */
>  void *vfio_del_group_dev(struct device *dev)
>  {
> -	DEFINE_WAIT_FUNC(wait, woken_wake_function);
>  	struct vfio_device *device = dev_get_drvdata(dev);
>  	struct vfio_group *group = device->group;
>  	void *device_data = device->device_data;
>  	struct vfio_unbound_dev *unbound;
>  	unsigned int i = 0;
>  	bool interrupted = false;
> +	long rc;
>  
>  	/*
>  	 * When the device is removed from the group, the group suddenly
> @@ -935,32 +916,18 @@ void *vfio_del_group_dev(struct device *dev)
>  	WARN_ON(!unbound);
>  
>  	vfio_device_put(device);
> -
> -	/*
> -	 * If the device is still present in the group after the above
> -	 * 'put', then it is in use and we need to request it from the
> -	 * bus driver.  The driver may in turn need to request the
> -	 * device from the user.  We send the request on an arbitrary
> -	 * interval with counter to allow the driver to take escalating
> -	 * measures to release the device if it has the ability to do so.
> -	 */
> -	add_wait_queue(&vfio.release_q, &wait);
> -
> -	do {
> -		device = vfio_group_get_device(group, dev);
> -		if (!device)
> -			break;
> -
> +	rc = try_wait_for_completion(&device->comp);
> +	while (rc <= 0) {
>  		if (device->ops->request)
>  			device->ops->request(device_data, i++);
>  
> -		vfio_device_put(device);
> -
>  		if (interrupted) {
> -			wait_woken(&wait, TASK_UNINTERRUPTIBLE, HZ * 10);
> +			rc = wait_for_completion_timeout(&device->comp,
> +							 HZ * 10);
>  		} else {
> -			wait_woken(&wait, TASK_INTERRUPTIBLE, HZ * 10);
> -			if (signal_pending(current)) {
> +			rc = wait_for_completion_interruptible_timeout(
> +				&device->comp, HZ * 10);
> +			if (rc < 0) {
>  				interrupted = true;
>  				dev_warn(dev,
>  					 "Device is currently in use, task"
> @@ -969,10 +936,13 @@ void *vfio_del_group_dev(struct device *dev)
>  					 current->comm, task_pid_nr(current));
>  			}
>  		}
> +	}
>  
> -	} while (1);
> +	mutex_lock(&group->device_lock);
> +	list_del(&device->group_next);
> +	group->dev_counter--;
> +	mutex_unlock(&group->device_lock);
>  
> -	remove_wait_queue(&vfio.release_q, &wait);
>  	/*
>  	 * In order to support multiple devices per group, devices can be
>  	 * plucked from the group while other devices in the group are still
> @@ -992,6 +962,8 @@ void *vfio_del_group_dev(struct device *dev)
>  
>  	/* Matches the get in vfio_group_create_device() */
>  	vfio_group_put(group);
> +	dev_set_drvdata(dev, NULL);
> +	kfree(device);
>  
>  	return device_data;
>  }
> @@ -2362,7 +2334,6 @@ static int __init vfio_init(void)
>  	mutex_init(&vfio.iommu_drivers_lock);
>  	INIT_LIST_HEAD(&vfio.group_list);
>  	INIT_LIST_HEAD(&vfio.iommu_drivers_list);
> -	init_waitqueue_head(&vfio.release_q);
>  
>  	ret = misc_register(&vfio_dev);
>  	if (ret) {
> 


^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v2 03/14] vfio: Split creation of a vfio_device into init and register ops
  2021-03-13  0:55 ` [PATCH v2 03/14] vfio: Split creation of a vfio_device into init and register ops Jason Gunthorpe
                     ` (2 preceding siblings ...)
  2021-03-16 12:54   ` Max Gurtovoy
@ 2021-03-18 13:18   ` Auger Eric
  3 siblings, 0 replies; 82+ messages in thread
From: Auger Eric @ 2021-03-18 13:18 UTC (permalink / raw)
  To: Jason Gunthorpe, Alex Williamson, Cornelia Huck, Jonathan Corbet,
	kvm, linux-doc
  Cc: Raj, Ashok, Dan Williams, Daniel Vetter, Christoph Hellwig,
	Leon Romanovsky, Max Gurtovoy, Tarun Gupta, Liu Yi L

Hi,
On 3/13/21 1:55 AM, Jason Gunthorpe wrote:
> This makes the struct vfio_pci_device part of the public interface so it
> can be used with container_of and so forth, as is typical for a Linux
> subystem.
> 
> This is the first step to bring some type-safety to the vfio interface by
> allowing the replacement of 'void *' and 'struct device *' inputs with a
> simple and clear 'struct vfio_pci_device *'
> 
> For now the self-allocating vfio_add_group_dev() interface is kept so each
> user can be updated as a separate patch.
> 
> The expected usage pattern is
> 
>   driver core probe() function:
>      my_device = kzalloc(sizeof(*mydevice));
>      vfio_init_group_dev(&my_device->vdev, dev, ops, mydevice);
>      /* other driver specific prep */
>      vfio_register_group_dev(&my_device->vdev);
>      dev_set_drvdata(my_device);
> 
>   driver core remove() function:
>      my_device = dev_get_drvdata(dev);
>      vfio_unregister_group_dev(&my_device->vdev);
>      /* other driver specific tear down */
>      kfree(my_device);
> 
> Allowing the driver to be able to use the drvdata and vifo_device to go
> to/from its own data.
> 
> The pattern also makes it clear that vfio_register_group_dev() must be
> last in the sequence, as once it is called the core code can immediately
> start calling ops. The init/register gap is provided to allow for the
> driver to do setup before ops can be called and thus avoid races.
> 
> Reviewed-by: Christoph Hellwig <hch@lst.de>
> Reviewed-by: Liu Yi L <yi.l.liu@intel.com>
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
With previously commit msg and comment fixes,

Reviewed-by: Eric Auger <eric.auger@redhat.com>

Thanks

Eric

> ---
>  Documentation/driver-api/vfio.rst |  31 ++++----
>  drivers/vfio/vfio.c               | 123 ++++++++++++++++--------------
>  include/linux/vfio.h              |  16 ++++
>  3 files changed, 98 insertions(+), 72 deletions(-)
> 
> diff --git a/Documentation/driver-api/vfio.rst b/Documentation/driver-api/vfio.rst
> index f1a4d3c3ba0bb1..d3a02300913a7f 100644
> --- a/Documentation/driver-api/vfio.rst
> +++ b/Documentation/driver-api/vfio.rst
> @@ -249,18 +249,23 @@ VFIO bus driver API
>  
>  VFIO bus drivers, such as vfio-pci make use of only a few interfaces
>  into VFIO core.  When devices are bound and unbound to the driver,
> -the driver should call vfio_add_group_dev() and vfio_del_group_dev()
> -respectively::
> -
> -	extern int vfio_add_group_dev(struct device *dev,
> -				      const struct vfio_device_ops *ops,
> -				      void *device_data);
> -
> -	extern void *vfio_del_group_dev(struct device *dev);
> -
> -vfio_add_group_dev() indicates to the core to begin tracking the
> -iommu_group of the specified dev and register the dev as owned by
> -a VFIO bus driver.  The driver provides an ops structure for callbacks
> +the driver should call vfio_register_group_dev() and
> +vfio_unregister_group_dev() respectively::
> +
> +	void vfio_init_group_dev(struct vfio_device *device,
> +				struct device *dev,
> +				const struct vfio_device_ops *ops,
> +				void *device_data);
> +	int vfio_register_group_dev(struct vfio_device *device);
> +	void vfio_unregister_group_dev(struct vfio_device *device);
> +
> +The driver should embed the vfio_device in its own structure and call
> +vfio_init_group_dev() to pre-configure it before going to registration.
> +vfio_register_group_dev() indicates to the core to begin tracking the
> +iommu_group of the specified dev and register the dev as owned by a VFIO bus
> +driver. Once vfio_register_group_dev() returns it is possible for userspace to
> +start accessing the driver, thus the driver should ensure it is completely
> +ready before calling it. The driver provides an ops structure for callbacks
>  similar to a file operations structure::
>  
>  	struct vfio_device_ops {
> @@ -276,7 +281,7 @@ similar to a file operations structure::
>  	};
>  
>  Each function is passed the device_data that was originally registered
> -in the vfio_add_group_dev() call above.  This allows the bus driver
> +in the vfio_register_group_dev() call above.  This allows the bus driver
>  an easy place to store its opaque, private data.  The open/release
>  callbacks are issued when a new file descriptor is created for a
>  device (via VFIO_GROUP_GET_DEVICE_FD).  The ioctl interface provides
> diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
> index 32660e8a69ae20..cfa06ae3b9018b 100644
> --- a/drivers/vfio/vfio.c
> +++ b/drivers/vfio/vfio.c
> @@ -89,16 +89,6 @@ struct vfio_group {
>  	struct blocking_notifier_head	notifier;
>  };
>  
> -struct vfio_device {
> -	refcount_t			refcount;
> -	struct completion		comp;
> -	struct device			*dev;
> -	const struct vfio_device_ops	*ops;
> -	struct vfio_group		*group;
> -	struct list_head		group_next;
> -	void				*device_data;
> -};
> -
>  #ifdef CONFIG_VFIO_NOIOMMU
>  static bool noiommu __read_mostly;
>  module_param_named(enable_unsafe_noiommu_mode,
> @@ -532,35 +522,6 @@ static struct vfio_group *vfio_group_get_from_dev(struct device *dev)
>  /**
>   * Device objects - create, release, get, put, search
>   */
> -static
> -struct vfio_device *vfio_group_create_device(struct vfio_group *group,
> -					     struct device *dev,
> -					     const struct vfio_device_ops *ops,
> -					     void *device_data)
> -{
> -	struct vfio_device *device;
> -
> -	device = kzalloc(sizeof(*device), GFP_KERNEL);
> -	if (!device)
> -		return ERR_PTR(-ENOMEM);
> -
> -	refcount_set(&device->refcount, 1);
> -	init_completion(&device->comp);
> -	device->dev = dev;
> -	/* Our reference on group is moved to the device */
> -	device->group = group;
> -	device->ops = ops;
> -	device->device_data = device_data;
> -	dev_set_drvdata(dev, device);
> -
> -	mutex_lock(&group->device_lock);
> -	list_add(&device->group_next, &group->device_list);
> -	group->dev_counter++;
> -	mutex_unlock(&group->device_lock);
> -
> -	return device;
> -}
> -
>  /* Device reference always implies a group reference */
>  void vfio_device_put(struct vfio_device *device)
>  {
> @@ -779,14 +740,23 @@ static int vfio_iommu_group_notifier(struct notifier_block *nb,
>  /**
>   * VFIO driver API
>   */
> -int vfio_add_group_dev(struct device *dev,
> -		       const struct vfio_device_ops *ops, void *device_data)
> +void vfio_init_group_dev(struct vfio_device *device, struct device *dev,
> +			 const struct vfio_device_ops *ops, void *device_data)
> +{
> +	init_completion(&device->comp);
> +	device->dev = dev;
> +	device->ops = ops;
> +	device->device_data = device_data;
> +}
> +EXPORT_SYMBOL_GPL(vfio_init_group_dev);
> +
> +int vfio_register_group_dev(struct vfio_device *device)
>  {
> +	struct vfio_device *existing_device;
>  	struct iommu_group *iommu_group;
>  	struct vfio_group *group;
> -	struct vfio_device *device;
>  
> -	iommu_group = iommu_group_get(dev);
> +	iommu_group = iommu_group_get(device->dev);
>  	if (!iommu_group)
>  		return -EINVAL;
>  
> @@ -805,21 +775,50 @@ int vfio_add_group_dev(struct device *dev,
>  		iommu_group_put(iommu_group);
>  	}
>  
> -	device = vfio_group_get_device(group, dev);
> -	if (device) {
> -		dev_WARN(dev, "Device already exists on group %d\n",
> +	existing_device = vfio_group_get_device(group, device->dev);
> +	if (existing_device) {
> +		dev_WARN(device->dev, "Device already exists on group %d\n",
>  			 iommu_group_id(iommu_group));
> -		vfio_device_put(device);
> +		vfio_device_put(existing_device);
>  		vfio_group_put(group);
>  		return -EBUSY;
>  	}
>  
> -	device = vfio_group_create_device(group, dev, ops, device_data);
> -	if (IS_ERR(device)) {
> -		vfio_group_put(group);
> -		return PTR_ERR(device);
> -	}
> +	/* Our reference on group is moved to the device */
> +	device->group = group;
> +
> +	/* Refcounting can't start until the driver calls register */
> +	refcount_set(&device->refcount, 1);
> +
> +	mutex_lock(&group->device_lock);
> +	list_add(&device->group_next, &group->device_list);
> +	group->dev_counter++;
> +	mutex_unlock(&group->device_lock);
> +
> +	return 0;
> +}
> +EXPORT_SYMBOL_GPL(vfio_register_group_dev);
> +
> +int vfio_add_group_dev(struct device *dev, const struct vfio_device_ops *ops,
> +		       void *device_data)
> +{
> +	struct vfio_device *device;
> +	int ret;
> +
> +	device = kzalloc(sizeof(*device), GFP_KERNEL);
> +	if (!device)
> +		return -ENOMEM;
> +
> +	vfio_init_group_dev(device, dev, ops, device_data);
> +	ret = vfio_register_group_dev(device);
> +	if (ret)
> +		goto err_kfree;
> +	dev_set_drvdata(dev, device);
>  	return 0;
> +
> +err_kfree:
> +	kfree(device);
> +	return ret;
>  }
>  EXPORT_SYMBOL_GPL(vfio_add_group_dev);
>  
> @@ -887,11 +886,9 @@ EXPORT_SYMBOL_GPL(vfio_device_data);
>  /*
>   * Decrement the device reference count and wait for the device to be
>   * removed.  Open file descriptors for the device... */
> -void *vfio_del_group_dev(struct device *dev)
> +void vfio_unregister_group_dev(struct vfio_device *device)
>  {
> -	struct vfio_device *device = dev_get_drvdata(dev);
>  	struct vfio_group *group = device->group;
> -	void *device_data = device->device_data;
>  	struct vfio_unbound_dev *unbound;
>  	unsigned int i = 0;
>  	bool interrupted = false;
> @@ -908,7 +905,7 @@ void *vfio_del_group_dev(struct device *dev)
>  	 */
>  	unbound = kzalloc(sizeof(*unbound), GFP_KERNEL);
>  	if (unbound) {
> -		unbound->dev = dev;
> +		unbound->dev = device->dev;
>  		mutex_lock(&group->unbound_lock);
>  		list_add(&unbound->unbound_next, &group->unbound_list);
>  		mutex_unlock(&group->unbound_lock);
> @@ -919,7 +916,7 @@ void *vfio_del_group_dev(struct device *dev)
>  	rc = try_wait_for_completion(&device->comp);
>  	while (rc <= 0) {
>  		if (device->ops->request)
> -			device->ops->request(device_data, i++);
> +			device->ops->request(device->device_data, i++);
>  
>  		if (interrupted) {
>  			rc = wait_for_completion_timeout(&device->comp,
> @@ -929,7 +926,7 @@ void *vfio_del_group_dev(struct device *dev)
>  				&device->comp, HZ * 10);
>  			if (rc < 0) {
>  				interrupted = true;
> -				dev_warn(dev,
> +				dev_warn(device->dev,
>  					 "Device is currently in use, task"
>  					 " \"%s\" (%d) "
>  					 "blocked until device is released",
> @@ -962,9 +959,17 @@ void *vfio_del_group_dev(struct device *dev)
>  
>  	/* Matches the get in vfio_group_create_device() */
>  	vfio_group_put(group);
> +}
> +EXPORT_SYMBOL_GPL(vfio_unregister_group_dev);
> +
> +void *vfio_del_group_dev(struct device *dev)
> +{
> +	struct vfio_device *device = dev_get_drvdata(dev);
> +	void *device_data = device->device_data;
> +
> +	vfio_unregister_group_dev(device);
>  	dev_set_drvdata(dev, NULL);
>  	kfree(device);
> -
>  	return device_data;
>  }
>  EXPORT_SYMBOL_GPL(vfio_del_group_dev);
> diff --git a/include/linux/vfio.h b/include/linux/vfio.h
> index b7e18bde5aa8b3..ad8b579d67d34a 100644
> --- a/include/linux/vfio.h
> +++ b/include/linux/vfio.h
> @@ -15,6 +15,18 @@
>  #include <linux/poll.h>
>  #include <uapi/linux/vfio.h>
>  
> +struct vfio_device {
> +	struct device *dev;
> +	const struct vfio_device_ops *ops;
> +	struct vfio_group *group;
> +
> +	/* Members below here are private, not for driver use */
> +	refcount_t refcount;
> +	struct completion comp;
> +	struct list_head group_next;
> +	void *device_data;
> +};
> +
>  /**
>   * struct vfio_device_ops - VFIO bus driver device callbacks
>   *
> @@ -48,11 +60,15 @@ struct vfio_device_ops {
>  extern struct iommu_group *vfio_iommu_group_get(struct device *dev);
>  extern void vfio_iommu_group_put(struct iommu_group *group, struct device *dev);
>  
> +void vfio_init_group_dev(struct vfio_device *device, struct device *dev,
> +			 const struct vfio_device_ops *ops, void *device_data);
> +int vfio_register_group_dev(struct vfio_device *device);
>  extern int vfio_add_group_dev(struct device *dev,
>  			      const struct vfio_device_ops *ops,
>  			      void *device_data);
>  
>  extern void *vfio_del_group_dev(struct device *dev);
> +void vfio_unregister_group_dev(struct vfio_device *device);
>  extern struct vfio_device *vfio_device_get_from_dev(struct device *dev);
>  extern void vfio_device_put(struct vfio_device *device);
>  extern void *vfio_device_data(struct vfio_device *device);
> 


^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v2 04/14] vfio/platform: Use vfio_init/register/unregister_group_dev
  2021-03-13  0:55 ` [PATCH v2 04/14] vfio/platform: Use vfio_init/register/unregister_group_dev Jason Gunthorpe
  2021-03-16 16:22   ` Cornelia Huck
  2021-03-16 21:33   ` Alex Williamson
@ 2021-03-18 13:40   ` Auger Eric
  2 siblings, 0 replies; 82+ messages in thread
From: Auger Eric @ 2021-03-18 13:40 UTC (permalink / raw)
  To: Jason Gunthorpe, Alex Williamson, Cornelia Huck, kvm
  Cc: Raj, Ashok, Dan Williams, Daniel Vetter, Christoph Hellwig,
	Leon Romanovsky, Max Gurtovoy, Tarun Gupta

Hi,

On 3/13/21 1:55 AM, Jason Gunthorpe wrote:
> platform already allocates a struct vfio_platform_device with exactly
> the same lifetime as vfio_device, switch to the new API and embed
> vfio_device in vfio_platform_device.

Without "kfree(vdev->name);" pointed out by Alex,

Acked-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Eric Auger <eric.auger@redhat.com>


Thanks

Eric

> 
> Reviewed-by: Christoph Hellwig <hch@lst.de>
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> ---
>  drivers/vfio/platform/vfio_amba.c             |  8 ++++---
>  drivers/vfio/platform/vfio_platform.c         | 21 ++++++++---------
>  drivers/vfio/platform/vfio_platform_common.c  | 23 +++++++------------
>  drivers/vfio/platform/vfio_platform_private.h |  5 ++--
>  4 files changed, 26 insertions(+), 31 deletions(-)
> 
> diff --git a/drivers/vfio/platform/vfio_amba.c b/drivers/vfio/platform/vfio_amba.c
> index 3626c21501017e..f970eb2a999f29 100644
> --- a/drivers/vfio/platform/vfio_amba.c
> +++ b/drivers/vfio/platform/vfio_amba.c
> @@ -66,16 +66,18 @@ static int vfio_amba_probe(struct amba_device *adev, const struct amba_id *id)
>  	if (ret) {
>  		kfree(vdev->name);
>  		kfree(vdev);
> +		return ret;
>  	}
>  
> -	return ret;
> +	dev_set_drvdata(&adev->dev, vdev);
> +	return 0;
>  }
>  
>  static void vfio_amba_remove(struct amba_device *adev)
>  {
> -	struct vfio_platform_device *vdev =
> -		vfio_platform_remove_common(&adev->dev);
> +	struct vfio_platform_device *vdev = dev_get_drvdata(&adev->dev);
>  
> +	vfio_platform_remove_common(vdev);
>  	kfree(vdev->name);
>  	kfree(vdev);
>  }
> diff --git a/drivers/vfio/platform/vfio_platform.c b/drivers/vfio/platform/vfio_platform.c
> index 9fb6818cea12cb..f7b3f64ecc7f6c 100644
> --- a/drivers/vfio/platform/vfio_platform.c
> +++ b/drivers/vfio/platform/vfio_platform.c
> @@ -54,23 +54,22 @@ static int vfio_platform_probe(struct platform_device *pdev)
>  	vdev->reset_required = reset_required;
>  
>  	ret = vfio_platform_probe_common(vdev, &pdev->dev);
> -	if (ret)
> +	if (ret) {
>  		kfree(vdev);
> -
> -	return ret;
> +		return ret;
> +	}
> +	dev_set_drvdata(&pdev->dev, vdev);
> +	return 0;
>  }
>  
>  static int vfio_platform_remove(struct platform_device *pdev)
>  {
> -	struct vfio_platform_device *vdev;
> -
> -	vdev = vfio_platform_remove_common(&pdev->dev);
> -	if (vdev) {
> -		kfree(vdev);
> -		return 0;
> -	}
> +	struct vfio_platform_device *vdev = dev_get_drvdata(&pdev->dev);
>  
> -	return -EINVAL;
> +	vfio_platform_remove_common(vdev);
> +	kfree(vdev->name);
> +	kfree(vdev);
> +	return 0;
>  }
>  
>  static struct platform_driver vfio_platform_driver = {
> diff --git a/drivers/vfio/platform/vfio_platform_common.c b/drivers/vfio/platform/vfio_platform_common.c
> index fb4b385191f288..6eb749250ee41c 100644
> --- a/drivers/vfio/platform/vfio_platform_common.c
> +++ b/drivers/vfio/platform/vfio_platform_common.c
> @@ -659,8 +659,7 @@ int vfio_platform_probe_common(struct vfio_platform_device *vdev,
>  	struct iommu_group *group;
>  	int ret;
>  
> -	if (!vdev)
> -		return -EINVAL;
> +	vfio_init_group_dev(&vdev->vdev, dev, &vfio_platform_ops, vdev);
>  
>  	ret = vfio_platform_acpi_probe(vdev, dev);
>  	if (ret)
> @@ -685,13 +684,13 @@ int vfio_platform_probe_common(struct vfio_platform_device *vdev,
>  		goto put_reset;
>  	}
>  
> -	ret = vfio_add_group_dev(dev, &vfio_platform_ops, vdev);
> +	ret = vfio_register_group_dev(&vdev->vdev);
>  	if (ret)
>  		goto put_iommu;
>  
>  	mutex_init(&vdev->igate);
>  
> -	pm_runtime_enable(vdev->device);
> +	pm_runtime_enable(dev);
>  	return 0;
>  
>  put_iommu:
> @@ -702,19 +701,13 @@ int vfio_platform_probe_common(struct vfio_platform_device *vdev,
>  }
>  EXPORT_SYMBOL_GPL(vfio_platform_probe_common);
>  
> -struct vfio_platform_device *vfio_platform_remove_common(struct device *dev)
> +void vfio_platform_remove_common(struct vfio_platform_device *vdev)
>  {
> -	struct vfio_platform_device *vdev;
> -
> -	vdev = vfio_del_group_dev(dev);
> +	vfio_unregister_group_dev(&vdev->vdev);
>  
> -	if (vdev) {
> -		pm_runtime_disable(vdev->device);
> -		vfio_platform_put_reset(vdev);
> -		vfio_iommu_group_put(dev->iommu_group, dev);
> -	}
> -
> -	return vdev;
> +	pm_runtime_disable(vdev->device);
> +	vfio_platform_put_reset(vdev);
> +	vfio_iommu_group_put(vdev->vdev.dev->iommu_group, vdev->vdev.dev);
>  }
>  EXPORT_SYMBOL_GPL(vfio_platform_remove_common);
>  
> diff --git a/drivers/vfio/platform/vfio_platform_private.h b/drivers/vfio/platform/vfio_platform_private.h
> index 289089910643ac..a5ba82c8cbc354 100644
> --- a/drivers/vfio/platform/vfio_platform_private.h
> +++ b/drivers/vfio/platform/vfio_platform_private.h
> @@ -9,6 +9,7 @@
>  
>  #include <linux/types.h>
>  #include <linux/interrupt.h>
> +#include <linux/vfio.h>
>  
>  #define VFIO_PLATFORM_OFFSET_SHIFT   40
>  #define VFIO_PLATFORM_OFFSET_MASK (((u64)(1) << VFIO_PLATFORM_OFFSET_SHIFT) - 1)
> @@ -42,6 +43,7 @@ struct vfio_platform_region {
>  };
>  
>  struct vfio_platform_device {
> +	struct vfio_device		vdev;
>  	struct vfio_platform_region	*regions;
>  	u32				num_regions;
>  	struct vfio_platform_irq	*irqs;
> @@ -80,8 +82,7 @@ struct vfio_platform_reset_node {
>  
>  extern int vfio_platform_probe_common(struct vfio_platform_device *vdev,
>  				      struct device *dev);
> -extern struct vfio_platform_device *vfio_platform_remove_common
> -				     (struct device *dev);
> +void vfio_platform_remove_common(struct vfio_platform_device *vdev);
>  
>  extern int vfio_platform_irq_init(struct vfio_platform_device *vdev);
>  extern void vfio_platform_irq_cleanup(struct vfio_platform_device *vdev);
> 


^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v2 09/14] vfio/pci: Use vfio_init/register/unregister_group_dev
  2021-03-13  0:56 ` [PATCH v2 09/14] vfio/pci: Use vfio_init/register/unregister_group_dev Jason Gunthorpe
  2021-03-16  8:06   ` Tian, Kevin
  2021-03-17 10:33   ` Cornelia Huck
@ 2021-03-18 13:43   ` Auger Eric
  2 siblings, 0 replies; 82+ messages in thread
From: Auger Eric @ 2021-03-18 13:43 UTC (permalink / raw)
  To: Jason Gunthorpe, Alex Williamson, Cornelia Huck, kvm
  Cc: Raj, Ashok, Dan Williams, Daniel Vetter, Christoph Hellwig,
	Leon Romanovsky, Max Gurtovoy, Tarun Gupta, Liu Yi L

Hi,

On 3/13/21 1:56 AM, Jason Gunthorpe wrote:
> pci already allocates a struct vfio_pci_device with exactly the same
> lifetime as vfio_device, switch to the new API and embed vfio_device in
> vfio_pci_device.
> 
> Reviewed-by: Christoph Hellwig <hch@lst.de>
> Reviewed-by: Liu Yi L <yi.l.liu@intel.com>
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>

Eric
> ---
>  drivers/vfio/pci/vfio_pci.c         | 10 +++++-----
>  drivers/vfio/pci/vfio_pci_private.h |  1 +
>  2 files changed, 6 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c
> index 0e7682e7a0b478..a0ac20a499cf6c 100644
> --- a/drivers/vfio/pci/vfio_pci.c
> +++ b/drivers/vfio/pci/vfio_pci.c
> @@ -2019,6 +2019,7 @@ static int vfio_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
>  		goto out_group_put;
>  	}
>  
> +	vfio_init_group_dev(&vdev->vdev, &pdev->dev, &vfio_pci_ops, vdev);
>  	vdev->pdev = pdev;
>  	vdev->irq_type = VFIO_PCI_NUM_IRQS;
>  	mutex_init(&vdev->igate);
> @@ -2056,9 +2057,10 @@ static int vfio_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
>  		vfio_pci_set_power_state(vdev, PCI_D3hot);
>  	}
>  
> -	ret = vfio_add_group_dev(&pdev->dev, &vfio_pci_ops, vdev);
> +	ret = vfio_register_group_dev(&vdev->vdev);
>  	if (ret)
>  		goto out_power;
> +	dev_set_drvdata(&pdev->dev, vdev);
>  	return 0;
>  
>  out_power:
> @@ -2078,13 +2080,11 @@ static int vfio_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
>  
>  static void vfio_pci_remove(struct pci_dev *pdev)
>  {
> -	struct vfio_pci_device *vdev;
> +	struct vfio_pci_device *vdev = dev_get_drvdata(&pdev->dev);
>  
>  	pci_disable_sriov(pdev);
>  
> -	vdev = vfio_del_group_dev(&pdev->dev);
> -	if (!vdev)
> -		return;
> +	vfio_unregister_group_dev(&vdev->vdev);
>  
>  	vfio_pci_vf_uninit(vdev);
>  	vfio_pci_reflck_put(vdev->reflck);
> diff --git a/drivers/vfio/pci/vfio_pci_private.h b/drivers/vfio/pci/vfio_pci_private.h
> index 9cd1882a05af69..8755a0febd054a 100644
> --- a/drivers/vfio/pci/vfio_pci_private.h
> +++ b/drivers/vfio/pci/vfio_pci_private.h
> @@ -100,6 +100,7 @@ struct vfio_pci_mmap_vma {
>  };
>  
>  struct vfio_pci_device {
> +	struct vfio_device	vdev;
>  	struct pci_dev		*pdev;
>  	void __iomem		*barmap[PCI_STD_NUM_BARS];
>  	bool			bar_mmap_supported[PCI_STD_NUM_BARS];
> 


^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v2 07/14] vfio/pci: Move VGA and VF initialization to functions
  2021-03-13  0:55 ` [PATCH v2 07/14] vfio/pci: Move VGA and VF initialization to functions Jason Gunthorpe
                     ` (3 preceding siblings ...)
  2021-03-16 16:51   ` Cornelia Huck
@ 2021-03-18 16:34   ` Auger Eric
  4 siblings, 0 replies; 82+ messages in thread
From: Auger Eric @ 2021-03-18 16:34 UTC (permalink / raw)
  To: Jason Gunthorpe, Cornelia Huck, kvm
  Cc: Alex Williamson, Raj, Ashok, Dan Williams, Daniel Vetter,
	Christoph Hellwig, Leon Romanovsky, Max Gurtovoy, Tarun Gupta

Hi,
On 3/13/21 1:55 AM, Jason Gunthorpe wrote:
> vfio_pci_probe() is quite complicated, with optional VF and VGA sub
> components. Move these into clear init/uninit functions and have a linear
> flow in probe/remove.
> 
> This fixes a few little buglets:
>  - vfio_pci_remove() is in the wrong order, vga_client_register() removes
>    a notifier and is after kfree(vdev), but the notifier refers to vdev,
>    so it can use after free in a race.
>  - vga_client_register() can fail but was ignored
> 
> Organize things so destruction order is the reverse of creation order.
> 
> Fixes: ecaa1f6a0154 ("vfio-pci: Add VGA arbiter client")
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>

Eric

> ---
>  drivers/vfio/pci/vfio_pci.c | 116 +++++++++++++++++++++++-------------
>  1 file changed, 74 insertions(+), 42 deletions(-)
> 
> diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c
> index 65e7e6b44578c2..f95b58376156a0 100644
> --- a/drivers/vfio/pci/vfio_pci.c
> +++ b/drivers/vfio/pci/vfio_pci.c
> @@ -1922,6 +1922,68 @@ static int vfio_pci_bus_notifier(struct notifier_block *nb,
>  	return 0;
>  }
>  
> +static int vfio_pci_vf_init(struct vfio_pci_device *vdev)
> +{
> +	struct pci_dev *pdev = vdev->pdev;
> +	int ret;
> +
> +	if (!pdev->is_physfn)
> +		return 0;
> +
> +	vdev->vf_token = kzalloc(sizeof(*vdev->vf_token), GFP_KERNEL);
> +	if (!vdev->vf_token)
> +		return -ENOMEM;
> +
> +	mutex_init(&vdev->vf_token->lock);
> +	uuid_gen(&vdev->vf_token->uuid);
> +
> +	vdev->nb.notifier_call = vfio_pci_bus_notifier;
> +	ret = bus_register_notifier(&pci_bus_type, &vdev->nb);
> +	if (ret) {
> +		kfree(vdev->vf_token);> +		return ret;
> +	}
> +	return 0;
> +}
> +
> +static void vfio_pci_vf_uninit(struct vfio_pci_device *vdev)
> +{
> +	if (!vdev->vf_token)
> +		return;
> +
> +	bus_unregister_notifier(&pci_bus_type, &vdev->nb);
> +	WARN_ON(vdev->vf_token->users);
> +	mutex_destroy(&vdev->vf_token->lock);
> +	kfree(vdev->vf_token);
> +}
> +
> +static int vfio_pci_vga_init(struct vfio_pci_device *vdev)
> +{
> +	struct pci_dev *pdev = vdev->pdev;
> +	int ret;
> +
> +	if (!vfio_pci_is_vga(pdev))
> +		return 0;
> +
> +	ret = vga_client_register(pdev, vdev, NULL, vfio_pci_set_vga_decode);
> +	if (ret)
> +		return ret;
> +	vga_set_legacy_decoding(pdev, vfio_pci_set_vga_decode(vdev, false));
> +	return 0;
> +}
> +
> +static void vfio_pci_vga_uninit(struct vfio_pci_device *vdev)
> +{
> +	struct pci_dev *pdev = vdev->pdev;
> +
> +	if (!vfio_pci_is_vga(pdev))
> +		return;
> +	vga_client_register(pdev, NULL, NULL, NULL);
> +	vga_set_legacy_decoding(pdev, VGA_RSRC_NORMAL_IO | VGA_RSRC_NORMAL_MEM |
> +					      VGA_RSRC_LEGACY_IO |
> +					      VGA_RSRC_LEGACY_MEM);
> +}
> +
>  static int vfio_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
>  {
>  	struct vfio_pci_device *vdev;
> @@ -1975,28 +2037,12 @@ static int vfio_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
>  	ret = vfio_pci_reflck_attach(vdev);
>  	if (ret)
>  		goto out_del_group_dev;
> -
> -	if (pdev->is_physfn) {
> -		vdev->vf_token = kzalloc(sizeof(*vdev->vf_token), GFP_KERNEL);
> -		if (!vdev->vf_token) {
> -			ret = -ENOMEM;
> -			goto out_reflck;
> -		}
> -
> -		mutex_init(&vdev->vf_token->lock);
> -		uuid_gen(&vdev->vf_token->uuid);
> -
> -		vdev->nb.notifier_call = vfio_pci_bus_notifier;
> -		ret = bus_register_notifier(&pci_bus_type, &vdev->nb);
> -		if (ret)
> -			goto out_vf_token;
> -	}
> -
> -	if (vfio_pci_is_vga(pdev)) {
> -		vga_client_register(pdev, vdev, NULL, vfio_pci_set_vga_decode);
> -		vga_set_legacy_decoding(pdev,
> -					vfio_pci_set_vga_decode(vdev, false));
> -	}
> +	ret = vfio_pci_vf_init(vdev);
> +	if (ret)
> +		goto out_reflck;
> +	ret = vfio_pci_vga_init(vdev);
> +	if (ret)
> +		goto out_vf;
>  
>  	vfio_pci_probe_power_state(vdev);
>  
> @@ -2016,8 +2062,8 @@ static int vfio_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
>  
>  	return ret;
>  
> -out_vf_token:
> -	kfree(vdev->vf_token);
> +out_vf:
> +	vfio_pci_vf_uninit(vdev);
>  out_reflck:
>  	vfio_pci_reflck_put(vdev->reflck);
>  out_del_group_dev:
> @@ -2039,33 +2085,19 @@ static void vfio_pci_remove(struct pci_dev *pdev)
>  	if (!vdev)
>  		return;
>  
> -	if (vdev->vf_token) {
> -		WARN_ON(vdev->vf_token->users);
> -		mutex_destroy(&vdev->vf_token->lock);
> -		kfree(vdev->vf_token);
> -	}
> -
> -	if (vdev->nb.notifier_call)
> -		bus_unregister_notifier(&pci_bus_type, &vdev->nb);
> -
> +	vfio_pci_vf_uninit(vdev);
>  	vfio_pci_reflck_put(vdev->reflck);
> +	vfio_pci_vga_uninit(vdev);
>  
>  	vfio_iommu_group_put(pdev->dev.iommu_group, &pdev->dev);
> -	kfree(vdev->region);
> -	mutex_destroy(&vdev->ioeventfds_lock);
>  
>  	if (!disable_idle_d3)
>  		vfio_pci_set_power_state(vdev, PCI_D0);
>  
> +	mutex_destroy(&vdev->ioeventfds_lock);
> +	kfree(vdev->region);
>  	kfree(vdev->pm_save);
>  	kfree(vdev);
> -
> -	if (vfio_pci_is_vga(pdev)) {
> -		vga_client_register(pdev, NULL, NULL, NULL);
> -		vga_set_legacy_decoding(pdev,
> -				VGA_RSRC_NORMAL_IO | VGA_RSRC_NORMAL_MEM |
> -				VGA_RSRC_LEGACY_IO | VGA_RSRC_LEGACY_MEM);
> -	}
>  }
>  
>  static pci_ers_result_t vfio_pci_aer_err_detected(struct pci_dev *pdev,
> 


^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v2 08/14] vfio/pci: Re-order vfio_pci_probe()
  2021-03-13  0:56 ` [PATCH v2 08/14] vfio/pci: Re-order vfio_pci_probe() Jason Gunthorpe
                     ` (3 preceding siblings ...)
  2021-03-17 10:32   ` Cornelia Huck
@ 2021-03-18 16:50   ` Auger Eric
  4 siblings, 0 replies; 82+ messages in thread
From: Auger Eric @ 2021-03-18 16:50 UTC (permalink / raw)
  To: Jason Gunthorpe, kvm
  Cc: Alex Williamson, Raj, Ashok, Christian Ehrhardt, Cornelia Huck,
	Dan Williams, Daniel Vetter, Christoph Hellwig, Kevin Tian,
	Leon Romanovsky, Max Gurtovoy, Tarun Gupta

Hi Jason,

On 3/13/21 1:56 AM, Jason Gunthorpe wrote:
> vfio_add_group_dev() must be called only after all of the private data in
> vdev is fully setup and ready, otherwise there could be races with user
> space instantiating a device file descriptor and starting to call ops.
> 
> For instance vfio_pci_reflck_attach() sets vdev->reflck and
> vfio_pci_open(), called by fops open, unconditionally derefs it, which
> will crash if things get out of order.>
> Fixes: cc20d7999000 ("vfio/pci: Introduce VF token")
> Fixes: e309df5b0c9e ("vfio/pci: Parallelize device open and release")
> Fixes: 6eb7018705de ("vfio-pci: Move idle devices to D3hot power state")
> Fixes: ecaa1f6a0154 ("vfio-pci: Add VGA arbiter client")
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>

Thanks

Eric
> ---
>  drivers/vfio/pci/vfio_pci.c | 17 +++++++++--------
>  1 file changed, 9 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c
> index f95b58376156a0..0e7682e7a0b478 100644
> --- a/drivers/vfio/pci/vfio_pci.c
> +++ b/drivers/vfio/pci/vfio_pci.c
> @@ -2030,13 +2030,9 @@ static int vfio_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
>  	INIT_LIST_HEAD(&vdev->vma_list);
>  	init_rwsem(&vdev->memory_lock);
>  
> -	ret = vfio_add_group_dev(&pdev->dev, &vfio_pci_ops, vdev);
> -	if (ret)
> -		goto out_free;
> -
>  	ret = vfio_pci_reflck_attach(vdev);
>  	if (ret)
> -		goto out_del_group_dev;
> +		goto out_free;
>  	ret = vfio_pci_vf_init(vdev);
>  	if (ret)
>  		goto out_reflck;
> @@ -2060,15 +2056,20 @@ static int vfio_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
>  		vfio_pci_set_power_state(vdev, PCI_D3hot);
>  	}
>  
> -	return ret;
> +	ret = vfio_add_group_dev(&pdev->dev, &vfio_pci_ops, vdev);
> +	if (ret)
> +		goto out_power;
> +	return 0;
>  
> +out_power:
> +	if (!disable_idle_d3)
> +		vfio_pci_set_power_state(vdev, PCI_D0);
>  out_vf:
>  	vfio_pci_vf_uninit(vdev);
>  out_reflck:
>  	vfio_pci_reflck_put(vdev->reflck);
> -out_del_group_dev:
> -	vfio_del_group_dev(&pdev->dev);
>  out_free:
> +	kfree(vdev->pm_save);
>  	kfree(vdev);
>  out_group_put:
>  	vfio_iommu_group_put(group, &pdev->dev);
> 


^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v2 01/14] vfio: Remove extra put/gets around vfio_device->group
  2021-03-17  0:47       ` Tian, Kevin
@ 2021-03-19 13:58         ` Jason Gunthorpe
  0 siblings, 0 replies; 82+ messages in thread
From: Jason Gunthorpe @ 2021-03-19 13:58 UTC (permalink / raw)
  To: Tian, Kevin
  Cc: Alex Williamson, Cornelia Huck, kvm, Raj, Ashok, Williams, Dan J,
	Daniel Vetter, Christoph Hellwig, Leon Romanovsky, Max Gurtovoy,
	Tarun Gupta

On Wed, Mar 17, 2021 at 12:47:16AM +0000, Tian, Kevin wrote:

> > /* Our reference on group is moved to the device */
> > 
> > The get is a move in this case
> > 
> > Later delete the function and this becomes perfectly clear
> 
> Looks above comment is not updated after vfio_group_create_device 
> is removed in patch03.

Oops, that hunk got lost during some rebase I think, I fixed it

Thanks,
Jason

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v2 02/14] vfio: Simplify the lifetime logic for vfio_device
  2021-03-17  8:12       ` Cornelia Huck
@ 2021-03-23 13:06         ` Jason Gunthorpe
  0 siblings, 0 replies; 82+ messages in thread
From: Jason Gunthorpe @ 2021-03-23 13:06 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: Alex Williamson, Tian, Kevin, kvm, Raj, Ashok, Williams, Dan J,
	Daniel Vetter, Christoph Hellwig, Leon Romanovsky, Max Gurtovoy,
	Tarun Gupta

On Wed, Mar 17, 2021 at 09:12:44AM +0100, Cornelia Huck wrote:
> On Tue, 16 Mar 2021 14:24:54 -0600
> Alex Williamson <alex.williamson@redhat.com> wrote:
> 
> > On Tue, 16 Mar 2021 07:38:09 +0000
> > "Tian, Kevin" <kevin.tian@intel.com> wrote:
> > 
> > > > From: Jason Gunthorpe <jgg@nvidia.com>
> > > > Sent: Saturday, March 13, 2021 8:56 AM
> > > > 
> > > > The vfio_device is using a 'sleep until all refs go to zero' pattern for
> > > > its lifetime, but it is indirectly coded by repeatedly scanning the group
> > > > list waiting for the device to be removed on its own.
> > > > 
> > > > Switch this around to be a direct representation, use a refcount to count
> > > > the number of places that are blocking destruction and sleep directly on a
> > > > completion until that counter goes to zero. kfree the device after other
> > > > accesses have been excluded in vfio_del_group_dev(). This is a fairly
> > > > common Linux idiom.
> > > > 
> > > > Due to this we can now remove kref_put_mutex(), which is very rarely used
> > > > in the kernel. Here it is being used to prevent a zero ref device from
> > > > being seen in the group list. Instead allow the zero ref device to
> > > > continue to exist in the device_list and use refcount_inc_not_zero() to
> > > > exclude it once refs go to zero.
> > > > 
> > > > This patch is organized so the next patch will be able to alter the API to
> > > > allow drivers to provide the kfree.
> > > > 
> > > > Reviewed-by: Christoph Hellwig <hch@lst.de>
> > > > Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> > > >  drivers/vfio/vfio.c | 79 ++++++++++++++-------------------------------
> > > >  1 file changed, 25 insertions(+), 54 deletions(-)
> > > > 
> 
> > > > @@ -935,32 +916,18 @@ void *vfio_del_group_dev(struct device *dev)
> > > >  	WARN_ON(!unbound);
> > > > 
> > > >  	vfio_device_put(device);
> > > > -
> > > > -	/*
> > > > -	 * If the device is still present in the group after the above
> > > > -	 * 'put', then it is in use and we need to request it from the
> > > > -	 * bus driver.  The driver may in turn need to request the
> > > > -	 * device from the user.  We send the request on an arbitrary
> > > > -	 * interval with counter to allow the driver to take escalating
> > > > -	 * measures to release the device if it has the ability to do so.
> > > > -	 */    
> > > 
> > > Above comment still makes sense even with this patch. What about
> > > keeping it? otherwise:  
> > 
> > The comment is not exactly correct after this code change either, the
> > device will always be present in the group after this 'put'.  Instead,
> > the completion now indicates the reference count has reached zero.  If
> > it's worthwhile to keep more context to the request callback, perhaps:
> > 
> > 	/*
> > 	 * If there are still outstanding device references, such as
> > 	 * from the device being in use, periodically kick the optional
> > 	 * device request callback while waiting.
> > 	 */
> 
> I like that comment; I don't think it hurts to be a bit verbose here.

I would prefer the comment explain why the driver should return from
request with refs held and what it is supposed to do on later
calls. This loop mechanism is strange, I didn't look at what the
drivers implement under this.

I don't see this approach in other places that are able to disconnect
their HW drivers from the uAPI (in RDMA land we call this
disassociation)

Jason

^ permalink raw reply	[flat|nested] 82+ messages in thread

end of thread, other threads:[~2021-03-23 13:06 UTC | newest]

Thread overview: 82+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-03-13  0:55 [PATCH v2 00/14] Embed struct vfio_device in all sub-structures Jason Gunthorpe
2021-03-13  0:55 ` [PATCH v2 01/14] vfio: Remove extra put/gets around vfio_device->group Jason Gunthorpe
2021-03-16  7:33   ` Tian, Kevin
2021-03-16 23:07     ` Jason Gunthorpe
2021-03-17  0:47       ` Tian, Kevin
2021-03-19 13:58         ` Jason Gunthorpe
2021-03-16 11:15   ` Max Gurtovoy
2021-03-16 11:59   ` Cornelia Huck
2021-03-18  9:32   ` Auger Eric
2021-03-13  0:55 ` [PATCH v2 02/14] vfio: Simplify the lifetime logic for vfio_device Jason Gunthorpe
2021-03-16  7:38   ` Tian, Kevin
2021-03-16 12:10     ` Cornelia Huck
2021-03-16 20:24     ` Alex Williamson
2021-03-16 23:08       ` Jason Gunthorpe
2021-03-17  8:12       ` Cornelia Huck
2021-03-23 13:06         ` Jason Gunthorpe
2021-03-18 13:10   ` Auger Eric
2021-03-13  0:55 ` [PATCH v2 03/14] vfio: Split creation of a vfio_device into init and register ops Jason Gunthorpe
2021-03-16  7:55   ` Tian, Kevin
2021-03-16 13:34     ` Jason Gunthorpe
2021-03-17  0:55       ` Tian, Kevin
2021-03-16 12:25   ` Cornelia Huck
2021-03-16 21:13     ` Alex Williamson
2021-03-16 23:12       ` Jason Gunthorpe
2021-03-16 12:54   ` Max Gurtovoy
2021-03-18 13:18   ` Auger Eric
2021-03-13  0:55 ` [PATCH v2 04/14] vfio/platform: Use vfio_init/register/unregister_group_dev Jason Gunthorpe
2021-03-16 16:22   ` Cornelia Huck
2021-03-16 21:33   ` Alex Williamson
2021-03-16 21:45     ` Jason Gunthorpe
2021-03-18 13:40   ` Auger Eric
2021-03-13  0:55 ` [PATCH v2 05/14] vfio/fsl-mc: Re-order vfio_fsl_mc_probe() Jason Gunthorpe
2021-03-15  8:44   ` Christoph Hellwig
2021-03-16  9:16   ` Diana Craciun OSS
2021-03-16 16:28   ` Cornelia Huck
2021-03-17 16:36   ` Diana Craciun OSS
2021-03-17 22:59     ` Jason Gunthorpe
2021-03-13  0:55 ` [PATCH v2 06/14] vfio/fsl-mc: Use vfio_init/register/unregister_group_dev Jason Gunthorpe
2021-03-15  8:44   ` Christoph Hellwig
2021-03-16 16:43   ` Cornelia Huck
2021-03-13  0:55 ` [PATCH v2 07/14] vfio/pci: Move VGA and VF initialization to functions Jason Gunthorpe
2021-03-15  8:45   ` Christoph Hellwig
2021-03-15 23:07     ` Jason Gunthorpe
2021-03-16  6:27       ` Christoph Hellwig
2021-03-16  7:57   ` Tian, Kevin
2021-03-16 13:02   ` Max Gurtovoy
2021-03-16 23:04     ` Jason Gunthorpe
2021-03-16 16:51   ` Cornelia Huck
2021-03-18 16:34   ` Auger Eric
2021-03-13  0:56 ` [PATCH v2 08/14] vfio/pci: Re-order vfio_pci_probe() Jason Gunthorpe
2021-03-15  8:46   ` Christoph Hellwig
2021-03-16  8:04   ` Tian, Kevin
2021-03-16 13:20     ` Jason Gunthorpe
2021-03-16 22:27       ` Alex Williamson
2021-03-17  0:56         ` Tian, Kevin
2021-03-16 11:28   ` Max Gurtovoy
2021-03-17 10:32   ` Cornelia Huck
2021-03-18 16:50   ` Auger Eric
2021-03-13  0:56 ` [PATCH v2 09/14] vfio/pci: Use vfio_init/register/unregister_group_dev Jason Gunthorpe
2021-03-16  8:06   ` Tian, Kevin
2021-03-17 10:33   ` Cornelia Huck
2021-03-18 13:43   ` Auger Eric
2021-03-13  0:56 ` [PATCH v2 10/14] vfio/mdev: " Jason Gunthorpe
2021-03-16  8:09   ` Tian, Kevin
2021-03-16 22:51     ` Alex Williamson
2021-03-16 23:19     ` Jason Gunthorpe
2021-03-17 10:36   ` Cornelia Huck
2021-03-13  0:56 ` [PATCH v2 11/14] vfio/mdev: Make to_mdev_device() into a static inline Jason Gunthorpe
2021-03-16  8:10   ` Tian, Kevin
2021-03-16 22:55   ` Alex Williamson
2021-03-16 23:20     ` Jason Gunthorpe
2021-03-17 10:36   ` Cornelia Huck
2021-03-13  0:56 ` [PATCH v2 12/14] vfio: Make vfio_device_ops pass a 'struct vfio_device *' instead of 'void *' Jason Gunthorpe
2021-03-15  8:58   ` Christoph Hellwig
2021-03-17 11:33   ` Cornelia Huck
2021-03-13  0:56 ` [PATCH v2 13/14] vfio/pci: Replace uses of vfio_device_data() with container_of Jason Gunthorpe
2021-03-16  8:20   ` Tian, Kevin
2021-03-17 12:06   ` Cornelia Huck
2021-03-13  0:56 ` [PATCH v2 14/14] vfio: Remove device_data from the vfio bus driver API Jason Gunthorpe
2021-03-16  8:22   ` Tian, Kevin
2021-03-17 12:08   ` Cornelia Huck
2021-03-17 23:24   ` Max Gurtovoy

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.