All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: [PATCH 07/14] vfio: simplify iommu group allocation for mediated devices
@ 2021-08-29 19:23 ` kernel test robot
  0 siblings, 0 replies; 23+ messages in thread
From: kernel test robot @ 2021-08-29 19:23 UTC (permalink / raw)
  Cc: kbuild-all, llvm

[-- Attachment #1: Type: text/plain, Size: 1938 bytes --]

In-Reply-To: <20210824144649.1488190-8-hch@lst.de>
References: <20210824144649.1488190-8-hch@lst.de>
TO: Christoph Hellwig <hch@lst.de>

Hi Christoph,

I love your patch! Perhaps something to improve:

[auto build test WARNING on vfio/next]
[cannot apply to linus/master v5.14-rc7 next-20210827]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Christoph-Hellwig/vfio-Move-vfio_iommu_group_get-to-vfio_register_group_dev/20210824-234703
base:   https://github.com/awilliam/linux-vfio.git next
config: x86_64-randconfig-a001-20210829 (attached as .config)
compiler: clang version 14.0.0 (https://github.com/llvm/llvm-project 510e106fa8635e7f9c51c896180b971de6309b2f)
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/0day-ci/linux/commit/710b535c0e5df66eae20c4a7f4868b8c29ce4376
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Christoph-Hellwig/vfio-Move-vfio_iommu_group_get-to-vfio_register_group_dev/20210824-234703
        git checkout 710b535c0e5df66eae20c4a7f4868b8c29ce4376
        # save the attached .config to linux build tree
        mkdir build_dir
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross O=build_dir ARCH=x86_64 SHELL=/bin/bash drivers/

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All warnings (new ones prefixed by >>):

>> drivers/vfio/vfio.o: warning: objtool: __vfio_register_dev()+0x129: unreachable instruction

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 37090 bytes --]

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 07/14] vfio: simplify iommu group allocation for mediated devices
@ 2021-08-29 19:23 ` kernel test robot
  0 siblings, 0 replies; 23+ messages in thread
From: kernel test robot @ 2021-08-29 19:23 UTC (permalink / raw)
  To: kbuild-all

[-- Attachment #1: Type: text/plain, Size: 1979 bytes --]

In-Reply-To: <20210824144649.1488190-8-hch@lst.de>
References: <20210824144649.1488190-8-hch@lst.de>
TO: Christoph Hellwig <hch@lst.de>

Hi Christoph,

I love your patch! Perhaps something to improve:

[auto build test WARNING on vfio/next]
[cannot apply to linus/master v5.14-rc7 next-20210827]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Christoph-Hellwig/vfio-Move-vfio_iommu_group_get-to-vfio_register_group_dev/20210824-234703
base:   https://github.com/awilliam/linux-vfio.git next
config: x86_64-randconfig-a001-20210829 (attached as .config)
compiler: clang version 14.0.0 (https://github.com/llvm/llvm-project 510e106fa8635e7f9c51c896180b971de6309b2f)
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/0day-ci/linux/commit/710b535c0e5df66eae20c4a7f4868b8c29ce4376
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Christoph-Hellwig/vfio-Move-vfio_iommu_group_get-to-vfio_register_group_dev/20210824-234703
        git checkout 710b535c0e5df66eae20c4a7f4868b8c29ce4376
        # save the attached .config to linux build tree
        mkdir build_dir
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross O=build_dir ARCH=x86_64 SHELL=/bin/bash drivers/

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All warnings (new ones prefixed by >>):

>> drivers/vfio/vfio.o: warning: objtool: __vfio_register_dev()+0x129: unreachable instruction

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all(a)lists.01.org

[-- Attachment #2: config.gz --]
[-- Type: application/gzip, Size: 37090 bytes --]

^ permalink raw reply	[flat|nested] 23+ messages in thread

* RE: [PATCH 07/14] vfio: simplify iommu group allocation for mediated devices
  2021-09-13  7:15 ` [PATCH 07/14] vfio: simplify iommu group allocation for mediated devices Christoph Hellwig
@ 2021-09-14  2:23   ` Tian, Kevin
  0 siblings, 0 replies; 23+ messages in thread
From: Tian, Kevin @ 2021-09-14  2:23 UTC (permalink / raw)
  To: Christoph Hellwig, Alex Williamson
  Cc: Diana Craciun, Cornelia Huck, Kirti Wankhede, Eric Auger,
	Jason Gunthorpe, Xu, Terrence, kvm

> From: Christoph Hellwig <hch@lst.de>
> Sent: Monday, September 13, 2021 3:16 PM
> 
> Reuse the logic in vfio_noiommu_group_alloc to allocate a fake
> single-device iommu group for mediated devices by factoring out a common
> function, and replacing the noiommu boolean field in struct vfio_group
> with an enum to distinguish the three different kinds of groups.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>

Reviewed-by: Kevin Tian <kevin.tian@intel.com>

> ---
>  drivers/s390/crypto/vfio_ap_ops.c |  2 +-
>  drivers/vfio/mdev/mdev_driver.c   | 45 ++-------------
>  drivers/vfio/mdev/vfio_mdev.c     |  2 +-
>  drivers/vfio/vfio.c               | 92 ++++++++++++++++++++++---------
>  include/linux/vfio.h              |  1 +
>  samples/vfio-mdev/mbochs.c        |  2 +-
>  samples/vfio-mdev/mdpy.c          |  2 +-
>  samples/vfio-mdev/mtty.c          |  2 +-
>  8 files changed, 76 insertions(+), 72 deletions(-)
> 
> diff --git a/drivers/s390/crypto/vfio_ap_ops.c
> b/drivers/s390/crypto/vfio_ap_ops.c
> index 118939a7729a1e..24755d1aedd5b0 100644
> --- a/drivers/s390/crypto/vfio_ap_ops.c
> +++ b/drivers/s390/crypto/vfio_ap_ops.c
> @@ -351,7 +351,7 @@ static int vfio_ap_mdev_probe(struct mdev_device
> *mdev)
>  	list_add(&matrix_mdev->node, &matrix_dev->mdev_list);
>  	mutex_unlock(&matrix_dev->lock);
> 
> -	ret = vfio_register_group_dev(&matrix_mdev->vdev);
> +	ret = vfio_register_emulated_iommu_dev(&matrix_mdev->vdev);
>  	if (ret)
>  		goto err_list;
>  	dev_set_drvdata(&mdev->dev, matrix_mdev);
> diff --git a/drivers/vfio/mdev/mdev_driver.c
> b/drivers/vfio/mdev/mdev_driver.c
> index e2cb1ff56f6c9b..7927ed4f1711f2 100644
> --- a/drivers/vfio/mdev/mdev_driver.c
> +++ b/drivers/vfio/mdev/mdev_driver.c
> @@ -13,60 +13,23 @@
> 
>  #include "mdev_private.h"
> 
> -static int mdev_attach_iommu(struct mdev_device *mdev)
> -{
> -	int ret;
> -	struct iommu_group *group;
> -
> -	group = iommu_group_alloc();
> -	if (IS_ERR(group))
> -		return PTR_ERR(group);
> -
> -	ret = iommu_group_add_device(group, &mdev->dev);
> -	if (!ret)
> -		dev_info(&mdev->dev, "MDEV: group_id = %d\n",
> -			 iommu_group_id(group));
> -
> -	iommu_group_put(group);
> -	return ret;
> -}
> -
> -static void mdev_detach_iommu(struct mdev_device *mdev)
> -{
> -	iommu_group_remove_device(&mdev->dev);
> -	dev_info(&mdev->dev, "MDEV: detaching iommu\n");
> -}
> -
>  static int mdev_probe(struct device *dev)
>  {
>  	struct mdev_driver *drv =
>  		container_of(dev->driver, struct mdev_driver, driver);
> -	struct mdev_device *mdev = to_mdev_device(dev);
> -	int ret;
> 
> -	ret = mdev_attach_iommu(mdev);
> -	if (ret)
> -		return ret;
> -
> -	if (drv->probe) {
> -		ret = drv->probe(mdev);
> -		if (ret)
> -			mdev_detach_iommu(mdev);
> -	}
> -
> -	return ret;
> +	if (!drv->probe)
> +		return 0;
> +	return drv->probe(to_mdev_device(dev));
>  }
> 
>  static void mdev_remove(struct device *dev)
>  {
>  	struct mdev_driver *drv =
>  		container_of(dev->driver, struct mdev_driver, driver);
> -	struct mdev_device *mdev = to_mdev_device(dev);
> 
>  	if (drv->remove)
> -		drv->remove(mdev);
> -
> -	mdev_detach_iommu(mdev);
> +		drv->remove(to_mdev_device(dev));
>  }
> 
>  static int mdev_match(struct device *dev, struct device_driver *drv)
> diff --git a/drivers/vfio/mdev/vfio_mdev.c b/drivers/vfio/mdev/vfio_mdev.c
> index 7a9883048216e7..a90e24b0c851d3 100644
> --- a/drivers/vfio/mdev/vfio_mdev.c
> +++ b/drivers/vfio/mdev/vfio_mdev.c
> @@ -119,7 +119,7 @@ static int vfio_mdev_probe(struct mdev_device
> *mdev)
>  		return -ENOMEM;
> 
>  	vfio_init_group_dev(vdev, &mdev->dev, &vfio_mdev_dev_ops);
> -	ret = vfio_register_group_dev(vdev);
> +	ret = vfio_register_emulated_iommu_dev(vdev);
>  	if (ret)
>  		goto out_uninit;
> 
> diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
> index 23eaebd2e28cd9..2508c8c3984091 100644
> --- a/drivers/vfio/vfio.c
> +++ b/drivers/vfio/vfio.c
> @@ -67,6 +67,30 @@ struct vfio_unbound_dev {
>  	struct list_head		unbound_next;
>  };
> 
> +enum vfio_group_type {
> +	/*
> +	 * Physical device with IOMMU backing.
> +	 */
> +	VFIO_IOMMU,
> +
> +	/*
> +	 * Virtual device without IOMMU backing. The VFIO core fakes up an
> +	 * iommu_group as the iommu_group sysfs interface is part of the
> +	 * userspace ABI.  The user of these devices must not be able to
> +	 * directly trigger unmediated DMA.
> +	 */
> +	VFIO_EMULATED_IOMMU,
> +
> +	/*
> +	 * Physical device without IOMMU backing. The VFIO core fakes up an
> +	 * iommu_group as the iommu_group sysfs interface is part of the
> +	 * userspace ABI.  Users can trigger unmediated DMA by the device,
> +	 * usage is highly dangerous, requires an explicit opt-in and will
> +	 * taint the kernel.
> +	 */
> +	VFIO_NO_IOMMU,
> +};
> +
>  struct vfio_group {
>  	struct kref			kref;
>  	int				minor;
> @@ -83,7 +107,7 @@ struct vfio_group {
>  	struct mutex			unbound_lock;
>  	atomic_t			opened;
>  	wait_queue_head_t		container_q;
> -	bool				noiommu;
> +	enum vfio_group_type		type;
>  	unsigned int			dev_counter;
>  	struct kvm			*kvm;
>  	struct blocking_notifier_head	notifier;
> @@ -336,7 +360,7 @@ static void vfio_group_unlock_and_free(struct
> vfio_group *group)
>   * Group objects - create, release, get, put, search
>   */
>  static struct vfio_group *vfio_create_group(struct iommu_group
> *iommu_group,
> -		bool noiommu)
> +		enum vfio_group_type type)
>  {
>  	struct vfio_group *group, *tmp;
>  	struct device *dev;
> @@ -355,7 +379,7 @@ static struct vfio_group *vfio_create_group(struct
> iommu_group *iommu_group,
>  	atomic_set(&group->opened, 0);
>  	init_waitqueue_head(&group->container_q);
>  	group->iommu_group = iommu_group;
> -	group->noiommu = noiommu;
> +	group->type = type;
>  	BLOCKING_INIT_NOTIFIER_HEAD(&group->notifier);
> 
>  	group->nb.notifier_call = vfio_iommu_group_notifier;
> @@ -391,8 +415,8 @@ static struct vfio_group *vfio_create_group(struct
> iommu_group *iommu_group,
>  	}
> 
>  	dev = device_create(vfio.class, NULL,
> -			    MKDEV(MAJOR(vfio.group_devt), minor),
> -			    group, "%s%d", group->noiommu ? "noiommu-" :
> "",
> +			    MKDEV(MAJOR(vfio.group_devt), minor), group,
> "%s%d",
> +			    group->type == VFIO_NO_IOMMU ? "noiommu-" :
> "",
>  			    iommu_group_id(iommu_group));
>  	if (IS_ERR(dev)) {
>  		vfio_free_group_minor(minor);
> @@ -778,8 +802,8 @@ void vfio_uninit_group_dev(struct vfio_device
> *device)
>  }
>  EXPORT_SYMBOL_GPL(vfio_uninit_group_dev);
> 
> -#ifdef CONFIG_VFIO_NOIOMMU
> -static struct vfio_group *vfio_noiommu_group_alloc(struct device *dev)
> +static struct vfio_group *vfio_noiommu_group_alloc(struct device *dev,
> +		enum vfio_group_type type)
>  {
>  	struct iommu_group *iommu_group;
>  	struct vfio_group *group;
> @@ -794,7 +818,7 @@ static struct vfio_group
> *vfio_noiommu_group_alloc(struct device *dev)
>  	if (ret)
>  		goto out_put_group;
> 
> -	group = vfio_create_group(iommu_group, true);
> +	group = vfio_create_group(iommu_group, type);
>  	if (IS_ERR(group)) {
>  		ret = PTR_ERR(group);
>  		goto out_remove_device;
> @@ -808,7 +832,6 @@ static struct vfio_group
> *vfio_noiommu_group_alloc(struct device *dev)
>  	iommu_group_put(iommu_group);
>  	return ERR_PTR(ret);
>  }
> -#endif
> 
>  static struct vfio_group *vfio_group_find_or_alloc(struct device *dev)
>  {
> @@ -824,7 +847,7 @@ static struct vfio_group
> *vfio_group_find_or_alloc(struct device *dev)
>  		 * bus.  Taint the kernel because we're about to give a DMA
>  		 * capable device to a user without IOMMU protection.
>  		 */
> -		group = vfio_noiommu_group_alloc(dev);
> +		group = vfio_noiommu_group_alloc(dev, VFIO_NO_IOMMU);
>  		if (!IS_ERR(group)) {
>  			add_taint(TAINT_USER, LOCKDEP_STILL_OK);
>  			dev_warn(dev, "Adding kernel taint for vfio-
> noiommu group on device\n");
> @@ -841,7 +864,7 @@ static struct vfio_group
> *vfio_group_find_or_alloc(struct device *dev)
>  		goto out_put;
> 
>  	/* a newly created vfio_group keeps the reference. */
> -	group = vfio_create_group(iommu_group, false);
> +	group = vfio_create_group(iommu_group, VFIO_IOMMU);
>  	if (IS_ERR(group))
>  		goto out_put;
>  	return group;
> @@ -851,10 +874,13 @@ static struct vfio_group
> *vfio_group_find_or_alloc(struct device *dev)
>  	return group;
>  }
> 
> -int vfio_register_group_dev(struct vfio_device *device)
> +static int __vfio_register_dev(struct vfio_device *device,
> +		struct vfio_group *group)
>  {
>  	struct vfio_device *existing_device;
> -	struct vfio_group *group;
> +
> +	if (IS_ERR(group))
> +		return PTR_ERR(group);
> 
>  	/*
>  	 * If the driver doesn't specify a set then the device is added to a
> @@ -863,16 +889,13 @@ int vfio_register_group_dev(struct vfio_device
> *device)
>  	if (!device->dev_set)
>  		vfio_assign_device_set(device, device);
> 
> -	group = vfio_group_find_or_alloc(device->dev);
> -	if (IS_ERR(group))
> -		return PTR_ERR(group);
> -
>  	existing_device = vfio_group_get_device(group, device->dev);
>  	if (existing_device) {
>  		dev_WARN(device->dev, "Device already exists on
> group %d\n",
>  			 iommu_group_id(group->iommu_group));
>  		vfio_device_put(existing_device);
> -		if (group->noiommu)
> +		if (group->type == VFIO_NO_IOMMU ||
> +		    group->type == VFIO_EMULATED_IOMMU)
>  			iommu_group_remove_device(device->dev);
>  		vfio_group_put(group);
>  		return -EBUSY;
> @@ -891,8 +914,25 @@ int vfio_register_group_dev(struct vfio_device
> *device)
> 
>  	return 0;
>  }
> +
> +int vfio_register_group_dev(struct vfio_device *device)
> +{
> +	return __vfio_register_dev(device,
> +		vfio_group_find_or_alloc(device->dev));
> +}
>  EXPORT_SYMBOL_GPL(vfio_register_group_dev);
> 
> +/*
> + * Register a virtual device without IOMMU backing.  The user of this
> + * device must not be able to directly trigger unmediated DMA.
> + */
> +int vfio_register_emulated_iommu_dev(struct vfio_device *device)
> +{
> +	return __vfio_register_dev(device,
> +		vfio_noiommu_group_alloc(device->dev,
> VFIO_EMULATED_IOMMU));
> +}
> +EXPORT_SYMBOL_GPL(vfio_register_emulated_iommu_dev);
> +
>  /**
>   * Get a reference to the vfio_device for a device.  Even if the
>   * caller thinks they own the device, they could be racing with a
> @@ -1019,7 +1059,7 @@ void vfio_unregister_group_dev(struct vfio_device
> *device)
>  	if (list_empty(&group->device_list))
>  		wait_event(group->container_q, !group->container);
> 
> -	if (group->noiommu)
> +	if (group->type == VFIO_NO_IOMMU || group->type ==
> VFIO_EMULATED_IOMMU)
>  		iommu_group_remove_device(device->dev);
> 
>  	/* Matches the get in vfio_register_group_dev() */
> @@ -1368,7 +1408,7 @@ static int vfio_group_set_container(struct
> vfio_group *group, int container_fd)
>  	if (atomic_read(&group->container_users))
>  		return -EINVAL;
> 
> -	if (group->noiommu && !capable(CAP_SYS_RAWIO))
> +	if (group->type == VFIO_NO_IOMMU && !capable(CAP_SYS_RAWIO))
>  		return -EPERM;
> 
>  	f = fdget(container_fd);
> @@ -1388,7 +1428,7 @@ static int vfio_group_set_container(struct
> vfio_group *group, int container_fd)
> 
>  	/* Real groups and fake groups cannot mix */
>  	if (!list_empty(&container->group_list) &&
> -	    container->noiommu != group->noiommu) {
> +	    container->noiommu != (group->type == VFIO_NO_IOMMU)) {
>  		ret = -EPERM;
>  		goto unlock_out;
>  	}
> @@ -1402,7 +1442,7 @@ static int vfio_group_set_container(struct
> vfio_group *group, int container_fd)
>  	}
> 
>  	group->container = container;
> -	container->noiommu = group->noiommu;
> +	container->noiommu = (group->type == VFIO_NO_IOMMU);
>  	list_add(&group->container_next, &container->group_list);
> 
>  	/* Get a reference on the container and mark a user within the group
> */
> @@ -1426,7 +1466,7 @@ static int vfio_group_add_container_user(struct
> vfio_group *group)
>  	if (!atomic_inc_not_zero(&group->container_users))
>  		return -EINVAL;
> 
> -	if (group->noiommu) {
> +	if (group->type == VFIO_NO_IOMMU) {
>  		atomic_dec(&group->container_users);
>  		return -EPERM;
>  	}
> @@ -1451,7 +1491,7 @@ static int vfio_group_get_device_fd(struct
> vfio_group *group, char *buf)
>  	    !group->container->iommu_driver || !vfio_group_viable(group))
>  		return -EINVAL;
> 
> -	if (group->noiommu && !capable(CAP_SYS_RAWIO))
> +	if (group->type == VFIO_NO_IOMMU && !capable(CAP_SYS_RAWIO))
>  		return -EPERM;
> 
>  	device = vfio_device_get_from_name(group, buf);
> @@ -1498,7 +1538,7 @@ static int vfio_group_get_device_fd(struct
> vfio_group *group, char *buf)
> 
>  	fd_install(fdno, filep);
> 
> -	if (group->noiommu)
> +	if (group->type == VFIO_NO_IOMMU)
>  		dev_warn(device->dev, "vfio-noiommu device opened by
> user "
>  			 "(%s:%d)\n", current->comm, task_pid_nr(current));
>  	return fdno;
> @@ -1594,7 +1634,7 @@ static int vfio_group_fops_open(struct inode
> *inode, struct file *filep)
>  	if (!group)
>  		return -ENODEV;
> 
> -	if (group->noiommu && !capable(CAP_SYS_RAWIO)) {
> +	if (group->type == VFIO_NO_IOMMU && !capable(CAP_SYS_RAWIO))
> {
>  		vfio_group_put(group);
>  		return -EPERM;
>  	}
> diff --git a/include/linux/vfio.h b/include/linux/vfio.h
> index f7083c2fd0d099..bbe29300862649 100644
> --- a/include/linux/vfio.h
> +++ b/include/linux/vfio.h
> @@ -75,6 +75,7 @@ void vfio_init_group_dev(struct vfio_device *device,
> struct device *dev,
>  			 const struct vfio_device_ops *ops);
>  void vfio_uninit_group_dev(struct vfio_device *device);
>  int vfio_register_group_dev(struct vfio_device *device);
> +int vfio_register_emulated_iommu_dev(struct vfio_device *device);
>  void vfio_unregister_group_dev(struct vfio_device *device);
>  extern struct vfio_device *vfio_device_get_from_dev(struct device *dev);
>  extern void vfio_device_put(struct vfio_device *device);
> diff --git a/samples/vfio-mdev/mbochs.c b/samples/vfio-mdev/mbochs.c
> index c313ab4d1f4e4e..cd41bec5fdeb39 100644
> --- a/samples/vfio-mdev/mbochs.c
> +++ b/samples/vfio-mdev/mbochs.c
> @@ -553,7 +553,7 @@ static int mbochs_probe(struct mdev_device *mdev)
>  	mbochs_create_config_space(mdev_state);
>  	mbochs_reset(mdev_state);
> 
> -	ret = vfio_register_group_dev(&mdev_state->vdev);
> +	ret = vfio_register_emulated_iommu_dev(&mdev_state->vdev);
>  	if (ret)
>  		goto err_mem;
>  	dev_set_drvdata(&mdev->dev, mdev_state);
> diff --git a/samples/vfio-mdev/mdpy.c b/samples/vfio-mdev/mdpy.c
> index 8d1a80a0722aa9..fe5d43e797b6d3 100644
> --- a/samples/vfio-mdev/mdpy.c
> +++ b/samples/vfio-mdev/mdpy.c
> @@ -258,7 +258,7 @@ static int mdpy_probe(struct mdev_device *mdev)
> 
>  	mdpy_count++;
> 
> -	ret = vfio_register_group_dev(&mdev_state->vdev);
> +	ret = vfio_register_emulated_iommu_dev(&mdev_state->vdev);
>  	if (ret)
>  		goto err_mem;
>  	dev_set_drvdata(&mdev->dev, mdev_state);
> diff --git a/samples/vfio-mdev/mtty.c b/samples/vfio-mdev/mtty.c
> index 5983cdb16e3d1d..a0e1a469bd47af 100644
> --- a/samples/vfio-mdev/mtty.c
> +++ b/samples/vfio-mdev/mtty.c
> @@ -741,7 +741,7 @@ static int mtty_probe(struct mdev_device *mdev)
> 
>  	mtty_create_config_space(mdev_state);
> 
> -	ret = vfio_register_group_dev(&mdev_state->vdev);
> +	ret = vfio_register_emulated_iommu_dev(&mdev_state->vdev);
>  	if (ret)
>  		goto err_vconfig;
>  	dev_set_drvdata(&mdev->dev, mdev_state);
> --
> 2.30.2


^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH 07/14] vfio: simplify iommu group allocation for mediated devices
  2021-09-13  7:15 cleanup vfio iommu_group creation v5 Christoph Hellwig
@ 2021-09-13  7:15 ` Christoph Hellwig
  2021-09-14  2:23   ` Tian, Kevin
  0 siblings, 1 reply; 23+ messages in thread
From: Christoph Hellwig @ 2021-09-13  7:15 UTC (permalink / raw)
  To: Alex Williamson
  Cc: Diana Craciun, Cornelia Huck, Kirti Wankhede, Eric Auger,
	Jason Gunthorpe, Terrence Xu, kvm

Reuse the logic in vfio_noiommu_group_alloc to allocate a fake
single-device iommu group for mediated devices by factoring out a common
function, and replacing the noiommu boolean field in struct vfio_group
with an enum to distinguish the three different kinds of groups.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/s390/crypto/vfio_ap_ops.c |  2 +-
 drivers/vfio/mdev/mdev_driver.c   | 45 ++-------------
 drivers/vfio/mdev/vfio_mdev.c     |  2 +-
 drivers/vfio/vfio.c               | 92 ++++++++++++++++++++++---------
 include/linux/vfio.h              |  1 +
 samples/vfio-mdev/mbochs.c        |  2 +-
 samples/vfio-mdev/mdpy.c          |  2 +-
 samples/vfio-mdev/mtty.c          |  2 +-
 8 files changed, 76 insertions(+), 72 deletions(-)

diff --git a/drivers/s390/crypto/vfio_ap_ops.c b/drivers/s390/crypto/vfio_ap_ops.c
index 118939a7729a1e..24755d1aedd5b0 100644
--- a/drivers/s390/crypto/vfio_ap_ops.c
+++ b/drivers/s390/crypto/vfio_ap_ops.c
@@ -351,7 +351,7 @@ static int vfio_ap_mdev_probe(struct mdev_device *mdev)
 	list_add(&matrix_mdev->node, &matrix_dev->mdev_list);
 	mutex_unlock(&matrix_dev->lock);
 
-	ret = vfio_register_group_dev(&matrix_mdev->vdev);
+	ret = vfio_register_emulated_iommu_dev(&matrix_mdev->vdev);
 	if (ret)
 		goto err_list;
 	dev_set_drvdata(&mdev->dev, matrix_mdev);
diff --git a/drivers/vfio/mdev/mdev_driver.c b/drivers/vfio/mdev/mdev_driver.c
index e2cb1ff56f6c9b..7927ed4f1711f2 100644
--- a/drivers/vfio/mdev/mdev_driver.c
+++ b/drivers/vfio/mdev/mdev_driver.c
@@ -13,60 +13,23 @@
 
 #include "mdev_private.h"
 
-static int mdev_attach_iommu(struct mdev_device *mdev)
-{
-	int ret;
-	struct iommu_group *group;
-
-	group = iommu_group_alloc();
-	if (IS_ERR(group))
-		return PTR_ERR(group);
-
-	ret = iommu_group_add_device(group, &mdev->dev);
-	if (!ret)
-		dev_info(&mdev->dev, "MDEV: group_id = %d\n",
-			 iommu_group_id(group));
-
-	iommu_group_put(group);
-	return ret;
-}
-
-static void mdev_detach_iommu(struct mdev_device *mdev)
-{
-	iommu_group_remove_device(&mdev->dev);
-	dev_info(&mdev->dev, "MDEV: detaching iommu\n");
-}
-
 static int mdev_probe(struct device *dev)
 {
 	struct mdev_driver *drv =
 		container_of(dev->driver, struct mdev_driver, driver);
-	struct mdev_device *mdev = to_mdev_device(dev);
-	int ret;
 
-	ret = mdev_attach_iommu(mdev);
-	if (ret)
-		return ret;
-
-	if (drv->probe) {
-		ret = drv->probe(mdev);
-		if (ret)
-			mdev_detach_iommu(mdev);
-	}
-
-	return ret;
+	if (!drv->probe)
+		return 0;
+	return drv->probe(to_mdev_device(dev));
 }
 
 static void mdev_remove(struct device *dev)
 {
 	struct mdev_driver *drv =
 		container_of(dev->driver, struct mdev_driver, driver);
-	struct mdev_device *mdev = to_mdev_device(dev);
 
 	if (drv->remove)
-		drv->remove(mdev);
-
-	mdev_detach_iommu(mdev);
+		drv->remove(to_mdev_device(dev));
 }
 
 static int mdev_match(struct device *dev, struct device_driver *drv)
diff --git a/drivers/vfio/mdev/vfio_mdev.c b/drivers/vfio/mdev/vfio_mdev.c
index 7a9883048216e7..a90e24b0c851d3 100644
--- a/drivers/vfio/mdev/vfio_mdev.c
+++ b/drivers/vfio/mdev/vfio_mdev.c
@@ -119,7 +119,7 @@ static int vfio_mdev_probe(struct mdev_device *mdev)
 		return -ENOMEM;
 
 	vfio_init_group_dev(vdev, &mdev->dev, &vfio_mdev_dev_ops);
-	ret = vfio_register_group_dev(vdev);
+	ret = vfio_register_emulated_iommu_dev(vdev);
 	if (ret)
 		goto out_uninit;
 
diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
index 23eaebd2e28cd9..2508c8c3984091 100644
--- a/drivers/vfio/vfio.c
+++ b/drivers/vfio/vfio.c
@@ -67,6 +67,30 @@ struct vfio_unbound_dev {
 	struct list_head		unbound_next;
 };
 
+enum vfio_group_type {
+	/*
+	 * Physical device with IOMMU backing.
+	 */
+	VFIO_IOMMU,
+
+	/*
+	 * Virtual device without IOMMU backing. The VFIO core fakes up an
+	 * iommu_group as the iommu_group sysfs interface is part of the
+	 * userspace ABI.  The user of these devices must not be able to
+	 * directly trigger unmediated DMA.
+	 */
+	VFIO_EMULATED_IOMMU,
+
+	/*
+	 * Physical device without IOMMU backing. The VFIO core fakes up an
+	 * iommu_group as the iommu_group sysfs interface is part of the
+	 * userspace ABI.  Users can trigger unmediated DMA by the device,
+	 * usage is highly dangerous, requires an explicit opt-in and will
+	 * taint the kernel.
+	 */
+	VFIO_NO_IOMMU,
+};
+
 struct vfio_group {
 	struct kref			kref;
 	int				minor;
@@ -83,7 +107,7 @@ struct vfio_group {
 	struct mutex			unbound_lock;
 	atomic_t			opened;
 	wait_queue_head_t		container_q;
-	bool				noiommu;
+	enum vfio_group_type		type;
 	unsigned int			dev_counter;
 	struct kvm			*kvm;
 	struct blocking_notifier_head	notifier;
@@ -336,7 +360,7 @@ static void vfio_group_unlock_and_free(struct vfio_group *group)
  * Group objects - create, release, get, put, search
  */
 static struct vfio_group *vfio_create_group(struct iommu_group *iommu_group,
-		bool noiommu)
+		enum vfio_group_type type)
 {
 	struct vfio_group *group, *tmp;
 	struct device *dev;
@@ -355,7 +379,7 @@ static struct vfio_group *vfio_create_group(struct iommu_group *iommu_group,
 	atomic_set(&group->opened, 0);
 	init_waitqueue_head(&group->container_q);
 	group->iommu_group = iommu_group;
-	group->noiommu = noiommu;
+	group->type = type;
 	BLOCKING_INIT_NOTIFIER_HEAD(&group->notifier);
 
 	group->nb.notifier_call = vfio_iommu_group_notifier;
@@ -391,8 +415,8 @@ static struct vfio_group *vfio_create_group(struct iommu_group *iommu_group,
 	}
 
 	dev = device_create(vfio.class, NULL,
-			    MKDEV(MAJOR(vfio.group_devt), minor),
-			    group, "%s%d", group->noiommu ? "noiommu-" : "",
+			    MKDEV(MAJOR(vfio.group_devt), minor), group, "%s%d",
+			    group->type == VFIO_NO_IOMMU ? "noiommu-" : "",
 			    iommu_group_id(iommu_group));
 	if (IS_ERR(dev)) {
 		vfio_free_group_minor(minor);
@@ -778,8 +802,8 @@ void vfio_uninit_group_dev(struct vfio_device *device)
 }
 EXPORT_SYMBOL_GPL(vfio_uninit_group_dev);
 
-#ifdef CONFIG_VFIO_NOIOMMU
-static struct vfio_group *vfio_noiommu_group_alloc(struct device *dev)
+static struct vfio_group *vfio_noiommu_group_alloc(struct device *dev,
+		enum vfio_group_type type)
 {
 	struct iommu_group *iommu_group;
 	struct vfio_group *group;
@@ -794,7 +818,7 @@ static struct vfio_group *vfio_noiommu_group_alloc(struct device *dev)
 	if (ret)
 		goto out_put_group;
 
-	group = vfio_create_group(iommu_group, true);
+	group = vfio_create_group(iommu_group, type);
 	if (IS_ERR(group)) {
 		ret = PTR_ERR(group);
 		goto out_remove_device;
@@ -808,7 +832,6 @@ static struct vfio_group *vfio_noiommu_group_alloc(struct device *dev)
 	iommu_group_put(iommu_group);
 	return ERR_PTR(ret);
 }
-#endif
 
 static struct vfio_group *vfio_group_find_or_alloc(struct device *dev)
 {
@@ -824,7 +847,7 @@ static struct vfio_group *vfio_group_find_or_alloc(struct device *dev)
 		 * bus.  Taint the kernel because we're about to give a DMA
 		 * capable device to a user without IOMMU protection.
 		 */
-		group = vfio_noiommu_group_alloc(dev);
+		group = vfio_noiommu_group_alloc(dev, VFIO_NO_IOMMU);
 		if (!IS_ERR(group)) {
 			add_taint(TAINT_USER, LOCKDEP_STILL_OK);
 			dev_warn(dev, "Adding kernel taint for vfio-noiommu group on device\n");
@@ -841,7 +864,7 @@ static struct vfio_group *vfio_group_find_or_alloc(struct device *dev)
 		goto out_put;
 
 	/* a newly created vfio_group keeps the reference. */
-	group = vfio_create_group(iommu_group, false);
+	group = vfio_create_group(iommu_group, VFIO_IOMMU);
 	if (IS_ERR(group))
 		goto out_put;
 	return group;
@@ -851,10 +874,13 @@ static struct vfio_group *vfio_group_find_or_alloc(struct device *dev)
 	return group;
 }
 
-int vfio_register_group_dev(struct vfio_device *device)
+static int __vfio_register_dev(struct vfio_device *device,
+		struct vfio_group *group)
 {
 	struct vfio_device *existing_device;
-	struct vfio_group *group;
+
+	if (IS_ERR(group))
+		return PTR_ERR(group);
 
 	/*
 	 * If the driver doesn't specify a set then the device is added to a
@@ -863,16 +889,13 @@ int vfio_register_group_dev(struct vfio_device *device)
 	if (!device->dev_set)
 		vfio_assign_device_set(device, device);
 
-	group = vfio_group_find_or_alloc(device->dev);
-	if (IS_ERR(group))
-		return PTR_ERR(group);
-
 	existing_device = vfio_group_get_device(group, device->dev);
 	if (existing_device) {
 		dev_WARN(device->dev, "Device already exists on group %d\n",
 			 iommu_group_id(group->iommu_group));
 		vfio_device_put(existing_device);
-		if (group->noiommu)
+		if (group->type == VFIO_NO_IOMMU ||
+		    group->type == VFIO_EMULATED_IOMMU)
 			iommu_group_remove_device(device->dev);
 		vfio_group_put(group);
 		return -EBUSY;
@@ -891,8 +914,25 @@ int vfio_register_group_dev(struct vfio_device *device)
 
 	return 0;
 }
+
+int vfio_register_group_dev(struct vfio_device *device)
+{
+	return __vfio_register_dev(device,
+		vfio_group_find_or_alloc(device->dev));
+}
 EXPORT_SYMBOL_GPL(vfio_register_group_dev);
 
+/*
+ * Register a virtual device without IOMMU backing.  The user of this
+ * device must not be able to directly trigger unmediated DMA.
+ */
+int vfio_register_emulated_iommu_dev(struct vfio_device *device)
+{
+	return __vfio_register_dev(device,
+		vfio_noiommu_group_alloc(device->dev, VFIO_EMULATED_IOMMU));
+}
+EXPORT_SYMBOL_GPL(vfio_register_emulated_iommu_dev);
+
 /**
  * Get a reference to the vfio_device for a device.  Even if the
  * caller thinks they own the device, they could be racing with a
@@ -1019,7 +1059,7 @@ void vfio_unregister_group_dev(struct vfio_device *device)
 	if (list_empty(&group->device_list))
 		wait_event(group->container_q, !group->container);
 
-	if (group->noiommu)
+	if (group->type == VFIO_NO_IOMMU || group->type == VFIO_EMULATED_IOMMU)
 		iommu_group_remove_device(device->dev);
 
 	/* Matches the get in vfio_register_group_dev() */
@@ -1368,7 +1408,7 @@ static int vfio_group_set_container(struct vfio_group *group, int container_fd)
 	if (atomic_read(&group->container_users))
 		return -EINVAL;
 
-	if (group->noiommu && !capable(CAP_SYS_RAWIO))
+	if (group->type == VFIO_NO_IOMMU && !capable(CAP_SYS_RAWIO))
 		return -EPERM;
 
 	f = fdget(container_fd);
@@ -1388,7 +1428,7 @@ static int vfio_group_set_container(struct vfio_group *group, int container_fd)
 
 	/* Real groups and fake groups cannot mix */
 	if (!list_empty(&container->group_list) &&
-	    container->noiommu != group->noiommu) {
+	    container->noiommu != (group->type == VFIO_NO_IOMMU)) {
 		ret = -EPERM;
 		goto unlock_out;
 	}
@@ -1402,7 +1442,7 @@ static int vfio_group_set_container(struct vfio_group *group, int container_fd)
 	}
 
 	group->container = container;
-	container->noiommu = group->noiommu;
+	container->noiommu = (group->type == VFIO_NO_IOMMU);
 	list_add(&group->container_next, &container->group_list);
 
 	/* Get a reference on the container and mark a user within the group */
@@ -1426,7 +1466,7 @@ static int vfio_group_add_container_user(struct vfio_group *group)
 	if (!atomic_inc_not_zero(&group->container_users))
 		return -EINVAL;
 
-	if (group->noiommu) {
+	if (group->type == VFIO_NO_IOMMU) {
 		atomic_dec(&group->container_users);
 		return -EPERM;
 	}
@@ -1451,7 +1491,7 @@ static int vfio_group_get_device_fd(struct vfio_group *group, char *buf)
 	    !group->container->iommu_driver || !vfio_group_viable(group))
 		return -EINVAL;
 
-	if (group->noiommu && !capable(CAP_SYS_RAWIO))
+	if (group->type == VFIO_NO_IOMMU && !capable(CAP_SYS_RAWIO))
 		return -EPERM;
 
 	device = vfio_device_get_from_name(group, buf);
@@ -1498,7 +1538,7 @@ static int vfio_group_get_device_fd(struct vfio_group *group, char *buf)
 
 	fd_install(fdno, filep);
 
-	if (group->noiommu)
+	if (group->type == VFIO_NO_IOMMU)
 		dev_warn(device->dev, "vfio-noiommu device opened by user "
 			 "(%s:%d)\n", current->comm, task_pid_nr(current));
 	return fdno;
@@ -1594,7 +1634,7 @@ static int vfio_group_fops_open(struct inode *inode, struct file *filep)
 	if (!group)
 		return -ENODEV;
 
-	if (group->noiommu && !capable(CAP_SYS_RAWIO)) {
+	if (group->type == VFIO_NO_IOMMU && !capable(CAP_SYS_RAWIO)) {
 		vfio_group_put(group);
 		return -EPERM;
 	}
diff --git a/include/linux/vfio.h b/include/linux/vfio.h
index f7083c2fd0d099..bbe29300862649 100644
--- a/include/linux/vfio.h
+++ b/include/linux/vfio.h
@@ -75,6 +75,7 @@ void vfio_init_group_dev(struct vfio_device *device, struct device *dev,
 			 const struct vfio_device_ops *ops);
 void vfio_uninit_group_dev(struct vfio_device *device);
 int vfio_register_group_dev(struct vfio_device *device);
+int vfio_register_emulated_iommu_dev(struct vfio_device *device);
 void vfio_unregister_group_dev(struct vfio_device *device);
 extern struct vfio_device *vfio_device_get_from_dev(struct device *dev);
 extern void vfio_device_put(struct vfio_device *device);
diff --git a/samples/vfio-mdev/mbochs.c b/samples/vfio-mdev/mbochs.c
index c313ab4d1f4e4e..cd41bec5fdeb39 100644
--- a/samples/vfio-mdev/mbochs.c
+++ b/samples/vfio-mdev/mbochs.c
@@ -553,7 +553,7 @@ static int mbochs_probe(struct mdev_device *mdev)
 	mbochs_create_config_space(mdev_state);
 	mbochs_reset(mdev_state);
 
-	ret = vfio_register_group_dev(&mdev_state->vdev);
+	ret = vfio_register_emulated_iommu_dev(&mdev_state->vdev);
 	if (ret)
 		goto err_mem;
 	dev_set_drvdata(&mdev->dev, mdev_state);
diff --git a/samples/vfio-mdev/mdpy.c b/samples/vfio-mdev/mdpy.c
index 8d1a80a0722aa9..fe5d43e797b6d3 100644
--- a/samples/vfio-mdev/mdpy.c
+++ b/samples/vfio-mdev/mdpy.c
@@ -258,7 +258,7 @@ static int mdpy_probe(struct mdev_device *mdev)
 
 	mdpy_count++;
 
-	ret = vfio_register_group_dev(&mdev_state->vdev);
+	ret = vfio_register_emulated_iommu_dev(&mdev_state->vdev);
 	if (ret)
 		goto err_mem;
 	dev_set_drvdata(&mdev->dev, mdev_state);
diff --git a/samples/vfio-mdev/mtty.c b/samples/vfio-mdev/mtty.c
index 5983cdb16e3d1d..a0e1a469bd47af 100644
--- a/samples/vfio-mdev/mtty.c
+++ b/samples/vfio-mdev/mtty.c
@@ -741,7 +741,7 @@ static int mtty_probe(struct mdev_device *mdev)
 
 	mtty_create_config_space(mdev_state);
 
-	ret = vfio_register_group_dev(&mdev_state->vdev);
+	ret = vfio_register_emulated_iommu_dev(&mdev_state->vdev);
 	if (ret)
 		goto err_vconfig;
 	dev_set_drvdata(&mdev->dev, mdev_state);
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH 07/14] vfio: simplify iommu group allocation for mediated devices
  2021-08-27  2:17     ` Tian, Kevin
@ 2021-08-27  5:42       ` Christoph Hellwig
  0 siblings, 0 replies; 23+ messages in thread
From: Christoph Hellwig @ 2021-08-27  5:42 UTC (permalink / raw)
  To: Tian, Kevin
  Cc: Alex Williamson, Christoph Hellwig, Diana Craciun, Cornelia Huck,
	Kirti Wankhede, Eric Auger, Jason Gunthorpe, kvm,
	Jason Gunthorpe

On Fri, Aug 27, 2021 at 02:17:14AM +0000, Tian, Kevin wrote:
> > flags suggests to me a bit field, but we can't have EMULATED|NOIOMMU.
> > Should this be an enum iommu_type?
> > 
> 
> and I wonder whether we should also define a type (VFIO_IOMMU)
> for the remaining groups which are backed by IOMMU. This can be
> set implicitly when the caller doesn't specify a specific type when
> creating the group. It's not checked in any place now, but it might
> be helpful for readability and diagnostic purpose? or change the
> name to noiommu_flags then no confusion on its scope...

If we move to a type field we should definitively define a value for
the normal case as well.  The advantage of the flags is that we can
also use it for other needs it it arises.  An emum OTOH would be
easier to follow.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* RE: [PATCH 07/14] vfio: simplify iommu group allocation for mediated devices
  2021-08-26 22:59   ` Alex Williamson
@ 2021-08-27  2:17     ` Tian, Kevin
  2021-08-27  5:42       ` Christoph Hellwig
  0 siblings, 1 reply; 23+ messages in thread
From: Tian, Kevin @ 2021-08-27  2:17 UTC (permalink / raw)
  To: Alex Williamson, Christoph Hellwig
  Cc: Diana Craciun, Cornelia Huck, Kirti Wankhede, Eric Auger,
	Jason Gunthorpe, kvm, Jason Gunthorpe

> From: Alex Williamson <alex.williamson@redhat.com>
> Sent: Friday, August 27, 2021 6:59 AM
> 
> On Thu, 26 Aug 2021 15:34:17 +0200
> Christoph Hellwig <hch@lst.de> wrote:
> > diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
> > index 94c5e18a05e0d0..467432379b91ef 100644
> > --- a/drivers/vfio/vfio.c
> > +++ b/drivers/vfio/vfio.c
> > @@ -67,6 +67,21 @@ struct vfio_unbound_dev {
> >  	struct list_head		unbound_next;
> >  };
> >
> > +/*
> > + * Virtual device without IOMMU backing. The VFIO core fakes up an
> iommu_group
> > + * as the iommu_group sysfs interface is part of the userspace ABI.  The
> user
> > + * of these devices must not be able to directly trigger unmediated DMA.
> > + */
> > +#define VFIO_EMULATED_IOMMU	(1 << 0)
> > +
> > +/*
> > + * Physical device without IOMMU backing. The VFIO core fakes up an
> iommu_group
> > + * as the iommu_group sysfs interface is part of the userspace ABI.  Users
> can
> > + * trigger unmediated DMA by the device, usage is highly dangerous,
> requires
> > + * an explicit opt-in and will taint the kernel.
> > + */
> > +#define VFIO_NO_IOMMU		(1 << 1)
> > +
> >  struct vfio_group {
> >  	struct kref			kref;
> >  	int				minor;
> > @@ -83,7 +98,7 @@ struct vfio_group {
> >  	struct mutex			unbound_lock;
> >  	atomic_t			opened;
> >  	wait_queue_head_t		container_q;
> > -	bool				noiommu;
> > +	unsigned int			flags;
> 
> flags suggests to me a bit field, but we can't have EMULATED|NOIOMMU.
> Should this be an enum iommu_type?
> 

and I wonder whether we should also define a type (VFIO_IOMMU)
for the remaining groups which are backed by IOMMU. This can be
set implicitly when the caller doesn't specify a specific type when
creating the group. It's not checked in any place now, but it might
be helpful for readability and diagnostic purpose? or change the
name to noiommu_flags then no confusion on its scope...

Thanks
Kevin

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 07/14] vfio: simplify iommu group allocation for mediated devices
  2021-08-26 13:34 ` [PATCH 07/14] vfio: simplify iommu group allocation for mediated devices Christoph Hellwig
  2021-08-26 17:24   ` Alex Williamson
@ 2021-08-26 22:59   ` Alex Williamson
  2021-08-27  2:17     ` Tian, Kevin
  1 sibling, 1 reply; 23+ messages in thread
From: Alex Williamson @ 2021-08-26 22:59 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Diana Craciun, Cornelia Huck, Kirti Wankhede, Eric Auger,
	Jason Gunthorpe, kvm, Jason Gunthorpe

On Thu, 26 Aug 2021 15:34:17 +0200
Christoph Hellwig <hch@lst.de> wrote:
> diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
> index 94c5e18a05e0d0..467432379b91ef 100644
> --- a/drivers/vfio/vfio.c
> +++ b/drivers/vfio/vfio.c
> @@ -67,6 +67,21 @@ struct vfio_unbound_dev {
>  	struct list_head		unbound_next;
>  };
>  
> +/*
> + * Virtual device without IOMMU backing. The VFIO core fakes up an iommu_group
> + * as the iommu_group sysfs interface is part of the userspace ABI.  The user
> + * of these devices must not be able to directly trigger unmediated DMA.
> + */
> +#define VFIO_EMULATED_IOMMU	(1 << 0)
> +
> +/*
> + * Physical device without IOMMU backing. The VFIO core fakes up an iommu_group
> + * as the iommu_group sysfs interface is part of the userspace ABI.  Users can
> + * trigger unmediated DMA by the device, usage is highly dangerous, requires
> + * an explicit opt-in and will taint the kernel.
> + */
> +#define VFIO_NO_IOMMU		(1 << 1)
> +
>  struct vfio_group {
>  	struct kref			kref;
>  	int				minor;
> @@ -83,7 +98,7 @@ struct vfio_group {
>  	struct mutex			unbound_lock;
>  	atomic_t			opened;
>  	wait_queue_head_t		container_q;
> -	bool				noiommu;
> +	unsigned int			flags;

flags suggests to me a bit field, but we can't have EMULATED|NOIOMMU.
Should this be an enum iommu_type?

Note that my next branch now includes the vfio-ap changes, so respins
can include the change Jason noted directly.  Thanks,

Alex


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 07/14] vfio: simplify iommu group allocation for mediated devices
  2021-08-26 13:34 ` [PATCH 07/14] vfio: simplify iommu group allocation for mediated devices Christoph Hellwig
@ 2021-08-26 17:24   ` Alex Williamson
  2021-08-26 22:59   ` Alex Williamson
  1 sibling, 0 replies; 23+ messages in thread
From: Alex Williamson @ 2021-08-26 17:24 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Diana Craciun, Cornelia Huck, Kirti Wankhede, Eric Auger,
	Jason Gunthorpe, kvm, Jason Gunthorpe, Tony Krowiak

[Cc +Tony]

On Thu, 26 Aug 2021 15:34:17 +0200
Christoph Hellwig <hch@lst.de> wrote:

> Reuse the logic in vfio_noiommu_group_alloc to allocate a fake
> single-device iommu group for mediated devices.  A new function is
> exposed to create vfio_device for this emulated case and the noiommu
> boolean field in struct vfio_group is replaced with a set of flags so
> that devices with an emulated IOMMU can be distinguished from those
> with no IOMMU at all.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
> ---
>  drivers/vfio/mdev/mdev_driver.c | 46 ++----------------
>  drivers/vfio/mdev/vfio_mdev.c   |  2 +-
>  drivers/vfio/vfio.c             | 82 ++++++++++++++++++++++-----------
>  include/linux/vfio.h            |  1 +
>  samples/vfio-mdev/mbochs.c      |  2 +-
>  samples/vfio-mdev/mdpy.c        |  2 +-
>  samples/vfio-mdev/mtty.c        |  2 +-
>  7 files changed, 65 insertions(+), 72 deletions(-)

As Jason suggested, I'll apply this after
0-v4-0203a4ab0596+f7-vfio_ap_jgg@nvidia.com and roll in the following:

diff --git a/drivers/s390/crypto/vfio_ap_ops.c b/drivers/s390/crypto/vfio_ap_ops.c
index 2347808fa3e4..f04fe1278f99 100644
--- a/drivers/s390/crypto/vfio_ap_ops.c
+++ b/drivers/s390/crypto/vfio_ap_ops.c
@@ -350,7 +350,7 @@ static int vfio_ap_mdev_probe(struct mdev_device *mdev)
 	list_add(&matrix_mdev->node, &matrix_dev->mdev_list);
 	mutex_unlock(&matrix_dev->lock);
 
-	ret = vfio_register_group_dev(&matrix_mdev->vdev);
+	ret = vfio_register_emulated_iommu_dev(&matrix_mdev->vdev);
 	if (ret)
 		goto err_list;
 	dev_set_drvdata(&mdev->dev, matrix_mdev);

Shout if this is incorrect.  Thanks,

Alex


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 07/14] vfio: simplify iommu group allocation for mediated devices
  2021-08-26 13:34 cleanup vfio iommu_group creation v4 Christoph Hellwig
@ 2021-08-26 13:34 ` Christoph Hellwig
  2021-08-26 17:24   ` Alex Williamson
  2021-08-26 22:59   ` Alex Williamson
  0 siblings, 2 replies; 23+ messages in thread
From: Christoph Hellwig @ 2021-08-26 13:34 UTC (permalink / raw)
  To: Alex Williamson
  Cc: Diana Craciun, Cornelia Huck, Kirti Wankhede, Eric Auger,
	Jason Gunthorpe, kvm, Jason Gunthorpe

Reuse the logic in vfio_noiommu_group_alloc to allocate a fake
single-device iommu group for mediated devices.  A new function is
exposed to create vfio_device for this emulated case and the noiommu
boolean field in struct vfio_group is replaced with a set of flags so
that devices with an emulated IOMMU can be distinguished from those
with no IOMMU at all.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/vfio/mdev/mdev_driver.c | 46 ++----------------
 drivers/vfio/mdev/vfio_mdev.c   |  2 +-
 drivers/vfio/vfio.c             | 82 ++++++++++++++++++++++-----------
 include/linux/vfio.h            |  1 +
 samples/vfio-mdev/mbochs.c      |  2 +-
 samples/vfio-mdev/mdpy.c        |  2 +-
 samples/vfio-mdev/mtty.c        |  2 +-
 7 files changed, 65 insertions(+), 72 deletions(-)

diff --git a/drivers/vfio/mdev/mdev_driver.c b/drivers/vfio/mdev/mdev_driver.c
index c368ec824e2b5c..14b9ab17426838 100644
--- a/drivers/vfio/mdev/mdev_driver.c
+++ b/drivers/vfio/mdev/mdev_driver.c
@@ -13,61 +13,23 @@
 
 #include "mdev_private.h"
 
-static int mdev_attach_iommu(struct mdev_device *mdev)
-{
-	int ret;
-	struct iommu_group *group;
-
-	group = iommu_group_alloc();
-	if (IS_ERR(group))
-		return PTR_ERR(group);
-
-	ret = iommu_group_add_device(group, &mdev->dev);
-	if (!ret)
-		dev_info(&mdev->dev, "MDEV: group_id = %d\n",
-			 iommu_group_id(group));
-
-	iommu_group_put(group);
-	return ret;
-}
-
-static void mdev_detach_iommu(struct mdev_device *mdev)
-{
-	iommu_group_remove_device(&mdev->dev);
-	dev_info(&mdev->dev, "MDEV: detaching iommu\n");
-}
-
 static int mdev_probe(struct device *dev)
 {
 	struct mdev_driver *drv =
 		container_of(dev->driver, struct mdev_driver, driver);
-	struct mdev_device *mdev = to_mdev_device(dev);
-	int ret;
-
-	ret = mdev_attach_iommu(mdev);
-	if (ret)
-		return ret;
 
-	if (drv->probe) {
-		ret = drv->probe(mdev);
-		if (ret)
-			mdev_detach_iommu(mdev);
-	}
-
-	return ret;
+	if (!drv->probe)
+		return 0;
+	return drv->probe(to_mdev_device(dev));
 }
 
 static int mdev_remove(struct device *dev)
 {
 	struct mdev_driver *drv =
 		container_of(dev->driver, struct mdev_driver, driver);
-	struct mdev_device *mdev = to_mdev_device(dev);
 
 	if (drv->remove)
-		drv->remove(mdev);
-
-	mdev_detach_iommu(mdev);
-
+		drv->remove(to_mdev_device(dev));
 	return 0;
 }
 
diff --git a/drivers/vfio/mdev/vfio_mdev.c b/drivers/vfio/mdev/vfio_mdev.c
index 7a9883048216e7..a90e24b0c851d3 100644
--- a/drivers/vfio/mdev/vfio_mdev.c
+++ b/drivers/vfio/mdev/vfio_mdev.c
@@ -119,7 +119,7 @@ static int vfio_mdev_probe(struct mdev_device *mdev)
 		return -ENOMEM;
 
 	vfio_init_group_dev(vdev, &mdev->dev, &vfio_mdev_dev_ops);
-	ret = vfio_register_group_dev(vdev);
+	ret = vfio_register_emulated_iommu_dev(vdev);
 	if (ret)
 		goto out_uninit;
 
diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
index 94c5e18a05e0d0..467432379b91ef 100644
--- a/drivers/vfio/vfio.c
+++ b/drivers/vfio/vfio.c
@@ -67,6 +67,21 @@ struct vfio_unbound_dev {
 	struct list_head		unbound_next;
 };
 
+/*
+ * Virtual device without IOMMU backing. The VFIO core fakes up an iommu_group
+ * as the iommu_group sysfs interface is part of the userspace ABI.  The user
+ * of these devices must not be able to directly trigger unmediated DMA.
+ */
+#define VFIO_EMULATED_IOMMU	(1 << 0)
+
+/*
+ * Physical device without IOMMU backing. The VFIO core fakes up an iommu_group
+ * as the iommu_group sysfs interface is part of the userspace ABI.  Users can
+ * trigger unmediated DMA by the device, usage is highly dangerous, requires
+ * an explicit opt-in and will taint the kernel.
+ */
+#define VFIO_NO_IOMMU		(1 << 1)
+
 struct vfio_group {
 	struct kref			kref;
 	int				minor;
@@ -83,7 +98,7 @@ struct vfio_group {
 	struct mutex			unbound_lock;
 	atomic_t			opened;
 	wait_queue_head_t		container_q;
-	bool				noiommu;
+	unsigned int			flags;
 	unsigned int			dev_counter;
 	struct kvm			*kvm;
 	struct blocking_notifier_head	notifier;
@@ -336,7 +351,7 @@ static void vfio_group_unlock_and_free(struct vfio_group *group)
  * Group objects - create, release, get, put, search
  */
 static struct vfio_group *vfio_create_group(struct iommu_group *iommu_group,
-		bool noiommu)
+		unsigned int flags)
 {
 	struct vfio_group *group, *tmp;
 	struct device *dev;
@@ -355,7 +370,7 @@ static struct vfio_group *vfio_create_group(struct iommu_group *iommu_group,
 	atomic_set(&group->opened, 0);
 	init_waitqueue_head(&group->container_q);
 	group->iommu_group = iommu_group;
-	group->noiommu = noiommu;
+	group->flags = flags;
 	BLOCKING_INIT_NOTIFIER_HEAD(&group->notifier);
 
 	group->nb.notifier_call = vfio_iommu_group_notifier;
@@ -391,8 +406,8 @@ static struct vfio_group *vfio_create_group(struct iommu_group *iommu_group,
 	}
 
 	dev = device_create(vfio.class, NULL,
-			    MKDEV(MAJOR(vfio.group_devt), minor),
-			    group, "%s%d", group->noiommu ? "noiommu-" : "",
+			    MKDEV(MAJOR(vfio.group_devt), minor), group, "%s%d",
+			    (group->flags & VFIO_NO_IOMMU) ? "noiommu-" : "",
 			    iommu_group_id(iommu_group));
 	if (IS_ERR(dev)) {
 		vfio_free_group_minor(minor);
@@ -778,8 +793,8 @@ void vfio_uninit_group_dev(struct vfio_device *device)
 }
 EXPORT_SYMBOL_GPL(vfio_uninit_group_dev);
 
-#ifdef CONFIG_VFIO_NOIOMMU
-static struct vfio_group *vfio_noiommu_group_alloc(struct device *dev)
+static struct vfio_group *vfio_noiommu_group_alloc(struct device *dev,
+		unsigned int flags)
 {
 	struct iommu_group *iommu_group;
 	struct vfio_group *group;
@@ -794,7 +809,7 @@ static struct vfio_group *vfio_noiommu_group_alloc(struct device *dev)
 	if (ret)
 		goto out_put_group;
 
-	group = vfio_create_group(iommu_group, true);
+	group = vfio_create_group(iommu_group, flags);
 	if (IS_ERR(group)) {
 		ret = PTR_ERR(group);
 		goto out_remove_device;
@@ -808,7 +823,6 @@ static struct vfio_group *vfio_noiommu_group_alloc(struct device *dev)
 	iommu_group_put(iommu_group);
 	return ERR_PTR(ret);
 }
-#endif
 
 static struct vfio_group *vfio_group_find_or_alloc(struct device *dev)
 {
@@ -824,7 +838,7 @@ static struct vfio_group *vfio_group_find_or_alloc(struct device *dev)
 		 * bus.  Taint the kernel because we're about to give a DMA
 		 * capable device to a user without IOMMU protection.
 		 */
-		group = vfio_noiommu_group_alloc(dev);
+		group = vfio_noiommu_group_alloc(dev, VFIO_NO_IOMMU);
 		if (group) {
 			add_taint(TAINT_USER, LOCKDEP_STILL_OK);
 			dev_warn(dev, "Adding kernel taint for vfio-noiommu group on device\n");
@@ -841,7 +855,7 @@ static struct vfio_group *vfio_group_find_or_alloc(struct device *dev)
 		goto out_put;
 
 	/* a newly created vfio_group keeps the reference. */
-	group = vfio_create_group(iommu_group, false);
+	group = vfio_create_group(iommu_group, 0);
 	if (IS_ERR(group))
 		goto out_put;
 	return group;
@@ -851,10 +865,13 @@ static struct vfio_group *vfio_group_find_or_alloc(struct device *dev)
 	return group;
 }
 
-int vfio_register_group_dev(struct vfio_device *device)
+static int __vfio_register_dev(struct vfio_device *device,
+		struct vfio_group *group)
 {
 	struct vfio_device *existing_device;
-	struct vfio_group *group;
+
+	if (IS_ERR(group))
+		return PTR_ERR(group);
 
 	/*
 	 * If the driver doesn't specify a set then the device is added to a
@@ -863,16 +880,12 @@ int vfio_register_group_dev(struct vfio_device *device)
 	if (!device->dev_set)
 		vfio_assign_device_set(device, device);
 
-	group = vfio_group_find_or_alloc(device->dev);
-	if (IS_ERR(group))
-		return PTR_ERR(group);
-
 	existing_device = vfio_group_get_device(group, device->dev);
 	if (existing_device) {
 		dev_WARN(device->dev, "Device already exists on group %d\n",
 			 iommu_group_id(group->iommu_group));
 		vfio_device_put(existing_device);
-		if (group->noiommu)
+		if (group->flags & (VFIO_NO_IOMMU | VFIO_EMULATED_IOMMU))
 			iommu_group_remove_device(device->dev);
 		vfio_group_put(group);
 		return -EBUSY;
@@ -891,8 +904,25 @@ int vfio_register_group_dev(struct vfio_device *device)
 
 	return 0;
 }
+
+int vfio_register_group_dev(struct vfio_device *device)
+{
+	return __vfio_register_dev(device,
+		vfio_group_find_or_alloc(device->dev));
+}
 EXPORT_SYMBOL_GPL(vfio_register_group_dev);
 
+/*
+ * Register a virtual device without IOMMU backing.  The user of this
+ * device must not be able to directly trigger unmediated DMA.
+ */
+int vfio_register_emulated_iommu_dev(struct vfio_device *device)
+{
+	return __vfio_register_dev(device,
+		vfio_noiommu_group_alloc(device->dev, VFIO_EMULATED_IOMMU));
+}
+EXPORT_SYMBOL_GPL(vfio_register_emulated_iommu_dev);
+
 /**
  * Get a reference to the vfio_device for a device.  Even if the
  * caller thinks they own the device, they could be racing with a
@@ -1019,7 +1049,7 @@ void vfio_unregister_group_dev(struct vfio_device *device)
 	if (list_empty(&group->device_list))
 		wait_event(group->container_q, !group->container);
 
-	if (group->noiommu)
+	if (group->flags & (VFIO_NO_IOMMU | VFIO_EMULATED_IOMMU))
 		iommu_group_remove_device(device->dev);
 
 	/* Matches the get in vfio_register_group_dev() */
@@ -1368,7 +1398,7 @@ static int vfio_group_set_container(struct vfio_group *group, int container_fd)
 	if (atomic_read(&group->container_users))
 		return -EINVAL;
 
-	if (group->noiommu && !capable(CAP_SYS_RAWIO))
+	if ((group->flags & VFIO_NO_IOMMU) && !capable(CAP_SYS_RAWIO))
 		return -EPERM;
 
 	f = fdget(container_fd);
@@ -1388,7 +1418,7 @@ static int vfio_group_set_container(struct vfio_group *group, int container_fd)
 
 	/* Real groups and fake groups cannot mix */
 	if (!list_empty(&container->group_list) &&
-	    container->noiommu != group->noiommu) {
+	    container->noiommu != (group->flags & VFIO_NO_IOMMU)) {
 		ret = -EPERM;
 		goto unlock_out;
 	}
@@ -1402,7 +1432,7 @@ static int vfio_group_set_container(struct vfio_group *group, int container_fd)
 	}
 
 	group->container = container;
-	container->noiommu = group->noiommu;
+	container->noiommu = (group->flags & VFIO_NO_IOMMU);
 	list_add(&group->container_next, &container->group_list);
 
 	/* Get a reference on the container and mark a user within the group */
@@ -1426,7 +1456,7 @@ static int vfio_group_add_container_user(struct vfio_group *group)
 	if (!atomic_inc_not_zero(&group->container_users))
 		return -EINVAL;
 
-	if (group->noiommu) {
+	if (group->flags & VFIO_NO_IOMMU) {
 		atomic_dec(&group->container_users);
 		return -EPERM;
 	}
@@ -1451,7 +1481,7 @@ static int vfio_group_get_device_fd(struct vfio_group *group, char *buf)
 	    !group->container->iommu_driver || !vfio_group_viable(group))
 		return -EINVAL;
 
-	if (group->noiommu && !capable(CAP_SYS_RAWIO))
+	if ((group->flags & VFIO_NO_IOMMU) && !capable(CAP_SYS_RAWIO))
 		return -EPERM;
 
 	device = vfio_device_get_from_name(group, buf);
@@ -1498,7 +1528,7 @@ static int vfio_group_get_device_fd(struct vfio_group *group, char *buf)
 
 	fd_install(fdno, filep);
 
-	if (group->noiommu)
+	if (group->flags & VFIO_NO_IOMMU)
 		dev_warn(device->dev, "vfio-noiommu device opened by user "
 			 "(%s:%d)\n", current->comm, task_pid_nr(current));
 	return fdno;
@@ -1594,7 +1624,7 @@ static int vfio_group_fops_open(struct inode *inode, struct file *filep)
 	if (!group)
 		return -ENODEV;
 
-	if (group->noiommu && !capable(CAP_SYS_RAWIO)) {
+	if ((group->flags & VFIO_NO_IOMMU) && !capable(CAP_SYS_RAWIO)) {
 		vfio_group_put(group);
 		return -EPERM;
 	}
diff --git a/include/linux/vfio.h b/include/linux/vfio.h
index f7083c2fd0d099..bbe29300862649 100644
--- a/include/linux/vfio.h
+++ b/include/linux/vfio.h
@@ -75,6 +75,7 @@ void vfio_init_group_dev(struct vfio_device *device, struct device *dev,
 			 const struct vfio_device_ops *ops);
 void vfio_uninit_group_dev(struct vfio_device *device);
 int vfio_register_group_dev(struct vfio_device *device);
+int vfio_register_emulated_iommu_dev(struct vfio_device *device);
 void vfio_unregister_group_dev(struct vfio_device *device);
 extern struct vfio_device *vfio_device_get_from_dev(struct device *dev);
 extern void vfio_device_put(struct vfio_device *device);
diff --git a/samples/vfio-mdev/mbochs.c b/samples/vfio-mdev/mbochs.c
index c313ab4d1f4e4e..cd41bec5fdeb39 100644
--- a/samples/vfio-mdev/mbochs.c
+++ b/samples/vfio-mdev/mbochs.c
@@ -553,7 +553,7 @@ static int mbochs_probe(struct mdev_device *mdev)
 	mbochs_create_config_space(mdev_state);
 	mbochs_reset(mdev_state);
 
-	ret = vfio_register_group_dev(&mdev_state->vdev);
+	ret = vfio_register_emulated_iommu_dev(&mdev_state->vdev);
 	if (ret)
 		goto err_mem;
 	dev_set_drvdata(&mdev->dev, mdev_state);
diff --git a/samples/vfio-mdev/mdpy.c b/samples/vfio-mdev/mdpy.c
index 8d1a80a0722aa9..fe5d43e797b6d3 100644
--- a/samples/vfio-mdev/mdpy.c
+++ b/samples/vfio-mdev/mdpy.c
@@ -258,7 +258,7 @@ static int mdpy_probe(struct mdev_device *mdev)
 
 	mdpy_count++;
 
-	ret = vfio_register_group_dev(&mdev_state->vdev);
+	ret = vfio_register_emulated_iommu_dev(&mdev_state->vdev);
 	if (ret)
 		goto err_mem;
 	dev_set_drvdata(&mdev->dev, mdev_state);
diff --git a/samples/vfio-mdev/mtty.c b/samples/vfio-mdev/mtty.c
index 5983cdb16e3d1d..a0e1a469bd47af 100644
--- a/samples/vfio-mdev/mtty.c
+++ b/samples/vfio-mdev/mtty.c
@@ -741,7 +741,7 @@ static int mtty_probe(struct mdev_device *mdev)
 
 	mtty_create_config_space(mdev_state);
 
-	ret = vfio_register_group_dev(&mdev_state->vdev);
+	ret = vfio_register_emulated_iommu_dev(&mdev_state->vdev);
 	if (ret)
 		goto err_vconfig;
 	dev_set_drvdata(&mdev->dev, mdev_state);
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* RE: [PATCH 07/14] vfio: simplify iommu group allocation for mediated devices
  2021-08-25 16:19 ` [PATCH 07/14] vfio: simplify iommu group allocation for mediated devices Christoph Hellwig
  2021-08-25 22:31   ` Jason Gunthorpe
@ 2021-08-26  3:10   ` Tian, Kevin
  1 sibling, 0 replies; 23+ messages in thread
From: Tian, Kevin @ 2021-08-26  3:10 UTC (permalink / raw)
  To: Christoph Hellwig, Alex Williamson
  Cc: Diana Craciun, Cornelia Huck, Kirti Wankhede, Eric Auger,
	Jason Gunthorpe, kvm

> From: Christoph Hellwig <hch@lst.de>
> Sent: Thursday, August 26, 2021 12:19 AM
> 
[...]
> diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
> index 71e0d3c4f1ac08..6bdfcb9264458c 100644
> --- a/drivers/vfio/vfio.c
> +++ b/drivers/vfio/vfio.c
> @@ -67,6 +67,9 @@ struct vfio_unbound_dev {
>  	struct list_head		unbound_next;
>  };
> 
> +#define VFIO_EMULATED_IOMMU	(1 << 0)
> +#define VFIO_NO_IOMMU		(1 << 1)

it'd be helpful to have some comment here, as emulated iommu
is easy to be mixed with virtual iommu. At least here so-called iommu
doesn't refer to the exact iommu hardware. It's more about the
capability of DMA isolation and how it is implemented.

[...]
> @@ -391,8 +394,8 @@ static struct vfio_group *vfio_create_group(struct
> iommu_group *iommu_group,
>  	}
> 
>  	dev = device_create(vfio.class, NULL,
> -			    MKDEV(MAJOR(vfio.group_devt), minor),
> -			    group, "%s%d", group->noiommu ? "noiommu-" :
> "",
> +			    MKDEV(MAJOR(vfio.group_devt), minor), group,
> "%s%d",
> +			    (group->flags & VFIO_NO_IOMMU) ? "noiommu-" :
> "",

what about introducing a group_no_iommu(group), given many duplicated
checks in this patch?

Thanks
Kevin

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 07/14] vfio: simplify iommu group allocation for mediated devices
  2021-08-25 16:19 ` [PATCH 07/14] vfio: simplify iommu group allocation for mediated devices Christoph Hellwig
@ 2021-08-25 22:31   ` Jason Gunthorpe
  2021-08-26  3:10   ` Tian, Kevin
  1 sibling, 0 replies; 23+ messages in thread
From: Jason Gunthorpe @ 2021-08-25 22:31 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Alex Williamson, Diana Craciun, Cornelia Huck, Kirti Wankhede,
	Eric Auger, kvm

On Wed, Aug 25, 2021 at 06:19:08PM +0200, Christoph Hellwig wrote:
> Reuse the logic in vfio_noiommu_group_alloc to allocate a fake
> single-device iommu group for mediated devices.  A new function is
> exposed to create vfio_device for this emulated case and the noiommu
> boolean field in struct vfio_group is replaced with a set of flags so
> that devices with an emulated IOMMU can be distinguished from those
> with no IOMMU at all.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>  drivers/vfio/mdev/mdev_driver.c | 46 ++--------------------
>  drivers/vfio/mdev/vfio_mdev.c   |  2 +-
>  drivers/vfio/vfio.c             | 70 +++++++++++++++++++++------------
>  include/linux/vfio.h            |  1 +
>  samples/vfio-mdev/mbochs.c      |  2 +-
>  samples/vfio-mdev/mdpy.c        |  2 +-
>  samples/vfio-mdev/mtty.c        |  2 +-
>  7 files changed, 53 insertions(+), 72 deletions(-)

This conflicts with the AP mdev conversion, it will need to change to
vfio_register_emulated_iommu_dev() like the samples did. Alex, I suggest
you take this patch after the AP mdev patch and add the one line fix up.

Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>

Jason

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH 07/14] vfio: simplify iommu group allocation for mediated devices
  2021-08-25 16:19 cleanup vfio iommu_group creation v3 Christoph Hellwig
@ 2021-08-25 16:19 ` Christoph Hellwig
  2021-08-25 22:31   ` Jason Gunthorpe
  2021-08-26  3:10   ` Tian, Kevin
  0 siblings, 2 replies; 23+ messages in thread
From: Christoph Hellwig @ 2021-08-25 16:19 UTC (permalink / raw)
  To: Alex Williamson
  Cc: Diana Craciun, Cornelia Huck, Kirti Wankhede, Eric Auger,
	Jason Gunthorpe, kvm

Reuse the logic in vfio_noiommu_group_alloc to allocate a fake
single-device iommu group for mediated devices.  A new function is
exposed to create vfio_device for this emulated case and the noiommu
boolean field in struct vfio_group is replaced with a set of flags so
that devices with an emulated IOMMU can be distinguished from those
with no IOMMU at all.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/vfio/mdev/mdev_driver.c | 46 ++--------------------
 drivers/vfio/mdev/vfio_mdev.c   |  2 +-
 drivers/vfio/vfio.c             | 70 +++++++++++++++++++++------------
 include/linux/vfio.h            |  1 +
 samples/vfio-mdev/mbochs.c      |  2 +-
 samples/vfio-mdev/mdpy.c        |  2 +-
 samples/vfio-mdev/mtty.c        |  2 +-
 7 files changed, 53 insertions(+), 72 deletions(-)

diff --git a/drivers/vfio/mdev/mdev_driver.c b/drivers/vfio/mdev/mdev_driver.c
index c368ec824e2b5c..14b9ab17426838 100644
--- a/drivers/vfio/mdev/mdev_driver.c
+++ b/drivers/vfio/mdev/mdev_driver.c
@@ -13,61 +13,23 @@
 
 #include "mdev_private.h"
 
-static int mdev_attach_iommu(struct mdev_device *mdev)
-{
-	int ret;
-	struct iommu_group *group;
-
-	group = iommu_group_alloc();
-	if (IS_ERR(group))
-		return PTR_ERR(group);
-
-	ret = iommu_group_add_device(group, &mdev->dev);
-	if (!ret)
-		dev_info(&mdev->dev, "MDEV: group_id = %d\n",
-			 iommu_group_id(group));
-
-	iommu_group_put(group);
-	return ret;
-}
-
-static void mdev_detach_iommu(struct mdev_device *mdev)
-{
-	iommu_group_remove_device(&mdev->dev);
-	dev_info(&mdev->dev, "MDEV: detaching iommu\n");
-}
-
 static int mdev_probe(struct device *dev)
 {
 	struct mdev_driver *drv =
 		container_of(dev->driver, struct mdev_driver, driver);
-	struct mdev_device *mdev = to_mdev_device(dev);
-	int ret;
-
-	ret = mdev_attach_iommu(mdev);
-	if (ret)
-		return ret;
 
-	if (drv->probe) {
-		ret = drv->probe(mdev);
-		if (ret)
-			mdev_detach_iommu(mdev);
-	}
-
-	return ret;
+	if (!drv->probe)
+		return 0;
+	return drv->probe(to_mdev_device(dev));
 }
 
 static int mdev_remove(struct device *dev)
 {
 	struct mdev_driver *drv =
 		container_of(dev->driver, struct mdev_driver, driver);
-	struct mdev_device *mdev = to_mdev_device(dev);
 
 	if (drv->remove)
-		drv->remove(mdev);
-
-	mdev_detach_iommu(mdev);
-
+		drv->remove(to_mdev_device(dev));
 	return 0;
 }
 
diff --git a/drivers/vfio/mdev/vfio_mdev.c b/drivers/vfio/mdev/vfio_mdev.c
index 7a9883048216e7..a90e24b0c851d3 100644
--- a/drivers/vfio/mdev/vfio_mdev.c
+++ b/drivers/vfio/mdev/vfio_mdev.c
@@ -119,7 +119,7 @@ static int vfio_mdev_probe(struct mdev_device *mdev)
 		return -ENOMEM;
 
 	vfio_init_group_dev(vdev, &mdev->dev, &vfio_mdev_dev_ops);
-	ret = vfio_register_group_dev(vdev);
+	ret = vfio_register_emulated_iommu_dev(vdev);
 	if (ret)
 		goto out_uninit;
 
diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
index 71e0d3c4f1ac08..6bdfcb9264458c 100644
--- a/drivers/vfio/vfio.c
+++ b/drivers/vfio/vfio.c
@@ -67,6 +67,9 @@ struct vfio_unbound_dev {
 	struct list_head		unbound_next;
 };
 
+#define VFIO_EMULATED_IOMMU	(1 << 0)
+#define VFIO_NO_IOMMU		(1 << 1)
+
 struct vfio_group {
 	struct kref			kref;
 	int				minor;
@@ -83,7 +86,7 @@ struct vfio_group {
 	struct mutex			unbound_lock;
 	atomic_t			opened;
 	wait_queue_head_t		container_q;
-	bool				noiommu;
+	unsigned int			flags;
 	unsigned int			dev_counter;
 	struct kvm			*kvm;
 	struct blocking_notifier_head	notifier;
@@ -336,7 +339,7 @@ static void vfio_group_unlock_and_free(struct vfio_group *group)
  * Group objects - create, release, get, put, search
  */
 static struct vfio_group *vfio_create_group(struct iommu_group *iommu_group,
-		bool noiommu)
+		unsigned int flags)
 {
 	struct vfio_group *group, *tmp;
 	struct device *dev;
@@ -355,7 +358,7 @@ static struct vfio_group *vfio_create_group(struct iommu_group *iommu_group,
 	atomic_set(&group->opened, 0);
 	init_waitqueue_head(&group->container_q);
 	group->iommu_group = iommu_group;
-	group->noiommu = noiommu;
+	group->flags = flags;
 	BLOCKING_INIT_NOTIFIER_HEAD(&group->notifier);
 
 	group->nb.notifier_call = vfio_iommu_group_notifier;
@@ -391,8 +394,8 @@ static struct vfio_group *vfio_create_group(struct iommu_group *iommu_group,
 	}
 
 	dev = device_create(vfio.class, NULL,
-			    MKDEV(MAJOR(vfio.group_devt), minor),
-			    group, "%s%d", group->noiommu ? "noiommu-" : "",
+			    MKDEV(MAJOR(vfio.group_devt), minor), group, "%s%d",
+			    (group->flags & VFIO_NO_IOMMU) ? "noiommu-" : "",
 			    iommu_group_id(iommu_group));
 	if (IS_ERR(dev)) {
 		vfio_free_group_minor(minor);
@@ -778,8 +781,8 @@ void vfio_uninit_group_dev(struct vfio_device *device)
 }
 EXPORT_SYMBOL_GPL(vfio_uninit_group_dev);
 
-#ifdef CONFIG_VFIO_NOIOMMU
-static struct vfio_group *vfio_noiommu_group_alloc(struct device *dev)
+static struct vfio_group *vfio_noiommu_group_alloc(struct device *dev,
+		unsigned int flags)
 {
 	struct iommu_group *iommu_group;
 	struct vfio_group *group;
@@ -794,7 +797,7 @@ static struct vfio_group *vfio_noiommu_group_alloc(struct device *dev)
 	if (ret)
 		goto out_put_group;
 
-	group = vfio_create_group(iommu_group, true);
+	group = vfio_create_group(iommu_group, flags);
 	if (IS_ERR(group)) {
 		ret = PTR_ERR(group);
 		goto out_remove_device;
@@ -808,7 +811,6 @@ static struct vfio_group *vfio_noiommu_group_alloc(struct device *dev)
 	iommu_group_put(iommu_group);
 	return ERR_PTR(ret);
 }
-#endif
 
 static struct vfio_group *vfio_group_find_or_alloc(struct device *dev)
 {
@@ -824,7 +826,7 @@ static struct vfio_group *vfio_group_find_or_alloc(struct device *dev)
 		 * bus.  Taint the kernel because we're about to give a DMA
 		 * capable device to a user without IOMMU protection.
 		 */
-		group = vfio_noiommu_group_alloc(dev);
+		group = vfio_noiommu_group_alloc(dev, VFIO_NO_IOMMU);
 		if (group) {
 			add_taint(TAINT_USER, LOCKDEP_STILL_OK);
 			dev_warn(dev, "Adding kernel taint for vfio-noiommu group on device\n");
@@ -841,7 +843,7 @@ static struct vfio_group *vfio_group_find_or_alloc(struct device *dev)
 		goto out_put;
 
 	/* a newly created vfio_group keeps the reference. */
-	group = vfio_create_group(iommu_group, false);
+	group = vfio_create_group(iommu_group, 0);
 	if (IS_ERR(group))
 		goto out_put;
 	return group;
@@ -851,10 +853,13 @@ static struct vfio_group *vfio_group_find_or_alloc(struct device *dev)
 	return group;
 }
 
-int vfio_register_group_dev(struct vfio_device *device)
+static int __vfio_register_dev(struct vfio_device *device,
+		struct vfio_group *group)
 {
 	struct vfio_device *existing_device;
-	struct vfio_group *group;
+
+	if (IS_ERR(group))
+		return PTR_ERR(group);
 
 	/*
 	 * If the driver doesn't specify a set then the device is added to a
@@ -863,16 +868,12 @@ int vfio_register_group_dev(struct vfio_device *device)
 	if (!device->dev_set)
 		vfio_assign_device_set(device, device);
 
-	group = vfio_group_find_or_alloc(device->dev);
-	if (IS_ERR(group))
-		return PTR_ERR(group);
-
 	existing_device = vfio_group_get_device(group, device->dev);
 	if (existing_device) {
 		dev_WARN(device->dev, "Device already exists on group %d\n",
 			 iommu_group_id(group->iommu_group));
 		vfio_device_put(existing_device);
-		if (group->noiommu)
+		if (group->flags & (VFIO_NO_IOMMU | VFIO_EMULATED_IOMMU))
 			iommu_group_remove_device(device->dev);
 		iommu_group_put(group->iommu_group);
 		return -EBUSY;
@@ -891,8 +892,25 @@ int vfio_register_group_dev(struct vfio_device *device)
 
 	return 0;
 }
+
+int vfio_register_group_dev(struct vfio_device *device)
+{
+	return __vfio_register_dev(device,
+		vfio_group_find_or_alloc(device->dev));
+}
 EXPORT_SYMBOL_GPL(vfio_register_group_dev);
 
+/*
+ * Register a virtual device without IOMMU backing.  The user of this
+ * device must not be able to directly trigger unmediated DMA.
+ */
+int vfio_register_emulated_iommu_dev(struct vfio_device *device)
+{
+	return __vfio_register_dev(device,
+		vfio_noiommu_group_alloc(device->dev, VFIO_EMULATED_IOMMU));
+}
+EXPORT_SYMBOL_GPL(vfio_register_emulated_iommu_dev);
+
 /**
  * Get a reference to the vfio_device for a device.  Even if the
  * caller thinks they own the device, they could be racing with a
@@ -1019,7 +1037,7 @@ void vfio_unregister_group_dev(struct vfio_device *device)
 	if (list_empty(&group->device_list))
 		wait_event(group->container_q, !group->container);
 
-	if (group->noiommu)
+	if (group->flags & (VFIO_NO_IOMMU | VFIO_EMULATED_IOMMU))
 		iommu_group_remove_device(device->dev);
 	iommu_group_put(group->iommu_group);
 }
@@ -1366,7 +1384,7 @@ static int vfio_group_set_container(struct vfio_group *group, int container_fd)
 	if (atomic_read(&group->container_users))
 		return -EINVAL;
 
-	if (group->noiommu && !capable(CAP_SYS_RAWIO))
+	if ((group->flags & VFIO_NO_IOMMU) && !capable(CAP_SYS_RAWIO))
 		return -EPERM;
 
 	f = fdget(container_fd);
@@ -1386,7 +1404,7 @@ static int vfio_group_set_container(struct vfio_group *group, int container_fd)
 
 	/* Real groups and fake groups cannot mix */
 	if (!list_empty(&container->group_list) &&
-	    container->noiommu != group->noiommu) {
+	    container->noiommu != (group->flags & VFIO_NO_IOMMU)) {
 		ret = -EPERM;
 		goto unlock_out;
 	}
@@ -1400,7 +1418,7 @@ static int vfio_group_set_container(struct vfio_group *group, int container_fd)
 	}
 
 	group->container = container;
-	container->noiommu = group->noiommu;
+	container->noiommu = (group->flags & VFIO_NO_IOMMU);
 	list_add(&group->container_next, &container->group_list);
 
 	/* Get a reference on the container and mark a user within the group */
@@ -1424,7 +1442,7 @@ static int vfio_group_add_container_user(struct vfio_group *group)
 	if (!atomic_inc_not_zero(&group->container_users))
 		return -EINVAL;
 
-	if (group->noiommu) {
+	if (group->flags & VFIO_NO_IOMMU) {
 		atomic_dec(&group->container_users);
 		return -EPERM;
 	}
@@ -1449,7 +1467,7 @@ static int vfio_group_get_device_fd(struct vfio_group *group, char *buf)
 	    !group->container->iommu_driver || !vfio_group_viable(group))
 		return -EINVAL;
 
-	if (group->noiommu && !capable(CAP_SYS_RAWIO))
+	if ((group->flags & VFIO_NO_IOMMU) && !capable(CAP_SYS_RAWIO))
 		return -EPERM;
 
 	device = vfio_device_get_from_name(group, buf);
@@ -1496,7 +1514,7 @@ static int vfio_group_get_device_fd(struct vfio_group *group, char *buf)
 
 	fd_install(fdno, filep);
 
-	if (group->noiommu)
+	if (group->flags & VFIO_NO_IOMMU)
 		dev_warn(device->dev, "vfio-noiommu device opened by user "
 			 "(%s:%d)\n", current->comm, task_pid_nr(current));
 	return fdno;
@@ -1592,7 +1610,7 @@ static int vfio_group_fops_open(struct inode *inode, struct file *filep)
 	if (!group)
 		return -ENODEV;
 
-	if (group->noiommu && !capable(CAP_SYS_RAWIO)) {
+	if ((group->flags & VFIO_NO_IOMMU) && !capable(CAP_SYS_RAWIO)) {
 		vfio_group_put(group);
 		return -EPERM;
 	}
diff --git a/include/linux/vfio.h b/include/linux/vfio.h
index f7083c2fd0d099..bbe29300862649 100644
--- a/include/linux/vfio.h
+++ b/include/linux/vfio.h
@@ -75,6 +75,7 @@ void vfio_init_group_dev(struct vfio_device *device, struct device *dev,
 			 const struct vfio_device_ops *ops);
 void vfio_uninit_group_dev(struct vfio_device *device);
 int vfio_register_group_dev(struct vfio_device *device);
+int vfio_register_emulated_iommu_dev(struct vfio_device *device);
 void vfio_unregister_group_dev(struct vfio_device *device);
 extern struct vfio_device *vfio_device_get_from_dev(struct device *dev);
 extern void vfio_device_put(struct vfio_device *device);
diff --git a/samples/vfio-mdev/mbochs.c b/samples/vfio-mdev/mbochs.c
index c313ab4d1f4e4e..cd41bec5fdeb39 100644
--- a/samples/vfio-mdev/mbochs.c
+++ b/samples/vfio-mdev/mbochs.c
@@ -553,7 +553,7 @@ static int mbochs_probe(struct mdev_device *mdev)
 	mbochs_create_config_space(mdev_state);
 	mbochs_reset(mdev_state);
 
-	ret = vfio_register_group_dev(&mdev_state->vdev);
+	ret = vfio_register_emulated_iommu_dev(&mdev_state->vdev);
 	if (ret)
 		goto err_mem;
 	dev_set_drvdata(&mdev->dev, mdev_state);
diff --git a/samples/vfio-mdev/mdpy.c b/samples/vfio-mdev/mdpy.c
index 8d1a80a0722aa9..fe5d43e797b6d3 100644
--- a/samples/vfio-mdev/mdpy.c
+++ b/samples/vfio-mdev/mdpy.c
@@ -258,7 +258,7 @@ static int mdpy_probe(struct mdev_device *mdev)
 
 	mdpy_count++;
 
-	ret = vfio_register_group_dev(&mdev_state->vdev);
+	ret = vfio_register_emulated_iommu_dev(&mdev_state->vdev);
 	if (ret)
 		goto err_mem;
 	dev_set_drvdata(&mdev->dev, mdev_state);
diff --git a/samples/vfio-mdev/mtty.c b/samples/vfio-mdev/mtty.c
index 5983cdb16e3d1d..a0e1a469bd47af 100644
--- a/samples/vfio-mdev/mtty.c
+++ b/samples/vfio-mdev/mtty.c
@@ -741,7 +741,7 @@ static int mtty_probe(struct mdev_device *mdev)
 
 	mtty_create_config_space(mdev_state);
 
-	ret = vfio_register_group_dev(&mdev_state->vdev);
+	ret = vfio_register_emulated_iommu_dev(&mdev_state->vdev);
 	if (ret)
 		goto err_vconfig;
 	dev_set_drvdata(&mdev->dev, mdev_state);
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH 07/14] vfio: simplify iommu group allocation for mediated devices
  2021-08-25 12:50                 ` Christoph Hellwig
@ 2021-08-25 12:50                   ` Jason Gunthorpe
  0 siblings, 0 replies; 23+ messages in thread
From: Jason Gunthorpe @ 2021-08-25 12:50 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Alex Williamson, Diana Craciun, Cornelia Huck, Kirti Wankhede,
	Eric Auger, kvm

On Wed, Aug 25, 2021 at 02:50:22PM +0200, Christoph Hellwig wrote:
> On Wed, Aug 25, 2021 at 09:45:38AM -0300, Jason Gunthorpe wrote:
> > I think so, in context of a group or iommu function it kind of makes
> > sense emulated would refer to the page table/dma process
> > 
> > The driver entry point can have the extra words
> >  vfio_register_emulated_dma_dev()
> 
> Not my preference, but I could live with it.  Or maybe emulated_iommu?

Sure

Jason

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 07/14] vfio: simplify iommu group allocation for mediated devices
  2021-08-25 12:45               ` Jason Gunthorpe
@ 2021-08-25 12:50                 ` Christoph Hellwig
  2021-08-25 12:50                   ` Jason Gunthorpe
  0 siblings, 1 reply; 23+ messages in thread
From: Christoph Hellwig @ 2021-08-25 12:50 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Christoph Hellwig, Alex Williamson, Diana Craciun, Cornelia Huck,
	Kirti Wankhede, Eric Auger, kvm

On Wed, Aug 25, 2021 at 09:45:38AM -0300, Jason Gunthorpe wrote:
> I think so, in context of a group or iommu function it kind of makes
> sense emulated would refer to the page table/dma process
> 
> The driver entry point can have the extra words
>  vfio_register_emulated_dma_dev()

Not my preference, but I could live with it.  Or maybe emulated_iommu?

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 07/14] vfio: simplify iommu group allocation for mediated devices
  2021-08-25 12:37             ` Christoph Hellwig
@ 2021-08-25 12:45               ` Jason Gunthorpe
  2021-08-25 12:50                 ` Christoph Hellwig
  0 siblings, 1 reply; 23+ messages in thread
From: Jason Gunthorpe @ 2021-08-25 12:45 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Alex Williamson, Diana Craciun, Cornelia Huck, Kirti Wankhede,
	Eric Auger, kvm

On Wed, Aug 25, 2021 at 02:37:42PM +0200, Christoph Hellwig wrote:
> On Wed, Aug 25, 2021 at 09:34:54AM -0300, Jason Gunthorpe wrote:
> > On Wed, Aug 25, 2021 at 02:24:00PM +0200, Christoph Hellwig wrote:
> > > On Wed, Aug 25, 2021 at 09:21:44AM -0300, Jason Gunthorpe wrote:
> > > > This feature is about creating a device that is not connected to a HW
> > > > IO page table (at least by the VFIO iommu code) but the IO page table
> > > > is held in software and accessed by the VFIO driver through the pin
> > > > API.
> > > > 
> > > > virtual_iommu is somewhat overloaded with the idea of a vIOMMU created
> > > > by qemu and stuffed into a guest..
> > > > 
> > > > "domainless" might work but I also find it confusing that the iommu
> > > > code uses the word domain to refer to a HW IO page table :\
> > > > 
> > > > Maybe "sw io page table" ?
> > > 
> > > Or simply emulated?  At least looking at i915 there is very little
> > > direct connection to the actual hardware, and while I don't understand
> > > them fully the s390 driver look similar.  And the samples are completely
> > > faked up anyway.
> > 
> > Emulated IO page table works for me!
> 
> Hmm, that is a full sentence for the comment that needs to be added.
> Are you fine with just s/mediated/emulated/ for the symbol names or
> do you want something more specific?

I think so, in context of a group or iommu function it kind of makes
sense emulated would refer to the page table/dma process

The driver entry point can have the extra words
 vfio_register_emulated_dma_dev()

Seems to read well?

Jason

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 07/14] vfio: simplify iommu group allocation for mediated devices
  2021-08-25 12:34           ` Jason Gunthorpe
@ 2021-08-25 12:37             ` Christoph Hellwig
  2021-08-25 12:45               ` Jason Gunthorpe
  0 siblings, 1 reply; 23+ messages in thread
From: Christoph Hellwig @ 2021-08-25 12:37 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Christoph Hellwig, Alex Williamson, Diana Craciun, Cornelia Huck,
	Kirti Wankhede, Eric Auger, kvm

On Wed, Aug 25, 2021 at 09:34:54AM -0300, Jason Gunthorpe wrote:
> On Wed, Aug 25, 2021 at 02:24:00PM +0200, Christoph Hellwig wrote:
> > On Wed, Aug 25, 2021 at 09:21:44AM -0300, Jason Gunthorpe wrote:
> > > This feature is about creating a device that is not connected to a HW
> > > IO page table (at least by the VFIO iommu code) but the IO page table
> > > is held in software and accessed by the VFIO driver through the pin
> > > API.
> > > 
> > > virtual_iommu is somewhat overloaded with the idea of a vIOMMU created
> > > by qemu and stuffed into a guest..
> > > 
> > > "domainless" might work but I also find it confusing that the iommu
> > > code uses the word domain to refer to a HW IO page table :\
> > > 
> > > Maybe "sw io page table" ?
> > 
> > Or simply emulated?  At least looking at i915 there is very little
> > direct connection to the actual hardware, and while I don't understand
> > them fully the s390 driver look similar.  And the samples are completely
> > faked up anyway.
> 
> Emulated IO page table works for me!

Hmm, that is a full sentence for the comment that needs to be added.
Are you fine with just s/mediated/emulated/ for the symbol names or
do you want something more specific?

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 07/14] vfio: simplify iommu group allocation for mediated devices
  2021-08-25 12:24         ` Christoph Hellwig
@ 2021-08-25 12:34           ` Jason Gunthorpe
  2021-08-25 12:37             ` Christoph Hellwig
  0 siblings, 1 reply; 23+ messages in thread
From: Jason Gunthorpe @ 2021-08-25 12:34 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Alex Williamson, Diana Craciun, Cornelia Huck, Kirti Wankhede,
	Eric Auger, kvm

On Wed, Aug 25, 2021 at 02:24:00PM +0200, Christoph Hellwig wrote:
> On Wed, Aug 25, 2021 at 09:21:44AM -0300, Jason Gunthorpe wrote:
> > This feature is about creating a device that is not connected to a HW
> > IO page table (at least by the VFIO iommu code) but the IO page table
> > is held in software and accessed by the VFIO driver through the pin
> > API.
> > 
> > virtual_iommu is somewhat overloaded with the idea of a vIOMMU created
> > by qemu and stuffed into a guest..
> > 
> > "domainless" might work but I also find it confusing that the iommu
> > code uses the word domain to refer to a HW IO page table :\
> > 
> > Maybe "sw io page table" ?
> 
> Or simply emulated?  At least looking at i915 there is very little
> direct connection to the actual hardware, and while I don't understand
> them fully the s390 driver look similar.  And the samples are completely
> faked up anyway.

Emulated IO page table works for me!

Jason

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 07/14] vfio: simplify iommu group allocation for mediated devices
  2021-08-25 12:21       ` Jason Gunthorpe
@ 2021-08-25 12:24         ` Christoph Hellwig
  2021-08-25 12:34           ` Jason Gunthorpe
  0 siblings, 1 reply; 23+ messages in thread
From: Christoph Hellwig @ 2021-08-25 12:24 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Christoph Hellwig, Alex Williamson, Diana Craciun, Cornelia Huck,
	Kirti Wankhede, Eric Auger, kvm

On Wed, Aug 25, 2021 at 09:21:44AM -0300, Jason Gunthorpe wrote:
> This feature is about creating a device that is not connected to a HW
> IO page table (at least by the VFIO iommu code) but the IO page table
> is held in software and accessed by the VFIO driver through the pin
> API.
> 
> virtual_iommu is somewhat overloaded with the idea of a vIOMMU created
> by qemu and stuffed into a guest..
> 
> "domainless" might work but I also find it confusing that the iommu
> code uses the word domain to refer to a HW IO page table :\
> 
> Maybe "sw io page table" ?

Or simply emulated?  At least looking at i915 there is very little
direct connection to the actual hardware, and while I don't understand
them fully the s390 driver look similar.  And the samples are completely
faked up anyway.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 07/14] vfio: simplify iommu group allocation for mediated devices
  2021-08-25  5:32     ` Christoph Hellwig
@ 2021-08-25 12:21       ` Jason Gunthorpe
  2021-08-25 12:24         ` Christoph Hellwig
  0 siblings, 1 reply; 23+ messages in thread
From: Jason Gunthorpe @ 2021-08-25 12:21 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Alex Williamson, Diana Craciun, Cornelia Huck, Kirti Wankhede,
	Eric Auger, kvm

On Wed, Aug 25, 2021 at 07:32:37AM +0200, Christoph Hellwig wrote:
> On Tue, Aug 24, 2021 at 09:19:16PM -0300, Jason Gunthorpe wrote:
> > The mechanism looks fine, but I think the core code is much clearer if
> > the name is not 'mediated' but 'sw_iommu' or something that implies
> > the group is running with a software page table. mediated has become
> > so overloaded in this code.
> 
> I thought that was sort of the definition of mediated - there needs to
> ben entify that "mediates" access so that a user of this interface
> can't trigger undmediated DMA to arbitrary addresses.  My other choice
> that I used for a while was "virtual".  sw_iommu sounds a little clumsy.

vfio mediated is really some toolbox of different features that a
driver can use to build what it wants.

For instance Intel is looking at this concept of a mediated device
that uses PASID for the IOMMU. It would have a real IOMMU, use real
IOMMU page tables and in this language it would create some PASID
group, not a "sw_iommu" group.

This feature is about creating a device that is not connected to a HW
IO page table (at least by the VFIO iommu code) but the IO page table
is held in software and accessed by the VFIO driver through the pin
API.

virtual_iommu is somewhat overloaded with the idea of a vIOMMU created
by qemu and stuffed into a guest..

"domainless" might work but I also find it confusing that the iommu
code uses the word domain to refer to a HW IO page table :\

Maybe "sw io page table" ?

Jason

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 07/14] vfio: simplify iommu group allocation for mediated devices
  2021-08-25  0:19   ` Jason Gunthorpe
@ 2021-08-25  5:32     ` Christoph Hellwig
  2021-08-25 12:21       ` Jason Gunthorpe
  0 siblings, 1 reply; 23+ messages in thread
From: Christoph Hellwig @ 2021-08-25  5:32 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Christoph Hellwig, Alex Williamson, Diana Craciun, Cornelia Huck,
	Kirti Wankhede, Eric Auger, kvm

On Tue, Aug 24, 2021 at 09:19:16PM -0300, Jason Gunthorpe wrote:
> The mechanism looks fine, but I think the core code is much clearer if
> the name is not 'mediated' but 'sw_iommu' or something that implies
> the group is running with a software page table. mediated has become
> so overloaded in this code.

I thought that was sort of the definition of mediated - there needs to
ben entify that "mediates" access so that a user of this interface
can't trigger undmediated DMA to arbitrary addresses.  My other choice
that I used for a while was "virtual".  sw_iommu sounds a little clumsy.

I guess either way we need some documentation describing it along the
above lines.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 07/14] vfio: simplify iommu group allocation for mediated devices
  2021-08-24 14:46 ` [PATCH 07/14] vfio: simplify iommu group allocation for mediated devices Christoph Hellwig
@ 2021-08-25  0:19   ` Jason Gunthorpe
  2021-08-25  5:32     ` Christoph Hellwig
  0 siblings, 1 reply; 23+ messages in thread
From: Jason Gunthorpe @ 2021-08-25  0:19 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Alex Williamson, Diana Craciun, Cornelia Huck, Kirti Wankhede,
	Eric Auger, kvm

On Tue, Aug 24, 2021 at 04:46:42PM +0200, Christoph Hellwig wrote:

> +int vfio_register_group_dev(struct vfio_device *device)
> +{
> +	return __vfio_register_dev(device,
> +		vfio_group_find_or_alloc(device->dev));
> +}
>  EXPORT_SYMBOL_GPL(vfio_register_group_dev);
>  
> +int vfio_register_mediated_dev(struct vfio_device *device)
> +{
> +	return __vfio_register_dev(device,
> +		vfio_noiommu_group_alloc(device->dev, VFIO_MEDIATED));
> +}
> +EXPORT_SYMBOL_GPL(vfio_register_mediated_dev);

The mechanism looks fine, but I think the core code is much clearer if
the name is not 'mediated' but 'sw_iommu' or something that implies
the group is running with a software page table. mediated has become
so overloaded in this code.

Really what this flag is doing is putting the group into a state where
the vfio_*pin_pages APIs are available once the vfio_device is opened.

So it really becomes an API family where a driver will call
vfio_regsiter_sw_iommu_dev(), then use the vfio_pin/unpin_pages API
set.

Jason

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH 07/14] vfio: simplify iommu group allocation for mediated devices
  2021-08-24 14:46 cleanup vfio iommu_group creation v2 Christoph Hellwig
@ 2021-08-24 14:46 ` Christoph Hellwig
  2021-08-25  0:19   ` Jason Gunthorpe
  0 siblings, 1 reply; 23+ messages in thread
From: Christoph Hellwig @ 2021-08-24 14:46 UTC (permalink / raw)
  To: Alex Williamson
  Cc: Diana Craciun, Cornelia Huck, Kirti Wankhede, Eric Auger,
	Jason Gunthorpe, kvm

Reuse the logic in vfio_noiommu_group_alloc to allocate a single-device
iommu group for mediated device.  For this a new
vfio_register_mediated_dev is added that calls vfio_noiommu_group_alloc.
The noiommu boolean in struct vfio_group is replaced with a set of flags
to that mediated devices can be distinguished from noiommu devices.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/vfio/mdev/mdev_driver.c | 46 ++---------------------
 drivers/vfio/mdev/vfio_mdev.c   |  2 +-
 drivers/vfio/vfio.c             | 66 ++++++++++++++++++++-------------
 include/linux/vfio.h            |  1 +
 samples/vfio-mdev/mbochs.c      |  2 +-
 samples/vfio-mdev/mdpy.c        |  2 +-
 samples/vfio-mdev/mtty.c        |  2 +-
 7 files changed, 49 insertions(+), 72 deletions(-)

diff --git a/drivers/vfio/mdev/mdev_driver.c b/drivers/vfio/mdev/mdev_driver.c
index c368ec824e2b5c..14b9ab17426838 100644
--- a/drivers/vfio/mdev/mdev_driver.c
+++ b/drivers/vfio/mdev/mdev_driver.c
@@ -13,61 +13,23 @@
 
 #include "mdev_private.h"
 
-static int mdev_attach_iommu(struct mdev_device *mdev)
-{
-	int ret;
-	struct iommu_group *group;
-
-	group = iommu_group_alloc();
-	if (IS_ERR(group))
-		return PTR_ERR(group);
-
-	ret = iommu_group_add_device(group, &mdev->dev);
-	if (!ret)
-		dev_info(&mdev->dev, "MDEV: group_id = %d\n",
-			 iommu_group_id(group));
-
-	iommu_group_put(group);
-	return ret;
-}
-
-static void mdev_detach_iommu(struct mdev_device *mdev)
-{
-	iommu_group_remove_device(&mdev->dev);
-	dev_info(&mdev->dev, "MDEV: detaching iommu\n");
-}
-
 static int mdev_probe(struct device *dev)
 {
 	struct mdev_driver *drv =
 		container_of(dev->driver, struct mdev_driver, driver);
-	struct mdev_device *mdev = to_mdev_device(dev);
-	int ret;
-
-	ret = mdev_attach_iommu(mdev);
-	if (ret)
-		return ret;
 
-	if (drv->probe) {
-		ret = drv->probe(mdev);
-		if (ret)
-			mdev_detach_iommu(mdev);
-	}
-
-	return ret;
+	if (!drv->probe)
+		return 0;
+	return drv->probe(to_mdev_device(dev));
 }
 
 static int mdev_remove(struct device *dev)
 {
 	struct mdev_driver *drv =
 		container_of(dev->driver, struct mdev_driver, driver);
-	struct mdev_device *mdev = to_mdev_device(dev);
 
 	if (drv->remove)
-		drv->remove(mdev);
-
-	mdev_detach_iommu(mdev);
-
+		drv->remove(to_mdev_device(dev));
 	return 0;
 }
 
diff --git a/drivers/vfio/mdev/vfio_mdev.c b/drivers/vfio/mdev/vfio_mdev.c
index 7a9883048216e7..7bab075173145b 100644
--- a/drivers/vfio/mdev/vfio_mdev.c
+++ b/drivers/vfio/mdev/vfio_mdev.c
@@ -119,7 +119,7 @@ static int vfio_mdev_probe(struct mdev_device *mdev)
 		return -ENOMEM;
 
 	vfio_init_group_dev(vdev, &mdev->dev, &vfio_mdev_dev_ops);
-	ret = vfio_register_group_dev(vdev);
+	ret = vfio_register_mediated_dev(vdev);
 	if (ret)
 		goto out_uninit;
 
diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
index f44532fed5bb4f..45987dc897179e 100644
--- a/drivers/vfio/vfio.c
+++ b/drivers/vfio/vfio.c
@@ -67,6 +67,9 @@ struct vfio_unbound_dev {
 	struct list_head		unbound_next;
 };
 
+#define VFIO_MEDIATED	(1 << 0)
+#define VFIO_NOIOMMU	(1 << 1)
+
 struct vfio_group {
 	struct kref			kref;
 	int				minor;
@@ -83,7 +86,7 @@ struct vfio_group {
 	struct mutex			unbound_lock;
 	atomic_t			opened;
 	wait_queue_head_t		container_q;
-	bool				noiommu;
+	unsigned int			flags;
 	unsigned int			dev_counter;
 	struct kvm			*kvm;
 	struct blocking_notifier_head	notifier;
@@ -336,7 +339,7 @@ static void vfio_group_unlock_and_free(struct vfio_group *group)
  * Group objects - create, release, get, put, search
  */
 static struct vfio_group *vfio_create_group(struct iommu_group *iommu_group,
-		bool noiommu)
+		unsigned int flags)
 {
 	struct vfio_group *group, *tmp;
 	struct device *dev;
@@ -355,7 +358,7 @@ static struct vfio_group *vfio_create_group(struct iommu_group *iommu_group,
 	atomic_set(&group->opened, 0);
 	init_waitqueue_head(&group->container_q);
 	group->iommu_group = iommu_group;
-	group->noiommu = noiommu;
+	group->flags = flags;
 	BLOCKING_INIT_NOTIFIER_HEAD(&group->notifier);
 
 	group->nb.notifier_call = vfio_iommu_group_notifier;
@@ -391,8 +394,8 @@ static struct vfio_group *vfio_create_group(struct iommu_group *iommu_group,
 	}
 
 	dev = device_create(vfio.class, NULL,
-			    MKDEV(MAJOR(vfio.group_devt), minor),
-			    group, "%s%d", group->noiommu ? "noiommu-" : "",
+			    MKDEV(MAJOR(vfio.group_devt), minor), group, "%s%d",
+			    (group->flags & VFIO_NOIOMMU) ? "noiommu-" : "",
 			    iommu_group_id(iommu_group));
 	if (IS_ERR(dev)) {
 		vfio_free_group_minor(minor);
@@ -778,8 +781,8 @@ void vfio_uninit_group_dev(struct vfio_device *device)
 }
 EXPORT_SYMBOL_GPL(vfio_uninit_group_dev);
 
-#ifdef CONFIG_VFIO_NOIOMMU
-static struct vfio_group *vfio_noiommu_group_alloc(struct device *dev)
+static struct vfio_group *vfio_noiommu_group_alloc(struct device *dev,
+		unsigned int flags)
 {
 	struct iommu_group *iommu_group;
 	struct vfio_group *group;
@@ -794,7 +797,7 @@ static struct vfio_group *vfio_noiommu_group_alloc(struct device *dev)
 	if (ret)
 		goto out_put_group;
 
-	group = vfio_create_group(iommu_group, true);
+	group = vfio_create_group(iommu_group, flags);
 	if (IS_ERR(group)) {
 		ret = PTR_ERR(group);
 		goto out_remove_device;
@@ -808,7 +811,6 @@ static struct vfio_group *vfio_noiommu_group_alloc(struct device *dev)
 	iommu_group_put(iommu_group);
 	return ERR_PTR(ret);
 }
-#endif
 
 static struct vfio_group *vfio_group_find_or_alloc(struct device *dev)
 {
@@ -824,7 +826,7 @@ static struct vfio_group *vfio_group_find_or_alloc(struct device *dev)
 		 * bus.  Taint the kernel because we're about to give a DMA
 		 * capable device to a user without IOMMU protection.
 		 */
-		group = vfio_noiommu_group_alloc(dev);
+		group = vfio_noiommu_group_alloc(dev, VFIO_NOIOMMU);
 		if (group) {
 			add_taint(TAINT_USER, LOCKDEP_STILL_OK);
 			dev_warn(dev, "Adding kernel taint for vfio-noiommu group on device\n");
@@ -841,7 +843,7 @@ static struct vfio_group *vfio_group_find_or_alloc(struct device *dev)
 	 */
 	group = vfio_group_get_from_iommu(iommu_group);
 	if (!group) {
-		group = vfio_create_group(iommu_group, false);
+		group = vfio_create_group(iommu_group, 0);
 		if (!IS_ERR(group))
 			return group;
 	}
@@ -850,10 +852,13 @@ static struct vfio_group *vfio_group_find_or_alloc(struct device *dev)
 	return group;
 }
 
-int vfio_register_group_dev(struct vfio_device *device)
+static int __vfio_register_dev(struct vfio_device *device,
+		struct vfio_group *group)
 {
 	struct vfio_device *existing_device;
-	struct vfio_group *group;
+
+	if (IS_ERR(group))
+		return PTR_ERR(group);
 
 	/*
 	 * If the driver doesn't specify a set then the device is added to a
@@ -862,16 +867,12 @@ int vfio_register_group_dev(struct vfio_device *device)
 	if (!device->dev_set)
 		vfio_assign_device_set(device, device);
 
-	group = vfio_group_find_or_alloc(device->dev);
-	if (IS_ERR(group))
-		return PTR_ERR(group);
-
 	existing_device = vfio_group_get_device(group, device->dev);
 	if (existing_device) {
 		dev_WARN(device->dev, "Device already exists on group %d\n",
 			 iommu_group_id(group->iommu_group));
 		vfio_device_put(existing_device);
-		if (group->noiommu)
+		if (group->flags & (VFIO_NOIOMMU | VFIO_MEDIATED))
 			iommu_group_remove_device(device->dev);
 		iommu_group_put(group->iommu_group);
 		return -EBUSY;
@@ -890,8 +891,21 @@ int vfio_register_group_dev(struct vfio_device *device)
 
 	return 0;
 }
+
+int vfio_register_group_dev(struct vfio_device *device)
+{
+	return __vfio_register_dev(device,
+		vfio_group_find_or_alloc(device->dev));
+}
 EXPORT_SYMBOL_GPL(vfio_register_group_dev);
 
+int vfio_register_mediated_dev(struct vfio_device *device)
+{
+	return __vfio_register_dev(device,
+		vfio_noiommu_group_alloc(device->dev, VFIO_MEDIATED));
+}
+EXPORT_SYMBOL_GPL(vfio_register_mediated_dev);
+
 /**
  * Get a reference to the vfio_device for a device.  Even if the
  * caller thinks they own the device, they could be racing with a
@@ -1018,7 +1032,7 @@ void vfio_unregister_group_dev(struct vfio_device *device)
 	if (list_empty(&group->device_list))
 		wait_event(group->container_q, !group->container);
 
-	if (group->noiommu)
+	if (group->flags & (VFIO_NOIOMMU | VFIO_MEDIATED))
 		iommu_group_remove_device(device->dev);
 	iommu_group_put(group->iommu_group);
 }
@@ -1365,7 +1379,7 @@ static int vfio_group_set_container(struct vfio_group *group, int container_fd)
 	if (atomic_read(&group->container_users))
 		return -EINVAL;
 
-	if (group->noiommu && !capable(CAP_SYS_RAWIO))
+	if ((group->flags & VFIO_NOIOMMU) && !capable(CAP_SYS_RAWIO))
 		return -EPERM;
 
 	f = fdget(container_fd);
@@ -1385,7 +1399,7 @@ static int vfio_group_set_container(struct vfio_group *group, int container_fd)
 
 	/* Real groups and fake groups cannot mix */
 	if (!list_empty(&container->group_list) &&
-	    container->noiommu != group->noiommu) {
+	    container->noiommu != (group->flags & VFIO_NOIOMMU)) {
 		ret = -EPERM;
 		goto unlock_out;
 	}
@@ -1399,7 +1413,7 @@ static int vfio_group_set_container(struct vfio_group *group, int container_fd)
 	}
 
 	group->container = container;
-	container->noiommu = group->noiommu;
+	container->noiommu = (group->flags & VFIO_NOIOMMU);
 	list_add(&group->container_next, &container->group_list);
 
 	/* Get a reference on the container and mark a user within the group */
@@ -1423,7 +1437,7 @@ static int vfio_group_add_container_user(struct vfio_group *group)
 	if (!atomic_inc_not_zero(&group->container_users))
 		return -EINVAL;
 
-	if (group->noiommu) {
+	if (group->flags & VFIO_NOIOMMU) {
 		atomic_dec(&group->container_users);
 		return -EPERM;
 	}
@@ -1448,7 +1462,7 @@ static int vfio_group_get_device_fd(struct vfio_group *group, char *buf)
 	    !group->container->iommu_driver || !vfio_group_viable(group))
 		return -EINVAL;
 
-	if (group->noiommu && !capable(CAP_SYS_RAWIO))
+	if ((group->flags & VFIO_NOIOMMU) && !capable(CAP_SYS_RAWIO))
 		return -EPERM;
 
 	device = vfio_device_get_from_name(group, buf);
@@ -1495,7 +1509,7 @@ static int vfio_group_get_device_fd(struct vfio_group *group, char *buf)
 
 	fd_install(fdno, filep);
 
-	if (group->noiommu)
+	if (group->flags & VFIO_NOIOMMU)
 		dev_warn(device->dev, "vfio-noiommu device opened by user "
 			 "(%s:%d)\n", current->comm, task_pid_nr(current));
 	return fdno;
@@ -1591,7 +1605,7 @@ static int vfio_group_fops_open(struct inode *inode, struct file *filep)
 	if (!group)
 		return -ENODEV;
 
-	if (group->noiommu && !capable(CAP_SYS_RAWIO)) {
+	if ((group->flags & VFIO_NOIOMMU) && !capable(CAP_SYS_RAWIO)) {
 		vfio_group_put(group);
 		return -EPERM;
 	}
diff --git a/include/linux/vfio.h b/include/linux/vfio.h
index f7083c2fd0d099..2ffbb2d02c9346 100644
--- a/include/linux/vfio.h
+++ b/include/linux/vfio.h
@@ -75,6 +75,7 @@ void vfio_init_group_dev(struct vfio_device *device, struct device *dev,
 			 const struct vfio_device_ops *ops);
 void vfio_uninit_group_dev(struct vfio_device *device);
 int vfio_register_group_dev(struct vfio_device *device);
+int vfio_register_mediated_dev(struct vfio_device *device);
 void vfio_unregister_group_dev(struct vfio_device *device);
 extern struct vfio_device *vfio_device_get_from_dev(struct device *dev);
 extern void vfio_device_put(struct vfio_device *device);
diff --git a/samples/vfio-mdev/mbochs.c b/samples/vfio-mdev/mbochs.c
index c313ab4d1f4e4e..f59218e03dd1b4 100644
--- a/samples/vfio-mdev/mbochs.c
+++ b/samples/vfio-mdev/mbochs.c
@@ -553,7 +553,7 @@ static int mbochs_probe(struct mdev_device *mdev)
 	mbochs_create_config_space(mdev_state);
 	mbochs_reset(mdev_state);
 
-	ret = vfio_register_group_dev(&mdev_state->vdev);
+	ret = vfio_register_mediated_dev(&mdev_state->vdev);
 	if (ret)
 		goto err_mem;
 	dev_set_drvdata(&mdev->dev, mdev_state);
diff --git a/samples/vfio-mdev/mdpy.c b/samples/vfio-mdev/mdpy.c
index 8d1a80a0722aa9..4df380014ad914 100644
--- a/samples/vfio-mdev/mdpy.c
+++ b/samples/vfio-mdev/mdpy.c
@@ -258,7 +258,7 @@ static int mdpy_probe(struct mdev_device *mdev)
 
 	mdpy_count++;
 
-	ret = vfio_register_group_dev(&mdev_state->vdev);
+	ret = vfio_register_mediated_dev(&mdev_state->vdev);
 	if (ret)
 		goto err_mem;
 	dev_set_drvdata(&mdev->dev, mdev_state);
diff --git a/samples/vfio-mdev/mtty.c b/samples/vfio-mdev/mtty.c
index 5983cdb16e3d1d..ef9f48bfe8d04f 100644
--- a/samples/vfio-mdev/mtty.c
+++ b/samples/vfio-mdev/mtty.c
@@ -741,7 +741,7 @@ static int mtty_probe(struct mdev_device *mdev)
 
 	mtty_create_config_space(mdev_state);
 
-	ret = vfio_register_group_dev(&mdev_state->vdev);
+	ret = vfio_register_mediated_dev(&mdev_state->vdev);
 	if (ret)
 		goto err_vconfig;
 	dev_set_drvdata(&mdev->dev, mdev_state);
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 07/14] vfio: simplify iommu group allocation for mediated devices
  2021-08-11 15:14 cleanup vfio iommu_group creation Christoph Hellwig
@ 2021-08-11 15:14 ` Christoph Hellwig
  0 siblings, 0 replies; 23+ messages in thread
From: Christoph Hellwig @ 2021-08-11 15:14 UTC (permalink / raw)
  To: Alex Williamson
  Cc: Diana Craciun, Cornelia Huck, Kirti Wankhede, Eric Auger,
	Jason Gunthorpe, kvm

Reuse the logic in vfio_noiommu_group_alloc to allocate a single-device
iommu group for mediated device.  For this a new
vfio_register_mediated_dev is added that calls vfio_noiommu_group_alloc.
The noiommu boolean in struct vfio_group is replaced with a set of flags
to that mediated devices can be distinguished from noiommu devices.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/vfio/mdev/mdev_driver.c | 46 ++---------------------
 drivers/vfio/mdev/vfio_mdev.c   |  2 +-
 drivers/vfio/vfio.c             | 66 ++++++++++++++++++++-------------
 include/linux/vfio.h            |  1 +
 samples/vfio-mdev/mbochs.c      |  2 +-
 samples/vfio-mdev/mdpy.c        |  2 +-
 samples/vfio-mdev/mtty.c        |  2 +-
 7 files changed, 49 insertions(+), 72 deletions(-)

diff --git a/drivers/vfio/mdev/mdev_driver.c b/drivers/vfio/mdev/mdev_driver.c
index c368ec824e2b5c..14b9ab17426838 100644
--- a/drivers/vfio/mdev/mdev_driver.c
+++ b/drivers/vfio/mdev/mdev_driver.c
@@ -13,61 +13,23 @@
 
 #include "mdev_private.h"
 
-static int mdev_attach_iommu(struct mdev_device *mdev)
-{
-	int ret;
-	struct iommu_group *group;
-
-	group = iommu_group_alloc();
-	if (IS_ERR(group))
-		return PTR_ERR(group);
-
-	ret = iommu_group_add_device(group, &mdev->dev);
-	if (!ret)
-		dev_info(&mdev->dev, "MDEV: group_id = %d\n",
-			 iommu_group_id(group));
-
-	iommu_group_put(group);
-	return ret;
-}
-
-static void mdev_detach_iommu(struct mdev_device *mdev)
-{
-	iommu_group_remove_device(&mdev->dev);
-	dev_info(&mdev->dev, "MDEV: detaching iommu\n");
-}
-
 static int mdev_probe(struct device *dev)
 {
 	struct mdev_driver *drv =
 		container_of(dev->driver, struct mdev_driver, driver);
-	struct mdev_device *mdev = to_mdev_device(dev);
-	int ret;
-
-	ret = mdev_attach_iommu(mdev);
-	if (ret)
-		return ret;
 
-	if (drv->probe) {
-		ret = drv->probe(mdev);
-		if (ret)
-			mdev_detach_iommu(mdev);
-	}
-
-	return ret;
+	if (!drv->probe)
+		return 0;
+	return drv->probe(to_mdev_device(dev));
 }
 
 static int mdev_remove(struct device *dev)
 {
 	struct mdev_driver *drv =
 		container_of(dev->driver, struct mdev_driver, driver);
-	struct mdev_device *mdev = to_mdev_device(dev);
 
 	if (drv->remove)
-		drv->remove(mdev);
-
-	mdev_detach_iommu(mdev);
-
+		drv->remove(to_mdev_device(dev));
 	return 0;
 }
 
diff --git a/drivers/vfio/mdev/vfio_mdev.c b/drivers/vfio/mdev/vfio_mdev.c
index 7a9883048216e7..7bab075173145b 100644
--- a/drivers/vfio/mdev/vfio_mdev.c
+++ b/drivers/vfio/mdev/vfio_mdev.c
@@ -119,7 +119,7 @@ static int vfio_mdev_probe(struct mdev_device *mdev)
 		return -ENOMEM;
 
 	vfio_init_group_dev(vdev, &mdev->dev, &vfio_mdev_dev_ops);
-	ret = vfio_register_group_dev(vdev);
+	ret = vfio_register_mediated_dev(vdev);
 	if (ret)
 		goto out_uninit;
 
diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
index 964e53c483baac..8d225038b04802 100644
--- a/drivers/vfio/vfio.c
+++ b/drivers/vfio/vfio.c
@@ -67,6 +67,9 @@ struct vfio_unbound_dev {
 	struct list_head		unbound_next;
 };
 
+#define VFIO_MEDIATED	(1 << 0)
+#define VFIO_NOIOMMU	(1 << 1)
+
 struct vfio_group {
 	struct kref			kref;
 	int				minor;
@@ -83,7 +86,7 @@ struct vfio_group {
 	struct mutex			unbound_lock;
 	atomic_t			opened;
 	wait_queue_head_t		container_q;
-	bool				noiommu;
+	unsigned int			flags;
 	unsigned int			dev_counter;
 	struct kvm			*kvm;
 	struct blocking_notifier_head	notifier;
@@ -336,7 +339,7 @@ static void vfio_group_unlock_and_free(struct vfio_group *group)
  * Group objects - create, release, get, put, search
  */
 static struct vfio_group *vfio_create_group(struct iommu_group *iommu_group,
-		bool noiommu)
+		unsigned int flags)
 {
 	struct vfio_group *group, *tmp;
 	struct device *dev;
@@ -355,7 +358,7 @@ static struct vfio_group *vfio_create_group(struct iommu_group *iommu_group,
 	atomic_set(&group->opened, 0);
 	init_waitqueue_head(&group->container_q);
 	group->iommu_group = iommu_group;
-	group->noiommu = noiommu;
+	group->flags = flags;
 	BLOCKING_INIT_NOTIFIER_HEAD(&group->notifier);
 
 	group->nb.notifier_call = vfio_iommu_group_notifier;
@@ -391,8 +394,8 @@ static struct vfio_group *vfio_create_group(struct iommu_group *iommu_group,
 	}
 
 	dev = device_create(vfio.class, NULL,
-			    MKDEV(MAJOR(vfio.group_devt), minor),
-			    group, "%s%d", group->noiommu ? "noiommu-" : "",
+			    MKDEV(MAJOR(vfio.group_devt), minor), group, "%s%d",
+			    (group->flags & VFIO_NOIOMMU) ? "noiommu-" : "",
 			    iommu_group_id(iommu_group));
 	if (IS_ERR(dev)) {
 		vfio_free_group_minor(minor);
@@ -778,8 +781,8 @@ void vfio_uninit_group_dev(struct vfio_device *device)
 }
 EXPORT_SYMBOL_GPL(vfio_uninit_group_dev);
 
-#ifdef CONFIG_VFIO_NOIOMMU
-static struct vfio_group *vfio_noiommu_group_alloc(struct device *dev)
+static struct vfio_group *vfio_noiommu_group_alloc(struct device *dev,
+		unsigned int flags)
 {
 	struct iommu_group *iommu_group;
 	struct vfio_group *group;
@@ -794,7 +797,7 @@ static struct vfio_group *vfio_noiommu_group_alloc(struct device *dev)
 	if (ret)
 		goto out_put_group;
 
-	group = vfio_create_group(iommu_group, true);
+	group = vfio_create_group(iommu_group, flags);
 	if (IS_ERR(group)) {
 		ret = PTR_ERR(group);
 		goto out_remove_device;
@@ -808,7 +811,6 @@ static struct vfio_group *vfio_noiommu_group_alloc(struct device *dev)
 	iommu_group_put(iommu_group);
 	return ERR_PTR(ret);
 }
-#endif
 
 static struct vfio_group *vfio_group_find_or_alloc(struct device *dev)
 {
@@ -827,7 +829,7 @@ static struct vfio_group *vfio_group_find_or_alloc(struct device *dev)
 		if (noiommu && !iommu_present(dev->bus)) {
 			add_taint(TAINT_USER, LOCKDEP_STILL_OK);
 			dev_warn(dev, "Adding kernel taint for vfio-noiommu group on device\n");
-			return vfio_noiommu_group_alloc(dev);
+			return vfio_noiommu_group_alloc(dev, VFIO_NOIOMMU);
 		}
 #endif
 		return ERR_PTR(-EINVAL);
@@ -839,7 +841,7 @@ static struct vfio_group *vfio_group_find_or_alloc(struct device *dev)
 	 */
 	group = vfio_group_get_from_iommu(iommu_group);
 	if (!group) {
-		group = vfio_create_group(iommu_group, false);
+		group = vfio_create_group(iommu_group, 0);
 		if (!IS_ERR(group))
 			return group;
 	}
@@ -848,10 +850,13 @@ static struct vfio_group *vfio_group_find_or_alloc(struct device *dev)
 	return group;
 }
 
-int vfio_register_group_dev(struct vfio_device *device)
+static int __vfio_register_dev(struct vfio_device *device,
+		struct vfio_group *group)
 {
 	struct vfio_device *existing_device;
-	struct vfio_group *group;
+
+	if (IS_ERR(group))
+		return PTR_ERR(group);
 
 	/*
 	 * If the driver doesn't specify a set then the device is added to a
@@ -860,16 +865,12 @@ int vfio_register_group_dev(struct vfio_device *device)
 	if (!device->dev_set)
 		vfio_assign_device_set(device, device);
 
-	group = vfio_group_find_or_alloc(device->dev);
-	if (IS_ERR(group))
-		return PTR_ERR(group);
-
 	existing_device = vfio_group_get_device(group, device->dev);
 	if (existing_device) {
 		dev_WARN(device->dev, "Device already exists on group %d\n",
 			 iommu_group_id(group->iommu_group));
 		vfio_device_put(existing_device);
-		if (group->noiommu)
+		if (group->flags & (VFIO_NOIOMMU | VFIO_MEDIATED))
 			iommu_group_remove_device(device->dev);
 		iommu_group_put(group->iommu_group);
 		return -EBUSY;
@@ -888,8 +889,21 @@ int vfio_register_group_dev(struct vfio_device *device)
 
 	return 0;
 }
+
+int vfio_register_group_dev(struct vfio_device *device)
+{
+	return __vfio_register_dev(device,
+		vfio_group_find_or_alloc(device->dev));
+}
 EXPORT_SYMBOL_GPL(vfio_register_group_dev);
 
+int vfio_register_mediated_dev(struct vfio_device *device)
+{
+	return __vfio_register_dev(device,
+		vfio_noiommu_group_alloc(device->dev, VFIO_MEDIATED));
+}
+EXPORT_SYMBOL_GPL(vfio_register_mediated_dev);
+
 /**
  * Get a reference to the vfio_device for a device.  Even if the
  * caller thinks they own the device, they could be racing with a
@@ -1016,7 +1030,7 @@ void vfio_unregister_group_dev(struct vfio_device *device)
 	if (list_empty(&group->device_list))
 		wait_event(group->container_q, !group->container);
 
-	if (group->noiommu)
+	if (group->flags & (VFIO_NOIOMMU | VFIO_MEDIATED))
 		iommu_group_remove_device(device->dev);
 	iommu_group_put(group->iommu_group);
 }
@@ -1363,7 +1377,7 @@ static int vfio_group_set_container(struct vfio_group *group, int container_fd)
 	if (atomic_read(&group->container_users))
 		return -EINVAL;
 
-	if (group->noiommu && !capable(CAP_SYS_RAWIO))
+	if ((group->flags & VFIO_NOIOMMU) && !capable(CAP_SYS_RAWIO))
 		return -EPERM;
 
 	f = fdget(container_fd);
@@ -1383,7 +1397,7 @@ static int vfio_group_set_container(struct vfio_group *group, int container_fd)
 
 	/* Real groups and fake groups cannot mix */
 	if (!list_empty(&container->group_list) &&
-	    container->noiommu != group->noiommu) {
+	    container->noiommu != (group->flags & VFIO_NOIOMMU)) {
 		ret = -EPERM;
 		goto unlock_out;
 	}
@@ -1397,7 +1411,7 @@ static int vfio_group_set_container(struct vfio_group *group, int container_fd)
 	}
 
 	group->container = container;
-	container->noiommu = group->noiommu;
+	container->noiommu = (group->flags & VFIO_NOIOMMU);
 	list_add(&group->container_next, &container->group_list);
 
 	/* Get a reference on the container and mark a user within the group */
@@ -1421,7 +1435,7 @@ static int vfio_group_add_container_user(struct vfio_group *group)
 	if (!atomic_inc_not_zero(&group->container_users))
 		return -EINVAL;
 
-	if (group->noiommu) {
+	if (group->flags & VFIO_NOIOMMU) {
 		atomic_dec(&group->container_users);
 		return -EPERM;
 	}
@@ -1446,7 +1460,7 @@ static int vfio_group_get_device_fd(struct vfio_group *group, char *buf)
 	    !group->container->iommu_driver || !vfio_group_viable(group))
 		return -EINVAL;
 
-	if (group->noiommu && !capable(CAP_SYS_RAWIO))
+	if ((group->flags & VFIO_NOIOMMU) && !capable(CAP_SYS_RAWIO))
 		return -EPERM;
 
 	device = vfio_device_get_from_name(group, buf);
@@ -1493,7 +1507,7 @@ static int vfio_group_get_device_fd(struct vfio_group *group, char *buf)
 
 	fd_install(fdno, filep);
 
-	if (group->noiommu)
+	if (group->flags & VFIO_NOIOMMU)
 		dev_warn(device->dev, "vfio-noiommu device opened by user "
 			 "(%s:%d)\n", current->comm, task_pid_nr(current));
 	return fdno;
@@ -1589,7 +1603,7 @@ static int vfio_group_fops_open(struct inode *inode, struct file *filep)
 	if (!group)
 		return -ENODEV;
 
-	if (group->noiommu && !capable(CAP_SYS_RAWIO)) {
+	if ((group->flags & VFIO_NOIOMMU) && !capable(CAP_SYS_RAWIO)) {
 		vfio_group_put(group);
 		return -EPERM;
 	}
diff --git a/include/linux/vfio.h b/include/linux/vfio.h
index f7083c2fd0d099..2ffbb2d02c9346 100644
--- a/include/linux/vfio.h
+++ b/include/linux/vfio.h
@@ -75,6 +75,7 @@ void vfio_init_group_dev(struct vfio_device *device, struct device *dev,
 			 const struct vfio_device_ops *ops);
 void vfio_uninit_group_dev(struct vfio_device *device);
 int vfio_register_group_dev(struct vfio_device *device);
+int vfio_register_mediated_dev(struct vfio_device *device);
 void vfio_unregister_group_dev(struct vfio_device *device);
 extern struct vfio_device *vfio_device_get_from_dev(struct device *dev);
 extern void vfio_device_put(struct vfio_device *device);
diff --git a/samples/vfio-mdev/mbochs.c b/samples/vfio-mdev/mbochs.c
index c313ab4d1f4e4e..f59218e03dd1b4 100644
--- a/samples/vfio-mdev/mbochs.c
+++ b/samples/vfio-mdev/mbochs.c
@@ -553,7 +553,7 @@ static int mbochs_probe(struct mdev_device *mdev)
 	mbochs_create_config_space(mdev_state);
 	mbochs_reset(mdev_state);
 
-	ret = vfio_register_group_dev(&mdev_state->vdev);
+	ret = vfio_register_mediated_dev(&mdev_state->vdev);
 	if (ret)
 		goto err_mem;
 	dev_set_drvdata(&mdev->dev, mdev_state);
diff --git a/samples/vfio-mdev/mdpy.c b/samples/vfio-mdev/mdpy.c
index 8d1a80a0722aa9..4df380014ad914 100644
--- a/samples/vfio-mdev/mdpy.c
+++ b/samples/vfio-mdev/mdpy.c
@@ -258,7 +258,7 @@ static int mdpy_probe(struct mdev_device *mdev)
 
 	mdpy_count++;
 
-	ret = vfio_register_group_dev(&mdev_state->vdev);
+	ret = vfio_register_mediated_dev(&mdev_state->vdev);
 	if (ret)
 		goto err_mem;
 	dev_set_drvdata(&mdev->dev, mdev_state);
diff --git a/samples/vfio-mdev/mtty.c b/samples/vfio-mdev/mtty.c
index 5983cdb16e3d1d..ef9f48bfe8d04f 100644
--- a/samples/vfio-mdev/mtty.c
+++ b/samples/vfio-mdev/mtty.c
@@ -741,7 +741,7 @@ static int mtty_probe(struct mdev_device *mdev)
 
 	mtty_create_config_space(mdev_state);
 
-	ret = vfio_register_group_dev(&mdev_state->vdev);
+	ret = vfio_register_mediated_dev(&mdev_state->vdev);
 	if (ret)
 		goto err_vconfig;
 	dev_set_drvdata(&mdev->dev, mdev_state);
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2021-09-14  2:23 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-08-29 19:23 [PATCH 07/14] vfio: simplify iommu group allocation for mediated devices kernel test robot
2021-08-29 19:23 ` kernel test robot
  -- strict thread matches above, loose matches on Subject: below --
2021-09-13  7:15 cleanup vfio iommu_group creation v5 Christoph Hellwig
2021-09-13  7:15 ` [PATCH 07/14] vfio: simplify iommu group allocation for mediated devices Christoph Hellwig
2021-09-14  2:23   ` Tian, Kevin
2021-08-26 13:34 cleanup vfio iommu_group creation v4 Christoph Hellwig
2021-08-26 13:34 ` [PATCH 07/14] vfio: simplify iommu group allocation for mediated devices Christoph Hellwig
2021-08-26 17:24   ` Alex Williamson
2021-08-26 22:59   ` Alex Williamson
2021-08-27  2:17     ` Tian, Kevin
2021-08-27  5:42       ` Christoph Hellwig
2021-08-25 16:19 cleanup vfio iommu_group creation v3 Christoph Hellwig
2021-08-25 16:19 ` [PATCH 07/14] vfio: simplify iommu group allocation for mediated devices Christoph Hellwig
2021-08-25 22:31   ` Jason Gunthorpe
2021-08-26  3:10   ` Tian, Kevin
2021-08-24 14:46 cleanup vfio iommu_group creation v2 Christoph Hellwig
2021-08-24 14:46 ` [PATCH 07/14] vfio: simplify iommu group allocation for mediated devices Christoph Hellwig
2021-08-25  0:19   ` Jason Gunthorpe
2021-08-25  5:32     ` Christoph Hellwig
2021-08-25 12:21       ` Jason Gunthorpe
2021-08-25 12:24         ` Christoph Hellwig
2021-08-25 12:34           ` Jason Gunthorpe
2021-08-25 12:37             ` Christoph Hellwig
2021-08-25 12:45               ` Jason Gunthorpe
2021-08-25 12:50                 ` Christoph Hellwig
2021-08-25 12:50                   ` Jason Gunthorpe
2021-08-11 15:14 cleanup vfio iommu_group creation Christoph Hellwig
2021-08-11 15:14 ` [PATCH 07/14] vfio: simplify iommu group allocation for mediated devices Christoph Hellwig

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.