All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 00/11] Connect VFIO to IOMMUFD
@ 2022-11-16 21:05 Jason Gunthorpe
  2022-11-16 21:05 ` [PATCH v3 01/11] vfio: Move vfio_device driver open/close code to a function Jason Gunthorpe
                   ` (11 more replies)
  0 siblings, 12 replies; 35+ messages in thread
From: Jason Gunthorpe @ 2022-11-16 21:05 UTC (permalink / raw)
  Cc: Alex Williamson, Lu Baolu, Cornelia Huck, Eric Auger, Kevin Tian,
	kvm, Lixiao Yang, Matthew Rosato, Nicolin Chen, Yi Liu, Yu He

This series provides an alternative container layer for VFIO implemented
using iommufd. This is optional, if CONFIG_IOMMUFD is not set then it will
not be compiled in.

At this point iommufd can be injected by passing in a iommfd FD to
VFIO_GROUP_SET_CONTAINER which will use the VFIO compat layer in iommufd
to obtain the compat IOAS and then connect up all the VFIO drivers as
appropriate.

This is temporary stopping point, a following series will provide a way to
directly open a VFIO device FD and directly connect it to IOMMUFD using
native ioctls that can expose the IOMMUFD features like hwpt, future
vPASID and dynamic attachment.

This series, in compat mode, has passed all the qemu tests we have
available, including the test suites for the Intel GVT mdev. Aside from
the temporary limitation with P2P memory this is belived to be fully
compatible with VFIO.

This is on github: https://github.com/jgunthorpe/linux/commits/vfio_iommufd

It requires the iommufd series:

https://lore.kernel.org/r/0-v5-4001c2997bd0+30c-iommufd_jgg@nvidia.com

v3:
 - Fix iommufd_attached to be only used in the vfio_iommufd_physical_*
   funcs
 - Always check for iommufd before invoking a iommufd function
 - Fix mismatch between vfio_pin_pages and iommufd_access when the IOVA
   is not aligned. Resolves problems on S390
v2: https://lore.kernel.org/r/0-v2-65016290f146+33e-vfio_iommufd_jgg@nvidia.com
 - Rebase to v6.1-rc3, v4 iommufd series
 - Fixup comments and commit messages from list remarks
 - Fix leaking of the iommufd for mdevs
 - New patch to fix vfio modaliases when vfio container is disabled
 - Add a dmesg once when the iommufd provided /dev/vfio/vfio is opened
   to signal that iommufd is providing this
v1: https://lore.kernel.org/r/0-v1-4991695894d8+211-vfio_iommufd_jgg@nvidia.com

Jason Gunthorpe (11):
  vfio: Move vfio_device driver open/close code to a function
  vfio: Move vfio_device_assign_container() into
    vfio_device_first_open()
  vfio: Rename vfio_device_assign/unassign_container()
  vfio: Move storage of allow_unsafe_interrupts to vfio_main.c
  vfio: Use IOMMU_CAP_ENFORCE_CACHE_COHERENCY for
    vfio_file_enforced_coherent()
  vfio-iommufd: Allow iommufd to be used in place of a container fd
  vfio-iommufd: Support iommufd for physical VFIO devices
  vfio-iommufd: Support iommufd for emulated VFIO devices
  vfio: Move container related MODULE_ALIAS statements into container.c
  vfio: Make vfio_container optionally compiled
  iommufd: Allow iommufd to supply /dev/vfio/vfio

 drivers/gpu/drm/i915/gvt/kvmgt.c              |   3 +
 drivers/iommu/iommufd/Kconfig                 |  12 +
 drivers/iommu/iommufd/main.c                  |  36 ++
 drivers/s390/cio/vfio_ccw_ops.c               |   3 +
 drivers/s390/crypto/vfio_ap_ops.c             |   3 +
 drivers/vfio/Kconfig                          |  36 +-
 drivers/vfio/Makefile                         |   5 +-
 drivers/vfio/container.c                      | 141 ++-----
 drivers/vfio/fsl-mc/vfio_fsl_mc.c             |   3 +
 drivers/vfio/iommufd.c                        | 161 ++++++++
 .../vfio/pci/hisilicon/hisi_acc_vfio_pci.c    |   6 +
 drivers/vfio/pci/mlx5/main.c                  |   3 +
 drivers/vfio/pci/vfio_pci.c                   |   3 +
 drivers/vfio/platform/vfio_amba.c             |   3 +
 drivers/vfio/platform/vfio_platform.c         |   3 +
 drivers/vfio/vfio.h                           | 100 ++++-
 drivers/vfio/vfio_iommu_type1.c               |   5 +-
 drivers/vfio/vfio_main.c                      | 348 ++++++++++++++----
 include/linux/vfio.h                          |  39 ++
 19 files changed, 714 insertions(+), 199 deletions(-)
 create mode 100644 drivers/vfio/iommufd.c


base-commit: 9d367dc905dd278614aaf601afb28e511b82fb3b
-- 
2.38.1


^ permalink raw reply	[flat|nested] 35+ messages in thread

* [PATCH v3 01/11] vfio: Move vfio_device driver open/close code to a function
  2022-11-16 21:05 [PATCH v3 00/11] Connect VFIO to IOMMUFD Jason Gunthorpe
@ 2022-11-16 21:05 ` Jason Gunthorpe
  2022-11-16 21:05 ` [PATCH v3 02/11] vfio: Move vfio_device_assign_container() into vfio_device_first_open() Jason Gunthorpe
                   ` (10 subsequent siblings)
  11 siblings, 0 replies; 35+ messages in thread
From: Jason Gunthorpe @ 2022-11-16 21:05 UTC (permalink / raw)
  Cc: Alex Williamson, Lu Baolu, Cornelia Huck, Eric Auger, Kevin Tian,
	kvm, Lixiao Yang, Matthew Rosato, Nicolin Chen, Yi Liu, Yu He

This error unwind is getting complicated. Move all the code into two
pair'd function. The functions should be called when the open_count == 1
after incrementing/before decrementing.

Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Yi Liu <yi.l.liu@intel.com>
Tested-by: Yi Liu <yi.l.liu@intel.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Lixiao Yang <lixiao.yang@intel.com>
Tested-by: Matthew Rosato <mjrosato@linux.ibm.com>
Tested-by: Yu He <yu.he@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/vfio/vfio_main.c | 95 ++++++++++++++++++++++------------------
 1 file changed, 53 insertions(+), 42 deletions(-)

diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c
index 2d168793d4e1ce..2e8346d13c16ca 100644
--- a/drivers/vfio/vfio_main.c
+++ b/drivers/vfio/vfio_main.c
@@ -734,6 +734,51 @@ bool vfio_assert_device_open(struct vfio_device *device)
 	return !WARN_ON_ONCE(!READ_ONCE(device->open_count));
 }
 
+static int vfio_device_first_open(struct vfio_device *device)
+{
+	int ret;
+
+	lockdep_assert_held(&device->dev_set->lock);
+
+	if (!try_module_get(device->dev->driver->owner))
+		return -ENODEV;
+
+	/*
+	 * Here we pass the KVM pointer with the group under the lock.  If the
+	 * device driver will use it, it must obtain a reference and release it
+	 * during close_device.
+	 */
+	mutex_lock(&device->group->group_lock);
+	device->kvm = device->group->kvm;
+	if (device->ops->open_device) {
+		ret = device->ops->open_device(device);
+		if (ret)
+			goto err_module_put;
+	}
+	vfio_device_container_register(device);
+	mutex_unlock(&device->group->group_lock);
+	return 0;
+
+err_module_put:
+	device->kvm = NULL;
+	mutex_unlock(&device->group->group_lock);
+	module_put(device->dev->driver->owner);
+	return ret;
+}
+
+static void vfio_device_last_close(struct vfio_device *device)
+{
+	lockdep_assert_held(&device->dev_set->lock);
+
+	mutex_lock(&device->group->group_lock);
+	vfio_device_container_unregister(device);
+	if (device->ops->close_device)
+		device->ops->close_device(device);
+	device->kvm = NULL;
+	mutex_unlock(&device->group->group_lock);
+	module_put(device->dev->driver->owner);
+}
+
 static struct file *vfio_device_open(struct vfio_device *device)
 {
 	struct file *filep;
@@ -745,29 +790,12 @@ static struct file *vfio_device_open(struct vfio_device *device)
 	if (ret)
 		return ERR_PTR(ret);
 
-	if (!try_module_get(device->dev->driver->owner)) {
-		ret = -ENODEV;
-		goto err_unassign_container;
-	}
-
 	mutex_lock(&device->dev_set->lock);
 	device->open_count++;
 	if (device->open_count == 1) {
-		/*
-		 * Here we pass the KVM pointer with the group under the read
-		 * lock.  If the device driver will use it, it must obtain a
-		 * reference and release it during close_device.
-		 */
-		mutex_lock(&device->group->group_lock);
-		device->kvm = device->group->kvm;
-
-		if (device->ops->open_device) {
-			ret = device->ops->open_device(device);
-			if (ret)
-				goto err_undo_count;
-		}
-		vfio_device_container_register(device);
-		mutex_unlock(&device->group->group_lock);
+		ret = vfio_device_first_open(device);
+		if (ret)
+			goto err_unassign_container;
 	}
 	mutex_unlock(&device->dev_set->lock);
 
@@ -800,20 +828,11 @@ static struct file *vfio_device_open(struct vfio_device *device)
 
 err_close_device:
 	mutex_lock(&device->dev_set->lock);
-	mutex_lock(&device->group->group_lock);
-	if (device->open_count == 1 && device->ops->close_device) {
-		device->ops->close_device(device);
-
-		vfio_device_container_unregister(device);
-	}
-err_undo_count:
-	mutex_unlock(&device->group->group_lock);
+	if (device->open_count == 1)
+		vfio_device_last_close(device);
+err_unassign_container:
 	device->open_count--;
-	if (device->open_count == 0 && device->kvm)
-		device->kvm = NULL;
 	mutex_unlock(&device->dev_set->lock);
-	module_put(device->dev->driver->owner);
-err_unassign_container:
 	vfio_device_unassign_container(device);
 	return ERR_PTR(ret);
 }
@@ -1016,19 +1035,11 @@ static int vfio_device_fops_release(struct inode *inode, struct file *filep)
 
 	mutex_lock(&device->dev_set->lock);
 	vfio_assert_device_open(device);
-	mutex_lock(&device->group->group_lock);
-	if (device->open_count == 1 && device->ops->close_device)
-		device->ops->close_device(device);
-
-	vfio_device_container_unregister(device);
-	mutex_unlock(&device->group->group_lock);
+	if (device->open_count == 1)
+		vfio_device_last_close(device);
 	device->open_count--;
-	if (device->open_count == 0)
-		device->kvm = NULL;
 	mutex_unlock(&device->dev_set->lock);
 
-	module_put(device->dev->driver->owner);
-
 	vfio_device_unassign_container(device);
 
 	vfio_device_put_registration(device);
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v3 02/11] vfio: Move vfio_device_assign_container() into vfio_device_first_open()
  2022-11-16 21:05 [PATCH v3 00/11] Connect VFIO to IOMMUFD Jason Gunthorpe
  2022-11-16 21:05 ` [PATCH v3 01/11] vfio: Move vfio_device driver open/close code to a function Jason Gunthorpe
@ 2022-11-16 21:05 ` Jason Gunthorpe
  2022-11-16 21:05 ` [PATCH v3 03/11] vfio: Rename vfio_device_assign/unassign_container() Jason Gunthorpe
                   ` (9 subsequent siblings)
  11 siblings, 0 replies; 35+ messages in thread
From: Jason Gunthorpe @ 2022-11-16 21:05 UTC (permalink / raw)
  Cc: Alex Williamson, Lu Baolu, Cornelia Huck, Eric Auger, Kevin Tian,
	kvm, Lixiao Yang, Matthew Rosato, Nicolin Chen, Yi Liu, Yu He

The only thing this function does is assert the group has an assigned
container and incrs refcounts.

The overall model we have is that once a container_users refcount is
incremented it cannot be de-assigned from the group -
vfio_group_ioctl_unset_container() will fail and the group FD cannot be
closed.

Thus we do not need to check this on every device FD open, just the
first. Reorganize the code so that only the first open and last close
manages the container.

Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Yi Liu <yi.l.liu@intel.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Yi Liu <yi.l.liu@intel.com>
Tested-by: Lixiao Yang <lixiao.yang@intel.com>
Tested-by: Matthew Rosato <mjrosato@linux.ibm.com>
Tested-by: Yu He <yu.he@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/vfio/container.c |  4 ++--
 drivers/vfio/vfio_main.c | 24 +++++++++++-------------
 2 files changed, 13 insertions(+), 15 deletions(-)

diff --git a/drivers/vfio/container.c b/drivers/vfio/container.c
index d74164abbf401d..dd79a66ec62cad 100644
--- a/drivers/vfio/container.c
+++ b/drivers/vfio/container.c
@@ -531,11 +531,11 @@ int vfio_device_assign_container(struct vfio_device *device)
 
 void vfio_device_unassign_container(struct vfio_device *device)
 {
-	mutex_lock(&device->group->group_lock);
+	lockdep_assert_held_write(&device->group->group_lock);
+
 	WARN_ON(device->group->container_users <= 1);
 	device->group->container_users--;
 	fput(device->group->opened_file);
-	mutex_unlock(&device->group->group_lock);
 }
 
 /*
diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c
index 2e8346d13c16ca..717c7f404feeea 100644
--- a/drivers/vfio/vfio_main.c
+++ b/drivers/vfio/vfio_main.c
@@ -749,18 +749,24 @@ static int vfio_device_first_open(struct vfio_device *device)
 	 * during close_device.
 	 */
 	mutex_lock(&device->group->group_lock);
+	ret = vfio_device_assign_container(device);
+	if (ret)
+		goto err_module_put;
+
 	device->kvm = device->group->kvm;
 	if (device->ops->open_device) {
 		ret = device->ops->open_device(device);
 		if (ret)
-			goto err_module_put;
+			goto err_container;
 	}
 	vfio_device_container_register(device);
 	mutex_unlock(&device->group->group_lock);
 	return 0;
 
-err_module_put:
+err_container:
 	device->kvm = NULL;
+	vfio_device_unassign_container(device);
+err_module_put:
 	mutex_unlock(&device->group->group_lock);
 	module_put(device->dev->driver->owner);
 	return ret;
@@ -775,6 +781,7 @@ static void vfio_device_last_close(struct vfio_device *device)
 	if (device->ops->close_device)
 		device->ops->close_device(device);
 	device->kvm = NULL;
+	vfio_device_unassign_container(device);
 	mutex_unlock(&device->group->group_lock);
 	module_put(device->dev->driver->owner);
 }
@@ -784,18 +791,12 @@ static struct file *vfio_device_open(struct vfio_device *device)
 	struct file *filep;
 	int ret;
 
-	mutex_lock(&device->group->group_lock);
-	ret = vfio_device_assign_container(device);
-	mutex_unlock(&device->group->group_lock);
-	if (ret)
-		return ERR_PTR(ret);
-
 	mutex_lock(&device->dev_set->lock);
 	device->open_count++;
 	if (device->open_count == 1) {
 		ret = vfio_device_first_open(device);
 		if (ret)
-			goto err_unassign_container;
+			goto err_unlock;
 	}
 	mutex_unlock(&device->dev_set->lock);
 
@@ -830,10 +831,9 @@ static struct file *vfio_device_open(struct vfio_device *device)
 	mutex_lock(&device->dev_set->lock);
 	if (device->open_count == 1)
 		vfio_device_last_close(device);
-err_unassign_container:
+err_unlock:
 	device->open_count--;
 	mutex_unlock(&device->dev_set->lock);
-	vfio_device_unassign_container(device);
 	return ERR_PTR(ret);
 }
 
@@ -1040,8 +1040,6 @@ static int vfio_device_fops_release(struct inode *inode, struct file *filep)
 	device->open_count--;
 	mutex_unlock(&device->dev_set->lock);
 
-	vfio_device_unassign_container(device);
-
 	vfio_device_put_registration(device);
 
 	return 0;
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v3 03/11] vfio: Rename vfio_device_assign/unassign_container()
  2022-11-16 21:05 [PATCH v3 00/11] Connect VFIO to IOMMUFD Jason Gunthorpe
  2022-11-16 21:05 ` [PATCH v3 01/11] vfio: Move vfio_device driver open/close code to a function Jason Gunthorpe
  2022-11-16 21:05 ` [PATCH v3 02/11] vfio: Move vfio_device_assign_container() into vfio_device_first_open() Jason Gunthorpe
@ 2022-11-16 21:05 ` Jason Gunthorpe
  2022-11-16 21:05 ` [PATCH v3 04/11] vfio: Move storage of allow_unsafe_interrupts to vfio_main.c Jason Gunthorpe
                   ` (8 subsequent siblings)
  11 siblings, 0 replies; 35+ messages in thread
From: Jason Gunthorpe @ 2022-11-16 21:05 UTC (permalink / raw)
  Cc: Alex Williamson, Lu Baolu, Cornelia Huck, Eric Auger, Kevin Tian,
	kvm, Lixiao Yang, Matthew Rosato, Nicolin Chen, Yi Liu, Yu He

These functions don't really assign anything anymore, they just increment
some refcounts and do a sanity check. Call them
vfio_group_[un]use_container()

Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Yi Liu <yi.l.liu@intel.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Yi Liu <yi.l.liu@intel.com>
Tested-by: Lixiao Yang <lixiao.yang@intel.com>
Tested-by: Matthew Rosato <mjrosato@linux.ibm.com>
Tested-by: Yu He <yu.he@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/vfio/container.c | 14 ++++++--------
 drivers/vfio/vfio.h      |  4 ++--
 drivers/vfio/vfio_main.c |  6 +++---
 3 files changed, 11 insertions(+), 13 deletions(-)

diff --git a/drivers/vfio/container.c b/drivers/vfio/container.c
index dd79a66ec62cad..499777930b08fa 100644
--- a/drivers/vfio/container.c
+++ b/drivers/vfio/container.c
@@ -511,10 +511,8 @@ void vfio_group_detach_container(struct vfio_group *group)
 	vfio_container_put(container);
 }
 
-int vfio_device_assign_container(struct vfio_device *device)
+int vfio_group_use_container(struct vfio_group *group)
 {
-	struct vfio_group *group = device->group;
-
 	lockdep_assert_held(&group->group_lock);
 
 	if (!group->container || !group->container->iommu_driver ||
@@ -529,13 +527,13 @@ int vfio_device_assign_container(struct vfio_device *device)
 	return 0;
 }
 
-void vfio_device_unassign_container(struct vfio_device *device)
+void vfio_group_unuse_container(struct vfio_group *group)
 {
-	lockdep_assert_held_write(&device->group->group_lock);
+	lockdep_assert_held(&group->group_lock);
 
-	WARN_ON(device->group->container_users <= 1);
-	device->group->container_users--;
-	fput(device->group->opened_file);
+	WARN_ON(group->container_users <= 1);
+	group->container_users--;
+	fput(group->opened_file);
 }
 
 /*
diff --git a/drivers/vfio/vfio.h b/drivers/vfio/vfio.h
index bcad54bbab08c4..f95f4925b83bbd 100644
--- a/drivers/vfio/vfio.h
+++ b/drivers/vfio/vfio.h
@@ -112,8 +112,8 @@ void vfio_unregister_iommu_driver(const struct vfio_iommu_driver_ops *ops);
 bool vfio_assert_device_open(struct vfio_device *device);
 
 struct vfio_container *vfio_container_from_file(struct file *filep);
-int vfio_device_assign_container(struct vfio_device *device);
-void vfio_device_unassign_container(struct vfio_device *device);
+int vfio_group_use_container(struct vfio_group *group);
+void vfio_group_unuse_container(struct vfio_group *group);
 int vfio_container_attach_group(struct vfio_container *container,
 				struct vfio_group *group);
 void vfio_group_detach_container(struct vfio_group *group);
diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c
index 717c7f404feeea..8c2dcb481ae10b 100644
--- a/drivers/vfio/vfio_main.c
+++ b/drivers/vfio/vfio_main.c
@@ -749,7 +749,7 @@ static int vfio_device_first_open(struct vfio_device *device)
 	 * during close_device.
 	 */
 	mutex_lock(&device->group->group_lock);
-	ret = vfio_device_assign_container(device);
+	ret = vfio_group_use_container(device->group);
 	if (ret)
 		goto err_module_put;
 
@@ -765,7 +765,7 @@ static int vfio_device_first_open(struct vfio_device *device)
 
 err_container:
 	device->kvm = NULL;
-	vfio_device_unassign_container(device);
+	vfio_group_unuse_container(device->group);
 err_module_put:
 	mutex_unlock(&device->group->group_lock);
 	module_put(device->dev->driver->owner);
@@ -781,7 +781,7 @@ static void vfio_device_last_close(struct vfio_device *device)
 	if (device->ops->close_device)
 		device->ops->close_device(device);
 	device->kvm = NULL;
-	vfio_device_unassign_container(device);
+	vfio_group_unuse_container(device->group);
 	mutex_unlock(&device->group->group_lock);
 	module_put(device->dev->driver->owner);
 }
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v3 04/11] vfio: Move storage of allow_unsafe_interrupts to vfio_main.c
  2022-11-16 21:05 [PATCH v3 00/11] Connect VFIO to IOMMUFD Jason Gunthorpe
                   ` (2 preceding siblings ...)
  2022-11-16 21:05 ` [PATCH v3 03/11] vfio: Rename vfio_device_assign/unassign_container() Jason Gunthorpe
@ 2022-11-16 21:05 ` Jason Gunthorpe
  2022-11-17 20:14   ` Alex Williamson
  2022-11-16 21:05 ` [PATCH v3 05/11] vfio: Use IOMMU_CAP_ENFORCE_CACHE_COHERENCY for vfio_file_enforced_coherent() Jason Gunthorpe
                   ` (7 subsequent siblings)
  11 siblings, 1 reply; 35+ messages in thread
From: Jason Gunthorpe @ 2022-11-16 21:05 UTC (permalink / raw)
  Cc: Alex Williamson, Lu Baolu, Cornelia Huck, Eric Auger, Kevin Tian,
	kvm, Lixiao Yang, Matthew Rosato, Nicolin Chen, Yi Liu, Yu He

This legacy module knob has become uAPI, when set on the vfio_iommu_type1
it disables some security protections in the iommu drivers. Move the
storage for this knob to vfio_main.c so that iommufd can access it too.

The may need enhancing as we learn more about how necessary
allow_unsafe_interrupts is in the current state of the world. If vfio
container is disabled then this option will not be available to the user.

Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Yi Liu <yi.l.liu@intel.com>
Tested-by: Lixiao Yang <lixiao.yang@intel.com>
Tested-by: Matthew Rosato <mjrosato@linux.ibm.com>
Tested-by: Yu He <yu.he@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/vfio/vfio.h             | 2 ++
 drivers/vfio/vfio_iommu_type1.c | 5 ++---
 drivers/vfio/vfio_main.c        | 3 +++
 3 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/drivers/vfio/vfio.h b/drivers/vfio/vfio.h
index f95f4925b83bbd..54e5a8e0834ccb 100644
--- a/drivers/vfio/vfio.h
+++ b/drivers/vfio/vfio.h
@@ -130,4 +130,6 @@ extern bool vfio_noiommu __read_mostly;
 enum { vfio_noiommu = false };
 #endif
 
+extern bool vfio_allow_unsafe_interrupts;
+
 #endif
diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index 23c24fe98c00d4..186e33a006d314 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -44,9 +44,8 @@
 #define DRIVER_AUTHOR   "Alex Williamson <alex.williamson@redhat.com>"
 #define DRIVER_DESC     "Type1 IOMMU driver for VFIO"
 
-static bool allow_unsafe_interrupts;
 module_param_named(allow_unsafe_interrupts,
-		   allow_unsafe_interrupts, bool, S_IRUGO | S_IWUSR);
+		   vfio_allow_unsafe_interrupts, bool, S_IRUGO | S_IWUSR);
 MODULE_PARM_DESC(allow_unsafe_interrupts,
 		 "Enable VFIO IOMMU support for on platforms without interrupt remapping support.");
 
@@ -2282,7 +2281,7 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
 		    iommu_group_for_each_dev(iommu_group, (void *)IOMMU_CAP_INTR_REMAP,
 					     vfio_iommu_device_capable);
 
-	if (!allow_unsafe_interrupts && !msi_remap) {
+	if (!vfio_allow_unsafe_interrupts && !msi_remap) {
 		pr_warn("%s: No interrupt remapping support.  Use the module param \"allow_unsafe_interrupts\" to enable VFIO IOMMU support on this platform\n",
 		       __func__);
 		ret = -EPERM;
diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c
index 8c2dcb481ae10b..e1fec1db6a3c93 100644
--- a/drivers/vfio/vfio_main.c
+++ b/drivers/vfio/vfio_main.c
@@ -51,6 +51,9 @@ static struct vfio {
 	struct ida			device_ida;
 } vfio;
 
+bool vfio_allow_unsafe_interrupts;
+EXPORT_SYMBOL_GPL(vfio_allow_unsafe_interrupts);
+
 static DEFINE_XARRAY(vfio_device_set_xa);
 static const struct file_operations vfio_group_fops;
 
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v3 05/11] vfio: Use IOMMU_CAP_ENFORCE_CACHE_COHERENCY for vfio_file_enforced_coherent()
  2022-11-16 21:05 [PATCH v3 00/11] Connect VFIO to IOMMUFD Jason Gunthorpe
                   ` (3 preceding siblings ...)
  2022-11-16 21:05 ` [PATCH v3 04/11] vfio: Move storage of allow_unsafe_interrupts to vfio_main.c Jason Gunthorpe
@ 2022-11-16 21:05 ` Jason Gunthorpe
  2022-11-16 21:05 ` [PATCH v3 06/11] vfio-iommufd: Allow iommufd to be used in place of a container fd Jason Gunthorpe
                   ` (6 subsequent siblings)
  11 siblings, 0 replies; 35+ messages in thread
From: Jason Gunthorpe @ 2022-11-16 21:05 UTC (permalink / raw)
  Cc: Alex Williamson, Lu Baolu, Cornelia Huck, Eric Auger, Kevin Tian,
	kvm, Lixiao Yang, Matthew Rosato, Nicolin Chen, Yi Liu, Yu He

iommufd doesn't establish the iommu_domains until after the device FD is
opened, even if the container has been set. This design is part of moving
away from the group centric iommu APIs.

This is fine, except that the normal sequence of establishing the kvm
wbinvd won't work:

   group = open("/dev/vfio/XX")
   ioctl(group, VFIO_GROUP_SET_CONTAINER)
   ioctl(kvm, KVM_DEV_VFIO_GROUP_ADD)
   ioctl(group, VFIO_GROUP_GET_DEVICE_FD)

As the domains don't start existing until GET_DEVICE_FD. Further,
GET_DEVICE_FD requires that KVM_DEV_VFIO_GROUP_ADD already be done as that
is what sets the group->kvm and thus device->kvm for the driver to use
during open.

Now that we have device centric cap ops and the new
IOMMU_CAP_ENFORCE_CACHE_COHERENCY we know what the iommu_domain will be
capable of without having to create it. Use this to compute
vfio_file_enforced_coherent() and resolve the ordering problems.

VFIO always tries to upgrade domains to enforce cache coherency, it never
attaches a device that supports enforce cache coherency to a less capable
domain, so the cap test is a sufficient proxy for the ultimate
outcome. iommufd also ensures that devices that set the cap will be
connected to enforcing domains.

Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Yi Liu <yi.l.liu@intel.com>
Tested-by: Lixiao Yang <lixiao.yang@intel.com>
Tested-by: Matthew Rosato <mjrosato@linux.ibm.com>
Tested-by: Yu He <yu.he@intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/vfio/container.c |  5 +++--
 drivers/vfio/vfio.h      |  2 --
 drivers/vfio/vfio_main.c | 29 ++++++++++++++++-------------
 3 files changed, 19 insertions(+), 17 deletions(-)

diff --git a/drivers/vfio/container.c b/drivers/vfio/container.c
index 499777930b08fa..d97747dfb05d02 100644
--- a/drivers/vfio/container.c
+++ b/drivers/vfio/container.c
@@ -188,8 +188,9 @@ void vfio_device_container_unregister(struct vfio_device *device)
 			device->group->container->iommu_data, device);
 }
 
-long vfio_container_ioctl_check_extension(struct vfio_container *container,
-					  unsigned long arg)
+static long
+vfio_container_ioctl_check_extension(struct vfio_container *container,
+				     unsigned long arg)
 {
 	struct vfio_iommu_driver *driver;
 	long ret = 0;
diff --git a/drivers/vfio/vfio.h b/drivers/vfio/vfio.h
index 54e5a8e0834ccb..247590334e14b0 100644
--- a/drivers/vfio/vfio.h
+++ b/drivers/vfio/vfio.h
@@ -119,8 +119,6 @@ int vfio_container_attach_group(struct vfio_container *container,
 void vfio_group_detach_container(struct vfio_group *group);
 void vfio_device_container_register(struct vfio_device *device);
 void vfio_device_container_unregister(struct vfio_device *device);
-long vfio_container_ioctl_check_extension(struct vfio_container *container,
-					  unsigned long arg);
 int __init vfio_container_init(void);
 void vfio_container_cleanup(void);
 
diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c
index e1fec1db6a3c93..5c0e810f8b4d08 100644
--- a/drivers/vfio/vfio_main.c
+++ b/drivers/vfio/vfio_main.c
@@ -1625,24 +1625,27 @@ EXPORT_SYMBOL_GPL(vfio_file_is_group);
 bool vfio_file_enforced_coherent(struct file *file)
 {
 	struct vfio_group *group = file->private_data;
-	bool ret;
+	struct vfio_device *device;
+	bool ret = true;
 
 	if (!vfio_file_is_group(file))
 		return true;
 
-	mutex_lock(&group->group_lock);
-	if (group->container) {
-		ret = vfio_container_ioctl_check_extension(group->container,
-							   VFIO_DMA_CC_IOMMU);
-	} else {
-		/*
-		 * Since the coherency state is determined only once a container
-		 * is attached the user must do so before they can prove they
-		 * have permission.
-		 */
-		ret = true;
+	/*
+	 * If the device does not have IOMMU_CAP_ENFORCE_CACHE_COHERENCY then
+	 * any domain later attached to it will also not support it. If the cap
+	 * is set then the iommu_domain eventually attached to the device/group
+	 * must use a domain with enforce_cache_coherency().
+	 */
+	mutex_lock(&group->device_lock);
+	list_for_each_entry(device, &group->device_list, group_next) {
+		if (!device_iommu_capable(device->dev,
+					  IOMMU_CAP_ENFORCE_CACHE_COHERENCY)) {
+			ret = false;
+			break;
+		}
 	}
-	mutex_unlock(&group->group_lock);
+	mutex_unlock(&group->device_lock);
 	return ret;
 }
 EXPORT_SYMBOL_GPL(vfio_file_enforced_coherent);
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v3 06/11] vfio-iommufd: Allow iommufd to be used in place of a container fd
  2022-11-16 21:05 [PATCH v3 00/11] Connect VFIO to IOMMUFD Jason Gunthorpe
                   ` (4 preceding siblings ...)
  2022-11-16 21:05 ` [PATCH v3 05/11] vfio: Use IOMMU_CAP_ENFORCE_CACHE_COHERENCY for vfio_file_enforced_coherent() Jason Gunthorpe
@ 2022-11-16 21:05 ` Jason Gunthorpe
  2022-11-16 23:31   ` Alex Williamson
  2022-11-16 21:05 ` [PATCH v3 07/11] vfio-iommufd: Support iommufd for physical VFIO devices Jason Gunthorpe
                   ` (5 subsequent siblings)
  11 siblings, 1 reply; 35+ messages in thread
From: Jason Gunthorpe @ 2022-11-16 21:05 UTC (permalink / raw)
  Cc: Alex Williamson, Lu Baolu, Cornelia Huck, Eric Auger, Kevin Tian,
	kvm, Lixiao Yang, Matthew Rosato, Nicolin Chen, Yi Liu, Yu He

This makes VFIO_GROUP_SET_CONTAINER accept both a vfio container FD and an
iommufd.

In iommufd mode an IOAS will exist after the SET_CONTAINER, but it will
not be attached to any groups.

For VFIO this means that the VFIO_GROUP_GET_STATUS and
VFIO_GROUP_FLAGS_VIABLE works subtly differently. With the container FD
the iommu_group_claim_dma_owner() is done during SET_CONTAINER but for
IOMMUFD this is done during VFIO_GROUP_GET_DEVICE_FD. Meaning that
VFIO_GROUP_FLAGS_VIABLE could be set but GET_DEVICE_FD will fail due to
viability.

As GET_DEVICE_FD can fail for many reasons already this is not expected to
be a meaningful difference.

Reorganize the tests for if the group has an assigned container or iommu
into a vfio_group_has_iommu() function and consolidate all the duplicated
WARN_ON's etc related to this.

Call container functions only if a container is actually present on the
group.

Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Yi Liu <yi.l.liu@intel.com>
Tested-by: Lixiao Yang <lixiao.yang@intel.com>
Tested-by: Matthew Rosato <mjrosato@linux.ibm.com>
Tested-by: Yu He <yu.he@intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/vfio/Kconfig     |  1 +
 drivers/vfio/container.c |  7 +++-
 drivers/vfio/vfio.h      |  2 +
 drivers/vfio/vfio_main.c | 86 +++++++++++++++++++++++++++++++++-------
 4 files changed, 80 insertions(+), 16 deletions(-)

diff --git a/drivers/vfio/Kconfig b/drivers/vfio/Kconfig
index 86c381ceb9a1e9..1118d322eec97d 100644
--- a/drivers/vfio/Kconfig
+++ b/drivers/vfio/Kconfig
@@ -2,6 +2,7 @@
 menuconfig VFIO
 	tristate "VFIO Non-Privileged userspace driver framework"
 	select IOMMU_API
+	depends on IOMMUFD || !IOMMUFD
 	select VFIO_IOMMU_TYPE1 if MMU && (X86 || S390 || ARM || ARM64)
 	select INTERVAL_TREE
 	help
diff --git a/drivers/vfio/container.c b/drivers/vfio/container.c
index d97747dfb05d02..8772dad6808539 100644
--- a/drivers/vfio/container.c
+++ b/drivers/vfio/container.c
@@ -516,8 +516,11 @@ int vfio_group_use_container(struct vfio_group *group)
 {
 	lockdep_assert_held(&group->group_lock);
 
-	if (!group->container || !group->container->iommu_driver ||
-	    WARN_ON(!group->container_users))
+	/*
+	 * The container fd has been assigned with VFIO_GROUP_SET_CONTAINER but
+	 * VFIO_SET_IOMMU hasn't been done yet.
+	 */
+	if (!group->container->iommu_driver)
 		return -EINVAL;
 
 	if (group->type == VFIO_NO_IOMMU && !capable(CAP_SYS_RAWIO))
diff --git a/drivers/vfio/vfio.h b/drivers/vfio/vfio.h
index 247590334e14b0..985e13d52989ca 100644
--- a/drivers/vfio/vfio.h
+++ b/drivers/vfio/vfio.h
@@ -10,6 +10,7 @@
 #include <linux/cdev.h>
 #include <linux/module.h>
 
+struct iommufd_ctx;
 struct iommu_group;
 struct vfio_device;
 struct vfio_container;
@@ -60,6 +61,7 @@ struct vfio_group {
 	struct kvm			*kvm;
 	struct file			*opened_file;
 	struct blocking_notifier_head	notifier;
+	struct iommufd_ctx		*iommufd;
 };
 
 /* events for the backend driver notify callback */
diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c
index 5c0e810f8b4d08..8c124290ce9f0d 100644
--- a/drivers/vfio/vfio_main.c
+++ b/drivers/vfio/vfio_main.c
@@ -35,6 +35,7 @@
 #include <linux/pm_runtime.h>
 #include <linux/interval_tree.h>
 #include <linux/iova_bitmap.h>
+#include <linux/iommufd.h>
 #include "vfio.h"
 
 #define DRIVER_VERSION	"0.3"
@@ -665,6 +666,16 @@ EXPORT_SYMBOL_GPL(vfio_unregister_group_dev);
 /*
  * VFIO Group fd, /dev/vfio/$GROUP
  */
+static bool vfio_group_has_iommu(struct vfio_group *group)
+{
+	lockdep_assert_held(&group->group_lock);
+	if (!group->container)
+		WARN_ON(group->container_users);
+	else
+		WARN_ON(!group->container_users);
+	return group->container || group->iommufd;
+}
+
 /*
  * VFIO_GROUP_UNSET_CONTAINER should fail if there are other users or
  * if there was no container to unset.  Since the ioctl is called on
@@ -676,15 +687,21 @@ static int vfio_group_ioctl_unset_container(struct vfio_group *group)
 	int ret = 0;
 
 	mutex_lock(&group->group_lock);
-	if (!group->container) {
+	if (!vfio_group_has_iommu(group)) {
 		ret = -EINVAL;
 		goto out_unlock;
 	}
-	if (group->container_users != 1) {
-		ret = -EBUSY;
-		goto out_unlock;
+	if (group->container) {
+		if (group->container_users != 1) {
+			ret = -EBUSY;
+			goto out_unlock;
+		}
+		vfio_group_detach_container(group);
+	}
+	if (group->iommufd) {
+		iommufd_ctx_put(group->iommufd);
+		group->iommufd = NULL;
 	}
-	vfio_group_detach_container(group);
 
 out_unlock:
 	mutex_unlock(&group->group_lock);
@@ -695,6 +712,7 @@ static int vfio_group_ioctl_set_container(struct vfio_group *group,
 					  int __user *arg)
 {
 	struct vfio_container *container;
+	struct iommufd_ctx *iommufd;
 	struct fd f;
 	int ret;
 	int fd;
@@ -707,7 +725,7 @@ static int vfio_group_ioctl_set_container(struct vfio_group *group,
 		return -EBADF;
 
 	mutex_lock(&group->group_lock);
-	if (group->container || WARN_ON(group->container_users)) {
+	if (vfio_group_has_iommu(group)) {
 		ret = -EINVAL;
 		goto out_unlock;
 	}
@@ -717,12 +735,28 @@ static int vfio_group_ioctl_set_container(struct vfio_group *group,
 	}
 
 	container = vfio_container_from_file(f.file);
-	ret = -EINVAL;
 	if (container) {
 		ret = vfio_container_attach_group(container, group);
 		goto out_unlock;
 	}
 
+	iommufd = iommufd_ctx_from_file(f.file);
+	if (!IS_ERR(iommufd)) {
+		u32 ioas_id;
+
+		ret = iommufd_vfio_compat_ioas_id(iommufd, &ioas_id);
+		if (ret) {
+			iommufd_ctx_put(group->iommufd);
+			goto out_unlock;
+		}
+
+		group->iommufd = iommufd;
+		goto out_unlock;
+	}
+
+	/* The FD passed is not recognized. */
+	ret = -EBADFD;
+
 out_unlock:
 	mutex_unlock(&group->group_lock);
 	fdput(f);
@@ -752,9 +786,16 @@ static int vfio_device_first_open(struct vfio_device *device)
 	 * during close_device.
 	 */
 	mutex_lock(&device->group->group_lock);
-	ret = vfio_group_use_container(device->group);
-	if (ret)
+	if (!vfio_group_has_iommu(device->group)) {
+		ret = -EINVAL;
 		goto err_module_put;
+	}
+
+	if (device->group->container) {
+		ret = vfio_group_use_container(device->group);
+		if (ret)
+			goto err_module_put;
+	}
 
 	device->kvm = device->group->kvm;
 	if (device->ops->open_device) {
@@ -762,13 +803,15 @@ static int vfio_device_first_open(struct vfio_device *device)
 		if (ret)
 			goto err_container;
 	}
-	vfio_device_container_register(device);
+	if (device->group->container)
+		vfio_device_container_register(device);
 	mutex_unlock(&device->group->group_lock);
 	return 0;
 
 err_container:
 	device->kvm = NULL;
-	vfio_group_unuse_container(device->group);
+	if (device->group->container)
+		vfio_group_unuse_container(device->group);
 err_module_put:
 	mutex_unlock(&device->group->group_lock);
 	module_put(device->dev->driver->owner);
@@ -780,11 +823,13 @@ static void vfio_device_last_close(struct vfio_device *device)
 	lockdep_assert_held(&device->dev_set->lock);
 
 	mutex_lock(&device->group->group_lock);
-	vfio_device_container_unregister(device);
+	if (device->group->container)
+		vfio_device_container_unregister(device);
 	if (device->ops->close_device)
 		device->ops->close_device(device);
 	device->kvm = NULL;
-	vfio_group_unuse_container(device->group);
+	if (device->group->container)
+		vfio_group_unuse_container(device->group);
 	mutex_unlock(&device->group->group_lock);
 	module_put(device->dev->driver->owner);
 }
@@ -900,7 +945,14 @@ static int vfio_group_ioctl_get_status(struct vfio_group *group,
 		return -ENODEV;
 	}
 
-	if (group->container)
+	/*
+	 * With the container FD the iommu_group_claim_dma_owner() is done
+	 * during SET_CONTAINER but for IOMMFD this is done during
+	 * VFIO_GROUP_GET_DEVICE_FD. Meaning that with iommufd
+	 * VFIO_GROUP_FLAGS_VIABLE could be set but GET_DEVICE_FD will fail due
+	 * to viability.
+	 */
+	if (group->container || group->iommufd)
 		status.flags |= VFIO_GROUP_FLAGS_CONTAINER_SET |
 				VFIO_GROUP_FLAGS_VIABLE;
 	else if (!iommu_group_dma_owner_claimed(group->iommu_group))
@@ -983,6 +1035,10 @@ static int vfio_group_fops_release(struct inode *inode, struct file *filep)
 	WARN_ON(group->notifier.head);
 	if (group->container)
 		vfio_group_detach_container(group);
+	if (group->iommufd) {
+		iommufd_ctx_put(group->iommufd);
+		group->iommufd = NULL;
+	}
 	group->opened_file = NULL;
 	mutex_unlock(&group->group_lock);
 	return 0;
@@ -1881,6 +1937,8 @@ static void __exit vfio_cleanup(void)
 module_init(vfio_init);
 module_exit(vfio_cleanup);
 
+MODULE_IMPORT_NS(IOMMUFD);
+MODULE_IMPORT_NS(IOMMUFD_VFIO);
 MODULE_VERSION(DRIVER_VERSION);
 MODULE_LICENSE("GPL v2");
 MODULE_AUTHOR(DRIVER_AUTHOR);
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v3 07/11] vfio-iommufd: Support iommufd for physical VFIO devices
  2022-11-16 21:05 [PATCH v3 00/11] Connect VFIO to IOMMUFD Jason Gunthorpe
                   ` (5 preceding siblings ...)
  2022-11-16 21:05 ` [PATCH v3 06/11] vfio-iommufd: Allow iommufd to be used in place of a container fd Jason Gunthorpe
@ 2022-11-16 21:05 ` Jason Gunthorpe
  2022-11-18  1:30   ` Tian, Kevin
  2022-11-16 21:05 ` [PATCH v3 08/11] vfio-iommufd: Support iommufd for emulated " Jason Gunthorpe
                   ` (4 subsequent siblings)
  11 siblings, 1 reply; 35+ messages in thread
From: Jason Gunthorpe @ 2022-11-16 21:05 UTC (permalink / raw)
  Cc: Alex Williamson, Lu Baolu, Cornelia Huck, Eric Auger, Kevin Tian,
	kvm, Lixiao Yang, Matthew Rosato, Nicolin Chen, Yi Liu, Yu He

This creates the iommufd_device for the physical VFIO drivers. These are
all the drivers that are calling vfio_register_group_dev() and expect the
type1 code to setup a real iommu_domain against their parent struct
device.

The design gives the driver a choice in how it gets connected to iommufd
by providing bind_iommufd/unbind_iommufd/attach_ioas callbacks to
implement as required. The core code provides three default callbacks for
physical mode using a real iommu_domain. This is suitable for drivers
using vfio_register_group_dev()

Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Yi Liu <yi.l.liu@intel.com>
Tested-by: Lixiao Yang <lixiao.yang@intel.com>
Tested-by: Matthew Rosato <mjrosato@linux.ibm.com>
Tested-by: Yu He <yu.he@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/vfio/Makefile                         |   1 +
 drivers/vfio/fsl-mc/vfio_fsl_mc.c             |   3 +
 drivers/vfio/iommufd.c                        | 103 ++++++++++++++++++
 .../vfio/pci/hisilicon/hisi_acc_vfio_pci.c    |   6 +
 drivers/vfio/pci/mlx5/main.c                  |   3 +
 drivers/vfio/pci/vfio_pci.c                   |   3 +
 drivers/vfio/platform/vfio_amba.c             |   3 +
 drivers/vfio/platform/vfio_platform.c         |   3 +
 drivers/vfio/vfio.h                           |  15 +++
 drivers/vfio/vfio_main.c                      |  15 ++-
 include/linux/vfio.h                          |  25 +++++
 11 files changed, 178 insertions(+), 2 deletions(-)
 create mode 100644 drivers/vfio/iommufd.c

diff --git a/drivers/vfio/Makefile b/drivers/vfio/Makefile
index b693a1169286f8..3863922529ef20 100644
--- a/drivers/vfio/Makefile
+++ b/drivers/vfio/Makefile
@@ -6,6 +6,7 @@ obj-$(CONFIG_VFIO) += vfio.o
 vfio-y += vfio_main.o \
 	  iova_bitmap.o \
 	  container.o
+vfio-$(CONFIG_IOMMUFD) += iommufd.o
 
 obj-$(CONFIG_VFIO_VIRQFD) += vfio_virqfd.o
 obj-$(CONFIG_VFIO_IOMMU_TYPE1) += vfio_iommu_type1.o
diff --git a/drivers/vfio/fsl-mc/vfio_fsl_mc.c b/drivers/vfio/fsl-mc/vfio_fsl_mc.c
index b16874e913e4f5..5cd4bb47644039 100644
--- a/drivers/vfio/fsl-mc/vfio_fsl_mc.c
+++ b/drivers/vfio/fsl-mc/vfio_fsl_mc.c
@@ -592,6 +592,9 @@ static const struct vfio_device_ops vfio_fsl_mc_ops = {
 	.read		= vfio_fsl_mc_read,
 	.write		= vfio_fsl_mc_write,
 	.mmap		= vfio_fsl_mc_mmap,
+	.bind_iommufd	= vfio_iommufd_physical_bind,
+	.unbind_iommufd	= vfio_iommufd_physical_unbind,
+	.attach_ioas	= vfio_iommufd_physical_attach_ioas,
 };
 
 static struct fsl_mc_driver vfio_fsl_mc_driver = {
diff --git a/drivers/vfio/iommufd.c b/drivers/vfio/iommufd.c
new file mode 100644
index 00000000000000..6d6452072b8dba
--- /dev/null
+++ b/drivers/vfio/iommufd.c
@@ -0,0 +1,103 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (c) 2021-2022, NVIDIA CORPORATION & AFFILIATES
+ */
+#include <linux/vfio.h>
+#include <linux/iommufd.h>
+
+#include "vfio.h"
+
+MODULE_IMPORT_NS(IOMMUFD);
+MODULE_IMPORT_NS(IOMMUFD_VFIO);
+
+int vfio_iommufd_bind(struct vfio_device *vdev, struct iommufd_ctx *ictx)
+{
+	u32 ioas_id;
+	u32 device_id;
+	int ret;
+
+	lockdep_assert_held(&vdev->dev_set->lock);
+
+	/*
+	 * If the driver doesn't provide this op then it means the device does
+	 * not do DMA at all. So nothing to do.
+	 */
+	if (!vdev->ops->bind_iommufd)
+		return 0;
+
+	ret = vdev->ops->bind_iommufd(vdev, ictx, &device_id);
+	if (ret)
+		return ret;
+
+	ret = iommufd_vfio_compat_ioas_id(ictx, &ioas_id);
+	if (ret)
+		goto err_unbind;
+	ret = vdev->ops->attach_ioas(vdev, &ioas_id);
+	if (ret)
+		goto err_unbind;
+
+	/*
+	 * The legacy path has no way to return the device id or the selected
+	 * pt_id
+	 */
+	return 0;
+
+err_unbind:
+	if (vdev->ops->unbind_iommufd)
+		vdev->ops->unbind_iommufd(vdev);
+	return ret;
+}
+
+void vfio_iommufd_unbind(struct vfio_device *vdev)
+{
+	lockdep_assert_held(&vdev->dev_set->lock);
+
+	if (vdev->ops->unbind_iommufd)
+		vdev->ops->unbind_iommufd(vdev);
+}
+
+/*
+ * The physical standard ops mean that the iommufd_device is bound to the
+ * physical device vdev->dev that was provided to vfio_init_group_dev(). Drivers
+ * using this ops set should call vfio_register_group_dev()
+ */
+int vfio_iommufd_physical_bind(struct vfio_device *vdev,
+			       struct iommufd_ctx *ictx, u32 *out_device_id)
+{
+	struct iommufd_device *idev;
+
+	idev = iommufd_device_bind(ictx, vdev->dev, out_device_id);
+	if (IS_ERR(idev))
+		return PTR_ERR(idev);
+	vdev->iommufd_device = idev;
+	return 0;
+}
+EXPORT_SYMBOL_GPL(vfio_iommufd_physical_bind);
+
+void vfio_iommufd_physical_unbind(struct vfio_device *vdev)
+{
+	lockdep_assert_held(&vdev->dev_set->lock);
+
+	if (vdev->iommufd_attached) {
+		iommufd_device_detach(vdev->iommufd_device);
+		vdev->iommufd_attached = false;
+	}
+	iommufd_device_unbind(vdev->iommufd_device);
+	vdev->iommufd_device = NULL;
+}
+EXPORT_SYMBOL_GPL(vfio_iommufd_physical_unbind);
+
+int vfio_iommufd_physical_attach_ioas(struct vfio_device *vdev, u32 *pt_id)
+{
+	unsigned int flags = 0;
+	int rc;
+
+	if (vfio_allow_unsafe_interrupts)
+		flags |= IOMMUFD_ATTACH_FLAGS_ALLOW_UNSAFE_INTERRUPT;
+	rc = iommufd_device_attach(vdev->iommufd_device, pt_id, flags);
+	if (rc)
+		return rc;
+	vdev->iommufd_attached = true;
+	return 0;
+}
+EXPORT_SYMBOL_GPL(vfio_iommufd_physical_attach_ioas);
diff --git a/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c b/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c
index 39eeca18a0f7c8..40019b11c5a969 100644
--- a/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c
+++ b/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c
@@ -1246,6 +1246,9 @@ static const struct vfio_device_ops hisi_acc_vfio_pci_migrn_ops = {
 	.mmap = hisi_acc_vfio_pci_mmap,
 	.request = vfio_pci_core_request,
 	.match = vfio_pci_core_match,
+	.bind_iommufd = vfio_iommufd_physical_bind,
+	.unbind_iommufd = vfio_iommufd_physical_unbind,
+	.attach_ioas = vfio_iommufd_physical_attach_ioas,
 };
 
 static const struct vfio_device_ops hisi_acc_vfio_pci_ops = {
@@ -1261,6 +1264,9 @@ static const struct vfio_device_ops hisi_acc_vfio_pci_ops = {
 	.mmap = vfio_pci_core_mmap,
 	.request = vfio_pci_core_request,
 	.match = vfio_pci_core_match,
+	.bind_iommufd = vfio_iommufd_physical_bind,
+	.unbind_iommufd = vfio_iommufd_physical_unbind,
+	.attach_ioas = vfio_iommufd_physical_attach_ioas,
 };
 
 static int hisi_acc_vfio_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
diff --git a/drivers/vfio/pci/mlx5/main.c b/drivers/vfio/pci/mlx5/main.c
index fd6ccb8454a24a..32d1f38d351e7e 100644
--- a/drivers/vfio/pci/mlx5/main.c
+++ b/drivers/vfio/pci/mlx5/main.c
@@ -623,6 +623,9 @@ static const struct vfio_device_ops mlx5vf_pci_ops = {
 	.mmap = vfio_pci_core_mmap,
 	.request = vfio_pci_core_request,
 	.match = vfio_pci_core_match,
+	.bind_iommufd = vfio_iommufd_physical_bind,
+	.unbind_iommufd = vfio_iommufd_physical_unbind,
+	.attach_ioas = vfio_iommufd_physical_attach_ioas,
 };
 
 static int mlx5vf_pci_probe(struct pci_dev *pdev,
diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c
index 1d4919edfbde48..29091ee2e9849b 100644
--- a/drivers/vfio/pci/vfio_pci.c
+++ b/drivers/vfio/pci/vfio_pci.c
@@ -138,6 +138,9 @@ static const struct vfio_device_ops vfio_pci_ops = {
 	.mmap		= vfio_pci_core_mmap,
 	.request	= vfio_pci_core_request,
 	.match		= vfio_pci_core_match,
+	.bind_iommufd	= vfio_iommufd_physical_bind,
+	.unbind_iommufd	= vfio_iommufd_physical_unbind,
+	.attach_ioas	= vfio_iommufd_physical_attach_ioas,
 };
 
 static int vfio_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
diff --git a/drivers/vfio/platform/vfio_amba.c b/drivers/vfio/platform/vfio_amba.c
index eaea63e5294c58..5a046098d0bdf4 100644
--- a/drivers/vfio/platform/vfio_amba.c
+++ b/drivers/vfio/platform/vfio_amba.c
@@ -117,6 +117,9 @@ static const struct vfio_device_ops vfio_amba_ops = {
 	.read		= vfio_platform_read,
 	.write		= vfio_platform_write,
 	.mmap		= vfio_platform_mmap,
+	.bind_iommufd	= vfio_iommufd_physical_bind,
+	.unbind_iommufd	= vfio_iommufd_physical_unbind,
+	.attach_ioas	= vfio_iommufd_physical_attach_ioas,
 };
 
 static const struct amba_id pl330_ids[] = {
diff --git a/drivers/vfio/platform/vfio_platform.c b/drivers/vfio/platform/vfio_platform.c
index 82cedcebfd9022..b87c3b70878341 100644
--- a/drivers/vfio/platform/vfio_platform.c
+++ b/drivers/vfio/platform/vfio_platform.c
@@ -106,6 +106,9 @@ static const struct vfio_device_ops vfio_platform_ops = {
 	.read		= vfio_platform_read,
 	.write		= vfio_platform_write,
 	.mmap		= vfio_platform_mmap,
+	.bind_iommufd	= vfio_iommufd_physical_bind,
+	.unbind_iommufd	= vfio_iommufd_physical_unbind,
+	.attach_ioas	= vfio_iommufd_physical_attach_ioas,
 };
 
 static struct platform_driver vfio_platform_driver = {
diff --git a/drivers/vfio/vfio.h b/drivers/vfio/vfio.h
index 985e13d52989ca..809f2e8523968e 100644
--- a/drivers/vfio/vfio.h
+++ b/drivers/vfio/vfio.h
@@ -124,6 +124,21 @@ void vfio_device_container_unregister(struct vfio_device *device);
 int __init vfio_container_init(void);
 void vfio_container_cleanup(void);
 
+#if IS_ENABLED(CONFIG_IOMMUFD)
+int vfio_iommufd_bind(struct vfio_device *device, struct iommufd_ctx *ictx);
+void vfio_iommufd_unbind(struct vfio_device *device);
+#else
+static inline int vfio_iommufd_bind(struct vfio_device *device,
+				    struct iommufd_ctx *ictx)
+{
+	return -EOPNOTSUPP;
+}
+
+static inline void vfio_iommufd_unbind(struct vfio_device *device)
+{
+}
+#endif
+
 #ifdef CONFIG_VFIO_NOIOMMU
 extern bool vfio_noiommu __read_mostly;
 #else
diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c
index 8c124290ce9f0d..e76ffa3ecebfb0 100644
--- a/drivers/vfio/vfio_main.c
+++ b/drivers/vfio/vfio_main.c
@@ -528,6 +528,11 @@ static int __vfio_register_dev(struct vfio_device *device,
 	if (IS_ERR(group))
 		return PTR_ERR(group);
 
+	if (WARN_ON(device->ops->bind_iommufd &&
+		    (!device->ops->unbind_iommufd ||
+		     !device->ops->attach_ioas)))
+		return -EINVAL;
+
 	/*
 	 * If the driver doesn't specify a set then the device is added to a
 	 * singleton set just for itself.
@@ -795,6 +800,10 @@ static int vfio_device_first_open(struct vfio_device *device)
 		ret = vfio_group_use_container(device->group);
 		if (ret)
 			goto err_module_put;
+	} else if (device->group->iommufd) {
+		ret = vfio_iommufd_bind(device, device->group->iommufd);
+		if (ret)
+			goto err_module_put;
 	}
 
 	device->kvm = device->group->kvm;
@@ -812,6 +821,8 @@ static int vfio_device_first_open(struct vfio_device *device)
 	device->kvm = NULL;
 	if (device->group->container)
 		vfio_group_unuse_container(device->group);
+	else if (device->group->iommufd)
+		vfio_iommufd_unbind(device);
 err_module_put:
 	mutex_unlock(&device->group->group_lock);
 	module_put(device->dev->driver->owner);
@@ -830,6 +841,8 @@ static void vfio_device_last_close(struct vfio_device *device)
 	device->kvm = NULL;
 	if (device->group->container)
 		vfio_group_unuse_container(device->group);
+	else if (device->group->iommufd)
+		vfio_iommufd_unbind(device);
 	mutex_unlock(&device->group->group_lock);
 	module_put(device->dev->driver->owner);
 }
@@ -1937,8 +1950,6 @@ static void __exit vfio_cleanup(void)
 module_init(vfio_init);
 module_exit(vfio_cleanup);
 
-MODULE_IMPORT_NS(IOMMUFD);
-MODULE_IMPORT_NS(IOMMUFD_VFIO);
 MODULE_VERSION(DRIVER_VERSION);
 MODULE_LICENSE("GPL v2");
 MODULE_AUTHOR(DRIVER_AUTHOR);
diff --git a/include/linux/vfio.h b/include/linux/vfio.h
index e7cebeb875dd1a..a7fc4d747dc226 100644
--- a/include/linux/vfio.h
+++ b/include/linux/vfio.h
@@ -17,6 +17,8 @@
 #include <linux/iova_bitmap.h>
 
 struct kvm;
+struct iommufd_ctx;
+struct iommufd_device;
 
 /*
  * VFIO devices can be placed in a set, this allows all devices to share this
@@ -54,6 +56,10 @@ struct vfio_device {
 	struct completion comp;
 	struct list_head group_next;
 	struct list_head iommu_entry;
+#if IS_ENABLED(CONFIG_IOMMUFD)
+	struct iommufd_device *iommufd_device;
+	bool iommufd_attached;
+#endif
 };
 
 /**
@@ -80,6 +86,10 @@ struct vfio_device_ops {
 	char	*name;
 	int	(*init)(struct vfio_device *vdev);
 	void	(*release)(struct vfio_device *vdev);
+	int	(*bind_iommufd)(struct vfio_device *vdev,
+				struct iommufd_ctx *ictx, u32 *out_device_id);
+	void	(*unbind_iommufd)(struct vfio_device *vdev);
+	int	(*attach_ioas)(struct vfio_device *vdev, u32 *pt_id);
 	int	(*open_device)(struct vfio_device *vdev);
 	void	(*close_device)(struct vfio_device *vdev);
 	ssize_t	(*read)(struct vfio_device *vdev, char __user *buf,
@@ -96,6 +106,21 @@ struct vfio_device_ops {
 				  void __user *arg, size_t argsz);
 };
 
+#if IS_ENABLED(CONFIG_IOMMUFD)
+int vfio_iommufd_physical_bind(struct vfio_device *vdev,
+			       struct iommufd_ctx *ictx, u32 *out_device_id);
+void vfio_iommufd_physical_unbind(struct vfio_device *vdev);
+int vfio_iommufd_physical_attach_ioas(struct vfio_device *vdev, u32 *pt_id);
+#else
+#define vfio_iommufd_physical_bind                                      \
+	((int (*)(struct vfio_device *vdev, struct iommufd_ctx *ictx,   \
+		  u32 *out_device_id)) NULL)
+#define vfio_iommufd_physical_unbind \
+	((void (*)(struct vfio_device *vdev)) NULL)
+#define vfio_iommufd_physical_attach_ioas \
+	((int (*)(struct vfio_device *vdev, u32 *pt_id)) NULL)
+#endif
+
 /**
  * @migration_set_state: Optional callback to change the migration state for
  *         devices that support migration. It's mandatory for
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v3 08/11] vfio-iommufd: Support iommufd for emulated VFIO devices
  2022-11-16 21:05 [PATCH v3 00/11] Connect VFIO to IOMMUFD Jason Gunthorpe
                   ` (6 preceding siblings ...)
  2022-11-16 21:05 ` [PATCH v3 07/11] vfio-iommufd: Support iommufd for physical VFIO devices Jason Gunthorpe
@ 2022-11-16 21:05 ` Jason Gunthorpe
  2022-11-18  2:03   ` Tian, Kevin
  2022-11-16 21:05 ` [PATCH v3 09/11] vfio: Move container related MODULE_ALIAS statements into container.c Jason Gunthorpe
                   ` (3 subsequent siblings)
  11 siblings, 1 reply; 35+ messages in thread
From: Jason Gunthorpe @ 2022-11-16 21:05 UTC (permalink / raw)
  Cc: Alex Williamson, Lu Baolu, Cornelia Huck, Eric Auger, Kevin Tian,
	kvm, Lixiao Yang, Matthew Rosato, Nicolin Chen, Yi Liu, Yu He

Emulated VFIO devices are calling vfio_register_emulated_iommu_dev() and
consist of all the mdev drivers.

Like the physical drivers, support for iommufd is provided by the driver
supplying the correct standard ops. Provide ops from the core that
duplicate what vfio_register_emulated_iommu_dev() does.

Emulated drivers are where it is more likely to see variation in the
iommfd support ops. For instance IDXD will probably need to setup both a
iommfd_device context linked to a PASID and an iommufd_access context to
support all their mdev operations.

Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Yi Liu <yi.l.liu@intel.com>
Tested-by: Lixiao Yang <lixiao.yang@intel.com>
Tested-by: Matthew Rosato <mjrosato@linux.ibm.com>
Tested-by: Yu He <yu.he@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/gpu/drm/i915/gvt/kvmgt.c  |   3 +
 drivers/s390/cio/vfio_ccw_ops.c   |   3 +
 drivers/s390/crypto/vfio_ap_ops.c |   3 +
 drivers/vfio/container.c          | 110 +++++----------------------
 drivers/vfio/iommufd.c            |  58 ++++++++++++++
 drivers/vfio/vfio.h               |  10 ++-
 drivers/vfio/vfio_main.c          | 122 +++++++++++++++++++++++++++++-
 include/linux/vfio.h              |  14 ++++
 8 files changed, 229 insertions(+), 94 deletions(-)

diff --git a/drivers/gpu/drm/i915/gvt/kvmgt.c b/drivers/gpu/drm/i915/gvt/kvmgt.c
index 7a45e5360caf2d..579b230a0f58d9 100644
--- a/drivers/gpu/drm/i915/gvt/kvmgt.c
+++ b/drivers/gpu/drm/i915/gvt/kvmgt.c
@@ -1474,6 +1474,9 @@ static const struct vfio_device_ops intel_vgpu_dev_ops = {
 	.mmap		= intel_vgpu_mmap,
 	.ioctl		= intel_vgpu_ioctl,
 	.dma_unmap	= intel_vgpu_dma_unmap,
+	.bind_iommufd	= vfio_iommufd_emulated_bind,
+	.unbind_iommufd = vfio_iommufd_emulated_unbind,
+	.attach_ioas	= vfio_iommufd_emulated_attach_ioas,
 };
 
 static int intel_vgpu_probe(struct mdev_device *mdev)
diff --git a/drivers/s390/cio/vfio_ccw_ops.c b/drivers/s390/cio/vfio_ccw_ops.c
index 6ae4d012d80084..560453d99c24fc 100644
--- a/drivers/s390/cio/vfio_ccw_ops.c
+++ b/drivers/s390/cio/vfio_ccw_ops.c
@@ -588,6 +588,9 @@ static const struct vfio_device_ops vfio_ccw_dev_ops = {
 	.ioctl = vfio_ccw_mdev_ioctl,
 	.request = vfio_ccw_mdev_request,
 	.dma_unmap = vfio_ccw_dma_unmap,
+	.bind_iommufd = vfio_iommufd_emulated_bind,
+	.unbind_iommufd = vfio_iommufd_emulated_unbind,
+	.attach_ioas = vfio_iommufd_emulated_attach_ioas,
 };
 
 struct mdev_driver vfio_ccw_mdev_driver = {
diff --git a/drivers/s390/crypto/vfio_ap_ops.c b/drivers/s390/crypto/vfio_ap_ops.c
index 0b4cc8c597ae67..bb7776d207924f 100644
--- a/drivers/s390/crypto/vfio_ap_ops.c
+++ b/drivers/s390/crypto/vfio_ap_ops.c
@@ -1789,6 +1789,9 @@ static const struct vfio_device_ops vfio_ap_matrix_dev_ops = {
 	.close_device = vfio_ap_mdev_close_device,
 	.ioctl = vfio_ap_mdev_ioctl,
 	.dma_unmap = vfio_ap_mdev_dma_unmap,
+	.bind_iommufd = vfio_iommufd_emulated_bind,
+	.unbind_iommufd = vfio_iommufd_emulated_unbind,
+	.attach_ioas = vfio_iommufd_emulated_attach_ioas,
 };
 
 static struct mdev_driver vfio_ap_matrix_driver = {
diff --git a/drivers/vfio/container.c b/drivers/vfio/container.c
index 8772dad6808539..7f3961fd4b5aac 100644
--- a/drivers/vfio/container.c
+++ b/drivers/vfio/container.c
@@ -540,113 +540,41 @@ void vfio_group_unuse_container(struct vfio_group *group)
 	fput(group->opened_file);
 }
 
-/*
- * Pin contiguous user pages and return their associated host pages for local
- * domain only.
- * @device [in]  : device
- * @iova [in]    : starting IOVA of user pages to be pinned.
- * @npage [in]   : count of pages to be pinned.  This count should not
- *		   be greater than VFIO_PIN_PAGES_MAX_ENTRIES.
- * @prot [in]    : protection flags
- * @pages[out]   : array of host pages
- * Return error or number of pages pinned.
- *
- * A driver may only call this function if the vfio_device was created
- * by vfio_register_emulated_iommu_dev().
- */
-int vfio_pin_pages(struct vfio_device *device, dma_addr_t iova,
-		   int npage, int prot, struct page **pages)
+int vfio_container_pin_pages(struct vfio_container *container,
+			     struct iommu_group *iommu_group, dma_addr_t iova,
+			     int npage, int prot, struct page **pages)
 {
-	struct vfio_container *container;
-	struct vfio_group *group = device->group;
-	struct vfio_iommu_driver *driver;
-	int ret;
-
-	if (!pages || !npage || !vfio_assert_device_open(device))
-		return -EINVAL;
+	struct vfio_iommu_driver *driver = container->iommu_driver;
 
 	if (npage > VFIO_PIN_PAGES_MAX_ENTRIES)
 		return -E2BIG;
 
-	/* group->container cannot change while a vfio device is open */
-	container = group->container;
-	driver = container->iommu_driver;
-	if (likely(driver && driver->ops->pin_pages))
-		ret = driver->ops->pin_pages(container->iommu_data,
-					     group->iommu_group, iova,
-					     npage, prot, pages);
-	else
-		ret = -ENOTTY;
-
-	return ret;
+	if (unlikely(!driver || !driver->ops->pin_pages))
+		return -ENOTTY;
+	return driver->ops->pin_pages(container->iommu_data, iommu_group, iova,
+				      npage, prot, pages);
 }
-EXPORT_SYMBOL(vfio_pin_pages);
 
-/*
- * Unpin contiguous host pages for local domain only.
- * @device [in]  : device
- * @iova [in]    : starting address of user pages to be unpinned.
- * @npage [in]   : count of pages to be unpinned.  This count should not
- *                 be greater than VFIO_PIN_PAGES_MAX_ENTRIES.
- */
-void vfio_unpin_pages(struct vfio_device *device, dma_addr_t iova, int npage)
+void vfio_container_unpin_pages(struct vfio_container *container,
+				dma_addr_t iova, int npage)
 {
-	struct vfio_container *container;
-	struct vfio_iommu_driver *driver;
-
 	if (WARN_ON(npage <= 0 || npage > VFIO_PIN_PAGES_MAX_ENTRIES))
 		return;
 
-	if (WARN_ON(!vfio_assert_device_open(device)))
-		return;
-
-	/* group->container cannot change while a vfio device is open */
-	container = device->group->container;
-	driver = container->iommu_driver;
-
-	driver->ops->unpin_pages(container->iommu_data, iova, npage);
+	container->iommu_driver->ops->unpin_pages(container->iommu_data, iova,
+						  npage);
 }
-EXPORT_SYMBOL(vfio_unpin_pages);
 
-/*
- * This interface allows the CPUs to perform some sort of virtual DMA on
- * behalf of the device.
- *
- * CPUs read/write from/into a range of IOVAs pointing to user space memory
- * into/from a kernel buffer.
- *
- * As the read/write of user space memory is conducted via the CPUs and is
- * not a real device DMA, it is not necessary to pin the user space memory.
- *
- * @device [in]		: VFIO device
- * @iova [in]		: base IOVA of a user space buffer
- * @data [in]		: pointer to kernel buffer
- * @len [in]		: kernel buffer length
- * @write		: indicate read or write
- * Return error code on failure or 0 on success.
- */
-int vfio_dma_rw(struct vfio_device *device, dma_addr_t iova, void *data,
-		size_t len, bool write)
+int vfio_container_dma_rw(struct vfio_container *container, dma_addr_t iova,
+			  void *data, size_t len, bool write)
 {
-	struct vfio_container *container;
-	struct vfio_iommu_driver *driver;
-	int ret = 0;
-
-	if (!data || len <= 0 || !vfio_assert_device_open(device))
-		return -EINVAL;
-
-	/* group->container cannot change while a vfio device is open */
-	container = device->group->container;
-	driver = container->iommu_driver;
+	struct vfio_iommu_driver *driver = container->iommu_driver;
 
-	if (likely(driver && driver->ops->dma_rw))
-		ret = driver->ops->dma_rw(container->iommu_data,
-					  iova, data, len, write);
-	else
-		ret = -ENOTTY;
-	return ret;
+	if (unlikely(!driver || !driver->ops->dma_rw))
+		return -ENOTTY;
+	return driver->ops->dma_rw(container->iommu_data, iova, data, len,
+				   write);
 }
-EXPORT_SYMBOL(vfio_dma_rw);
 
 int __init vfio_container_init(void)
 {
diff --git a/drivers/vfio/iommufd.c b/drivers/vfio/iommufd.c
index 6d6452072b8dba..dad8935d71f7f0 100644
--- a/drivers/vfio/iommufd.c
+++ b/drivers/vfio/iommufd.c
@@ -101,3 +101,61 @@ int vfio_iommufd_physical_attach_ioas(struct vfio_device *vdev, u32 *pt_id)
 	return 0;
 }
 EXPORT_SYMBOL_GPL(vfio_iommufd_physical_attach_ioas);
+
+/*
+ * The emulated standard ops mean that vfio_device is going to use the
+ * "mdev path" and will call vfio_pin_pages()/vfio_dma_rw(). Drivers using this
+ * ops set should call vfio_register_emulated_iommu_dev().
+ */
+
+static void vfio_emulated_unmap(void *data, unsigned long iova,
+				unsigned long length)
+{
+	struct vfio_device *vdev = data;
+
+	vdev->ops->dma_unmap(vdev, iova, length);
+}
+
+static const struct iommufd_access_ops vfio_user_ops = {
+	.needs_pin_pages = 1,
+	.unmap = vfio_emulated_unmap,
+};
+
+int vfio_iommufd_emulated_bind(struct vfio_device *vdev,
+			       struct iommufd_ctx *ictx, u32 *out_device_id)
+{
+	lockdep_assert_held(&vdev->dev_set->lock);
+
+	vdev->iommufd_ictx = ictx;
+	iommufd_ctx_get(ictx);
+	return 0;
+}
+EXPORT_SYMBOL_GPL(vfio_iommufd_emulated_bind);
+
+void vfio_iommufd_emulated_unbind(struct vfio_device *vdev)
+{
+	lockdep_assert_held(&vdev->dev_set->lock);
+
+	if (vdev->iommufd_access) {
+		iommufd_access_destroy(vdev->iommufd_access);
+		vdev->iommufd_access = NULL;
+	}
+	iommufd_ctx_put(vdev->iommufd_ictx);
+	vdev->iommufd_ictx = NULL;
+}
+EXPORT_SYMBOL_GPL(vfio_iommufd_emulated_unbind);
+
+int vfio_iommufd_emulated_attach_ioas(struct vfio_device *vdev, u32 *pt_id)
+{
+	struct iommufd_access *user;
+
+	lockdep_assert_held(&vdev->dev_set->lock);
+
+	user = iommufd_access_create(vdev->iommufd_ictx, *pt_id, &vfio_user_ops,
+				     vdev);
+	if (IS_ERR(user))
+		return PTR_ERR(user);
+	vdev->iommufd_access = user;
+	return 0;
+}
+EXPORT_SYMBOL_GPL(vfio_iommufd_emulated_attach_ioas);
diff --git a/drivers/vfio/vfio.h b/drivers/vfio/vfio.h
index 809f2e8523968e..d57a08afb5cf5c 100644
--- a/drivers/vfio/vfio.h
+++ b/drivers/vfio/vfio.h
@@ -111,8 +111,6 @@ struct vfio_iommu_driver {
 int vfio_register_iommu_driver(const struct vfio_iommu_driver_ops *ops);
 void vfio_unregister_iommu_driver(const struct vfio_iommu_driver_ops *ops);
 
-bool vfio_assert_device_open(struct vfio_device *device);
-
 struct vfio_container *vfio_container_from_file(struct file *filep);
 int vfio_group_use_container(struct vfio_group *group);
 void vfio_group_unuse_container(struct vfio_group *group);
@@ -121,6 +119,14 @@ int vfio_container_attach_group(struct vfio_container *container,
 void vfio_group_detach_container(struct vfio_group *group);
 void vfio_device_container_register(struct vfio_device *device);
 void vfio_device_container_unregister(struct vfio_device *device);
+int vfio_container_pin_pages(struct vfio_container *container,
+			     struct iommu_group *iommu_group, dma_addr_t iova,
+			     int npage, int prot, struct page **pages);
+void vfio_container_unpin_pages(struct vfio_container *container,
+				dma_addr_t iova, int npage);
+int vfio_container_dma_rw(struct vfio_container *container, dma_addr_t iova,
+			  void *data, size_t len, bool write);
+
 int __init vfio_container_init(void);
 void vfio_container_cleanup(void);
 
diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c
index e76ffa3ecebfb0..6195d7ad5b5479 100644
--- a/drivers/vfio/vfio_main.c
+++ b/drivers/vfio/vfio_main.c
@@ -771,7 +771,7 @@ static int vfio_group_ioctl_set_container(struct vfio_group *group,
 static const struct file_operations vfio_device_fops;
 
 /* true if the vfio_device has open_device() called but not close_device() */
-bool vfio_assert_device_open(struct vfio_device *device)
+static bool vfio_assert_device_open(struct vfio_device *device)
 {
 	return !WARN_ON_ONCE(!READ_ONCE(device->open_count));
 }
@@ -1877,6 +1877,126 @@ int vfio_set_irqs_validate_and_prepare(struct vfio_irq_set *hdr, int num_irqs,
 }
 EXPORT_SYMBOL(vfio_set_irqs_validate_and_prepare);
 
+/*
+ * Pin contiguous user pages and return their associated host pages for local
+ * domain only.
+ * @device [in]  : device
+ * @iova [in]    : starting IOVA of user pages to be pinned.
+ * @npage [in]   : count of pages to be pinned.  This count should not
+ *		   be greater than VFIO_PIN_PAGES_MAX_ENTRIES.
+ * @prot [in]    : protection flags
+ * @pages[out]   : array of host pages
+ * Return error or number of pages pinned.
+ *
+ * A driver may only call this function if the vfio_device was created
+ * by vfio_register_emulated_iommu_dev() due to vfio_container_pin_pages().
+ */
+int vfio_pin_pages(struct vfio_device *device, dma_addr_t iova,
+		   int npage, int prot, struct page **pages)
+{
+	/* group->container cannot change while a vfio device is open */
+	if (!pages || !npage || WARN_ON(!vfio_assert_device_open(device)))
+		return -EINVAL;
+	if (device->group->container)
+		return vfio_container_pin_pages(device->group->container,
+						device->group->iommu_group,
+						iova, npage, prot, pages);
+	if (device->iommufd_access) {
+		int ret;
+
+		if (iova > ULONG_MAX)
+			return -EINVAL;
+		/*
+		 * VFIO ignores the sub page offset, npages is from the start of
+		 * a PAGE_SIZE chunk of IOVA. The caller is expected to recover
+		 * the sub page offset by doing:
+		 *     pages[0] + (iova % PAGE_SIZE)
+		 */
+		ret = iommufd_access_pin_pages(
+			device->iommufd_access, ALIGN_DOWN(iova, PAGE_SIZE),
+			npage * PAGE_SIZE, pages,
+			(prot & IOMMU_WRITE) ? IOMMUFD_ACCESS_RW_WRITE : 0);
+		if (ret)
+			return ret;
+		return npage;
+	}
+	return -EINVAL;
+}
+EXPORT_SYMBOL(vfio_pin_pages);
+
+/*
+ * Unpin contiguous host pages for local domain only.
+ * @device [in]  : device
+ * @iova [in]    : starting address of user pages to be unpinned.
+ * @npage [in]   : count of pages to be unpinned.  This count should not
+ *                 be greater than VFIO_PIN_PAGES_MAX_ENTRIES.
+ */
+void vfio_unpin_pages(struct vfio_device *device, dma_addr_t iova, int npage)
+{
+	if (WARN_ON(!vfio_assert_device_open(device)))
+		return;
+
+	if (device->group->container) {
+		vfio_container_unpin_pages(device->group->container, iova,
+					   npage);
+		return;
+	}
+	if (device->iommufd_access) {
+		if (WARN_ON(iova > ULONG_MAX))
+			return;
+		iommufd_access_unpin_pages(device->iommufd_access,
+					   ALIGN_DOWN(iova, PAGE_SIZE),
+					   npage * PAGE_SIZE);
+		return;
+	}
+}
+EXPORT_SYMBOL(vfio_unpin_pages);
+
+/*
+ * This interface allows the CPUs to perform some sort of virtual DMA on
+ * behalf of the device.
+ *
+ * CPUs read/write from/into a range of IOVAs pointing to user space memory
+ * into/from a kernel buffer.
+ *
+ * As the read/write of user space memory is conducted via the CPUs and is
+ * not a real device DMA, it is not necessary to pin the user space memory.
+ *
+ * @device [in]		: VFIO device
+ * @iova [in]		: base IOVA of a user space buffer
+ * @data [in]		: pointer to kernel buffer
+ * @len [in]		: kernel buffer length
+ * @write		: indicate read or write
+ * Return error code on failure or 0 on success.
+ */
+int vfio_dma_rw(struct vfio_device *device, dma_addr_t iova, void *data,
+		size_t len, bool write)
+{
+	if (!data || len <= 0 || !vfio_assert_device_open(device))
+		return -EINVAL;
+
+	if (device->group->container)
+		return vfio_container_dma_rw(device->group->container, iova,
+					     data, len, write);
+
+	if (device->iommufd_access) {
+		unsigned int flags = 0;
+
+		if (iova > ULONG_MAX)
+			return -EINVAL;
+
+		/* VFIO historically tries to auto-detect a kthread */
+		if (!current->mm)
+			flags |= IOMMUFD_ACCESS_RW_KTHREAD;
+		if (write)
+			flags |= IOMMUFD_ACCESS_RW_WRITE;
+		return iommufd_access_rw(device->iommufd_access, iova, data,
+					 len, flags);
+	}
+	return -EINVAL;
+}
+EXPORT_SYMBOL(vfio_dma_rw);
+
 /*
  * Module/class support
  */
diff --git a/include/linux/vfio.h b/include/linux/vfio.h
index a7fc4d747dc226..d5f84f98c0fa8f 100644
--- a/include/linux/vfio.h
+++ b/include/linux/vfio.h
@@ -19,6 +19,7 @@
 struct kvm;
 struct iommufd_ctx;
 struct iommufd_device;
+struct iommufd_access;
 
 /*
  * VFIO devices can be placed in a set, this allows all devices to share this
@@ -56,8 +57,10 @@ struct vfio_device {
 	struct completion comp;
 	struct list_head group_next;
 	struct list_head iommu_entry;
+	struct iommufd_access *iommufd_access;
 #if IS_ENABLED(CONFIG_IOMMUFD)
 	struct iommufd_device *iommufd_device;
+	struct iommufd_ctx *iommufd_ictx;
 	bool iommufd_attached;
 #endif
 };
@@ -111,6 +114,10 @@ int vfio_iommufd_physical_bind(struct vfio_device *vdev,
 			       struct iommufd_ctx *ictx, u32 *out_device_id);
 void vfio_iommufd_physical_unbind(struct vfio_device *vdev);
 int vfio_iommufd_physical_attach_ioas(struct vfio_device *vdev, u32 *pt_id);
+int vfio_iommufd_emulated_bind(struct vfio_device *vdev,
+			       struct iommufd_ctx *ictx, u32 *out_device_id);
+void vfio_iommufd_emulated_unbind(struct vfio_device *vdev);
+int vfio_iommufd_emulated_attach_ioas(struct vfio_device *vdev, u32 *pt_id);
 #else
 #define vfio_iommufd_physical_bind                                      \
 	((int (*)(struct vfio_device *vdev, struct iommufd_ctx *ictx,   \
@@ -119,6 +126,13 @@ int vfio_iommufd_physical_attach_ioas(struct vfio_device *vdev, u32 *pt_id);
 	((void (*)(struct vfio_device *vdev)) NULL)
 #define vfio_iommufd_physical_attach_ioas \
 	((int (*)(struct vfio_device *vdev, u32 *pt_id)) NULL)
+#define vfio_iommufd_emulated_bind                                      \
+	((int (*)(struct vfio_device *vdev, struct iommufd_ctx *ictx,   \
+		  u32 *out_device_id)) NULL)
+#define vfio_iommufd_emulated_unbind \
+	((void (*)(struct vfio_device *vdev)) NULL)
+#define vfio_iommufd_emulated_attach_ioas \
+	((int (*)(struct vfio_device *vdev, u32 *pt_id)) NULL)
 #endif
 
 /**
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v3 09/11] vfio: Move container related MODULE_ALIAS statements into container.c
  2022-11-16 21:05 [PATCH v3 00/11] Connect VFIO to IOMMUFD Jason Gunthorpe
                   ` (7 preceding siblings ...)
  2022-11-16 21:05 ` [PATCH v3 08/11] vfio-iommufd: Support iommufd for emulated " Jason Gunthorpe
@ 2022-11-16 21:05 ` Jason Gunthorpe
  2022-11-16 21:05 ` [PATCH v3 10/11] vfio: Make vfio_container optionally compiled Jason Gunthorpe
                   ` (2 subsequent siblings)
  11 siblings, 0 replies; 35+ messages in thread
From: Jason Gunthorpe @ 2022-11-16 21:05 UTC (permalink / raw)
  Cc: Alex Williamson, Lu Baolu, Cornelia Huck, Eric Auger, Kevin Tian,
	kvm, Lixiao Yang, Matthew Rosato, Nicolin Chen, Yi Liu, Yu He

The miscdev is in container.c, so should these related MODULE_ALIAS
statements. This is necessary for the next patch to be able to fully
disable /dev/vfio/vfio.

Fixes: cdc71fe4ecbf ("vfio: Move container code into drivers/vfio/container.c")
Reported-by: Yi Liu <yi.l.liu@intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Yi Liu <yi.l.liu@intel.com>
Tested-by: Yi Liu <yi.l.liu@intel.com>
Tested-by: Lixiao Yang <lixiao.yang@intel.com>
Tested-by: Matthew Rosato <mjrosato@linux.ibm.com>
Tested-by: Yu He <yu.he@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/vfio/container.c | 3 +++
 drivers/vfio/vfio_main.c | 2 --
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/vfio/container.c b/drivers/vfio/container.c
index 7f3961fd4b5aac..6b362d97d68220 100644
--- a/drivers/vfio/container.c
+++ b/drivers/vfio/container.c
@@ -608,3 +608,6 @@ void vfio_container_cleanup(void)
 	misc_deregister(&vfio_dev);
 	mutex_destroy(&vfio.iommu_drivers_lock);
 }
+
+MODULE_ALIAS_MISCDEV(VFIO_MINOR);
+MODULE_ALIAS("devname:vfio/vfio");
diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c
index 6195d7ad5b5479..f3c48b8c45627d 100644
--- a/drivers/vfio/vfio_main.c
+++ b/drivers/vfio/vfio_main.c
@@ -2074,6 +2074,4 @@ MODULE_VERSION(DRIVER_VERSION);
 MODULE_LICENSE("GPL v2");
 MODULE_AUTHOR(DRIVER_AUTHOR);
 MODULE_DESCRIPTION(DRIVER_DESC);
-MODULE_ALIAS_MISCDEV(VFIO_MINOR);
-MODULE_ALIAS("devname:vfio/vfio");
 MODULE_SOFTDEP("post: vfio_iommu_type1 vfio_iommu_spapr_tce");
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v3 10/11] vfio: Make vfio_container optionally compiled
  2022-11-16 21:05 [PATCH v3 00/11] Connect VFIO to IOMMUFD Jason Gunthorpe
                   ` (8 preceding siblings ...)
  2022-11-16 21:05 ` [PATCH v3 09/11] vfio: Move container related MODULE_ALIAS statements into container.c Jason Gunthorpe
@ 2022-11-16 21:05 ` Jason Gunthorpe
  2022-11-18  2:05   ` Tian, Kevin
  2022-11-16 21:05 ` [PATCH v3 11/11] iommufd: Allow iommufd to supply /dev/vfio/vfio Jason Gunthorpe
  2022-11-23  2:44 ` [PATCH v3 00/11] Connect VFIO to IOMMUFD Yi Liu
  11 siblings, 1 reply; 35+ messages in thread
From: Jason Gunthorpe @ 2022-11-16 21:05 UTC (permalink / raw)
  Cc: Alex Williamson, Lu Baolu, Cornelia Huck, Eric Auger, Kevin Tian,
	kvm, Lixiao Yang, Matthew Rosato, Nicolin Chen, Yi Liu, Yu He

Add a kconfig CONFIG_VFIO_CONTAINER that controls compiling the container
code. If 'n' then only iommufd will provide the container service. All the
support for vfio iommu drivers, including type1, will not be built.

This allows a compilation check that no inappropriate dependencies between
the device/group and container have been created.

Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Yi Liu <yi.l.liu@intel.com>
Tested-by: Lixiao Yang <lixiao.yang@intel.com>
Tested-by: Matthew Rosato <mjrosato@linux.ibm.com>
Tested-by: Yu He <yu.he@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/vfio/Kconfig  | 35 +++++++++++++++--------
 drivers/vfio/Makefile |  4 +--
 drivers/vfio/vfio.h   | 65 +++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 91 insertions(+), 13 deletions(-)

diff --git a/drivers/vfio/Kconfig b/drivers/vfio/Kconfig
index 1118d322eec97d..286c1663bd7564 100644
--- a/drivers/vfio/Kconfig
+++ b/drivers/vfio/Kconfig
@@ -3,8 +3,8 @@ menuconfig VFIO
 	tristate "VFIO Non-Privileged userspace driver framework"
 	select IOMMU_API
 	depends on IOMMUFD || !IOMMUFD
-	select VFIO_IOMMU_TYPE1 if MMU && (X86 || S390 || ARM || ARM64)
 	select INTERVAL_TREE
+	select VFIO_CONTAINER if IOMMUFD=n
 	help
 	  VFIO provides a framework for secure userspace device drivers.
 	  See Documentation/driver-api/vfio.rst for more details.
@@ -12,6 +12,18 @@ menuconfig VFIO
 	  If you don't know what to do here, say N.
 
 if VFIO
+config VFIO_CONTAINER
+	bool "Support for the VFIO container /dev/vfio/vfio"
+	select VFIO_IOMMU_TYPE1 if MMU && (X86 || S390 || ARM || ARM64)
+	default y
+	help
+	  The VFIO container is the classic interface to VFIO for establishing
+	  IOMMU mappings. If N is selected here then IOMMUFD must be used to
+	  manage the mappings.
+
+	  Unless testing IOMMUFD say Y here.
+
+if VFIO_CONTAINER
 config VFIO_IOMMU_TYPE1
 	tristate
 	default n
@@ -21,16 +33,6 @@ config VFIO_IOMMU_SPAPR_TCE
 	depends on SPAPR_TCE_IOMMU
 	default VFIO
 
-config VFIO_SPAPR_EEH
-	tristate
-	depends on EEH && VFIO_IOMMU_SPAPR_TCE
-	default VFIO
-
-config VFIO_VIRQFD
-	tristate
-	select EVENTFD
-	default n
-
 config VFIO_NOIOMMU
 	bool "VFIO No-IOMMU support"
 	help
@@ -44,6 +46,17 @@ config VFIO_NOIOMMU
 	  this mode since there is no IOMMU to provide DMA translation.
 
 	  If you don't know what to do here, say N.
+endif
+
+config VFIO_SPAPR_EEH
+	tristate
+	depends on EEH && VFIO_IOMMU_SPAPR_TCE
+	default VFIO
+
+config VFIO_VIRQFD
+	tristate
+	select EVENTFD
+	default n
 
 source "drivers/vfio/pci/Kconfig"
 source "drivers/vfio/platform/Kconfig"
diff --git a/drivers/vfio/Makefile b/drivers/vfio/Makefile
index 3863922529ef20..b953517dc70f99 100644
--- a/drivers/vfio/Makefile
+++ b/drivers/vfio/Makefile
@@ -4,9 +4,9 @@ vfio_virqfd-y := virqfd.o
 obj-$(CONFIG_VFIO) += vfio.o
 
 vfio-y += vfio_main.o \
-	  iova_bitmap.o \
-	  container.o
+	  iova_bitmap.o
 vfio-$(CONFIG_IOMMUFD) += iommufd.o
+vfio-$(CONFIG_VFIO_CONTAINER) += container.o
 
 obj-$(CONFIG_VFIO_VIRQFD) += vfio_virqfd.o
 obj-$(CONFIG_VFIO_IOMMU_TYPE1) += vfio_iommu_type1.o
diff --git a/drivers/vfio/vfio.h b/drivers/vfio/vfio.h
index d57a08afb5cf5c..3378714a746274 100644
--- a/drivers/vfio/vfio.h
+++ b/drivers/vfio/vfio.h
@@ -55,7 +55,9 @@ struct vfio_group {
 	struct list_head		device_list;
 	struct mutex			device_lock;
 	struct list_head		vfio_next;
+#if IS_ENABLED(CONFIG_VFIO_CONTAINER)
 	struct list_head		container_next;
+#endif
 	enum vfio_group_type		type;
 	struct mutex			group_lock;
 	struct kvm			*kvm;
@@ -64,6 +66,7 @@ struct vfio_group {
 	struct iommufd_ctx		*iommufd;
 };
 
+#if IS_ENABLED(CONFIG_VFIO_CONTAINER)
 /* events for the backend driver notify callback */
 enum vfio_iommu_notify_type {
 	VFIO_IOMMU_CONTAINER_CLOSE = 0,
@@ -129,6 +132,68 @@ int vfio_container_dma_rw(struct vfio_container *container, dma_addr_t iova,
 
 int __init vfio_container_init(void);
 void vfio_container_cleanup(void);
+#else
+static inline struct vfio_container *
+vfio_container_from_file(struct file *filep)
+{
+	return NULL;
+}
+
+static inline int vfio_group_use_container(struct vfio_group *group)
+{
+	return -EOPNOTSUPP;
+}
+
+static inline void vfio_group_unuse_container(struct vfio_group *group)
+{
+}
+
+static inline int vfio_container_attach_group(struct vfio_container *container,
+					      struct vfio_group *group)
+{
+	return -EOPNOTSUPP;
+}
+
+static inline void vfio_group_detach_container(struct vfio_group *group)
+{
+}
+
+static inline void vfio_device_container_register(struct vfio_device *device)
+{
+}
+
+static inline void vfio_device_container_unregister(struct vfio_device *device)
+{
+}
+
+static inline int vfio_container_pin_pages(struct vfio_container *container,
+					   struct iommu_group *iommu_group,
+					   dma_addr_t iova, int npage, int prot,
+					   struct page **pages)
+{
+	return -EOPNOTSUPP;
+}
+
+static inline void vfio_container_unpin_pages(struct vfio_container *container,
+					      dma_addr_t iova, int npage)
+{
+}
+
+static inline int vfio_container_dma_rw(struct vfio_container *container,
+					dma_addr_t iova, void *data, size_t len,
+					bool write)
+{
+	return -EOPNOTSUPP;
+}
+
+static inline int vfio_container_init(void)
+{
+	return 0;
+}
+static inline void vfio_container_cleanup(void)
+{
+}
+#endif
 
 #if IS_ENABLED(CONFIG_IOMMUFD)
 int vfio_iommufd_bind(struct vfio_device *device, struct iommufd_ctx *ictx);
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v3 11/11] iommufd: Allow iommufd to supply /dev/vfio/vfio
  2022-11-16 21:05 [PATCH v3 00/11] Connect VFIO to IOMMUFD Jason Gunthorpe
                   ` (9 preceding siblings ...)
  2022-11-16 21:05 ` [PATCH v3 10/11] vfio: Make vfio_container optionally compiled Jason Gunthorpe
@ 2022-11-16 21:05 ` Jason Gunthorpe
  2022-11-17 20:34   ` Alex Williamson
  2022-11-23  2:44 ` [PATCH v3 00/11] Connect VFIO to IOMMUFD Yi Liu
  11 siblings, 1 reply; 35+ messages in thread
From: Jason Gunthorpe @ 2022-11-16 21:05 UTC (permalink / raw)
  Cc: Alex Williamson, Lu Baolu, Cornelia Huck, Eric Auger, Kevin Tian,
	kvm, Lixiao Yang, Matthew Rosato, Nicolin Chen, Yi Liu, Yu He

If the VFIO container is compiled out, give a kconfig option for iommufd
to provide the miscdev node with the same name and permissions as vfio
uses.

The compatibility node supports the same ioctls as VFIO and automatically
enables the VFIO compatible pinned page accounting mode.

Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Yi Liu <yi.l.liu@intel.com>
Tested-by: Lixiao Yang <lixiao.yang@intel.com>
Tested-by: Matthew Rosato <mjrosato@linux.ibm.com>
Tested-by: Yu He <yu.he@intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Yi Liu <yi.l.liu@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/iommu/iommufd/Kconfig | 12 ++++++++++++
 drivers/iommu/iommufd/main.c  | 36 +++++++++++++++++++++++++++++++++++
 2 files changed, 48 insertions(+)

diff --git a/drivers/iommu/iommufd/Kconfig b/drivers/iommu/iommufd/Kconfig
index 399a2edeaef6de..f387f803dc6f7f 100644
--- a/drivers/iommu/iommufd/Kconfig
+++ b/drivers/iommu/iommufd/Kconfig
@@ -12,6 +12,18 @@ config IOMMUFD
 	  If you don't know what to do here, say N.
 
 if IOMMUFD
+config IOMMUFD_VFIO_CONTAINER
+	bool "IOMMUFD provides the VFIO container /dev/vfio/vfio"
+	depends on VFIO && !VFIO_CONTAINER
+	default VFIO && !VFIO_CONTAINER
+	help
+	  IOMMUFD will provide /dev/vfio/vfio instead of VFIO. This relies on
+	  IOMMUFD providing compatibility emulation to give the same ioctls.
+	  It provides an option to build a kernel with legacy VFIO components
+	  removed.
+
+	  Unless testing IOMMUFD say N here.
+
 config IOMMUFD_TEST
 	bool "IOMMU Userspace API Test support"
 	depends on RUNTIME_TESTING_MENU
diff --git a/drivers/iommu/iommufd/main.c b/drivers/iommu/iommufd/main.c
index ab3fa05f38505d..1eeb326f74f005 100644
--- a/drivers/iommu/iommufd/main.c
+++ b/drivers/iommu/iommufd/main.c
@@ -18,6 +18,7 @@
 #include <uapi/linux/iommufd.h>
 #include <linux/iommufd.h>
 
+#include "io_pagetable.h"
 #include "iommufd_private.h"
 #include "iommufd_test.h"
 
@@ -25,6 +26,7 @@ struct iommufd_object_ops {
 	void (*destroy)(struct iommufd_object *obj);
 };
 static const struct iommufd_object_ops iommufd_object_ops[];
+static struct miscdevice vfio_misc_dev;
 
 struct iommufd_object *_iommufd_object_alloc(struct iommufd_ctx *ictx,
 					     size_t size,
@@ -170,6 +172,16 @@ static int iommufd_fops_open(struct inode *inode, struct file *filp)
 	if (!ictx)
 		return -ENOMEM;
 
+	/*
+	 * For compatibility with VFIO when /dev/vfio/vfio is opened we default
+	 * to the same rlimit accounting as vfio uses.
+	 */
+	if (IS_ENABLED(CONFIG_IOMMUFD_VFIO_CONTAINER) &&
+	    filp->private_data == &vfio_misc_dev) {
+		ictx->account_mode = IOPT_PAGES_ACCOUNT_MM;
+		pr_info_once("IOMMUFD is providing /dev/vfio/vfio, not VFIO.\n");
+	}
+
 	xa_init_flags(&ictx->objects, XA_FLAGS_ALLOC1 | XA_FLAGS_ACCOUNT);
 	ictx->file = filp;
 	filp->private_data = ictx;
@@ -395,6 +407,15 @@ static struct miscdevice iommu_misc_dev = {
 	.mode = 0660,
 };
 
+
+static struct miscdevice vfio_misc_dev = {
+	.minor = VFIO_MINOR,
+	.name = "vfio",
+	.fops = &iommufd_fops,
+	.nodename = "vfio/vfio",
+	.mode = 0666,
+};
+
 static int __init iommufd_init(void)
 {
 	int ret;
@@ -402,18 +423,33 @@ static int __init iommufd_init(void)
 	ret = misc_register(&iommu_misc_dev);
 	if (ret)
 		return ret;
+
+	if (IS_ENABLED(CONFIG_IOMMUFD_VFIO_CONTAINER)) {
+		ret = misc_register(&vfio_misc_dev);
+		if (ret)
+			goto err_misc;
+	}
 	iommufd_test_init();
 	return 0;
+err_misc:
+	misc_deregister(&iommu_misc_dev);
+	return ret;
 }
 
 static void __exit iommufd_exit(void)
 {
 	iommufd_test_exit();
+	if (IS_ENABLED(CONFIG_IOMMUFD_VFIO_CONTAINER))
+		misc_deregister(&vfio_misc_dev);
 	misc_deregister(&iommu_misc_dev);
 }
 
 module_init(iommufd_init);
 module_exit(iommufd_exit);
 
+#if IS_ENABLED(CONFIG_IOMMUFD_VFIO_CONTAINER)
+MODULE_ALIAS_MISCDEV(VFIO_MINOR);
+MODULE_ALIAS("devname:vfio/vfio");
+#endif
 MODULE_DESCRIPTION("I/O Address Space Management for passthrough devices");
 MODULE_LICENSE("GPL");
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 06/11] vfio-iommufd: Allow iommufd to be used in place of a container fd
  2022-11-16 21:05 ` [PATCH v3 06/11] vfio-iommufd: Allow iommufd to be used in place of a container fd Jason Gunthorpe
@ 2022-11-16 23:31   ` Alex Williamson
  2022-11-17  0:20     ` Jason Gunthorpe
  0 siblings, 1 reply; 35+ messages in thread
From: Alex Williamson @ 2022-11-16 23:31 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Lu Baolu, Cornelia Huck, Eric Auger, Kevin Tian, kvm,
	Lixiao Yang, Matthew Rosato, Nicolin Chen, Yi Liu, Yu He

On Wed, 16 Nov 2022 17:05:31 -0400
Jason Gunthorpe <jgg@nvidia.com> wrote:

> This makes VFIO_GROUP_SET_CONTAINER accept both a vfio container FD and an
> iommufd.
> 
> In iommufd mode an IOAS will exist after the SET_CONTAINER, but it will
> not be attached to any groups.
> 
> For VFIO this means that the VFIO_GROUP_GET_STATUS and
> VFIO_GROUP_FLAGS_VIABLE works subtly differently. With the container FD
> the iommu_group_claim_dma_owner() is done during SET_CONTAINER but for
> IOMMUFD this is done during VFIO_GROUP_GET_DEVICE_FD. Meaning that
> VFIO_GROUP_FLAGS_VIABLE could be set but GET_DEVICE_FD will fail due to
> viability.
> 
> As GET_DEVICE_FD can fail for many reasons already this is not expected to
> be a meaningful difference.
> 
> Reorganize the tests for if the group has an assigned container or iommu
> into a vfio_group_has_iommu() function and consolidate all the duplicated
> WARN_ON's etc related to this.
> 
> Call container functions only if a container is actually present on the
> group.
> 
> Tested-by: Nicolin Chen <nicolinc@nvidia.com>
> Tested-by: Yi Liu <yi.l.liu@intel.com>
> Tested-by: Lixiao Yang <lixiao.yang@intel.com>
> Tested-by: Matthew Rosato <mjrosato@linux.ibm.com>
> Tested-by: Yu He <yu.he@intel.com>
> Reviewed-by: Kevin Tian <kevin.tian@intel.com>
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> ---
>  drivers/vfio/Kconfig     |  1 +
>  drivers/vfio/container.c |  7 +++-
>  drivers/vfio/vfio.h      |  2 +
>  drivers/vfio/vfio_main.c | 86 +++++++++++++++++++++++++++++++++-------
>  4 files changed, 80 insertions(+), 16 deletions(-)
> 
> diff --git a/drivers/vfio/Kconfig b/drivers/vfio/Kconfig
> index 86c381ceb9a1e9..1118d322eec97d 100644
> --- a/drivers/vfio/Kconfig
> +++ b/drivers/vfio/Kconfig
> @@ -2,6 +2,7 @@
>  menuconfig VFIO
>  	tristate "VFIO Non-Privileged userspace driver framework"
>  	select IOMMU_API
> +	depends on IOMMUFD || !IOMMUFD
>  	select VFIO_IOMMU_TYPE1 if MMU && (X86 || S390 || ARM || ARM64)
>  	select INTERVAL_TREE
>  	help
> diff --git a/drivers/vfio/container.c b/drivers/vfio/container.c
> index d97747dfb05d02..8772dad6808539 100644
> --- a/drivers/vfio/container.c
> +++ b/drivers/vfio/container.c
> @@ -516,8 +516,11 @@ int vfio_group_use_container(struct vfio_group *group)
>  {
>  	lockdep_assert_held(&group->group_lock);
>  
> -	if (!group->container || !group->container->iommu_driver ||
> -	    WARN_ON(!group->container_users))
> +	/*
> +	 * The container fd has been assigned with VFIO_GROUP_SET_CONTAINER but
> +	 * VFIO_SET_IOMMU hasn't been done yet.
> +	 */
> +	if (!group->container->iommu_driver)
>  		return -EINVAL;
>  
>  	if (group->type == VFIO_NO_IOMMU && !capable(CAP_SYS_RAWIO))
> diff --git a/drivers/vfio/vfio.h b/drivers/vfio/vfio.h
> index 247590334e14b0..985e13d52989ca 100644
> --- a/drivers/vfio/vfio.h
> +++ b/drivers/vfio/vfio.h
> @@ -10,6 +10,7 @@
>  #include <linux/cdev.h>
>  #include <linux/module.h>
>  
> +struct iommufd_ctx;
>  struct iommu_group;
>  struct vfio_device;
>  struct vfio_container;
> @@ -60,6 +61,7 @@ struct vfio_group {
>  	struct kvm			*kvm;
>  	struct file			*opened_file;
>  	struct blocking_notifier_head	notifier;
> +	struct iommufd_ctx		*iommufd;
>  };
>  
>  /* events for the backend driver notify callback */
> diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c
> index 5c0e810f8b4d08..8c124290ce9f0d 100644
> --- a/drivers/vfio/vfio_main.c
> +++ b/drivers/vfio/vfio_main.c
> @@ -35,6 +35,7 @@
>  #include <linux/pm_runtime.h>
>  #include <linux/interval_tree.h>
>  #include <linux/iova_bitmap.h>
> +#include <linux/iommufd.h>
>  #include "vfio.h"
>  
>  #define DRIVER_VERSION	"0.3"
> @@ -665,6 +666,16 @@ EXPORT_SYMBOL_GPL(vfio_unregister_group_dev);
>  /*
>   * VFIO Group fd, /dev/vfio/$GROUP
>   */
> +static bool vfio_group_has_iommu(struct vfio_group *group)
> +{
> +	lockdep_assert_held(&group->group_lock);
> +	if (!group->container)
> +		WARN_ON(group->container_users);
> +	else
> +		WARN_ON(!group->container_users);

I think this is just carrying forward the WARN_ON that gets replaced in
set_container, but I don't really see how this bit of paranoia is ever
a possibility.  If it is, a comment would be good, and perhaps simplify
to:

	WARN_ON(group->container ^ group->container_users);


> +	return group->container || group->iommufd;
> +}
> +
>  /*
>   * VFIO_GROUP_UNSET_CONTAINER should fail if there are other users or
>   * if there was no container to unset.  Since the ioctl is called on
[snip]
> @@ -900,7 +945,14 @@ static int vfio_group_ioctl_get_status(struct vfio_group *group,
>  		return -ENODEV;
>  	}
>  
> -	if (group->container)
> +	/*
> +	 * With the container FD the iommu_group_claim_dma_owner() is done
> +	 * during SET_CONTAINER but for IOMMFD this is done during
> +	 * VFIO_GROUP_GET_DEVICE_FD. Meaning that with iommufd
> +	 * VFIO_GROUP_FLAGS_VIABLE could be set but GET_DEVICE_FD will fail due
> +	 * to viability.
> +	 */
> +	if (group->container || group->iommufd)

Why didn't this use the vfio_group_has_iommu() helper?  This is only
skipping the paranoia checks, which aren't currently obvious to me
anyway.  Thanks,

Alex

>  		status.flags |= VFIO_GROUP_FLAGS_CONTAINER_SET |
>  				VFIO_GROUP_FLAGS_VIABLE;
>  	else if (!iommu_group_dma_owner_claimed(group->iommu_group))


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 06/11] vfio-iommufd: Allow iommufd to be used in place of a container fd
  2022-11-16 23:31   ` Alex Williamson
@ 2022-11-17  0:20     ` Jason Gunthorpe
  0 siblings, 0 replies; 35+ messages in thread
From: Jason Gunthorpe @ 2022-11-17  0:20 UTC (permalink / raw)
  To: Alex Williamson
  Cc: Lu Baolu, Cornelia Huck, Eric Auger, Kevin Tian, kvm,
	Lixiao Yang, Matthew Rosato, Nicolin Chen, Yi Liu, Yu He

On Wed, Nov 16, 2022 at 04:31:33PM -0700, Alex Williamson wrote:
> > diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c
> > index 5c0e810f8b4d08..8c124290ce9f0d 100644
> > --- a/drivers/vfio/vfio_main.c
> > +++ b/drivers/vfio/vfio_main.c
> > @@ -35,6 +35,7 @@
> >  #include <linux/pm_runtime.h>
> >  #include <linux/interval_tree.h>
> >  #include <linux/iova_bitmap.h>
> > +#include <linux/iommufd.h>
> >  #include "vfio.h"
> >  
> >  #define DRIVER_VERSION	"0.3"
> > @@ -665,6 +666,16 @@ EXPORT_SYMBOL_GPL(vfio_unregister_group_dev);
> >  /*
> >   * VFIO Group fd, /dev/vfio/$GROUP
> >   */
> > +static bool vfio_group_has_iommu(struct vfio_group *group)
> > +{
> > +	lockdep_assert_held(&group->group_lock);
> > +	if (!group->container)
> > +		WARN_ON(group->container_users);
> > +	else
> > +		WARN_ON(!group->container_users);
> 
> I think this is just carrying forward the WARN_ON that gets replaced in
> set_container,

Yes, I've carried this invariant forward through a few series now

> but I don't really see how this bit of paranoia is ever a
> possibility.  

Right, it is an invariant assertion, it should never trigger and we've
never seen it trigger. I think at one point it was harder to see that
this is impossible so an assertion must have been added

> 	WARN_ON(group->container ^ group->container_users);

Ah, this needs a "logical xor" which is a bit obscure. In C I guess
this is the common way to do it:

	/*
	 * There can only be users if there is a container, and if there is a
	 * container there must be users.
	 */
	WARN_ON(!group->container != !group->container_users);

I'm also happy to delete it, not sure it is a valuable invariant.

> > @@ -900,7 +945,14 @@ static int vfio_group_ioctl_get_status(struct vfio_group *group,
> >  		return -ENODEV;
> >  	}
> >  
> > -	if (group->container)
> > +	/*
> > +	 * With the container FD the iommu_group_claim_dma_owner() is done
> > +	 * during SET_CONTAINER but for IOMMFD this is done during
> > +	 * VFIO_GROUP_GET_DEVICE_FD. Meaning that with iommufd
> > +	 * VFIO_GROUP_FLAGS_VIABLE could be set but GET_DEVICE_FD will fail due
> > +	 * to viability.
> > +	 */
> > +	if (group->container || group->iommufd)
> 
> Why didn't this use the vfio_group_has_iommu() helper?  This is only
> skipping the paranoia checks, which aren't currently obvious to me
> anyway.

Yes, it was missed, I fixed it

Thanks,
Jason

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 04/11] vfio: Move storage of allow_unsafe_interrupts to vfio_main.c
  2022-11-16 21:05 ` [PATCH v3 04/11] vfio: Move storage of allow_unsafe_interrupts to vfio_main.c Jason Gunthorpe
@ 2022-11-17 20:14   ` Alex Williamson
  2022-11-18 15:36     ` Jason Gunthorpe
  0 siblings, 1 reply; 35+ messages in thread
From: Alex Williamson @ 2022-11-17 20:14 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Lu Baolu, Cornelia Huck, Eric Auger, Kevin Tian, kvm,
	Lixiao Yang, Matthew Rosato, Nicolin Chen, Yi Liu, Yu He

On Wed, 16 Nov 2022 17:05:29 -0400
Jason Gunthorpe <jgg@nvidia.com> wrote:

> This legacy module knob has become uAPI, when set on the vfio_iommu_type1
> it disables some security protections in the iommu drivers. Move the
> storage for this knob to vfio_main.c so that iommufd can access it too.
> 
> The may need enhancing as we learn more about how necessary
> allow_unsafe_interrupts is in the current state of the world. If vfio
> container is disabled then this option will not be available to the user.
> 
> Tested-by: Nicolin Chen <nicolinc@nvidia.com>
> Tested-by: Yi Liu <yi.l.liu@intel.com>
> Tested-by: Lixiao Yang <lixiao.yang@intel.com>
> Tested-by: Matthew Rosato <mjrosato@linux.ibm.com>
> Tested-by: Yu He <yu.he@intel.com>
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> ---
>  drivers/vfio/vfio.h             | 2 ++
>  drivers/vfio/vfio_iommu_type1.c | 5 ++---
>  drivers/vfio/vfio_main.c        | 3 +++
>  3 files changed, 7 insertions(+), 3 deletions(-)

It's really quite trivial to convert to a vfio_iommu.ko module to host
a separate option for this.  Half of the patch below is undoing what's
done here.  Is your only concern with this approach that we use a few
KB more memory for the separate module?

I don't think a per-device sysfs setting makes a lot of sense, if we're
on a host w/o MSI isolation, all devices are affected.  Thanks,

Alex


 Makefile           |    4 +++-
 iommufd.c          |   12 +++++++++++-
 vfio.h             |    2 --
 vfio_iommu_type1.c |    5 +++--
 vfio_main.c        |    6 +++---
 5 files changed, 20 insertions(+), 9 deletions(-)

diff --git a/drivers/vfio/Makefile b/drivers/vfio/Makefile
index b953517dc70f..34b7d91112e5 100644
--- a/drivers/vfio/Makefile
+++ b/drivers/vfio/Makefile
@@ -5,9 +5,11 @@ obj-$(CONFIG_VFIO) += vfio.o
 
 vfio-y += vfio_main.o \
 	  iova_bitmap.o
-vfio-$(CONFIG_IOMMUFD) += iommufd.o
 vfio-$(CONFIG_VFIO_CONTAINER) += container.o
 
+obj-$(CONFIG_IOMMUFD) += vfio_iommufd.o
+vfio_iommufd-y += iommufd.o
+
 obj-$(CONFIG_VFIO_VIRQFD) += vfio_virqfd.o
 obj-$(CONFIG_VFIO_IOMMU_TYPE1) += vfio_iommu_type1.o
 obj-$(CONFIG_VFIO_IOMMU_SPAPR_TCE) += vfio_iommu_spapr_tce.o
diff --git a/drivers/vfio/iommufd.c b/drivers/vfio/iommufd.c
index dad8935d71f7..b6b4038ba036 100644
--- a/drivers/vfio/iommufd.c
+++ b/drivers/vfio/iommufd.c
@@ -10,6 +10,12 @@
 MODULE_IMPORT_NS(IOMMUFD);
 MODULE_IMPORT_NS(IOMMUFD_VFIO);
 
+static bool allow_unsafe_interrupts;
+module_param_named(allow_unsafe_interrupts,
+		   allow_unsafe_interrupts, bool, S_IRUGO | S_IWUSR);
+MODULE_PARM_DESC(allow_unsafe_interrupts,
+		 "Enable VFIO IOMMUFD support on platforms without MSI/X isolation facilities.");
+
 int vfio_iommufd_bind(struct vfio_device *vdev, struct iommufd_ctx *ictx)
 {
 	u32 ioas_id;
@@ -47,6 +53,7 @@ int vfio_iommufd_bind(struct vfio_device *vdev, struct iommufd_ctx *ictx)
 		vdev->ops->unbind_iommufd(vdev);
 	return ret;
 }
+EXPORT_SYMBOL_GPL(vfio_iommufd_bind);
 
 void vfio_iommufd_unbind(struct vfio_device *vdev)
 {
@@ -55,6 +62,7 @@ void vfio_iommufd_unbind(struct vfio_device *vdev)
 	if (vdev->ops->unbind_iommufd)
 		vdev->ops->unbind_iommufd(vdev);
 }
+EXPORT_SYMBOL_GPL(vfio_iommufd_unbind);
 
 /*
  * The physical standard ops mean that the iommufd_device is bound to the
@@ -92,7 +100,7 @@ int vfio_iommufd_physical_attach_ioas(struct vfio_device *vdev, u32 *pt_id)
 	unsigned int flags = 0;
 	int rc;
 
-	if (vfio_allow_unsafe_interrupts)
+	if (allow_unsafe_interrupts)
 		flags |= IOMMUFD_ATTACH_FLAGS_ALLOW_UNSAFE_INTERRUPT;
 	rc = iommufd_device_attach(vdev->iommufd_device, pt_id, flags);
 	if (rc)
@@ -159,3 +167,5 @@ int vfio_iommufd_emulated_attach_ioas(struct vfio_device *vdev, u32 *pt_id)
 	return 0;
 }
 EXPORT_SYMBOL_GPL(vfio_iommufd_emulated_attach_ioas);
+
+MODULE_LICENSE("GPL v2");
diff --git a/drivers/vfio/vfio.h b/drivers/vfio/vfio.h
index 3378714a7462..ce5fe3fc493b 100644
--- a/drivers/vfio/vfio.h
+++ b/drivers/vfio/vfio.h
@@ -216,6 +216,4 @@ extern bool vfio_noiommu __read_mostly;
 enum { vfio_noiommu = false };
 #endif
 
-extern bool vfio_allow_unsafe_interrupts;
-
 #endif
diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index 186e33a006d3..23c24fe98c00 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -44,8 +44,9 @@
 #define DRIVER_AUTHOR   "Alex Williamson <alex.williamson@redhat.com>"
 #define DRIVER_DESC     "Type1 IOMMU driver for VFIO"
 
+static bool allow_unsafe_interrupts;
 module_param_named(allow_unsafe_interrupts,
-		   vfio_allow_unsafe_interrupts, bool, S_IRUGO | S_IWUSR);
+		   allow_unsafe_interrupts, bool, S_IRUGO | S_IWUSR);
 MODULE_PARM_DESC(allow_unsafe_interrupts,
 		 "Enable VFIO IOMMU support for on platforms without interrupt remapping support.");
 
@@ -2281,7 +2282,7 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
 		    iommu_group_for_each_dev(iommu_group, (void *)IOMMU_CAP_INTR_REMAP,
 					     vfio_iommu_device_capable);
 
-	if (!vfio_allow_unsafe_interrupts && !msi_remap) {
+	if (!allow_unsafe_interrupts && !msi_remap) {
 		pr_warn("%s: No interrupt remapping support.  Use the module param \"allow_unsafe_interrupts\" to enable VFIO IOMMU support on this platform\n",
 		       __func__);
 		ret = -EPERM;
diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c
index f3c48b8c4562..48b3aa8582aa 100644
--- a/drivers/vfio/vfio_main.c
+++ b/drivers/vfio/vfio_main.c
@@ -42,6 +42,9 @@
 #define DRIVER_AUTHOR	"Alex Williamson <alex.williamson@redhat.com>"
 #define DRIVER_DESC	"VFIO - User Level meta-driver"
 
+MODULE_IMPORT_NS(IOMMUFD);
+MODULE_IMPORT_NS(IOMMUFD_VFIO);
+
 static struct vfio {
 	struct class			*class;
 	struct list_head		group_list;
@@ -52,9 +55,6 @@ static struct vfio {
 	struct ida			device_ida;
 } vfio;
 
-bool vfio_allow_unsafe_interrupts;
-EXPORT_SYMBOL_GPL(vfio_allow_unsafe_interrupts);
-
 static DEFINE_XARRAY(vfio_device_set_xa);
 static const struct file_operations vfio_group_fops;
 


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 11/11] iommufd: Allow iommufd to supply /dev/vfio/vfio
  2022-11-16 21:05 ` [PATCH v3 11/11] iommufd: Allow iommufd to supply /dev/vfio/vfio Jason Gunthorpe
@ 2022-11-17 20:34   ` Alex Williamson
  2022-11-18 13:07     ` Jason Gunthorpe
  0 siblings, 1 reply; 35+ messages in thread
From: Alex Williamson @ 2022-11-17 20:34 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Lu Baolu, Cornelia Huck, Eric Auger, Kevin Tian, kvm,
	Lixiao Yang, Matthew Rosato, Nicolin Chen, Yi Liu, Yu He

On Wed, 16 Nov 2022 17:05:36 -0400
Jason Gunthorpe <jgg@nvidia.com> wrote:

> If the VFIO container is compiled out, give a kconfig option for iommufd
> to provide the miscdev node with the same name and permissions as vfio
> uses.
> 
> The compatibility node supports the same ioctls as VFIO and automatically
> enables the VFIO compatible pinned page accounting mode.
> 
> Tested-by: Nicolin Chen <nicolinc@nvidia.com>
> Tested-by: Yi Liu <yi.l.liu@intel.com>
> Tested-by: Lixiao Yang <lixiao.yang@intel.com>
> Tested-by: Matthew Rosato <mjrosato@linux.ibm.com>
> Tested-by: Yu He <yu.he@intel.com>
> Reviewed-by: Kevin Tian <kevin.tian@intel.com>
> Reviewed-by: Yi Liu <yi.l.liu@intel.com>
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> ---
>  drivers/iommu/iommufd/Kconfig | 12 ++++++++++++
>  drivers/iommu/iommufd/main.c  | 36 +++++++++++++++++++++++++++++++++++
>  2 files changed, 48 insertions(+)
> 
> diff --git a/drivers/iommu/iommufd/Kconfig b/drivers/iommu/iommufd/Kconfig
> index 399a2edeaef6de..f387f803dc6f7f 100644
> --- a/drivers/iommu/iommufd/Kconfig
> +++ b/drivers/iommu/iommufd/Kconfig
> @@ -12,6 +12,18 @@ config IOMMUFD
>  	  If you don't know what to do here, say N.
>  
>  if IOMMUFD
> +config IOMMUFD_VFIO_CONTAINER
> +	bool "IOMMUFD provides the VFIO container /dev/vfio/vfio"
> +	depends on VFIO && !VFIO_CONTAINER
> +	default VFIO && !VFIO_CONTAINER
> +	help
> +	  IOMMUFD will provide /dev/vfio/vfio instead of VFIO. This relies on
> +	  IOMMUFD providing compatibility emulation to give the same ioctls.
> +	  It provides an option to build a kernel with legacy VFIO components
> +	  removed.
> +
> +	  Unless testing IOMMUFD say N here.
> +

"Unless testing..." alone is a bit more subtle that I thought we were
discussing.  I was expecting something more like:

  IOMMUFD VFIO container emulation is known to lack certain features of
  the native VFIO container, such as no-IOMMU support, peer-to-peer DMA
  mapping, PPC IOMMU support, as well as other potentially undiscovered
  gaps.  This option is currently intended for the purpose of testing
  IOMMUFD with unmodified userspace supporting VFIO and making use of
  the Type1 VFIO IOMMU backend.  General purpose enabling of this
  option is currently discouraged.

  Unless testing IOMMUFD, say N here.

Thanks,
Alex


^ permalink raw reply	[flat|nested] 35+ messages in thread

* RE: [PATCH v3 07/11] vfio-iommufd: Support iommufd for physical VFIO devices
  2022-11-16 21:05 ` [PATCH v3 07/11] vfio-iommufd: Support iommufd for physical VFIO devices Jason Gunthorpe
@ 2022-11-18  1:30   ` Tian, Kevin
  0 siblings, 0 replies; 35+ messages in thread
From: Tian, Kevin @ 2022-11-18  1:30 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Alex Williamson, Lu Baolu, Cornelia Huck, Eric Auger, kvm, Yang,
	Lixiao, Matthew Rosato, Nicolin Chen, Liu, Yi L, He, Yu

> From: Jason Gunthorpe <jgg@nvidia.com>
> Sent: Thursday, November 17, 2022 5:06 AM
> 
> This creates the iommufd_device for the physical VFIO drivers. These are
> all the drivers that are calling vfio_register_group_dev() and expect the
> type1 code to setup a real iommu_domain against their parent struct
> device.
> 
> The design gives the driver a choice in how it gets connected to iommufd
> by providing bind_iommufd/unbind_iommufd/attach_ioas callbacks to
> implement as required. The core code provides three default callbacks for
> physical mode using a real iommu_domain. This is suitable for drivers
> using vfio_register_group_dev()
> 
> Tested-by: Nicolin Chen <nicolinc@nvidia.com>
> Tested-by: Yi Liu <yi.l.liu@intel.com>
> Tested-by: Lixiao Yang <lixiao.yang@intel.com>
> Tested-by: Matthew Rosato <mjrosato@linux.ibm.com>
> Tested-by: Yu He <yu.he@intel.com>
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

Reviewed-by: Kevin Tian <kevin.tian@intel.com>

^ permalink raw reply	[flat|nested] 35+ messages in thread

* RE: [PATCH v3 08/11] vfio-iommufd: Support iommufd for emulated VFIO devices
  2022-11-16 21:05 ` [PATCH v3 08/11] vfio-iommufd: Support iommufd for emulated " Jason Gunthorpe
@ 2022-11-18  2:03   ` Tian, Kevin
  0 siblings, 0 replies; 35+ messages in thread
From: Tian, Kevin @ 2022-11-18  2:03 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Alex Williamson, Lu Baolu, Cornelia Huck, Eric Auger, kvm, Yang,
	Lixiao, Matthew Rosato, Nicolin Chen, Liu, Yi L, He, Yu

> From: Jason Gunthorpe <jgg@nvidia.com>
> Sent: Thursday, November 17, 2022 5:06 AM
> 
> Emulated VFIO devices are calling vfio_register_emulated_iommu_dev() and
> consist of all the mdev drivers.
> 
> Like the physical drivers, support for iommufd is provided by the driver
> supplying the correct standard ops. Provide ops from the core that
> duplicate what vfio_register_emulated_iommu_dev() does.
> 
> Emulated drivers are where it is more likely to see variation in the
> iommfd support ops. For instance IDXD will probably need to setup both a
> iommfd_device context linked to a PASID and an iommufd_access context to
> support all their mdev operations.
> 
> Tested-by: Nicolin Chen <nicolinc@nvidia.com>
> Tested-by: Yi Liu <yi.l.liu@intel.com>
> Tested-by: Lixiao Yang <lixiao.yang@intel.com>
> Tested-by: Matthew Rosato <mjrosato@linux.ibm.com>
> Tested-by: Yu He <yu.he@intel.com>
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

Reviewed-by: Kevin Tian <kevin.tian@intel.com>

^ permalink raw reply	[flat|nested] 35+ messages in thread

* RE: [PATCH v3 10/11] vfio: Make vfio_container optionally compiled
  2022-11-16 21:05 ` [PATCH v3 10/11] vfio: Make vfio_container optionally compiled Jason Gunthorpe
@ 2022-11-18  2:05   ` Tian, Kevin
  0 siblings, 0 replies; 35+ messages in thread
From: Tian, Kevin @ 2022-11-18  2:05 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Alex Williamson, Lu Baolu, Cornelia Huck, Eric Auger, kvm, Yang,
	Lixiao, Matthew Rosato, Nicolin Chen, Liu, Yi L, He, Yu

> From: Jason Gunthorpe <jgg@nvidia.com>
> Sent: Thursday, November 17, 2022 5:06 AM
> 
> Add a kconfig CONFIG_VFIO_CONTAINER that controls compiling the
> container
> code. If 'n' then only iommufd will provide the container service. All the
> support for vfio iommu drivers, including type1, will not be built.
> 
> This allows a compilation check that no inappropriate dependencies between
> the device/group and container have been created.
> 
> Tested-by: Nicolin Chen <nicolinc@nvidia.com>
> Tested-by: Yi Liu <yi.l.liu@intel.com>
> Tested-by: Lixiao Yang <lixiao.yang@intel.com>
> Tested-by: Matthew Rosato <mjrosato@linux.ibm.com>
> Tested-by: Yu He <yu.he@intel.com>
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

Reviewed-by: Kevin Tian <kevin.tian@intel.com>

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 11/11] iommufd: Allow iommufd to supply /dev/vfio/vfio
  2022-11-17 20:34   ` Alex Williamson
@ 2022-11-18 13:07     ` Jason Gunthorpe
  0 siblings, 0 replies; 35+ messages in thread
From: Jason Gunthorpe @ 2022-11-18 13:07 UTC (permalink / raw)
  To: Alex Williamson
  Cc: Lu Baolu, Cornelia Huck, Eric Auger, Kevin Tian, kvm,
	Lixiao Yang, Matthew Rosato, Nicolin Chen, Yi Liu, Yu He

On Thu, Nov 17, 2022 at 01:34:17PM -0700, Alex Williamson wrote:
> > +config IOMMUFD_VFIO_CONTAINER
> > +	bool "IOMMUFD provides the VFIO container /dev/vfio/vfio"
> > +	depends on VFIO && !VFIO_CONTAINER
> > +	default VFIO && !VFIO_CONTAINER
> > +	help
> > +	  IOMMUFD will provide /dev/vfio/vfio instead of VFIO. This relies on
> > +	  IOMMUFD providing compatibility emulation to give the same ioctls.
> > +	  It provides an option to build a kernel with legacy VFIO components
> > +	  removed.
> > +
> > +	  Unless testing IOMMUFD say N here.
> > +
> 
> "Unless testing..." alone is a bit more subtle that I thought we were
> discussing.  I was expecting something more like:
> 
>   IOMMUFD VFIO container emulation is known to lack certain features of
>   the native VFIO container, such as no-IOMMU support, peer-to-peer DMA
>   mapping, PPC IOMMU support, as well as other potentially undiscovered
>   gaps.  This option is currently intended for the purpose of testing
>   IOMMUFD with unmodified userspace supporting VFIO and making use of
>   the Type1 VFIO IOMMU backend.  General purpose enabling of this
>   option is currently discouraged.
> 
>   Unless testing IOMMUFD, say N here.

Done, thanks

Jason

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 04/11] vfio: Move storage of allow_unsafe_interrupts to vfio_main.c
  2022-11-17 20:14   ` Alex Williamson
@ 2022-11-18 15:36     ` Jason Gunthorpe
  2022-11-18 20:36       ` Alex Williamson
  0 siblings, 1 reply; 35+ messages in thread
From: Jason Gunthorpe @ 2022-11-18 15:36 UTC (permalink / raw)
  To: Alex Williamson
  Cc: Lu Baolu, Cornelia Huck, Eric Auger, Kevin Tian, kvm,
	Lixiao Yang, Matthew Rosato, Nicolin Chen, Yi Liu, Yu He

On Thu, Nov 17, 2022 at 01:14:51PM -0700, Alex Williamson wrote:
> On Wed, 16 Nov 2022 17:05:29 -0400
> Jason Gunthorpe <jgg@nvidia.com> wrote:
> 
> > This legacy module knob has become uAPI, when set on the vfio_iommu_type1
> > it disables some security protections in the iommu drivers. Move the
> > storage for this knob to vfio_main.c so that iommufd can access it too.
> > 
> > The may need enhancing as we learn more about how necessary
> > allow_unsafe_interrupts is in the current state of the world. If vfio
> > container is disabled then this option will not be available to the user.
> > 
> > Tested-by: Nicolin Chen <nicolinc@nvidia.com>
> > Tested-by: Yi Liu <yi.l.liu@intel.com>
> > Tested-by: Lixiao Yang <lixiao.yang@intel.com>
> > Tested-by: Matthew Rosato <mjrosato@linux.ibm.com>
> > Tested-by: Yu He <yu.he@intel.com>
> > Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> > ---
> >  drivers/vfio/vfio.h             | 2 ++
> >  drivers/vfio/vfio_iommu_type1.c | 5 ++---
> >  drivers/vfio/vfio_main.c        | 3 +++
> >  3 files changed, 7 insertions(+), 3 deletions(-)
> 
> It's really quite trivial to convert to a vfio_iommu.ko module to host
> a separate option for this.  Half of the patch below is undoing what's
> done here.  Is your only concern with this approach that we use a few
> KB more memory for the separate module?

My main dislike is that it just seems arbitary to shunt iommufd
support to a module when it is always required by vfio.ko. In general
if you have a module that is only ever used by 1 other module, you
should probably just combine them. It saves memory and simplifies
operation (eg you don't have to unload a zoo of modules during
development testing)

If it wasn't for the module option ABI problem I would propse to merge
vfio_type1/spapr into vfio.ko too - vfio.ko doesn't work without them
and the module soft dependencies are hint something is weird here.

An alternative suggestion is to just retain a stub vfio_iommu_type1.ko
which only exposes the module option if iommufd is enabled. At least
this would preserve the semi-ABI.

However, if you think this fits your vision for VFIO better I will
take it.

> diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
> index 186e33a006d3..23c24fe98c00 100644
> --- a/drivers/vfio/vfio_iommu_type1.c
> +++ b/drivers/vfio/vfio_iommu_type1.c
> @@ -44,8 +44,9 @@
>  #define DRIVER_AUTHOR   "Alex Williamson <alex.williamson@redhat.com>"
>  #define DRIVER_DESC     "Type1 IOMMU driver for VFIO"
>  
> +static bool allow_unsafe_interrupts;
>  module_param_named(allow_unsafe_interrupts,
> -		   vfio_allow_unsafe_interrupts, bool, S_IRUGO | S_IWUSR);
> +		   allow_unsafe_interrupts, bool, S_IRUGO | S_IWUSR);
>  MODULE_PARM_DESC(allow_unsafe_interrupts,
>  		 "Enable VFIO IOMMU support for on platforms without
>  interrupt remapping support.");

Except this, I think we still should have iommufd compat with the
current module ABI, so this should still get moved into vfio.ko and
both vfio_iommu_type1.ko and vfio_iommufd.ko should jointly manipulate
the same memory with their module options.

Thanks,
Jason

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 04/11] vfio: Move storage of allow_unsafe_interrupts to vfio_main.c
  2022-11-18 15:36     ` Jason Gunthorpe
@ 2022-11-18 20:36       ` Alex Williamson
  2022-11-22  1:59         ` Jason Gunthorpe
  0 siblings, 1 reply; 35+ messages in thread
From: Alex Williamson @ 2022-11-18 20:36 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Lu Baolu, Cornelia Huck, Eric Auger, Kevin Tian, kvm,
	Lixiao Yang, Matthew Rosato, Nicolin Chen, Yi Liu, Yu He

On Fri, 18 Nov 2022 11:36:14 -0400
Jason Gunthorpe <jgg@nvidia.com> wrote:

> On Thu, Nov 17, 2022 at 01:14:51PM -0700, Alex Williamson wrote:
> > On Wed, 16 Nov 2022 17:05:29 -0400
> > Jason Gunthorpe <jgg@nvidia.com> wrote:
> >   
> > > This legacy module knob has become uAPI, when set on the vfio_iommu_type1
> > > it disables some security protections in the iommu drivers. Move the
> > > storage for this knob to vfio_main.c so that iommufd can access it too.
> > > 
> > > The may need enhancing as we learn more about how necessary
> > > allow_unsafe_interrupts is in the current state of the world. If vfio
> > > container is disabled then this option will not be available to the user.
> > > 
> > > Tested-by: Nicolin Chen <nicolinc@nvidia.com>
> > > Tested-by: Yi Liu <yi.l.liu@intel.com>
> > > Tested-by: Lixiao Yang <lixiao.yang@intel.com>
> > > Tested-by: Matthew Rosato <mjrosato@linux.ibm.com>
> > > Tested-by: Yu He <yu.he@intel.com>
> > > Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> > > ---
> > >  drivers/vfio/vfio.h             | 2 ++
> > >  drivers/vfio/vfio_iommu_type1.c | 5 ++---
> > >  drivers/vfio/vfio_main.c        | 3 +++
> > >  3 files changed, 7 insertions(+), 3 deletions(-)  
> > 
> > It's really quite trivial to convert to a vfio_iommu.ko module to host
> > a separate option for this.  Half of the patch below is undoing what's
> > done here.  Is your only concern with this approach that we use a few
> > KB more memory for the separate module?  
> 
> My main dislike is that it just seems arbitary to shunt iommufd
> support to a module when it is always required by vfio.ko. In general
> if you have a module that is only ever used by 1 other module, you
> should probably just combine them. It saves memory and simplifies
> operation (eg you don't have to unload a zoo of modules during
> development testing)

These are all great reasons for why iommufd should host this option, as
it's fundamentally part of the DMA isolation of the device, which vfio
relies on iommufd to provide in this case.  Adding a module option to
vfio.ko conflicts with the existing option on vfio_iommu_type1.ko,
which leads to the problem of duplicate module options manipulating the
same variable, or options that disappear when other modules are
deprecated, which are both issues I have with the originally proposed
code.  Since iommufd won't accept such an option, we maintain the
logical association via modularizing the vfio interface to iommufd.

> If it wasn't for the module option ABI problem I would propse to merge
> vfio_type1/spapr into vfio.ko too - vfio.ko doesn't work without them
> and the module soft dependencies are hint something is weird here.

In fact no-iommu mode works w/o them.

> An alternative suggestion is to just retain a stub vfio_iommu_type1.ko
> which only exposes the module option if iommufd is enabled. At least
> this would preserve the semi-ABI.

See below.

> However, if you think this fits your vision for VFIO better I will
> take it.
> 
> > diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
> > index 186e33a006d3..23c24fe98c00 100644
> > --- a/drivers/vfio/vfio_iommu_type1.c
> > +++ b/drivers/vfio/vfio_iommu_type1.c
> > @@ -44,8 +44,9 @@
> >  #define DRIVER_AUTHOR   "Alex Williamson <alex.williamson@redhat.com>"
> >  #define DRIVER_DESC     "Type1 IOMMU driver for VFIO"
> >  
> > +static bool allow_unsafe_interrupts;
> >  module_param_named(allow_unsafe_interrupts,
> > -		   vfio_allow_unsafe_interrupts, bool, S_IRUGO | S_IWUSR);
> > +		   allow_unsafe_interrupts, bool, S_IRUGO | S_IWUSR);
> >  MODULE_PARM_DESC(allow_unsafe_interrupts,
> >  		 "Enable VFIO IOMMU support for on platforms without
> >  interrupt remapping support.");  
> 
> Except this, I think we still should have iommufd compat with the
> current module ABI, so this should still get moved into vfio.ko and
> both vfio_iommu_type1.ko and vfio_iommufd.ko should jointly manipulate
> the same memory with their module options.

Modules implicitly interacting in this way is exactly what I find so
terrible in the original proposal.  The idea of a stub type1 module to
preserve that uAPI was only proposed as a known terrible solution to the
problem.

The more I think about it, the less convinced I am that the kernel has
a responsibility to automatically carry forward the behavior of a
module option for a module that's no longer used.  The only way we make
use of IOMMUFD is if either the distribution has opt'd to disable
VFIO_CONTAINER and enable IOMMUFD emulation or userspace has opt'd to
pass an iommufd to vfio rather than a container.  The former could only
be forced after a deprecation period for VFIO_CONTAINER, after which I
think it's fair to require a re-opt-in by the user.  In the latter
case, userspace is intentionally choosing to use a highly compatible
uAPI, but nonetheless, it's still a different uAPI.

Thus the proposal here for a separate, but equivalent module option in
the vfio interface to iommufd.  Let's also not forget that at it's
core, this option is an opt-in to allow less secure configurations.
It's prudent to tread lightly with automatically carrying it forward.
Thanks,

Alex


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 04/11] vfio: Move storage of allow_unsafe_interrupts to vfio_main.c
  2022-11-18 20:36       ` Alex Williamson
@ 2022-11-22  1:59         ` Jason Gunthorpe
  2022-11-22 17:34           ` Alex Williamson
  2022-11-23 19:47           ` Jason Gunthorpe
  0 siblings, 2 replies; 35+ messages in thread
From: Jason Gunthorpe @ 2022-11-22  1:59 UTC (permalink / raw)
  To: Alex Williamson
  Cc: Lu Baolu, Cornelia Huck, Eric Auger, Kevin Tian, kvm,
	Lixiao Yang, Matthew Rosato, Nicolin Chen, Yi Liu, Yu He

On Fri, Nov 18, 2022 at 01:36:46PM -0700, Alex Williamson wrote:
> On Fri, 18 Nov 2022 11:36:14 -0400
> Jason Gunthorpe <jgg@nvidia.com> wrote:
> 
> > On Thu, Nov 17, 2022 at 01:14:51PM -0700, Alex Williamson wrote:
> > > On Wed, 16 Nov 2022 17:05:29 -0400
> > > Jason Gunthorpe <jgg@nvidia.com> wrote:
> > >   
> > > > This legacy module knob has become uAPI, when set on the vfio_iommu_type1
> > > > it disables some security protections in the iommu drivers. Move the
> > > > storage for this knob to vfio_main.c so that iommufd can access it too.
> > > > 
> > > > The may need enhancing as we learn more about how necessary
> > > > allow_unsafe_interrupts is in the current state of the world. If vfio
> > > > container is disabled then this option will not be available to the user.
> > > > 
> > > > Tested-by: Nicolin Chen <nicolinc@nvidia.com>
> > > > Tested-by: Yi Liu <yi.l.liu@intel.com>
> > > > Tested-by: Lixiao Yang <lixiao.yang@intel.com>
> > > > Tested-by: Matthew Rosato <mjrosato@linux.ibm.com>
> > > > Tested-by: Yu He <yu.he@intel.com>
> > > > Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> > > > ---
> > > >  drivers/vfio/vfio.h             | 2 ++
> > > >  drivers/vfio/vfio_iommu_type1.c | 5 ++---
> > > >  drivers/vfio/vfio_main.c        | 3 +++
> > > >  3 files changed, 7 insertions(+), 3 deletions(-)  
> > > 
> > > It's really quite trivial to convert to a vfio_iommu.ko module to host
> > > a separate option for this.  Half of the patch below is undoing what's
> > > done here.  Is your only concern with this approach that we use a few
> > > KB more memory for the separate module?  
> > 
> > My main dislike is that it just seems arbitary to shunt iommufd
> > support to a module when it is always required by vfio.ko. In general
> > if you have a module that is only ever used by 1 other module, you
> > should probably just combine them. It saves memory and simplifies
> > operation (eg you don't have to unload a zoo of modules during
> > development testing)
> 
> These are all great reasons for why iommufd should host this option, as
> it's fundamentally part of the DMA isolation of the device, which vfio
> relies on iommufd to provide in this case. 

Fine, lets do that.

> > Except this, I think we still should have iommufd compat with the
> > current module ABI, so this should still get moved into vfio.ko and
> > both vfio_iommu_type1.ko and vfio_iommufd.ko should jointly manipulate
> > the same memory with their module options.
> 
> Modules implicitly interacting in this way is exactly what I find so
> terrible in the original proposal.  The idea of a stub type1 module to
> preserve that uAPI was only proposed as a known terrible solution to the
> problem.

And I take it you prefer we remove this compat code as well and just
leave the module option on vfio_type1 non-working?

> think it's fair to require a re-opt-in by the user.  In the latter
> case, userspace is intentionally choosing to use a highly compatible
> uAPI, but nonetheless, it's still a different uAPI.

Well, the later case is likely going to be a choice made by the
distribution, eg I would expect that libvirt will start automatically
favoring iommufd if it is available.

So, instructions someone might find saying to tweak the module option
and then use libvirt are going to stop working at some point.

Thanks,
Jason

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 04/11] vfio: Move storage of allow_unsafe_interrupts to vfio_main.c
  2022-11-22  1:59         ` Jason Gunthorpe
@ 2022-11-22 17:34           ` Alex Williamson
  2022-11-22 17:41             ` Jason Gunthorpe
  2022-11-23 19:47           ` Jason Gunthorpe
  1 sibling, 1 reply; 35+ messages in thread
From: Alex Williamson @ 2022-11-22 17:34 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Lu Baolu, Cornelia Huck, Eric Auger, Kevin Tian, kvm,
	Lixiao Yang, Matthew Rosato, Nicolin Chen, Yi Liu, Yu He

On Mon, 21 Nov 2022 21:59:28 -0400
Jason Gunthorpe <jgg@nvidia.com> wrote:

> On Fri, Nov 18, 2022 at 01:36:46PM -0700, Alex Williamson wrote:
> > On Fri, 18 Nov 2022 11:36:14 -0400
> > Jason Gunthorpe <jgg@nvidia.com> wrote:
> >   
> > > On Thu, Nov 17, 2022 at 01:14:51PM -0700, Alex Williamson wrote:  
> > > > On Wed, 16 Nov 2022 17:05:29 -0400
> > > > Jason Gunthorpe <jgg@nvidia.com> wrote:
> > > >     
> > > > > This legacy module knob has become uAPI, when set on the vfio_iommu_type1
> > > > > it disables some security protections in the iommu drivers. Move the
> > > > > storage for this knob to vfio_main.c so that iommufd can access it too.
> > > > > 
> > > > > The may need enhancing as we learn more about how necessary
> > > > > allow_unsafe_interrupts is in the current state of the world. If vfio
> > > > > container is disabled then this option will not be available to the user.
> > > > > 
> > > > > Tested-by: Nicolin Chen <nicolinc@nvidia.com>
> > > > > Tested-by: Yi Liu <yi.l.liu@intel.com>
> > > > > Tested-by: Lixiao Yang <lixiao.yang@intel.com>
> > > > > Tested-by: Matthew Rosato <mjrosato@linux.ibm.com>
> > > > > Tested-by: Yu He <yu.he@intel.com>
> > > > > Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> > > > > ---
> > > > >  drivers/vfio/vfio.h             | 2 ++
> > > > >  drivers/vfio/vfio_iommu_type1.c | 5 ++---
> > > > >  drivers/vfio/vfio_main.c        | 3 +++
> > > > >  3 files changed, 7 insertions(+), 3 deletions(-)    
> > > > 
> > > > It's really quite trivial to convert to a vfio_iommu.ko module to host
> > > > a separate option for this.  Half of the patch below is undoing what's
> > > > done here.  Is your only concern with this approach that we use a few
> > > > KB more memory for the separate module?    
> > > 
> > > My main dislike is that it just seems arbitary to shunt iommufd
> > > support to a module when it is always required by vfio.ko. In general
> > > if you have a module that is only ever used by 1 other module, you
> > > should probably just combine them. It saves memory and simplifies
> > > operation (eg you don't have to unload a zoo of modules during
> > > development testing)  
> > 
> > These are all great reasons for why iommufd should host this option, as
> > it's fundamentally part of the DMA isolation of the device, which vfio
> > relies on iommufd to provide in this case.   
> 
> Fine, lets do that.
> 
> > > Except this, I think we still should have iommufd compat with the
> > > current module ABI, so this should still get moved into vfio.ko and
> > > both vfio_iommu_type1.ko and vfio_iommufd.ko should jointly manipulate
> > > the same memory with their module options.  
> > 
> > Modules implicitly interacting in this way is exactly what I find so
> > terrible in the original proposal.  The idea of a stub type1 module to
> > preserve that uAPI was only proposed as a known terrible solution to the
> > problem.  
> 
> And I take it you prefer we remove this compat code as well and just
> leave the module option on vfio_type1 non-working?

In the case where userspace provides an iommufd for the container, yes,
the type1 module then isn't involved.
 
> > think it's fair to require a re-opt-in by the user.  In the latter
> > case, userspace is intentionally choosing to use a highly compatible
> > uAPI, but nonetheless, it's still a different uAPI.  
> 
> Well, the later case is likely going to be a choice made by the
> distribution, eg I would expect that libvirt will start automatically
> favoring iommufd if it is available.
> 
> So, instructions someone might find saying to tweak the module option
> and then use libvirt are going to stop working at some point.

libvirt doesn't currently pass any file descriptors for vfio devices in
QEMU, so we're looking a ways down the road here.  Once QEMU gains
native iommufd support, libvirt will launch QEMU in a way that
explicitly enables this iommufd support.  I'd expect libvirt might
implement this by creating a "vfio-legacy" vs "vfio-iommufd" driver
option, where "vfio" becomes an alias to one of those.  That allows both
a staged transition and a fallback for issues.  I'd expect a first
debugging step would be to fallback to legacy support.  But ultimately,
yes, in the rare case that this option is actually necessary, the admin
would need to re-opt-in, and hopefully a dmesg log from iommufd would
make it apparent this is the problem.

I just can't wrap my head around shared module options and stub drivers
being a sane solution simply to make this potentially rare condition
more transparent w/o reminding user's their setup has a vulnerability.

BTW, what is the actual expected use case of passing an iommufd as a
vfio container?  I have doubts that it'd make sense to have QEMU look
for an iommufd in place of a vfio container for anything beyond yet
another means for early testing of iommufd.  Thanks,

Alex


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 04/11] vfio: Move storage of allow_unsafe_interrupts to vfio_main.c
  2022-11-22 17:34           ` Alex Williamson
@ 2022-11-22 17:41             ` Jason Gunthorpe
  2022-11-23  1:21               ` Tian, Kevin
  0 siblings, 1 reply; 35+ messages in thread
From: Jason Gunthorpe @ 2022-11-22 17:41 UTC (permalink / raw)
  To: Alex Williamson
  Cc: Lu Baolu, Cornelia Huck, Eric Auger, Kevin Tian, kvm,
	Lixiao Yang, Matthew Rosato, Nicolin Chen, Yi Liu, Yu He

> BTW, what is the actual expected use case of passing an iommufd as a
> vfio container?  I have doubts that it'd make sense to have QEMU look
> for an iommufd in place of a vfio container for anything beyond yet
> another means for early testing of iommufd.  Thanks,

I don't think there is one in production for qemu.

For something like DPDK I can imagine replacing the open logic to use
vfio device cdevs and iommufd, but keeping the rest of the logic the
same so the FD looks and feels like a VFIO container. There is little
value in replacing the VFIO map/unmap/etc ioctls with the IOMMUFD
equivalents.

Jason

^ permalink raw reply	[flat|nested] 35+ messages in thread

* RE: [PATCH v3 04/11] vfio: Move storage of allow_unsafe_interrupts to vfio_main.c
  2022-11-22 17:41             ` Jason Gunthorpe
@ 2022-11-23  1:21               ` Tian, Kevin
  2022-11-23 16:38                 ` Jason Gunthorpe
  0 siblings, 1 reply; 35+ messages in thread
From: Tian, Kevin @ 2022-11-23  1:21 UTC (permalink / raw)
  To: Jason Gunthorpe, Alex Williamson
  Cc: Lu Baolu, Cornelia Huck, Eric Auger, kvm, Yang, Lixiao,
	Matthew Rosato, Nicolin Chen, Liu, Yi L, He, Yu

> From: Jason Gunthorpe <jgg@nvidia.com>
> Sent: Wednesday, November 23, 2022 1:41 AM
> 
> > BTW, what is the actual expected use case of passing an iommufd as a
> > vfio container?  I have doubts that it'd make sense to have QEMU look
> > for an iommufd in place of a vfio container for anything beyond yet
> > another means for early testing of iommufd.  Thanks,

One case is that the admin links /dev/vfio/vfio to /dev/iommu then all
legacy vfio applications are implicitly converted to use iommufd as vfio
container.

> 
> I don't think there is one in production for qemu.
> 
> For something like DPDK I can imagine replacing the open logic to use
> vfio device cdevs and iommufd, but keeping the rest of the logic the
> same so the FD looks and feels like a VFIO container. There is little
> value in replacing the VFIO map/unmap/etc ioctls with the IOMMUFD
> equivalents.
> 

I'm not sure the value of entering this mixed world. I could envision
dpdk starting to add cdev/iommufd support when it wants to use
new features (pasid, iopf, etc.) which are available only via iommufd
native api. Before that point it just stays with full vfio legacy.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 00/11] Connect VFIO to IOMMUFD
  2022-11-16 21:05 [PATCH v3 00/11] Connect VFIO to IOMMUFD Jason Gunthorpe
                   ` (10 preceding siblings ...)
  2022-11-16 21:05 ` [PATCH v3 11/11] iommufd: Allow iommufd to supply /dev/vfio/vfio Jason Gunthorpe
@ 2022-11-23  2:44 ` Yi Liu
  2022-11-23 12:59   ` Jason Gunthorpe
  11 siblings, 1 reply; 35+ messages in thread
From: Yi Liu @ 2022-11-23  2:44 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Alex Williamson, Lu Baolu, Cornelia Huck, Eric Auger, Kevin Tian,
	kvm, Lixiao Yang, Matthew Rosato, Nicolin Chen, Yu He

Hi Jason,

On 2022/11/17 05:05, Jason Gunthorpe wrote:
> This series provides an alternative container layer for VFIO implemented
> using iommufd. This is optional, if CONFIG_IOMMUFD is not set then it will
> not be compiled in.
> 
> At this point iommufd can be injected by passing in a iommfd FD to
> VFIO_GROUP_SET_CONTAINER which will use the VFIO compat layer in iommufd
> to obtain the compat IOAS and then connect up all the VFIO drivers as
> appropriate.
> 
> This is temporary stopping point, a following series will provide a way to
> directly open a VFIO device FD and directly connect it to IOMMUFD using
> native ioctls that can expose the IOMMUFD features like hwpt, future
> vPASID and dynamic attachment.
> 
> This series, in compat mode, has passed all the qemu tests we have
> available, including the test suites for the Intel GVT mdev. Aside from
> the temporary limitation with P2P memory this is belived to be fully
> compatible with VFIO.
> 
> This is on github: https://github.com/jgunthorpe/linux/commits/vfio_iommufd
> 
> It requires the iommufd series:
> 
> https://lore.kernel.org/r/0-v5-4001c2997bd0+30c-iommufd_jgg@nvidia.com

gvtg test encountered broken display with below commit in your for-next
branch.

https://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd.git/commit/?h=for-next&id=57f62422b6f0477afaddd2fc77a4bb9b94275f42

I noticed there are diffs in drivers/vfio/ and drivers/iommu/iommufd/
between this commit and the last tested commit (37c9e6e44d77a). Seems
to have regression due to the diffs.

> v3:
>   - Fix iommufd_attached to be only used in the vfio_iommufd_physical_*
>     funcs
>   - Always check for iommufd before invoking a iommufd function
>   - Fix mismatch between vfio_pin_pages and iommufd_access when the IOVA
>     is not aligned. Resolves problems on S390
> v2: https://lore.kernel.org/r/0-v2-65016290f146+33e-vfio_iommufd_jgg@nvidia.com
>   - Rebase to v6.1-rc3, v4 iommufd series
>   - Fixup comments and commit messages from list remarks
>   - Fix leaking of the iommufd for mdevs
>   - New patch to fix vfio modaliases when vfio container is disabled
>   - Add a dmesg once when the iommufd provided /dev/vfio/vfio is opened
>     to signal that iommufd is providing this
> v1: https://lore.kernel.org/r/0-v1-4991695894d8+211-vfio_iommufd_jgg@nvidia.com
> 
> Jason Gunthorpe (11):
>    vfio: Move vfio_device driver open/close code to a function
>    vfio: Move vfio_device_assign_container() into
>      vfio_device_first_open()
>    vfio: Rename vfio_device_assign/unassign_container()
>    vfio: Move storage of allow_unsafe_interrupts to vfio_main.c
>    vfio: Use IOMMU_CAP_ENFORCE_CACHE_COHERENCY for
>      vfio_file_enforced_coherent()
>    vfio-iommufd: Allow iommufd to be used in place of a container fd
>    vfio-iommufd: Support iommufd for physical VFIO devices
>    vfio-iommufd: Support iommufd for emulated VFIO devices
>    vfio: Move container related MODULE_ALIAS statements into container.c
>    vfio: Make vfio_container optionally compiled
>    iommufd: Allow iommufd to supply /dev/vfio/vfio
> 
>   drivers/gpu/drm/i915/gvt/kvmgt.c              |   3 +
>   drivers/iommu/iommufd/Kconfig                 |  12 +
>   drivers/iommu/iommufd/main.c                  |  36 ++
>   drivers/s390/cio/vfio_ccw_ops.c               |   3 +
>   drivers/s390/crypto/vfio_ap_ops.c             |   3 +
>   drivers/vfio/Kconfig                          |  36 +-
>   drivers/vfio/Makefile                         |   5 +-
>   drivers/vfio/container.c                      | 141 ++-----
>   drivers/vfio/fsl-mc/vfio_fsl_mc.c             |   3 +
>   drivers/vfio/iommufd.c                        | 161 ++++++++
>   .../vfio/pci/hisilicon/hisi_acc_vfio_pci.c    |   6 +
>   drivers/vfio/pci/mlx5/main.c                  |   3 +
>   drivers/vfio/pci/vfio_pci.c                   |   3 +
>   drivers/vfio/platform/vfio_amba.c             |   3 +
>   drivers/vfio/platform/vfio_platform.c         |   3 +
>   drivers/vfio/vfio.h                           | 100 ++++-
>   drivers/vfio/vfio_iommu_type1.c               |   5 +-
>   drivers/vfio/vfio_main.c                      | 348 ++++++++++++++----
>   include/linux/vfio.h                          |  39 ++
>   19 files changed, 714 insertions(+), 199 deletions(-)
>   create mode 100644 drivers/vfio/iommufd.c
> 
> 
> base-commit: 9d367dc905dd278614aaf601afb28e511b82fb3b

-- 
Regards,
Yi Liu

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 00/11] Connect VFIO to IOMMUFD
  2022-11-23  2:44 ` [PATCH v3 00/11] Connect VFIO to IOMMUFD Yi Liu
@ 2022-11-23 12:59   ` Jason Gunthorpe
  2022-11-23 13:04     ` Yi Liu
  0 siblings, 1 reply; 35+ messages in thread
From: Jason Gunthorpe @ 2022-11-23 12:59 UTC (permalink / raw)
  To: Yi Liu
  Cc: Alex Williamson, Lu Baolu, Cornelia Huck, Eric Auger, Kevin Tian,
	kvm, Lixiao Yang, Matthew Rosato, Nicolin Chen, Yu He

On Wed, Nov 23, 2022 at 10:44:12AM +0800, Yi Liu wrote:
> Hi Jason,
> 
> On 2022/11/17 05:05, Jason Gunthorpe wrote:
> > This series provides an alternative container layer for VFIO implemented
> > using iommufd. This is optional, if CONFIG_IOMMUFD is not set then it will
> > not be compiled in.
> > 
> > At this point iommufd can be injected by passing in a iommfd FD to
> > VFIO_GROUP_SET_CONTAINER which will use the VFIO compat layer in iommufd
> > to obtain the compat IOAS and then connect up all the VFIO drivers as
> > appropriate.
> > 
> > This is temporary stopping point, a following series will provide a way to
> > directly open a VFIO device FD and directly connect it to IOMMUFD using
> > native ioctls that can expose the IOMMUFD features like hwpt, future
> > vPASID and dynamic attachment.
> > 
> > This series, in compat mode, has passed all the qemu tests we have
> > available, including the test suites for the Intel GVT mdev. Aside from
> > the temporary limitation with P2P memory this is belived to be fully
> > compatible with VFIO.
> > 
> > This is on github: https://github.com/jgunthorpe/linux/commits/vfio_iommufd
> > 
> > It requires the iommufd series:
> > 
> > https://lore.kernel.org/r/0-v5-4001c2997bd0+30c-iommufd_jgg@nvidia.com
> 
> gvtg test encountered broken display with below commit in your for-next
> branch.
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd.git/commit/?h=for-next&id=57f62422b6f0477afaddd2fc77a4bb9b94275f42
> 
> I noticed there are diffs in drivers/vfio/ and drivers/iommu/iommufd/
> between this commit and the last tested commit (37c9e6e44d77a). Seems
> to have regression due to the diffs.

Do you have something more to go on? I am checking the diff and not
getting any idea. The above also merges v6.1-rc5 into the tree, is
there a chance rc5 is the gvt problem?

Jason

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 00/11] Connect VFIO to IOMMUFD
  2022-11-23 12:59   ` Jason Gunthorpe
@ 2022-11-23 13:04     ` Yi Liu
  2022-11-29 12:41       ` Yi Liu
  0 siblings, 1 reply; 35+ messages in thread
From: Yi Liu @ 2022-11-23 13:04 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Alex Williamson, Lu Baolu, Cornelia Huck, Eric Auger, Kevin Tian,
	kvm, Lixiao Yang, Matthew Rosato, Nicolin Chen, Yu He

On 2022/11/23 20:59, Jason Gunthorpe wrote:
> On Wed, Nov 23, 2022 at 10:44:12AM +0800, Yi Liu wrote:
>> Hi Jason,
>>
>> On 2022/11/17 05:05, Jason Gunthorpe wrote:
>>> This series provides an alternative container layer for VFIO implemented
>>> using iommufd. This is optional, if CONFIG_IOMMUFD is not set then it will
>>> not be compiled in.
>>>
>>> At this point iommufd can be injected by passing in a iommfd FD to
>>> VFIO_GROUP_SET_CONTAINER which will use the VFIO compat layer in iommufd
>>> to obtain the compat IOAS and then connect up all the VFIO drivers as
>>> appropriate.
>>>
>>> This is temporary stopping point, a following series will provide a way to
>>> directly open a VFIO device FD and directly connect it to IOMMUFD using
>>> native ioctls that can expose the IOMMUFD features like hwpt, future
>>> vPASID and dynamic attachment.
>>>
>>> This series, in compat mode, has passed all the qemu tests we have
>>> available, including the test suites for the Intel GVT mdev. Aside from
>>> the temporary limitation with P2P memory this is belived to be fully
>>> compatible with VFIO.
>>>
>>> This is on github: https://github.com/jgunthorpe/linux/commits/vfio_iommufd
>>>
>>> It requires the iommufd series:
>>>
>>> https://lore.kernel.org/r/0-v5-4001c2997bd0+30c-iommufd_jgg@nvidia.com
>>
>> gvtg test encountered broken display with below commit in your for-next
>> branch.
>>
>> https://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd.git/commit/?h=for-next&id=57f62422b6f0477afaddd2fc77a4bb9b94275f42
>>
>> I noticed there are diffs in drivers/vfio/ and drivers/iommu/iommufd/
>> between this commit and the last tested commit (37c9e6e44d77a). Seems
>> to have regression due to the diffs.
> 
> Do you have something more to go on? I am checking the diff and not
> getting any idea. The above also merges v6.1-rc5 into the tree, is
> there a chance rc5 is the gvt problem?

that is possible, I'll let my colleague revert it and try.

-- 
Regards,
Yi Liu

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 04/11] vfio: Move storage of allow_unsafe_interrupts to vfio_main.c
  2022-11-23  1:21               ` Tian, Kevin
@ 2022-11-23 16:38                 ` Jason Gunthorpe
  2022-11-24  5:30                   ` Tian, Kevin
  0 siblings, 1 reply; 35+ messages in thread
From: Jason Gunthorpe @ 2022-11-23 16:38 UTC (permalink / raw)
  To: Tian, Kevin
  Cc: Alex Williamson, Lu Baolu, Cornelia Huck, Eric Auger, kvm, Yang,
	Lixiao, Matthew Rosato, Nicolin Chen, Liu, Yi L, He, Yu

On Wed, Nov 23, 2022 at 01:21:30AM +0000, Tian, Kevin wrote:

> I'm not sure the value of entering this mixed world. I could envision
> dpdk starting to add cdev/iommufd support when it wants to use
> new features (pasid, iopf, etc.) which are available only via iommufd
> native api. Before that point it just stays with full vfio legacy.

I think the value is for the distro that would benefit from getting
apps validated and running on iommufd with the least effort invested.

So I'd like to see all the vfio apps we can convert to the new device
interface and iommufd just to be converted, even if they don't make
use of new features.

It gives a consistent experiance and support knowledge base if
everything uses one interface.

Jason

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 04/11] vfio: Move storage of allow_unsafe_interrupts to vfio_main.c
  2022-11-22  1:59         ` Jason Gunthorpe
  2022-11-22 17:34           ` Alex Williamson
@ 2022-11-23 19:47           ` Jason Gunthorpe
  2022-11-24  5:26             ` Tian, Kevin
  1 sibling, 1 reply; 35+ messages in thread
From: Jason Gunthorpe @ 2022-11-23 19:47 UTC (permalink / raw)
  To: Alex Williamson
  Cc: Lu Baolu, Cornelia Huck, Eric Auger, Kevin Tian, kvm,
	Lixiao Yang, Matthew Rosato, Nicolin Chen, Yi Liu, Yu He

On Mon, Nov 21, 2022 at 09:59:28PM -0400, Jason Gunthorpe wrote:
> On Fri, Nov 18, 2022 at 01:36:46PM -0700, Alex Williamson wrote:
> > On Fri, 18 Nov 2022 11:36:14 -0400
> > Jason Gunthorpe <jgg@nvidia.com> wrote:
> > 
> > > On Thu, Nov 17, 2022 at 01:14:51PM -0700, Alex Williamson wrote:
> > > > On Wed, 16 Nov 2022 17:05:29 -0400
> > > > Jason Gunthorpe <jgg@nvidia.com> wrote:
> > > >   
> > > > > This legacy module knob has become uAPI, when set on the vfio_iommu_type1
> > > > > it disables some security protections in the iommu drivers. Move the
> > > > > storage for this knob to vfio_main.c so that iommufd can access it too.
> > > > > 
> > > > > The may need enhancing as we learn more about how necessary
> > > > > allow_unsafe_interrupts is in the current state of the world. If vfio
> > > > > container is disabled then this option will not be available to the user.
> > > > > 
> > > > > Tested-by: Nicolin Chen <nicolinc@nvidia.com>
> > > > > Tested-by: Yi Liu <yi.l.liu@intel.com>
> > > > > Tested-by: Lixiao Yang <lixiao.yang@intel.com>
> > > > > Tested-by: Matthew Rosato <mjrosato@linux.ibm.com>
> > > > > Tested-by: Yu He <yu.he@intel.com>
> > > > > Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> > > > > ---
> > > > >  drivers/vfio/vfio.h             | 2 ++
> > > > >  drivers/vfio/vfio_iommu_type1.c | 5 ++---
> > > > >  drivers/vfio/vfio_main.c        | 3 +++
> > > > >  3 files changed, 7 insertions(+), 3 deletions(-)  
> > > > 
> > > > It's really quite trivial to convert to a vfio_iommu.ko module to host
> > > > a separate option for this.  Half of the patch below is undoing what's
> > > > done here.  Is your only concern with this approach that we use a few
> > > > KB more memory for the separate module?  
> > > 
> > > My main dislike is that it just seems arbitary to shunt iommufd
> > > support to a module when it is always required by vfio.ko. In general
> > > if you have a module that is only ever used by 1 other module, you
> > > should probably just combine them. It saves memory and simplifies
> > > operation (eg you don't have to unload a zoo of modules during
> > > development testing)
> > 
> > These are all great reasons for why iommufd should host this option, as
> > it's fundamentally part of the DMA isolation of the device, which vfio
> > relies on iommufd to provide in this case. 
> 
> Fine, lets do that.

It looks like this:

diff --git a/drivers/iommu/iommufd/device.c b/drivers/iommu/iommufd/device.c
index 07d4dcc0dbf5e1..6d088af776034b 100644
--- a/drivers/iommu/iommufd/device.c
+++ b/drivers/iommu/iommufd/device.c
@@ -9,6 +9,13 @@
 #include "io_pagetable.h"
 #include "iommufd_private.h"
 
+static bool allow_unsafe_interrupts;
+module_param(allow_unsafe_interrupts, bool, S_IRUGO | S_IWUSR);
+MODULE_PARM_DESC(
+	allow_unsafe_interrupts,
+	"Allow IOMMUFD to bind to devices even if the platform cannot isolate "
+	"the MSI interrupt window. Enabling this is a security weakness.");
+
 /*
  * A iommufd_device object represents the binding relationship between a
  * consuming driver and the iommufd. These objects are created/destroyed by
@@ -127,8 +134,7 @@ EXPORT_SYMBOL_NS_GPL(iommufd_device_unbind, IOMMUFD);
 
 static int iommufd_device_setup_msi(struct iommufd_device *idev,
 				    struct iommufd_hw_pagetable *hwpt,
-				    phys_addr_t sw_msi_start,
-				    unsigned int flags)
+				    phys_addr_t sw_msi_start)
 {
 	int rc;
 
@@ -174,12 +180,11 @@ static int iommufd_device_setup_msi(struct iommufd_device *idev,
 	 * historical compat with VFIO allow a module parameter to ignore the
 	 * insecurity.
 	 */
-	if (!(flags & IOMMUFD_ATTACH_FLAGS_ALLOW_UNSAFE_INTERRUPT))
+	if (!allow_unsafe_interrupts)
 		return -EPERM;
-
 	dev_warn(
 		idev->dev,
-		"Device interrupts cannot be isolated by the IOMMU, this platform in insecure. Use an \"allow_unsafe_interrupts\" module parameter to override\n");
+		"MSI interrupt window cannot be isolated by the IOMMU, this platform in insecure. Use the \"allow_unsafe_interrupts\" module parameter to override\n");
 	return 0;
 }
 
@@ -195,8 +200,7 @@ static bool iommufd_hw_pagetable_has_group(struct iommufd_hw_pagetable *hwpt,
 }
 
 static int iommufd_device_do_attach(struct iommufd_device *idev,
-				    struct iommufd_hw_pagetable *hwpt,
-				    unsigned int flags)
+				    struct iommufd_hw_pagetable *hwpt)
 {
 	phys_addr_t sw_msi_start = 0;
 	int rc;
@@ -226,7 +230,7 @@ static int iommufd_device_do_attach(struct iommufd_device *idev,
 	if (rc)
 		goto out_unlock;
 
-	rc = iommufd_device_setup_msi(idev, hwpt, sw_msi_start, flags);
+	rc = iommufd_device_setup_msi(idev, hwpt, sw_msi_start);
 	if (rc)
 		goto out_iova;
 
@@ -268,8 +272,7 @@ static int iommufd_device_do_attach(struct iommufd_device *idev,
  * Automatic domain selection will never pick a manually created domain.
  */
 static int iommufd_device_auto_get_domain(struct iommufd_device *idev,
-					  struct iommufd_ioas *ioas,
-					  unsigned int flags)
+					  struct iommufd_ioas *ioas)
 {
 	struct iommufd_hw_pagetable *hwpt;
 	int rc;
@@ -284,7 +287,7 @@ static int iommufd_device_auto_get_domain(struct iommufd_device *idev,
 		if (!hwpt->auto_domain)
 			continue;
 
-		rc = iommufd_device_do_attach(idev, hwpt, flags);
+		rc = iommufd_device_do_attach(idev, hwpt);
 
 		/*
 		 * -EINVAL means the domain is incompatible with the device.
@@ -303,7 +306,7 @@ static int iommufd_device_auto_get_domain(struct iommufd_device *idev,
 	}
 	hwpt->auto_domain = true;
 
-	rc = iommufd_device_do_attach(idev, hwpt, flags);
+	rc = iommufd_device_do_attach(idev, hwpt);
 	if (rc)
 		goto out_abort;
 	list_add_tail(&hwpt->hwpt_item, &ioas->hwpt_list);
@@ -324,7 +327,6 @@ static int iommufd_device_auto_get_domain(struct iommufd_device *idev,
  * @idev: device to attach
  * @pt_id: Input a IOMMUFD_OBJ_IOAS, or IOMMUFD_OBJ_HW_PAGETABLE
  *         Output the IOMMUFD_OBJ_HW_PAGETABLE ID
- * @flags: Optional flags
  *
  * This connects the device to an iommu_domain, either automatically or manually
  * selected. Once this completes the device could do DMA.
@@ -332,8 +334,7 @@ static int iommufd_device_auto_get_domain(struct iommufd_device *idev,
  * The caller should return the resulting pt_id back to userspace.
  * This function is undone by calling iommufd_device_detach().
  */
-int iommufd_device_attach(struct iommufd_device *idev, u32 *pt_id,
-			  unsigned int flags)
+int iommufd_device_attach(struct iommufd_device *idev, u32 *pt_id)
 {
 	struct iommufd_object *pt_obj;
 	int rc;
@@ -347,7 +348,7 @@ int iommufd_device_attach(struct iommufd_device *idev, u32 *pt_id,
 		struct iommufd_hw_pagetable *hwpt =
 			container_of(pt_obj, struct iommufd_hw_pagetable, obj);
 
-		rc = iommufd_device_do_attach(idev, hwpt, flags);
+		rc = iommufd_device_do_attach(idev, hwpt);
 		if (rc)
 			goto out_put_pt_obj;
 
@@ -360,7 +361,7 @@ int iommufd_device_attach(struct iommufd_device *idev, u32 *pt_id,
 		struct iommufd_ioas *ioas =
 			container_of(pt_obj, struct iommufd_ioas, obj);
 
-		rc = iommufd_device_auto_get_domain(idev, ioas, flags);
+		rc = iommufd_device_auto_get_domain(idev, ioas);
 		if (rc)
 			goto out_put_pt_obj;
 		break;
diff --git a/drivers/vfio/iommufd.c b/drivers/vfio/iommufd.c
index 5a7ce4d9fbae0a..da50feb24b6e1d 100644
--- a/drivers/vfio/iommufd.c
+++ b/drivers/vfio/iommufd.c
@@ -108,12 +108,9 @@ EXPORT_SYMBOL_GPL(vfio_iommufd_physical_unbind);
 
 int vfio_iommufd_physical_attach_ioas(struct vfio_device *vdev, u32 *pt_id)
 {
-	unsigned int flags = 0;
 	int rc;
 
-	if (vfio_allow_unsafe_interrupts)
-		flags |= IOMMUFD_ATTACH_FLAGS_ALLOW_UNSAFE_INTERRUPT;
-	rc = iommufd_device_attach(vdev->iommufd_device, pt_id, flags);
+	rc = iommufd_device_attach(vdev->iommufd_device, pt_id);
 	if (rc)
 		return rc;
 	vdev->iommufd_attached = true;
diff --git a/drivers/vfio/vfio.h b/drivers/vfio/vfio.h
index 3378714a746274..ce5fe3fc493b4e 100644
--- a/drivers/vfio/vfio.h
+++ b/drivers/vfio/vfio.h
@@ -216,6 +216,4 @@ extern bool vfio_noiommu __read_mostly;
 enum { vfio_noiommu = false };
 #endif
 
-extern bool vfio_allow_unsafe_interrupts;
-
 #endif
diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index 186e33a006d314..23c24fe98c00d4 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -44,8 +44,9 @@
 #define DRIVER_AUTHOR   "Alex Williamson <alex.williamson@redhat.com>"
 #define DRIVER_DESC     "Type1 IOMMU driver for VFIO"
 
+static bool allow_unsafe_interrupts;
 module_param_named(allow_unsafe_interrupts,
-		   vfio_allow_unsafe_interrupts, bool, S_IRUGO | S_IWUSR);
+		   allow_unsafe_interrupts, bool, S_IRUGO | S_IWUSR);
 MODULE_PARM_DESC(allow_unsafe_interrupts,
 		 "Enable VFIO IOMMU support for on platforms without interrupt remapping support.");
 
@@ -2281,7 +2282,7 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
 		    iommu_group_for_each_dev(iommu_group, (void *)IOMMU_CAP_INTR_REMAP,
 					     vfio_iommu_device_capable);
 
-	if (!vfio_allow_unsafe_interrupts && !msi_remap) {
+	if (!allow_unsafe_interrupts && !msi_remap) {
 		pr_warn("%s: No interrupt remapping support.  Use the module param \"allow_unsafe_interrupts\" to enable VFIO IOMMU support on this platform\n",
 		       __func__);
 		ret = -EPERM;
diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c
index 451a07eb702b34..593d45f43a16ba 100644
--- a/drivers/vfio/vfio_main.c
+++ b/drivers/vfio/vfio_main.c
@@ -52,9 +52,6 @@ static struct vfio {
 	struct ida			device_ida;
 } vfio;
 
-bool vfio_allow_unsafe_interrupts;
-EXPORT_SYMBOL_GPL(vfio_allow_unsafe_interrupts);
-
 static DEFINE_XARRAY(vfio_device_set_xa);
 static const struct file_operations vfio_group_fops;
 
diff --git a/include/linux/iommufd.h b/include/linux/iommufd.h
index bf2b3ea5f90fd2..9d1afd417215d0 100644
--- a/include/linux/iommufd.h
+++ b/include/linux/iommufd.h
@@ -21,11 +21,7 @@ struct iommufd_device *iommufd_device_bind(struct iommufd_ctx *ictx,
 					   struct device *dev, u32 *id);
 void iommufd_device_unbind(struct iommufd_device *idev);
 
-enum {
-	IOMMUFD_ATTACH_FLAGS_ALLOW_UNSAFE_INTERRUPT = 1 << 0,
-};
-int iommufd_device_attach(struct iommufd_device *idev, u32 *pt_id,
-			  unsigned int flags);
+int iommufd_device_attach(struct iommufd_device *idev, u32 *pt_id);
 void iommufd_device_detach(struct iommufd_device *idev);
 
 struct iommufd_access_ops {

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* RE: [PATCH v3 04/11] vfio: Move storage of allow_unsafe_interrupts to vfio_main.c
  2022-11-23 19:47           ` Jason Gunthorpe
@ 2022-11-24  5:26             ` Tian, Kevin
  0 siblings, 0 replies; 35+ messages in thread
From: Tian, Kevin @ 2022-11-24  5:26 UTC (permalink / raw)
  To: Jason Gunthorpe, Alex Williamson
  Cc: Lu Baolu, Cornelia Huck, Eric Auger, kvm, Yang, Lixiao,
	Matthew Rosato, Nicolin Chen, Liu, Yi L, He, Yu

> From: Jason Gunthorpe <jgg@nvidia.com>
> Sent: Thursday, November 24, 2022 3:48 AM
> 
> On Mon, Nov 21, 2022 at 09:59:28PM -0400, Jason Gunthorpe wrote:
> > On Fri, Nov 18, 2022 at 01:36:46PM -0700, Alex Williamson wrote:
> > > On Fri, 18 Nov 2022 11:36:14 -0400
> > > Jason Gunthorpe <jgg@nvidia.com> wrote:
> > >
> > > > On Thu, Nov 17, 2022 at 01:14:51PM -0700, Alex Williamson wrote:
> > > > > On Wed, 16 Nov 2022 17:05:29 -0400
> > > > > Jason Gunthorpe <jgg@nvidia.com> wrote:
> > > > >
> > > > > > This legacy module knob has become uAPI, when set on the
> vfio_iommu_type1
> > > > > > it disables some security protections in the iommu drivers. Move the
> > > > > > storage for this knob to vfio_main.c so that iommufd can access it
> too.
> > > > > >
> > > > > > The may need enhancing as we learn more about how necessary
> > > > > > allow_unsafe_interrupts is in the current state of the world. If vfio
> > > > > > container is disabled then this option will not be available to the
> user.
> > > > > >
> > > > > > Tested-by: Nicolin Chen <nicolinc@nvidia.com>
> > > > > > Tested-by: Yi Liu <yi.l.liu@intel.com>
> > > > > > Tested-by: Lixiao Yang <lixiao.yang@intel.com>
> > > > > > Tested-by: Matthew Rosato <mjrosato@linux.ibm.com>
> > > > > > Tested-by: Yu He <yu.he@intel.com>
> > > > > > Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> > > > > > ---
> > > > > >  drivers/vfio/vfio.h             | 2 ++
> > > > > >  drivers/vfio/vfio_iommu_type1.c | 5 ++---
> > > > > >  drivers/vfio/vfio_main.c        | 3 +++
> > > > > >  3 files changed, 7 insertions(+), 3 deletions(-)
> > > > >
> > > > > It's really quite trivial to convert to a vfio_iommu.ko module to host
> > > > > a separate option for this.  Half of the patch below is undoing what's
> > > > > done here.  Is your only concern with this approach that we use a few
> > > > > KB more memory for the separate module?
> > > >
> > > > My main dislike is that it just seems arbitary to shunt iommufd
> > > > support to a module when it is always required by vfio.ko. In general
> > > > if you have a module that is only ever used by 1 other module, you
> > > > should probably just combine them. It saves memory and simplifies
> > > > operation (eg you don't have to unload a zoo of modules during
> > > > development testing)
> > >
> > > These are all great reasons for why iommufd should host this option, as
> > > it's fundamentally part of the DMA isolation of the device, which vfio
> > > relies on iommufd to provide in this case.
> >
> > Fine, lets do that.
> 
> It looks like this:
> 
> diff --git a/drivers/iommu/iommufd/device.c
> b/drivers/iommu/iommufd/device.c
> index 07d4dcc0dbf5e1..6d088af776034b 100644
> --- a/drivers/iommu/iommufd/device.c
> +++ b/drivers/iommu/iommufd/device.c
> @@ -9,6 +9,13 @@
>  #include "io_pagetable.h"
>  #include "iommufd_private.h"
> 
> +static bool allow_unsafe_interrupts;
> +module_param(allow_unsafe_interrupts, bool, S_IRUGO | S_IWUSR);
> +MODULE_PARM_DESC(
> +	allow_unsafe_interrupts,
> +	"Allow IOMMUFD to bind to devices even if the platform cannot
> isolate "
> +	"the MSI interrupt window. Enabling this is a security weakness.");
> +
>  /*
>   * A iommufd_device object represents the binding relationship between a
>   * consuming driver and the iommufd. These objects are created/destroyed
> by
> @@ -127,8 +134,7 @@ EXPORT_SYMBOL_NS_GPL(iommufd_device_unbind,
> IOMMUFD);
> 
>  static int iommufd_device_setup_msi(struct iommufd_device *idev,
>  				    struct iommufd_hw_pagetable *hwpt,
> -				    phys_addr_t sw_msi_start,
> -				    unsigned int flags)
> +				    phys_addr_t sw_msi_start)
>  {
>  	int rc;
> 
> @@ -174,12 +180,11 @@ static int iommufd_device_setup_msi(struct
> iommufd_device *idev,
>  	 * historical compat with VFIO allow a module parameter to ignore
> the
>  	 * insecurity.
>  	 */
> -	if (!(flags &
> IOMMUFD_ATTACH_FLAGS_ALLOW_UNSAFE_INTERRUPT))
> +	if (!allow_unsafe_interrupts)
>  		return -EPERM;
> -

keep the blank line.

>  	dev_warn(
>  		idev->dev,
> -		"Device interrupts cannot be isolated by the IOMMU, this
> platform in insecure. Use an \"allow_unsafe_interrupts\" module parameter
> to override\n");
> +		"MSI interrupt window cannot be isolated by the IOMMU,
> this platform in insecure. Use the \"allow_unsafe_interrupts\" module

s/in/is/

> parameter to override\n");
>  	return 0;
>  }
> 
> @@ -195,8 +200,7 @@ static bool
> iommufd_hw_pagetable_has_group(struct iommufd_hw_pagetable *hwpt,
>  }
> 
>  static int iommufd_device_do_attach(struct iommufd_device *idev,
> -				    struct iommufd_hw_pagetable *hwpt,
> -				    unsigned int flags)
> +				    struct iommufd_hw_pagetable *hwpt)
>  {
>  	phys_addr_t sw_msi_start = 0;
>  	int rc;
> @@ -226,7 +230,7 @@ static int iommufd_device_do_attach(struct
> iommufd_device *idev,
>  	if (rc)
>  		goto out_unlock;
> 
> -	rc = iommufd_device_setup_msi(idev, hwpt, sw_msi_start, flags);
> +	rc = iommufd_device_setup_msi(idev, hwpt, sw_msi_start);
>  	if (rc)
>  		goto out_iova;
> 
> @@ -268,8 +272,7 @@ static int iommufd_device_do_attach(struct
> iommufd_device *idev,
>   * Automatic domain selection will never pick a manually created domain.
>   */
>  static int iommufd_device_auto_get_domain(struct iommufd_device *idev,
> -					  struct iommufd_ioas *ioas,
> -					  unsigned int flags)
> +					  struct iommufd_ioas *ioas)
>  {
>  	struct iommufd_hw_pagetable *hwpt;
>  	int rc;
> @@ -284,7 +287,7 @@ static int iommufd_device_auto_get_domain(struct
> iommufd_device *idev,
>  		if (!hwpt->auto_domain)
>  			continue;
> 
> -		rc = iommufd_device_do_attach(idev, hwpt, flags);
> +		rc = iommufd_device_do_attach(idev, hwpt);
> 
>  		/*
>  		 * -EINVAL means the domain is incompatible with the device.
> @@ -303,7 +306,7 @@ static int iommufd_device_auto_get_domain(struct
> iommufd_device *idev,
>  	}
>  	hwpt->auto_domain = true;
> 
> -	rc = iommufd_device_do_attach(idev, hwpt, flags);
> +	rc = iommufd_device_do_attach(idev, hwpt);
>  	if (rc)
>  		goto out_abort;
>  	list_add_tail(&hwpt->hwpt_item, &ioas->hwpt_list);
> @@ -324,7 +327,6 @@ static int iommufd_device_auto_get_domain(struct
> iommufd_device *idev,
>   * @idev: device to attach
>   * @pt_id: Input a IOMMUFD_OBJ_IOAS, or
> IOMMUFD_OBJ_HW_PAGETABLE
>   *         Output the IOMMUFD_OBJ_HW_PAGETABLE ID
> - * @flags: Optional flags
>   *
>   * This connects the device to an iommu_domain, either automatically or
> manually
>   * selected. Once this completes the device could do DMA.
> @@ -332,8 +334,7 @@ static int iommufd_device_auto_get_domain(struct
> iommufd_device *idev,
>   * The caller should return the resulting pt_id back to userspace.
>   * This function is undone by calling iommufd_device_detach().
>   */
> -int iommufd_device_attach(struct iommufd_device *idev, u32 *pt_id,
> -			  unsigned int flags)
> +int iommufd_device_attach(struct iommufd_device *idev, u32 *pt_id)
>  {
>  	struct iommufd_object *pt_obj;
>  	int rc;
> @@ -347,7 +348,7 @@ int iommufd_device_attach(struct iommufd_device
> *idev, u32 *pt_id,
>  		struct iommufd_hw_pagetable *hwpt =
>  			container_of(pt_obj, struct iommufd_hw_pagetable,
> obj);
> 
> -		rc = iommufd_device_do_attach(idev, hwpt, flags);
> +		rc = iommufd_device_do_attach(idev, hwpt);
>  		if (rc)
>  			goto out_put_pt_obj;
> 
> @@ -360,7 +361,7 @@ int iommufd_device_attach(struct iommufd_device
> *idev, u32 *pt_id,
>  		struct iommufd_ioas *ioas =
>  			container_of(pt_obj, struct iommufd_ioas, obj);
> 
> -		rc = iommufd_device_auto_get_domain(idev, ioas, flags);
> +		rc = iommufd_device_auto_get_domain(idev, ioas);
>  		if (rc)
>  			goto out_put_pt_obj;
>  		break;
> diff --git a/drivers/vfio/iommufd.c b/drivers/vfio/iommufd.c
> index 5a7ce4d9fbae0a..da50feb24b6e1d 100644
> --- a/drivers/vfio/iommufd.c
> +++ b/drivers/vfio/iommufd.c
> @@ -108,12 +108,9 @@
> EXPORT_SYMBOL_GPL(vfio_iommufd_physical_unbind);
> 
>  int vfio_iommufd_physical_attach_ioas(struct vfio_device *vdev, u32 *pt_id)
>  {
> -	unsigned int flags = 0;
>  	int rc;
> 
> -	if (vfio_allow_unsafe_interrupts)
> -		flags |=
> IOMMUFD_ATTACH_FLAGS_ALLOW_UNSAFE_INTERRUPT;
> -	rc = iommufd_device_attach(vdev->iommufd_device, pt_id, flags);
> +	rc = iommufd_device_attach(vdev->iommufd_device, pt_id);
>  	if (rc)
>  		return rc;
>  	vdev->iommufd_attached = true;
> diff --git a/drivers/vfio/vfio.h b/drivers/vfio/vfio.h
> index 3378714a746274..ce5fe3fc493b4e 100644
> --- a/drivers/vfio/vfio.h
> +++ b/drivers/vfio/vfio.h
> @@ -216,6 +216,4 @@ extern bool vfio_noiommu __read_mostly;
>  enum { vfio_noiommu = false };
>  #endif
> 
> -extern bool vfio_allow_unsafe_interrupts;
> -
>  #endif
> diff --git a/drivers/vfio/vfio_iommu_type1.c
> b/drivers/vfio/vfio_iommu_type1.c
> index 186e33a006d314..23c24fe98c00d4 100644
> --- a/drivers/vfio/vfio_iommu_type1.c
> +++ b/drivers/vfio/vfio_iommu_type1.c
> @@ -44,8 +44,9 @@
>  #define DRIVER_AUTHOR   "Alex Williamson
> <alex.williamson@redhat.com>"
>  #define DRIVER_DESC     "Type1 IOMMU driver for VFIO"
> 
> +static bool allow_unsafe_interrupts;
>  module_param_named(allow_unsafe_interrupts,
> -		   vfio_allow_unsafe_interrupts, bool, S_IRUGO | S_IWUSR);
> +		   allow_unsafe_interrupts, bool, S_IRUGO | S_IWUSR);
>  MODULE_PARM_DESC(allow_unsafe_interrupts,
>  		 "Enable VFIO IOMMU support for on platforms without
> interrupt remapping support.");
> 
> @@ -2281,7 +2282,7 @@ static int vfio_iommu_type1_attach_group(void
> *iommu_data,
>  		    iommu_group_for_each_dev(iommu_group, (void
> *)IOMMU_CAP_INTR_REMAP,
>  					     vfio_iommu_device_capable);
> 
> -	if (!vfio_allow_unsafe_interrupts && !msi_remap) {
> +	if (!allow_unsafe_interrupts && !msi_remap) {
>  		pr_warn("%s: No interrupt remapping support.  Use the
> module param \"allow_unsafe_interrupts\" to enable VFIO IOMMU support
> on this platform\n",
>  		       __func__);
>  		ret = -EPERM;
> diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c
> index 451a07eb702b34..593d45f43a16ba 100644
> --- a/drivers/vfio/vfio_main.c
> +++ b/drivers/vfio/vfio_main.c
> @@ -52,9 +52,6 @@ static struct vfio {
>  	struct ida			device_ida;
>  } vfio;
> 
> -bool vfio_allow_unsafe_interrupts;
> -EXPORT_SYMBOL_GPL(vfio_allow_unsafe_interrupts);
> -
>  static DEFINE_XARRAY(vfio_device_set_xa);
>  static const struct file_operations vfio_group_fops;
> 
> diff --git a/include/linux/iommufd.h b/include/linux/iommufd.h
> index bf2b3ea5f90fd2..9d1afd417215d0 100644
> --- a/include/linux/iommufd.h
> +++ b/include/linux/iommufd.h
> @@ -21,11 +21,7 @@ struct iommufd_device *iommufd_device_bind(struct
> iommufd_ctx *ictx,
>  					   struct device *dev, u32 *id);
>  void iommufd_device_unbind(struct iommufd_device *idev);
> 
> -enum {
> -	IOMMUFD_ATTACH_FLAGS_ALLOW_UNSAFE_INTERRUPT = 1 << 0,
> -};
> -int iommufd_device_attach(struct iommufd_device *idev, u32 *pt_id,
> -			  unsigned int flags);
> +int iommufd_device_attach(struct iommufd_device *idev, u32 *pt_id);
>  void iommufd_device_detach(struct iommufd_device *idev);
> 
>  struct iommufd_access_ops {

this looks good to me:

Reviewed-by: Kevin Tian <kevin.tian@intel.com>

^ permalink raw reply	[flat|nested] 35+ messages in thread

* RE: [PATCH v3 04/11] vfio: Move storage of allow_unsafe_interrupts to vfio_main.c
  2022-11-23 16:38                 ` Jason Gunthorpe
@ 2022-11-24  5:30                   ` Tian, Kevin
  2022-11-24 13:24                     ` Jason Gunthorpe
  0 siblings, 1 reply; 35+ messages in thread
From: Tian, Kevin @ 2022-11-24  5:30 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Alex Williamson, Lu Baolu, Cornelia Huck, Eric Auger, kvm, Yang,
	Lixiao, Matthew Rosato, Nicolin Chen, Liu, Yi L, He, Yu

> From: Jason Gunthorpe <jgg@nvidia.com>
> Sent: Thursday, November 24, 2022 12:38 AM
> 
> On Wed, Nov 23, 2022 at 01:21:30AM +0000, Tian, Kevin wrote:
> 
> > I'm not sure the value of entering this mixed world. I could envision
> > dpdk starting to add cdev/iommufd support when it wants to use
> > new features (pasid, iopf, etc.) which are available only via iommufd
> > native api. Before that point it just stays with full vfio legacy.
> 
> I think the value is for the distro that would benefit from getting
> apps validated and running on iommufd with the least effort invested.
> 
> So I'd like to see all the vfio apps we can convert to the new device
> interface and iommufd just to be converted, even if they don't make
> use of new features.
> 

We'll certainly change Qemu but then it will support either legacy vfio
or new iommufd native. No intermediate step like above.

what'd be the practice to push other vfio apps to do such conversion?

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 04/11] vfio: Move storage of allow_unsafe_interrupts to vfio_main.c
  2022-11-24  5:30                   ` Tian, Kevin
@ 2022-11-24 13:24                     ` Jason Gunthorpe
  0 siblings, 0 replies; 35+ messages in thread
From: Jason Gunthorpe @ 2022-11-24 13:24 UTC (permalink / raw)
  To: Tian, Kevin
  Cc: Alex Williamson, Lu Baolu, Cornelia Huck, Eric Auger, kvm, Yang,
	Lixiao, Matthew Rosato, Nicolin Chen, Liu, Yi L, He, Yu

On Thu, Nov 24, 2022 at 05:30:00AM +0000, Tian, Kevin wrote:

> what'd be the practice to push other vfio apps to do such conversion?

If you are operating a custom hypervisor environment there is security
and maintenance value in minimizing the kernel code enabled.

So eg needing iommufd turned on to run qemu and vfio but also needing
vfio container & group turned on to run dpdk as part of the OVS is not
desirable compared to just updating dpdk.

Jason

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 00/11] Connect VFIO to IOMMUFD
  2022-11-23 13:04     ` Yi Liu
@ 2022-11-29 12:41       ` Yi Liu
  0 siblings, 0 replies; 35+ messages in thread
From: Yi Liu @ 2022-11-29 12:41 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Alex Williamson, Lu Baolu, Cornelia Huck, Eric Auger, Kevin Tian,
	kvm, Lixiao Yang, Matthew Rosato, Nicolin Chen, Yu He

On 2022/11/23 21:04, Yi Liu wrote:
> On 2022/11/23 20:59, Jason Gunthorpe wrote:
>> On Wed, Nov 23, 2022 at 10:44:12AM +0800, Yi Liu wrote:
>>> Hi Jason,
>>>
>>> On 2022/11/17 05:05, Jason Gunthorpe wrote:
>>>> This series provides an alternative container layer for VFIO implemented
>>>> using iommufd. This is optional, if CONFIG_IOMMUFD is not set then it will
>>>> not be compiled in.
>>>>
>>>> At this point iommufd can be injected by passing in a iommfd FD to
>>>> VFIO_GROUP_SET_CONTAINER which will use the VFIO compat layer in iommufd
>>>> to obtain the compat IOAS and then connect up all the VFIO drivers as
>>>> appropriate.
>>>>
>>>> This is temporary stopping point, a following series will provide a way to
>>>> directly open a VFIO device FD and directly connect it to IOMMUFD using
>>>> native ioctls that can expose the IOMMUFD features like hwpt, future
>>>> vPASID and dynamic attachment.
>>>>
>>>> This series, in compat mode, has passed all the qemu tests we have
>>>> available, including the test suites for the Intel GVT mdev. Aside from
>>>> the temporary limitation with P2P memory this is belived to be fully
>>>> compatible with VFIO.
>>>>
>>>> This is on github: 
>>>> https://github.com/jgunthorpe/linux/commits/vfio_iommufd
>>>>
>>>> It requires the iommufd series:
>>>>
>>>> https://lore.kernel.org/r/0-v5-4001c2997bd0+30c-iommufd_jgg@nvidia.com
>>>
>>> gvtg test encountered broken display with below commit in your for-next
>>> branch.
>>>
>>> https://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd.git/commit/?h=for-next&id=57f62422b6f0477afaddd2fc77a4bb9b94275f42 
>>>
>>>
>>> I noticed there are diffs in drivers/vfio/ and drivers/iommu/iommufd/
>>> between this commit and the last tested commit (37c9e6e44d77a). Seems
>>> to have regression due to the diffs.
>>
>> Do you have something more to go on? I am checking the diff and not
>> getting any idea. The above also merges v6.1-rc5 into the tree, is
>> there a chance rc5 is the gvt problem?
> 
> that is possible, I'll let my colleague revert it and try.

after reverting the v6.1-rc5 merge commit, the test is passed. also
tried your latest branch which has v6.1-rc6 merge commit, still observed
blank display. but this time, this issue can be recovered by restarting
gdm. It looks like the regression is not due to iommufd. May due to diff
in gvt and i915 itself. So reported a regression as below link.

https://lists.freedesktop.org/archives/intel-gvt-dev/2022-November/011488.html

-- 
Regards,
Yi Liu

^ permalink raw reply	[flat|nested] 35+ messages in thread

end of thread, other threads:[~2022-11-29 12:41 UTC | newest]

Thread overview: 35+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-11-16 21:05 [PATCH v3 00/11] Connect VFIO to IOMMUFD Jason Gunthorpe
2022-11-16 21:05 ` [PATCH v3 01/11] vfio: Move vfio_device driver open/close code to a function Jason Gunthorpe
2022-11-16 21:05 ` [PATCH v3 02/11] vfio: Move vfio_device_assign_container() into vfio_device_first_open() Jason Gunthorpe
2022-11-16 21:05 ` [PATCH v3 03/11] vfio: Rename vfio_device_assign/unassign_container() Jason Gunthorpe
2022-11-16 21:05 ` [PATCH v3 04/11] vfio: Move storage of allow_unsafe_interrupts to vfio_main.c Jason Gunthorpe
2022-11-17 20:14   ` Alex Williamson
2022-11-18 15:36     ` Jason Gunthorpe
2022-11-18 20:36       ` Alex Williamson
2022-11-22  1:59         ` Jason Gunthorpe
2022-11-22 17:34           ` Alex Williamson
2022-11-22 17:41             ` Jason Gunthorpe
2022-11-23  1:21               ` Tian, Kevin
2022-11-23 16:38                 ` Jason Gunthorpe
2022-11-24  5:30                   ` Tian, Kevin
2022-11-24 13:24                     ` Jason Gunthorpe
2022-11-23 19:47           ` Jason Gunthorpe
2022-11-24  5:26             ` Tian, Kevin
2022-11-16 21:05 ` [PATCH v3 05/11] vfio: Use IOMMU_CAP_ENFORCE_CACHE_COHERENCY for vfio_file_enforced_coherent() Jason Gunthorpe
2022-11-16 21:05 ` [PATCH v3 06/11] vfio-iommufd: Allow iommufd to be used in place of a container fd Jason Gunthorpe
2022-11-16 23:31   ` Alex Williamson
2022-11-17  0:20     ` Jason Gunthorpe
2022-11-16 21:05 ` [PATCH v3 07/11] vfio-iommufd: Support iommufd for physical VFIO devices Jason Gunthorpe
2022-11-18  1:30   ` Tian, Kevin
2022-11-16 21:05 ` [PATCH v3 08/11] vfio-iommufd: Support iommufd for emulated " Jason Gunthorpe
2022-11-18  2:03   ` Tian, Kevin
2022-11-16 21:05 ` [PATCH v3 09/11] vfio: Move container related MODULE_ALIAS statements into container.c Jason Gunthorpe
2022-11-16 21:05 ` [PATCH v3 10/11] vfio: Make vfio_container optionally compiled Jason Gunthorpe
2022-11-18  2:05   ` Tian, Kevin
2022-11-16 21:05 ` [PATCH v3 11/11] iommufd: Allow iommufd to supply /dev/vfio/vfio Jason Gunthorpe
2022-11-17 20:34   ` Alex Williamson
2022-11-18 13:07     ` Jason Gunthorpe
2022-11-23  2:44 ` [PATCH v3 00/11] Connect VFIO to IOMMUFD Yi Liu
2022-11-23 12:59   ` Jason Gunthorpe
2022-11-23 13:04     ` Yi Liu
2022-11-29 12:41       ` Yi Liu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.