All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 00/15] Add vfio_device cdev for iommufd support
@ 2023-02-13 15:13 ` Yi Liu
  0 siblings, 0 replies; 135+ messages in thread
From: Yi Liu @ 2023-02-13 15:13 UTC (permalink / raw)
  To: joro, alex.williamson, jgg, kevin.tian, robin.murphy
  Cc: cohuck, eric.auger, nicolinc, kvm, mjrosato, chao.p.peng,
	yi.l.liu, yi.y.sun, peterx, jasowang, shameerali.kolothum.thodi,
	lulu, suravee.suthikulpanit, intel-gvt-dev, intel-gfx,
	linux-s390

Existing VFIO provides group-centric user APIs for userspace. Userspace
opens the /dev/vfio/$group_id first before getting device fd and hence
getting access to device. This is not the desired model for iommufd. Per
the conclusion of community discussion[1], iommufd provides device-centric
kAPIs and requires its consumer (like VFIO) to be device-centric user
APIs. Such user APIs are used to associate device with iommufd and also
the I/O address spaces managed by the iommufd.

This series first introduces a per device file structure to be prepared
for further enhancement and refactors the kvm-vfio code to be prepared
for accepting device file from userspace. Then refactors the vfio to be
able to handle iommufd binding. This refactor includes the mechanism of
blocking device access before iommufd bind, making vfio_device_open() be
exclusive between the group path and the cdev path. Eventually, adds the
cdev support for vfio device, and makes group infrastructure optional as
it is not needed when vfio device cdev is compiled.

This is also a prerequisite for iommu nesting for vfio device[2].

The complete code can be found in below branch, simple test done with the
legacy group path and the cdev path. Draft QEMU branch can be found at[3]

https://github.com/yiliu1765/iommufd/tree/vfio_device_cdev_v3
(config CONFIG_IOMMUFD=y CONFIG_VFIO_DEVICE_CDEV=y)

base-commit: 06a24ad

[1] https://lore.kernel.org/kvm/BN9PR11MB5433B1E4AE5B0480369F97178C189@BN9PR11MB5433.namprd11.prod.outlook.com/
[2] https://lore.kernel.org/linux-iommu/20230209043153.14964-1-yi.l.liu@intel.com/
[3] https://github.com/yiliu1765/qemu/tree/iommufd_rfcv3 (it is based on Eric's
    QEMU iommufd rfcv3 (https://lore.kernel.org/kvm/20230131205305.2726330-1-eric.auger@redhat.com/)
    plus two commits to align with vfio_device_cdev v3)

Change log:

v3:
 - Add r-b from Kevin on patch 03, 06, 07, 08.
 - Refine the group and cdev path exclusion. Remove vfio_device:single_open;
   add vfio_group::cdev_device_open_cnt to achieve exlucsion between group
   path and cdev path (Kevin, Jason)
 - Fix a bug in the error handling path (Yan Zhao)
 - Address misc remarks from Kevin

v2: https://lore.kernel.org/kvm/20230206090532.95598-1-yi.l.liu@intel.com/
 - Add r-b from Kevin and Eric on patch 01 02 04.
 - "Split kvm/vfio: Provide struct kvm_device_ops::release() insted of ::destroy()"
   from this series and got applied. (Alex, Kevin, Jason, Mathhew)
 - Add kvm_ref_lock to protect vfio_device_file->kvm instead of reusing
   dev_set->lock as dead-lock is observed with vfio-ap which would try to
   acquire kvm_lock. This is opposite lock order with kvm_device_release()
   which holds kvm_lock first and then hold dev_set->lock. (Kevin)
 - Use a separate ioctl for detaching IOAS. (Alex)
 - Rename vfio_device_file::single_open to be is_cdev_device (Kevin, Alex)
 - Move the vfio device cdev code into device_cdev.c and add a VFIO_DEVICE_CDEV
   kconfig for it. (Kevin, Jason)

v1: https://lore.kernel.org/kvm/20230117134942.101112-1-yi.l.liu@intel.com/
 - Fix the circular refcount between kvm struct and device file reference. (JasonG)
 - Address comments from KevinT
 - Remained the ioctl for detach, needs to Alex's taste
   (https://lore.kernel.org/kvm/BN9PR11MB5276BE9F4B0613EE859317028CFF9@BN9PR11MB5276.namprd11.prod.outlook.com/)

rfc: https://lore.kernel.org/kvm/20221219084718.9342-1-yi.l.liu@intel.com/

Thanks,
	Yi Liu

Yi Liu (15):
  vfio: Allocate per device file structure
  vfio: Refine vfio file kAPIs
  vfio: Accept vfio device file in the driver facing kAPI
  kvm/vfio: Rename kvm_vfio_group to prepare for accepting vfio device
    fd
  kvm/vfio: Accept vfio device file from userspace
  vfio: Pass struct vfio_device_file * to vfio_device_open/close()
  vfio: Block device access via device fd until device is opened
  vfio: Add infrastructure for bind_iommufd from userspace
  vfio-iommufd: Add detach_ioas support for physical VFIO devices
  vfio-iommufd: Add detach_ioas for emulated VFIO devices
  vfio: Add cdev_device_open_cnt to vfio_group
  vfio: Make vfio_device_open() single open for device cdev path
  vfio: Add cdev for vfio_device
  vfio: Add ioctls for device cdev using iommufd
  vfio: Compile group optionally

 Documentation/driver-api/vfio.rst             |   8 +-
 Documentation/virt/kvm/devices/vfio.rst       |  45 ++-
 drivers/gpu/drm/i915/gvt/kvmgt.c              |   1 +
 drivers/s390/cio/vfio_ccw_ops.c               |   1 +
 drivers/s390/crypto/vfio_ap_ops.c             |   1 +
 drivers/vfio/Kconfig                          |  29 ++
 drivers/vfio/Makefile                         |   3 +-
 drivers/vfio/device_cdev.c                    | 264 ++++++++++++++++
 drivers/vfio/fsl-mc/vfio_fsl_mc.c             |   1 +
 drivers/vfio/group.c                          | 149 +++++----
 drivers/vfio/iommufd.c                        |  59 +++-
 .../vfio/pci/hisilicon/hisi_acc_vfio_pci.c    |   2 +
 drivers/vfio/pci/mlx5/main.c                  |   1 +
 drivers/vfio/pci/vfio_pci.c                   |   1 +
 drivers/vfio/pci/vfio_pci_core.c              |   4 +-
 drivers/vfio/platform/vfio_amba.c             |   1 +
 drivers/vfio/platform/vfio_platform.c         |   1 +
 drivers/vfio/vfio.h                           | 168 +++++++++-
 drivers/vfio/vfio_main.c                      | 295 ++++++++++++++++--
 include/linux/iommufd.h                       |   6 +
 include/linux/vfio.h                          |  28 +-
 include/uapi/linux/kvm.h                      |  16 +-
 include/uapi/linux/vfio.h                     |  86 +++++
 virt/kvm/vfio.c                               | 141 ++++-----
 24 files changed, 1106 insertions(+), 205 deletions(-)
 create mode 100644 drivers/vfio/device_cdev.c

-- 
2.34.1


^ permalink raw reply	[flat|nested] 135+ messages in thread

* [Intel-gfx] [PATCH v3 00/15] Add vfio_device cdev for iommufd support
@ 2023-02-13 15:13 ` Yi Liu
  0 siblings, 0 replies; 135+ messages in thread
From: Yi Liu @ 2023-02-13 15:13 UTC (permalink / raw)
  To: joro, alex.williamson, jgg, kevin.tian, robin.murphy
  Cc: linux-s390, yi.l.liu, yi.y.sun, kvm, mjrosato, jasowang, cohuck,
	peterx, eric.auger, nicolinc, shameerali.kolothum.thodi,
	suravee.suthikulpanit, chao.p.peng, lulu, intel-gvt-dev,
	intel-gfx

Existing VFIO provides group-centric user APIs for userspace. Userspace
opens the /dev/vfio/$group_id first before getting device fd and hence
getting access to device. This is not the desired model for iommufd. Per
the conclusion of community discussion[1], iommufd provides device-centric
kAPIs and requires its consumer (like VFIO) to be device-centric user
APIs. Such user APIs are used to associate device with iommufd and also
the I/O address spaces managed by the iommufd.

This series first introduces a per device file structure to be prepared
for further enhancement and refactors the kvm-vfio code to be prepared
for accepting device file from userspace. Then refactors the vfio to be
able to handle iommufd binding. This refactor includes the mechanism of
blocking device access before iommufd bind, making vfio_device_open() be
exclusive between the group path and the cdev path. Eventually, adds the
cdev support for vfio device, and makes group infrastructure optional as
it is not needed when vfio device cdev is compiled.

This is also a prerequisite for iommu nesting for vfio device[2].

The complete code can be found in below branch, simple test done with the
legacy group path and the cdev path. Draft QEMU branch can be found at[3]

https://github.com/yiliu1765/iommufd/tree/vfio_device_cdev_v3
(config CONFIG_IOMMUFD=y CONFIG_VFIO_DEVICE_CDEV=y)

base-commit: 06a24ad

[1] https://lore.kernel.org/kvm/BN9PR11MB5433B1E4AE5B0480369F97178C189@BN9PR11MB5433.namprd11.prod.outlook.com/
[2] https://lore.kernel.org/linux-iommu/20230209043153.14964-1-yi.l.liu@intel.com/
[3] https://github.com/yiliu1765/qemu/tree/iommufd_rfcv3 (it is based on Eric's
    QEMU iommufd rfcv3 (https://lore.kernel.org/kvm/20230131205305.2726330-1-eric.auger@redhat.com/)
    plus two commits to align with vfio_device_cdev v3)

Change log:

v3:
 - Add r-b from Kevin on patch 03, 06, 07, 08.
 - Refine the group and cdev path exclusion. Remove vfio_device:single_open;
   add vfio_group::cdev_device_open_cnt to achieve exlucsion between group
   path and cdev path (Kevin, Jason)
 - Fix a bug in the error handling path (Yan Zhao)
 - Address misc remarks from Kevin

v2: https://lore.kernel.org/kvm/20230206090532.95598-1-yi.l.liu@intel.com/
 - Add r-b from Kevin and Eric on patch 01 02 04.
 - "Split kvm/vfio: Provide struct kvm_device_ops::release() insted of ::destroy()"
   from this series and got applied. (Alex, Kevin, Jason, Mathhew)
 - Add kvm_ref_lock to protect vfio_device_file->kvm instead of reusing
   dev_set->lock as dead-lock is observed with vfio-ap which would try to
   acquire kvm_lock. This is opposite lock order with kvm_device_release()
   which holds kvm_lock first and then hold dev_set->lock. (Kevin)
 - Use a separate ioctl for detaching IOAS. (Alex)
 - Rename vfio_device_file::single_open to be is_cdev_device (Kevin, Alex)
 - Move the vfio device cdev code into device_cdev.c and add a VFIO_DEVICE_CDEV
   kconfig for it. (Kevin, Jason)

v1: https://lore.kernel.org/kvm/20230117134942.101112-1-yi.l.liu@intel.com/
 - Fix the circular refcount between kvm struct and device file reference. (JasonG)
 - Address comments from KevinT
 - Remained the ioctl for detach, needs to Alex's taste
   (https://lore.kernel.org/kvm/BN9PR11MB5276BE9F4B0613EE859317028CFF9@BN9PR11MB5276.namprd11.prod.outlook.com/)

rfc: https://lore.kernel.org/kvm/20221219084718.9342-1-yi.l.liu@intel.com/

Thanks,
	Yi Liu

Yi Liu (15):
  vfio: Allocate per device file structure
  vfio: Refine vfio file kAPIs
  vfio: Accept vfio device file in the driver facing kAPI
  kvm/vfio: Rename kvm_vfio_group to prepare for accepting vfio device
    fd
  kvm/vfio: Accept vfio device file from userspace
  vfio: Pass struct vfio_device_file * to vfio_device_open/close()
  vfio: Block device access via device fd until device is opened
  vfio: Add infrastructure for bind_iommufd from userspace
  vfio-iommufd: Add detach_ioas support for physical VFIO devices
  vfio-iommufd: Add detach_ioas for emulated VFIO devices
  vfio: Add cdev_device_open_cnt to vfio_group
  vfio: Make vfio_device_open() single open for device cdev path
  vfio: Add cdev for vfio_device
  vfio: Add ioctls for device cdev using iommufd
  vfio: Compile group optionally

 Documentation/driver-api/vfio.rst             |   8 +-
 Documentation/virt/kvm/devices/vfio.rst       |  45 ++-
 drivers/gpu/drm/i915/gvt/kvmgt.c              |   1 +
 drivers/s390/cio/vfio_ccw_ops.c               |   1 +
 drivers/s390/crypto/vfio_ap_ops.c             |   1 +
 drivers/vfio/Kconfig                          |  29 ++
 drivers/vfio/Makefile                         |   3 +-
 drivers/vfio/device_cdev.c                    | 264 ++++++++++++++++
 drivers/vfio/fsl-mc/vfio_fsl_mc.c             |   1 +
 drivers/vfio/group.c                          | 149 +++++----
 drivers/vfio/iommufd.c                        |  59 +++-
 .../vfio/pci/hisilicon/hisi_acc_vfio_pci.c    |   2 +
 drivers/vfio/pci/mlx5/main.c                  |   1 +
 drivers/vfio/pci/vfio_pci.c                   |   1 +
 drivers/vfio/pci/vfio_pci_core.c              |   4 +-
 drivers/vfio/platform/vfio_amba.c             |   1 +
 drivers/vfio/platform/vfio_platform.c         |   1 +
 drivers/vfio/vfio.h                           | 168 +++++++++-
 drivers/vfio/vfio_main.c                      | 295 ++++++++++++++++--
 include/linux/iommufd.h                       |   6 +
 include/linux/vfio.h                          |  28 +-
 include/uapi/linux/kvm.h                      |  16 +-
 include/uapi/linux/vfio.h                     |  86 +++++
 virt/kvm/vfio.c                               | 141 ++++-----
 24 files changed, 1106 insertions(+), 205 deletions(-)
 create mode 100644 drivers/vfio/device_cdev.c

-- 
2.34.1


^ permalink raw reply	[flat|nested] 135+ messages in thread

* [PATCH v3 01/15] vfio: Allocate per device file structure
  2023-02-13 15:13 ` [Intel-gfx] " Yi Liu
@ 2023-02-13 15:13   ` Yi Liu
  -1 siblings, 0 replies; 135+ messages in thread
From: Yi Liu @ 2023-02-13 15:13 UTC (permalink / raw)
  To: joro, alex.williamson, jgg, kevin.tian, robin.murphy
  Cc: cohuck, eric.auger, nicolinc, kvm, mjrosato, chao.p.peng,
	yi.l.liu, yi.y.sun, peterx, jasowang, shameerali.kolothum.thodi,
	lulu, suravee.suthikulpanit, intel-gvt-dev, intel-gfx,
	linux-s390

This is preparation for adding vfio device cdev support. vfio device
cdev requires:
1) a per device file memory to store the kvm pointer set by KVM. It will
   be propagated to vfio_device:kvm after the device cdev file is bound
   to an iommufd
2) a mechanism to block device access through device cdev fd before it
   is bound to an iommufd

To address above requirements, this adds a per device file structure
named vfio_device_file. For now, it's only a wrapper of struct vfio_device
pointer. Other fields will be added to this per file structure in future
commits.

Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
---
 drivers/vfio/group.c     | 13 +++++++++++--
 drivers/vfio/vfio.h      |  6 ++++++
 drivers/vfio/vfio_main.c | 31 ++++++++++++++++++++++++++-----
 3 files changed, 43 insertions(+), 7 deletions(-)

diff --git a/drivers/vfio/group.c b/drivers/vfio/group.c
index 0e9036e2b9c4..cf51e1a0fd96 100644
--- a/drivers/vfio/group.c
+++ b/drivers/vfio/group.c
@@ -215,19 +215,26 @@ void vfio_device_group_close(struct vfio_device *device)
 
 static struct file *vfio_device_open_file(struct vfio_device *device)
 {
+	struct vfio_device_file *df;
 	struct file *filep;
 	int ret;
 
+	df = vfio_allocate_device_file(device);
+	if (IS_ERR(df)) {
+		ret = PTR_ERR(df);
+		goto err_out;
+	}
+
 	ret = vfio_device_group_open(device);
 	if (ret)
-		goto err_out;
+		goto err_free;
 
 	/*
 	 * We can't use anon_inode_getfd() because we need to modify
 	 * the f_mode flags directly to allow more than just ioctls
 	 */
 	filep = anon_inode_getfile("[vfio-device]", &vfio_device_fops,
-				   device, O_RDWR);
+				   df, O_RDWR);
 	if (IS_ERR(filep)) {
 		ret = PTR_ERR(filep);
 		goto err_close_device;
@@ -251,6 +258,8 @@ static struct file *vfio_device_open_file(struct vfio_device *device)
 
 err_close_device:
 	vfio_device_group_close(device);
+err_free:
+	kfree(df);
 err_out:
 	return ERR_PTR(ret);
 }
diff --git a/drivers/vfio/vfio.h b/drivers/vfio/vfio.h
index e9721d8424bc..61bbf673e672 100644
--- a/drivers/vfio/vfio.h
+++ b/drivers/vfio/vfio.h
@@ -16,11 +16,17 @@ struct iommu_group;
 struct vfio_device;
 struct vfio_container;
 
+struct vfio_device_file {
+	struct vfio_device *device;
+};
+
 void vfio_device_put_registration(struct vfio_device *device);
 bool vfio_device_try_get_registration(struct vfio_device *device);
 int vfio_device_open(struct vfio_device *device, struct iommufd_ctx *iommufd);
 void vfio_device_close(struct vfio_device *device,
 		       struct iommufd_ctx *iommufd);
+struct vfio_device_file *
+vfio_allocate_device_file(struct vfio_device *device);
 
 extern const struct file_operations vfio_device_fops;
 
diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c
index 3a597e799918..d99fa0cec18e 100644
--- a/drivers/vfio/vfio_main.c
+++ b/drivers/vfio/vfio_main.c
@@ -396,6 +396,20 @@ static bool vfio_assert_device_open(struct vfio_device *device)
 	return !WARN_ON_ONCE(!READ_ONCE(device->open_count));
 }
 
+struct vfio_device_file *
+vfio_allocate_device_file(struct vfio_device *device)
+{
+	struct vfio_device_file *df;
+
+	df = kzalloc(sizeof(*df), GFP_KERNEL_ACCOUNT);
+	if (!df)
+		return ERR_PTR(-ENOMEM);
+
+	df->device = device;
+
+	return df;
+}
+
 static int vfio_device_first_open(struct vfio_device *device,
 				  struct iommufd_ctx *iommufd)
 {
@@ -509,12 +523,15 @@ static inline void vfio_device_pm_runtime_put(struct vfio_device *device)
  */
 static int vfio_device_fops_release(struct inode *inode, struct file *filep)
 {
-	struct vfio_device *device = filep->private_data;
+	struct vfio_device_file *df = filep->private_data;
+	struct vfio_device *device = df->device;
 
 	vfio_device_group_close(device);
 
 	vfio_device_put_registration(device);
 
+	kfree(df);
+
 	return 0;
 }
 
@@ -1079,7 +1096,8 @@ static int vfio_ioctl_device_feature(struct vfio_device *device,
 static long vfio_device_fops_unl_ioctl(struct file *filep,
 				       unsigned int cmd, unsigned long arg)
 {
-	struct vfio_device *device = filep->private_data;
+	struct vfio_device_file *df = filep->private_data;
+	struct vfio_device *device = df->device;
 	int ret;
 
 	ret = vfio_device_pm_runtime_get(device);
@@ -1106,7 +1124,8 @@ static long vfio_device_fops_unl_ioctl(struct file *filep,
 static ssize_t vfio_device_fops_read(struct file *filep, char __user *buf,
 				     size_t count, loff_t *ppos)
 {
-	struct vfio_device *device = filep->private_data;
+	struct vfio_device_file *df = filep->private_data;
+	struct vfio_device *device = df->device;
 
 	if (unlikely(!device->ops->read))
 		return -EINVAL;
@@ -1118,7 +1137,8 @@ static ssize_t vfio_device_fops_write(struct file *filep,
 				      const char __user *buf,
 				      size_t count, loff_t *ppos)
 {
-	struct vfio_device *device = filep->private_data;
+	struct vfio_device_file *df = filep->private_data;
+	struct vfio_device *device = df->device;
 
 	if (unlikely(!device->ops->write))
 		return -EINVAL;
@@ -1128,7 +1148,8 @@ static ssize_t vfio_device_fops_write(struct file *filep,
 
 static int vfio_device_fops_mmap(struct file *filep, struct vm_area_struct *vma)
 {
-	struct vfio_device *device = filep->private_data;
+	struct vfio_device_file *df = filep->private_data;
+	struct vfio_device *device = df->device;
 
 	if (unlikely(!device->ops->mmap))
 		return -EINVAL;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 135+ messages in thread

* [Intel-gfx] [PATCH v3 01/15] vfio: Allocate per device file structure
@ 2023-02-13 15:13   ` Yi Liu
  0 siblings, 0 replies; 135+ messages in thread
From: Yi Liu @ 2023-02-13 15:13 UTC (permalink / raw)
  To: joro, alex.williamson, jgg, kevin.tian, robin.murphy
  Cc: linux-s390, yi.l.liu, yi.y.sun, kvm, mjrosato, jasowang, cohuck,
	peterx, eric.auger, nicolinc, shameerali.kolothum.thodi,
	suravee.suthikulpanit, chao.p.peng, lulu, intel-gvt-dev,
	intel-gfx

This is preparation for adding vfio device cdev support. vfio device
cdev requires:
1) a per device file memory to store the kvm pointer set by KVM. It will
   be propagated to vfio_device:kvm after the device cdev file is bound
   to an iommufd
2) a mechanism to block device access through device cdev fd before it
   is bound to an iommufd

To address above requirements, this adds a per device file structure
named vfio_device_file. For now, it's only a wrapper of struct vfio_device
pointer. Other fields will be added to this per file structure in future
commits.

Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
---
 drivers/vfio/group.c     | 13 +++++++++++--
 drivers/vfio/vfio.h      |  6 ++++++
 drivers/vfio/vfio_main.c | 31 ++++++++++++++++++++++++++-----
 3 files changed, 43 insertions(+), 7 deletions(-)

diff --git a/drivers/vfio/group.c b/drivers/vfio/group.c
index 0e9036e2b9c4..cf51e1a0fd96 100644
--- a/drivers/vfio/group.c
+++ b/drivers/vfio/group.c
@@ -215,19 +215,26 @@ void vfio_device_group_close(struct vfio_device *device)
 
 static struct file *vfio_device_open_file(struct vfio_device *device)
 {
+	struct vfio_device_file *df;
 	struct file *filep;
 	int ret;
 
+	df = vfio_allocate_device_file(device);
+	if (IS_ERR(df)) {
+		ret = PTR_ERR(df);
+		goto err_out;
+	}
+
 	ret = vfio_device_group_open(device);
 	if (ret)
-		goto err_out;
+		goto err_free;
 
 	/*
 	 * We can't use anon_inode_getfd() because we need to modify
 	 * the f_mode flags directly to allow more than just ioctls
 	 */
 	filep = anon_inode_getfile("[vfio-device]", &vfio_device_fops,
-				   device, O_RDWR);
+				   df, O_RDWR);
 	if (IS_ERR(filep)) {
 		ret = PTR_ERR(filep);
 		goto err_close_device;
@@ -251,6 +258,8 @@ static struct file *vfio_device_open_file(struct vfio_device *device)
 
 err_close_device:
 	vfio_device_group_close(device);
+err_free:
+	kfree(df);
 err_out:
 	return ERR_PTR(ret);
 }
diff --git a/drivers/vfio/vfio.h b/drivers/vfio/vfio.h
index e9721d8424bc..61bbf673e672 100644
--- a/drivers/vfio/vfio.h
+++ b/drivers/vfio/vfio.h
@@ -16,11 +16,17 @@ struct iommu_group;
 struct vfio_device;
 struct vfio_container;
 
+struct vfio_device_file {
+	struct vfio_device *device;
+};
+
 void vfio_device_put_registration(struct vfio_device *device);
 bool vfio_device_try_get_registration(struct vfio_device *device);
 int vfio_device_open(struct vfio_device *device, struct iommufd_ctx *iommufd);
 void vfio_device_close(struct vfio_device *device,
 		       struct iommufd_ctx *iommufd);
+struct vfio_device_file *
+vfio_allocate_device_file(struct vfio_device *device);
 
 extern const struct file_operations vfio_device_fops;
 
diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c
index 3a597e799918..d99fa0cec18e 100644
--- a/drivers/vfio/vfio_main.c
+++ b/drivers/vfio/vfio_main.c
@@ -396,6 +396,20 @@ static bool vfio_assert_device_open(struct vfio_device *device)
 	return !WARN_ON_ONCE(!READ_ONCE(device->open_count));
 }
 
+struct vfio_device_file *
+vfio_allocate_device_file(struct vfio_device *device)
+{
+	struct vfio_device_file *df;
+
+	df = kzalloc(sizeof(*df), GFP_KERNEL_ACCOUNT);
+	if (!df)
+		return ERR_PTR(-ENOMEM);
+
+	df->device = device;
+
+	return df;
+}
+
 static int vfio_device_first_open(struct vfio_device *device,
 				  struct iommufd_ctx *iommufd)
 {
@@ -509,12 +523,15 @@ static inline void vfio_device_pm_runtime_put(struct vfio_device *device)
  */
 static int vfio_device_fops_release(struct inode *inode, struct file *filep)
 {
-	struct vfio_device *device = filep->private_data;
+	struct vfio_device_file *df = filep->private_data;
+	struct vfio_device *device = df->device;
 
 	vfio_device_group_close(device);
 
 	vfio_device_put_registration(device);
 
+	kfree(df);
+
 	return 0;
 }
 
@@ -1079,7 +1096,8 @@ static int vfio_ioctl_device_feature(struct vfio_device *device,
 static long vfio_device_fops_unl_ioctl(struct file *filep,
 				       unsigned int cmd, unsigned long arg)
 {
-	struct vfio_device *device = filep->private_data;
+	struct vfio_device_file *df = filep->private_data;
+	struct vfio_device *device = df->device;
 	int ret;
 
 	ret = vfio_device_pm_runtime_get(device);
@@ -1106,7 +1124,8 @@ static long vfio_device_fops_unl_ioctl(struct file *filep,
 static ssize_t vfio_device_fops_read(struct file *filep, char __user *buf,
 				     size_t count, loff_t *ppos)
 {
-	struct vfio_device *device = filep->private_data;
+	struct vfio_device_file *df = filep->private_data;
+	struct vfio_device *device = df->device;
 
 	if (unlikely(!device->ops->read))
 		return -EINVAL;
@@ -1118,7 +1137,8 @@ static ssize_t vfio_device_fops_write(struct file *filep,
 				      const char __user *buf,
 				      size_t count, loff_t *ppos)
 {
-	struct vfio_device *device = filep->private_data;
+	struct vfio_device_file *df = filep->private_data;
+	struct vfio_device *device = df->device;
 
 	if (unlikely(!device->ops->write))
 		return -EINVAL;
@@ -1128,7 +1148,8 @@ static ssize_t vfio_device_fops_write(struct file *filep,
 
 static int vfio_device_fops_mmap(struct file *filep, struct vm_area_struct *vma)
 {
-	struct vfio_device *device = filep->private_data;
+	struct vfio_device_file *df = filep->private_data;
+	struct vfio_device *device = df->device;
 
 	if (unlikely(!device->ops->mmap))
 		return -EINVAL;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 135+ messages in thread

* [PATCH v3 02/15] vfio: Refine vfio file kAPIs
  2023-02-13 15:13 ` [Intel-gfx] " Yi Liu
@ 2023-02-13 15:13   ` Yi Liu
  -1 siblings, 0 replies; 135+ messages in thread
From: Yi Liu @ 2023-02-13 15:13 UTC (permalink / raw)
  To: joro, alex.williamson, jgg, kevin.tian, robin.murphy
  Cc: cohuck, eric.auger, nicolinc, kvm, mjrosato, chao.p.peng,
	yi.l.liu, yi.y.sun, peterx, jasowang, shameerali.kolothum.thodi,
	lulu, suravee.suthikulpanit, intel-gvt-dev, intel-gfx,
	linux-s390

This prepares for making the below kAPIs to accept both group file
and device file instead of only vfio group file.

  bool vfio_file_enforced_coherent(struct file *file);
  void vfio_file_set_kvm(struct file *file, struct kvm *kvm);
  bool vfio_file_has_dev(struct file *file, struct vfio_device *device);

Besides above change, vfio_file_is_group() is renamed to be
vfio_file_is_valid().

Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
---
 drivers/vfio/group.c             | 74 ++++++++------------------------
 drivers/vfio/pci/vfio_pci_core.c |  4 +-
 drivers/vfio/vfio.h              |  4 ++
 drivers/vfio/vfio_main.c         | 62 ++++++++++++++++++++++++++
 include/linux/vfio.h             |  2 +-
 virt/kvm/vfio.c                  | 10 ++---
 6 files changed, 92 insertions(+), 64 deletions(-)

diff --git a/drivers/vfio/group.c b/drivers/vfio/group.c
index cf51e1a0fd96..cc0eded19a9f 100644
--- a/drivers/vfio/group.c
+++ b/drivers/vfio/group.c
@@ -751,6 +751,15 @@ bool vfio_device_has_container(struct vfio_device *device)
 	return device->group->container;
 }
 
+struct vfio_group *vfio_group_from_file(struct file *file)
+{
+	struct vfio_group *group = file->private_data;
+
+	if (file->f_op != &vfio_group_fops)
+		return NULL;
+	return group;
+}
+
 /**
  * vfio_file_iommu_group - Return the struct iommu_group for the vfio group file
  * @file: VFIO group file
@@ -761,13 +770,13 @@ bool vfio_device_has_container(struct vfio_device *device)
  */
 struct iommu_group *vfio_file_iommu_group(struct file *file)
 {
-	struct vfio_group *group = file->private_data;
+	struct vfio_group *group = vfio_group_from_file(file);
 	struct iommu_group *iommu_group = NULL;
 
 	if (!IS_ENABLED(CONFIG_SPAPR_TCE_IOMMU))
 		return NULL;
 
-	if (!vfio_file_is_group(file))
+	if (!group)
 		return NULL;
 
 	mutex_lock(&group->group_lock);
@@ -780,34 +789,11 @@ struct iommu_group *vfio_file_iommu_group(struct file *file)
 }
 EXPORT_SYMBOL_GPL(vfio_file_iommu_group);
 
-/**
- * vfio_file_is_group - True if the file is usable with VFIO aPIS
- * @file: VFIO group file
- */
-bool vfio_file_is_group(struct file *file)
-{
-	return file->f_op == &vfio_group_fops;
-}
-EXPORT_SYMBOL_GPL(vfio_file_is_group);
-
-/**
- * vfio_file_enforced_coherent - True if the DMA associated with the VFIO file
- *        is always CPU cache coherent
- * @file: VFIO group file
- *
- * Enforced coherency means that the IOMMU ignores things like the PCIe no-snoop
- * bit in DMA transactions. A return of false indicates that the user has
- * rights to access additional instructions such as wbinvd on x86.
- */
-bool vfio_file_enforced_coherent(struct file *file)
+bool vfio_group_enforced_coherent(struct vfio_group *group)
 {
-	struct vfio_group *group = file->private_data;
 	struct vfio_device *device;
 	bool ret = true;
 
-	if (!vfio_file_is_group(file))
-		return true;
-
 	/*
 	 * If the device does not have IOMMU_CAP_ENFORCE_CACHE_COHERENCY then
 	 * any domain later attached to it will also not support it. If the cap
@@ -825,46 +811,22 @@ bool vfio_file_enforced_coherent(struct file *file)
 	mutex_unlock(&group->device_lock);
 	return ret;
 }
-EXPORT_SYMBOL_GPL(vfio_file_enforced_coherent);
 
-/**
- * vfio_file_set_kvm - Link a kvm with VFIO drivers
- * @file: VFIO group file
- * @kvm: KVM to link
- *
- * When a VFIO device is first opened the KVM will be available in
- * device->kvm if one was associated with the group.
- */
-void vfio_file_set_kvm(struct file *file, struct kvm *kvm)
+void vfio_group_set_kvm(struct vfio_group *group, struct kvm *kvm)
 {
-	struct vfio_group *group = file->private_data;
-
-	if (!vfio_file_is_group(file))
-		return;
-
+	/*
+	 * When a VFIO device is first opened the KVM will be available in
+	 * device->kvm if one was associated with the group.
+	 */
 	spin_lock(&group->kvm_ref_lock);
 	group->kvm = kvm;
 	spin_unlock(&group->kvm_ref_lock);
 }
-EXPORT_SYMBOL_GPL(vfio_file_set_kvm);
 
-/**
- * vfio_file_has_dev - True if the VFIO file is a handle for device
- * @file: VFIO file to check
- * @device: Device that must be part of the file
- *
- * Returns true if given file has permission to manipulate the given device.
- */
-bool vfio_file_has_dev(struct file *file, struct vfio_device *device)
+bool vfio_group_has_dev(struct vfio_group *group, struct vfio_device *device)
 {
-	struct vfio_group *group = file->private_data;
-
-	if (!vfio_file_is_group(file))
-		return false;
-
 	return group == device->group;
 }
-EXPORT_SYMBOL_GPL(vfio_file_has_dev);
 
 static char *vfio_devnode(const struct device *dev, umode_t *mode)
 {
diff --git a/drivers/vfio/pci/vfio_pci_core.c b/drivers/vfio/pci/vfio_pci_core.c
index a6492a25ff6a..4704c1babae3 100644
--- a/drivers/vfio/pci/vfio_pci_core.c
+++ b/drivers/vfio/pci/vfio_pci_core.c
@@ -1320,8 +1320,8 @@ static int vfio_pci_ioctl_pci_hot_reset(struct vfio_pci_core_device *vdev,
 			break;
 		}
 
-		/* Ensure the FD is a vfio group FD.*/
-		if (!vfio_file_is_group(file)) {
+		/* Ensure the FD is a vfio FD.*/
+		if (!vfio_file_is_valid(file)) {
 			fput(file);
 			ret = -EINVAL;
 			break;
diff --git a/drivers/vfio/vfio.h b/drivers/vfio/vfio.h
index 61bbf673e672..f237e9410d1e 100644
--- a/drivers/vfio/vfio.h
+++ b/drivers/vfio/vfio.h
@@ -90,6 +90,10 @@ void vfio_device_group_unregister(struct vfio_device *device);
 int vfio_device_group_use_iommu(struct vfio_device *device);
 void vfio_device_group_unuse_iommu(struct vfio_device *device);
 void vfio_device_group_close(struct vfio_device *device);
+struct vfio_group *vfio_group_from_file(struct file *file);
+bool vfio_group_enforced_coherent(struct vfio_group *group);
+void vfio_group_set_kvm(struct vfio_group *group, struct kvm *kvm);
+bool vfio_group_has_dev(struct vfio_group *group, struct vfio_device *device);
 bool vfio_device_has_container(struct vfio_device *device);
 int __init vfio_group_init(void);
 void vfio_group_cleanup(void);
diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c
index d99fa0cec18e..8612ba112e7f 100644
--- a/drivers/vfio/vfio_main.c
+++ b/drivers/vfio/vfio_main.c
@@ -1167,6 +1167,68 @@ const struct file_operations vfio_device_fops = {
 	.mmap		= vfio_device_fops_mmap,
 };
 
+/**
+ * vfio_file_is_valid - True if the file is usable with VFIO APIS
+ * @file: VFIO group file or VFIO device file
+ */
+bool vfio_file_is_valid(struct file *file)
+{
+	return vfio_group_from_file(file);
+}
+EXPORT_SYMBOL_GPL(vfio_file_is_valid);
+
+/**
+ * vfio_file_enforced_coherent - True if the DMA associated with the VFIO file
+ *        is always CPU cache coherent
+ * @file: VFIO group file or VFIO device file
+ *
+ * Enforced coherency means that the IOMMU ignores things like the PCIe no-snoop
+ * bit in DMA transactions. A return of false indicates that the user has
+ * rights to access additional instructions such as wbinvd on x86.
+ */
+bool vfio_file_enforced_coherent(struct file *file)
+{
+	struct vfio_group *group = vfio_group_from_file(file);
+
+	if (group)
+		return vfio_group_enforced_coherent(group);
+
+	return true;
+}
+EXPORT_SYMBOL_GPL(vfio_file_enforced_coherent);
+
+/**
+ * vfio_file_set_kvm - Link a kvm with VFIO drivers
+ * @file: VFIO group file or VFIO device file
+ * @kvm: KVM to link
+ *
+ */
+void vfio_file_set_kvm(struct file *file, struct kvm *kvm)
+{
+	struct vfio_group *group = vfio_group_from_file(file);
+
+	if (group)
+		vfio_group_set_kvm(group, kvm);
+}
+EXPORT_SYMBOL_GPL(vfio_file_set_kvm);
+
+/**
+ * vfio_file_has_dev - True if the VFIO file is a handle for device
+ * @file: VFIO file to check, VFIO group file or VFIO device file
+ * @device: Device that must be part of the file
+ *
+ * Returns true if given file has permission to manipulate the given device.
+ */
+bool vfio_file_has_dev(struct file *file, struct vfio_device *device)
+{
+	struct vfio_group *group = vfio_group_from_file(file);
+
+	if (group)
+		return vfio_group_has_dev(group, device);
+	return false;
+}
+EXPORT_SYMBOL_GPL(vfio_file_has_dev);
+
 /*
  * Sub-module support
  */
diff --git a/include/linux/vfio.h b/include/linux/vfio.h
index 93134b023968..6a07e1c6c38e 100644
--- a/include/linux/vfio.h
+++ b/include/linux/vfio.h
@@ -245,7 +245,7 @@ int vfio_mig_get_next_state(struct vfio_device *device,
  * External user API
  */
 struct iommu_group *vfio_file_iommu_group(struct file *file);
-bool vfio_file_is_group(struct file *file);
+bool vfio_file_is_valid(struct file *file);
 bool vfio_file_enforced_coherent(struct file *file);
 void vfio_file_set_kvm(struct file *file, struct kvm *kvm);
 bool vfio_file_has_dev(struct file *file, struct vfio_device *device);
diff --git a/virt/kvm/vfio.c b/virt/kvm/vfio.c
index 9584eb57e0ed..8bac308ba630 100644
--- a/virt/kvm/vfio.c
+++ b/virt/kvm/vfio.c
@@ -64,18 +64,18 @@ static bool kvm_vfio_file_enforced_coherent(struct file *file)
 	return ret;
 }
 
-static bool kvm_vfio_file_is_group(struct file *file)
+static bool kvm_vfio_file_is_valid(struct file *file)
 {
 	bool (*fn)(struct file *file);
 	bool ret;
 
-	fn = symbol_get(vfio_file_is_group);
+	fn = symbol_get(vfio_file_is_valid);
 	if (!fn)
 		return false;
 
 	ret = fn(file);
 
-	symbol_put(vfio_file_is_group);
+	symbol_put(vfio_file_is_valid);
 
 	return ret;
 }
@@ -154,8 +154,8 @@ static int kvm_vfio_group_add(struct kvm_device *dev, unsigned int fd)
 	if (!filp)
 		return -EBADF;
 
-	/* Ensure the FD is a vfio group FD.*/
-	if (!kvm_vfio_file_is_group(filp)) {
+	/* Ensure the FD is a vfio FD.*/
+	if (!kvm_vfio_file_is_valid(filp)) {
 		ret = -EINVAL;
 		goto err_fput;
 	}
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 135+ messages in thread

* [Intel-gfx] [PATCH v3 02/15] vfio: Refine vfio file kAPIs
@ 2023-02-13 15:13   ` Yi Liu
  0 siblings, 0 replies; 135+ messages in thread
From: Yi Liu @ 2023-02-13 15:13 UTC (permalink / raw)
  To: joro, alex.williamson, jgg, kevin.tian, robin.murphy
  Cc: linux-s390, yi.l.liu, yi.y.sun, kvm, mjrosato, jasowang, cohuck,
	peterx, eric.auger, nicolinc, shameerali.kolothum.thodi,
	suravee.suthikulpanit, chao.p.peng, lulu, intel-gvt-dev,
	intel-gfx

This prepares for making the below kAPIs to accept both group file
and device file instead of only vfio group file.

  bool vfio_file_enforced_coherent(struct file *file);
  void vfio_file_set_kvm(struct file *file, struct kvm *kvm);
  bool vfio_file_has_dev(struct file *file, struct vfio_device *device);

Besides above change, vfio_file_is_group() is renamed to be
vfio_file_is_valid().

Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
---
 drivers/vfio/group.c             | 74 ++++++++------------------------
 drivers/vfio/pci/vfio_pci_core.c |  4 +-
 drivers/vfio/vfio.h              |  4 ++
 drivers/vfio/vfio_main.c         | 62 ++++++++++++++++++++++++++
 include/linux/vfio.h             |  2 +-
 virt/kvm/vfio.c                  | 10 ++---
 6 files changed, 92 insertions(+), 64 deletions(-)

diff --git a/drivers/vfio/group.c b/drivers/vfio/group.c
index cf51e1a0fd96..cc0eded19a9f 100644
--- a/drivers/vfio/group.c
+++ b/drivers/vfio/group.c
@@ -751,6 +751,15 @@ bool vfio_device_has_container(struct vfio_device *device)
 	return device->group->container;
 }
 
+struct vfio_group *vfio_group_from_file(struct file *file)
+{
+	struct vfio_group *group = file->private_data;
+
+	if (file->f_op != &vfio_group_fops)
+		return NULL;
+	return group;
+}
+
 /**
  * vfio_file_iommu_group - Return the struct iommu_group for the vfio group file
  * @file: VFIO group file
@@ -761,13 +770,13 @@ bool vfio_device_has_container(struct vfio_device *device)
  */
 struct iommu_group *vfio_file_iommu_group(struct file *file)
 {
-	struct vfio_group *group = file->private_data;
+	struct vfio_group *group = vfio_group_from_file(file);
 	struct iommu_group *iommu_group = NULL;
 
 	if (!IS_ENABLED(CONFIG_SPAPR_TCE_IOMMU))
 		return NULL;
 
-	if (!vfio_file_is_group(file))
+	if (!group)
 		return NULL;
 
 	mutex_lock(&group->group_lock);
@@ -780,34 +789,11 @@ struct iommu_group *vfio_file_iommu_group(struct file *file)
 }
 EXPORT_SYMBOL_GPL(vfio_file_iommu_group);
 
-/**
- * vfio_file_is_group - True if the file is usable with VFIO aPIS
- * @file: VFIO group file
- */
-bool vfio_file_is_group(struct file *file)
-{
-	return file->f_op == &vfio_group_fops;
-}
-EXPORT_SYMBOL_GPL(vfio_file_is_group);
-
-/**
- * vfio_file_enforced_coherent - True if the DMA associated with the VFIO file
- *        is always CPU cache coherent
- * @file: VFIO group file
- *
- * Enforced coherency means that the IOMMU ignores things like the PCIe no-snoop
- * bit in DMA transactions. A return of false indicates that the user has
- * rights to access additional instructions such as wbinvd on x86.
- */
-bool vfio_file_enforced_coherent(struct file *file)
+bool vfio_group_enforced_coherent(struct vfio_group *group)
 {
-	struct vfio_group *group = file->private_data;
 	struct vfio_device *device;
 	bool ret = true;
 
-	if (!vfio_file_is_group(file))
-		return true;
-
 	/*
 	 * If the device does not have IOMMU_CAP_ENFORCE_CACHE_COHERENCY then
 	 * any domain later attached to it will also not support it. If the cap
@@ -825,46 +811,22 @@ bool vfio_file_enforced_coherent(struct file *file)
 	mutex_unlock(&group->device_lock);
 	return ret;
 }
-EXPORT_SYMBOL_GPL(vfio_file_enforced_coherent);
 
-/**
- * vfio_file_set_kvm - Link a kvm with VFIO drivers
- * @file: VFIO group file
- * @kvm: KVM to link
- *
- * When a VFIO device is first opened the KVM will be available in
- * device->kvm if one was associated with the group.
- */
-void vfio_file_set_kvm(struct file *file, struct kvm *kvm)
+void vfio_group_set_kvm(struct vfio_group *group, struct kvm *kvm)
 {
-	struct vfio_group *group = file->private_data;
-
-	if (!vfio_file_is_group(file))
-		return;
-
+	/*
+	 * When a VFIO device is first opened the KVM will be available in
+	 * device->kvm if one was associated with the group.
+	 */
 	spin_lock(&group->kvm_ref_lock);
 	group->kvm = kvm;
 	spin_unlock(&group->kvm_ref_lock);
 }
-EXPORT_SYMBOL_GPL(vfio_file_set_kvm);
 
-/**
- * vfio_file_has_dev - True if the VFIO file is a handle for device
- * @file: VFIO file to check
- * @device: Device that must be part of the file
- *
- * Returns true if given file has permission to manipulate the given device.
- */
-bool vfio_file_has_dev(struct file *file, struct vfio_device *device)
+bool vfio_group_has_dev(struct vfio_group *group, struct vfio_device *device)
 {
-	struct vfio_group *group = file->private_data;
-
-	if (!vfio_file_is_group(file))
-		return false;
-
 	return group == device->group;
 }
-EXPORT_SYMBOL_GPL(vfio_file_has_dev);
 
 static char *vfio_devnode(const struct device *dev, umode_t *mode)
 {
diff --git a/drivers/vfio/pci/vfio_pci_core.c b/drivers/vfio/pci/vfio_pci_core.c
index a6492a25ff6a..4704c1babae3 100644
--- a/drivers/vfio/pci/vfio_pci_core.c
+++ b/drivers/vfio/pci/vfio_pci_core.c
@@ -1320,8 +1320,8 @@ static int vfio_pci_ioctl_pci_hot_reset(struct vfio_pci_core_device *vdev,
 			break;
 		}
 
-		/* Ensure the FD is a vfio group FD.*/
-		if (!vfio_file_is_group(file)) {
+		/* Ensure the FD is a vfio FD.*/
+		if (!vfio_file_is_valid(file)) {
 			fput(file);
 			ret = -EINVAL;
 			break;
diff --git a/drivers/vfio/vfio.h b/drivers/vfio/vfio.h
index 61bbf673e672..f237e9410d1e 100644
--- a/drivers/vfio/vfio.h
+++ b/drivers/vfio/vfio.h
@@ -90,6 +90,10 @@ void vfio_device_group_unregister(struct vfio_device *device);
 int vfio_device_group_use_iommu(struct vfio_device *device);
 void vfio_device_group_unuse_iommu(struct vfio_device *device);
 void vfio_device_group_close(struct vfio_device *device);
+struct vfio_group *vfio_group_from_file(struct file *file);
+bool vfio_group_enforced_coherent(struct vfio_group *group);
+void vfio_group_set_kvm(struct vfio_group *group, struct kvm *kvm);
+bool vfio_group_has_dev(struct vfio_group *group, struct vfio_device *device);
 bool vfio_device_has_container(struct vfio_device *device);
 int __init vfio_group_init(void);
 void vfio_group_cleanup(void);
diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c
index d99fa0cec18e..8612ba112e7f 100644
--- a/drivers/vfio/vfio_main.c
+++ b/drivers/vfio/vfio_main.c
@@ -1167,6 +1167,68 @@ const struct file_operations vfio_device_fops = {
 	.mmap		= vfio_device_fops_mmap,
 };
 
+/**
+ * vfio_file_is_valid - True if the file is usable with VFIO APIS
+ * @file: VFIO group file or VFIO device file
+ */
+bool vfio_file_is_valid(struct file *file)
+{
+	return vfio_group_from_file(file);
+}
+EXPORT_SYMBOL_GPL(vfio_file_is_valid);
+
+/**
+ * vfio_file_enforced_coherent - True if the DMA associated with the VFIO file
+ *        is always CPU cache coherent
+ * @file: VFIO group file or VFIO device file
+ *
+ * Enforced coherency means that the IOMMU ignores things like the PCIe no-snoop
+ * bit in DMA transactions. A return of false indicates that the user has
+ * rights to access additional instructions such as wbinvd on x86.
+ */
+bool vfio_file_enforced_coherent(struct file *file)
+{
+	struct vfio_group *group = vfio_group_from_file(file);
+
+	if (group)
+		return vfio_group_enforced_coherent(group);
+
+	return true;
+}
+EXPORT_SYMBOL_GPL(vfio_file_enforced_coherent);
+
+/**
+ * vfio_file_set_kvm - Link a kvm with VFIO drivers
+ * @file: VFIO group file or VFIO device file
+ * @kvm: KVM to link
+ *
+ */
+void vfio_file_set_kvm(struct file *file, struct kvm *kvm)
+{
+	struct vfio_group *group = vfio_group_from_file(file);
+
+	if (group)
+		vfio_group_set_kvm(group, kvm);
+}
+EXPORT_SYMBOL_GPL(vfio_file_set_kvm);
+
+/**
+ * vfio_file_has_dev - True if the VFIO file is a handle for device
+ * @file: VFIO file to check, VFIO group file or VFIO device file
+ * @device: Device that must be part of the file
+ *
+ * Returns true if given file has permission to manipulate the given device.
+ */
+bool vfio_file_has_dev(struct file *file, struct vfio_device *device)
+{
+	struct vfio_group *group = vfio_group_from_file(file);
+
+	if (group)
+		return vfio_group_has_dev(group, device);
+	return false;
+}
+EXPORT_SYMBOL_GPL(vfio_file_has_dev);
+
 /*
  * Sub-module support
  */
diff --git a/include/linux/vfio.h b/include/linux/vfio.h
index 93134b023968..6a07e1c6c38e 100644
--- a/include/linux/vfio.h
+++ b/include/linux/vfio.h
@@ -245,7 +245,7 @@ int vfio_mig_get_next_state(struct vfio_device *device,
  * External user API
  */
 struct iommu_group *vfio_file_iommu_group(struct file *file);
-bool vfio_file_is_group(struct file *file);
+bool vfio_file_is_valid(struct file *file);
 bool vfio_file_enforced_coherent(struct file *file);
 void vfio_file_set_kvm(struct file *file, struct kvm *kvm);
 bool vfio_file_has_dev(struct file *file, struct vfio_device *device);
diff --git a/virt/kvm/vfio.c b/virt/kvm/vfio.c
index 9584eb57e0ed..8bac308ba630 100644
--- a/virt/kvm/vfio.c
+++ b/virt/kvm/vfio.c
@@ -64,18 +64,18 @@ static bool kvm_vfio_file_enforced_coherent(struct file *file)
 	return ret;
 }
 
-static bool kvm_vfio_file_is_group(struct file *file)
+static bool kvm_vfio_file_is_valid(struct file *file)
 {
 	bool (*fn)(struct file *file);
 	bool ret;
 
-	fn = symbol_get(vfio_file_is_group);
+	fn = symbol_get(vfio_file_is_valid);
 	if (!fn)
 		return false;
 
 	ret = fn(file);
 
-	symbol_put(vfio_file_is_group);
+	symbol_put(vfio_file_is_valid);
 
 	return ret;
 }
@@ -154,8 +154,8 @@ static int kvm_vfio_group_add(struct kvm_device *dev, unsigned int fd)
 	if (!filp)
 		return -EBADF;
 
-	/* Ensure the FD is a vfio group FD.*/
-	if (!kvm_vfio_file_is_group(filp)) {
+	/* Ensure the FD is a vfio FD.*/
+	if (!kvm_vfio_file_is_valid(filp)) {
 		ret = -EINVAL;
 		goto err_fput;
 	}
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 135+ messages in thread

* [PATCH v3 03/15] vfio: Accept vfio device file in the driver facing kAPI
  2023-02-13 15:13 ` [Intel-gfx] " Yi Liu
@ 2023-02-13 15:13   ` Yi Liu
  -1 siblings, 0 replies; 135+ messages in thread
From: Yi Liu @ 2023-02-13 15:13 UTC (permalink / raw)
  To: joro, alex.williamson, jgg, kevin.tian, robin.murphy
  Cc: cohuck, eric.auger, nicolinc, kvm, mjrosato, chao.p.peng,
	yi.l.liu, yi.y.sun, peterx, jasowang, shameerali.kolothum.thodi,
	lulu, suravee.suthikulpanit, intel-gvt-dev, intel-gfx,
	linux-s390

This makes the vfio file kAPIs to accepte vfio device files, also a
preparation for vfio device cdev support.

For the kvm set with vfio device file, kvm pointer is stored in struct
vfio_device_file, and use kvm_ref_lock to protect kvm set and kvm
pointer usage within VFIO. This kvm pointer will be set to vfio_device
after device file is bound to iommufd in the cdev path.

Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
---
 drivers/vfio/vfio.h      |  2 ++
 drivers/vfio/vfio_main.c | 51 ++++++++++++++++++++++++++++++++++++----
 2 files changed, 49 insertions(+), 4 deletions(-)

diff --git a/drivers/vfio/vfio.h b/drivers/vfio/vfio.h
index f237e9410d1e..cee979a1b90f 100644
--- a/drivers/vfio/vfio.h
+++ b/drivers/vfio/vfio.h
@@ -18,6 +18,8 @@ struct vfio_container;
 
 struct vfio_device_file {
 	struct vfio_device *device;
+	spinlock_t kvm_ref_lock; /* protect kvm field */
+	struct kvm *kvm;
 };
 
 void vfio_device_put_registration(struct vfio_device *device);
diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c
index 8612ba112e7f..c529f609fecc 100644
--- a/drivers/vfio/vfio_main.c
+++ b/drivers/vfio/vfio_main.c
@@ -406,6 +406,7 @@ vfio_allocate_device_file(struct vfio_device *device)
 		return ERR_PTR(-ENOMEM);
 
 	df->device = device;
+	spin_lock_init(&df->kvm_ref_lock);
 
 	return df;
 }
@@ -1167,13 +1168,23 @@ const struct file_operations vfio_device_fops = {
 	.mmap		= vfio_device_fops_mmap,
 };
 
+static struct vfio_device *vfio_device_from_file(struct file *file)
+{
+	struct vfio_device_file *df = file->private_data;
+
+	if (file->f_op != &vfio_device_fops)
+		return NULL;
+	return df->device;
+}
+
 /**
  * vfio_file_is_valid - True if the file is usable with VFIO APIS
  * @file: VFIO group file or VFIO device file
  */
 bool vfio_file_is_valid(struct file *file)
 {
-	return vfio_group_from_file(file);
+	return vfio_group_from_file(file) ||
+	       vfio_device_from_file(file);
 }
 EXPORT_SYMBOL_GPL(vfio_file_is_valid);
 
@@ -1188,15 +1199,36 @@ EXPORT_SYMBOL_GPL(vfio_file_is_valid);
  */
 bool vfio_file_enforced_coherent(struct file *file)
 {
-	struct vfio_group *group = vfio_group_from_file(file);
+	struct vfio_group *group;
+	struct vfio_device *device;
 
+	group = vfio_group_from_file(file);
 	if (group)
 		return vfio_group_enforced_coherent(group);
 
+	device = vfio_device_from_file(file);
+	if (device)
+		return device_iommu_capable(device->dev,
+					    IOMMU_CAP_ENFORCE_CACHE_COHERENCY);
+
 	return true;
 }
 EXPORT_SYMBOL_GPL(vfio_file_enforced_coherent);
 
+static void vfio_device_file_set_kvm(struct file *file, struct kvm *kvm)
+{
+	struct vfio_device_file *df = file->private_data;
+
+	/*
+	 * The kvm is first recorded in the vfio_device_file, and will
+	 * be propagated to vfio_device::kvm when the file is bound to
+	 * iommufd successfully in the vfio device cdev path.
+	 */
+	spin_lock(&df->kvm_ref_lock);
+	df->kvm = kvm;
+	spin_unlock(&df->kvm_ref_lock);
+}
+
 /**
  * vfio_file_set_kvm - Link a kvm with VFIO drivers
  * @file: VFIO group file or VFIO device file
@@ -1205,10 +1237,14 @@ EXPORT_SYMBOL_GPL(vfio_file_enforced_coherent);
  */
 void vfio_file_set_kvm(struct file *file, struct kvm *kvm)
 {
-	struct vfio_group *group = vfio_group_from_file(file);
+	struct vfio_group *group;
 
+	group = vfio_group_from_file(file);
 	if (group)
 		vfio_group_set_kvm(group, kvm);
+
+	if (vfio_device_from_file(file))
+		vfio_device_file_set_kvm(file, kvm);
 }
 EXPORT_SYMBOL_GPL(vfio_file_set_kvm);
 
@@ -1221,10 +1257,17 @@ EXPORT_SYMBOL_GPL(vfio_file_set_kvm);
  */
 bool vfio_file_has_dev(struct file *file, struct vfio_device *device)
 {
-	struct vfio_group *group = vfio_group_from_file(file);
+	struct vfio_group *group;
+	struct vfio_device *vdev;
 
+	group = vfio_group_from_file(file);
 	if (group)
 		return vfio_group_has_dev(group, device);
+
+	vdev = vfio_device_from_file(file);
+	if (device)
+		return vdev == device;
+
 	return false;
 }
 EXPORT_SYMBOL_GPL(vfio_file_has_dev);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 135+ messages in thread

* [Intel-gfx] [PATCH v3 03/15] vfio: Accept vfio device file in the driver facing kAPI
@ 2023-02-13 15:13   ` Yi Liu
  0 siblings, 0 replies; 135+ messages in thread
From: Yi Liu @ 2023-02-13 15:13 UTC (permalink / raw)
  To: joro, alex.williamson, jgg, kevin.tian, robin.murphy
  Cc: linux-s390, yi.l.liu, yi.y.sun, kvm, mjrosato, jasowang, cohuck,
	peterx, eric.auger, nicolinc, shameerali.kolothum.thodi,
	suravee.suthikulpanit, chao.p.peng, lulu, intel-gvt-dev,
	intel-gfx

This makes the vfio file kAPIs to accepte vfio device files, also a
preparation for vfio device cdev support.

For the kvm set with vfio device file, kvm pointer is stored in struct
vfio_device_file, and use kvm_ref_lock to protect kvm set and kvm
pointer usage within VFIO. This kvm pointer will be set to vfio_device
after device file is bound to iommufd in the cdev path.

Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
---
 drivers/vfio/vfio.h      |  2 ++
 drivers/vfio/vfio_main.c | 51 ++++++++++++++++++++++++++++++++++++----
 2 files changed, 49 insertions(+), 4 deletions(-)

diff --git a/drivers/vfio/vfio.h b/drivers/vfio/vfio.h
index f237e9410d1e..cee979a1b90f 100644
--- a/drivers/vfio/vfio.h
+++ b/drivers/vfio/vfio.h
@@ -18,6 +18,8 @@ struct vfio_container;
 
 struct vfio_device_file {
 	struct vfio_device *device;
+	spinlock_t kvm_ref_lock; /* protect kvm field */
+	struct kvm *kvm;
 };
 
 void vfio_device_put_registration(struct vfio_device *device);
diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c
index 8612ba112e7f..c529f609fecc 100644
--- a/drivers/vfio/vfio_main.c
+++ b/drivers/vfio/vfio_main.c
@@ -406,6 +406,7 @@ vfio_allocate_device_file(struct vfio_device *device)
 		return ERR_PTR(-ENOMEM);
 
 	df->device = device;
+	spin_lock_init(&df->kvm_ref_lock);
 
 	return df;
 }
@@ -1167,13 +1168,23 @@ const struct file_operations vfio_device_fops = {
 	.mmap		= vfio_device_fops_mmap,
 };
 
+static struct vfio_device *vfio_device_from_file(struct file *file)
+{
+	struct vfio_device_file *df = file->private_data;
+
+	if (file->f_op != &vfio_device_fops)
+		return NULL;
+	return df->device;
+}
+
 /**
  * vfio_file_is_valid - True if the file is usable with VFIO APIS
  * @file: VFIO group file or VFIO device file
  */
 bool vfio_file_is_valid(struct file *file)
 {
-	return vfio_group_from_file(file);
+	return vfio_group_from_file(file) ||
+	       vfio_device_from_file(file);
 }
 EXPORT_SYMBOL_GPL(vfio_file_is_valid);
 
@@ -1188,15 +1199,36 @@ EXPORT_SYMBOL_GPL(vfio_file_is_valid);
  */
 bool vfio_file_enforced_coherent(struct file *file)
 {
-	struct vfio_group *group = vfio_group_from_file(file);
+	struct vfio_group *group;
+	struct vfio_device *device;
 
+	group = vfio_group_from_file(file);
 	if (group)
 		return vfio_group_enforced_coherent(group);
 
+	device = vfio_device_from_file(file);
+	if (device)
+		return device_iommu_capable(device->dev,
+					    IOMMU_CAP_ENFORCE_CACHE_COHERENCY);
+
 	return true;
 }
 EXPORT_SYMBOL_GPL(vfio_file_enforced_coherent);
 
+static void vfio_device_file_set_kvm(struct file *file, struct kvm *kvm)
+{
+	struct vfio_device_file *df = file->private_data;
+
+	/*
+	 * The kvm is first recorded in the vfio_device_file, and will
+	 * be propagated to vfio_device::kvm when the file is bound to
+	 * iommufd successfully in the vfio device cdev path.
+	 */
+	spin_lock(&df->kvm_ref_lock);
+	df->kvm = kvm;
+	spin_unlock(&df->kvm_ref_lock);
+}
+
 /**
  * vfio_file_set_kvm - Link a kvm with VFIO drivers
  * @file: VFIO group file or VFIO device file
@@ -1205,10 +1237,14 @@ EXPORT_SYMBOL_GPL(vfio_file_enforced_coherent);
  */
 void vfio_file_set_kvm(struct file *file, struct kvm *kvm)
 {
-	struct vfio_group *group = vfio_group_from_file(file);
+	struct vfio_group *group;
 
+	group = vfio_group_from_file(file);
 	if (group)
 		vfio_group_set_kvm(group, kvm);
+
+	if (vfio_device_from_file(file))
+		vfio_device_file_set_kvm(file, kvm);
 }
 EXPORT_SYMBOL_GPL(vfio_file_set_kvm);
 
@@ -1221,10 +1257,17 @@ EXPORT_SYMBOL_GPL(vfio_file_set_kvm);
  */
 bool vfio_file_has_dev(struct file *file, struct vfio_device *device)
 {
-	struct vfio_group *group = vfio_group_from_file(file);
+	struct vfio_group *group;
+	struct vfio_device *vdev;
 
+	group = vfio_group_from_file(file);
 	if (group)
 		return vfio_group_has_dev(group, device);
+
+	vdev = vfio_device_from_file(file);
+	if (device)
+		return vdev == device;
+
 	return false;
 }
 EXPORT_SYMBOL_GPL(vfio_file_has_dev);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 135+ messages in thread

* [PATCH v3 04/15] kvm/vfio: Rename kvm_vfio_group to prepare for accepting vfio device fd
  2023-02-13 15:13 ` [Intel-gfx] " Yi Liu
@ 2023-02-13 15:13   ` Yi Liu
  -1 siblings, 0 replies; 135+ messages in thread
From: Yi Liu @ 2023-02-13 15:13 UTC (permalink / raw)
  To: joro, alex.williamson, jgg, kevin.tian, robin.murphy
  Cc: cohuck, eric.auger, nicolinc, kvm, mjrosato, chao.p.peng,
	yi.l.liu, yi.y.sun, peterx, jasowang, shameerali.kolothum.thodi,
	lulu, suravee.suthikulpanit, intel-gvt-dev, intel-gfx,
	linux-s390

Meanwhile, rename related helpers. No functional change is intended.

Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
---
 virt/kvm/vfio.c | 115 ++++++++++++++++++++++++------------------------
 1 file changed, 58 insertions(+), 57 deletions(-)

diff --git a/virt/kvm/vfio.c b/virt/kvm/vfio.c
index 8bac308ba630..857d6ba349e1 100644
--- a/virt/kvm/vfio.c
+++ b/virt/kvm/vfio.c
@@ -21,7 +21,7 @@
 #include <asm/kvm_ppc.h>
 #endif
 
-struct kvm_vfio_group {
+struct kvm_vfio_file {
 	struct list_head node;
 	struct file *file;
 #ifdef CONFIG_SPAPR_TCE_IOMMU
@@ -30,7 +30,7 @@ struct kvm_vfio_group {
 };
 
 struct kvm_vfio {
-	struct list_head group_list;
+	struct list_head file_list;
 	struct mutex lock;
 	bool noncoherent;
 };
@@ -98,34 +98,35 @@ static struct iommu_group *kvm_vfio_file_iommu_group(struct file *file)
 }
 
 static void kvm_spapr_tce_release_vfio_group(struct kvm *kvm,
-					     struct kvm_vfio_group *kvg)
+					     struct kvm_vfio_file *kvf)
 {
-	if (WARN_ON_ONCE(!kvg->iommu_group))
+	if (WARN_ON_ONCE(!kvf->iommu_group))
 		return;
 
-	kvm_spapr_tce_release_iommu_group(kvm, kvg->iommu_group);
-	iommu_group_put(kvg->iommu_group);
-	kvg->iommu_group = NULL;
+	kvm_spapr_tce_release_iommu_group(kvm, kvf->iommu_group);
+	iommu_group_put(kvf->iommu_group);
+	kvf->iommu_group = NULL;
 }
 #endif
 
 /*
- * Groups can use the same or different IOMMU domains.  If the same then
- * adding a new group may change the coherency of groups we've previously
- * been told about.  We don't want to care about any of that so we retest
- * each group and bail as soon as we find one that's noncoherent.  This
- * means we only ever [un]register_noncoherent_dma once for the whole device.
+ * Groups/devices can use the same or different IOMMU domains. If the same
+ * then adding a new group/device may change the coherency of groups/devices
+ * we've previously been told about. We don't want to care about any of
+ * that so we retest each group/device and bail as soon as we find one that's
+ * noncoherent.  This means we only ever [un]register_noncoherent_dma once
+ * for the whole device.
  */
 static void kvm_vfio_update_coherency(struct kvm_device *dev)
 {
 	struct kvm_vfio *kv = dev->private;
 	bool noncoherent = false;
-	struct kvm_vfio_group *kvg;
+	struct kvm_vfio_file *kvf;
 
 	mutex_lock(&kv->lock);
 
-	list_for_each_entry(kvg, &kv->group_list, node) {
-		if (!kvm_vfio_file_enforced_coherent(kvg->file)) {
+	list_for_each_entry(kvf, &kv->file_list, node) {
+		if (!kvm_vfio_file_enforced_coherent(kvf->file)) {
 			noncoherent = true;
 			break;
 		}
@@ -143,10 +144,10 @@ static void kvm_vfio_update_coherency(struct kvm_device *dev)
 	mutex_unlock(&kv->lock);
 }
 
-static int kvm_vfio_group_add(struct kvm_device *dev, unsigned int fd)
+static int kvm_vfio_file_add(struct kvm_device *dev, unsigned int fd)
 {
 	struct kvm_vfio *kv = dev->private;
-	struct kvm_vfio_group *kvg;
+	struct kvm_vfio_file *kvf;
 	struct file *filp;
 	int ret;
 
@@ -162,27 +163,27 @@ static int kvm_vfio_group_add(struct kvm_device *dev, unsigned int fd)
 
 	mutex_lock(&kv->lock);
 
-	list_for_each_entry(kvg, &kv->group_list, node) {
-		if (kvg->file == filp) {
+	list_for_each_entry(kvf, &kv->file_list, node) {
+		if (kvf->file == filp) {
 			ret = -EEXIST;
 			goto err_unlock;
 		}
 	}
 
-	kvg = kzalloc(sizeof(*kvg), GFP_KERNEL_ACCOUNT);
-	if (!kvg) {
+	kvf = kzalloc(sizeof(*kvf), GFP_KERNEL_ACCOUNT);
+	if (!kvf) {
 		ret = -ENOMEM;
 		goto err_unlock;
 	}
 
-	kvg->file = filp;
-	list_add_tail(&kvg->node, &kv->group_list);
+	kvf->file = filp;
+	list_add_tail(&kvf->node, &kv->file_list);
 
 	kvm_arch_start_assignment(dev->kvm);
 
 	mutex_unlock(&kv->lock);
 
-	kvm_vfio_file_set_kvm(kvg->file, dev->kvm);
+	kvm_vfio_file_set_kvm(kvf->file, dev->kvm);
 	kvm_vfio_update_coherency(dev);
 
 	return 0;
@@ -193,10 +194,10 @@ static int kvm_vfio_group_add(struct kvm_device *dev, unsigned int fd)
 	return ret;
 }
 
-static int kvm_vfio_group_del(struct kvm_device *dev, unsigned int fd)
+static int kvm_vfio_file_del(struct kvm_device *dev, unsigned int fd)
 {
 	struct kvm_vfio *kv = dev->private;
-	struct kvm_vfio_group *kvg;
+	struct kvm_vfio_file *kvf;
 	struct fd f;
 	int ret;
 
@@ -208,18 +209,18 @@ static int kvm_vfio_group_del(struct kvm_device *dev, unsigned int fd)
 
 	mutex_lock(&kv->lock);
 
-	list_for_each_entry(kvg, &kv->group_list, node) {
-		if (kvg->file != f.file)
+	list_for_each_entry(kvf, &kv->file_list, node) {
+		if (kvf->file != f.file)
 			continue;
 
-		list_del(&kvg->node);
+		list_del(&kvf->node);
 		kvm_arch_end_assignment(dev->kvm);
 #ifdef CONFIG_SPAPR_TCE_IOMMU
-		kvm_spapr_tce_release_vfio_group(dev->kvm, kvg);
+		kvm_spapr_tce_release_vfio_group(dev->kvm, kvf);
 #endif
-		kvm_vfio_file_set_kvm(kvg->file, NULL);
-		fput(kvg->file);
-		kfree(kvg);
+		kvm_vfio_file_set_kvm(kvf->file, NULL);
+		fput(kvf->file);
+		kfree(kvf);
 		ret = 0;
 		break;
 	}
@@ -234,12 +235,12 @@ static int kvm_vfio_group_del(struct kvm_device *dev, unsigned int fd)
 }
 
 #ifdef CONFIG_SPAPR_TCE_IOMMU
-static int kvm_vfio_group_set_spapr_tce(struct kvm_device *dev,
-					void __user *arg)
+static int kvm_vfio_file_set_spapr_tce(struct kvm_device *dev,
+				       void __user *arg)
 {
 	struct kvm_vfio_spapr_tce param;
 	struct kvm_vfio *kv = dev->private;
-	struct kvm_vfio_group *kvg;
+	struct kvm_vfio_file *kvf;
 	struct fd f;
 	int ret;
 
@@ -254,20 +255,20 @@ static int kvm_vfio_group_set_spapr_tce(struct kvm_device *dev,
 
 	mutex_lock(&kv->lock);
 
-	list_for_each_entry(kvg, &kv->group_list, node) {
-		if (kvg->file != f.file)
+	list_for_each_entry(kvf, &kv->file_list, node) {
+		if (kvf->file != f.file)
 			continue;
 
-		if (!kvg->iommu_group) {
-			kvg->iommu_group = kvm_vfio_file_iommu_group(kvg->file);
-			if (WARN_ON_ONCE(!kvg->iommu_group)) {
+		if (!kvf->iommu_group) {
+			kvf->iommu_group = kvm_vfio_file_iommu_group(kvf->file);
+			if (WARN_ON_ONCE(!kvf->iommu_group)) {
 				ret = -EIO;
 				goto err_fdput;
 			}
 		}
 
 		ret = kvm_spapr_tce_attach_iommu_group(dev->kvm, param.tablefd,
-						       kvg->iommu_group);
+						       kvf->iommu_group);
 		break;
 	}
 
@@ -278,8 +279,8 @@ static int kvm_vfio_group_set_spapr_tce(struct kvm_device *dev,
 }
 #endif
 
-static int kvm_vfio_set_group(struct kvm_device *dev, long attr,
-			      void __user *arg)
+static int kvm_vfio_set_file(struct kvm_device *dev, long attr,
+			     void __user *arg)
 {
 	int32_t __user *argp = arg;
 	int32_t fd;
@@ -288,16 +289,16 @@ static int kvm_vfio_set_group(struct kvm_device *dev, long attr,
 	case KVM_DEV_VFIO_GROUP_ADD:
 		if (get_user(fd, argp))
 			return -EFAULT;
-		return kvm_vfio_group_add(dev, fd);
+		return kvm_vfio_file_add(dev, fd);
 
 	case KVM_DEV_VFIO_GROUP_DEL:
 		if (get_user(fd, argp))
 			return -EFAULT;
-		return kvm_vfio_group_del(dev, fd);
+		return kvm_vfio_file_del(dev, fd);
 
 #ifdef CONFIG_SPAPR_TCE_IOMMU
 	case KVM_DEV_VFIO_GROUP_SET_SPAPR_TCE:
-		return kvm_vfio_group_set_spapr_tce(dev, arg);
+		return kvm_vfio_file_set_spapr_tce(dev, arg);
 #endif
 	}
 
@@ -309,8 +310,8 @@ static int kvm_vfio_set_attr(struct kvm_device *dev,
 {
 	switch (attr->group) {
 	case KVM_DEV_VFIO_GROUP:
-		return kvm_vfio_set_group(dev, attr->attr,
-					  u64_to_user_ptr(attr->addr));
+		return kvm_vfio_set_file(dev, attr->attr,
+					 u64_to_user_ptr(attr->addr));
 	}
 
 	return -ENXIO;
@@ -339,16 +340,16 @@ static int kvm_vfio_has_attr(struct kvm_device *dev,
 static void kvm_vfio_release(struct kvm_device *dev)
 {
 	struct kvm_vfio *kv = dev->private;
-	struct kvm_vfio_group *kvg, *tmp;
+	struct kvm_vfio_file *kvf, *tmp;
 
-	list_for_each_entry_safe(kvg, tmp, &kv->group_list, node) {
+	list_for_each_entry_safe(kvf, tmp, &kv->file_list, node) {
 #ifdef CONFIG_SPAPR_TCE_IOMMU
-		kvm_spapr_tce_release_vfio_group(dev->kvm, kvg);
+		kvm_spapr_tce_release_vfio_group(dev->kvm, kvf);
 #endif
-		kvm_vfio_file_set_kvm(kvg->file, NULL);
-		fput(kvg->file);
-		list_del(&kvg->node);
-		kfree(kvg);
+		kvm_vfio_file_set_kvm(kvf->file, NULL);
+		fput(kvf->file);
+		list_del(&kvf->node);
+		kfree(kvf);
 		kvm_arch_end_assignment(dev->kvm);
 	}
 
@@ -382,7 +383,7 @@ static int kvm_vfio_create(struct kvm_device *dev, u32 type)
 	if (!kv)
 		return -ENOMEM;
 
-	INIT_LIST_HEAD(&kv->group_list);
+	INIT_LIST_HEAD(&kv->file_list);
 	mutex_init(&kv->lock);
 
 	dev->private = kv;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 135+ messages in thread

* [Intel-gfx] [PATCH v3 04/15] kvm/vfio: Rename kvm_vfio_group to prepare for accepting vfio device fd
@ 2023-02-13 15:13   ` Yi Liu
  0 siblings, 0 replies; 135+ messages in thread
From: Yi Liu @ 2023-02-13 15:13 UTC (permalink / raw)
  To: joro, alex.williamson, jgg, kevin.tian, robin.murphy
  Cc: linux-s390, yi.l.liu, yi.y.sun, kvm, mjrosato, jasowang, cohuck,
	peterx, eric.auger, nicolinc, shameerali.kolothum.thodi,
	suravee.suthikulpanit, chao.p.peng, lulu, intel-gvt-dev,
	intel-gfx

Meanwhile, rename related helpers. No functional change is intended.

Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
---
 virt/kvm/vfio.c | 115 ++++++++++++++++++++++++------------------------
 1 file changed, 58 insertions(+), 57 deletions(-)

diff --git a/virt/kvm/vfio.c b/virt/kvm/vfio.c
index 8bac308ba630..857d6ba349e1 100644
--- a/virt/kvm/vfio.c
+++ b/virt/kvm/vfio.c
@@ -21,7 +21,7 @@
 #include <asm/kvm_ppc.h>
 #endif
 
-struct kvm_vfio_group {
+struct kvm_vfio_file {
 	struct list_head node;
 	struct file *file;
 #ifdef CONFIG_SPAPR_TCE_IOMMU
@@ -30,7 +30,7 @@ struct kvm_vfio_group {
 };
 
 struct kvm_vfio {
-	struct list_head group_list;
+	struct list_head file_list;
 	struct mutex lock;
 	bool noncoherent;
 };
@@ -98,34 +98,35 @@ static struct iommu_group *kvm_vfio_file_iommu_group(struct file *file)
 }
 
 static void kvm_spapr_tce_release_vfio_group(struct kvm *kvm,
-					     struct kvm_vfio_group *kvg)
+					     struct kvm_vfio_file *kvf)
 {
-	if (WARN_ON_ONCE(!kvg->iommu_group))
+	if (WARN_ON_ONCE(!kvf->iommu_group))
 		return;
 
-	kvm_spapr_tce_release_iommu_group(kvm, kvg->iommu_group);
-	iommu_group_put(kvg->iommu_group);
-	kvg->iommu_group = NULL;
+	kvm_spapr_tce_release_iommu_group(kvm, kvf->iommu_group);
+	iommu_group_put(kvf->iommu_group);
+	kvf->iommu_group = NULL;
 }
 #endif
 
 /*
- * Groups can use the same or different IOMMU domains.  If the same then
- * adding a new group may change the coherency of groups we've previously
- * been told about.  We don't want to care about any of that so we retest
- * each group and bail as soon as we find one that's noncoherent.  This
- * means we only ever [un]register_noncoherent_dma once for the whole device.
+ * Groups/devices can use the same or different IOMMU domains. If the same
+ * then adding a new group/device may change the coherency of groups/devices
+ * we've previously been told about. We don't want to care about any of
+ * that so we retest each group/device and bail as soon as we find one that's
+ * noncoherent.  This means we only ever [un]register_noncoherent_dma once
+ * for the whole device.
  */
 static void kvm_vfio_update_coherency(struct kvm_device *dev)
 {
 	struct kvm_vfio *kv = dev->private;
 	bool noncoherent = false;
-	struct kvm_vfio_group *kvg;
+	struct kvm_vfio_file *kvf;
 
 	mutex_lock(&kv->lock);
 
-	list_for_each_entry(kvg, &kv->group_list, node) {
-		if (!kvm_vfio_file_enforced_coherent(kvg->file)) {
+	list_for_each_entry(kvf, &kv->file_list, node) {
+		if (!kvm_vfio_file_enforced_coherent(kvf->file)) {
 			noncoherent = true;
 			break;
 		}
@@ -143,10 +144,10 @@ static void kvm_vfio_update_coherency(struct kvm_device *dev)
 	mutex_unlock(&kv->lock);
 }
 
-static int kvm_vfio_group_add(struct kvm_device *dev, unsigned int fd)
+static int kvm_vfio_file_add(struct kvm_device *dev, unsigned int fd)
 {
 	struct kvm_vfio *kv = dev->private;
-	struct kvm_vfio_group *kvg;
+	struct kvm_vfio_file *kvf;
 	struct file *filp;
 	int ret;
 
@@ -162,27 +163,27 @@ static int kvm_vfio_group_add(struct kvm_device *dev, unsigned int fd)
 
 	mutex_lock(&kv->lock);
 
-	list_for_each_entry(kvg, &kv->group_list, node) {
-		if (kvg->file == filp) {
+	list_for_each_entry(kvf, &kv->file_list, node) {
+		if (kvf->file == filp) {
 			ret = -EEXIST;
 			goto err_unlock;
 		}
 	}
 
-	kvg = kzalloc(sizeof(*kvg), GFP_KERNEL_ACCOUNT);
-	if (!kvg) {
+	kvf = kzalloc(sizeof(*kvf), GFP_KERNEL_ACCOUNT);
+	if (!kvf) {
 		ret = -ENOMEM;
 		goto err_unlock;
 	}
 
-	kvg->file = filp;
-	list_add_tail(&kvg->node, &kv->group_list);
+	kvf->file = filp;
+	list_add_tail(&kvf->node, &kv->file_list);
 
 	kvm_arch_start_assignment(dev->kvm);
 
 	mutex_unlock(&kv->lock);
 
-	kvm_vfio_file_set_kvm(kvg->file, dev->kvm);
+	kvm_vfio_file_set_kvm(kvf->file, dev->kvm);
 	kvm_vfio_update_coherency(dev);
 
 	return 0;
@@ -193,10 +194,10 @@ static int kvm_vfio_group_add(struct kvm_device *dev, unsigned int fd)
 	return ret;
 }
 
-static int kvm_vfio_group_del(struct kvm_device *dev, unsigned int fd)
+static int kvm_vfio_file_del(struct kvm_device *dev, unsigned int fd)
 {
 	struct kvm_vfio *kv = dev->private;
-	struct kvm_vfio_group *kvg;
+	struct kvm_vfio_file *kvf;
 	struct fd f;
 	int ret;
 
@@ -208,18 +209,18 @@ static int kvm_vfio_group_del(struct kvm_device *dev, unsigned int fd)
 
 	mutex_lock(&kv->lock);
 
-	list_for_each_entry(kvg, &kv->group_list, node) {
-		if (kvg->file != f.file)
+	list_for_each_entry(kvf, &kv->file_list, node) {
+		if (kvf->file != f.file)
 			continue;
 
-		list_del(&kvg->node);
+		list_del(&kvf->node);
 		kvm_arch_end_assignment(dev->kvm);
 #ifdef CONFIG_SPAPR_TCE_IOMMU
-		kvm_spapr_tce_release_vfio_group(dev->kvm, kvg);
+		kvm_spapr_tce_release_vfio_group(dev->kvm, kvf);
 #endif
-		kvm_vfio_file_set_kvm(kvg->file, NULL);
-		fput(kvg->file);
-		kfree(kvg);
+		kvm_vfio_file_set_kvm(kvf->file, NULL);
+		fput(kvf->file);
+		kfree(kvf);
 		ret = 0;
 		break;
 	}
@@ -234,12 +235,12 @@ static int kvm_vfio_group_del(struct kvm_device *dev, unsigned int fd)
 }
 
 #ifdef CONFIG_SPAPR_TCE_IOMMU
-static int kvm_vfio_group_set_spapr_tce(struct kvm_device *dev,
-					void __user *arg)
+static int kvm_vfio_file_set_spapr_tce(struct kvm_device *dev,
+				       void __user *arg)
 {
 	struct kvm_vfio_spapr_tce param;
 	struct kvm_vfio *kv = dev->private;
-	struct kvm_vfio_group *kvg;
+	struct kvm_vfio_file *kvf;
 	struct fd f;
 	int ret;
 
@@ -254,20 +255,20 @@ static int kvm_vfio_group_set_spapr_tce(struct kvm_device *dev,
 
 	mutex_lock(&kv->lock);
 
-	list_for_each_entry(kvg, &kv->group_list, node) {
-		if (kvg->file != f.file)
+	list_for_each_entry(kvf, &kv->file_list, node) {
+		if (kvf->file != f.file)
 			continue;
 
-		if (!kvg->iommu_group) {
-			kvg->iommu_group = kvm_vfio_file_iommu_group(kvg->file);
-			if (WARN_ON_ONCE(!kvg->iommu_group)) {
+		if (!kvf->iommu_group) {
+			kvf->iommu_group = kvm_vfio_file_iommu_group(kvf->file);
+			if (WARN_ON_ONCE(!kvf->iommu_group)) {
 				ret = -EIO;
 				goto err_fdput;
 			}
 		}
 
 		ret = kvm_spapr_tce_attach_iommu_group(dev->kvm, param.tablefd,
-						       kvg->iommu_group);
+						       kvf->iommu_group);
 		break;
 	}
 
@@ -278,8 +279,8 @@ static int kvm_vfio_group_set_spapr_tce(struct kvm_device *dev,
 }
 #endif
 
-static int kvm_vfio_set_group(struct kvm_device *dev, long attr,
-			      void __user *arg)
+static int kvm_vfio_set_file(struct kvm_device *dev, long attr,
+			     void __user *arg)
 {
 	int32_t __user *argp = arg;
 	int32_t fd;
@@ -288,16 +289,16 @@ static int kvm_vfio_set_group(struct kvm_device *dev, long attr,
 	case KVM_DEV_VFIO_GROUP_ADD:
 		if (get_user(fd, argp))
 			return -EFAULT;
-		return kvm_vfio_group_add(dev, fd);
+		return kvm_vfio_file_add(dev, fd);
 
 	case KVM_DEV_VFIO_GROUP_DEL:
 		if (get_user(fd, argp))
 			return -EFAULT;
-		return kvm_vfio_group_del(dev, fd);
+		return kvm_vfio_file_del(dev, fd);
 
 #ifdef CONFIG_SPAPR_TCE_IOMMU
 	case KVM_DEV_VFIO_GROUP_SET_SPAPR_TCE:
-		return kvm_vfio_group_set_spapr_tce(dev, arg);
+		return kvm_vfio_file_set_spapr_tce(dev, arg);
 #endif
 	}
 
@@ -309,8 +310,8 @@ static int kvm_vfio_set_attr(struct kvm_device *dev,
 {
 	switch (attr->group) {
 	case KVM_DEV_VFIO_GROUP:
-		return kvm_vfio_set_group(dev, attr->attr,
-					  u64_to_user_ptr(attr->addr));
+		return kvm_vfio_set_file(dev, attr->attr,
+					 u64_to_user_ptr(attr->addr));
 	}
 
 	return -ENXIO;
@@ -339,16 +340,16 @@ static int kvm_vfio_has_attr(struct kvm_device *dev,
 static void kvm_vfio_release(struct kvm_device *dev)
 {
 	struct kvm_vfio *kv = dev->private;
-	struct kvm_vfio_group *kvg, *tmp;
+	struct kvm_vfio_file *kvf, *tmp;
 
-	list_for_each_entry_safe(kvg, tmp, &kv->group_list, node) {
+	list_for_each_entry_safe(kvf, tmp, &kv->file_list, node) {
 #ifdef CONFIG_SPAPR_TCE_IOMMU
-		kvm_spapr_tce_release_vfio_group(dev->kvm, kvg);
+		kvm_spapr_tce_release_vfio_group(dev->kvm, kvf);
 #endif
-		kvm_vfio_file_set_kvm(kvg->file, NULL);
-		fput(kvg->file);
-		list_del(&kvg->node);
-		kfree(kvg);
+		kvm_vfio_file_set_kvm(kvf->file, NULL);
+		fput(kvf->file);
+		list_del(&kvf->node);
+		kfree(kvf);
 		kvm_arch_end_assignment(dev->kvm);
 	}
 
@@ -382,7 +383,7 @@ static int kvm_vfio_create(struct kvm_device *dev, u32 type)
 	if (!kv)
 		return -ENOMEM;
 
-	INIT_LIST_HEAD(&kv->group_list);
+	INIT_LIST_HEAD(&kv->file_list);
 	mutex_init(&kv->lock);
 
 	dev->private = kv;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 135+ messages in thread

* [PATCH v3 05/15] kvm/vfio: Accept vfio device file from userspace
  2023-02-13 15:13 ` [Intel-gfx] " Yi Liu
@ 2023-02-13 15:13   ` Yi Liu
  -1 siblings, 0 replies; 135+ messages in thread
From: Yi Liu @ 2023-02-13 15:13 UTC (permalink / raw)
  To: joro, alex.williamson, jgg, kevin.tian, robin.murphy
  Cc: cohuck, eric.auger, nicolinc, kvm, mjrosato, chao.p.peng,
	yi.l.liu, yi.y.sun, peterx, jasowang, shameerali.kolothum.thodi,
	lulu, suravee.suthikulpanit, intel-gvt-dev, intel-gfx,
	linux-s390

This defines KVM_DEV_VFIO_FILE* and make alias with KVM_DEV_VFIO_GROUP*.
Old userspace uses KVM_DEV_VFIO_GROUP* works as well.

Signed-off-by: Yi Liu <yi.l.liu@intel.com>
---
 Documentation/virt/kvm/devices/vfio.rst | 45 ++++++++++++++++---------
 include/uapi/linux/kvm.h                | 16 ++++++---
 virt/kvm/vfio.c                         | 16 ++++-----
 3 files changed, 50 insertions(+), 27 deletions(-)

diff --git a/Documentation/virt/kvm/devices/vfio.rst b/Documentation/virt/kvm/devices/vfio.rst
index 2d20dc561069..90f22933dcfa 100644
--- a/Documentation/virt/kvm/devices/vfio.rst
+++ b/Documentation/virt/kvm/devices/vfio.rst
@@ -9,24 +9,37 @@ Device types supported:
   - KVM_DEV_TYPE_VFIO
 
 Only one VFIO instance may be created per VM.  The created device
-tracks VFIO groups in use by the VM and features of those groups
-important to the correctness and acceleration of the VM.  As groups
-are enabled and disabled for use by the VM, KVM should be updated
-about their presence.  When registered with KVM, a reference to the
-VFIO-group is held by KVM.
+tracks VFIO files (group or device) in use by the VM and features
+of those groups/devices important to the correctness and acceleration
+of the VM.  As groups/devices are enabled and disabled for use by the
+VM, KVM should be updated about their presence.  When registered with
+KVM, a reference to the VFIO file is held by KVM.
 
 Groups:
-  KVM_DEV_VFIO_GROUP
-
-KVM_DEV_VFIO_GROUP attributes:
-  KVM_DEV_VFIO_GROUP_ADD: Add a VFIO group to VFIO-KVM device tracking
-	kvm_device_attr.addr points to an int32_t file descriptor
-	for the VFIO group.
-  KVM_DEV_VFIO_GROUP_DEL: Remove a VFIO group from VFIO-KVM device tracking
-	kvm_device_attr.addr points to an int32_t file descriptor
-	for the VFIO group.
-  KVM_DEV_VFIO_GROUP_SET_SPAPR_TCE: attaches a guest visible TCE table
+  KVM_DEV_VFIO_FILE
+	alias: KVM_DEV_VFIO_GROUP
+
+KVM_DEV_VFIO_FILE attributes:
+  KVM_DEV_VFIO_FILE_ADD: Add a VFIO file (group/device) to VFIO-KVM device
+	tracking
+
+	alias: KVM_DEV_VFIO_GROUP_ADD
+
+	kvm_device_attr.addr points to an int32_t file descriptor for the
+	VFIO file.
+  KVM_DEV_VFIO_FILE_DEL: Remove a VFIO file (group/device) from VFIO-KVM
+	device tracking
+
+	alias: KVM_DEV_VFIO_GROUP_DEL
+
+	kvm_device_attr.addr points to an int32_t file descriptor for the
+	VFIO file.
+
+  KVM_DEV_VFIO_FILE_SET_SPAPR_TCE: attaches a guest visible TCE table
 	allocated by sPAPR KVM.
+
+	alias: KVM_DEV_VFIO_GROUP_SET_SPAPR_TCE
+
 	kvm_device_attr.addr points to a struct::
 
 		struct kvm_vfio_spapr_tce {
@@ -39,3 +52,5 @@ KVM_DEV_VFIO_GROUP attributes:
 	- @groupfd is a file descriptor for a VFIO group;
 	- @tablefd is a file descriptor for a TCE table allocated via
 	  KVM_CREATE_SPAPR_TCE.
+
+	only accepts vfio group file
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index 55155e262646..484a8133bc69 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -1401,10 +1401,18 @@ struct kvm_device_attr {
 	__u64	addr;		/* userspace address of attr data */
 };
 
-#define  KVM_DEV_VFIO_GROUP			1
-#define   KVM_DEV_VFIO_GROUP_ADD			1
-#define   KVM_DEV_VFIO_GROUP_DEL			2
-#define   KVM_DEV_VFIO_GROUP_SET_SPAPR_TCE		3
+#define  KVM_DEV_VFIO_FILE	1
+
+#define   KVM_DEV_VFIO_FILE_ADD			1
+#define   KVM_DEV_VFIO_FILE_DEL			2
+#define   KVM_DEV_VFIO_FILE_SET_SPAPR_TCE	3
+
+/* KVM_DEV_VFIO_GROUP aliases are for compile time uapi compatibility */
+#define  KVM_DEV_VFIO_GROUP	KVM_DEV_VFIO_FILE
+
+#define   KVM_DEV_VFIO_GROUP_ADD	KVM_DEV_VFIO_FILE_ADD
+#define   KVM_DEV_VFIO_GROUP_DEL	KVM_DEV_VFIO_FILE_DEL
+#define   KVM_DEV_VFIO_GROUP_SET_SPAPR_TCE	KVM_DEV_VFIO_FILE_SET_SPAPR_TCE
 
 enum kvm_device_type {
 	KVM_DEV_TYPE_FSL_MPIC_20	= 1,
diff --git a/virt/kvm/vfio.c b/virt/kvm/vfio.c
index 857d6ba349e1..d869913baafd 100644
--- a/virt/kvm/vfio.c
+++ b/virt/kvm/vfio.c
@@ -286,18 +286,18 @@ static int kvm_vfio_set_file(struct kvm_device *dev, long attr,
 	int32_t fd;
 
 	switch (attr) {
-	case KVM_DEV_VFIO_GROUP_ADD:
+	case KVM_DEV_VFIO_FILE_ADD:
 		if (get_user(fd, argp))
 			return -EFAULT;
 		return kvm_vfio_file_add(dev, fd);
 
-	case KVM_DEV_VFIO_GROUP_DEL:
+	case KVM_DEV_VFIO_FILE_DEL:
 		if (get_user(fd, argp))
 			return -EFAULT;
 		return kvm_vfio_file_del(dev, fd);
 
 #ifdef CONFIG_SPAPR_TCE_IOMMU
-	case KVM_DEV_VFIO_GROUP_SET_SPAPR_TCE:
+	case KVM_DEV_VFIO_FILE_SET_SPAPR_TCE:
 		return kvm_vfio_file_set_spapr_tce(dev, arg);
 #endif
 	}
@@ -309,7 +309,7 @@ static int kvm_vfio_set_attr(struct kvm_device *dev,
 			     struct kvm_device_attr *attr)
 {
 	switch (attr->group) {
-	case KVM_DEV_VFIO_GROUP:
+	case KVM_DEV_VFIO_FILE:
 		return kvm_vfio_set_file(dev, attr->attr,
 					 u64_to_user_ptr(attr->addr));
 	}
@@ -321,12 +321,12 @@ static int kvm_vfio_has_attr(struct kvm_device *dev,
 			     struct kvm_device_attr *attr)
 {
 	switch (attr->group) {
-	case KVM_DEV_VFIO_GROUP:
+	case KVM_DEV_VFIO_FILE:
 		switch (attr->attr) {
-		case KVM_DEV_VFIO_GROUP_ADD:
-		case KVM_DEV_VFIO_GROUP_DEL:
+		case KVM_DEV_VFIO_FILE_ADD:
+		case KVM_DEV_VFIO_FILE_DEL:
 #ifdef CONFIG_SPAPR_TCE_IOMMU
-		case KVM_DEV_VFIO_GROUP_SET_SPAPR_TCE:
+		case KVM_DEV_VFIO_FILE_SET_SPAPR_TCE:
 #endif
 			return 0;
 		}
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 135+ messages in thread

* [Intel-gfx] [PATCH v3 05/15] kvm/vfio: Accept vfio device file from userspace
@ 2023-02-13 15:13   ` Yi Liu
  0 siblings, 0 replies; 135+ messages in thread
From: Yi Liu @ 2023-02-13 15:13 UTC (permalink / raw)
  To: joro, alex.williamson, jgg, kevin.tian, robin.murphy
  Cc: linux-s390, yi.l.liu, yi.y.sun, kvm, mjrosato, jasowang, cohuck,
	peterx, eric.auger, nicolinc, shameerali.kolothum.thodi,
	suravee.suthikulpanit, chao.p.peng, lulu, intel-gvt-dev,
	intel-gfx

This defines KVM_DEV_VFIO_FILE* and make alias with KVM_DEV_VFIO_GROUP*.
Old userspace uses KVM_DEV_VFIO_GROUP* works as well.

Signed-off-by: Yi Liu <yi.l.liu@intel.com>
---
 Documentation/virt/kvm/devices/vfio.rst | 45 ++++++++++++++++---------
 include/uapi/linux/kvm.h                | 16 ++++++---
 virt/kvm/vfio.c                         | 16 ++++-----
 3 files changed, 50 insertions(+), 27 deletions(-)

diff --git a/Documentation/virt/kvm/devices/vfio.rst b/Documentation/virt/kvm/devices/vfio.rst
index 2d20dc561069..90f22933dcfa 100644
--- a/Documentation/virt/kvm/devices/vfio.rst
+++ b/Documentation/virt/kvm/devices/vfio.rst
@@ -9,24 +9,37 @@ Device types supported:
   - KVM_DEV_TYPE_VFIO
 
 Only one VFIO instance may be created per VM.  The created device
-tracks VFIO groups in use by the VM and features of those groups
-important to the correctness and acceleration of the VM.  As groups
-are enabled and disabled for use by the VM, KVM should be updated
-about their presence.  When registered with KVM, a reference to the
-VFIO-group is held by KVM.
+tracks VFIO files (group or device) in use by the VM and features
+of those groups/devices important to the correctness and acceleration
+of the VM.  As groups/devices are enabled and disabled for use by the
+VM, KVM should be updated about their presence.  When registered with
+KVM, a reference to the VFIO file is held by KVM.
 
 Groups:
-  KVM_DEV_VFIO_GROUP
-
-KVM_DEV_VFIO_GROUP attributes:
-  KVM_DEV_VFIO_GROUP_ADD: Add a VFIO group to VFIO-KVM device tracking
-	kvm_device_attr.addr points to an int32_t file descriptor
-	for the VFIO group.
-  KVM_DEV_VFIO_GROUP_DEL: Remove a VFIO group from VFIO-KVM device tracking
-	kvm_device_attr.addr points to an int32_t file descriptor
-	for the VFIO group.
-  KVM_DEV_VFIO_GROUP_SET_SPAPR_TCE: attaches a guest visible TCE table
+  KVM_DEV_VFIO_FILE
+	alias: KVM_DEV_VFIO_GROUP
+
+KVM_DEV_VFIO_FILE attributes:
+  KVM_DEV_VFIO_FILE_ADD: Add a VFIO file (group/device) to VFIO-KVM device
+	tracking
+
+	alias: KVM_DEV_VFIO_GROUP_ADD
+
+	kvm_device_attr.addr points to an int32_t file descriptor for the
+	VFIO file.
+  KVM_DEV_VFIO_FILE_DEL: Remove a VFIO file (group/device) from VFIO-KVM
+	device tracking
+
+	alias: KVM_DEV_VFIO_GROUP_DEL
+
+	kvm_device_attr.addr points to an int32_t file descriptor for the
+	VFIO file.
+
+  KVM_DEV_VFIO_FILE_SET_SPAPR_TCE: attaches a guest visible TCE table
 	allocated by sPAPR KVM.
+
+	alias: KVM_DEV_VFIO_GROUP_SET_SPAPR_TCE
+
 	kvm_device_attr.addr points to a struct::
 
 		struct kvm_vfio_spapr_tce {
@@ -39,3 +52,5 @@ KVM_DEV_VFIO_GROUP attributes:
 	- @groupfd is a file descriptor for a VFIO group;
 	- @tablefd is a file descriptor for a TCE table allocated via
 	  KVM_CREATE_SPAPR_TCE.
+
+	only accepts vfio group file
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index 55155e262646..484a8133bc69 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -1401,10 +1401,18 @@ struct kvm_device_attr {
 	__u64	addr;		/* userspace address of attr data */
 };
 
-#define  KVM_DEV_VFIO_GROUP			1
-#define   KVM_DEV_VFIO_GROUP_ADD			1
-#define   KVM_DEV_VFIO_GROUP_DEL			2
-#define   KVM_DEV_VFIO_GROUP_SET_SPAPR_TCE		3
+#define  KVM_DEV_VFIO_FILE	1
+
+#define   KVM_DEV_VFIO_FILE_ADD			1
+#define   KVM_DEV_VFIO_FILE_DEL			2
+#define   KVM_DEV_VFIO_FILE_SET_SPAPR_TCE	3
+
+/* KVM_DEV_VFIO_GROUP aliases are for compile time uapi compatibility */
+#define  KVM_DEV_VFIO_GROUP	KVM_DEV_VFIO_FILE
+
+#define   KVM_DEV_VFIO_GROUP_ADD	KVM_DEV_VFIO_FILE_ADD
+#define   KVM_DEV_VFIO_GROUP_DEL	KVM_DEV_VFIO_FILE_DEL
+#define   KVM_DEV_VFIO_GROUP_SET_SPAPR_TCE	KVM_DEV_VFIO_FILE_SET_SPAPR_TCE
 
 enum kvm_device_type {
 	KVM_DEV_TYPE_FSL_MPIC_20	= 1,
diff --git a/virt/kvm/vfio.c b/virt/kvm/vfio.c
index 857d6ba349e1..d869913baafd 100644
--- a/virt/kvm/vfio.c
+++ b/virt/kvm/vfio.c
@@ -286,18 +286,18 @@ static int kvm_vfio_set_file(struct kvm_device *dev, long attr,
 	int32_t fd;
 
 	switch (attr) {
-	case KVM_DEV_VFIO_GROUP_ADD:
+	case KVM_DEV_VFIO_FILE_ADD:
 		if (get_user(fd, argp))
 			return -EFAULT;
 		return kvm_vfio_file_add(dev, fd);
 
-	case KVM_DEV_VFIO_GROUP_DEL:
+	case KVM_DEV_VFIO_FILE_DEL:
 		if (get_user(fd, argp))
 			return -EFAULT;
 		return kvm_vfio_file_del(dev, fd);
 
 #ifdef CONFIG_SPAPR_TCE_IOMMU
-	case KVM_DEV_VFIO_GROUP_SET_SPAPR_TCE:
+	case KVM_DEV_VFIO_FILE_SET_SPAPR_TCE:
 		return kvm_vfio_file_set_spapr_tce(dev, arg);
 #endif
 	}
@@ -309,7 +309,7 @@ static int kvm_vfio_set_attr(struct kvm_device *dev,
 			     struct kvm_device_attr *attr)
 {
 	switch (attr->group) {
-	case KVM_DEV_VFIO_GROUP:
+	case KVM_DEV_VFIO_FILE:
 		return kvm_vfio_set_file(dev, attr->attr,
 					 u64_to_user_ptr(attr->addr));
 	}
@@ -321,12 +321,12 @@ static int kvm_vfio_has_attr(struct kvm_device *dev,
 			     struct kvm_device_attr *attr)
 {
 	switch (attr->group) {
-	case KVM_DEV_VFIO_GROUP:
+	case KVM_DEV_VFIO_FILE:
 		switch (attr->attr) {
-		case KVM_DEV_VFIO_GROUP_ADD:
-		case KVM_DEV_VFIO_GROUP_DEL:
+		case KVM_DEV_VFIO_FILE_ADD:
+		case KVM_DEV_VFIO_FILE_DEL:
 #ifdef CONFIG_SPAPR_TCE_IOMMU
-		case KVM_DEV_VFIO_GROUP_SET_SPAPR_TCE:
+		case KVM_DEV_VFIO_FILE_SET_SPAPR_TCE:
 #endif
 			return 0;
 		}
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 135+ messages in thread

* [PATCH v3 06/15] vfio: Pass struct vfio_device_file * to vfio_device_open/close()
  2023-02-13 15:13 ` [Intel-gfx] " Yi Liu
@ 2023-02-13 15:13   ` Yi Liu
  -1 siblings, 0 replies; 135+ messages in thread
From: Yi Liu @ 2023-02-13 15:13 UTC (permalink / raw)
  To: joro, alex.williamson, jgg, kevin.tian, robin.murphy
  Cc: cohuck, eric.auger, nicolinc, kvm, mjrosato, chao.p.peng,
	yi.l.liu, yi.y.sun, peterx, jasowang, shameerali.kolothum.thodi,
	lulu, suravee.suthikulpanit, intel-gvt-dev, intel-gfx,
	linux-s390

This avoids passing too much parameters in multiple functions.

Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
---
 drivers/vfio/group.c     | 19 +++++++++++++------
 drivers/vfio/vfio.h      |  8 ++++----
 drivers/vfio/vfio_main.c | 25 +++++++++++++++----------
 3 files changed, 32 insertions(+), 20 deletions(-)

diff --git a/drivers/vfio/group.c b/drivers/vfio/group.c
index cc0eded19a9f..2abf55c69281 100644
--- a/drivers/vfio/group.c
+++ b/drivers/vfio/group.c
@@ -166,8 +166,9 @@ static void vfio_device_group_get_kvm_safe(struct vfio_device *device)
 	spin_unlock(&device->group->kvm_ref_lock);
 }
 
-static int vfio_device_group_open(struct vfio_device *device)
+static int vfio_device_group_open(struct vfio_device_file *df)
 {
+	struct vfio_device *device = df->device;
 	int ret;
 
 	mutex_lock(&device->group->group_lock);
@@ -187,7 +188,11 @@ static int vfio_device_group_open(struct vfio_device *device)
 	if (device->open_count == 0)
 		vfio_device_group_get_kvm_safe(device);
 
-	ret = vfio_device_open(device, device->group->iommufd);
+	df->iommufd = device->group->iommufd;
+
+	ret = vfio_device_open(df);
+	if (ret)
+		df->iommufd = NULL;
 
 	if (device->open_count == 0)
 		vfio_device_put_kvm(device);
@@ -199,12 +204,14 @@ static int vfio_device_group_open(struct vfio_device *device)
 	return ret;
 }
 
-void vfio_device_group_close(struct vfio_device *device)
+void vfio_device_group_close(struct vfio_device_file *df)
 {
+	struct vfio_device *device = df->device;
+
 	mutex_lock(&device->group->group_lock);
 	mutex_lock(&device->dev_set->lock);
 
-	vfio_device_close(device, device->group->iommufd);
+	vfio_device_close(df);
 
 	if (device->open_count == 0)
 		vfio_device_put_kvm(device);
@@ -225,7 +232,7 @@ static struct file *vfio_device_open_file(struct vfio_device *device)
 		goto err_out;
 	}
 
-	ret = vfio_device_group_open(device);
+	ret = vfio_device_group_open(df);
 	if (ret)
 		goto err_free;
 
@@ -257,7 +264,7 @@ static struct file *vfio_device_open_file(struct vfio_device *device)
 	return filep;
 
 err_close_device:
-	vfio_device_group_close(device);
+	vfio_device_group_close(df);
 err_free:
 	kfree(df);
 err_out:
diff --git a/drivers/vfio/vfio.h b/drivers/vfio/vfio.h
index cee979a1b90f..11e56fe079a1 100644
--- a/drivers/vfio/vfio.h
+++ b/drivers/vfio/vfio.h
@@ -20,13 +20,13 @@ struct vfio_device_file {
 	struct vfio_device *device;
 	spinlock_t kvm_ref_lock; /* protect kvm field */
 	struct kvm *kvm;
+	struct iommufd_ctx *iommufd; /* protected by struct vfio_device_set::lock */
 };
 
 void vfio_device_put_registration(struct vfio_device *device);
 bool vfio_device_try_get_registration(struct vfio_device *device);
-int vfio_device_open(struct vfio_device *device, struct iommufd_ctx *iommufd);
-void vfio_device_close(struct vfio_device *device,
-		       struct iommufd_ctx *iommufd);
+int vfio_device_open(struct vfio_device_file *df);
+void vfio_device_close(struct vfio_device_file *df);
 struct vfio_device_file *
 vfio_allocate_device_file(struct vfio_device *device);
 
@@ -91,7 +91,7 @@ void vfio_device_group_register(struct vfio_device *device);
 void vfio_device_group_unregister(struct vfio_device *device);
 int vfio_device_group_use_iommu(struct vfio_device *device);
 void vfio_device_group_unuse_iommu(struct vfio_device *device);
-void vfio_device_group_close(struct vfio_device *device);
+void vfio_device_group_close(struct vfio_device_file *df);
 struct vfio_group *vfio_group_from_file(struct file *file);
 bool vfio_group_enforced_coherent(struct vfio_group *group);
 void vfio_group_set_kvm(struct vfio_group *group, struct kvm *kvm);
diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c
index c529f609fecc..c517252aba19 100644
--- a/drivers/vfio/vfio_main.c
+++ b/drivers/vfio/vfio_main.c
@@ -411,9 +411,10 @@ vfio_allocate_device_file(struct vfio_device *device)
 	return df;
 }
 
-static int vfio_device_first_open(struct vfio_device *device,
-				  struct iommufd_ctx *iommufd)
+static int vfio_device_first_open(struct vfio_device_file *df)
 {
+	struct vfio_device *device = df->device;
+	struct iommufd_ctx *iommufd = df->iommufd;
 	int ret;
 
 	lockdep_assert_held(&device->dev_set->lock);
@@ -445,9 +446,11 @@ static int vfio_device_first_open(struct vfio_device *device,
 	return ret;
 }
 
-static void vfio_device_last_close(struct vfio_device *device,
-				   struct iommufd_ctx *iommufd)
+static void vfio_device_last_close(struct vfio_device_file *df)
 {
+	struct vfio_device *device = df->device;
+	struct iommufd_ctx *iommufd = df->iommufd;
+
 	lockdep_assert_held(&device->dev_set->lock);
 
 	if (device->ops->close_device)
@@ -459,15 +462,16 @@ static void vfio_device_last_close(struct vfio_device *device,
 	module_put(device->dev->driver->owner);
 }
 
-int vfio_device_open(struct vfio_device *device, struct iommufd_ctx *iommufd)
+int vfio_device_open(struct vfio_device_file *df)
 {
+	struct vfio_device *device = df->device;
 	int ret = 0;
 
 	lockdep_assert_held(&device->dev_set->lock);
 
 	device->open_count++;
 	if (device->open_count == 1) {
-		ret = vfio_device_first_open(device, iommufd);
+		ret = vfio_device_first_open(df);
 		if (ret)
 			device->open_count--;
 	}
@@ -475,14 +479,15 @@ int vfio_device_open(struct vfio_device *device, struct iommufd_ctx *iommufd)
 	return ret;
 }
 
-void vfio_device_close(struct vfio_device *device,
-		       struct iommufd_ctx *iommufd)
+void vfio_device_close(struct vfio_device_file *df)
 {
+	struct vfio_device *device = df->device;
+
 	lockdep_assert_held(&device->dev_set->lock);
 
 	vfio_assert_device_open(device);
 	if (device->open_count == 1)
-		vfio_device_last_close(device, iommufd);
+		vfio_device_last_close(df);
 	device->open_count--;
 }
 
@@ -527,7 +532,7 @@ static int vfio_device_fops_release(struct inode *inode, struct file *filep)
 	struct vfio_device_file *df = filep->private_data;
 	struct vfio_device *device = df->device;
 
-	vfio_device_group_close(device);
+	vfio_device_group_close(df);
 
 	vfio_device_put_registration(device);
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 135+ messages in thread

* [Intel-gfx] [PATCH v3 06/15] vfio: Pass struct vfio_device_file * to vfio_device_open/close()
@ 2023-02-13 15:13   ` Yi Liu
  0 siblings, 0 replies; 135+ messages in thread
From: Yi Liu @ 2023-02-13 15:13 UTC (permalink / raw)
  To: joro, alex.williamson, jgg, kevin.tian, robin.murphy
  Cc: linux-s390, yi.l.liu, yi.y.sun, kvm, mjrosato, jasowang, cohuck,
	peterx, eric.auger, nicolinc, shameerali.kolothum.thodi,
	suravee.suthikulpanit, chao.p.peng, lulu, intel-gvt-dev,
	intel-gfx

This avoids passing too much parameters in multiple functions.

Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
---
 drivers/vfio/group.c     | 19 +++++++++++++------
 drivers/vfio/vfio.h      |  8 ++++----
 drivers/vfio/vfio_main.c | 25 +++++++++++++++----------
 3 files changed, 32 insertions(+), 20 deletions(-)

diff --git a/drivers/vfio/group.c b/drivers/vfio/group.c
index cc0eded19a9f..2abf55c69281 100644
--- a/drivers/vfio/group.c
+++ b/drivers/vfio/group.c
@@ -166,8 +166,9 @@ static void vfio_device_group_get_kvm_safe(struct vfio_device *device)
 	spin_unlock(&device->group->kvm_ref_lock);
 }
 
-static int vfio_device_group_open(struct vfio_device *device)
+static int vfio_device_group_open(struct vfio_device_file *df)
 {
+	struct vfio_device *device = df->device;
 	int ret;
 
 	mutex_lock(&device->group->group_lock);
@@ -187,7 +188,11 @@ static int vfio_device_group_open(struct vfio_device *device)
 	if (device->open_count == 0)
 		vfio_device_group_get_kvm_safe(device);
 
-	ret = vfio_device_open(device, device->group->iommufd);
+	df->iommufd = device->group->iommufd;
+
+	ret = vfio_device_open(df);
+	if (ret)
+		df->iommufd = NULL;
 
 	if (device->open_count == 0)
 		vfio_device_put_kvm(device);
@@ -199,12 +204,14 @@ static int vfio_device_group_open(struct vfio_device *device)
 	return ret;
 }
 
-void vfio_device_group_close(struct vfio_device *device)
+void vfio_device_group_close(struct vfio_device_file *df)
 {
+	struct vfio_device *device = df->device;
+
 	mutex_lock(&device->group->group_lock);
 	mutex_lock(&device->dev_set->lock);
 
-	vfio_device_close(device, device->group->iommufd);
+	vfio_device_close(df);
 
 	if (device->open_count == 0)
 		vfio_device_put_kvm(device);
@@ -225,7 +232,7 @@ static struct file *vfio_device_open_file(struct vfio_device *device)
 		goto err_out;
 	}
 
-	ret = vfio_device_group_open(device);
+	ret = vfio_device_group_open(df);
 	if (ret)
 		goto err_free;
 
@@ -257,7 +264,7 @@ static struct file *vfio_device_open_file(struct vfio_device *device)
 	return filep;
 
 err_close_device:
-	vfio_device_group_close(device);
+	vfio_device_group_close(df);
 err_free:
 	kfree(df);
 err_out:
diff --git a/drivers/vfio/vfio.h b/drivers/vfio/vfio.h
index cee979a1b90f..11e56fe079a1 100644
--- a/drivers/vfio/vfio.h
+++ b/drivers/vfio/vfio.h
@@ -20,13 +20,13 @@ struct vfio_device_file {
 	struct vfio_device *device;
 	spinlock_t kvm_ref_lock; /* protect kvm field */
 	struct kvm *kvm;
+	struct iommufd_ctx *iommufd; /* protected by struct vfio_device_set::lock */
 };
 
 void vfio_device_put_registration(struct vfio_device *device);
 bool vfio_device_try_get_registration(struct vfio_device *device);
-int vfio_device_open(struct vfio_device *device, struct iommufd_ctx *iommufd);
-void vfio_device_close(struct vfio_device *device,
-		       struct iommufd_ctx *iommufd);
+int vfio_device_open(struct vfio_device_file *df);
+void vfio_device_close(struct vfio_device_file *df);
 struct vfio_device_file *
 vfio_allocate_device_file(struct vfio_device *device);
 
@@ -91,7 +91,7 @@ void vfio_device_group_register(struct vfio_device *device);
 void vfio_device_group_unregister(struct vfio_device *device);
 int vfio_device_group_use_iommu(struct vfio_device *device);
 void vfio_device_group_unuse_iommu(struct vfio_device *device);
-void vfio_device_group_close(struct vfio_device *device);
+void vfio_device_group_close(struct vfio_device_file *df);
 struct vfio_group *vfio_group_from_file(struct file *file);
 bool vfio_group_enforced_coherent(struct vfio_group *group);
 void vfio_group_set_kvm(struct vfio_group *group, struct kvm *kvm);
diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c
index c529f609fecc..c517252aba19 100644
--- a/drivers/vfio/vfio_main.c
+++ b/drivers/vfio/vfio_main.c
@@ -411,9 +411,10 @@ vfio_allocate_device_file(struct vfio_device *device)
 	return df;
 }
 
-static int vfio_device_first_open(struct vfio_device *device,
-				  struct iommufd_ctx *iommufd)
+static int vfio_device_first_open(struct vfio_device_file *df)
 {
+	struct vfio_device *device = df->device;
+	struct iommufd_ctx *iommufd = df->iommufd;
 	int ret;
 
 	lockdep_assert_held(&device->dev_set->lock);
@@ -445,9 +446,11 @@ static int vfio_device_first_open(struct vfio_device *device,
 	return ret;
 }
 
-static void vfio_device_last_close(struct vfio_device *device,
-				   struct iommufd_ctx *iommufd)
+static void vfio_device_last_close(struct vfio_device_file *df)
 {
+	struct vfio_device *device = df->device;
+	struct iommufd_ctx *iommufd = df->iommufd;
+
 	lockdep_assert_held(&device->dev_set->lock);
 
 	if (device->ops->close_device)
@@ -459,15 +462,16 @@ static void vfio_device_last_close(struct vfio_device *device,
 	module_put(device->dev->driver->owner);
 }
 
-int vfio_device_open(struct vfio_device *device, struct iommufd_ctx *iommufd)
+int vfio_device_open(struct vfio_device_file *df)
 {
+	struct vfio_device *device = df->device;
 	int ret = 0;
 
 	lockdep_assert_held(&device->dev_set->lock);
 
 	device->open_count++;
 	if (device->open_count == 1) {
-		ret = vfio_device_first_open(device, iommufd);
+		ret = vfio_device_first_open(df);
 		if (ret)
 			device->open_count--;
 	}
@@ -475,14 +479,15 @@ int vfio_device_open(struct vfio_device *device, struct iommufd_ctx *iommufd)
 	return ret;
 }
 
-void vfio_device_close(struct vfio_device *device,
-		       struct iommufd_ctx *iommufd)
+void vfio_device_close(struct vfio_device_file *df)
 {
+	struct vfio_device *device = df->device;
+
 	lockdep_assert_held(&device->dev_set->lock);
 
 	vfio_assert_device_open(device);
 	if (device->open_count == 1)
-		vfio_device_last_close(device, iommufd);
+		vfio_device_last_close(df);
 	device->open_count--;
 }
 
@@ -527,7 +532,7 @@ static int vfio_device_fops_release(struct inode *inode, struct file *filep)
 	struct vfio_device_file *df = filep->private_data;
 	struct vfio_device *device = df->device;
 
-	vfio_device_group_close(device);
+	vfio_device_group_close(df);
 
 	vfio_device_put_registration(device);
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 135+ messages in thread

* [PATCH v3 07/15] vfio: Block device access via device fd until device is opened
  2023-02-13 15:13 ` [Intel-gfx] " Yi Liu
@ 2023-02-13 15:13   ` Yi Liu
  -1 siblings, 0 replies; 135+ messages in thread
From: Yi Liu @ 2023-02-13 15:13 UTC (permalink / raw)
  To: joro, alex.williamson, jgg, kevin.tian, robin.murphy
  Cc: cohuck, eric.auger, nicolinc, kvm, mjrosato, chao.p.peng,
	yi.l.liu, yi.y.sun, peterx, jasowang, shameerali.kolothum.thodi,
	lulu, suravee.suthikulpanit, intel-gvt-dev, intel-gfx,
	linux-s390

Allow the vfio_device file to be in a state where the device FD is
opened but the device cannot be used by userspace (i.e. its .open_device()
hasn't been called). This inbetween state is not used when the device
FD is spawned from the group FD, however when we create the device FD
directly by opening a cdev it will be opened in the blocked state.

The reason for the inbetween state is that userspace only gets a FD but
doesn't gain access permission until binding the FD to an iommufd. So in
the blocked state, only the bind operation is allowed. Completing bind
will allow user to further access the device.

This is implemented by adding a flag in struct vfio_device_file to mark
the blocked state and using a simple smp_load_acquire() to obtain the
flag value and serialize all the device setup with the thread accessing
this device.

Following this lockless scheme, it can safely handle the device FD
unbound->bound but it cannot handle bound->unbound. To allow this we'd
need to add a lock on all the vfio ioctls which seems costly. So once
device FD is bound, it remains bound until the FD is closed.

Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
---
 drivers/vfio/vfio.h      |  1 +
 drivers/vfio/vfio_main.c | 34 +++++++++++++++++++++++++++++++++-
 2 files changed, 34 insertions(+), 1 deletion(-)

diff --git a/drivers/vfio/vfio.h b/drivers/vfio/vfio.h
index 11e56fe079a1..d56cdb114024 100644
--- a/drivers/vfio/vfio.h
+++ b/drivers/vfio/vfio.h
@@ -18,6 +18,7 @@ struct vfio_container;
 
 struct vfio_device_file {
 	struct vfio_device *device;
+	bool access_granted;
 	spinlock_t kvm_ref_lock; /* protect kvm field */
 	struct kvm *kvm;
 	struct iommufd_ctx *iommufd; /* protected by struct vfio_device_set::lock */
diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c
index c517252aba19..2267057240bd 100644
--- a/drivers/vfio/vfio_main.c
+++ b/drivers/vfio/vfio_main.c
@@ -476,7 +476,15 @@ int vfio_device_open(struct vfio_device_file *df)
 			device->open_count--;
 	}
 
-	return ret;
+	if (ret)
+		return ret;
+
+	/*
+	 * Paired with smp_load_acquire() in vfio_device_fops::ioctl/
+	 * read/write/mmap
+	 */
+	smp_store_release(&df->access_granted, true);
+	return 0;
 }
 
 void vfio_device_close(struct vfio_device_file *df)
@@ -1104,8 +1112,14 @@ static long vfio_device_fops_unl_ioctl(struct file *filep,
 {
 	struct vfio_device_file *df = filep->private_data;
 	struct vfio_device *device = df->device;
+	bool access;
 	int ret;
 
+	/* Paired with smp_store_release() in vfio_device_open() */
+	access = smp_load_acquire(&df->access_granted);
+	if (!access)
+		return -EINVAL;
+
 	ret = vfio_device_pm_runtime_get(device);
 	if (ret)
 		return ret;
@@ -1132,6 +1146,12 @@ static ssize_t vfio_device_fops_read(struct file *filep, char __user *buf,
 {
 	struct vfio_device_file *df = filep->private_data;
 	struct vfio_device *device = df->device;
+	bool access;
+
+	/* Paired with smp_store_release() in vfio_device_open() */
+	access = smp_load_acquire(&df->access_granted);
+	if (!access)
+		return -EINVAL;
 
 	if (unlikely(!device->ops->read))
 		return -EINVAL;
@@ -1145,6 +1165,12 @@ static ssize_t vfio_device_fops_write(struct file *filep,
 {
 	struct vfio_device_file *df = filep->private_data;
 	struct vfio_device *device = df->device;
+	bool access;
+
+	/* Paired with smp_store_release() in vfio_device_open() */
+	access = smp_load_acquire(&df->access_granted);
+	if (!access)
+		return -EINVAL;
 
 	if (unlikely(!device->ops->write))
 		return -EINVAL;
@@ -1156,6 +1182,12 @@ static int vfio_device_fops_mmap(struct file *filep, struct vm_area_struct *vma)
 {
 	struct vfio_device_file *df = filep->private_data;
 	struct vfio_device *device = df->device;
+	bool access;
+
+	/* Paired with smp_store_release() in vfio_device_open() */
+	access = smp_load_acquire(&df->access_granted);
+	if (!access)
+		return -EINVAL;
 
 	if (unlikely(!device->ops->mmap))
 		return -EINVAL;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 135+ messages in thread

* [Intel-gfx] [PATCH v3 07/15] vfio: Block device access via device fd until device is opened
@ 2023-02-13 15:13   ` Yi Liu
  0 siblings, 0 replies; 135+ messages in thread
From: Yi Liu @ 2023-02-13 15:13 UTC (permalink / raw)
  To: joro, alex.williamson, jgg, kevin.tian, robin.murphy
  Cc: linux-s390, yi.l.liu, yi.y.sun, kvm, mjrosato, jasowang, cohuck,
	peterx, eric.auger, nicolinc, shameerali.kolothum.thodi,
	suravee.suthikulpanit, chao.p.peng, lulu, intel-gvt-dev,
	intel-gfx

Allow the vfio_device file to be in a state where the device FD is
opened but the device cannot be used by userspace (i.e. its .open_device()
hasn't been called). This inbetween state is not used when the device
FD is spawned from the group FD, however when we create the device FD
directly by opening a cdev it will be opened in the blocked state.

The reason for the inbetween state is that userspace only gets a FD but
doesn't gain access permission until binding the FD to an iommufd. So in
the blocked state, only the bind operation is allowed. Completing bind
will allow user to further access the device.

This is implemented by adding a flag in struct vfio_device_file to mark
the blocked state and using a simple smp_load_acquire() to obtain the
flag value and serialize all the device setup with the thread accessing
this device.

Following this lockless scheme, it can safely handle the device FD
unbound->bound but it cannot handle bound->unbound. To allow this we'd
need to add a lock on all the vfio ioctls which seems costly. So once
device FD is bound, it remains bound until the FD is closed.

Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
---
 drivers/vfio/vfio.h      |  1 +
 drivers/vfio/vfio_main.c | 34 +++++++++++++++++++++++++++++++++-
 2 files changed, 34 insertions(+), 1 deletion(-)

diff --git a/drivers/vfio/vfio.h b/drivers/vfio/vfio.h
index 11e56fe079a1..d56cdb114024 100644
--- a/drivers/vfio/vfio.h
+++ b/drivers/vfio/vfio.h
@@ -18,6 +18,7 @@ struct vfio_container;
 
 struct vfio_device_file {
 	struct vfio_device *device;
+	bool access_granted;
 	spinlock_t kvm_ref_lock; /* protect kvm field */
 	struct kvm *kvm;
 	struct iommufd_ctx *iommufd; /* protected by struct vfio_device_set::lock */
diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c
index c517252aba19..2267057240bd 100644
--- a/drivers/vfio/vfio_main.c
+++ b/drivers/vfio/vfio_main.c
@@ -476,7 +476,15 @@ int vfio_device_open(struct vfio_device_file *df)
 			device->open_count--;
 	}
 
-	return ret;
+	if (ret)
+		return ret;
+
+	/*
+	 * Paired with smp_load_acquire() in vfio_device_fops::ioctl/
+	 * read/write/mmap
+	 */
+	smp_store_release(&df->access_granted, true);
+	return 0;
 }
 
 void vfio_device_close(struct vfio_device_file *df)
@@ -1104,8 +1112,14 @@ static long vfio_device_fops_unl_ioctl(struct file *filep,
 {
 	struct vfio_device_file *df = filep->private_data;
 	struct vfio_device *device = df->device;
+	bool access;
 	int ret;
 
+	/* Paired with smp_store_release() in vfio_device_open() */
+	access = smp_load_acquire(&df->access_granted);
+	if (!access)
+		return -EINVAL;
+
 	ret = vfio_device_pm_runtime_get(device);
 	if (ret)
 		return ret;
@@ -1132,6 +1146,12 @@ static ssize_t vfio_device_fops_read(struct file *filep, char __user *buf,
 {
 	struct vfio_device_file *df = filep->private_data;
 	struct vfio_device *device = df->device;
+	bool access;
+
+	/* Paired with smp_store_release() in vfio_device_open() */
+	access = smp_load_acquire(&df->access_granted);
+	if (!access)
+		return -EINVAL;
 
 	if (unlikely(!device->ops->read))
 		return -EINVAL;
@@ -1145,6 +1165,12 @@ static ssize_t vfio_device_fops_write(struct file *filep,
 {
 	struct vfio_device_file *df = filep->private_data;
 	struct vfio_device *device = df->device;
+	bool access;
+
+	/* Paired with smp_store_release() in vfio_device_open() */
+	access = smp_load_acquire(&df->access_granted);
+	if (!access)
+		return -EINVAL;
 
 	if (unlikely(!device->ops->write))
 		return -EINVAL;
@@ -1156,6 +1182,12 @@ static int vfio_device_fops_mmap(struct file *filep, struct vm_area_struct *vma)
 {
 	struct vfio_device_file *df = filep->private_data;
 	struct vfio_device *device = df->device;
+	bool access;
+
+	/* Paired with smp_store_release() in vfio_device_open() */
+	access = smp_load_acquire(&df->access_granted);
+	if (!access)
+		return -EINVAL;
 
 	if (unlikely(!device->ops->mmap))
 		return -EINVAL;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 135+ messages in thread

* [PATCH v3 08/15] vfio: Add infrastructure for bind_iommufd from userspace
  2023-02-13 15:13 ` [Intel-gfx] " Yi Liu
@ 2023-02-13 15:13   ` Yi Liu
  -1 siblings, 0 replies; 135+ messages in thread
From: Yi Liu @ 2023-02-13 15:13 UTC (permalink / raw)
  To: joro, alex.williamson, jgg, kevin.tian, robin.murphy
  Cc: cohuck, eric.auger, nicolinc, kvm, mjrosato, chao.p.peng,
	yi.l.liu, yi.y.sun, peterx, jasowang, shameerali.kolothum.thodi,
	lulu, suravee.suthikulpanit, intel-gvt-dev, intel-gfx,
	linux-s390

For the device fd opened from cdev, userspace needs to bind it to an
iommufd and attach it to IOAS managed by iommufd. With such operations,
userspace can set up a secure DMA context and hence access device.

This changes the existing vfio_iommufd_bind() to accept a pt_id pointer
as an optional input, and also an dev_id pointer to selectively return
the dev_id to prepare for adding bind_iommufd ioctl, which does the bind
first and then attach IOAS.

Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
---
 drivers/vfio/group.c     | 17 ++++++++++++++---
 drivers/vfio/iommufd.c   | 21 +++++++++------------
 drivers/vfio/vfio.h      |  9 ++++++---
 drivers/vfio/vfio_main.c | 10 ++++++----
 4 files changed, 35 insertions(+), 22 deletions(-)

diff --git a/drivers/vfio/group.c b/drivers/vfio/group.c
index 2abf55c69281..9f3f6f0e4942 100644
--- a/drivers/vfio/group.c
+++ b/drivers/vfio/group.c
@@ -169,6 +169,7 @@ static void vfio_device_group_get_kvm_safe(struct vfio_device *device)
 static int vfio_device_group_open(struct vfio_device_file *df)
 {
 	struct vfio_device *device = df->device;
+	u32 ioas_id;
 	int ret;
 
 	mutex_lock(&device->group->group_lock);
@@ -177,6 +178,13 @@ static int vfio_device_group_open(struct vfio_device_file *df)
 		goto out_unlock;
 	}
 
+	if (device->group->iommufd) {
+		ret = iommufd_vfio_compat_ioas_id(device->group->iommufd,
+						  &ioas_id);
+		if (ret)
+			goto out_unlock;
+	}
+
 	mutex_lock(&device->dev_set->lock);
 
 	/*
@@ -188,9 +196,12 @@ static int vfio_device_group_open(struct vfio_device_file *df)
 	if (device->open_count == 0)
 		vfio_device_group_get_kvm_safe(device);
 
-	df->iommufd = device->group->iommufd;
-
-	ret = vfio_device_open(df);
+	if (device->group->iommufd) {
+		df->iommufd = device->group->iommufd;
+		ret = vfio_device_open(df, NULL, &ioas_id);
+	} else {
+		ret = vfio_device_open(df, NULL, NULL);
+	}
 	if (ret)
 		df->iommufd = NULL;
 
diff --git a/drivers/vfio/iommufd.c b/drivers/vfio/iommufd.c
index 4f82a6fa7c6c..beef6ca21107 100644
--- a/drivers/vfio/iommufd.c
+++ b/drivers/vfio/iommufd.c
@@ -10,9 +10,9 @@
 MODULE_IMPORT_NS(IOMMUFD);
 MODULE_IMPORT_NS(IOMMUFD_VFIO);
 
-int vfio_iommufd_bind(struct vfio_device *vdev, struct iommufd_ctx *ictx)
+int vfio_iommufd_bind(struct vfio_device *vdev, struct iommufd_ctx *ictx,
+		      u32 *dev_id, u32 *pt_id)
 {
-	u32 ioas_id;
 	u32 device_id;
 	int ret;
 
@@ -29,17 +29,14 @@ int vfio_iommufd_bind(struct vfio_device *vdev, struct iommufd_ctx *ictx)
 	if (ret)
 		return ret;
 
-	ret = iommufd_vfio_compat_ioas_id(ictx, &ioas_id);
-	if (ret)
-		goto err_unbind;
-	ret = vdev->ops->attach_ioas(vdev, &ioas_id);
-	if (ret)
-		goto err_unbind;
+	if (pt_id) {
+		ret = vdev->ops->attach_ioas(vdev, pt_id);
+		if (ret)
+			goto err_unbind;
+	}
 
-	/*
-	 * The legacy path has no way to return the device id or the selected
-	 * pt_id
-	 */
+	if (dev_id)
+		*dev_id = device_id;
 	return 0;
 
 err_unbind:
diff --git a/drivers/vfio/vfio.h b/drivers/vfio/vfio.h
index d56cdb114024..6f063e31d08a 100644
--- a/drivers/vfio/vfio.h
+++ b/drivers/vfio/vfio.h
@@ -26,7 +26,8 @@ struct vfio_device_file {
 
 void vfio_device_put_registration(struct vfio_device *device);
 bool vfio_device_try_get_registration(struct vfio_device *device);
-int vfio_device_open(struct vfio_device_file *df);
+int vfio_device_open(struct vfio_device_file *df,
+		     u32 *dev_id, u32 *pt_id);
 void vfio_device_close(struct vfio_device_file *df);
 struct vfio_device_file *
 vfio_allocate_device_file(struct vfio_device *device);
@@ -224,11 +225,13 @@ static inline void vfio_container_cleanup(void)
 #endif
 
 #if IS_ENABLED(CONFIG_IOMMUFD)
-int vfio_iommufd_bind(struct vfio_device *device, struct iommufd_ctx *ictx);
+int vfio_iommufd_bind(struct vfio_device *device, struct iommufd_ctx *ictx,
+		      u32 *dev_id, u32 *pt_id);
 void vfio_iommufd_unbind(struct vfio_device *device);
 #else
 static inline int vfio_iommufd_bind(struct vfio_device *device,
-				    struct iommufd_ctx *ictx)
+				    struct iommufd_ctx *ictx,
+				    u32 *dev_id, u32 *pt_id)
 {
 	return -EOPNOTSUPP;
 }
diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c
index 2267057240bd..b40c2d95f693 100644
--- a/drivers/vfio/vfio_main.c
+++ b/drivers/vfio/vfio_main.c
@@ -411,7 +411,8 @@ vfio_allocate_device_file(struct vfio_device *device)
 	return df;
 }
 
-static int vfio_device_first_open(struct vfio_device_file *df)
+static int vfio_device_first_open(struct vfio_device_file *df,
+				  u32 *dev_id, u32 *pt_id)
 {
 	struct vfio_device *device = df->device;
 	struct iommufd_ctx *iommufd = df->iommufd;
@@ -423,7 +424,7 @@ static int vfio_device_first_open(struct vfio_device_file *df)
 		return -ENODEV;
 
 	if (iommufd)
-		ret = vfio_iommufd_bind(device, iommufd);
+		ret = vfio_iommufd_bind(device, iommufd, dev_id, pt_id);
 	else
 		ret = vfio_device_group_use_iommu(device);
 	if (ret)
@@ -462,7 +463,8 @@ static void vfio_device_last_close(struct vfio_device_file *df)
 	module_put(device->dev->driver->owner);
 }
 
-int vfio_device_open(struct vfio_device_file *df)
+int vfio_device_open(struct vfio_device_file *df,
+		     u32 *dev_id, u32 *pt_id)
 {
 	struct vfio_device *device = df->device;
 	int ret = 0;
@@ -471,7 +473,7 @@ int vfio_device_open(struct vfio_device_file *df)
 
 	device->open_count++;
 	if (device->open_count == 1) {
-		ret = vfio_device_first_open(df);
+		ret = vfio_device_first_open(df, dev_id, pt_id);
 		if (ret)
 			device->open_count--;
 	}
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 135+ messages in thread

* [Intel-gfx] [PATCH v3 08/15] vfio: Add infrastructure for bind_iommufd from userspace
@ 2023-02-13 15:13   ` Yi Liu
  0 siblings, 0 replies; 135+ messages in thread
From: Yi Liu @ 2023-02-13 15:13 UTC (permalink / raw)
  To: joro, alex.williamson, jgg, kevin.tian, robin.murphy
  Cc: linux-s390, yi.l.liu, yi.y.sun, kvm, mjrosato, jasowang, cohuck,
	peterx, eric.auger, nicolinc, shameerali.kolothum.thodi,
	suravee.suthikulpanit, chao.p.peng, lulu, intel-gvt-dev,
	intel-gfx

For the device fd opened from cdev, userspace needs to bind it to an
iommufd and attach it to IOAS managed by iommufd. With such operations,
userspace can set up a secure DMA context and hence access device.

This changes the existing vfio_iommufd_bind() to accept a pt_id pointer
as an optional input, and also an dev_id pointer to selectively return
the dev_id to prepare for adding bind_iommufd ioctl, which does the bind
first and then attach IOAS.

Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
---
 drivers/vfio/group.c     | 17 ++++++++++++++---
 drivers/vfio/iommufd.c   | 21 +++++++++------------
 drivers/vfio/vfio.h      |  9 ++++++---
 drivers/vfio/vfio_main.c | 10 ++++++----
 4 files changed, 35 insertions(+), 22 deletions(-)

diff --git a/drivers/vfio/group.c b/drivers/vfio/group.c
index 2abf55c69281..9f3f6f0e4942 100644
--- a/drivers/vfio/group.c
+++ b/drivers/vfio/group.c
@@ -169,6 +169,7 @@ static void vfio_device_group_get_kvm_safe(struct vfio_device *device)
 static int vfio_device_group_open(struct vfio_device_file *df)
 {
 	struct vfio_device *device = df->device;
+	u32 ioas_id;
 	int ret;
 
 	mutex_lock(&device->group->group_lock);
@@ -177,6 +178,13 @@ static int vfio_device_group_open(struct vfio_device_file *df)
 		goto out_unlock;
 	}
 
+	if (device->group->iommufd) {
+		ret = iommufd_vfio_compat_ioas_id(device->group->iommufd,
+						  &ioas_id);
+		if (ret)
+			goto out_unlock;
+	}
+
 	mutex_lock(&device->dev_set->lock);
 
 	/*
@@ -188,9 +196,12 @@ static int vfio_device_group_open(struct vfio_device_file *df)
 	if (device->open_count == 0)
 		vfio_device_group_get_kvm_safe(device);
 
-	df->iommufd = device->group->iommufd;
-
-	ret = vfio_device_open(df);
+	if (device->group->iommufd) {
+		df->iommufd = device->group->iommufd;
+		ret = vfio_device_open(df, NULL, &ioas_id);
+	} else {
+		ret = vfio_device_open(df, NULL, NULL);
+	}
 	if (ret)
 		df->iommufd = NULL;
 
diff --git a/drivers/vfio/iommufd.c b/drivers/vfio/iommufd.c
index 4f82a6fa7c6c..beef6ca21107 100644
--- a/drivers/vfio/iommufd.c
+++ b/drivers/vfio/iommufd.c
@@ -10,9 +10,9 @@
 MODULE_IMPORT_NS(IOMMUFD);
 MODULE_IMPORT_NS(IOMMUFD_VFIO);
 
-int vfio_iommufd_bind(struct vfio_device *vdev, struct iommufd_ctx *ictx)
+int vfio_iommufd_bind(struct vfio_device *vdev, struct iommufd_ctx *ictx,
+		      u32 *dev_id, u32 *pt_id)
 {
-	u32 ioas_id;
 	u32 device_id;
 	int ret;
 
@@ -29,17 +29,14 @@ int vfio_iommufd_bind(struct vfio_device *vdev, struct iommufd_ctx *ictx)
 	if (ret)
 		return ret;
 
-	ret = iommufd_vfio_compat_ioas_id(ictx, &ioas_id);
-	if (ret)
-		goto err_unbind;
-	ret = vdev->ops->attach_ioas(vdev, &ioas_id);
-	if (ret)
-		goto err_unbind;
+	if (pt_id) {
+		ret = vdev->ops->attach_ioas(vdev, pt_id);
+		if (ret)
+			goto err_unbind;
+	}
 
-	/*
-	 * The legacy path has no way to return the device id or the selected
-	 * pt_id
-	 */
+	if (dev_id)
+		*dev_id = device_id;
 	return 0;
 
 err_unbind:
diff --git a/drivers/vfio/vfio.h b/drivers/vfio/vfio.h
index d56cdb114024..6f063e31d08a 100644
--- a/drivers/vfio/vfio.h
+++ b/drivers/vfio/vfio.h
@@ -26,7 +26,8 @@ struct vfio_device_file {
 
 void vfio_device_put_registration(struct vfio_device *device);
 bool vfio_device_try_get_registration(struct vfio_device *device);
-int vfio_device_open(struct vfio_device_file *df);
+int vfio_device_open(struct vfio_device_file *df,
+		     u32 *dev_id, u32 *pt_id);
 void vfio_device_close(struct vfio_device_file *df);
 struct vfio_device_file *
 vfio_allocate_device_file(struct vfio_device *device);
@@ -224,11 +225,13 @@ static inline void vfio_container_cleanup(void)
 #endif
 
 #if IS_ENABLED(CONFIG_IOMMUFD)
-int vfio_iommufd_bind(struct vfio_device *device, struct iommufd_ctx *ictx);
+int vfio_iommufd_bind(struct vfio_device *device, struct iommufd_ctx *ictx,
+		      u32 *dev_id, u32 *pt_id);
 void vfio_iommufd_unbind(struct vfio_device *device);
 #else
 static inline int vfio_iommufd_bind(struct vfio_device *device,
-				    struct iommufd_ctx *ictx)
+				    struct iommufd_ctx *ictx,
+				    u32 *dev_id, u32 *pt_id)
 {
 	return -EOPNOTSUPP;
 }
diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c
index 2267057240bd..b40c2d95f693 100644
--- a/drivers/vfio/vfio_main.c
+++ b/drivers/vfio/vfio_main.c
@@ -411,7 +411,8 @@ vfio_allocate_device_file(struct vfio_device *device)
 	return df;
 }
 
-static int vfio_device_first_open(struct vfio_device_file *df)
+static int vfio_device_first_open(struct vfio_device_file *df,
+				  u32 *dev_id, u32 *pt_id)
 {
 	struct vfio_device *device = df->device;
 	struct iommufd_ctx *iommufd = df->iommufd;
@@ -423,7 +424,7 @@ static int vfio_device_first_open(struct vfio_device_file *df)
 		return -ENODEV;
 
 	if (iommufd)
-		ret = vfio_iommufd_bind(device, iommufd);
+		ret = vfio_iommufd_bind(device, iommufd, dev_id, pt_id);
 	else
 		ret = vfio_device_group_use_iommu(device);
 	if (ret)
@@ -462,7 +463,8 @@ static void vfio_device_last_close(struct vfio_device_file *df)
 	module_put(device->dev->driver->owner);
 }
 
-int vfio_device_open(struct vfio_device_file *df)
+int vfio_device_open(struct vfio_device_file *df,
+		     u32 *dev_id, u32 *pt_id)
 {
 	struct vfio_device *device = df->device;
 	int ret = 0;
@@ -471,7 +473,7 @@ int vfio_device_open(struct vfio_device_file *df)
 
 	device->open_count++;
 	if (device->open_count == 1) {
-		ret = vfio_device_first_open(df);
+		ret = vfio_device_first_open(df, dev_id, pt_id);
 		if (ret)
 			device->open_count--;
 	}
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 135+ messages in thread

* [PATCH v3 09/15] vfio-iommufd: Add detach_ioas support for physical VFIO devices
  2023-02-13 15:13 ` [Intel-gfx] " Yi Liu
@ 2023-02-13 15:13   ` Yi Liu
  -1 siblings, 0 replies; 135+ messages in thread
From: Yi Liu @ 2023-02-13 15:13 UTC (permalink / raw)
  To: joro, alex.williamson, jgg, kevin.tian, robin.murphy
  Cc: cohuck, eric.auger, nicolinc, kvm, mjrosato, chao.p.peng,
	yi.l.liu, yi.y.sun, peterx, jasowang, shameerali.kolothum.thodi,
	lulu, suravee.suthikulpanit, intel-gvt-dev, intel-gfx,
	linux-s390

this prepares for adding DETACH ioctl for physical VFIO devices.

Signed-off-by: Yi Liu <yi.l.liu@intel.com>
---
 Documentation/driver-api/vfio.rst             |  8 +++++---
 drivers/vfio/fsl-mc/vfio_fsl_mc.c             |  1 +
 drivers/vfio/iommufd.c                        | 20 +++++++++++++++++++
 .../vfio/pci/hisilicon/hisi_acc_vfio_pci.c    |  2 ++
 drivers/vfio/pci/mlx5/main.c                  |  1 +
 drivers/vfio/pci/vfio_pci.c                   |  1 +
 drivers/vfio/platform/vfio_amba.c             |  1 +
 drivers/vfio/platform/vfio_platform.c         |  1 +
 drivers/vfio/vfio_main.c                      |  3 ++-
 include/linux/vfio.h                          |  8 +++++++-
 10 files changed, 41 insertions(+), 5 deletions(-)

diff --git a/Documentation/driver-api/vfio.rst b/Documentation/driver-api/vfio.rst
index 50b690f7f663..44527420f20d 100644
--- a/Documentation/driver-api/vfio.rst
+++ b/Documentation/driver-api/vfio.rst
@@ -279,6 +279,7 @@ similar to a file operations structure::
 					struct iommufd_ctx *ictx, u32 *out_device_id);
 		void	(*unbind_iommufd)(struct vfio_device *vdev);
 		int	(*attach_ioas)(struct vfio_device *vdev, u32 *pt_id);
+		void	(*detach_ioas)(struct vfio_device *vdev);
 		int	(*open_device)(struct vfio_device *vdev);
 		void	(*close_device)(struct vfio_device *vdev);
 		ssize_t	(*read)(struct vfio_device *vdev, char __user *buf,
@@ -315,9 +316,10 @@ container_of().
 	- The [un]bind_iommufd callbacks are issued when the device is bound to
 	  and unbound from iommufd.
 
-	- The attach_ioas callback is issued when the device is attached to an
-	  IOAS managed by the bound iommufd. The attached IOAS is automatically
-	  detached when the device is unbound from iommufd.
+	- The [de]attach_ioas callback is issued when the device is attached to
+	  and detached from an IOAS managed by the bound iommufd. However, the
+	  attached IOAS can also be automatically detached when the device is
+	  unbound from iommufd.
 
 	- The read/write/mmap callbacks implement the device region access defined
 	  by the device's own VFIO_DEVICE_GET_REGION_INFO ioctl.
diff --git a/drivers/vfio/fsl-mc/vfio_fsl_mc.c b/drivers/vfio/fsl-mc/vfio_fsl_mc.c
index c89a047a4cd8..d540cf683d93 100644
--- a/drivers/vfio/fsl-mc/vfio_fsl_mc.c
+++ b/drivers/vfio/fsl-mc/vfio_fsl_mc.c
@@ -594,6 +594,7 @@ static const struct vfio_device_ops vfio_fsl_mc_ops = {
 	.bind_iommufd	= vfio_iommufd_physical_bind,
 	.unbind_iommufd	= vfio_iommufd_physical_unbind,
 	.attach_ioas	= vfio_iommufd_physical_attach_ioas,
+	.detach_ioas	= vfio_iommufd_physical_detach_ioas,
 };
 
 static struct fsl_mc_driver vfio_fsl_mc_driver = {
diff --git a/drivers/vfio/iommufd.c b/drivers/vfio/iommufd.c
index beef6ca21107..bfaa9876499b 100644
--- a/drivers/vfio/iommufd.c
+++ b/drivers/vfio/iommufd.c
@@ -88,6 +88,14 @@ int vfio_iommufd_physical_attach_ioas(struct vfio_device *vdev, u32 *pt_id)
 {
 	int rc;
 
+	lockdep_assert_held(&vdev->dev_set->lock);
+
+	if (!vdev->iommufd_device)
+		return -EINVAL;
+
+	if (vdev->iommufd_attached)
+		return -EBUSY;
+
 	rc = iommufd_device_attach(vdev->iommufd_device, pt_id);
 	if (rc)
 		return rc;
@@ -96,6 +104,18 @@ int vfio_iommufd_physical_attach_ioas(struct vfio_device *vdev, u32 *pt_id)
 }
 EXPORT_SYMBOL_GPL(vfio_iommufd_physical_attach_ioas);
 
+void vfio_iommufd_physical_detach_ioas(struct vfio_device *vdev)
+{
+	lockdep_assert_held(&vdev->dev_set->lock);
+
+	if (!vdev->iommufd_device || !vdev->iommufd_attached)
+		return;
+
+	iommufd_device_detach(vdev->iommufd_device);
+	vdev->iommufd_attached = false;
+}
+EXPORT_SYMBOL_GPL(vfio_iommufd_physical_detach_ioas);
+
 /*
  * The emulated standard ops mean that vfio_device is going to use the
  * "mdev path" and will call vfio_pin_pages()/vfio_dma_rw(). Drivers using this
diff --git a/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c b/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c
index a117eaf21c14..b2f9778c8366 100644
--- a/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c
+++ b/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c
@@ -1373,6 +1373,7 @@ static const struct vfio_device_ops hisi_acc_vfio_pci_migrn_ops = {
 	.bind_iommufd = vfio_iommufd_physical_bind,
 	.unbind_iommufd = vfio_iommufd_physical_unbind,
 	.attach_ioas = vfio_iommufd_physical_attach_ioas,
+	.detach_ioas = vfio_iommufd_physical_detach_ioas,
 };
 
 static const struct vfio_device_ops hisi_acc_vfio_pci_ops = {
@@ -1391,6 +1392,7 @@ static const struct vfio_device_ops hisi_acc_vfio_pci_ops = {
 	.bind_iommufd = vfio_iommufd_physical_bind,
 	.unbind_iommufd = vfio_iommufd_physical_unbind,
 	.attach_ioas = vfio_iommufd_physical_attach_ioas,
+	.detach_ioas = vfio_iommufd_physical_detach_ioas,
 };
 
 static int hisi_acc_vfio_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
diff --git a/drivers/vfio/pci/mlx5/main.c b/drivers/vfio/pci/mlx5/main.c
index e897537a9e8a..6fc3410989eb 100644
--- a/drivers/vfio/pci/mlx5/main.c
+++ b/drivers/vfio/pci/mlx5/main.c
@@ -1326,6 +1326,7 @@ static const struct vfio_device_ops mlx5vf_pci_ops = {
 	.bind_iommufd = vfio_iommufd_physical_bind,
 	.unbind_iommufd = vfio_iommufd_physical_unbind,
 	.attach_ioas = vfio_iommufd_physical_attach_ioas,
+	.detach_ioas = vfio_iommufd_physical_detach_ioas,
 };
 
 static int mlx5vf_pci_probe(struct pci_dev *pdev,
diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c
index 29091ee2e984..cb5b7f865d58 100644
--- a/drivers/vfio/pci/vfio_pci.c
+++ b/drivers/vfio/pci/vfio_pci.c
@@ -141,6 +141,7 @@ static const struct vfio_device_ops vfio_pci_ops = {
 	.bind_iommufd	= vfio_iommufd_physical_bind,
 	.unbind_iommufd	= vfio_iommufd_physical_unbind,
 	.attach_ioas	= vfio_iommufd_physical_attach_ioas,
+	.detach_ioas	= vfio_iommufd_physical_detach_ioas,
 };
 
 static int vfio_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
diff --git a/drivers/vfio/platform/vfio_amba.c b/drivers/vfio/platform/vfio_amba.c
index 83fe54015595..6464b3939ebc 100644
--- a/drivers/vfio/platform/vfio_amba.c
+++ b/drivers/vfio/platform/vfio_amba.c
@@ -119,6 +119,7 @@ static const struct vfio_device_ops vfio_amba_ops = {
 	.bind_iommufd	= vfio_iommufd_physical_bind,
 	.unbind_iommufd	= vfio_iommufd_physical_unbind,
 	.attach_ioas	= vfio_iommufd_physical_attach_ioas,
+	.detach_ioas	= vfio_iommufd_physical_detach_ioas,
 };
 
 static const struct amba_id pl330_ids[] = {
diff --git a/drivers/vfio/platform/vfio_platform.c b/drivers/vfio/platform/vfio_platform.c
index 22a1efca32a8..8cf22fa65baa 100644
--- a/drivers/vfio/platform/vfio_platform.c
+++ b/drivers/vfio/platform/vfio_platform.c
@@ -108,6 +108,7 @@ static const struct vfio_device_ops vfio_platform_ops = {
 	.bind_iommufd	= vfio_iommufd_physical_bind,
 	.unbind_iommufd	= vfio_iommufd_physical_unbind,
 	.attach_ioas	= vfio_iommufd_physical_attach_ioas,
+	.detach_ioas	= vfio_iommufd_physical_detach_ioas,
 };
 
 static struct platform_driver vfio_platform_driver = {
diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c
index b40c2d95f693..05dd4b89e9d1 100644
--- a/drivers/vfio/vfio_main.c
+++ b/drivers/vfio/vfio_main.c
@@ -250,7 +250,8 @@ static int __vfio_register_dev(struct vfio_device *device,
 
 	if (WARN_ON(device->ops->bind_iommufd &&
 		    (!device->ops->unbind_iommufd ||
-		     !device->ops->attach_ioas)))
+		     !device->ops->attach_ioas ||
+		     !device->ops->detach_ioas)))
 		return -EINVAL;
 
 	/*
diff --git a/include/linux/vfio.h b/include/linux/vfio.h
index 6a07e1c6c38e..584aa909c8bc 100644
--- a/include/linux/vfio.h
+++ b/include/linux/vfio.h
@@ -73,7 +73,9 @@ struct vfio_device {
  * @bind_iommufd: Called when binding the device to an iommufd
  * @unbind_iommufd: Opposite of bind_iommufd
  * @attach_ioas: Called when attaching device to an IOAS/HWPT managed by the
- *		 bound iommufd. Undo in unbind_iommufd.
+ *		 bound iommufd. Undo in unbind_iommufd if @detach_ioas is not
+ *		 called
+ * @detach_ioas: Opposite of attach_ioas
  * @open_device: Called when the first file descriptor is opened for this device
  * @close_device: Opposite of open_device
  * @read: Perform read(2) on device file descriptor
@@ -97,6 +99,7 @@ struct vfio_device_ops {
 				struct iommufd_ctx *ictx, u32 *out_device_id);
 	void	(*unbind_iommufd)(struct vfio_device *vdev);
 	int	(*attach_ioas)(struct vfio_device *vdev, u32 *pt_id);
+	void	(*detach_ioas)(struct vfio_device *vdev);
 	int	(*open_device)(struct vfio_device *vdev);
 	void	(*close_device)(struct vfio_device *vdev);
 	ssize_t	(*read)(struct vfio_device *vdev, char __user *buf,
@@ -118,6 +121,7 @@ int vfio_iommufd_physical_bind(struct vfio_device *vdev,
 			       struct iommufd_ctx *ictx, u32 *out_device_id);
 void vfio_iommufd_physical_unbind(struct vfio_device *vdev);
 int vfio_iommufd_physical_attach_ioas(struct vfio_device *vdev, u32 *pt_id);
+void vfio_iommufd_physical_detach_ioas(struct vfio_device *vdev);
 int vfio_iommufd_emulated_bind(struct vfio_device *vdev,
 			       struct iommufd_ctx *ictx, u32 *out_device_id);
 void vfio_iommufd_emulated_unbind(struct vfio_device *vdev);
@@ -130,6 +134,8 @@ int vfio_iommufd_emulated_attach_ioas(struct vfio_device *vdev, u32 *pt_id);
 	((void (*)(struct vfio_device *vdev)) NULL)
 #define vfio_iommufd_physical_attach_ioas \
 	((int (*)(struct vfio_device *vdev, u32 *pt_id)) NULL)
+#define vfio_iommufd_physical_detach_ioas \
+	((void (*)(struct vfio_device *vdev)) NULL)
 #define vfio_iommufd_emulated_bind                                      \
 	((int (*)(struct vfio_device *vdev, struct iommufd_ctx *ictx,   \
 		  u32 *out_device_id)) NULL)
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 135+ messages in thread

* [Intel-gfx] [PATCH v3 09/15] vfio-iommufd: Add detach_ioas support for physical VFIO devices
@ 2023-02-13 15:13   ` Yi Liu
  0 siblings, 0 replies; 135+ messages in thread
From: Yi Liu @ 2023-02-13 15:13 UTC (permalink / raw)
  To: joro, alex.williamson, jgg, kevin.tian, robin.murphy
  Cc: linux-s390, yi.l.liu, yi.y.sun, kvm, mjrosato, jasowang, cohuck,
	peterx, eric.auger, nicolinc, shameerali.kolothum.thodi,
	suravee.suthikulpanit, chao.p.peng, lulu, intel-gvt-dev,
	intel-gfx

this prepares for adding DETACH ioctl for physical VFIO devices.

Signed-off-by: Yi Liu <yi.l.liu@intel.com>
---
 Documentation/driver-api/vfio.rst             |  8 +++++---
 drivers/vfio/fsl-mc/vfio_fsl_mc.c             |  1 +
 drivers/vfio/iommufd.c                        | 20 +++++++++++++++++++
 .../vfio/pci/hisilicon/hisi_acc_vfio_pci.c    |  2 ++
 drivers/vfio/pci/mlx5/main.c                  |  1 +
 drivers/vfio/pci/vfio_pci.c                   |  1 +
 drivers/vfio/platform/vfio_amba.c             |  1 +
 drivers/vfio/platform/vfio_platform.c         |  1 +
 drivers/vfio/vfio_main.c                      |  3 ++-
 include/linux/vfio.h                          |  8 +++++++-
 10 files changed, 41 insertions(+), 5 deletions(-)

diff --git a/Documentation/driver-api/vfio.rst b/Documentation/driver-api/vfio.rst
index 50b690f7f663..44527420f20d 100644
--- a/Documentation/driver-api/vfio.rst
+++ b/Documentation/driver-api/vfio.rst
@@ -279,6 +279,7 @@ similar to a file operations structure::
 					struct iommufd_ctx *ictx, u32 *out_device_id);
 		void	(*unbind_iommufd)(struct vfio_device *vdev);
 		int	(*attach_ioas)(struct vfio_device *vdev, u32 *pt_id);
+		void	(*detach_ioas)(struct vfio_device *vdev);
 		int	(*open_device)(struct vfio_device *vdev);
 		void	(*close_device)(struct vfio_device *vdev);
 		ssize_t	(*read)(struct vfio_device *vdev, char __user *buf,
@@ -315,9 +316,10 @@ container_of().
 	- The [un]bind_iommufd callbacks are issued when the device is bound to
 	  and unbound from iommufd.
 
-	- The attach_ioas callback is issued when the device is attached to an
-	  IOAS managed by the bound iommufd. The attached IOAS is automatically
-	  detached when the device is unbound from iommufd.
+	- The [de]attach_ioas callback is issued when the device is attached to
+	  and detached from an IOAS managed by the bound iommufd. However, the
+	  attached IOAS can also be automatically detached when the device is
+	  unbound from iommufd.
 
 	- The read/write/mmap callbacks implement the device region access defined
 	  by the device's own VFIO_DEVICE_GET_REGION_INFO ioctl.
diff --git a/drivers/vfio/fsl-mc/vfio_fsl_mc.c b/drivers/vfio/fsl-mc/vfio_fsl_mc.c
index c89a047a4cd8..d540cf683d93 100644
--- a/drivers/vfio/fsl-mc/vfio_fsl_mc.c
+++ b/drivers/vfio/fsl-mc/vfio_fsl_mc.c
@@ -594,6 +594,7 @@ static const struct vfio_device_ops vfio_fsl_mc_ops = {
 	.bind_iommufd	= vfio_iommufd_physical_bind,
 	.unbind_iommufd	= vfio_iommufd_physical_unbind,
 	.attach_ioas	= vfio_iommufd_physical_attach_ioas,
+	.detach_ioas	= vfio_iommufd_physical_detach_ioas,
 };
 
 static struct fsl_mc_driver vfio_fsl_mc_driver = {
diff --git a/drivers/vfio/iommufd.c b/drivers/vfio/iommufd.c
index beef6ca21107..bfaa9876499b 100644
--- a/drivers/vfio/iommufd.c
+++ b/drivers/vfio/iommufd.c
@@ -88,6 +88,14 @@ int vfio_iommufd_physical_attach_ioas(struct vfio_device *vdev, u32 *pt_id)
 {
 	int rc;
 
+	lockdep_assert_held(&vdev->dev_set->lock);
+
+	if (!vdev->iommufd_device)
+		return -EINVAL;
+
+	if (vdev->iommufd_attached)
+		return -EBUSY;
+
 	rc = iommufd_device_attach(vdev->iommufd_device, pt_id);
 	if (rc)
 		return rc;
@@ -96,6 +104,18 @@ int vfio_iommufd_physical_attach_ioas(struct vfio_device *vdev, u32 *pt_id)
 }
 EXPORT_SYMBOL_GPL(vfio_iommufd_physical_attach_ioas);
 
+void vfio_iommufd_physical_detach_ioas(struct vfio_device *vdev)
+{
+	lockdep_assert_held(&vdev->dev_set->lock);
+
+	if (!vdev->iommufd_device || !vdev->iommufd_attached)
+		return;
+
+	iommufd_device_detach(vdev->iommufd_device);
+	vdev->iommufd_attached = false;
+}
+EXPORT_SYMBOL_GPL(vfio_iommufd_physical_detach_ioas);
+
 /*
  * The emulated standard ops mean that vfio_device is going to use the
  * "mdev path" and will call vfio_pin_pages()/vfio_dma_rw(). Drivers using this
diff --git a/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c b/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c
index a117eaf21c14..b2f9778c8366 100644
--- a/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c
+++ b/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c
@@ -1373,6 +1373,7 @@ static const struct vfio_device_ops hisi_acc_vfio_pci_migrn_ops = {
 	.bind_iommufd = vfio_iommufd_physical_bind,
 	.unbind_iommufd = vfio_iommufd_physical_unbind,
 	.attach_ioas = vfio_iommufd_physical_attach_ioas,
+	.detach_ioas = vfio_iommufd_physical_detach_ioas,
 };
 
 static const struct vfio_device_ops hisi_acc_vfio_pci_ops = {
@@ -1391,6 +1392,7 @@ static const struct vfio_device_ops hisi_acc_vfio_pci_ops = {
 	.bind_iommufd = vfio_iommufd_physical_bind,
 	.unbind_iommufd = vfio_iommufd_physical_unbind,
 	.attach_ioas = vfio_iommufd_physical_attach_ioas,
+	.detach_ioas = vfio_iommufd_physical_detach_ioas,
 };
 
 static int hisi_acc_vfio_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
diff --git a/drivers/vfio/pci/mlx5/main.c b/drivers/vfio/pci/mlx5/main.c
index e897537a9e8a..6fc3410989eb 100644
--- a/drivers/vfio/pci/mlx5/main.c
+++ b/drivers/vfio/pci/mlx5/main.c
@@ -1326,6 +1326,7 @@ static const struct vfio_device_ops mlx5vf_pci_ops = {
 	.bind_iommufd = vfio_iommufd_physical_bind,
 	.unbind_iommufd = vfio_iommufd_physical_unbind,
 	.attach_ioas = vfio_iommufd_physical_attach_ioas,
+	.detach_ioas = vfio_iommufd_physical_detach_ioas,
 };
 
 static int mlx5vf_pci_probe(struct pci_dev *pdev,
diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c
index 29091ee2e984..cb5b7f865d58 100644
--- a/drivers/vfio/pci/vfio_pci.c
+++ b/drivers/vfio/pci/vfio_pci.c
@@ -141,6 +141,7 @@ static const struct vfio_device_ops vfio_pci_ops = {
 	.bind_iommufd	= vfio_iommufd_physical_bind,
 	.unbind_iommufd	= vfio_iommufd_physical_unbind,
 	.attach_ioas	= vfio_iommufd_physical_attach_ioas,
+	.detach_ioas	= vfio_iommufd_physical_detach_ioas,
 };
 
 static int vfio_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
diff --git a/drivers/vfio/platform/vfio_amba.c b/drivers/vfio/platform/vfio_amba.c
index 83fe54015595..6464b3939ebc 100644
--- a/drivers/vfio/platform/vfio_amba.c
+++ b/drivers/vfio/platform/vfio_amba.c
@@ -119,6 +119,7 @@ static const struct vfio_device_ops vfio_amba_ops = {
 	.bind_iommufd	= vfio_iommufd_physical_bind,
 	.unbind_iommufd	= vfio_iommufd_physical_unbind,
 	.attach_ioas	= vfio_iommufd_physical_attach_ioas,
+	.detach_ioas	= vfio_iommufd_physical_detach_ioas,
 };
 
 static const struct amba_id pl330_ids[] = {
diff --git a/drivers/vfio/platform/vfio_platform.c b/drivers/vfio/platform/vfio_platform.c
index 22a1efca32a8..8cf22fa65baa 100644
--- a/drivers/vfio/platform/vfio_platform.c
+++ b/drivers/vfio/platform/vfio_platform.c
@@ -108,6 +108,7 @@ static const struct vfio_device_ops vfio_platform_ops = {
 	.bind_iommufd	= vfio_iommufd_physical_bind,
 	.unbind_iommufd	= vfio_iommufd_physical_unbind,
 	.attach_ioas	= vfio_iommufd_physical_attach_ioas,
+	.detach_ioas	= vfio_iommufd_physical_detach_ioas,
 };
 
 static struct platform_driver vfio_platform_driver = {
diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c
index b40c2d95f693..05dd4b89e9d1 100644
--- a/drivers/vfio/vfio_main.c
+++ b/drivers/vfio/vfio_main.c
@@ -250,7 +250,8 @@ static int __vfio_register_dev(struct vfio_device *device,
 
 	if (WARN_ON(device->ops->bind_iommufd &&
 		    (!device->ops->unbind_iommufd ||
-		     !device->ops->attach_ioas)))
+		     !device->ops->attach_ioas ||
+		     !device->ops->detach_ioas)))
 		return -EINVAL;
 
 	/*
diff --git a/include/linux/vfio.h b/include/linux/vfio.h
index 6a07e1c6c38e..584aa909c8bc 100644
--- a/include/linux/vfio.h
+++ b/include/linux/vfio.h
@@ -73,7 +73,9 @@ struct vfio_device {
  * @bind_iommufd: Called when binding the device to an iommufd
  * @unbind_iommufd: Opposite of bind_iommufd
  * @attach_ioas: Called when attaching device to an IOAS/HWPT managed by the
- *		 bound iommufd. Undo in unbind_iommufd.
+ *		 bound iommufd. Undo in unbind_iommufd if @detach_ioas is not
+ *		 called
+ * @detach_ioas: Opposite of attach_ioas
  * @open_device: Called when the first file descriptor is opened for this device
  * @close_device: Opposite of open_device
  * @read: Perform read(2) on device file descriptor
@@ -97,6 +99,7 @@ struct vfio_device_ops {
 				struct iommufd_ctx *ictx, u32 *out_device_id);
 	void	(*unbind_iommufd)(struct vfio_device *vdev);
 	int	(*attach_ioas)(struct vfio_device *vdev, u32 *pt_id);
+	void	(*detach_ioas)(struct vfio_device *vdev);
 	int	(*open_device)(struct vfio_device *vdev);
 	void	(*close_device)(struct vfio_device *vdev);
 	ssize_t	(*read)(struct vfio_device *vdev, char __user *buf,
@@ -118,6 +121,7 @@ int vfio_iommufd_physical_bind(struct vfio_device *vdev,
 			       struct iommufd_ctx *ictx, u32 *out_device_id);
 void vfio_iommufd_physical_unbind(struct vfio_device *vdev);
 int vfio_iommufd_physical_attach_ioas(struct vfio_device *vdev, u32 *pt_id);
+void vfio_iommufd_physical_detach_ioas(struct vfio_device *vdev);
 int vfio_iommufd_emulated_bind(struct vfio_device *vdev,
 			       struct iommufd_ctx *ictx, u32 *out_device_id);
 void vfio_iommufd_emulated_unbind(struct vfio_device *vdev);
@@ -130,6 +134,8 @@ int vfio_iommufd_emulated_attach_ioas(struct vfio_device *vdev, u32 *pt_id);
 	((void (*)(struct vfio_device *vdev)) NULL)
 #define vfio_iommufd_physical_attach_ioas \
 	((int (*)(struct vfio_device *vdev, u32 *pt_id)) NULL)
+#define vfio_iommufd_physical_detach_ioas \
+	((void (*)(struct vfio_device *vdev)) NULL)
 #define vfio_iommufd_emulated_bind                                      \
 	((int (*)(struct vfio_device *vdev, struct iommufd_ctx *ictx,   \
 		  u32 *out_device_id)) NULL)
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 135+ messages in thread

* [PATCH v3 10/15] vfio-iommufd: Add detach_ioas for emulated VFIO devices
  2023-02-13 15:13 ` [Intel-gfx] " Yi Liu
@ 2023-02-13 15:13   ` Yi Liu
  -1 siblings, 0 replies; 135+ messages in thread
From: Yi Liu @ 2023-02-13 15:13 UTC (permalink / raw)
  To: joro, alex.williamson, jgg, kevin.tian, robin.murphy
  Cc: cohuck, eric.auger, nicolinc, kvm, mjrosato, chao.p.peng,
	yi.l.liu, yi.y.sun, peterx, jasowang, shameerali.kolothum.thodi,
	lulu, suravee.suthikulpanit, intel-gvt-dev, intel-gfx,
	linux-s390

this prepares for adding DETACH ioctl for emulated VFIO devices.

Signed-off-by: Yi Liu <yi.l.liu@intel.com>
---
 drivers/gpu/drm/i915/gvt/kvmgt.c  |  1 +
 drivers/s390/cio/vfio_ccw_ops.c   |  1 +
 drivers/s390/crypto/vfio_ap_ops.c |  1 +
 drivers/vfio/iommufd.c            | 18 ++++++++++++++++++
 include/linux/vfio.h              |  3 +++
 5 files changed, 24 insertions(+)

diff --git a/drivers/gpu/drm/i915/gvt/kvmgt.c b/drivers/gpu/drm/i915/gvt/kvmgt.c
index 8ae7039b3683..8a76a84bc3c1 100644
--- a/drivers/gpu/drm/i915/gvt/kvmgt.c
+++ b/drivers/gpu/drm/i915/gvt/kvmgt.c
@@ -1474,6 +1474,7 @@ static const struct vfio_device_ops intel_vgpu_dev_ops = {
 	.bind_iommufd	= vfio_iommufd_emulated_bind,
 	.unbind_iommufd = vfio_iommufd_emulated_unbind,
 	.attach_ioas	= vfio_iommufd_emulated_attach_ioas,
+	.detach_ioas	= vfio_iommufd_emulated_detach_ioas,
 };
 
 static int intel_vgpu_probe(struct mdev_device *mdev)
diff --git a/drivers/s390/cio/vfio_ccw_ops.c b/drivers/s390/cio/vfio_ccw_ops.c
index 5b53b94f13c7..cba4971618ff 100644
--- a/drivers/s390/cio/vfio_ccw_ops.c
+++ b/drivers/s390/cio/vfio_ccw_ops.c
@@ -632,6 +632,7 @@ static const struct vfio_device_ops vfio_ccw_dev_ops = {
 	.bind_iommufd = vfio_iommufd_emulated_bind,
 	.unbind_iommufd = vfio_iommufd_emulated_unbind,
 	.attach_ioas = vfio_iommufd_emulated_attach_ioas,
+	.detach_ioas = vfio_iommufd_emulated_detach_ioas,
 };
 
 struct mdev_driver vfio_ccw_mdev_driver = {
diff --git a/drivers/s390/crypto/vfio_ap_ops.c b/drivers/s390/crypto/vfio_ap_ops.c
index 9c01957e56b3..f99c69d40982 100644
--- a/drivers/s390/crypto/vfio_ap_ops.c
+++ b/drivers/s390/crypto/vfio_ap_ops.c
@@ -1802,6 +1802,7 @@ static const struct vfio_device_ops vfio_ap_matrix_dev_ops = {
 	.bind_iommufd = vfio_iommufd_emulated_bind,
 	.unbind_iommufd = vfio_iommufd_emulated_unbind,
 	.attach_ioas = vfio_iommufd_emulated_attach_ioas,
+	.detach_ioas = vfio_iommufd_emulated_detach_ioas,
 };
 
 static struct mdev_driver vfio_ap_matrix_driver = {
diff --git a/drivers/vfio/iommufd.c b/drivers/vfio/iommufd.c
index bfaa9876499b..faf2516b0f06 100644
--- a/drivers/vfio/iommufd.c
+++ b/drivers/vfio/iommufd.c
@@ -165,6 +165,12 @@ int vfio_iommufd_emulated_attach_ioas(struct vfio_device *vdev, u32 *pt_id)
 
 	lockdep_assert_held(&vdev->dev_set->lock);
 
+	if (!vdev->iommufd_ictx)
+		return -EINVAL;
+
+	if (vdev->iommufd_access)
+		return -EBUSY;
+
 	user = iommufd_access_create(vdev->iommufd_ictx, *pt_id, &vfio_user_ops,
 				     vdev);
 	if (IS_ERR(user))
@@ -173,3 +179,15 @@ int vfio_iommufd_emulated_attach_ioas(struct vfio_device *vdev, u32 *pt_id)
 	return 0;
 }
 EXPORT_SYMBOL_GPL(vfio_iommufd_emulated_attach_ioas);
+
+void vfio_iommufd_emulated_detach_ioas(struct vfio_device *vdev)
+{
+	lockdep_assert_held(&vdev->dev_set->lock);
+
+	if (!vdev->iommufd_ictx || !vdev->iommufd_access)
+		return;
+
+	iommufd_access_destroy(vdev->iommufd_access);
+	vdev->iommufd_access = NULL;
+}
+EXPORT_SYMBOL_GPL(vfio_iommufd_emulated_detach_ioas);
diff --git a/include/linux/vfio.h b/include/linux/vfio.h
index 584aa909c8bc..50ee3efbc1f9 100644
--- a/include/linux/vfio.h
+++ b/include/linux/vfio.h
@@ -126,6 +126,7 @@ int vfio_iommufd_emulated_bind(struct vfio_device *vdev,
 			       struct iommufd_ctx *ictx, u32 *out_device_id);
 void vfio_iommufd_emulated_unbind(struct vfio_device *vdev);
 int vfio_iommufd_emulated_attach_ioas(struct vfio_device *vdev, u32 *pt_id);
+void vfio_iommufd_emulated_detach_ioas(struct vfio_device *vdev);
 #else
 #define vfio_iommufd_physical_bind                                      \
 	((int (*)(struct vfio_device *vdev, struct iommufd_ctx *ictx,   \
@@ -143,6 +144,8 @@ int vfio_iommufd_emulated_attach_ioas(struct vfio_device *vdev, u32 *pt_id);
 	((void (*)(struct vfio_device *vdev)) NULL)
 #define vfio_iommufd_emulated_attach_ioas \
 	((int (*)(struct vfio_device *vdev, u32 *pt_id)) NULL)
+#define vfio_iommufd_emulated_detach_ioas \
+	((void (*)(struct vfio_device *vdev)) NULL)
 #endif
 
 /**
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 135+ messages in thread

* [Intel-gfx] [PATCH v3 10/15] vfio-iommufd: Add detach_ioas for emulated VFIO devices
@ 2023-02-13 15:13   ` Yi Liu
  0 siblings, 0 replies; 135+ messages in thread
From: Yi Liu @ 2023-02-13 15:13 UTC (permalink / raw)
  To: joro, alex.williamson, jgg, kevin.tian, robin.murphy
  Cc: linux-s390, yi.l.liu, yi.y.sun, kvm, mjrosato, jasowang, cohuck,
	peterx, eric.auger, nicolinc, shameerali.kolothum.thodi,
	suravee.suthikulpanit, chao.p.peng, lulu, intel-gvt-dev,
	intel-gfx

this prepares for adding DETACH ioctl for emulated VFIO devices.

Signed-off-by: Yi Liu <yi.l.liu@intel.com>
---
 drivers/gpu/drm/i915/gvt/kvmgt.c  |  1 +
 drivers/s390/cio/vfio_ccw_ops.c   |  1 +
 drivers/s390/crypto/vfio_ap_ops.c |  1 +
 drivers/vfio/iommufd.c            | 18 ++++++++++++++++++
 include/linux/vfio.h              |  3 +++
 5 files changed, 24 insertions(+)

diff --git a/drivers/gpu/drm/i915/gvt/kvmgt.c b/drivers/gpu/drm/i915/gvt/kvmgt.c
index 8ae7039b3683..8a76a84bc3c1 100644
--- a/drivers/gpu/drm/i915/gvt/kvmgt.c
+++ b/drivers/gpu/drm/i915/gvt/kvmgt.c
@@ -1474,6 +1474,7 @@ static const struct vfio_device_ops intel_vgpu_dev_ops = {
 	.bind_iommufd	= vfio_iommufd_emulated_bind,
 	.unbind_iommufd = vfio_iommufd_emulated_unbind,
 	.attach_ioas	= vfio_iommufd_emulated_attach_ioas,
+	.detach_ioas	= vfio_iommufd_emulated_detach_ioas,
 };
 
 static int intel_vgpu_probe(struct mdev_device *mdev)
diff --git a/drivers/s390/cio/vfio_ccw_ops.c b/drivers/s390/cio/vfio_ccw_ops.c
index 5b53b94f13c7..cba4971618ff 100644
--- a/drivers/s390/cio/vfio_ccw_ops.c
+++ b/drivers/s390/cio/vfio_ccw_ops.c
@@ -632,6 +632,7 @@ static const struct vfio_device_ops vfio_ccw_dev_ops = {
 	.bind_iommufd = vfio_iommufd_emulated_bind,
 	.unbind_iommufd = vfio_iommufd_emulated_unbind,
 	.attach_ioas = vfio_iommufd_emulated_attach_ioas,
+	.detach_ioas = vfio_iommufd_emulated_detach_ioas,
 };
 
 struct mdev_driver vfio_ccw_mdev_driver = {
diff --git a/drivers/s390/crypto/vfio_ap_ops.c b/drivers/s390/crypto/vfio_ap_ops.c
index 9c01957e56b3..f99c69d40982 100644
--- a/drivers/s390/crypto/vfio_ap_ops.c
+++ b/drivers/s390/crypto/vfio_ap_ops.c
@@ -1802,6 +1802,7 @@ static const struct vfio_device_ops vfio_ap_matrix_dev_ops = {
 	.bind_iommufd = vfio_iommufd_emulated_bind,
 	.unbind_iommufd = vfio_iommufd_emulated_unbind,
 	.attach_ioas = vfio_iommufd_emulated_attach_ioas,
+	.detach_ioas = vfio_iommufd_emulated_detach_ioas,
 };
 
 static struct mdev_driver vfio_ap_matrix_driver = {
diff --git a/drivers/vfio/iommufd.c b/drivers/vfio/iommufd.c
index bfaa9876499b..faf2516b0f06 100644
--- a/drivers/vfio/iommufd.c
+++ b/drivers/vfio/iommufd.c
@@ -165,6 +165,12 @@ int vfio_iommufd_emulated_attach_ioas(struct vfio_device *vdev, u32 *pt_id)
 
 	lockdep_assert_held(&vdev->dev_set->lock);
 
+	if (!vdev->iommufd_ictx)
+		return -EINVAL;
+
+	if (vdev->iommufd_access)
+		return -EBUSY;
+
 	user = iommufd_access_create(vdev->iommufd_ictx, *pt_id, &vfio_user_ops,
 				     vdev);
 	if (IS_ERR(user))
@@ -173,3 +179,15 @@ int vfio_iommufd_emulated_attach_ioas(struct vfio_device *vdev, u32 *pt_id)
 	return 0;
 }
 EXPORT_SYMBOL_GPL(vfio_iommufd_emulated_attach_ioas);
+
+void vfio_iommufd_emulated_detach_ioas(struct vfio_device *vdev)
+{
+	lockdep_assert_held(&vdev->dev_set->lock);
+
+	if (!vdev->iommufd_ictx || !vdev->iommufd_access)
+		return;
+
+	iommufd_access_destroy(vdev->iommufd_access);
+	vdev->iommufd_access = NULL;
+}
+EXPORT_SYMBOL_GPL(vfio_iommufd_emulated_detach_ioas);
diff --git a/include/linux/vfio.h b/include/linux/vfio.h
index 584aa909c8bc..50ee3efbc1f9 100644
--- a/include/linux/vfio.h
+++ b/include/linux/vfio.h
@@ -126,6 +126,7 @@ int vfio_iommufd_emulated_bind(struct vfio_device *vdev,
 			       struct iommufd_ctx *ictx, u32 *out_device_id);
 void vfio_iommufd_emulated_unbind(struct vfio_device *vdev);
 int vfio_iommufd_emulated_attach_ioas(struct vfio_device *vdev, u32 *pt_id);
+void vfio_iommufd_emulated_detach_ioas(struct vfio_device *vdev);
 #else
 #define vfio_iommufd_physical_bind                                      \
 	((int (*)(struct vfio_device *vdev, struct iommufd_ctx *ictx,   \
@@ -143,6 +144,8 @@ int vfio_iommufd_emulated_attach_ioas(struct vfio_device *vdev, u32 *pt_id);
 	((void (*)(struct vfio_device *vdev)) NULL)
 #define vfio_iommufd_emulated_attach_ioas \
 	((int (*)(struct vfio_device *vdev, u32 *pt_id)) NULL)
+#define vfio_iommufd_emulated_detach_ioas \
+	((void (*)(struct vfio_device *vdev)) NULL)
 #endif
 
 /**
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 135+ messages in thread

* [PATCH v3 11/15] vfio: Add cdev_device_open_cnt to vfio_group
  2023-02-13 15:13 ` [Intel-gfx] " Yi Liu
@ 2023-02-13 15:13   ` Yi Liu
  -1 siblings, 0 replies; 135+ messages in thread
From: Yi Liu @ 2023-02-13 15:13 UTC (permalink / raw)
  To: joro, alex.williamson, jgg, kevin.tian, robin.murphy
  Cc: cohuck, eric.auger, nicolinc, kvm, mjrosato, chao.p.peng,
	yi.l.liu, yi.y.sun, peterx, jasowang, shameerali.kolothum.thodi,
	lulu, suravee.suthikulpanit, intel-gvt-dev, intel-gfx,
	linux-s390

for counting the devices that are opened via the cdev path. This count
is increased and decreased by the cdev path. The group path checks it
to achieve exclusion with the cdev path. With this, only one path (group
path or cdev path) will claim DMA ownership. This avoids scenarios in
which devices within the same group may be opened via different paths.

Signed-off-by: Yi Liu <yi.l.liu@intel.com>
---
 drivers/vfio/group.c | 5 +++++
 drivers/vfio/vfio.h  | 1 +
 2 files changed, 6 insertions(+)

diff --git a/drivers/vfio/group.c b/drivers/vfio/group.c
index 9f3f6f0e4942..f3f5f4589cdd 100644
--- a/drivers/vfio/group.c
+++ b/drivers/vfio/group.c
@@ -403,6 +403,11 @@ static int vfio_group_fops_open(struct inode *inode, struct file *filep)
 		goto out_unlock;
 	}
 
+	if (group->cdev_device_open_cnt) {
+		ret = -EBUSY;
+		goto out_unlock;
+	}
+
 	/*
 	 * Do we need multiple instances of the group open?  Seems not.
 	 */
diff --git a/drivers/vfio/vfio.h b/drivers/vfio/vfio.h
index 6f063e31d08a..7a77fb12bd2c 100644
--- a/drivers/vfio/vfio.h
+++ b/drivers/vfio/vfio.h
@@ -84,6 +84,7 @@ struct vfio_group {
 	struct blocking_notifier_head	notifier;
 	struct iommufd_ctx		*iommufd;
 	spinlock_t			kvm_ref_lock;
+	unsigned int			cdev_device_open_cnt;
 };
 
 int vfio_device_set_group(struct vfio_device *device,
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 135+ messages in thread

* [Intel-gfx] [PATCH v3 11/15] vfio: Add cdev_device_open_cnt to vfio_group
@ 2023-02-13 15:13   ` Yi Liu
  0 siblings, 0 replies; 135+ messages in thread
From: Yi Liu @ 2023-02-13 15:13 UTC (permalink / raw)
  To: joro, alex.williamson, jgg, kevin.tian, robin.murphy
  Cc: linux-s390, yi.l.liu, yi.y.sun, kvm, mjrosato, jasowang, cohuck,
	peterx, eric.auger, nicolinc, shameerali.kolothum.thodi,
	suravee.suthikulpanit, chao.p.peng, lulu, intel-gvt-dev,
	intel-gfx

for counting the devices that are opened via the cdev path. This count
is increased and decreased by the cdev path. The group path checks it
to achieve exclusion with the cdev path. With this, only one path (group
path or cdev path) will claim DMA ownership. This avoids scenarios in
which devices within the same group may be opened via different paths.

Signed-off-by: Yi Liu <yi.l.liu@intel.com>
---
 drivers/vfio/group.c | 5 +++++
 drivers/vfio/vfio.h  | 1 +
 2 files changed, 6 insertions(+)

diff --git a/drivers/vfio/group.c b/drivers/vfio/group.c
index 9f3f6f0e4942..f3f5f4589cdd 100644
--- a/drivers/vfio/group.c
+++ b/drivers/vfio/group.c
@@ -403,6 +403,11 @@ static int vfio_group_fops_open(struct inode *inode, struct file *filep)
 		goto out_unlock;
 	}
 
+	if (group->cdev_device_open_cnt) {
+		ret = -EBUSY;
+		goto out_unlock;
+	}
+
 	/*
 	 * Do we need multiple instances of the group open?  Seems not.
 	 */
diff --git a/drivers/vfio/vfio.h b/drivers/vfio/vfio.h
index 6f063e31d08a..7a77fb12bd2c 100644
--- a/drivers/vfio/vfio.h
+++ b/drivers/vfio/vfio.h
@@ -84,6 +84,7 @@ struct vfio_group {
 	struct blocking_notifier_head	notifier;
 	struct iommufd_ctx		*iommufd;
 	spinlock_t			kvm_ref_lock;
+	unsigned int			cdev_device_open_cnt;
 };
 
 int vfio_device_set_group(struct vfio_device *device,
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 135+ messages in thread

* [Intel-gfx] [PATCH v3 12/15] vfio: Make vfio_device_open() single open for device cdev path
  2023-02-13 15:13 ` [Intel-gfx] " Yi Liu
@ 2023-02-13 15:13   ` Yi Liu
  -1 siblings, 0 replies; 135+ messages in thread
From: Yi Liu @ 2023-02-13 15:13 UTC (permalink / raw)
  To: joro, alex.williamson, jgg, kevin.tian, robin.murphy
  Cc: linux-s390, yi.l.liu, yi.y.sun, kvm, mjrosato, jasowang, cohuck,
	peterx, eric.auger, nicolinc, shameerali.kolothum.thodi,
	suravee.suthikulpanit, chao.p.peng, lulu, intel-gvt-dev,
	intel-gfx

With the introduction of vfio device cdev, userspace can get device
access by either the legacy group path or the cdev path. For VFIO devices,
it can only be opened by one of the group path and the cdev path at one
time. e.g. when the device is opened via cdev path, the group path should
be failed. Both paths will call into vfio_device_open(), so the exclusion
is done in it.

VFIO group has historically allowed multi-open of the device FD. This
was made secure because the "open" was executed via an ioctl to the
group FD which is itself only single open.

However, no known use of multiple device FDs today. It is kind of a
strange thing to do because new device FDs can naturally be created
via dup().

When we implement the new device uAPI (only used in cdev path) there is
no natural way to allow the device itself from being multi-opened in a
secure manner. Without the group FD we cannot prove the security context
of the opener.

Thus, when moving to the new uAPI we block the ability to multi-open
the device. Old group path still allows it.

vfio_device_open() needs to sustain both the legacy behavior i.e. multi-open
in the group path and the new behavior i.e. single-open in the cdev path.
This mixture leads to the introduction of a new is_cdev_device flag in struct
vfio_device_file.

Signed-off-by: Yi Liu <yi.l.liu@intel.com>
---
 drivers/vfio/vfio.h      |  2 ++
 drivers/vfio/vfio_main.c | 16 +++++++++++++++-
 2 files changed, 17 insertions(+), 1 deletion(-)

diff --git a/drivers/vfio/vfio.h b/drivers/vfio/vfio.h
index 7a77fb12bd2c..620ebcf966fc 100644
--- a/drivers/vfio/vfio.h
+++ b/drivers/vfio/vfio.h
@@ -18,6 +18,8 @@ struct vfio_container;
 
 struct vfio_device_file {
 	struct vfio_device *device;
+	bool is_cdev_device;
+
 	bool access_granted;
 	spinlock_t kvm_ref_lock; /* protect kvm field */
 	struct kvm *kvm;
diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c
index 05dd4b89e9d1..c0be4b27f96c 100644
--- a/drivers/vfio/vfio_main.c
+++ b/drivers/vfio/vfio_main.c
@@ -472,6 +472,15 @@ int vfio_device_open(struct vfio_device_file *df,
 
 	lockdep_assert_held(&device->dev_set->lock);
 
+	/*
+	 * Device cdev path cannot support multiple device open since
+	 * it doesn't have a secure way for it. So a second device
+	 * open attempt should be failed if the caller is from a cdev
+	 * path.
+	 */
+	if (device->open_count != 0 && df->is_cdev_device)
+		return -EINVAL;
+
 	device->open_count++;
 	if (device->open_count == 1) {
 		ret = vfio_device_first_open(df, dev_id, pt_id);
@@ -543,7 +552,12 @@ static int vfio_device_fops_release(struct inode *inode, struct file *filep)
 	struct vfio_device_file *df = filep->private_data;
 	struct vfio_device *device = df->device;
 
-	vfio_device_group_close(df);
+	/*
+	 * group path supports multiple device open, while cdev doesn't.
+	 * So use vfio_device_group_close() for !is_cdev_device case.
+	 */
+	if (!df->is_cdev_device)
+		vfio_device_group_close(df);
 
 	vfio_device_put_registration(device);
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 135+ messages in thread

* [PATCH v3 12/15] vfio: Make vfio_device_open() single open for device cdev path
@ 2023-02-13 15:13   ` Yi Liu
  0 siblings, 0 replies; 135+ messages in thread
From: Yi Liu @ 2023-02-13 15:13 UTC (permalink / raw)
  To: joro, alex.williamson, jgg, kevin.tian, robin.murphy
  Cc: cohuck, eric.auger, nicolinc, kvm, mjrosato, chao.p.peng,
	yi.l.liu, yi.y.sun, peterx, jasowang, shameerali.kolothum.thodi,
	lulu, suravee.suthikulpanit, intel-gvt-dev, intel-gfx,
	linux-s390

With the introduction of vfio device cdev, userspace can get device
access by either the legacy group path or the cdev path. For VFIO devices,
it can only be opened by one of the group path and the cdev path at one
time. e.g. when the device is opened via cdev path, the group path should
be failed. Both paths will call into vfio_device_open(), so the exclusion
is done in it.

VFIO group has historically allowed multi-open of the device FD. This
was made secure because the "open" was executed via an ioctl to the
group FD which is itself only single open.

However, no known use of multiple device FDs today. It is kind of a
strange thing to do because new device FDs can naturally be created
via dup().

When we implement the new device uAPI (only used in cdev path) there is
no natural way to allow the device itself from being multi-opened in a
secure manner. Without the group FD we cannot prove the security context
of the opener.

Thus, when moving to the new uAPI we block the ability to multi-open
the device. Old group path still allows it.

vfio_device_open() needs to sustain both the legacy behavior i.e. multi-open
in the group path and the new behavior i.e. single-open in the cdev path.
This mixture leads to the introduction of a new is_cdev_device flag in struct
vfio_device_file.

Signed-off-by: Yi Liu <yi.l.liu@intel.com>
---
 drivers/vfio/vfio.h      |  2 ++
 drivers/vfio/vfio_main.c | 16 +++++++++++++++-
 2 files changed, 17 insertions(+), 1 deletion(-)

diff --git a/drivers/vfio/vfio.h b/drivers/vfio/vfio.h
index 7a77fb12bd2c..620ebcf966fc 100644
--- a/drivers/vfio/vfio.h
+++ b/drivers/vfio/vfio.h
@@ -18,6 +18,8 @@ struct vfio_container;
 
 struct vfio_device_file {
 	struct vfio_device *device;
+	bool is_cdev_device;
+
 	bool access_granted;
 	spinlock_t kvm_ref_lock; /* protect kvm field */
 	struct kvm *kvm;
diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c
index 05dd4b89e9d1..c0be4b27f96c 100644
--- a/drivers/vfio/vfio_main.c
+++ b/drivers/vfio/vfio_main.c
@@ -472,6 +472,15 @@ int vfio_device_open(struct vfio_device_file *df,
 
 	lockdep_assert_held(&device->dev_set->lock);
 
+	/*
+	 * Device cdev path cannot support multiple device open since
+	 * it doesn't have a secure way for it. So a second device
+	 * open attempt should be failed if the caller is from a cdev
+	 * path.
+	 */
+	if (device->open_count != 0 && df->is_cdev_device)
+		return -EINVAL;
+
 	device->open_count++;
 	if (device->open_count == 1) {
 		ret = vfio_device_first_open(df, dev_id, pt_id);
@@ -543,7 +552,12 @@ static int vfio_device_fops_release(struct inode *inode, struct file *filep)
 	struct vfio_device_file *df = filep->private_data;
 	struct vfio_device *device = df->device;
 
-	vfio_device_group_close(df);
+	/*
+	 * group path supports multiple device open, while cdev doesn't.
+	 * So use vfio_device_group_close() for !is_cdev_device case.
+	 */
+	if (!df->is_cdev_device)
+		vfio_device_group_close(df);
 
 	vfio_device_put_registration(device);
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 135+ messages in thread

* [Intel-gfx] [PATCH v3 13/15] vfio: Add cdev for vfio_device
  2023-02-13 15:13 ` [Intel-gfx] " Yi Liu
@ 2023-02-13 15:13   ` Yi Liu
  -1 siblings, 0 replies; 135+ messages in thread
From: Yi Liu @ 2023-02-13 15:13 UTC (permalink / raw)
  To: joro, alex.williamson, jgg, kevin.tian, robin.murphy
  Cc: linux-s390, yi.l.liu, yi.y.sun, kvm, mjrosato, jasowang, cohuck,
	peterx, eric.auger, nicolinc, shameerali.kolothum.thodi,
	suravee.suthikulpanit, chao.p.peng, lulu, intel-gvt-dev,
	intel-gfx

This allows user to directly open a vfio device w/o using the legacy
container/group interface, as a prerequisite for supporting new iommu
features like nested translation.

The device fd opened in this manner doesn't have the capability to access
the device as the fops open() doesn't open the device until the successful
BIND_IOMMUFD which be added in next patch.

With this patch, devices registered to vfio core have both group and device
interface created.

- group interface : /dev/vfio/$groupID
- device interface: /dev/vfio/devices/vfioX  (X is the minor number and
					      is unique across devices)

Given a vfio device the user can identify the matching vfioX by checking
the sysfs path of the device. Take PCI device (0000:6a:01.0) for example,
/sys/bus/pci/devices/0000\:6a\:01.0/vfio-dev/vfio0/dev contains the
major:minor of the matching vfioX.

Userspace then opens the /dev/vfio/devices/vfioX and checks with fstat
that the major:minor matches.

The vfio_device cdev logic in this patch:
*) __vfio_register_dev() path ends up doing cdev_device_add() for each
   vfio_device;
*) vfio_unregister_group_dev() path does cdev_device_del();

Signed-off-by: Yi Liu <yi.l.liu@intel.com>
---
 drivers/vfio/Kconfig       | 11 +++++++
 drivers/vfio/Makefile      |  1 +
 drivers/vfio/device_cdev.c | 64 ++++++++++++++++++++++++++++++++++++++
 drivers/vfio/vfio.h        | 26 ++++++++++++++++
 drivers/vfio/vfio_main.c   | 41 +++++++++++++++++++++---
 include/linux/vfio.h       |  4 +++
 6 files changed, 143 insertions(+), 4 deletions(-)
 create mode 100644 drivers/vfio/device_cdev.c

diff --git a/drivers/vfio/Kconfig b/drivers/vfio/Kconfig
index a8f544629467..0476abf154f2 100644
--- a/drivers/vfio/Kconfig
+++ b/drivers/vfio/Kconfig
@@ -12,6 +12,17 @@ menuconfig VFIO
 	  If you don't know what to do here, say N.
 
 if VFIO
+config VFIO_DEVICE_CDEV
+	bool "Support for the VFIO cdev /dev/vfio/devices/vfioX"
+	depends on IOMMUFD
+	help
+	  The VFIO device cdev is another way for userspace to get device
+	  access. Userspace gets device fd by opening device cdev under
+	  /dev/vfio/devices/vfioX, and then bind the device fd with an iommufd
+	  to set up secure context for device access.
+
+	  If you don't know what to do here, say N.
+
 config VFIO_CONTAINER
 	bool "Support for the VFIO container /dev/vfio/vfio"
 	select VFIO_IOMMU_TYPE1 if MMU && (X86 || S390 || ARM || ARM64)
diff --git a/drivers/vfio/Makefile b/drivers/vfio/Makefile
index 70e7dcb302ef..245394aeb94b 100644
--- a/drivers/vfio/Makefile
+++ b/drivers/vfio/Makefile
@@ -4,6 +4,7 @@ obj-$(CONFIG_VFIO) += vfio.o
 vfio-y += vfio_main.o \
 	  group.o \
 	  iova_bitmap.o
+vfio-$(CONFIG_VFIO_DEVICE_CDEV) += device_cdev.o
 vfio-$(CONFIG_IOMMUFD) += iommufd.o
 vfio-$(CONFIG_VFIO_CONTAINER) += container.o
 vfio-$(CONFIG_VFIO_VIRQFD) += virqfd.o
diff --git a/drivers/vfio/device_cdev.c b/drivers/vfio/device_cdev.c
new file mode 100644
index 000000000000..07869fde1c0c
--- /dev/null
+++ b/drivers/vfio/device_cdev.c
@@ -0,0 +1,64 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (c) 2023 Intel Corporation.
+ */
+#include <linux/vfio.h>
+
+#include "vfio.h"
+
+static dev_t device_devt;
+
+void vfio_init_device_cdev(struct vfio_device *device)
+{
+	device->device.devt = MKDEV(MAJOR(device_devt), device->index);
+	cdev_init(&device->cdev, &vfio_device_fops);
+	device->cdev.owner = THIS_MODULE;
+}
+
+/*
+ * cdev open op. device access via the fd opened by this function
+ * is blocked until .open_device() is called successfully during
+ * BIND_IOMMUFD.
+ */
+int vfio_device_fops_open(struct inode *inode, struct file *filep)
+{
+	struct vfio_device *device = container_of(inode->i_cdev,
+						  struct vfio_device, cdev);
+	struct vfio_device_file *df;
+	int ret;
+
+	if (!vfio_device_try_get_registration(device))
+		return -ENODEV;
+
+	df = vfio_allocate_device_file(device);
+	if (IS_ERR(df)) {
+		ret = PTR_ERR(df);
+		goto err_put_registration;
+	}
+
+	df->is_cdev_device = true;
+	filep->private_data = df;
+
+	return 0;
+
+err_put_registration:
+	vfio_device_put_registration(device);
+	return ret;
+}
+
+static char *vfio_device_devnode(const struct device *dev, umode_t *mode)
+{
+	return kasprintf(GFP_KERNEL, "vfio/devices/%s", dev_name(dev));
+}
+
+int vfio_cdev_init(struct class *device_class)
+{
+	device_class->devnode = vfio_device_devnode;
+	return alloc_chrdev_region(&device_devt, 0,
+				   MINORMASK + 1, "vfio-dev");
+}
+
+void vfio_cdev_cleanup(void)
+{
+	unregister_chrdev_region(device_devt, MINORMASK + 1);
+}
diff --git a/drivers/vfio/vfio.h b/drivers/vfio/vfio.h
index 620ebcf966fc..be93a1c953f8 100644
--- a/drivers/vfio/vfio.h
+++ b/drivers/vfio/vfio.h
@@ -244,6 +244,32 @@ static inline void vfio_iommufd_unbind(struct vfio_device *device)
 }
 #endif
 
+#if IS_ENABLED(CONFIG_VFIO_DEVICE_CDEV)
+void vfio_init_device_cdev(struct vfio_device *device);
+int vfio_device_fops_open(struct inode *inode, struct file *filep);
+int vfio_cdev_init(struct class *device_class);
+void vfio_cdev_cleanup(void);
+#else
+static inline void vfio_init_device_cdev(struct vfio_device *device)
+{
+}
+
+static inline int vfio_device_fops_open(struct inode *inode,
+					struct file *filep)
+{
+	return 0;
+}
+
+static inline int vfio_cdev_init(struct class *device_class)
+{
+	return 0;
+}
+
+static inline void vfio_cdev_cleanup(void)
+{
+}
+#endif /* CONFIG_VFIO_DEVICE_CDEV */
+
 #if IS_ENABLED(CONFIG_VFIO_VIRQFD)
 int __init vfio_virqfd_init(void);
 void vfio_virqfd_exit(void);
diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c
index c0be4b27f96c..a7eb2727c613 100644
--- a/drivers/vfio/vfio_main.c
+++ b/drivers/vfio/vfio_main.c
@@ -235,6 +235,7 @@ static int vfio_init_device(struct vfio_device *device, struct device *dev,
 	device->device.release = vfio_device_release;
 	device->device.class = vfio.device_class;
 	device->device.parent = device->dev;
+	vfio_init_device_cdev(device);
 	return 0;
 
 out_uninit:
@@ -243,6 +244,25 @@ static int vfio_init_device(struct vfio_device *device, struct device *dev,
 	return ret;
 }
 
+static int vfio_device_add(struct vfio_device *device)
+{
+	int ret;
+
+	if (IS_ENABLED(CONFIG_VFIO_DEVICE_CDEV))
+		ret = cdev_device_add(&device->cdev, &device->device);
+	else
+		ret = device_add(&device->device);
+	return ret;
+}
+
+static void vfio_device_del(struct vfio_device *device)
+{
+	if (IS_ENABLED(CONFIG_VFIO_DEVICE_CDEV))
+		cdev_device_del(&device->cdev, &device->device);
+	else
+		device_del(&device->device);
+}
+
 static int __vfio_register_dev(struct vfio_device *device,
 			       enum vfio_group_type type)
 {
@@ -269,7 +289,7 @@ static int __vfio_register_dev(struct vfio_device *device,
 	if (ret)
 		return ret;
 
-	ret = device_add(&device->device);
+	ret = vfio_device_add(device);
 	if (ret)
 		goto err_out;
 
@@ -309,6 +329,13 @@ void vfio_unregister_group_dev(struct vfio_device *device)
 	bool interrupted = false;
 	long rc;
 
+	/*
+	 * Balances vfio_device_add in register path. Putting it as the
+	 * first operation in unregister to prevent registration refcount
+	 * from incrementing per cdev open.
+	 */
+	vfio_device_del(device);
+
 	vfio_device_put_registration(device);
 	rc = try_wait_for_completion(&device->comp);
 	while (rc <= 0) {
@@ -334,9 +361,6 @@ void vfio_unregister_group_dev(struct vfio_device *device)
 
 	vfio_device_group_unregister(device);
 
-	/* Balances device_add in register path */
-	device_del(&device->device);
-
 	/* Balances vfio_device_set_group in register path */
 	vfio_device_remove_group(device);
 }
@@ -1214,6 +1238,7 @@ static int vfio_device_fops_mmap(struct file *filep, struct vm_area_struct *vma)
 
 const struct file_operations vfio_device_fops = {
 	.owner		= THIS_MODULE,
+	.open		= vfio_device_fops_open,
 	.release	= vfio_device_fops_release,
 	.read		= vfio_device_fops_read,
 	.write		= vfio_device_fops_write,
@@ -1587,9 +1612,16 @@ static int __init vfio_init(void)
 		goto err_dev_class;
 	}
 
+	ret = vfio_cdev_init(vfio.device_class);
+	if (ret)
+		goto err_alloc_dev_chrdev;
+
 	pr_info(DRIVER_DESC " version: " DRIVER_VERSION "\n");
 	return 0;
 
+err_alloc_dev_chrdev:
+	class_destroy(vfio.device_class);
+	vfio.device_class = NULL;
 err_dev_class:
 	vfio_virqfd_exit();
 err_virqfd:
@@ -1600,6 +1632,7 @@ static int __init vfio_init(void)
 static void __exit vfio_cleanup(void)
 {
 	ida_destroy(&vfio.device_ida);
+	vfio_cdev_cleanup();
 	class_destroy(vfio.device_class);
 	vfio.device_class = NULL;
 	vfio_virqfd_exit();
diff --git a/include/linux/vfio.h b/include/linux/vfio.h
index 50ee3efbc1f9..6b554ce6245a 100644
--- a/include/linux/vfio.h
+++ b/include/linux/vfio.h
@@ -13,6 +13,7 @@
 #include <linux/mm.h>
 #include <linux/workqueue.h>
 #include <linux/poll.h>
+#include <linux/cdev.h>
 #include <uapi/linux/vfio.h>
 #include <linux/iova_bitmap.h>
 
@@ -51,6 +52,9 @@ struct vfio_device {
 	/* Members below here are private, not for driver use */
 	unsigned int index;
 	struct device device;	/* device.kref covers object life circle */
+#if IS_ENABLED(CONFIG_VFIO_DEVICE_CDEV)
+	struct cdev cdev;
+#endif
 	refcount_t refcount;	/* user count on registered device*/
 	unsigned int open_count;
 	struct completion comp;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 135+ messages in thread

* [PATCH v3 13/15] vfio: Add cdev for vfio_device
@ 2023-02-13 15:13   ` Yi Liu
  0 siblings, 0 replies; 135+ messages in thread
From: Yi Liu @ 2023-02-13 15:13 UTC (permalink / raw)
  To: joro, alex.williamson, jgg, kevin.tian, robin.murphy
  Cc: cohuck, eric.auger, nicolinc, kvm, mjrosato, chao.p.peng,
	yi.l.liu, yi.y.sun, peterx, jasowang, shameerali.kolothum.thodi,
	lulu, suravee.suthikulpanit, intel-gvt-dev, intel-gfx,
	linux-s390

This allows user to directly open a vfio device w/o using the legacy
container/group interface, as a prerequisite for supporting new iommu
features like nested translation.

The device fd opened in this manner doesn't have the capability to access
the device as the fops open() doesn't open the device until the successful
BIND_IOMMUFD which be added in next patch.

With this patch, devices registered to vfio core have both group and device
interface created.

- group interface : /dev/vfio/$groupID
- device interface: /dev/vfio/devices/vfioX  (X is the minor number and
					      is unique across devices)

Given a vfio device the user can identify the matching vfioX by checking
the sysfs path of the device. Take PCI device (0000:6a:01.0) for example,
/sys/bus/pci/devices/0000\:6a\:01.0/vfio-dev/vfio0/dev contains the
major:minor of the matching vfioX.

Userspace then opens the /dev/vfio/devices/vfioX and checks with fstat
that the major:minor matches.

The vfio_device cdev logic in this patch:
*) __vfio_register_dev() path ends up doing cdev_device_add() for each
   vfio_device;
*) vfio_unregister_group_dev() path does cdev_device_del();

Signed-off-by: Yi Liu <yi.l.liu@intel.com>
---
 drivers/vfio/Kconfig       | 11 +++++++
 drivers/vfio/Makefile      |  1 +
 drivers/vfio/device_cdev.c | 64 ++++++++++++++++++++++++++++++++++++++
 drivers/vfio/vfio.h        | 26 ++++++++++++++++
 drivers/vfio/vfio_main.c   | 41 +++++++++++++++++++++---
 include/linux/vfio.h       |  4 +++
 6 files changed, 143 insertions(+), 4 deletions(-)
 create mode 100644 drivers/vfio/device_cdev.c

diff --git a/drivers/vfio/Kconfig b/drivers/vfio/Kconfig
index a8f544629467..0476abf154f2 100644
--- a/drivers/vfio/Kconfig
+++ b/drivers/vfio/Kconfig
@@ -12,6 +12,17 @@ menuconfig VFIO
 	  If you don't know what to do here, say N.
 
 if VFIO
+config VFIO_DEVICE_CDEV
+	bool "Support for the VFIO cdev /dev/vfio/devices/vfioX"
+	depends on IOMMUFD
+	help
+	  The VFIO device cdev is another way for userspace to get device
+	  access. Userspace gets device fd by opening device cdev under
+	  /dev/vfio/devices/vfioX, and then bind the device fd with an iommufd
+	  to set up secure context for device access.
+
+	  If you don't know what to do here, say N.
+
 config VFIO_CONTAINER
 	bool "Support for the VFIO container /dev/vfio/vfio"
 	select VFIO_IOMMU_TYPE1 if MMU && (X86 || S390 || ARM || ARM64)
diff --git a/drivers/vfio/Makefile b/drivers/vfio/Makefile
index 70e7dcb302ef..245394aeb94b 100644
--- a/drivers/vfio/Makefile
+++ b/drivers/vfio/Makefile
@@ -4,6 +4,7 @@ obj-$(CONFIG_VFIO) += vfio.o
 vfio-y += vfio_main.o \
 	  group.o \
 	  iova_bitmap.o
+vfio-$(CONFIG_VFIO_DEVICE_CDEV) += device_cdev.o
 vfio-$(CONFIG_IOMMUFD) += iommufd.o
 vfio-$(CONFIG_VFIO_CONTAINER) += container.o
 vfio-$(CONFIG_VFIO_VIRQFD) += virqfd.o
diff --git a/drivers/vfio/device_cdev.c b/drivers/vfio/device_cdev.c
new file mode 100644
index 000000000000..07869fde1c0c
--- /dev/null
+++ b/drivers/vfio/device_cdev.c
@@ -0,0 +1,64 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (c) 2023 Intel Corporation.
+ */
+#include <linux/vfio.h>
+
+#include "vfio.h"
+
+static dev_t device_devt;
+
+void vfio_init_device_cdev(struct vfio_device *device)
+{
+	device->device.devt = MKDEV(MAJOR(device_devt), device->index);
+	cdev_init(&device->cdev, &vfio_device_fops);
+	device->cdev.owner = THIS_MODULE;
+}
+
+/*
+ * cdev open op. device access via the fd opened by this function
+ * is blocked until .open_device() is called successfully during
+ * BIND_IOMMUFD.
+ */
+int vfio_device_fops_open(struct inode *inode, struct file *filep)
+{
+	struct vfio_device *device = container_of(inode->i_cdev,
+						  struct vfio_device, cdev);
+	struct vfio_device_file *df;
+	int ret;
+
+	if (!vfio_device_try_get_registration(device))
+		return -ENODEV;
+
+	df = vfio_allocate_device_file(device);
+	if (IS_ERR(df)) {
+		ret = PTR_ERR(df);
+		goto err_put_registration;
+	}
+
+	df->is_cdev_device = true;
+	filep->private_data = df;
+
+	return 0;
+
+err_put_registration:
+	vfio_device_put_registration(device);
+	return ret;
+}
+
+static char *vfio_device_devnode(const struct device *dev, umode_t *mode)
+{
+	return kasprintf(GFP_KERNEL, "vfio/devices/%s", dev_name(dev));
+}
+
+int vfio_cdev_init(struct class *device_class)
+{
+	device_class->devnode = vfio_device_devnode;
+	return alloc_chrdev_region(&device_devt, 0,
+				   MINORMASK + 1, "vfio-dev");
+}
+
+void vfio_cdev_cleanup(void)
+{
+	unregister_chrdev_region(device_devt, MINORMASK + 1);
+}
diff --git a/drivers/vfio/vfio.h b/drivers/vfio/vfio.h
index 620ebcf966fc..be93a1c953f8 100644
--- a/drivers/vfio/vfio.h
+++ b/drivers/vfio/vfio.h
@@ -244,6 +244,32 @@ static inline void vfio_iommufd_unbind(struct vfio_device *device)
 }
 #endif
 
+#if IS_ENABLED(CONFIG_VFIO_DEVICE_CDEV)
+void vfio_init_device_cdev(struct vfio_device *device);
+int vfio_device_fops_open(struct inode *inode, struct file *filep);
+int vfio_cdev_init(struct class *device_class);
+void vfio_cdev_cleanup(void);
+#else
+static inline void vfio_init_device_cdev(struct vfio_device *device)
+{
+}
+
+static inline int vfio_device_fops_open(struct inode *inode,
+					struct file *filep)
+{
+	return 0;
+}
+
+static inline int vfio_cdev_init(struct class *device_class)
+{
+	return 0;
+}
+
+static inline void vfio_cdev_cleanup(void)
+{
+}
+#endif /* CONFIG_VFIO_DEVICE_CDEV */
+
 #if IS_ENABLED(CONFIG_VFIO_VIRQFD)
 int __init vfio_virqfd_init(void);
 void vfio_virqfd_exit(void);
diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c
index c0be4b27f96c..a7eb2727c613 100644
--- a/drivers/vfio/vfio_main.c
+++ b/drivers/vfio/vfio_main.c
@@ -235,6 +235,7 @@ static int vfio_init_device(struct vfio_device *device, struct device *dev,
 	device->device.release = vfio_device_release;
 	device->device.class = vfio.device_class;
 	device->device.parent = device->dev;
+	vfio_init_device_cdev(device);
 	return 0;
 
 out_uninit:
@@ -243,6 +244,25 @@ static int vfio_init_device(struct vfio_device *device, struct device *dev,
 	return ret;
 }
 
+static int vfio_device_add(struct vfio_device *device)
+{
+	int ret;
+
+	if (IS_ENABLED(CONFIG_VFIO_DEVICE_CDEV))
+		ret = cdev_device_add(&device->cdev, &device->device);
+	else
+		ret = device_add(&device->device);
+	return ret;
+}
+
+static void vfio_device_del(struct vfio_device *device)
+{
+	if (IS_ENABLED(CONFIG_VFIO_DEVICE_CDEV))
+		cdev_device_del(&device->cdev, &device->device);
+	else
+		device_del(&device->device);
+}
+
 static int __vfio_register_dev(struct vfio_device *device,
 			       enum vfio_group_type type)
 {
@@ -269,7 +289,7 @@ static int __vfio_register_dev(struct vfio_device *device,
 	if (ret)
 		return ret;
 
-	ret = device_add(&device->device);
+	ret = vfio_device_add(device);
 	if (ret)
 		goto err_out;
 
@@ -309,6 +329,13 @@ void vfio_unregister_group_dev(struct vfio_device *device)
 	bool interrupted = false;
 	long rc;
 
+	/*
+	 * Balances vfio_device_add in register path. Putting it as the
+	 * first operation in unregister to prevent registration refcount
+	 * from incrementing per cdev open.
+	 */
+	vfio_device_del(device);
+
 	vfio_device_put_registration(device);
 	rc = try_wait_for_completion(&device->comp);
 	while (rc <= 0) {
@@ -334,9 +361,6 @@ void vfio_unregister_group_dev(struct vfio_device *device)
 
 	vfio_device_group_unregister(device);
 
-	/* Balances device_add in register path */
-	device_del(&device->device);
-
 	/* Balances vfio_device_set_group in register path */
 	vfio_device_remove_group(device);
 }
@@ -1214,6 +1238,7 @@ static int vfio_device_fops_mmap(struct file *filep, struct vm_area_struct *vma)
 
 const struct file_operations vfio_device_fops = {
 	.owner		= THIS_MODULE,
+	.open		= vfio_device_fops_open,
 	.release	= vfio_device_fops_release,
 	.read		= vfio_device_fops_read,
 	.write		= vfio_device_fops_write,
@@ -1587,9 +1612,16 @@ static int __init vfio_init(void)
 		goto err_dev_class;
 	}
 
+	ret = vfio_cdev_init(vfio.device_class);
+	if (ret)
+		goto err_alloc_dev_chrdev;
+
 	pr_info(DRIVER_DESC " version: " DRIVER_VERSION "\n");
 	return 0;
 
+err_alloc_dev_chrdev:
+	class_destroy(vfio.device_class);
+	vfio.device_class = NULL;
 err_dev_class:
 	vfio_virqfd_exit();
 err_virqfd:
@@ -1600,6 +1632,7 @@ static int __init vfio_init(void)
 static void __exit vfio_cleanup(void)
 {
 	ida_destroy(&vfio.device_ida);
+	vfio_cdev_cleanup();
 	class_destroy(vfio.device_class);
 	vfio.device_class = NULL;
 	vfio_virqfd_exit();
diff --git a/include/linux/vfio.h b/include/linux/vfio.h
index 50ee3efbc1f9..6b554ce6245a 100644
--- a/include/linux/vfio.h
+++ b/include/linux/vfio.h
@@ -13,6 +13,7 @@
 #include <linux/mm.h>
 #include <linux/workqueue.h>
 #include <linux/poll.h>
+#include <linux/cdev.h>
 #include <uapi/linux/vfio.h>
 #include <linux/iova_bitmap.h>
 
@@ -51,6 +52,9 @@ struct vfio_device {
 	/* Members below here are private, not for driver use */
 	unsigned int index;
 	struct device device;	/* device.kref covers object life circle */
+#if IS_ENABLED(CONFIG_VFIO_DEVICE_CDEV)
+	struct cdev cdev;
+#endif
 	refcount_t refcount;	/* user count on registered device*/
 	unsigned int open_count;
 	struct completion comp;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 135+ messages in thread

* [Intel-gfx] [PATCH v3 14/15] vfio: Add ioctls for device cdev using iommufd
  2023-02-13 15:13 ` [Intel-gfx] " Yi Liu
@ 2023-02-13 15:13   ` Yi Liu
  -1 siblings, 0 replies; 135+ messages in thread
From: Yi Liu @ 2023-02-13 15:13 UTC (permalink / raw)
  To: joro, alex.williamson, jgg, kevin.tian, robin.murphy
  Cc: linux-s390, yi.l.liu, yi.y.sun, kvm, mjrosato, jasowang, cohuck,
	peterx, eric.auger, nicolinc, shameerali.kolothum.thodi,
	suravee.suthikulpanit, chao.p.peng, lulu, intel-gvt-dev,
	intel-gfx

This adds three vfio device ioctls for userspace using iommufd to set up
secure DMA context for device access.

    VFIO_DEVICE_BIND_IOMMUFD: bind device to an iommufd, hence gain DMA
			      control provided by the iommufd. open_device
			      op is called after bind_iommufd op.
			      VFIO no iommu mode is indicated by passing
			      a negative iommufd value.
    VFIO_DEVICE_ATTACH_IOMMUFD_PT: attach device to IOAS, hw_pagetable
				   managed by iommufd. Attach can be
				   undo by VFIO_DEVICE_DETACH_IOMMUFD_PT
				   or device fd close.
    VFIO_DEVICE_DETACH_IOMMUFD_PT: detach device from the current attached
				   IOAS or hw_pagetable managed by iommufd.

Signed-off-by: Yi Liu <yi.l.liu@intel.com>
---
 drivers/vfio/device_cdev.c | 200 +++++++++++++++++++++++++++++++++++++
 drivers/vfio/group.c       |  27 +++++
 drivers/vfio/vfio.h        |  35 ++++++-
 drivers/vfio/vfio_main.c   |  38 ++++++-
 include/linux/iommufd.h    |   6 ++
 include/uapi/linux/vfio.h  |  86 ++++++++++++++++
 6 files changed, 387 insertions(+), 5 deletions(-)

diff --git a/drivers/vfio/device_cdev.c b/drivers/vfio/device_cdev.c
index 07869fde1c0c..e5297bf99cc4 100644
--- a/drivers/vfio/device_cdev.c
+++ b/drivers/vfio/device_cdev.c
@@ -3,6 +3,7 @@
  * Copyright (c) 2023 Intel Corporation.
  */
 #include <linux/vfio.h>
+#include <linux/iommufd.h>
 
 #include "vfio.h"
 
@@ -46,6 +47,205 @@ int vfio_device_fops_open(struct inode *inode, struct file *filep)
 	return ret;
 }
 
+static void vfio_device_get_kvm_safe(struct vfio_device_file *df)
+{
+	spin_lock(&df->kvm_ref_lock);
+	if (!df->kvm)
+		goto unlock;
+
+	_vfio_device_get_kvm_safe(df->device, df->kvm);
+
+unlock:
+	spin_unlock(&df->kvm_ref_lock);
+}
+
+void vfio_device_cdev_close(struct vfio_device_file *df)
+{
+	struct vfio_device *device = df->device;
+
+	mutex_lock(&device->dev_set->lock);
+	if (!device->open_count) {
+		mutex_unlock(&device->dev_set->lock);
+		return;
+	}
+	vfio_device_close(df);
+	vfio_device_put_kvm(device);
+	mutex_unlock(&device->dev_set->lock);
+	vfio_device_release_group(device);
+}
+
+long vfio_device_ioctl_bind_iommufd(struct vfio_device_file *df,
+				    unsigned long arg)
+{
+	struct vfio_device *device = df->device;
+	struct vfio_device_bind_iommufd bind;
+	struct iommufd_ctx *iommufd = NULL;
+	struct fd f;
+	unsigned long minsz;
+	int ret;
+
+	minsz = offsetofend(struct vfio_device_bind_iommufd, out_devid);
+
+	if (copy_from_user(&bind, (void __user *)arg, minsz))
+		return -EFAULT;
+
+	if (bind.argsz < minsz || bind.flags)
+		return -EINVAL;
+
+	if (!device->ops->bind_iommufd)
+		return -ENODEV;
+
+	ret = vfio_device_claim_group(device);
+	if (ret)
+		return ret;
+
+	mutex_lock(&device->dev_set->lock);
+	/*
+	 * If already been bound to an iommufd, or already set noiommu
+	 * then fail it.
+	 */
+	if (df->iommufd || df->noiommu) {
+		ret = -EINVAL;
+		goto out_unlock;
+	}
+
+	/* iommufd < 0 means noiommu mode */
+	if (bind.iommufd < 0) {
+		if (!capable(CAP_SYS_RAWIO)) {
+			ret = -EPERM;
+			goto out_unlock;
+		}
+		df->noiommu = true;
+	} else {
+		f = fdget(bind.iommufd);
+		if (!f.file) {
+			ret = -EBADF;
+			goto out_unlock;
+		}
+		iommufd = iommufd_ctx_from_file(f.file);
+		if (IS_ERR(iommufd)) {
+			ret = PTR_ERR(iommufd);
+			goto out_put_file;
+		}
+	}
+
+	/*
+	 * Before the device open, get the KVM pointer currently
+	 * associated with the device file (if there is) and obtain a
+	 * reference. This reference is held until device closed. Save
+	 * the pointer in the device for use by drivers.
+	 */
+	vfio_device_get_kvm_safe(df);
+
+	df->iommufd = iommufd;
+	ret = vfio_device_open(df, &bind.out_devid, NULL);
+	if (ret)
+		goto out_put_kvm;
+
+	ret = copy_to_user((void __user *)arg +
+			   offsetofend(struct vfio_device_bind_iommufd, iommufd),
+			   &bind.out_devid,
+			   sizeof(bind.out_devid)) ? -EFAULT : 0;
+	if (ret)
+		goto out_close_device;
+
+	if (iommufd)
+		fdput(f);
+	else if (df->noiommu)
+		dev_warn(device->dev, "vfio-noiommu device used by user "
+			 "(%s:%d)\n", current->comm, task_pid_nr(current));
+	mutex_unlock(&device->dev_set->lock);
+	return 0;
+
+out_close_device:
+	vfio_device_close(df);
+out_put_kvm:
+	df->iommufd = NULL;
+	df->noiommu = false;
+	vfio_device_put_kvm(device);
+out_put_file:
+	if (iommufd)
+		fdput(f);
+out_unlock:
+	mutex_unlock(&device->dev_set->lock);
+	vfio_device_release_group(device);
+	return ret;
+}
+
+int vfio_ioctl_device_attach(struct vfio_device_file *df,
+			     void __user *arg)
+{
+	struct vfio_device *device = df->device;
+	struct vfio_device_attach_iommufd_pt attach;
+	unsigned long minsz;
+	int ret;
+
+	minsz = offsetofend(struct vfio_device_attach_iommufd_pt, pt_id);
+
+	if (copy_from_user(&attach, (void __user *)arg, minsz))
+		return -EFAULT;
+
+	if (attach.argsz < minsz || attach.flags ||
+	    attach.pt_id == IOMMUFD_INVALID_ID)
+		return -EINVAL;
+
+	if (!device->ops->bind_iommufd)
+		return -ENODEV;
+
+	mutex_lock(&device->dev_set->lock);
+	if (df->noiommu) {
+		ret = -EINVAL;
+		goto out_unlock;
+	}
+
+	ret = device->ops->attach_ioas(device, &attach.pt_id);
+	if (ret)
+		goto out_unlock;
+
+	ret = copy_to_user((void __user *)arg +
+			   offsetofend(struct vfio_device_attach_iommufd_pt, flags),
+			   &attach.pt_id,
+			   sizeof(attach.pt_id)) ? -EFAULT : 0;
+	if (ret)
+		goto out_detach;
+	mutex_unlock(&device->dev_set->lock);
+	return 0;
+
+out_detach:
+	device->ops->detach_ioas(device);
+out_unlock:
+	mutex_unlock(&device->dev_set->lock);
+	return ret;
+}
+
+int vfio_ioctl_device_detach(struct vfio_device_file *df,
+			     void __user *arg)
+{
+	struct vfio_device *device = df->device;
+	struct vfio_device_detach_iommufd_pt detach;
+	unsigned long minsz;
+
+	minsz = offsetofend(struct vfio_device_detach_iommufd_pt, flags);
+
+	if (copy_from_user(&detach, (void __user *)arg, minsz))
+		return -EFAULT;
+
+	if (detach.argsz < minsz || detach.flags)
+		return -EINVAL;
+
+	if (!device->ops->bind_iommufd)
+		return -ENODEV;
+
+	mutex_lock(&device->dev_set->lock);
+	if (df->noiommu) {
+		mutex_unlock(&device->dev_set->lock);
+		return -EINVAL;
+	}
+	device->ops->detach_ioas(device);
+	mutex_unlock(&device->dev_set->lock);
+	return 0;
+}
+
 static char *vfio_device_devnode(const struct device *dev, umode_t *mode)
 {
 	return kasprintf(GFP_KERNEL, "vfio/devices/%s", dev_name(dev));
diff --git a/drivers/vfio/group.c b/drivers/vfio/group.c
index f3f5f4589cdd..8ee06d8b17fa 100644
--- a/drivers/vfio/group.c
+++ b/drivers/vfio/group.c
@@ -381,6 +381,33 @@ static long vfio_group_fops_unl_ioctl(struct file *filep,
 	}
 }
 
+int vfio_device_claim_group(struct vfio_device *device)
+{
+	struct vfio_group *group = device->group;
+	int ret = 0;
+
+	mutex_lock(&group->group_lock);
+	if (group->opened_file) {
+		ret = -EBUSY;
+		goto out_unlock;
+	}
+
+	group->cdev_device_open_cnt++;
+
+out_unlock:
+	mutex_unlock(&group->group_lock);
+	return ret;
+}
+
+void vfio_device_release_group(struct vfio_device *device)
+{
+	struct vfio_group *group = device->group;
+
+	mutex_lock(&group->group_lock);
+	group->cdev_device_open_cnt--;
+	mutex_unlock(&group->group_lock);
+}
+
 static int vfio_group_fops_open(struct inode *inode, struct file *filep)
 {
 	struct vfio_group *group =
diff --git a/drivers/vfio/vfio.h b/drivers/vfio/vfio.h
index be93a1c953f8..421492518ab5 100644
--- a/drivers/vfio/vfio.h
+++ b/drivers/vfio/vfio.h
@@ -23,7 +23,9 @@ struct vfio_device_file {
 	bool access_granted;
 	spinlock_t kvm_ref_lock; /* protect kvm field */
 	struct kvm *kvm;
-	struct iommufd_ctx *iommufd; /* protected by struct vfio_device_set::lock */
+	/* protected by struct vfio_device_set::lock */
+	struct iommufd_ctx *iommufd;
+	bool noiommu;
 };
 
 void vfio_device_put_registration(struct vfio_device *device);
@@ -89,6 +91,8 @@ struct vfio_group {
 	unsigned int			cdev_device_open_cnt;
 };
 
+int vfio_device_claim_group(struct vfio_device *device);
+void vfio_device_release_group(struct vfio_device *device);
 int vfio_device_set_group(struct vfio_device *device,
 			  enum vfio_group_type type);
 void vfio_device_remove_group(struct vfio_device *device);
@@ -247,6 +251,13 @@ static inline void vfio_iommufd_unbind(struct vfio_device *device)
 #if IS_ENABLED(CONFIG_VFIO_DEVICE_CDEV)
 void vfio_init_device_cdev(struct vfio_device *device);
 int vfio_device_fops_open(struct inode *inode, struct file *filep);
+void vfio_device_cdev_close(struct vfio_device_file *df);
+long vfio_device_ioctl_bind_iommufd(struct vfio_device_file *df,
+				    unsigned long arg);
+int vfio_ioctl_device_attach(struct vfio_device_file *df,
+			     void __user *arg);
+int vfio_ioctl_device_detach(struct vfio_device_file *df,
+			     void __user *arg);
 int vfio_cdev_init(struct class *device_class);
 void vfio_cdev_cleanup(void);
 #else
@@ -260,6 +271,28 @@ static inline int vfio_device_fops_open(struct inode *inode,
 	return 0;
 }
 
+static inline void vfio_device_cdev_close(struct vfio_device_file *df)
+{
+}
+
+static inline long vfio_device_ioctl_bind_iommufd(struct vfio_device_file *df,
+						  unsigned long arg)
+{
+	return -EOPNOTSUPP;
+}
+
+static inline int vfio_ioctl_device_attach(struct vfio_device_file *df,
+					   void __user *arg)
+{
+	return -EOPNOTSUPP;
+}
+
+static inline int vfio_ioctl_device_detach(struct vfio_device_file *df,
+					   void __user *arg)
+{
+	return -EOPNOTSUPP;
+}
+
 static inline int vfio_cdev_init(struct class *device_class)
 {
 	return 0;
diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c
index a7eb2727c613..933319083282 100644
--- a/drivers/vfio/vfio_main.c
+++ b/drivers/vfio/vfio_main.c
@@ -37,6 +37,7 @@
 #include <linux/interval_tree.h>
 #include <linux/iova_bitmap.h>
 #include <linux/iommufd.h>
+#include <uapi/linux/iommufd.h>
 #include "vfio.h"
 
 #define DRIVER_VERSION	"0.3"
@@ -441,16 +442,32 @@ static int vfio_device_first_open(struct vfio_device_file *df,
 {
 	struct vfio_device *device = df->device;
 	struct iommufd_ctx *iommufd = df->iommufd;
-	int ret;
+	int ret = 0;
 
 	lockdep_assert_held(&device->dev_set->lock);
 
+	if (WARN_ON(iommufd && df->noiommu))
+		return -EINVAL;
+
 	if (!try_module_get(device->dev->driver->owner))
 		return -ENODEV;
 
+	/*
+	 * For group/container path, iommufd pointer is NULL when comes
+	 * into this helper. Its noiommu support is in container.c.
+	 *
+	 * For iommufd compat mode, iommufd pointer here is a valid value.
+	 * Its noiommu support is in vfio_iommufd_bind().
+	 *
+	 * For device cdev path, iommufd pointer here is a valid value for
+	 * normal cases, but it is NULL if it's noiommu. To differentiate
+	 * the noiommu from the group/container path which also passes NULL
+	 * iommufd pointer in, check df->noiommu which is set only in the
+	 * cdev path.
+	 */
 	if (iommufd)
 		ret = vfio_iommufd_bind(device, iommufd, dev_id, pt_id);
-	else
+	else if (!df->noiommu)
 		ret = vfio_device_group_use_iommu(device);
 	if (ret)
 		goto err_module_put;
@@ -465,7 +482,7 @@ static int vfio_device_first_open(struct vfio_device_file *df,
 err_unuse_iommu:
 	if (iommufd)
 		vfio_iommufd_unbind(device);
-	else
+	else if (!df->noiommu)
 		vfio_device_group_unuse_iommu(device);
 err_module_put:
 	module_put(device->dev->driver->owner);
@@ -483,7 +500,7 @@ static void vfio_device_last_close(struct vfio_device_file *df)
 		device->ops->close_device(device);
 	if (iommufd)
 		vfio_iommufd_unbind(device);
-	else
+	else if (!df->noiommu)
 		vfio_device_group_unuse_iommu(device);
 	module_put(device->dev->driver->owner);
 }
@@ -582,6 +599,8 @@ static int vfio_device_fops_release(struct inode *inode, struct file *filep)
 	 */
 	if (!df->is_cdev_device)
 		vfio_device_group_close(df);
+	else
+		vfio_device_cdev_close(df);
 
 	vfio_device_put_registration(device);
 
@@ -1156,6 +1175,9 @@ static long vfio_device_fops_unl_ioctl(struct file *filep,
 	bool access;
 	int ret;
 
+	if (cmd == VFIO_DEVICE_BIND_IOMMUFD)
+		return vfio_device_ioctl_bind_iommufd(df, arg);
+
 	/* Paired with smp_store_release() in vfio_device_open() */
 	access = smp_load_acquire(&df->access_granted);
 	if (!access)
@@ -1170,6 +1192,14 @@ static long vfio_device_fops_unl_ioctl(struct file *filep,
 		ret = vfio_ioctl_device_feature(device, (void __user *)arg);
 		break;
 
+	case VFIO_DEVICE_ATTACH_IOMMUFD_PT:
+		ret = vfio_ioctl_device_attach(df, (void __user *)arg);
+		break;
+
+	case VFIO_DEVICE_DETACH_IOMMUFD_PT:
+		ret = vfio_ioctl_device_detach(df, (void __user *)arg);
+		break;
+
 	default:
 		if (unlikely(!device->ops->ioctl))
 			ret = -EINVAL;
diff --git a/include/linux/iommufd.h b/include/linux/iommufd.h
index 650d45629647..9672cf839687 100644
--- a/include/linux/iommufd.h
+++ b/include/linux/iommufd.h
@@ -17,6 +17,12 @@ struct iommufd_ctx;
 struct iommufd_access;
 struct file;
 
+/*
+ * iommufd core init xarray with flags==XA_FLAGS_ALLOC1, so valid
+ * ID starts from 1.
+ */
+#define IOMMUFD_INVALID_ID 0
+
 struct iommufd_device *iommufd_device_bind(struct iommufd_ctx *ictx,
 					   struct device *dev, u32 *id);
 void iommufd_device_unbind(struct iommufd_device *idev);
diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h
index 0552e8dcf0cb..026af52cf22e 100644
--- a/include/uapi/linux/vfio.h
+++ b/include/uapi/linux/vfio.h
@@ -194,6 +194,92 @@ struct vfio_group_status {
 
 /* --------------- IOCTLs for DEVICE file descriptors --------------- */
 
+/*
+ * VFIO_DEVICE_BIND_IOMMUFD - _IOR(VFIO_TYPE, VFIO_BASE + 19,
+ *				   struct vfio_device_bind_iommufd)
+ *
+ * Bind a vfio_device to the specified iommufd.
+ *
+ * The user should provide a device cookie when calling this ioctl. The
+ * cookie is carried only in event e.g. I/O fault reported to userspace
+ * via iommufd. The user should use devid returned by this ioctl to mark
+ * the target device in other ioctls (e.g. capability query via iommufd).
+ *
+ * User is not allowed to access the device before the binding operation
+ * is completed.
+ *
+ * Unbind is automatically conducted when device fd is closed.
+ *
+ * @argsz:	 user filled size of this data.
+ * @flags:	 reserved for future extension.
+ * @dev_cookie:	 a per device cookie provided by userspace.
+ * @iommufd:	 iommufd to bind. a negative value means noiommu.
+ * @out_devid:	 the device id generated by this bind.
+ *
+ * Return: 0 on success, -errno on failure.
+ */
+struct vfio_device_bind_iommufd {
+	__u32		argsz;
+	__u32		flags;
+	__aligned_u64	dev_cookie;
+	__s32		iommufd;
+	__u32		out_devid;
+};
+
+#define VFIO_DEVICE_BIND_IOMMUFD	_IO(VFIO_TYPE, VFIO_BASE + 19)
+
+/*
+ * VFIO_DEVICE_ATTACH_IOMMUFD_PT - _IOW(VFIO_TYPE, VFIO_BASE + 20,
+ *					struct vfio_device_attach_iommufd_pt)
+ *
+ * Attach a vfio device to an iommufd address space specified by IOAS
+ * id or hw_pagetable (hwpt) id.
+ *
+ * Available only after a device has been bound to iommufd via
+ * VFIO_DEVICE_BIND_IOMMUFD
+ *
+ * Undo by VFIO_DEVICE_DETACH_IOMMUFD_PT or device fd close.
+ *
+ * @argsz:	user filled size of this data.
+ * @flags:	must be 0.
+ * @pt_id:	Input the target id which can represent an ioas or a hwpt
+ *		allocated via iommufd subsystem.
+ *		Output the attached hwpt id which could be the specified
+ *		hwpt itself or a hwpt automatically created for the
+ *		specified ioas by kernel during the attachment.
+ *
+ * Return: 0 on success, -errno on failure.
+ */
+struct vfio_device_attach_iommufd_pt {
+	__u32	argsz;
+	__u32	flags;
+	__u32	pt_id;
+};
+
+#define VFIO_DEVICE_ATTACH_IOMMUFD_PT		_IO(VFIO_TYPE, VFIO_BASE + 20)
+
+/*
+ * VFIO_DEVICE_DETACH_IOMMUFD_PT - _IOW(VFIO_TYPE, VFIO_BASE + 21,
+ *					struct vfio_device_detach_iommufd_pt)
+ *
+ * Detach a vfio device from the iommufd address space it has been
+ * attached to. After it, device should be in a blocking DMA state.
+ *
+ * Available only after a device has been bound to iommufd via
+ * VFIO_DEVICE_BIND_IOMMUFD
+ *
+ * @argsz:	user filled size of this data.
+ * @flags:	must be 0.
+ *
+ * Return: 0 on success, -errno on failure.
+ */
+struct vfio_device_detach_iommufd_pt {
+	__u32	argsz;
+	__u32	flags;
+};
+
+#define VFIO_DEVICE_DETACH_IOMMUFD_PT		_IO(VFIO_TYPE, VFIO_BASE + 21)
+
 /**
  * VFIO_DEVICE_GET_INFO - _IOR(VFIO_TYPE, VFIO_BASE + 7,
  *						struct vfio_device_info)
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 135+ messages in thread

* [PATCH v3 14/15] vfio: Add ioctls for device cdev using iommufd
@ 2023-02-13 15:13   ` Yi Liu
  0 siblings, 0 replies; 135+ messages in thread
From: Yi Liu @ 2023-02-13 15:13 UTC (permalink / raw)
  To: joro, alex.williamson, jgg, kevin.tian, robin.murphy
  Cc: cohuck, eric.auger, nicolinc, kvm, mjrosato, chao.p.peng,
	yi.l.liu, yi.y.sun, peterx, jasowang, shameerali.kolothum.thodi,
	lulu, suravee.suthikulpanit, intel-gvt-dev, intel-gfx,
	linux-s390

This adds three vfio device ioctls for userspace using iommufd to set up
secure DMA context for device access.

    VFIO_DEVICE_BIND_IOMMUFD: bind device to an iommufd, hence gain DMA
			      control provided by the iommufd. open_device
			      op is called after bind_iommufd op.
			      VFIO no iommu mode is indicated by passing
			      a negative iommufd value.
    VFIO_DEVICE_ATTACH_IOMMUFD_PT: attach device to IOAS, hw_pagetable
				   managed by iommufd. Attach can be
				   undo by VFIO_DEVICE_DETACH_IOMMUFD_PT
				   or device fd close.
    VFIO_DEVICE_DETACH_IOMMUFD_PT: detach device from the current attached
				   IOAS or hw_pagetable managed by iommufd.

Signed-off-by: Yi Liu <yi.l.liu@intel.com>
---
 drivers/vfio/device_cdev.c | 200 +++++++++++++++++++++++++++++++++++++
 drivers/vfio/group.c       |  27 +++++
 drivers/vfio/vfio.h        |  35 ++++++-
 drivers/vfio/vfio_main.c   |  38 ++++++-
 include/linux/iommufd.h    |   6 ++
 include/uapi/linux/vfio.h  |  86 ++++++++++++++++
 6 files changed, 387 insertions(+), 5 deletions(-)

diff --git a/drivers/vfio/device_cdev.c b/drivers/vfio/device_cdev.c
index 07869fde1c0c..e5297bf99cc4 100644
--- a/drivers/vfio/device_cdev.c
+++ b/drivers/vfio/device_cdev.c
@@ -3,6 +3,7 @@
  * Copyright (c) 2023 Intel Corporation.
  */
 #include <linux/vfio.h>
+#include <linux/iommufd.h>
 
 #include "vfio.h"
 
@@ -46,6 +47,205 @@ int vfio_device_fops_open(struct inode *inode, struct file *filep)
 	return ret;
 }
 
+static void vfio_device_get_kvm_safe(struct vfio_device_file *df)
+{
+	spin_lock(&df->kvm_ref_lock);
+	if (!df->kvm)
+		goto unlock;
+
+	_vfio_device_get_kvm_safe(df->device, df->kvm);
+
+unlock:
+	spin_unlock(&df->kvm_ref_lock);
+}
+
+void vfio_device_cdev_close(struct vfio_device_file *df)
+{
+	struct vfio_device *device = df->device;
+
+	mutex_lock(&device->dev_set->lock);
+	if (!device->open_count) {
+		mutex_unlock(&device->dev_set->lock);
+		return;
+	}
+	vfio_device_close(df);
+	vfio_device_put_kvm(device);
+	mutex_unlock(&device->dev_set->lock);
+	vfio_device_release_group(device);
+}
+
+long vfio_device_ioctl_bind_iommufd(struct vfio_device_file *df,
+				    unsigned long arg)
+{
+	struct vfio_device *device = df->device;
+	struct vfio_device_bind_iommufd bind;
+	struct iommufd_ctx *iommufd = NULL;
+	struct fd f;
+	unsigned long minsz;
+	int ret;
+
+	minsz = offsetofend(struct vfio_device_bind_iommufd, out_devid);
+
+	if (copy_from_user(&bind, (void __user *)arg, minsz))
+		return -EFAULT;
+
+	if (bind.argsz < minsz || bind.flags)
+		return -EINVAL;
+
+	if (!device->ops->bind_iommufd)
+		return -ENODEV;
+
+	ret = vfio_device_claim_group(device);
+	if (ret)
+		return ret;
+
+	mutex_lock(&device->dev_set->lock);
+	/*
+	 * If already been bound to an iommufd, or already set noiommu
+	 * then fail it.
+	 */
+	if (df->iommufd || df->noiommu) {
+		ret = -EINVAL;
+		goto out_unlock;
+	}
+
+	/* iommufd < 0 means noiommu mode */
+	if (bind.iommufd < 0) {
+		if (!capable(CAP_SYS_RAWIO)) {
+			ret = -EPERM;
+			goto out_unlock;
+		}
+		df->noiommu = true;
+	} else {
+		f = fdget(bind.iommufd);
+		if (!f.file) {
+			ret = -EBADF;
+			goto out_unlock;
+		}
+		iommufd = iommufd_ctx_from_file(f.file);
+		if (IS_ERR(iommufd)) {
+			ret = PTR_ERR(iommufd);
+			goto out_put_file;
+		}
+	}
+
+	/*
+	 * Before the device open, get the KVM pointer currently
+	 * associated with the device file (if there is) and obtain a
+	 * reference. This reference is held until device closed. Save
+	 * the pointer in the device for use by drivers.
+	 */
+	vfio_device_get_kvm_safe(df);
+
+	df->iommufd = iommufd;
+	ret = vfio_device_open(df, &bind.out_devid, NULL);
+	if (ret)
+		goto out_put_kvm;
+
+	ret = copy_to_user((void __user *)arg +
+			   offsetofend(struct vfio_device_bind_iommufd, iommufd),
+			   &bind.out_devid,
+			   sizeof(bind.out_devid)) ? -EFAULT : 0;
+	if (ret)
+		goto out_close_device;
+
+	if (iommufd)
+		fdput(f);
+	else if (df->noiommu)
+		dev_warn(device->dev, "vfio-noiommu device used by user "
+			 "(%s:%d)\n", current->comm, task_pid_nr(current));
+	mutex_unlock(&device->dev_set->lock);
+	return 0;
+
+out_close_device:
+	vfio_device_close(df);
+out_put_kvm:
+	df->iommufd = NULL;
+	df->noiommu = false;
+	vfio_device_put_kvm(device);
+out_put_file:
+	if (iommufd)
+		fdput(f);
+out_unlock:
+	mutex_unlock(&device->dev_set->lock);
+	vfio_device_release_group(device);
+	return ret;
+}
+
+int vfio_ioctl_device_attach(struct vfio_device_file *df,
+			     void __user *arg)
+{
+	struct vfio_device *device = df->device;
+	struct vfio_device_attach_iommufd_pt attach;
+	unsigned long minsz;
+	int ret;
+
+	minsz = offsetofend(struct vfio_device_attach_iommufd_pt, pt_id);
+
+	if (copy_from_user(&attach, (void __user *)arg, minsz))
+		return -EFAULT;
+
+	if (attach.argsz < minsz || attach.flags ||
+	    attach.pt_id == IOMMUFD_INVALID_ID)
+		return -EINVAL;
+
+	if (!device->ops->bind_iommufd)
+		return -ENODEV;
+
+	mutex_lock(&device->dev_set->lock);
+	if (df->noiommu) {
+		ret = -EINVAL;
+		goto out_unlock;
+	}
+
+	ret = device->ops->attach_ioas(device, &attach.pt_id);
+	if (ret)
+		goto out_unlock;
+
+	ret = copy_to_user((void __user *)arg +
+			   offsetofend(struct vfio_device_attach_iommufd_pt, flags),
+			   &attach.pt_id,
+			   sizeof(attach.pt_id)) ? -EFAULT : 0;
+	if (ret)
+		goto out_detach;
+	mutex_unlock(&device->dev_set->lock);
+	return 0;
+
+out_detach:
+	device->ops->detach_ioas(device);
+out_unlock:
+	mutex_unlock(&device->dev_set->lock);
+	return ret;
+}
+
+int vfio_ioctl_device_detach(struct vfio_device_file *df,
+			     void __user *arg)
+{
+	struct vfio_device *device = df->device;
+	struct vfio_device_detach_iommufd_pt detach;
+	unsigned long minsz;
+
+	minsz = offsetofend(struct vfio_device_detach_iommufd_pt, flags);
+
+	if (copy_from_user(&detach, (void __user *)arg, minsz))
+		return -EFAULT;
+
+	if (detach.argsz < minsz || detach.flags)
+		return -EINVAL;
+
+	if (!device->ops->bind_iommufd)
+		return -ENODEV;
+
+	mutex_lock(&device->dev_set->lock);
+	if (df->noiommu) {
+		mutex_unlock(&device->dev_set->lock);
+		return -EINVAL;
+	}
+	device->ops->detach_ioas(device);
+	mutex_unlock(&device->dev_set->lock);
+	return 0;
+}
+
 static char *vfio_device_devnode(const struct device *dev, umode_t *mode)
 {
 	return kasprintf(GFP_KERNEL, "vfio/devices/%s", dev_name(dev));
diff --git a/drivers/vfio/group.c b/drivers/vfio/group.c
index f3f5f4589cdd..8ee06d8b17fa 100644
--- a/drivers/vfio/group.c
+++ b/drivers/vfio/group.c
@@ -381,6 +381,33 @@ static long vfio_group_fops_unl_ioctl(struct file *filep,
 	}
 }
 
+int vfio_device_claim_group(struct vfio_device *device)
+{
+	struct vfio_group *group = device->group;
+	int ret = 0;
+
+	mutex_lock(&group->group_lock);
+	if (group->opened_file) {
+		ret = -EBUSY;
+		goto out_unlock;
+	}
+
+	group->cdev_device_open_cnt++;
+
+out_unlock:
+	mutex_unlock(&group->group_lock);
+	return ret;
+}
+
+void vfio_device_release_group(struct vfio_device *device)
+{
+	struct vfio_group *group = device->group;
+
+	mutex_lock(&group->group_lock);
+	group->cdev_device_open_cnt--;
+	mutex_unlock(&group->group_lock);
+}
+
 static int vfio_group_fops_open(struct inode *inode, struct file *filep)
 {
 	struct vfio_group *group =
diff --git a/drivers/vfio/vfio.h b/drivers/vfio/vfio.h
index be93a1c953f8..421492518ab5 100644
--- a/drivers/vfio/vfio.h
+++ b/drivers/vfio/vfio.h
@@ -23,7 +23,9 @@ struct vfio_device_file {
 	bool access_granted;
 	spinlock_t kvm_ref_lock; /* protect kvm field */
 	struct kvm *kvm;
-	struct iommufd_ctx *iommufd; /* protected by struct vfio_device_set::lock */
+	/* protected by struct vfio_device_set::lock */
+	struct iommufd_ctx *iommufd;
+	bool noiommu;
 };
 
 void vfio_device_put_registration(struct vfio_device *device);
@@ -89,6 +91,8 @@ struct vfio_group {
 	unsigned int			cdev_device_open_cnt;
 };
 
+int vfio_device_claim_group(struct vfio_device *device);
+void vfio_device_release_group(struct vfio_device *device);
 int vfio_device_set_group(struct vfio_device *device,
 			  enum vfio_group_type type);
 void vfio_device_remove_group(struct vfio_device *device);
@@ -247,6 +251,13 @@ static inline void vfio_iommufd_unbind(struct vfio_device *device)
 #if IS_ENABLED(CONFIG_VFIO_DEVICE_CDEV)
 void vfio_init_device_cdev(struct vfio_device *device);
 int vfio_device_fops_open(struct inode *inode, struct file *filep);
+void vfio_device_cdev_close(struct vfio_device_file *df);
+long vfio_device_ioctl_bind_iommufd(struct vfio_device_file *df,
+				    unsigned long arg);
+int vfio_ioctl_device_attach(struct vfio_device_file *df,
+			     void __user *arg);
+int vfio_ioctl_device_detach(struct vfio_device_file *df,
+			     void __user *arg);
 int vfio_cdev_init(struct class *device_class);
 void vfio_cdev_cleanup(void);
 #else
@@ -260,6 +271,28 @@ static inline int vfio_device_fops_open(struct inode *inode,
 	return 0;
 }
 
+static inline void vfio_device_cdev_close(struct vfio_device_file *df)
+{
+}
+
+static inline long vfio_device_ioctl_bind_iommufd(struct vfio_device_file *df,
+						  unsigned long arg)
+{
+	return -EOPNOTSUPP;
+}
+
+static inline int vfio_ioctl_device_attach(struct vfio_device_file *df,
+					   void __user *arg)
+{
+	return -EOPNOTSUPP;
+}
+
+static inline int vfio_ioctl_device_detach(struct vfio_device_file *df,
+					   void __user *arg)
+{
+	return -EOPNOTSUPP;
+}
+
 static inline int vfio_cdev_init(struct class *device_class)
 {
 	return 0;
diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c
index a7eb2727c613..933319083282 100644
--- a/drivers/vfio/vfio_main.c
+++ b/drivers/vfio/vfio_main.c
@@ -37,6 +37,7 @@
 #include <linux/interval_tree.h>
 #include <linux/iova_bitmap.h>
 #include <linux/iommufd.h>
+#include <uapi/linux/iommufd.h>
 #include "vfio.h"
 
 #define DRIVER_VERSION	"0.3"
@@ -441,16 +442,32 @@ static int vfio_device_first_open(struct vfio_device_file *df,
 {
 	struct vfio_device *device = df->device;
 	struct iommufd_ctx *iommufd = df->iommufd;
-	int ret;
+	int ret = 0;
 
 	lockdep_assert_held(&device->dev_set->lock);
 
+	if (WARN_ON(iommufd && df->noiommu))
+		return -EINVAL;
+
 	if (!try_module_get(device->dev->driver->owner))
 		return -ENODEV;
 
+	/*
+	 * For group/container path, iommufd pointer is NULL when comes
+	 * into this helper. Its noiommu support is in container.c.
+	 *
+	 * For iommufd compat mode, iommufd pointer here is a valid value.
+	 * Its noiommu support is in vfio_iommufd_bind().
+	 *
+	 * For device cdev path, iommufd pointer here is a valid value for
+	 * normal cases, but it is NULL if it's noiommu. To differentiate
+	 * the noiommu from the group/container path which also passes NULL
+	 * iommufd pointer in, check df->noiommu which is set only in the
+	 * cdev path.
+	 */
 	if (iommufd)
 		ret = vfio_iommufd_bind(device, iommufd, dev_id, pt_id);
-	else
+	else if (!df->noiommu)
 		ret = vfio_device_group_use_iommu(device);
 	if (ret)
 		goto err_module_put;
@@ -465,7 +482,7 @@ static int vfio_device_first_open(struct vfio_device_file *df,
 err_unuse_iommu:
 	if (iommufd)
 		vfio_iommufd_unbind(device);
-	else
+	else if (!df->noiommu)
 		vfio_device_group_unuse_iommu(device);
 err_module_put:
 	module_put(device->dev->driver->owner);
@@ -483,7 +500,7 @@ static void vfio_device_last_close(struct vfio_device_file *df)
 		device->ops->close_device(device);
 	if (iommufd)
 		vfio_iommufd_unbind(device);
-	else
+	else if (!df->noiommu)
 		vfio_device_group_unuse_iommu(device);
 	module_put(device->dev->driver->owner);
 }
@@ -582,6 +599,8 @@ static int vfio_device_fops_release(struct inode *inode, struct file *filep)
 	 */
 	if (!df->is_cdev_device)
 		vfio_device_group_close(df);
+	else
+		vfio_device_cdev_close(df);
 
 	vfio_device_put_registration(device);
 
@@ -1156,6 +1175,9 @@ static long vfio_device_fops_unl_ioctl(struct file *filep,
 	bool access;
 	int ret;
 
+	if (cmd == VFIO_DEVICE_BIND_IOMMUFD)
+		return vfio_device_ioctl_bind_iommufd(df, arg);
+
 	/* Paired with smp_store_release() in vfio_device_open() */
 	access = smp_load_acquire(&df->access_granted);
 	if (!access)
@@ -1170,6 +1192,14 @@ static long vfio_device_fops_unl_ioctl(struct file *filep,
 		ret = vfio_ioctl_device_feature(device, (void __user *)arg);
 		break;
 
+	case VFIO_DEVICE_ATTACH_IOMMUFD_PT:
+		ret = vfio_ioctl_device_attach(df, (void __user *)arg);
+		break;
+
+	case VFIO_DEVICE_DETACH_IOMMUFD_PT:
+		ret = vfio_ioctl_device_detach(df, (void __user *)arg);
+		break;
+
 	default:
 		if (unlikely(!device->ops->ioctl))
 			ret = -EINVAL;
diff --git a/include/linux/iommufd.h b/include/linux/iommufd.h
index 650d45629647..9672cf839687 100644
--- a/include/linux/iommufd.h
+++ b/include/linux/iommufd.h
@@ -17,6 +17,12 @@ struct iommufd_ctx;
 struct iommufd_access;
 struct file;
 
+/*
+ * iommufd core init xarray with flags==XA_FLAGS_ALLOC1, so valid
+ * ID starts from 1.
+ */
+#define IOMMUFD_INVALID_ID 0
+
 struct iommufd_device *iommufd_device_bind(struct iommufd_ctx *ictx,
 					   struct device *dev, u32 *id);
 void iommufd_device_unbind(struct iommufd_device *idev);
diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h
index 0552e8dcf0cb..026af52cf22e 100644
--- a/include/uapi/linux/vfio.h
+++ b/include/uapi/linux/vfio.h
@@ -194,6 +194,92 @@ struct vfio_group_status {
 
 /* --------------- IOCTLs for DEVICE file descriptors --------------- */
 
+/*
+ * VFIO_DEVICE_BIND_IOMMUFD - _IOR(VFIO_TYPE, VFIO_BASE + 19,
+ *				   struct vfio_device_bind_iommufd)
+ *
+ * Bind a vfio_device to the specified iommufd.
+ *
+ * The user should provide a device cookie when calling this ioctl. The
+ * cookie is carried only in event e.g. I/O fault reported to userspace
+ * via iommufd. The user should use devid returned by this ioctl to mark
+ * the target device in other ioctls (e.g. capability query via iommufd).
+ *
+ * User is not allowed to access the device before the binding operation
+ * is completed.
+ *
+ * Unbind is automatically conducted when device fd is closed.
+ *
+ * @argsz:	 user filled size of this data.
+ * @flags:	 reserved for future extension.
+ * @dev_cookie:	 a per device cookie provided by userspace.
+ * @iommufd:	 iommufd to bind. a negative value means noiommu.
+ * @out_devid:	 the device id generated by this bind.
+ *
+ * Return: 0 on success, -errno on failure.
+ */
+struct vfio_device_bind_iommufd {
+	__u32		argsz;
+	__u32		flags;
+	__aligned_u64	dev_cookie;
+	__s32		iommufd;
+	__u32		out_devid;
+};
+
+#define VFIO_DEVICE_BIND_IOMMUFD	_IO(VFIO_TYPE, VFIO_BASE + 19)
+
+/*
+ * VFIO_DEVICE_ATTACH_IOMMUFD_PT - _IOW(VFIO_TYPE, VFIO_BASE + 20,
+ *					struct vfio_device_attach_iommufd_pt)
+ *
+ * Attach a vfio device to an iommufd address space specified by IOAS
+ * id or hw_pagetable (hwpt) id.
+ *
+ * Available only after a device has been bound to iommufd via
+ * VFIO_DEVICE_BIND_IOMMUFD
+ *
+ * Undo by VFIO_DEVICE_DETACH_IOMMUFD_PT or device fd close.
+ *
+ * @argsz:	user filled size of this data.
+ * @flags:	must be 0.
+ * @pt_id:	Input the target id which can represent an ioas or a hwpt
+ *		allocated via iommufd subsystem.
+ *		Output the attached hwpt id which could be the specified
+ *		hwpt itself or a hwpt automatically created for the
+ *		specified ioas by kernel during the attachment.
+ *
+ * Return: 0 on success, -errno on failure.
+ */
+struct vfio_device_attach_iommufd_pt {
+	__u32	argsz;
+	__u32	flags;
+	__u32	pt_id;
+};
+
+#define VFIO_DEVICE_ATTACH_IOMMUFD_PT		_IO(VFIO_TYPE, VFIO_BASE + 20)
+
+/*
+ * VFIO_DEVICE_DETACH_IOMMUFD_PT - _IOW(VFIO_TYPE, VFIO_BASE + 21,
+ *					struct vfio_device_detach_iommufd_pt)
+ *
+ * Detach a vfio device from the iommufd address space it has been
+ * attached to. After it, device should be in a blocking DMA state.
+ *
+ * Available only after a device has been bound to iommufd via
+ * VFIO_DEVICE_BIND_IOMMUFD
+ *
+ * @argsz:	user filled size of this data.
+ * @flags:	must be 0.
+ *
+ * Return: 0 on success, -errno on failure.
+ */
+struct vfio_device_detach_iommufd_pt {
+	__u32	argsz;
+	__u32	flags;
+};
+
+#define VFIO_DEVICE_DETACH_IOMMUFD_PT		_IO(VFIO_TYPE, VFIO_BASE + 21)
+
 /**
  * VFIO_DEVICE_GET_INFO - _IOR(VFIO_TYPE, VFIO_BASE + 7,
  *						struct vfio_device_info)
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 135+ messages in thread

* [Intel-gfx] [PATCH v3 15/15] vfio: Compile group optionally
  2023-02-13 15:13 ` [Intel-gfx] " Yi Liu
@ 2023-02-13 15:13   ` Yi Liu
  -1 siblings, 0 replies; 135+ messages in thread
From: Yi Liu @ 2023-02-13 15:13 UTC (permalink / raw)
  To: joro, alex.williamson, jgg, kevin.tian, robin.murphy
  Cc: linux-s390, yi.l.liu, yi.y.sun, kvm, mjrosato, jasowang, cohuck,
	peterx, eric.auger, nicolinc, shameerali.kolothum.thodi,
	suravee.suthikulpanit, chao.p.peng, lulu, intel-gvt-dev,
	intel-gfx

group code is not needed for vfio device cdev, so with vfio device cdev
introduced, the group infrastructures can be compiled out if only cdev
is needed.

Signed-off-by: Yi Liu <yi.l.liu@intel.com>
---
 drivers/vfio/Kconfig  | 18 ++++++++++
 drivers/vfio/Makefile |  2 +-
 drivers/vfio/vfio.h   | 78 +++++++++++++++++++++++++++++++++++++++++++
 include/linux/vfio.h  | 11 ++++++
 4 files changed, 108 insertions(+), 1 deletion(-)

diff --git a/drivers/vfio/Kconfig b/drivers/vfio/Kconfig
index 0476abf154f2..9c6626dfd116 100644
--- a/drivers/vfio/Kconfig
+++ b/drivers/vfio/Kconfig
@@ -15,6 +15,7 @@ if VFIO
 config VFIO_DEVICE_CDEV
 	bool "Support for the VFIO cdev /dev/vfio/devices/vfioX"
 	depends on IOMMUFD
+	default !VFIO_GROUP
 	help
 	  The VFIO device cdev is another way for userspace to get device
 	  access. Userspace gets device fd by opening device cdev under
@@ -23,9 +24,26 @@ config VFIO_DEVICE_CDEV
 
 	  If you don't know what to do here, say N.
 
+config VFIO_ENABLE_GROUP
+	bool
+	default !VFIO_DEVICE_CDEV
+
+config VFIO_GROUP
+	bool "Support for the VFIO group /dev/vfio/$group_id"
+	select VFIO_ENABLE_GROUP
+	default y
+	help
+	   VFIO group is legacy interface for userspace. As the introduction
+	   of VFIO device cdev interface, this can be N. For now, before
+	   userspace applications are fully converted to new vfio device cdev
+	   interface, this should be Y.
+
+	   If you don't know what to do here, say Y.
+
 config VFIO_CONTAINER
 	bool "Support for the VFIO container /dev/vfio/vfio"
 	select VFIO_IOMMU_TYPE1 if MMU && (X86 || S390 || ARM || ARM64)
+	depends on VFIO_ENABLE_GROUP
 	default y
 	help
 	  The VFIO container is the classic interface to VFIO for establishing
diff --git a/drivers/vfio/Makefile b/drivers/vfio/Makefile
index 245394aeb94b..4e81c3bbed30 100644
--- a/drivers/vfio/Makefile
+++ b/drivers/vfio/Makefile
@@ -2,9 +2,9 @@
 obj-$(CONFIG_VFIO) += vfio.o
 
 vfio-y += vfio_main.o \
-	  group.o \
 	  iova_bitmap.o
 vfio-$(CONFIG_VFIO_DEVICE_CDEV) += device_cdev.o
+vfio-$(CONFIG_VFIO_ENABLE_GROUP) += group.o
 vfio-$(CONFIG_IOMMUFD) += iommufd.o
 vfio-$(CONFIG_VFIO_CONTAINER) += container.o
 vfio-$(CONFIG_VFIO_VIRQFD) += virqfd.o
diff --git a/drivers/vfio/vfio.h b/drivers/vfio/vfio.h
index 421492518ab5..0fc9faaa7780 100644
--- a/drivers/vfio/vfio.h
+++ b/drivers/vfio/vfio.h
@@ -62,6 +62,7 @@ enum vfio_group_type {
 	VFIO_NO_IOMMU,
 };
 
+#if IS_ENABLED(CONFIG_VFIO_ENABLE_GROUP)
 struct vfio_group {
 	struct device 			dev;
 	struct cdev			cdev;
@@ -108,6 +109,83 @@ bool vfio_group_has_dev(struct vfio_group *group, struct vfio_device *device);
 bool vfio_device_has_container(struct vfio_device *device);
 int __init vfio_group_init(void);
 void vfio_group_cleanup(void);
+#else
+struct vfio_group;
+
+static inline int vfio_device_claim_group(struct vfio_device *device)
+{
+	return 0;
+}
+
+static inline void vfio_device_release_group(struct vfio_device *device)
+{
+}
+
+static inline int vfio_device_set_group(struct vfio_device *device,
+					enum vfio_group_type type)
+{
+	return 0;
+}
+
+static inline void vfio_device_remove_group(struct vfio_device *device)
+{
+}
+
+static inline void vfio_device_group_register(struct vfio_device *device)
+{
+}
+
+static inline void vfio_device_group_unregister(struct vfio_device *device)
+{
+}
+
+static inline int vfio_device_group_use_iommu(struct vfio_device *device)
+{
+	return -EOPNOTSUPP;
+}
+
+static inline void vfio_device_group_unuse_iommu(struct vfio_device *device)
+{
+}
+
+static inline void vfio_device_group_close(struct vfio_device_file *df)
+{
+}
+
+static inline struct vfio_group *vfio_group_from_file(struct file *file)
+{
+	return NULL;
+}
+
+static inline bool vfio_group_enforced_coherent(struct vfio_group *group)
+{
+	return true;
+}
+
+static inline void vfio_group_set_kvm(struct vfio_group *group, struct kvm *kvm)
+{
+}
+
+static inline bool vfio_group_has_dev(struct vfio_group *group,
+				      struct vfio_device *device)
+{
+	return false;
+}
+
+static inline bool vfio_device_has_container(struct vfio_device *device)
+{
+	return false;
+}
+
+static inline int __init vfio_group_init(void)
+{
+	return 0;
+}
+
+static inline void vfio_group_cleanup(void)
+{
+}
+#endif /* CONFIG_VFIO_ENABLE_GROUP */
 
 #if IS_ENABLED(CONFIG_VFIO_CONTAINER)
 /**
diff --git a/include/linux/vfio.h b/include/linux/vfio.h
index 6b554ce6245a..1d9ba93feb0d 100644
--- a/include/linux/vfio.h
+++ b/include/linux/vfio.h
@@ -43,7 +43,9 @@ struct vfio_device {
 	 */
 	const struct vfio_migration_ops *mig_ops;
 	const struct vfio_log_ops *log_ops;
+#if IS_ENABLED(CONFIG_VFIO_ENABLE_GROUP)
 	struct vfio_group *group;
+#endif
 	struct vfio_device_set *dev_set;
 	struct list_head dev_set_list;
 	unsigned int migration_flags;
@@ -58,8 +60,10 @@ struct vfio_device {
 	refcount_t refcount;	/* user count on registered device*/
 	unsigned int open_count;
 	struct completion comp;
+#if IS_ENABLED(CONFIG_VFIO_ENABLE_GROUP)
 	struct list_head group_next;
 	struct list_head iommu_entry;
+#endif
 	struct iommufd_access *iommufd_access;
 	void (*put_kvm)(struct kvm *kvm);
 #if IS_ENABLED(CONFIG_IOMMUFD)
@@ -257,7 +261,14 @@ int vfio_mig_get_next_state(struct vfio_device *device,
 /*
  * External user API
  */
+#if IS_ENABLED(CONFIG_VFIO_ENABLE_GROUP)
 struct iommu_group *vfio_file_iommu_group(struct file *file);
+#else
+static inline struct iommu_group *vfio_file_iommu_group(struct file *file)
+{
+	return NULL;
+}
+#endif
 bool vfio_file_is_valid(struct file *file);
 bool vfio_file_enforced_coherent(struct file *file);
 void vfio_file_set_kvm(struct file *file, struct kvm *kvm);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 135+ messages in thread

* [PATCH v3 15/15] vfio: Compile group optionally
@ 2023-02-13 15:13   ` Yi Liu
  0 siblings, 0 replies; 135+ messages in thread
From: Yi Liu @ 2023-02-13 15:13 UTC (permalink / raw)
  To: joro, alex.williamson, jgg, kevin.tian, robin.murphy
  Cc: cohuck, eric.auger, nicolinc, kvm, mjrosato, chao.p.peng,
	yi.l.liu, yi.y.sun, peterx, jasowang, shameerali.kolothum.thodi,
	lulu, suravee.suthikulpanit, intel-gvt-dev, intel-gfx,
	linux-s390

group code is not needed for vfio device cdev, so with vfio device cdev
introduced, the group infrastructures can be compiled out if only cdev
is needed.

Signed-off-by: Yi Liu <yi.l.liu@intel.com>
---
 drivers/vfio/Kconfig  | 18 ++++++++++
 drivers/vfio/Makefile |  2 +-
 drivers/vfio/vfio.h   | 78 +++++++++++++++++++++++++++++++++++++++++++
 include/linux/vfio.h  | 11 ++++++
 4 files changed, 108 insertions(+), 1 deletion(-)

diff --git a/drivers/vfio/Kconfig b/drivers/vfio/Kconfig
index 0476abf154f2..9c6626dfd116 100644
--- a/drivers/vfio/Kconfig
+++ b/drivers/vfio/Kconfig
@@ -15,6 +15,7 @@ if VFIO
 config VFIO_DEVICE_CDEV
 	bool "Support for the VFIO cdev /dev/vfio/devices/vfioX"
 	depends on IOMMUFD
+	default !VFIO_GROUP
 	help
 	  The VFIO device cdev is another way for userspace to get device
 	  access. Userspace gets device fd by opening device cdev under
@@ -23,9 +24,26 @@ config VFIO_DEVICE_CDEV
 
 	  If you don't know what to do here, say N.
 
+config VFIO_ENABLE_GROUP
+	bool
+	default !VFIO_DEVICE_CDEV
+
+config VFIO_GROUP
+	bool "Support for the VFIO group /dev/vfio/$group_id"
+	select VFIO_ENABLE_GROUP
+	default y
+	help
+	   VFIO group is legacy interface for userspace. As the introduction
+	   of VFIO device cdev interface, this can be N. For now, before
+	   userspace applications are fully converted to new vfio device cdev
+	   interface, this should be Y.
+
+	   If you don't know what to do here, say Y.
+
 config VFIO_CONTAINER
 	bool "Support for the VFIO container /dev/vfio/vfio"
 	select VFIO_IOMMU_TYPE1 if MMU && (X86 || S390 || ARM || ARM64)
+	depends on VFIO_ENABLE_GROUP
 	default y
 	help
 	  The VFIO container is the classic interface to VFIO for establishing
diff --git a/drivers/vfio/Makefile b/drivers/vfio/Makefile
index 245394aeb94b..4e81c3bbed30 100644
--- a/drivers/vfio/Makefile
+++ b/drivers/vfio/Makefile
@@ -2,9 +2,9 @@
 obj-$(CONFIG_VFIO) += vfio.o
 
 vfio-y += vfio_main.o \
-	  group.o \
 	  iova_bitmap.o
 vfio-$(CONFIG_VFIO_DEVICE_CDEV) += device_cdev.o
+vfio-$(CONFIG_VFIO_ENABLE_GROUP) += group.o
 vfio-$(CONFIG_IOMMUFD) += iommufd.o
 vfio-$(CONFIG_VFIO_CONTAINER) += container.o
 vfio-$(CONFIG_VFIO_VIRQFD) += virqfd.o
diff --git a/drivers/vfio/vfio.h b/drivers/vfio/vfio.h
index 421492518ab5..0fc9faaa7780 100644
--- a/drivers/vfio/vfio.h
+++ b/drivers/vfio/vfio.h
@@ -62,6 +62,7 @@ enum vfio_group_type {
 	VFIO_NO_IOMMU,
 };
 
+#if IS_ENABLED(CONFIG_VFIO_ENABLE_GROUP)
 struct vfio_group {
 	struct device 			dev;
 	struct cdev			cdev;
@@ -108,6 +109,83 @@ bool vfio_group_has_dev(struct vfio_group *group, struct vfio_device *device);
 bool vfio_device_has_container(struct vfio_device *device);
 int __init vfio_group_init(void);
 void vfio_group_cleanup(void);
+#else
+struct vfio_group;
+
+static inline int vfio_device_claim_group(struct vfio_device *device)
+{
+	return 0;
+}
+
+static inline void vfio_device_release_group(struct vfio_device *device)
+{
+}
+
+static inline int vfio_device_set_group(struct vfio_device *device,
+					enum vfio_group_type type)
+{
+	return 0;
+}
+
+static inline void vfio_device_remove_group(struct vfio_device *device)
+{
+}
+
+static inline void vfio_device_group_register(struct vfio_device *device)
+{
+}
+
+static inline void vfio_device_group_unregister(struct vfio_device *device)
+{
+}
+
+static inline int vfio_device_group_use_iommu(struct vfio_device *device)
+{
+	return -EOPNOTSUPP;
+}
+
+static inline void vfio_device_group_unuse_iommu(struct vfio_device *device)
+{
+}
+
+static inline void vfio_device_group_close(struct vfio_device_file *df)
+{
+}
+
+static inline struct vfio_group *vfio_group_from_file(struct file *file)
+{
+	return NULL;
+}
+
+static inline bool vfio_group_enforced_coherent(struct vfio_group *group)
+{
+	return true;
+}
+
+static inline void vfio_group_set_kvm(struct vfio_group *group, struct kvm *kvm)
+{
+}
+
+static inline bool vfio_group_has_dev(struct vfio_group *group,
+				      struct vfio_device *device)
+{
+	return false;
+}
+
+static inline bool vfio_device_has_container(struct vfio_device *device)
+{
+	return false;
+}
+
+static inline int __init vfio_group_init(void)
+{
+	return 0;
+}
+
+static inline void vfio_group_cleanup(void)
+{
+}
+#endif /* CONFIG_VFIO_ENABLE_GROUP */
 
 #if IS_ENABLED(CONFIG_VFIO_CONTAINER)
 /**
diff --git a/include/linux/vfio.h b/include/linux/vfio.h
index 6b554ce6245a..1d9ba93feb0d 100644
--- a/include/linux/vfio.h
+++ b/include/linux/vfio.h
@@ -43,7 +43,9 @@ struct vfio_device {
 	 */
 	const struct vfio_migration_ops *mig_ops;
 	const struct vfio_log_ops *log_ops;
+#if IS_ENABLED(CONFIG_VFIO_ENABLE_GROUP)
 	struct vfio_group *group;
+#endif
 	struct vfio_device_set *dev_set;
 	struct list_head dev_set_list;
 	unsigned int migration_flags;
@@ -58,8 +60,10 @@ struct vfio_device {
 	refcount_t refcount;	/* user count on registered device*/
 	unsigned int open_count;
 	struct completion comp;
+#if IS_ENABLED(CONFIG_VFIO_ENABLE_GROUP)
 	struct list_head group_next;
 	struct list_head iommu_entry;
+#endif
 	struct iommufd_access *iommufd_access;
 	void (*put_kvm)(struct kvm *kvm);
 #if IS_ENABLED(CONFIG_IOMMUFD)
@@ -257,7 +261,14 @@ int vfio_mig_get_next_state(struct vfio_device *device,
 /*
  * External user API
  */
+#if IS_ENABLED(CONFIG_VFIO_ENABLE_GROUP)
 struct iommu_group *vfio_file_iommu_group(struct file *file);
+#else
+static inline struct iommu_group *vfio_file_iommu_group(struct file *file)
+{
+	return NULL;
+}
+#endif
 bool vfio_file_is_valid(struct file *file);
 bool vfio_file_enforced_coherent(struct file *file);
 void vfio_file_set_kvm(struct file *file, struct kvm *kvm);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 135+ messages in thread

* [Intel-gfx] ✗ Fi.CI.BUILD: failure for Add vfio_device cdev for iommufd support (rev2)
  2023-02-13 15:13 ` [Intel-gfx] " Yi Liu
                   ` (15 preceding siblings ...)
  (?)
@ 2023-02-13 15:30 ` Patchwork
  -1 siblings, 0 replies; 135+ messages in thread
From: Patchwork @ 2023-02-13 15:30 UTC (permalink / raw)
  To: Yi Liu; +Cc: intel-gfx

== Series Details ==

Series: Add vfio_device cdev for iommufd support (rev2)
URL   : https://patchwork.freedesktop.org/series/113696/
State : failure

== Summary ==

Error: patch https://patchwork.freedesktop.org/api/1.0/series/113696/revisions/2/mbox/ not applied
Applying: vfio: Allocate per device file structure
Using index info to reconstruct a base tree...
M	drivers/vfio/group.c
M	drivers/vfio/vfio.h
M	drivers/vfio/vfio_main.c
Falling back to patching base and 3-way merge...
Auto-merging drivers/vfio/vfio_main.c
Auto-merging drivers/vfio/vfio.h
Auto-merging drivers/vfio/group.c
Applying: vfio: Refine vfio file kAPIs
Using index info to reconstruct a base tree...
M	drivers/vfio/group.c
M	drivers/vfio/pci/vfio_pci_core.c
M	drivers/vfio/vfio.h
M	drivers/vfio/vfio_main.c
M	include/linux/vfio.h
Falling back to patching base and 3-way merge...
Auto-merging include/linux/vfio.h
Auto-merging drivers/vfio/vfio_main.c
Auto-merging drivers/vfio/vfio.h
Auto-merging drivers/vfio/pci/vfio_pci_core.c
Auto-merging drivers/vfio/group.c
CONFLICT (content): Merge conflict in drivers/vfio/group.c
error: Failed to merge in the changes.
hint: Use 'git am --show-current-patch=diff' to see the failed patch
Patch failed at 0002 vfio: Refine vfio file kAPIs
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".



^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [PATCH v3 00/15] Add vfio_device cdev for iommufd support
  2023-02-13 15:13 ` [Intel-gfx] " Yi Liu
@ 2023-02-13 19:47   ` Alex Williamson
  -1 siblings, 0 replies; 135+ messages in thread
From: Alex Williamson @ 2023-02-13 19:47 UTC (permalink / raw)
  To: Yi Liu
  Cc: joro, jgg, kevin.tian, robin.murphy, cohuck, eric.auger,
	nicolinc, kvm, mjrosato, chao.p.peng, yi.y.sun, peterx, jasowang,
	shameerali.kolothum.thodi, lulu, suravee.suthikulpanit,
	intel-gvt-dev, intel-gfx, linux-s390

On Mon, 13 Feb 2023 07:13:33 -0800
Yi Liu <yi.l.liu@intel.com> wrote:

> Existing VFIO provides group-centric user APIs for userspace. Userspace
> opens the /dev/vfio/$group_id first before getting device fd and hence
> getting access to device. This is not the desired model for iommufd. Per
> the conclusion of community discussion[1], iommufd provides device-centric
> kAPIs and requires its consumer (like VFIO) to be device-centric user
> APIs. Such user APIs are used to associate device with iommufd and also
> the I/O address spaces managed by the iommufd.
> 
> This series first introduces a per device file structure to be prepared
> for further enhancement and refactors the kvm-vfio code to be prepared
> for accepting device file from userspace. Then refactors the vfio to be
> able to handle iommufd binding. This refactor includes the mechanism of
> blocking device access before iommufd bind, making vfio_device_open() be
> exclusive between the group path and the cdev path. Eventually, adds the
> cdev support for vfio device, and makes group infrastructure optional as
> it is not needed when vfio device cdev is compiled.
> 
> This is also a prerequisite for iommu nesting for vfio device[2].
> 
> The complete code can be found in below branch, simple test done with the
> legacy group path and the cdev path. Draft QEMU branch can be found at[3]
> 
> https://github.com/yiliu1765/iommufd/tree/vfio_device_cdev_v3
> (config CONFIG_IOMMUFD=y CONFIG_VFIO_DEVICE_CDEV=y)

Even using your branch[1], it seems like this has not been tested
except with cdev support enabled:

/home/alwillia/Work/linux.git/drivers/vfio/vfio_main.c: In function ‘vfio_device_add’:
/home/alwillia/Work/linux.git/drivers/vfio/vfio_main.c:253:48: error: ‘struct vfio_device’ has no member named ‘cdev’; did you mean ‘dev’?
  253 |                 ret = cdev_device_add(&device->cdev, &device->device);
      |                                                ^~~~
      |                                                dev
/home/alwillia/Work/linux.git/drivers/vfio/vfio_main.c: In function ‘vfio_device_del’:
/home/alwillia/Work/linux.git/drivers/vfio/vfio_main.c:262:42: error: ‘struct vfio_device’ has no member named ‘cdev’; did you mean ‘dev’?
  262 |                 cdev_device_del(&device->cdev, &device->device);
      |                                          ^~~~
      |                                          dev

Additionally the VFIO_ENABLE_GROUP Kconfig option doesn't make much
sense to me, it seems entirely redundant to VFIO_GROUP.

I think it's too late for v6.3 already, but given this needs at least
one more spin, let's set expectations of this being v6.4 material.  Thanks,

Alex

[1] 98491da60ae1 cover-letter: Add vfio_device cdev for iommufd support


^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [Intel-gfx] [PATCH v3 00/15] Add vfio_device cdev for iommufd support
@ 2023-02-13 19:47   ` Alex Williamson
  0 siblings, 0 replies; 135+ messages in thread
From: Alex Williamson @ 2023-02-13 19:47 UTC (permalink / raw)
  To: Yi Liu
  Cc: linux-s390, yi.y.sun, mjrosato, kvm, intel-gvt-dev, joro, cohuck,
	peterx, suravee.suthikulpanit, eric.auger, nicolinc,
	shameerali.kolothum.thodi, jgg, intel-gfx, chao.p.peng, lulu,
	robin.murphy, jasowang

On Mon, 13 Feb 2023 07:13:33 -0800
Yi Liu <yi.l.liu@intel.com> wrote:

> Existing VFIO provides group-centric user APIs for userspace. Userspace
> opens the /dev/vfio/$group_id first before getting device fd and hence
> getting access to device. This is not the desired model for iommufd. Per
> the conclusion of community discussion[1], iommufd provides device-centric
> kAPIs and requires its consumer (like VFIO) to be device-centric user
> APIs. Such user APIs are used to associate device with iommufd and also
> the I/O address spaces managed by the iommufd.
> 
> This series first introduces a per device file structure to be prepared
> for further enhancement and refactors the kvm-vfio code to be prepared
> for accepting device file from userspace. Then refactors the vfio to be
> able to handle iommufd binding. This refactor includes the mechanism of
> blocking device access before iommufd bind, making vfio_device_open() be
> exclusive between the group path and the cdev path. Eventually, adds the
> cdev support for vfio device, and makes group infrastructure optional as
> it is not needed when vfio device cdev is compiled.
> 
> This is also a prerequisite for iommu nesting for vfio device[2].
> 
> The complete code can be found in below branch, simple test done with the
> legacy group path and the cdev path. Draft QEMU branch can be found at[3]
> 
> https://github.com/yiliu1765/iommufd/tree/vfio_device_cdev_v3
> (config CONFIG_IOMMUFD=y CONFIG_VFIO_DEVICE_CDEV=y)

Even using your branch[1], it seems like this has not been tested
except with cdev support enabled:

/home/alwillia/Work/linux.git/drivers/vfio/vfio_main.c: In function ‘vfio_device_add’:
/home/alwillia/Work/linux.git/drivers/vfio/vfio_main.c:253:48: error: ‘struct vfio_device’ has no member named ‘cdev’; did you mean ‘dev’?
  253 |                 ret = cdev_device_add(&device->cdev, &device->device);
      |                                                ^~~~
      |                                                dev
/home/alwillia/Work/linux.git/drivers/vfio/vfio_main.c: In function ‘vfio_device_del’:
/home/alwillia/Work/linux.git/drivers/vfio/vfio_main.c:262:42: error: ‘struct vfio_device’ has no member named ‘cdev’; did you mean ‘dev’?
  262 |                 cdev_device_del(&device->cdev, &device->device);
      |                                          ^~~~
      |                                          dev

Additionally the VFIO_ENABLE_GROUP Kconfig option doesn't make much
sense to me, it seems entirely redundant to VFIO_GROUP.

I think it's too late for v6.3 already, but given this needs at least
one more spin, let's set expectations of this being v6.4 material.  Thanks,

Alex

[1] 98491da60ae1 cover-letter: Add vfio_device cdev for iommufd support


^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [PATCH v3 00/15] Add vfio_device cdev for iommufd support
  2023-02-13 19:47   ` [Intel-gfx] " Alex Williamson
@ 2023-02-13 23:21     ` Jason Gunthorpe
  -1 siblings, 0 replies; 135+ messages in thread
From: Jason Gunthorpe @ 2023-02-13 23:21 UTC (permalink / raw)
  To: Alex Williamson
  Cc: Yi Liu, joro, kevin.tian, robin.murphy, cohuck, eric.auger,
	nicolinc, kvm, mjrosato, chao.p.peng, yi.y.sun, peterx, jasowang,
	shameerali.kolothum.thodi, lulu, suravee.suthikulpanit,
	intel-gvt-dev, intel-gfx, linux-s390

On Mon, Feb 13, 2023 at 12:47:19PM -0700, Alex Williamson wrote:

> I think it's too late for v6.3 already, but given this needs at least
> one more spin, let's set expectations of this being v6.4 material.  Thanks,

Please let's continue to try to get this finished during the merge
window, all the other series depend on it. We can manage it with a
shared branch again..

Thanks,
Jason

^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [Intel-gfx] [PATCH v3 00/15] Add vfio_device cdev for iommufd support
@ 2023-02-13 23:21     ` Jason Gunthorpe
  0 siblings, 0 replies; 135+ messages in thread
From: Jason Gunthorpe @ 2023-02-13 23:21 UTC (permalink / raw)
  To: Alex Williamson
  Cc: linux-s390, Yi Liu, yi.y.sun, mjrosato, kvm, intel-gvt-dev, joro,
	cohuck, peterx, eric.auger, nicolinc, shameerali.kolothum.thodi,
	suravee.suthikulpanit, intel-gfx, chao.p.peng, lulu,
	robin.murphy, jasowang

On Mon, Feb 13, 2023 at 12:47:19PM -0700, Alex Williamson wrote:

> I think it's too late for v6.3 already, but given this needs at least
> one more spin, let's set expectations of this being v6.4 material.  Thanks,

Please let's continue to try to get this finished during the merge
window, all the other series depend on it. We can manage it with a
shared branch again..

Thanks,
Jason

^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [PATCH v3 03/15] vfio: Accept vfio device file in the driver facing kAPI
  2023-02-13 15:13   ` [Intel-gfx] " Yi Liu
@ 2023-02-13 23:21     ` Alex Williamson
  -1 siblings, 0 replies; 135+ messages in thread
From: Alex Williamson @ 2023-02-13 23:21 UTC (permalink / raw)
  To: Yi Liu
  Cc: joro, jgg, kevin.tian, robin.murphy, cohuck, eric.auger,
	nicolinc, kvm, mjrosato, chao.p.peng, yi.y.sun, peterx, jasowang,
	shameerali.kolothum.thodi, lulu, suravee.suthikulpanit,
	intel-gvt-dev, intel-gfx, linux-s390

On Mon, 13 Feb 2023 07:13:36 -0800
Yi Liu <yi.l.liu@intel.com> wrote:

> This makes the vfio file kAPIs to accepte vfio device files, also a
> preparation for vfio device cdev support.
> 
> For the kvm set with vfio device file, kvm pointer is stored in struct
> vfio_device_file, and use kvm_ref_lock to protect kvm set and kvm
> pointer usage within VFIO. This kvm pointer will be set to vfio_device
> after device file is bound to iommufd in the cdev path.
> 
> Signed-off-by: Yi Liu <yi.l.liu@intel.com>
> Reviewed-by: Kevin Tian <kevin.tian@intel.com>
> ---
>  drivers/vfio/vfio.h      |  2 ++
>  drivers/vfio/vfio_main.c | 51 ++++++++++++++++++++++++++++++++++++----
>  2 files changed, 49 insertions(+), 4 deletions(-)

This subtly changes the behavior of the vfio-pci hot reset functions
without updating the uAPI description or implementation to use less
group-centric variables.  The new behavior appears to be that cdev fds
can also be passed to prove ownership of the affected set of devices
for a hot reset, but this probably needs to be examined for gaps.
Thanks,

Alex


^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [Intel-gfx] [PATCH v3 03/15] vfio: Accept vfio device file in the driver facing kAPI
@ 2023-02-13 23:21     ` Alex Williamson
  0 siblings, 0 replies; 135+ messages in thread
From: Alex Williamson @ 2023-02-13 23:21 UTC (permalink / raw)
  To: Yi Liu
  Cc: linux-s390, yi.y.sun, mjrosato, kvm, intel-gvt-dev, joro, cohuck,
	peterx, suravee.suthikulpanit, eric.auger, nicolinc,
	shameerali.kolothum.thodi, jgg, intel-gfx, chao.p.peng, lulu,
	robin.murphy, jasowang

On Mon, 13 Feb 2023 07:13:36 -0800
Yi Liu <yi.l.liu@intel.com> wrote:

> This makes the vfio file kAPIs to accepte vfio device files, also a
> preparation for vfio device cdev support.
> 
> For the kvm set with vfio device file, kvm pointer is stored in struct
> vfio_device_file, and use kvm_ref_lock to protect kvm set and kvm
> pointer usage within VFIO. This kvm pointer will be set to vfio_device
> after device file is bound to iommufd in the cdev path.
> 
> Signed-off-by: Yi Liu <yi.l.liu@intel.com>
> Reviewed-by: Kevin Tian <kevin.tian@intel.com>
> ---
>  drivers/vfio/vfio.h      |  2 ++
>  drivers/vfio/vfio_main.c | 51 ++++++++++++++++++++++++++++++++++++----
>  2 files changed, 49 insertions(+), 4 deletions(-)

This subtly changes the behavior of the vfio-pci hot reset functions
without updating the uAPI description or implementation to use less
group-centric variables.  The new behavior appears to be that cdev fds
can also be passed to prove ownership of the affected set of devices
for a hot reset, but this probably needs to be examined for gaps.
Thanks,

Alex


^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [PATCH v3 03/15] vfio: Accept vfio device file in the driver facing kAPI
  2023-02-13 15:13   ` [Intel-gfx] " Yi Liu
@ 2023-02-13 23:43     ` Jason Gunthorpe
  -1 siblings, 0 replies; 135+ messages in thread
From: Jason Gunthorpe @ 2023-02-13 23:43 UTC (permalink / raw)
  To: Yi Liu
  Cc: joro, alex.williamson, kevin.tian, robin.murphy, cohuck,
	eric.auger, nicolinc, kvm, mjrosato, chao.p.peng, yi.y.sun,
	peterx, jasowang, shameerali.kolothum.thodi, lulu,
	suravee.suthikulpanit, intel-gvt-dev, intel-gfx, linux-s390

On Mon, Feb 13, 2023 at 07:13:36AM -0800, Yi Liu wrote:
> +static struct vfio_device *vfio_device_from_file(struct file *file)
> +{
> +	struct vfio_device_file *df = file->private_data;
> +
> +	if (file->f_op != &vfio_device_fops)
> +		return NULL;
> +	return df->device;
> +}
> +
>  /**
>   * vfio_file_is_valid - True if the file is usable with VFIO APIS
>   * @file: VFIO group file or VFIO device file
>   */
>  bool vfio_file_is_valid(struct file *file)
>  {
> -	return vfio_group_from_file(file);
> +	return vfio_group_from_file(file) ||
> +	       vfio_device_from_file(file);
>  }
>  EXPORT_SYMBOL_GPL(vfio_file_is_valid);

This can only succeed on a device cdev that has been fully opened.

Jason

^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [Intel-gfx] [PATCH v3 03/15] vfio: Accept vfio device file in the driver facing kAPI
@ 2023-02-13 23:43     ` Jason Gunthorpe
  0 siblings, 0 replies; 135+ messages in thread
From: Jason Gunthorpe @ 2023-02-13 23:43 UTC (permalink / raw)
  To: Yi Liu
  Cc: linux-s390, yi.y.sun, mjrosato, kvm, intel-gvt-dev, joro, cohuck,
	peterx, eric.auger, nicolinc, shameerali.kolothum.thodi,
	suravee.suthikulpanit, intel-gfx, chao.p.peng, lulu,
	robin.murphy, jasowang

On Mon, Feb 13, 2023 at 07:13:36AM -0800, Yi Liu wrote:
> +static struct vfio_device *vfio_device_from_file(struct file *file)
> +{
> +	struct vfio_device_file *df = file->private_data;
> +
> +	if (file->f_op != &vfio_device_fops)
> +		return NULL;
> +	return df->device;
> +}
> +
>  /**
>   * vfio_file_is_valid - True if the file is usable with VFIO APIS
>   * @file: VFIO group file or VFIO device file
>   */
>  bool vfio_file_is_valid(struct file *file)
>  {
> -	return vfio_group_from_file(file);
> +	return vfio_group_from_file(file) ||
> +	       vfio_device_from_file(file);
>  }
>  EXPORT_SYMBOL_GPL(vfio_file_is_valid);

This can only succeed on a device cdev that has been fully opened.

Jason

^ permalink raw reply	[flat|nested] 135+ messages in thread

* RE: [PATCH v3 00/15] Add vfio_device cdev for iommufd support
  2023-02-13 19:47   ` [Intel-gfx] " Alex Williamson
@ 2023-02-14  1:55     ` Liu, Yi L
  -1 siblings, 0 replies; 135+ messages in thread
From: Liu, Yi L @ 2023-02-14  1:55 UTC (permalink / raw)
  To: Alex Williamson
  Cc: joro, jgg, Tian, Kevin, robin.murphy, cohuck, eric.auger,
	nicolinc, kvm, mjrosato, chao.p.peng, yi.y.sun, peterx, jasowang,
	shameerali.kolothum.thodi, lulu, suravee.suthikulpanit,
	intel-gvt-dev, intel-gfx, linux-s390

> From: Alex Williamson <alex.williamson@redhat.com>
> Sent: Tuesday, February 14, 2023 3:47 AM
> 
> On Mon, 13 Feb 2023 07:13:33 -0800
> Yi Liu <yi.l.liu@intel.com> wrote:
> 
> > Existing VFIO provides group-centric user APIs for userspace. Userspace
> > opens the /dev/vfio/$group_id first before getting device fd and hence
> > getting access to device. This is not the desired model for iommufd. Per
> > the conclusion of community discussion[1], iommufd provides device-
> centric
> > kAPIs and requires its consumer (like VFIO) to be device-centric user
> > APIs. Such user APIs are used to associate device with iommufd and also
> > the I/O address spaces managed by the iommufd.
> >
> > This series first introduces a per device file structure to be prepared
> > for further enhancement and refactors the kvm-vfio code to be prepared
> > for accepting device file from userspace. Then refactors the vfio to be
> > able to handle iommufd binding. This refactor includes the mechanism of
> > blocking device access before iommufd bind, making vfio_device_open()
> be
> > exclusive between the group path and the cdev path. Eventually, adds the
> > cdev support for vfio device, and makes group infrastructure optional as
> > it is not needed when vfio device cdev is compiled.
> >
> > This is also a prerequisite for iommu nesting for vfio device[2].
> >
> > The complete code can be found in below branch, simple test done with
> the
> > legacy group path and the cdev path. Draft QEMU branch can be found
> at[3]
> >
> > https://github.com/yiliu1765/iommufd/tree/vfio_device_cdev_v3
> > (config CONFIG_IOMMUFD=y CONFIG_VFIO_DEVICE_CDEV=y)
> 
> Even using your branch[1], it seems like this has not been tested
> except with cdev support enabled:
> 
> /home/alwillia/Work/linux.git/drivers/vfio/vfio_main.c: In function
> ‘vfio_device_add’:
> /home/alwillia/Work/linux.git/drivers/vfio/vfio_main.c:253:48: error: ‘struct
> vfio_device’ has no member named ‘cdev’; did you mean ‘dev’?
>   253 |                 ret = cdev_device_add(&device->cdev, &device->device);
>       |                                                ^~~~
>       |                                                dev
> /home/alwillia/Work/linux.git/drivers/vfio/vfio_main.c: In function
> ‘vfio_device_del’:
> /home/alwillia/Work/linux.git/drivers/vfio/vfio_main.c:262:42: error: ‘struct
> vfio_device’ has no member named ‘cdev’; did you mean ‘dev’?
>   262 |                 cdev_device_del(&device->cdev, &device->device);
>       |                                          ^~~~
>       |                                          dev

Sorry for it. It is due to the cdev definition is under
"#if IS_ENABLED(CONFIG_VFIO_DEVICE_CDEV)". While, in the code it
uses "if (IS_ENABLED(CONFIG_VFIO_DEVICE_CDEV))".  I think for
readability, it would be better to always define cdev in vfio_device,
and keep the using of cdev in code. How about your taste?

> Additionally the VFIO_ENABLE_GROUP Kconfig option doesn't make much
> sense to me, it seems entirely redundant to VFIO_GROUP.

The intention is to make the group code compiling match existing case.
Currently, if VFIO is configured, group code is by default compiled.
So VFIO_ENABLE_GROUP a hidden option, and VFIO_GROUP an option
for user.  User needs to explicitly config VFIO_GROUP if VFIO_DEVICE_CDEV==y.
If VFIO_DEVICE_CDEV==n, then no matter user configed VFIO_GROUP or not,
the group code shall be compiled.

Regards,
Yi Liu


^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [Intel-gfx] [PATCH v3 00/15] Add vfio_device cdev for iommufd support
@ 2023-02-14  1:55     ` Liu, Yi L
  0 siblings, 0 replies; 135+ messages in thread
From: Liu, Yi L @ 2023-02-14  1:55 UTC (permalink / raw)
  To: Alex Williamson
  Cc: linux-s390, yi.y.sun, mjrosato, kvm, intel-gvt-dev, joro, cohuck,
	peterx, suravee.suthikulpanit, eric.auger, nicolinc,
	shameerali.kolothum.thodi, jgg, intel-gfx, chao.p.peng, lulu,
	robin.murphy, jasowang

> From: Alex Williamson <alex.williamson@redhat.com>
> Sent: Tuesday, February 14, 2023 3:47 AM
> 
> On Mon, 13 Feb 2023 07:13:33 -0800
> Yi Liu <yi.l.liu@intel.com> wrote:
> 
> > Existing VFIO provides group-centric user APIs for userspace. Userspace
> > opens the /dev/vfio/$group_id first before getting device fd and hence
> > getting access to device. This is not the desired model for iommufd. Per
> > the conclusion of community discussion[1], iommufd provides device-
> centric
> > kAPIs and requires its consumer (like VFIO) to be device-centric user
> > APIs. Such user APIs are used to associate device with iommufd and also
> > the I/O address spaces managed by the iommufd.
> >
> > This series first introduces a per device file structure to be prepared
> > for further enhancement and refactors the kvm-vfio code to be prepared
> > for accepting device file from userspace. Then refactors the vfio to be
> > able to handle iommufd binding. This refactor includes the mechanism of
> > blocking device access before iommufd bind, making vfio_device_open()
> be
> > exclusive between the group path and the cdev path. Eventually, adds the
> > cdev support for vfio device, and makes group infrastructure optional as
> > it is not needed when vfio device cdev is compiled.
> >
> > This is also a prerequisite for iommu nesting for vfio device[2].
> >
> > The complete code can be found in below branch, simple test done with
> the
> > legacy group path and the cdev path. Draft QEMU branch can be found
> at[3]
> >
> > https://github.com/yiliu1765/iommufd/tree/vfio_device_cdev_v3
> > (config CONFIG_IOMMUFD=y CONFIG_VFIO_DEVICE_CDEV=y)
> 
> Even using your branch[1], it seems like this has not been tested
> except with cdev support enabled:
> 
> /home/alwillia/Work/linux.git/drivers/vfio/vfio_main.c: In function
> ‘vfio_device_add’:
> /home/alwillia/Work/linux.git/drivers/vfio/vfio_main.c:253:48: error: ‘struct
> vfio_device’ has no member named ‘cdev’; did you mean ‘dev’?
>   253 |                 ret = cdev_device_add(&device->cdev, &device->device);
>       |                                                ^~~~
>       |                                                dev
> /home/alwillia/Work/linux.git/drivers/vfio/vfio_main.c: In function
> ‘vfio_device_del’:
> /home/alwillia/Work/linux.git/drivers/vfio/vfio_main.c:262:42: error: ‘struct
> vfio_device’ has no member named ‘cdev’; did you mean ‘dev’?
>   262 |                 cdev_device_del(&device->cdev, &device->device);
>       |                                          ^~~~
>       |                                          dev

Sorry for it. It is due to the cdev definition is under
"#if IS_ENABLED(CONFIG_VFIO_DEVICE_CDEV)". While, in the code it
uses "if (IS_ENABLED(CONFIG_VFIO_DEVICE_CDEV))".  I think for
readability, it would be better to always define cdev in vfio_device,
and keep the using of cdev in code. How about your taste?

> Additionally the VFIO_ENABLE_GROUP Kconfig option doesn't make much
> sense to me, it seems entirely redundant to VFIO_GROUP.

The intention is to make the group code compiling match existing case.
Currently, if VFIO is configured, group code is by default compiled.
So VFIO_ENABLE_GROUP a hidden option, and VFIO_GROUP an option
for user.  User needs to explicitly config VFIO_GROUP if VFIO_DEVICE_CDEV==y.
If VFIO_DEVICE_CDEV==n, then no matter user configed VFIO_GROUP or not,
the group code shall be compiled.

Regards,
Yi Liu


^ permalink raw reply	[flat|nested] 135+ messages in thread

* RE: [PATCH v3 03/15] vfio: Accept vfio device file in the driver facing kAPI
  2023-02-13 23:43     ` [Intel-gfx] " Jason Gunthorpe
@ 2023-02-14  2:02       ` Liu, Yi L
  -1 siblings, 0 replies; 135+ messages in thread
From: Liu, Yi L @ 2023-02-14  2:02 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: joro, alex.williamson, Tian, Kevin, robin.murphy, cohuck,
	eric.auger, nicolinc, kvm, mjrosato, chao.p.peng, yi.y.sun,
	peterx, jasowang, shameerali.kolothum.thodi, lulu,
	suravee.suthikulpanit, intel-gvt-dev, intel-gfx, linux-s390

> From: Jason Gunthorpe <jgg@nvidia.com>
> Sent: Tuesday, February 14, 2023 7:44 AM
> 
> On Mon, Feb 13, 2023 at 07:13:36AM -0800, Yi Liu wrote:
> > +static struct vfio_device *vfio_device_from_file(struct file *file)
> > +{
> > +	struct vfio_device_file *df = file->private_data;
> > +
> > +	if (file->f_op != &vfio_device_fops)
> > +		return NULL;
> > +	return df->device;
> > +}
> > +
> >  /**
> >   * vfio_file_is_valid - True if the file is usable with VFIO APIS
> >   * @file: VFIO group file or VFIO device file
> >   */
> >  bool vfio_file_is_valid(struct file *file)
> >  {
> > -	return vfio_group_from_file(file);
> > +	return vfio_group_from_file(file) ||
> > +	       vfio_device_from_file(file);
> >  }
> >  EXPORT_SYMBOL_GPL(vfio_file_is_valid);
> 
> This can only succeed on a device cdev that has been fully opened.

Actually, we cannot. This is used in the kvm-vfio code to see if the
user-provided fd is vfio fds in the SET_KVM path. And we don't
have the device cdev fully opened until BIND_IOMMUFD. But we do
need to invoke SET_KVM before issuing BIND_IOMMUFD as the device
open needs kvm pointer. So if we cannot apply fully opened limit to this
interface. Maybe an updated function comment is needed.

" vfio_file_is_valid - True if the file is vfio files (group or device)"

Regards,
Yi Liu

^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [Intel-gfx] [PATCH v3 03/15] vfio: Accept vfio device file in the driver facing kAPI
@ 2023-02-14  2:02       ` Liu, Yi L
  0 siblings, 0 replies; 135+ messages in thread
From: Liu, Yi L @ 2023-02-14  2:02 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: linux-s390, yi.y.sun, mjrosato, kvm, intel-gvt-dev, joro, cohuck,
	peterx, eric.auger, nicolinc, shameerali.kolothum.thodi,
	suravee.suthikulpanit, intel-gfx, chao.p.peng, lulu,
	robin.murphy, jasowang

> From: Jason Gunthorpe <jgg@nvidia.com>
> Sent: Tuesday, February 14, 2023 7:44 AM
> 
> On Mon, Feb 13, 2023 at 07:13:36AM -0800, Yi Liu wrote:
> > +static struct vfio_device *vfio_device_from_file(struct file *file)
> > +{
> > +	struct vfio_device_file *df = file->private_data;
> > +
> > +	if (file->f_op != &vfio_device_fops)
> > +		return NULL;
> > +	return df->device;
> > +}
> > +
> >  /**
> >   * vfio_file_is_valid - True if the file is usable with VFIO APIS
> >   * @file: VFIO group file or VFIO device file
> >   */
> >  bool vfio_file_is_valid(struct file *file)
> >  {
> > -	return vfio_group_from_file(file);
> > +	return vfio_group_from_file(file) ||
> > +	       vfio_device_from_file(file);
> >  }
> >  EXPORT_SYMBOL_GPL(vfio_file_is_valid);
> 
> This can only succeed on a device cdev that has been fully opened.

Actually, we cannot. This is used in the kvm-vfio code to see if the
user-provided fd is vfio fds in the SET_KVM path. And we don't
have the device cdev fully opened until BIND_IOMMUFD. But we do
need to invoke SET_KVM before issuing BIND_IOMMUFD as the device
open needs kvm pointer. So if we cannot apply fully opened limit to this
interface. Maybe an updated function comment is needed.

" vfio_file_is_valid - True if the file is vfio files (group or device)"

Regards,
Yi Liu

^ permalink raw reply	[flat|nested] 135+ messages in thread

* RE: [PATCH v3 03/15] vfio: Accept vfio device file in the driver facing kAPI
  2023-02-13 23:21     ` [Intel-gfx] " Alex Williamson
@ 2023-02-14  2:19       ` Liu, Yi L
  -1 siblings, 0 replies; 135+ messages in thread
From: Liu, Yi L @ 2023-02-14  2:19 UTC (permalink / raw)
  To: Alex Williamson
  Cc: joro, jgg, Tian, Kevin, robin.murphy, cohuck, eric.auger,
	nicolinc, kvm, mjrosato, chao.p.peng, yi.y.sun, peterx, jasowang,
	shameerali.kolothum.thodi, lulu, suravee.suthikulpanit,
	intel-gvt-dev, intel-gfx, linux-s390

> From: Alex Williamson <alex.williamson@redhat.com>
> Sent: Tuesday, February 14, 2023 7:22 AM
> 
> On Mon, 13 Feb 2023 07:13:36 -0800
> Yi Liu <yi.l.liu@intel.com> wrote:
> 
> > This makes the vfio file kAPIs to accepte vfio device files, also a
> > preparation for vfio device cdev support.
> >
> > For the kvm set with vfio device file, kvm pointer is stored in struct
> > vfio_device_file, and use kvm_ref_lock to protect kvm set and kvm
> > pointer usage within VFIO. This kvm pointer will be set to vfio_device
> > after device file is bound to iommufd in the cdev path.
> >
> > Signed-off-by: Yi Liu <yi.l.liu@intel.com>
> > Reviewed-by: Kevin Tian <kevin.tian@intel.com>
> > ---
> >  drivers/vfio/vfio.h      |  2 ++
> >  drivers/vfio/vfio_main.c | 51
> ++++++++++++++++++++++++++++++++++++----
> >  2 files changed, 49 insertions(+), 4 deletions(-)
> 
> This subtly changes the behavior of the vfio-pci hot reset functions
> without updating the uAPI description or implementation to use less
> group-centric variables.  The new behavior appears to be that cdev fds
> can also be passed to prove ownership of the affected set of devices
> for a hot reset, but this probably needs to be examined for gaps.

Yes. user could pass cdev fds afterward. I suppose the VFIO_DEVICE_GET_PCI_HOT_RESET_INFO
will report the existing info (group_id, segment, bus, devfn). While userspace
passes device fds to the kernel for resetting. Need to update struct vfio_pci_hot_reset
and the kernel reset code accordingly. Probably, it is a following series for it. 😊

/**
 * VFIO_DEVICE_GET_PCI_HOT_RESET_INFO - _IOWR(VFIO_TYPE, VFIO_BASE + 12,
 *					      struct vfio_pci_hot_reset_info)
 *
 * Return: 0 on success, -errno on failure:
 *	-enospc = insufficient buffer, -enodev = unsupported for device.
 */
struct vfio_pci_dependent_device {
	__u32	group_id;
	__u16	segment;
	__u8	bus;
	__u8	devfn; /* Use PCI_SLOT/PCI_FUNC */
};

struct vfio_pci_hot_reset_info {
	__u32	argsz;
	__u32	flags;
	__u32	count;
	struct vfio_pci_dependent_device	devices[];
};

#define VFIO_DEVICE_GET_PCI_HOT_RESET_INFO	_IO(VFIO_TYPE, VFIO_BASE + 12)

/**
 * VFIO_DEVICE_PCI_HOT_RESET - _IOW(VFIO_TYPE, VFIO_BASE + 13,
 *				    struct vfio_pci_hot_reset)
 *
 * Return: 0 on success, -errno on failure.
 */
struct vfio_pci_hot_reset {
	__u32	argsz;
	__u32	flags;
	__u32	count;
	__s32	group_fds[];
};

#define VFIO_DEVICE_PCI_HOT_RESET	_IO(VFIO_TYPE, VFIO_BASE + 13)

^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [Intel-gfx] [PATCH v3 03/15] vfio: Accept vfio device file in the driver facing kAPI
@ 2023-02-14  2:19       ` Liu, Yi L
  0 siblings, 0 replies; 135+ messages in thread
From: Liu, Yi L @ 2023-02-14  2:19 UTC (permalink / raw)
  To: Alex Williamson
  Cc: linux-s390, yi.y.sun, mjrosato, kvm, intel-gvt-dev, joro, cohuck,
	peterx, suravee.suthikulpanit, eric.auger, nicolinc,
	shameerali.kolothum.thodi, jgg, intel-gfx, chao.p.peng, lulu,
	robin.murphy, jasowang

> From: Alex Williamson <alex.williamson@redhat.com>
> Sent: Tuesday, February 14, 2023 7:22 AM
> 
> On Mon, 13 Feb 2023 07:13:36 -0800
> Yi Liu <yi.l.liu@intel.com> wrote:
> 
> > This makes the vfio file kAPIs to accepte vfio device files, also a
> > preparation for vfio device cdev support.
> >
> > For the kvm set with vfio device file, kvm pointer is stored in struct
> > vfio_device_file, and use kvm_ref_lock to protect kvm set and kvm
> > pointer usage within VFIO. This kvm pointer will be set to vfio_device
> > after device file is bound to iommufd in the cdev path.
> >
> > Signed-off-by: Yi Liu <yi.l.liu@intel.com>
> > Reviewed-by: Kevin Tian <kevin.tian@intel.com>
> > ---
> >  drivers/vfio/vfio.h      |  2 ++
> >  drivers/vfio/vfio_main.c | 51
> ++++++++++++++++++++++++++++++++++++----
> >  2 files changed, 49 insertions(+), 4 deletions(-)
> 
> This subtly changes the behavior of the vfio-pci hot reset functions
> without updating the uAPI description or implementation to use less
> group-centric variables.  The new behavior appears to be that cdev fds
> can also be passed to prove ownership of the affected set of devices
> for a hot reset, but this probably needs to be examined for gaps.

Yes. user could pass cdev fds afterward. I suppose the VFIO_DEVICE_GET_PCI_HOT_RESET_INFO
will report the existing info (group_id, segment, bus, devfn). While userspace
passes device fds to the kernel for resetting. Need to update struct vfio_pci_hot_reset
and the kernel reset code accordingly. Probably, it is a following series for it. 😊

/**
 * VFIO_DEVICE_GET_PCI_HOT_RESET_INFO - _IOWR(VFIO_TYPE, VFIO_BASE + 12,
 *					      struct vfio_pci_hot_reset_info)
 *
 * Return: 0 on success, -errno on failure:
 *	-enospc = insufficient buffer, -enodev = unsupported for device.
 */
struct vfio_pci_dependent_device {
	__u32	group_id;
	__u16	segment;
	__u8	bus;
	__u8	devfn; /* Use PCI_SLOT/PCI_FUNC */
};

struct vfio_pci_hot_reset_info {
	__u32	argsz;
	__u32	flags;
	__u32	count;
	struct vfio_pci_dependent_device	devices[];
};

#define VFIO_DEVICE_GET_PCI_HOT_RESET_INFO	_IO(VFIO_TYPE, VFIO_BASE + 12)

/**
 * VFIO_DEVICE_PCI_HOT_RESET - _IOW(VFIO_TYPE, VFIO_BASE + 13,
 *				    struct vfio_pci_hot_reset)
 *
 * Return: 0 on success, -errno on failure.
 */
struct vfio_pci_hot_reset {
	__u32	argsz;
	__u32	flags;
	__u32	count;
	__s32	group_fds[];
};

#define VFIO_DEVICE_PCI_HOT_RESET	_IO(VFIO_TYPE, VFIO_BASE + 13)

^ permalink raw reply	[flat|nested] 135+ messages in thread

* RE: [PATCH v3 03/15] vfio: Accept vfio device file in the driver facing kAPI
  2023-02-14  2:02       ` [Intel-gfx] " Liu, Yi L
@ 2023-02-14  7:19         ` Liu, Yi L
  -1 siblings, 0 replies; 135+ messages in thread
From: Liu, Yi L @ 2023-02-14  7:19 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: joro, alex.williamson, Tian, Kevin, robin.murphy, cohuck,
	eric.auger, nicolinc, kvm, mjrosato, chao.p.peng, yi.y.sun,
	peterx, jasowang, shameerali.kolothum.thodi, lulu,
	suravee.suthikulpanit, intel-gvt-dev, intel-gfx, linux-s390

> From: Liu, Yi L <yi.l.liu@intel.com>
> Sent: Tuesday, February 14, 2023 10:03 AM
> 
> > From: Jason Gunthorpe <jgg@nvidia.com>
> > Sent: Tuesday, February 14, 2023 7:44 AM
> >
> > On Mon, Feb 13, 2023 at 07:13:36AM -0800, Yi Liu wrote:
> > > +static struct vfio_device *vfio_device_from_file(struct file *file)
> > > +{
> > > +	struct vfio_device_file *df = file->private_data;
> > > +
> > > +	if (file->f_op != &vfio_device_fops)
> > > +		return NULL;
> > > +	return df->device;
> > > +}
> > > +
> > >  /**
> > >   * vfio_file_is_valid - True if the file is usable with VFIO APIS
> > >   * @file: VFIO group file or VFIO device file
> > >   */
> > >  bool vfio_file_is_valid(struct file *file)
> > >  {
> > > -	return vfio_group_from_file(file);
> > > +	return vfio_group_from_file(file) ||
> > > +	       vfio_device_from_file(file);
> > >  }
> > >  EXPORT_SYMBOL_GPL(vfio_file_is_valid);
> >
> > This can only succeed on a device cdev that has been fully opened.
> 
> Actually, we cannot. This is used in the kvm-vfio code to see if the
> user-provided fd is vfio fds in the SET_KVM path. And we don't
> have the device cdev fully opened until BIND_IOMMUFD. But we do
> need to invoke SET_KVM before issuing BIND_IOMMUFD as the device
> open needs kvm pointer. So if we cannot apply fully opened limit to this
> interface. Maybe an updated function comment is needed.
> 
> " vfio_file_is_valid - True if the file is vfio files (group or device)"

I guess your point is this is also called in the pci hot reset path. And
in the reset path, the device referred by the device fd should be fully
opened. vfio_file_is_valid() only checks f_ops, which is not enough to
show the device is fully-opened for cdev fd. However, view the high-level
flow, for cdev fd, the device access (neither VFIO_DEVICE_PCI_HOT_RESET
nor VFIO_DEVICE_GET_PCI_HOT_RESET_INFO) is not allowed until the
device is fully-opened (done in the bind_iommufd). So if the
VFIO_DEVICE_PCI_HOT_RESET path goes such far to call vfio_file_is_valid(),
the device should have been fully-opened.

Regards,
Yi Liu


^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [Intel-gfx] [PATCH v3 03/15] vfio: Accept vfio device file in the driver facing kAPI
@ 2023-02-14  7:19         ` Liu, Yi L
  0 siblings, 0 replies; 135+ messages in thread
From: Liu, Yi L @ 2023-02-14  7:19 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: linux-s390, yi.y.sun, mjrosato, kvm, intel-gvt-dev, joro, cohuck,
	peterx, eric.auger, nicolinc, shameerali.kolothum.thodi,
	suravee.suthikulpanit, intel-gfx, chao.p.peng, lulu,
	robin.murphy, jasowang

> From: Liu, Yi L <yi.l.liu@intel.com>
> Sent: Tuesday, February 14, 2023 10:03 AM
> 
> > From: Jason Gunthorpe <jgg@nvidia.com>
> > Sent: Tuesday, February 14, 2023 7:44 AM
> >
> > On Mon, Feb 13, 2023 at 07:13:36AM -0800, Yi Liu wrote:
> > > +static struct vfio_device *vfio_device_from_file(struct file *file)
> > > +{
> > > +	struct vfio_device_file *df = file->private_data;
> > > +
> > > +	if (file->f_op != &vfio_device_fops)
> > > +		return NULL;
> > > +	return df->device;
> > > +}
> > > +
> > >  /**
> > >   * vfio_file_is_valid - True if the file is usable with VFIO APIS
> > >   * @file: VFIO group file or VFIO device file
> > >   */
> > >  bool vfio_file_is_valid(struct file *file)
> > >  {
> > > -	return vfio_group_from_file(file);
> > > +	return vfio_group_from_file(file) ||
> > > +	       vfio_device_from_file(file);
> > >  }
> > >  EXPORT_SYMBOL_GPL(vfio_file_is_valid);
> >
> > This can only succeed on a device cdev that has been fully opened.
> 
> Actually, we cannot. This is used in the kvm-vfio code to see if the
> user-provided fd is vfio fds in the SET_KVM path. And we don't
> have the device cdev fully opened until BIND_IOMMUFD. But we do
> need to invoke SET_KVM before issuing BIND_IOMMUFD as the device
> open needs kvm pointer. So if we cannot apply fully opened limit to this
> interface. Maybe an updated function comment is needed.
> 
> " vfio_file_is_valid - True if the file is vfio files (group or device)"

I guess your point is this is also called in the pci hot reset path. And
in the reset path, the device referred by the device fd should be fully
opened. vfio_file_is_valid() only checks f_ops, which is not enough to
show the device is fully-opened for cdev fd. However, view the high-level
flow, for cdev fd, the device access (neither VFIO_DEVICE_PCI_HOT_RESET
nor VFIO_DEVICE_GET_PCI_HOT_RESET_INFO) is not allowed until the
device is fully-opened (done in the bind_iommufd). So if the
VFIO_DEVICE_PCI_HOT_RESET path goes such far to call vfio_file_is_valid(),
the device should have been fully-opened.

Regards,
Yi Liu


^ permalink raw reply	[flat|nested] 135+ messages in thread

* RE: [PATCH v3 09/15] vfio-iommufd: Add detach_ioas support for physical VFIO devices
  2023-02-13 15:13   ` [Intel-gfx] " Yi Liu
@ 2023-02-14  8:05     ` Tian, Kevin
  -1 siblings, 0 replies; 135+ messages in thread
From: Tian, Kevin @ 2023-02-14  8:05 UTC (permalink / raw)
  To: Liu, Yi L, joro, alex.williamson, jgg, robin.murphy
  Cc: cohuck, eric.auger, nicolinc, kvm, mjrosato, chao.p.peng,
	yi.y.sun, peterx, jasowang, shameerali.kolothum.thodi, lulu,
	suravee.suthikulpanit, intel-gvt-dev, intel-gfx, linux-s390

> From: Liu, Yi L <yi.l.liu@intel.com>
> Sent: Monday, February 13, 2023 11:14 PM
> 
> this prepares for adding DETACH ioctl for physical VFIO devices.
> 
> Signed-off-by: Yi Liu <yi.l.liu@intel.com>

Reviewed-by: Kevin Tian <kevin.tian@intel.com>

^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [Intel-gfx] [PATCH v3 09/15] vfio-iommufd: Add detach_ioas support for physical VFIO devices
@ 2023-02-14  8:05     ` Tian, Kevin
  0 siblings, 0 replies; 135+ messages in thread
From: Tian, Kevin @ 2023-02-14  8:05 UTC (permalink / raw)
  To: Liu, Yi L, joro, alex.williamson, jgg, robin.murphy
  Cc: linux-s390, yi.y.sun, kvm, mjrosato, jasowang, cohuck, peterx,
	eric.auger, nicolinc, shameerali.kolothum.thodi,
	suravee.suthikulpanit, chao.p.peng, lulu, intel-gvt-dev,
	intel-gfx

> From: Liu, Yi L <yi.l.liu@intel.com>
> Sent: Monday, February 13, 2023 11:14 PM
> 
> this prepares for adding DETACH ioctl for physical VFIO devices.
> 
> Signed-off-by: Yi Liu <yi.l.liu@intel.com>

Reviewed-by: Kevin Tian <kevin.tian@intel.com>

^ permalink raw reply	[flat|nested] 135+ messages in thread

* RE: [PATCH v3 10/15] vfio-iommufd: Add detach_ioas for emulated VFIO devices
  2023-02-13 15:13   ` [Intel-gfx] " Yi Liu
@ 2023-02-14  8:06     ` Tian, Kevin
  -1 siblings, 0 replies; 135+ messages in thread
From: Tian, Kevin @ 2023-02-14  8:06 UTC (permalink / raw)
  To: Liu, Yi L, joro, alex.williamson, jgg, robin.murphy
  Cc: cohuck, eric.auger, nicolinc, kvm, mjrosato, chao.p.peng,
	yi.y.sun, peterx, jasowang, shameerali.kolothum.thodi, lulu,
	suravee.suthikulpanit, intel-gvt-dev, intel-gfx, linux-s390

> From: Liu, Yi L <yi.l.liu@intel.com>
> Sent: Monday, February 13, 2023 11:14 PM
> 
> this prepares for adding DETACH ioctl for emulated VFIO devices.
> 
> Signed-off-by: Yi Liu <yi.l.liu@intel.com>

Reviewed-by: Kevin Tian <kevin.tian@intel.com>

^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [Intel-gfx] [PATCH v3 10/15] vfio-iommufd: Add detach_ioas for emulated VFIO devices
@ 2023-02-14  8:06     ` Tian, Kevin
  0 siblings, 0 replies; 135+ messages in thread
From: Tian, Kevin @ 2023-02-14  8:06 UTC (permalink / raw)
  To: Liu, Yi L, joro, alex.williamson, jgg, robin.murphy
  Cc: linux-s390, yi.y.sun, kvm, mjrosato, jasowang, cohuck, peterx,
	eric.auger, nicolinc, shameerali.kolothum.thodi,
	suravee.suthikulpanit, chao.p.peng, lulu, intel-gvt-dev,
	intel-gfx

> From: Liu, Yi L <yi.l.liu@intel.com>
> Sent: Monday, February 13, 2023 11:14 PM
> 
> this prepares for adding DETACH ioctl for emulated VFIO devices.
> 
> Signed-off-by: Yi Liu <yi.l.liu@intel.com>

Reviewed-by: Kevin Tian <kevin.tian@intel.com>

^ permalink raw reply	[flat|nested] 135+ messages in thread

* RE: [PATCH v3 11/15] vfio: Add cdev_device_open_cnt to vfio_group
  2023-02-13 15:13   ` [Intel-gfx] " Yi Liu
@ 2023-02-14  8:18     ` Tian, Kevin
  -1 siblings, 0 replies; 135+ messages in thread
From: Tian, Kevin @ 2023-02-14  8:18 UTC (permalink / raw)
  To: Liu, Yi L, joro, alex.williamson, jgg, robin.murphy
  Cc: cohuck, eric.auger, nicolinc, kvm, mjrosato, chao.p.peng,
	yi.y.sun, peterx, jasowang, shameerali.kolothum.thodi, lulu,
	suravee.suthikulpanit, intel-gvt-dev, intel-gfx, linux-s390

> From: Liu, Yi L <yi.l.liu@intel.com>
> Sent: Monday, February 13, 2023 11:14 PM
> 
> for counting the devices that are opened via the cdev path. This count
> is increased and decreased by the cdev path. The group path checks it
> to achieve exclusion with the cdev path. With this, only one path (group
> path or cdev path) will claim DMA ownership. This avoids scenarios in
> which devices within the same group may be opened via different paths.

please move vfio_device_claim/release_group() from patch 14 into
this patch to make the exclusiveness part complete.

> 
> Signed-off-by: Yi Liu <yi.l.liu@intel.com>
> ---
>  drivers/vfio/group.c | 5 +++++
>  drivers/vfio/vfio.h  | 1 +
>  2 files changed, 6 insertions(+)
> 
> diff --git a/drivers/vfio/group.c b/drivers/vfio/group.c
> index 9f3f6f0e4942..f3f5f4589cdd 100644
> --- a/drivers/vfio/group.c
> +++ b/drivers/vfio/group.c
> @@ -403,6 +403,11 @@ static int vfio_group_fops_open(struct inode *inode,
> struct file *filep)
>  		goto out_unlock;
>  	}
> 
> +	if (group->cdev_device_open_cnt) {
> +		ret = -EBUSY;
> +		goto out_unlock;
> +	}
> +
>  	/*
>  	 * Do we need multiple instances of the group open?  Seems not.
>  	 */
> diff --git a/drivers/vfio/vfio.h b/drivers/vfio/vfio.h
> index 6f063e31d08a..7a77fb12bd2c 100644
> --- a/drivers/vfio/vfio.h
> +++ b/drivers/vfio/vfio.h
> @@ -84,6 +84,7 @@ struct vfio_group {
>  	struct blocking_notifier_head	notifier;
>  	struct iommufd_ctx		*iommufd;
>  	spinlock_t			kvm_ref_lock;
> +	unsigned int			cdev_device_open_cnt;
>  };
> 
>  int vfio_device_set_group(struct vfio_device *device,
> --
> 2.34.1


^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [Intel-gfx] [PATCH v3 11/15] vfio: Add cdev_device_open_cnt to vfio_group
@ 2023-02-14  8:18     ` Tian, Kevin
  0 siblings, 0 replies; 135+ messages in thread
From: Tian, Kevin @ 2023-02-14  8:18 UTC (permalink / raw)
  To: Liu, Yi L, joro, alex.williamson, jgg, robin.murphy
  Cc: linux-s390, yi.y.sun, kvm, mjrosato, jasowang, cohuck, peterx,
	eric.auger, nicolinc, shameerali.kolothum.thodi,
	suravee.suthikulpanit, chao.p.peng, lulu, intel-gvt-dev,
	intel-gfx

> From: Liu, Yi L <yi.l.liu@intel.com>
> Sent: Monday, February 13, 2023 11:14 PM
> 
> for counting the devices that are opened via the cdev path. This count
> is increased and decreased by the cdev path. The group path checks it
> to achieve exclusion with the cdev path. With this, only one path (group
> path or cdev path) will claim DMA ownership. This avoids scenarios in
> which devices within the same group may be opened via different paths.

please move vfio_device_claim/release_group() from patch 14 into
this patch to make the exclusiveness part complete.

> 
> Signed-off-by: Yi Liu <yi.l.liu@intel.com>
> ---
>  drivers/vfio/group.c | 5 +++++
>  drivers/vfio/vfio.h  | 1 +
>  2 files changed, 6 insertions(+)
> 
> diff --git a/drivers/vfio/group.c b/drivers/vfio/group.c
> index 9f3f6f0e4942..f3f5f4589cdd 100644
> --- a/drivers/vfio/group.c
> +++ b/drivers/vfio/group.c
> @@ -403,6 +403,11 @@ static int vfio_group_fops_open(struct inode *inode,
> struct file *filep)
>  		goto out_unlock;
>  	}
> 
> +	if (group->cdev_device_open_cnt) {
> +		ret = -EBUSY;
> +		goto out_unlock;
> +	}
> +
>  	/*
>  	 * Do we need multiple instances of the group open?  Seems not.
>  	 */
> diff --git a/drivers/vfio/vfio.h b/drivers/vfio/vfio.h
> index 6f063e31d08a..7a77fb12bd2c 100644
> --- a/drivers/vfio/vfio.h
> +++ b/drivers/vfio/vfio.h
> @@ -84,6 +84,7 @@ struct vfio_group {
>  	struct blocking_notifier_head	notifier;
>  	struct iommufd_ctx		*iommufd;
>  	spinlock_t			kvm_ref_lock;
> +	unsigned int			cdev_device_open_cnt;
>  };
> 
>  int vfio_device_set_group(struct vfio_device *device,
> --
> 2.34.1


^ permalink raw reply	[flat|nested] 135+ messages in thread

* RE: [PATCH v3 12/15] vfio: Make vfio_device_open() single open for device cdev path
  2023-02-13 15:13   ` Yi Liu
@ 2023-02-14  8:25     ` Tian, Kevin
  -1 siblings, 0 replies; 135+ messages in thread
From: Tian, Kevin @ 2023-02-14  8:25 UTC (permalink / raw)
  To: Liu, Yi L, joro, alex.williamson, jgg, robin.murphy
  Cc: cohuck, eric.auger, nicolinc, kvm, mjrosato, chao.p.peng,
	yi.y.sun, peterx, jasowang, shameerali.kolothum.thodi, lulu,
	suravee.suthikulpanit, intel-gvt-dev, intel-gfx, linux-s390

> From: Liu, Yi L <yi.l.liu@intel.com>
> Sent: Monday, February 13, 2023 11:14 PM
> 
> With the introduction of vfio device cdev, userspace can get device
> access by either the legacy group path or the cdev path. For VFIO devices,
> it can only be opened by one of the group path and the cdev path at one
> time. e.g. when the device is opened via cdev path, the group path should
> be failed. Both paths will call into vfio_device_open(), so the exclusion
> is done in it.

the exclusive part between two paths is handled by the last patch.

this patch should stay with explaining single-open facet in cdev path.

> 
> +	/*
> +	 * Device cdev path cannot support multiple device open since
> +	 * it doesn't have a secure way for it. So a second device
> +	 * open attempt should be failed if the caller is from a cdev
> +	 * path.
> +	 */

remove the last sentence.

> +	if (device->open_count != 0 && df->is_cdev_device)
> +		return -EINVAL;
> +
>  	device->open_count++;
>  	if (device->open_count == 1) {
>  		ret = vfio_device_first_open(df, dev_id, pt_id);
> @@ -543,7 +552,12 @@ static int vfio_device_fops_release(struct inode
> *inode, struct file *filep)
>  	struct vfio_device_file *df = filep->private_data;
>  	struct vfio_device *device = df->device;
> 
> -	vfio_device_group_close(df);
> +	/*
> +	 * group path supports multiple device open, while cdev doesn't.
> +	 * So use vfio_device_group_close() for !is_cdev_device case.
> +	 */

I don't say why multi-open is the reason to call group_close(). Isn't
it straightforward to do so in the group path? I'd just remove the comment.

> +	if (!df->is_cdev_device)
> +		vfio_device_group_close(df);
> 
>  	vfio_device_put_registration(device);
> 
> --
> 2.34.1


^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [Intel-gfx] [PATCH v3 12/15] vfio: Make vfio_device_open() single open for device cdev path
@ 2023-02-14  8:25     ` Tian, Kevin
  0 siblings, 0 replies; 135+ messages in thread
From: Tian, Kevin @ 2023-02-14  8:25 UTC (permalink / raw)
  To: Liu, Yi L, joro, alex.williamson, jgg, robin.murphy
  Cc: linux-s390, yi.y.sun, kvm, mjrosato, jasowang, cohuck, peterx,
	eric.auger, nicolinc, shameerali.kolothum.thodi,
	suravee.suthikulpanit, chao.p.peng, lulu, intel-gvt-dev,
	intel-gfx

> From: Liu, Yi L <yi.l.liu@intel.com>
> Sent: Monday, February 13, 2023 11:14 PM
> 
> With the introduction of vfio device cdev, userspace can get device
> access by either the legacy group path or the cdev path. For VFIO devices,
> it can only be opened by one of the group path and the cdev path at one
> time. e.g. when the device is opened via cdev path, the group path should
> be failed. Both paths will call into vfio_device_open(), so the exclusion
> is done in it.

the exclusive part between two paths is handled by the last patch.

this patch should stay with explaining single-open facet in cdev path.

> 
> +	/*
> +	 * Device cdev path cannot support multiple device open since
> +	 * it doesn't have a secure way for it. So a second device
> +	 * open attempt should be failed if the caller is from a cdev
> +	 * path.
> +	 */

remove the last sentence.

> +	if (device->open_count != 0 && df->is_cdev_device)
> +		return -EINVAL;
> +
>  	device->open_count++;
>  	if (device->open_count == 1) {
>  		ret = vfio_device_first_open(df, dev_id, pt_id);
> @@ -543,7 +552,12 @@ static int vfio_device_fops_release(struct inode
> *inode, struct file *filep)
>  	struct vfio_device_file *df = filep->private_data;
>  	struct vfio_device *device = df->device;
> 
> -	vfio_device_group_close(df);
> +	/*
> +	 * group path supports multiple device open, while cdev doesn't.
> +	 * So use vfio_device_group_close() for !is_cdev_device case.
> +	 */

I don't say why multi-open is the reason to call group_close(). Isn't
it straightforward to do so in the group path? I'd just remove the comment.

> +	if (!df->is_cdev_device)
> +		vfio_device_group_close(df);
> 
>  	vfio_device_put_registration(device);
> 
> --
> 2.34.1


^ permalink raw reply	[flat|nested] 135+ messages in thread

* RE: [PATCH v3 13/15] vfio: Add cdev for vfio_device
  2023-02-13 15:13   ` Yi Liu
@ 2023-02-14  8:32     ` Tian, Kevin
  -1 siblings, 0 replies; 135+ messages in thread
From: Tian, Kevin @ 2023-02-14  8:32 UTC (permalink / raw)
  To: Liu, Yi L, joro, alex.williamson, jgg, robin.murphy
  Cc: cohuck, eric.auger, nicolinc, kvm, mjrosato, chao.p.peng,
	yi.y.sun, peterx, jasowang, shameerali.kolothum.thodi, lulu,
	suravee.suthikulpanit, intel-gvt-dev, intel-gfx, linux-s390

> From: Liu, Yi L <yi.l.liu@intel.com>
> Sent: Monday, February 13, 2023 11:14 PM
> 
> +/*
> + * cdev open op. device access via the fd opened by this function
> + * is blocked until .open_device() is called successfully during
> + * BIND_IOMMUFD.
> + */

remove "cdev open op"

> +int vfio_device_fops_open(struct inode *inode, struct file *filep)

vfio_device_fops_cdev_open()

> 
> +static int vfio_device_add(struct vfio_device *device)
> +{
> +	int ret;
> +
> +	if (IS_ENABLED(CONFIG_VFIO_DEVICE_CDEV))
> +		ret = cdev_device_add(&device->cdev, &device->device);
> +	else
> +		ret = device_add(&device->device);
> +	return ret;
> +}
> +
> +static void vfio_device_del(struct vfio_device *device)
> +{
> +	if (IS_ENABLED(CONFIG_VFIO_DEVICE_CDEV))
> +		cdev_device_del(&device->cdev, &device->device);
> +	else
> +		device_del(&device->device);
> +}
> +

move to header file and have CONFIG_VFIO_DEVICE_CDEV
wrapping vfio_device_add/del() directly.

^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [Intel-gfx] [PATCH v3 13/15] vfio: Add cdev for vfio_device
@ 2023-02-14  8:32     ` Tian, Kevin
  0 siblings, 0 replies; 135+ messages in thread
From: Tian, Kevin @ 2023-02-14  8:32 UTC (permalink / raw)
  To: Liu, Yi L, joro, alex.williamson, jgg, robin.murphy
  Cc: linux-s390, yi.y.sun, kvm, mjrosato, jasowang, cohuck, peterx,
	eric.auger, nicolinc, shameerali.kolothum.thodi,
	suravee.suthikulpanit, chao.p.peng, lulu, intel-gvt-dev,
	intel-gfx

> From: Liu, Yi L <yi.l.liu@intel.com>
> Sent: Monday, February 13, 2023 11:14 PM
> 
> +/*
> + * cdev open op. device access via the fd opened by this function
> + * is blocked until .open_device() is called successfully during
> + * BIND_IOMMUFD.
> + */

remove "cdev open op"

> +int vfio_device_fops_open(struct inode *inode, struct file *filep)

vfio_device_fops_cdev_open()

> 
> +static int vfio_device_add(struct vfio_device *device)
> +{
> +	int ret;
> +
> +	if (IS_ENABLED(CONFIG_VFIO_DEVICE_CDEV))
> +		ret = cdev_device_add(&device->cdev, &device->device);
> +	else
> +		ret = device_add(&device->device);
> +	return ret;
> +}
> +
> +static void vfio_device_del(struct vfio_device *device)
> +{
> +	if (IS_ENABLED(CONFIG_VFIO_DEVICE_CDEV))
> +		cdev_device_del(&device->cdev, &device->device);
> +	else
> +		device_del(&device->device);
> +}
> +

move to header file and have CONFIG_VFIO_DEVICE_CDEV
wrapping vfio_device_add/del() directly.

^ permalink raw reply	[flat|nested] 135+ messages in thread

* RE: [PATCH v3 13/15] vfio: Add cdev for vfio_device
  2023-02-14  8:32     ` [Intel-gfx] " Tian, Kevin
@ 2023-02-14  8:35       ` Liu, Yi L
  -1 siblings, 0 replies; 135+ messages in thread
From: Liu, Yi L @ 2023-02-14  8:35 UTC (permalink / raw)
  To: Tian, Kevin, joro, alex.williamson, jgg, robin.murphy
  Cc: cohuck, eric.auger, nicolinc, kvm, mjrosato, chao.p.peng,
	yi.y.sun, peterx, jasowang, shameerali.kolothum.thodi, lulu,
	suravee.suthikulpanit, intel-gvt-dev, intel-gfx, linux-s390

> From: Tian, Kevin <kevin.tian@intel.com>
> Sent: Tuesday, February 14, 2023 4:32 PM
> 
> > From: Liu, Yi L <yi.l.liu@intel.com>
> > Sent: Monday, February 13, 2023 11:14 PM
> >
> > +/*
> > + * cdev open op. device access via the fd opened by this function
> > + * is blocked until .open_device() is called successfully during
> > + * BIND_IOMMUFD.
> > + */
> 
> remove "cdev open op"
> 
> > +int vfio_device_fops_open(struct inode *inode, struct file *filep)
> 
> vfio_device_fops_cdev_open()
> 
> >
> > +static int vfio_device_add(struct vfio_device *device)
> > +{
> > +	int ret;
> > +
> > +	if (IS_ENABLED(CONFIG_VFIO_DEVICE_CDEV))
> > +		ret = cdev_device_add(&device->cdev, &device->device);
> > +	else
> > +		ret = device_add(&device->device);
> > +	return ret;
> > +}
> > +
> > +static void vfio_device_del(struct vfio_device *device)
> > +{
> > +	if (IS_ENABLED(CONFIG_VFIO_DEVICE_CDEV))
> > +		cdev_device_del(&device->cdev, &device->device);
> > +	else
> > +		device_del(&device->device);
> > +}
> > +
> 
> move to header file and have CONFIG_VFIO_DEVICE_CDEV
> wrapping vfio_device_add/del() directly.

Ok. 

^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [Intel-gfx] [PATCH v3 13/15] vfio: Add cdev for vfio_device
@ 2023-02-14  8:35       ` Liu, Yi L
  0 siblings, 0 replies; 135+ messages in thread
From: Liu, Yi L @ 2023-02-14  8:35 UTC (permalink / raw)
  To: Tian, Kevin, joro, alex.williamson, jgg, robin.murphy
  Cc: linux-s390, yi.y.sun, kvm, mjrosato, jasowang, cohuck, peterx,
	eric.auger, nicolinc, shameerali.kolothum.thodi,
	suravee.suthikulpanit, chao.p.peng, lulu, intel-gvt-dev,
	intel-gfx

> From: Tian, Kevin <kevin.tian@intel.com>
> Sent: Tuesday, February 14, 2023 4:32 PM
> 
> > From: Liu, Yi L <yi.l.liu@intel.com>
> > Sent: Monday, February 13, 2023 11:14 PM
> >
> > +/*
> > + * cdev open op. device access via the fd opened by this function
> > + * is blocked until .open_device() is called successfully during
> > + * BIND_IOMMUFD.
> > + */
> 
> remove "cdev open op"
> 
> > +int vfio_device_fops_open(struct inode *inode, struct file *filep)
> 
> vfio_device_fops_cdev_open()
> 
> >
> > +static int vfio_device_add(struct vfio_device *device)
> > +{
> > +	int ret;
> > +
> > +	if (IS_ENABLED(CONFIG_VFIO_DEVICE_CDEV))
> > +		ret = cdev_device_add(&device->cdev, &device->device);
> > +	else
> > +		ret = device_add(&device->device);
> > +	return ret;
> > +}
> > +
> > +static void vfio_device_del(struct vfio_device *device)
> > +{
> > +	if (IS_ENABLED(CONFIG_VFIO_DEVICE_CDEV))
> > +		cdev_device_del(&device->cdev, &device->device);
> > +	else
> > +		device_del(&device->device);
> > +}
> > +
> 
> move to header file and have CONFIG_VFIO_DEVICE_CDEV
> wrapping vfio_device_add/del() directly.

Ok. 

^ permalink raw reply	[flat|nested] 135+ messages in thread

* RE: [PATCH v3 14/15] vfio: Add ioctls for device cdev using iommufd
  2023-02-13 15:13   ` Yi Liu
@ 2023-02-14  8:53     ` Tian, Kevin
  -1 siblings, 0 replies; 135+ messages in thread
From: Tian, Kevin @ 2023-02-14  8:53 UTC (permalink / raw)
  To: Liu, Yi L, joro, alex.williamson, jgg, robin.murphy
  Cc: cohuck, eric.auger, nicolinc, kvm, mjrosato, chao.p.peng,
	yi.y.sun, peterx, jasowang, shameerali.kolothum.thodi, lulu,
	suravee.suthikulpanit, intel-gvt-dev, intel-gfx, linux-s390

> From: Liu, Yi L <yi.l.liu@intel.com>
> Sent: Monday, February 13, 2023 11:14 PM
> 
> This adds three vfio device ioctls for userspace using iommufd to set up
> secure DMA context for device access.
> 
>     VFIO_DEVICE_BIND_IOMMUFD: bind device to an iommufd, hence gain
> DMA
> 			      control provided by the iommufd. open_device
> 			      op is called after bind_iommufd op.
> 			      VFIO no iommu mode is indicated by passing
> 			      a negative iommufd value.
>     VFIO_DEVICE_ATTACH_IOMMUFD_PT: attach device to IOAS,
> hw_pagetable
> 				   managed by iommufd. Attach can be
> 				   undo by
> VFIO_DEVICE_DETACH_IOMMUFD_PT
> 				   or device fd close.
>     VFIO_DEVICE_DETACH_IOMMUFD_PT: detach device from the current
> attached
> 				   IOAS or hw_pagetable managed by
> iommufd.

let's split into two patches: bind and attach/detach.

> 
> +int vfio_device_claim_group(struct vfio_device *device)
> +void vfio_device_release_group(struct vfio_device *device)

vfio_device_block_group()
vfio_device_unblock_group()

> 
> +	/*
> +	 * For group/container path, iommufd pointer is NULL when comes
> +	 * into this helper. Its noiommu support is in container.c.

"Its noiommu support is handled by vfio_device_group_use_iommu()"

> +	 *
> +	 * For iommufd compat mode, iommufd pointer here is a valid value.
> +	 * Its noiommu support is in vfio_iommufd_bind().
> +	 *
> +	 * For device cdev path, iommufd pointer here is a valid value for
> +	 * normal cases, but it is NULL if it's noiommu. To differentiate
> +	 * the noiommu from the group/container path which also passes
> NULL
> +	 * iommufd pointer in, check df->noiommu which is set only in the
> +	 * cdev path.

"Check df->noiommu to differentiate cdev noiommu from the group/
container path which also passes NULL iommufd pointer in. If set
then do nothing."


^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [Intel-gfx] [PATCH v3 14/15] vfio: Add ioctls for device cdev using iommufd
@ 2023-02-14  8:53     ` Tian, Kevin
  0 siblings, 0 replies; 135+ messages in thread
From: Tian, Kevin @ 2023-02-14  8:53 UTC (permalink / raw)
  To: Liu, Yi L, joro, alex.williamson, jgg, robin.murphy
  Cc: linux-s390, yi.y.sun, kvm, mjrosato, jasowang, cohuck, peterx,
	eric.auger, nicolinc, shameerali.kolothum.thodi,
	suravee.suthikulpanit, chao.p.peng, lulu, intel-gvt-dev,
	intel-gfx

> From: Liu, Yi L <yi.l.liu@intel.com>
> Sent: Monday, February 13, 2023 11:14 PM
> 
> This adds three vfio device ioctls for userspace using iommufd to set up
> secure DMA context for device access.
> 
>     VFIO_DEVICE_BIND_IOMMUFD: bind device to an iommufd, hence gain
> DMA
> 			      control provided by the iommufd. open_device
> 			      op is called after bind_iommufd op.
> 			      VFIO no iommu mode is indicated by passing
> 			      a negative iommufd value.
>     VFIO_DEVICE_ATTACH_IOMMUFD_PT: attach device to IOAS,
> hw_pagetable
> 				   managed by iommufd. Attach can be
> 				   undo by
> VFIO_DEVICE_DETACH_IOMMUFD_PT
> 				   or device fd close.
>     VFIO_DEVICE_DETACH_IOMMUFD_PT: detach device from the current
> attached
> 				   IOAS or hw_pagetable managed by
> iommufd.

let's split into two patches: bind and attach/detach.

> 
> +int vfio_device_claim_group(struct vfio_device *device)
> +void vfio_device_release_group(struct vfio_device *device)

vfio_device_block_group()
vfio_device_unblock_group()

> 
> +	/*
> +	 * For group/container path, iommufd pointer is NULL when comes
> +	 * into this helper. Its noiommu support is in container.c.

"Its noiommu support is handled by vfio_device_group_use_iommu()"

> +	 *
> +	 * For iommufd compat mode, iommufd pointer here is a valid value.
> +	 * Its noiommu support is in vfio_iommufd_bind().
> +	 *
> +	 * For device cdev path, iommufd pointer here is a valid value for
> +	 * normal cases, but it is NULL if it's noiommu. To differentiate
> +	 * the noiommu from the group/container path which also passes
> NULL
> +	 * iommufd pointer in, check df->noiommu which is set only in the
> +	 * cdev path.

"Check df->noiommu to differentiate cdev noiommu from the group/
container path which also passes NULL iommufd pointer in. If set
then do nothing."


^ permalink raw reply	[flat|nested] 135+ messages in thread

* RE: [PATCH v3 00/15] Add vfio_device cdev for iommufd support
  2023-02-13 23:21     ` [Intel-gfx] " Jason Gunthorpe
@ 2023-02-14 15:15       ` Liu, Yi L
  -1 siblings, 0 replies; 135+ messages in thread
From: Liu, Yi L @ 2023-02-14 15:15 UTC (permalink / raw)
  To: Jason Gunthorpe, Alex Williamson
  Cc: joro, Tian, Kevin, robin.murphy, cohuck, eric.auger, nicolinc,
	kvm, mjrosato, chao.p.peng, yi.y.sun, peterx, jasowang,
	shameerali.kolothum.thodi, lulu, suravee.suthikulpanit,
	intel-gvt-dev, intel-gfx, linux-s390

> From: Jason Gunthorpe <jgg@nvidia.com>
> Sent: Tuesday, February 14, 2023 7:22 AM
> 
> On Mon, Feb 13, 2023 at 12:47:19PM -0700, Alex Williamson wrote:
> 
> > I think it's too late for v6.3 already, but given this needs at least
> > one more spin, let's set expectations of this being v6.4 material.  Thanks,
> 
> Please let's continue to try to get this finished during the merge
> window, all the other series depend on it. We can manage it with a
> shared branch again..
> 

Sure. I've updated the below branch to address Kevin's latest remarks.
Fixed the compiling failure reported by Alex as well.

https://github.com/yiliu1765/iommufd/commits/vfio_device_cdev_v3

Regards,
Yi Liu

^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [Intel-gfx] [PATCH v3 00/15] Add vfio_device cdev for iommufd support
@ 2023-02-14 15:15       ` Liu, Yi L
  0 siblings, 0 replies; 135+ messages in thread
From: Liu, Yi L @ 2023-02-14 15:15 UTC (permalink / raw)
  To: Jason Gunthorpe, Alex Williamson
  Cc: linux-s390, yi.y.sun, mjrosato, kvm, intel-gvt-dev, joro, cohuck,
	peterx, eric.auger, nicolinc, shameerali.kolothum.thodi,
	suravee.suthikulpanit, intel-gfx, chao.p.peng, lulu,
	robin.murphy, jasowang

> From: Jason Gunthorpe <jgg@nvidia.com>
> Sent: Tuesday, February 14, 2023 7:22 AM
> 
> On Mon, Feb 13, 2023 at 12:47:19PM -0700, Alex Williamson wrote:
> 
> > I think it's too late for v6.3 already, but given this needs at least
> > one more spin, let's set expectations of this being v6.4 material.  Thanks,
> 
> Please let's continue to try to get this finished during the merge
> window, all the other series depend on it. We can manage it with a
> shared branch again..
> 

Sure. I've updated the below branch to address Kevin's latest remarks.
Fixed the compiling failure reported by Alex as well.

https://github.com/yiliu1765/iommufd/commits/vfio_device_cdev_v3

Regards,
Yi Liu

^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [Intel-gfx] [PATCH v3 00/15] Add vfio_device cdev for iommufd support
  2023-02-14  1:55     ` [Intel-gfx] " Liu, Yi L
@ 2023-02-14 15:47       ` Alex Williamson
  -1 siblings, 0 replies; 135+ messages in thread
From: Alex Williamson @ 2023-02-14 15:47 UTC (permalink / raw)
  To: Liu, Yi L
  Cc: linux-s390, yi.y.sun, mjrosato, kvm, intel-gvt-dev, joro, cohuck,
	peterx, suravee.suthikulpanit, eric.auger, nicolinc,
	shameerali.kolothum.thodi, jgg, intel-gfx, chao.p.peng, lulu,
	robin.murphy, jasowang

On Tue, 14 Feb 2023 01:55:17 +0000
"Liu, Yi L" <yi.l.liu@intel.com> wrote:

> > From: Alex Williamson <alex.williamson@redhat.com>
> > Sent: Tuesday, February 14, 2023 3:47 AM
> > 
> > On Mon, 13 Feb 2023 07:13:33 -0800
> > Yi Liu <yi.l.liu@intel.com> wrote:
> >   
> > > Existing VFIO provides group-centric user APIs for userspace. Userspace
> > > opens the /dev/vfio/$group_id first before getting device fd and hence
> > > getting access to device. This is not the desired model for iommufd. Per
> > > the conclusion of community discussion[1], iommufd provides device-  
> > centric  
> > > kAPIs and requires its consumer (like VFIO) to be device-centric user
> > > APIs. Such user APIs are used to associate device with iommufd and also
> > > the I/O address spaces managed by the iommufd.
> > >
> > > This series first introduces a per device file structure to be prepared
> > > for further enhancement and refactors the kvm-vfio code to be prepared
> > > for accepting device file from userspace. Then refactors the vfio to be
> > > able to handle iommufd binding. This refactor includes the mechanism of
> > > blocking device access before iommufd bind, making vfio_device_open()  
> > be  
> > > exclusive between the group path and the cdev path. Eventually, adds the
> > > cdev support for vfio device, and makes group infrastructure optional as
> > > it is not needed when vfio device cdev is compiled.
> > >
> > > This is also a prerequisite for iommu nesting for vfio device[2].
> > >
> > > The complete code can be found in below branch, simple test done with  
> > the  
> > > legacy group path and the cdev path. Draft QEMU branch can be found  
> > at[3]  
> > >
> > > https://github.com/yiliu1765/iommufd/tree/vfio_device_cdev_v3
> > > (config CONFIG_IOMMUFD=y CONFIG_VFIO_DEVICE_CDEV=y)  
> > 
> > Even using your branch[1], it seems like this has not been tested
> > except with cdev support enabled:
> > 
> > /home/alwillia/Work/linux.git/drivers/vfio/vfio_main.c: In function
> > ‘vfio_device_add’:
> > /home/alwillia/Work/linux.git/drivers/vfio/vfio_main.c:253:48: error: ‘struct
> > vfio_device’ has no member named ‘cdev’; did you mean ‘dev’?
> >   253 |                 ret = cdev_device_add(&device->cdev, &device->device);
> >       |                                                ^~~~
> >       |                                                dev
> > /home/alwillia/Work/linux.git/drivers/vfio/vfio_main.c: In function
> > ‘vfio_device_del’:
> > /home/alwillia/Work/linux.git/drivers/vfio/vfio_main.c:262:42: error: ‘struct
> > vfio_device’ has no member named ‘cdev’; did you mean ‘dev’?
> >   262 |                 cdev_device_del(&device->cdev, &device->device);
> >       |                                          ^~~~
> >       |                                          dev  
> 
> Sorry for it. It is due to the cdev definition is under
> "#if IS_ENABLED(CONFIG_VFIO_DEVICE_CDEV)". While, in the code it
> uses "if (IS_ENABLED(CONFIG_VFIO_DEVICE_CDEV))".  I think for
> readability, it would be better to always define cdev in vfio_device,
> and keep the using of cdev in code. How about your taste?

It seems necessary unless we want to litter the code with #ifdefs.

> > Additionally the VFIO_ENABLE_GROUP Kconfig option doesn't make much
> > sense to me, it seems entirely redundant to VFIO_GROUP.  
> 
> The intention is to make the group code compiling match existing case.
> Currently, if VFIO is configured, group code is by default compiled.
> So VFIO_ENABLE_GROUP a hidden option, and VFIO_GROUP an option
> for user.  User needs to explicitly config VFIO_GROUP if VFIO_DEVICE_CDEV==y.
> If VFIO_DEVICE_CDEV==n, then no matter user configed VFIO_GROUP or not,
> the group code shall be compiled.

I understand the mechanics, I still find VFIO_ENABLE_GROUP redundant
and unnecessary.  Also, Kconfig should not allow a configuration
without either VFIO_GROUP or VFIO_DEVICE_CDEV as this is not
functional.  Deselecting VFIO_GROUP should select VFIO_DEVICE_CDEV, but
VFIO_DEVICE_CDEV should be an optional addition to VFIO_GROUP.  Thanks,

Alex


^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [PATCH v3 00/15] Add vfio_device cdev for iommufd support
@ 2023-02-14 15:47       ` Alex Williamson
  0 siblings, 0 replies; 135+ messages in thread
From: Alex Williamson @ 2023-02-14 15:47 UTC (permalink / raw)
  To: Liu, Yi L
  Cc: joro, jgg, Tian, Kevin, robin.murphy, cohuck, eric.auger,
	nicolinc, kvm, mjrosato, chao.p.peng, yi.y.sun, peterx, jasowang,
	shameerali.kolothum.thodi, lulu, suravee.suthikulpanit,
	intel-gvt-dev, intel-gfx, linux-s390

On Tue, 14 Feb 2023 01:55:17 +0000
"Liu, Yi L" <yi.l.liu@intel.com> wrote:

> > From: Alex Williamson <alex.williamson@redhat.com>
> > Sent: Tuesday, February 14, 2023 3:47 AM
> > 
> > On Mon, 13 Feb 2023 07:13:33 -0800
> > Yi Liu <yi.l.liu@intel.com> wrote:
> >   
> > > Existing VFIO provides group-centric user APIs for userspace. Userspace
> > > opens the /dev/vfio/$group_id first before getting device fd and hence
> > > getting access to device. This is not the desired model for iommufd. Per
> > > the conclusion of community discussion[1], iommufd provides device-  
> > centric  
> > > kAPIs and requires its consumer (like VFIO) to be device-centric user
> > > APIs. Such user APIs are used to associate device with iommufd and also
> > > the I/O address spaces managed by the iommufd.
> > >
> > > This series first introduces a per device file structure to be prepared
> > > for further enhancement and refactors the kvm-vfio code to be prepared
> > > for accepting device file from userspace. Then refactors the vfio to be
> > > able to handle iommufd binding. This refactor includes the mechanism of
> > > blocking device access before iommufd bind, making vfio_device_open()  
> > be  
> > > exclusive between the group path and the cdev path. Eventually, adds the
> > > cdev support for vfio device, and makes group infrastructure optional as
> > > it is not needed when vfio device cdev is compiled.
> > >
> > > This is also a prerequisite for iommu nesting for vfio device[2].
> > >
> > > The complete code can be found in below branch, simple test done with  
> > the  
> > > legacy group path and the cdev path. Draft QEMU branch can be found  
> > at[3]  
> > >
> > > https://github.com/yiliu1765/iommufd/tree/vfio_device_cdev_v3
> > > (config CONFIG_IOMMUFD=y CONFIG_VFIO_DEVICE_CDEV=y)  
> > 
> > Even using your branch[1], it seems like this has not been tested
> > except with cdev support enabled:
> > 
> > /home/alwillia/Work/linux.git/drivers/vfio/vfio_main.c: In function
> > ‘vfio_device_add’:
> > /home/alwillia/Work/linux.git/drivers/vfio/vfio_main.c:253:48: error: ‘struct
> > vfio_device’ has no member named ‘cdev’; did you mean ‘dev’?
> >   253 |                 ret = cdev_device_add(&device->cdev, &device->device);
> >       |                                                ^~~~
> >       |                                                dev
> > /home/alwillia/Work/linux.git/drivers/vfio/vfio_main.c: In function
> > ‘vfio_device_del’:
> > /home/alwillia/Work/linux.git/drivers/vfio/vfio_main.c:262:42: error: ‘struct
> > vfio_device’ has no member named ‘cdev’; did you mean ‘dev’?
> >   262 |                 cdev_device_del(&device->cdev, &device->device);
> >       |                                          ^~~~
> >       |                                          dev  
> 
> Sorry for it. It is due to the cdev definition is under
> "#if IS_ENABLED(CONFIG_VFIO_DEVICE_CDEV)". While, in the code it
> uses "if (IS_ENABLED(CONFIG_VFIO_DEVICE_CDEV))".  I think for
> readability, it would be better to always define cdev in vfio_device,
> and keep the using of cdev in code. How about your taste?

It seems necessary unless we want to litter the code with #ifdefs.

> > Additionally the VFIO_ENABLE_GROUP Kconfig option doesn't make much
> > sense to me, it seems entirely redundant to VFIO_GROUP.  
> 
> The intention is to make the group code compiling match existing case.
> Currently, if VFIO is configured, group code is by default compiled.
> So VFIO_ENABLE_GROUP a hidden option, and VFIO_GROUP an option
> for user.  User needs to explicitly config VFIO_GROUP if VFIO_DEVICE_CDEV==y.
> If VFIO_DEVICE_CDEV==n, then no matter user configed VFIO_GROUP or not,
> the group code shall be compiled.

I understand the mechanics, I still find VFIO_ENABLE_GROUP redundant
and unnecessary.  Also, Kconfig should not allow a configuration
without either VFIO_GROUP or VFIO_DEVICE_CDEV as this is not
functional.  Deselecting VFIO_GROUP should select VFIO_DEVICE_CDEV, but
VFIO_DEVICE_CDEV should be an optional addition to VFIO_GROUP.  Thanks,

Alex


^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [PATCH v3 00/15] Add vfio_device cdev for iommufd support
  2023-02-14 15:15       ` [Intel-gfx] " Liu, Yi L
@ 2023-02-14 15:54         ` Alex Williamson
  -1 siblings, 0 replies; 135+ messages in thread
From: Alex Williamson @ 2023-02-14 15:54 UTC (permalink / raw)
  To: Liu, Yi L
  Cc: Jason Gunthorpe, joro, Tian, Kevin, robin.murphy, cohuck,
	eric.auger, nicolinc, kvm, mjrosato, chao.p.peng, yi.y.sun,
	peterx, jasowang, shameerali.kolothum.thodi, lulu,
	suravee.suthikulpanit, intel-gvt-dev, intel-gfx, linux-s390

On Tue, 14 Feb 2023 15:15:28 +0000
"Liu, Yi L" <yi.l.liu@intel.com> wrote:

> > From: Jason Gunthorpe <jgg@nvidia.com>
> > Sent: Tuesday, February 14, 2023 7:22 AM
> > 
> > On Mon, Feb 13, 2023 at 12:47:19PM -0700, Alex Williamson wrote:
> >   
> > > I think it's too late for v6.3 already, but given this needs at least
> > > one more spin, let's set expectations of this being v6.4 material.  Thanks,  
> > 
> > Please let's continue to try to get this finished during the merge
> > window, all the other series depend on it. We can manage it with a
> > shared branch again..
> >   
> 
> Sure. I've updated the below branch to address Kevin's latest remarks.
> Fixed the compiling failure reported by Alex as well.
> 
> https://github.com/yiliu1765/iommufd/commits/vfio_device_cdev_v3


Sorry, I think this is an abuse of the merge window.  We have a new uAPI
proposal that's barely a week old and has no reviews or acks other than
Yi's, we have build configuration issues which suggests a lack of
testing, we're finding subtle implications to other existing uAPIs, and
nobody seems to have finished an upstream review, including me.

Code for the merge window should already be in maintainer trees by now,
the merge window should be for integration.  There are a lot of things
in flight and many more review comments coming in than we should see
for this to be a v6.3 candidate.  Thanks,

Alex


^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [Intel-gfx] [PATCH v3 00/15] Add vfio_device cdev for iommufd support
@ 2023-02-14 15:54         ` Alex Williamson
  0 siblings, 0 replies; 135+ messages in thread
From: Alex Williamson @ 2023-02-14 15:54 UTC (permalink / raw)
  To: Liu, Yi L
  Cc: linux-s390, yi.y.sun, mjrosato, kvm, intel-gvt-dev, joro, cohuck,
	peterx, suravee.suthikulpanit, eric.auger, nicolinc,
	shameerali.kolothum.thodi, Jason Gunthorpe, intel-gfx,
	chao.p.peng, lulu, robin.murphy, jasowang

On Tue, 14 Feb 2023 15:15:28 +0000
"Liu, Yi L" <yi.l.liu@intel.com> wrote:

> > From: Jason Gunthorpe <jgg@nvidia.com>
> > Sent: Tuesday, February 14, 2023 7:22 AM
> > 
> > On Mon, Feb 13, 2023 at 12:47:19PM -0700, Alex Williamson wrote:
> >   
> > > I think it's too late for v6.3 already, but given this needs at least
> > > one more spin, let's set expectations of this being v6.4 material.  Thanks,  
> > 
> > Please let's continue to try to get this finished during the merge
> > window, all the other series depend on it. We can manage it with a
> > shared branch again..
> >   
> 
> Sure. I've updated the below branch to address Kevin's latest remarks.
> Fixed the compiling failure reported by Alex as well.
> 
> https://github.com/yiliu1765/iommufd/commits/vfio_device_cdev_v3


Sorry, I think this is an abuse of the merge window.  We have a new uAPI
proposal that's barely a week old and has no reviews or acks other than
Yi's, we have build configuration issues which suggests a lack of
testing, we're finding subtle implications to other existing uAPIs, and
nobody seems to have finished an upstream review, including me.

Code for the merge window should already be in maintainer trees by now,
the merge window should be for integration.  There are a lot of things
in flight and many more review comments coming in than we should see
for this to be a v6.3 candidate.  Thanks,

Alex


^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [PATCH v3 00/15] Add vfio_device cdev for iommufd support
  2023-02-14 15:54         ` [Intel-gfx] " Alex Williamson
@ 2023-02-14 16:48           ` Jason Gunthorpe
  -1 siblings, 0 replies; 135+ messages in thread
From: Jason Gunthorpe @ 2023-02-14 16:48 UTC (permalink / raw)
  To: Alex Williamson
  Cc: Liu, Yi L, joro, Tian, Kevin, robin.murphy, cohuck, eric.auger,
	nicolinc, kvm, mjrosato, chao.p.peng, yi.y.sun, peterx, jasowang,
	shameerali.kolothum.thodi, lulu, suravee.suthikulpanit,
	intel-gvt-dev, intel-gfx, linux-s390

On Tue, Feb 14, 2023 at 08:54:19AM -0700, Alex Williamson wrote:
> On Tue, 14 Feb 2023 15:15:28 +0000
> "Liu, Yi L" <yi.l.liu@intel.com> wrote:
> 
> > > From: Jason Gunthorpe <jgg@nvidia.com>
> > > Sent: Tuesday, February 14, 2023 7:22 AM
> > > 
> > > On Mon, Feb 13, 2023 at 12:47:19PM -0700, Alex Williamson wrote:
> > >   
> > > > I think it's too late for v6.3 already, but given this needs at least
> > > > one more spin, let's set expectations of this being v6.4 material.  Thanks,  
> > > 
> > > Please let's continue to try to get this finished during the merge
> > > window, all the other series depend on it. We can manage it with a
> > > shared branch again..
> > >   
> > 
> > Sure. I've updated the below branch to address Kevin's latest remarks.
> > Fixed the compiling failure reported by Alex as well.
> > 
> > https://github.com/yiliu1765/iommufd/commits/vfio_device_cdev_v3
> 
> 
> Sorry, I think this is an abuse of the merge window.  We have a new uAPI
> proposal that's barely a week old and has no reviews or acks other than
> Yi's, we have build configuration issues which suggests a lack of
> testing, we're finding subtle implications to other existing uAPIs, and
> nobody seems to have finished an upstream review, including me.
> 
> Code for the merge window should already be in maintainer trees by now,
> the merge window should be for integration.  There are a lot of things
> in flight and many more review comments coming in than we should see
> for this to be a v6.3 candidate.  Thanks,

Sorry, I ment that we continue to review and try to get this ready for
rc1, not abuse the merge window.

Obviously this cycle is lost.

Jason

^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [Intel-gfx] [PATCH v3 00/15] Add vfio_device cdev for iommufd support
@ 2023-02-14 16:48           ` Jason Gunthorpe
  0 siblings, 0 replies; 135+ messages in thread
From: Jason Gunthorpe @ 2023-02-14 16:48 UTC (permalink / raw)
  To: Alex Williamson
  Cc: linux-s390, Liu, Yi L, yi.y.sun, mjrosato, kvm, intel-gvt-dev,
	joro, cohuck, peterx, eric.auger, nicolinc,
	shameerali.kolothum.thodi, suravee.suthikulpanit, intel-gfx,
	chao.p.peng, lulu, robin.murphy, jasowang

On Tue, Feb 14, 2023 at 08:54:19AM -0700, Alex Williamson wrote:
> On Tue, 14 Feb 2023 15:15:28 +0000
> "Liu, Yi L" <yi.l.liu@intel.com> wrote:
> 
> > > From: Jason Gunthorpe <jgg@nvidia.com>
> > > Sent: Tuesday, February 14, 2023 7:22 AM
> > > 
> > > On Mon, Feb 13, 2023 at 12:47:19PM -0700, Alex Williamson wrote:
> > >   
> > > > I think it's too late for v6.3 already, but given this needs at least
> > > > one more spin, let's set expectations of this being v6.4 material.  Thanks,  
> > > 
> > > Please let's continue to try to get this finished during the merge
> > > window, all the other series depend on it. We can manage it with a
> > > shared branch again..
> > >   
> > 
> > Sure. I've updated the below branch to address Kevin's latest remarks.
> > Fixed the compiling failure reported by Alex as well.
> > 
> > https://github.com/yiliu1765/iommufd/commits/vfio_device_cdev_v3
> 
> 
> Sorry, I think this is an abuse of the merge window.  We have a new uAPI
> proposal that's barely a week old and has no reviews or acks other than
> Yi's, we have build configuration issues which suggests a lack of
> testing, we're finding subtle implications to other existing uAPIs, and
> nobody seems to have finished an upstream review, including me.
> 
> Code for the merge window should already be in maintainer trees by now,
> the merge window should be for integration.  There are a lot of things
> in flight and many more review comments coming in than we should see
> for this to be a v6.3 candidate.  Thanks,

Sorry, I ment that we continue to review and try to get this ready for
rc1, not abuse the merge window.

Obviously this cycle is lost.

Jason

^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [PATCH v3 05/15] kvm/vfio: Accept vfio device file from userspace
  2023-02-13 15:13   ` [Intel-gfx] " Yi Liu
@ 2023-02-14 22:26     ` Alex Williamson
  -1 siblings, 0 replies; 135+ messages in thread
From: Alex Williamson @ 2023-02-14 22:26 UTC (permalink / raw)
  To: Yi Liu
  Cc: joro, jgg, kevin.tian, robin.murphy, cohuck, eric.auger,
	nicolinc, kvm, mjrosato, chao.p.peng, yi.y.sun, peterx, jasowang,
	shameerali.kolothum.thodi, lulu, suravee.suthikulpanit,
	intel-gvt-dev, intel-gfx, linux-s390

On Mon, 13 Feb 2023 07:13:38 -0800
Yi Liu <yi.l.liu@intel.com> wrote:

> This defines KVM_DEV_VFIO_FILE* and make alias with KVM_DEV_VFIO_GROUP*.
> Old userspace uses KVM_DEV_VFIO_GROUP* works as well.
> 
> Signed-off-by: Yi Liu <yi.l.liu@intel.com>
> ---
>  Documentation/virt/kvm/devices/vfio.rst | 45 ++++++++++++++++---------
>  include/uapi/linux/kvm.h                | 16 ++++++---
>  virt/kvm/vfio.c                         | 16 ++++-----
>  3 files changed, 50 insertions(+), 27 deletions(-)
> 
> diff --git a/Documentation/virt/kvm/devices/vfio.rst b/Documentation/virt/kvm/devices/vfio.rst
> index 2d20dc561069..90f22933dcfa 100644
> --- a/Documentation/virt/kvm/devices/vfio.rst
> +++ b/Documentation/virt/kvm/devices/vfio.rst
> @@ -9,24 +9,37 @@ Device types supported:
>    - KVM_DEV_TYPE_VFIO
>  
>  Only one VFIO instance may be created per VM.  The created device
> -tracks VFIO groups in use by the VM and features of those groups
> -important to the correctness and acceleration of the VM.  As groups
> -are enabled and disabled for use by the VM, KVM should be updated
> -about their presence.  When registered with KVM, a reference to the
> -VFIO-group is held by KVM.
> +tracks VFIO files (group or device) in use by the VM and features
> +of those groups/devices important to the correctness and acceleration
> +of the VM.  As groups/devices are enabled and disabled for use by the
> +VM, KVM should be updated about their presence.  When registered with
> +KVM, a reference to the VFIO file is held by KVM.
>  
>  Groups:
> -  KVM_DEV_VFIO_GROUP
> -
> -KVM_DEV_VFIO_GROUP attributes:
> -  KVM_DEV_VFIO_GROUP_ADD: Add a VFIO group to VFIO-KVM device tracking
> -	kvm_device_attr.addr points to an int32_t file descriptor
> -	for the VFIO group.
> -  KVM_DEV_VFIO_GROUP_DEL: Remove a VFIO group from VFIO-KVM device tracking
> -	kvm_device_attr.addr points to an int32_t file descriptor
> -	for the VFIO group.
> -  KVM_DEV_VFIO_GROUP_SET_SPAPR_TCE: attaches a guest visible TCE table
> +  KVM_DEV_VFIO_FILE
> +	alias: KVM_DEV_VFIO_GROUP
> +
> +KVM_DEV_VFIO_FILE attributes:
> +  KVM_DEV_VFIO_FILE_ADD: Add a VFIO file (group/device) to VFIO-KVM device
> +	tracking
> +
> +	alias: KVM_DEV_VFIO_GROUP_ADD
> +
> +	kvm_device_attr.addr points to an int32_t file descriptor for the
> +	VFIO file.
> +  KVM_DEV_VFIO_FILE_DEL: Remove a VFIO file (group/device) from VFIO-KVM
> +	device tracking
> +
> +	alias: KVM_DEV_VFIO_GROUP_DEL
> +
> +	kvm_device_attr.addr points to an int32_t file descriptor for the
> +	VFIO file.
> +
> +  KVM_DEV_VFIO_FILE_SET_SPAPR_TCE: attaches a guest visible TCE table
>  	allocated by sPAPR KVM.
> +
> +	alias: KVM_DEV_VFIO_GROUP_SET_SPAPR_TCE
> +
>  	kvm_device_attr.addr points to a struct::
>  
>  		struct kvm_vfio_spapr_tce {
> @@ -39,3 +52,5 @@ KVM_DEV_VFIO_GROUP attributes:
>  	- @groupfd is a file descriptor for a VFIO group;
>  	- @tablefd is a file descriptor for a TCE table allocated via
>  	  KVM_CREATE_SPAPR_TCE.
> +
> +	only accepts vfio group file
> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
> index 55155e262646..484a8133bc69 100644
> --- a/include/uapi/linux/kvm.h
> +++ b/include/uapi/linux/kvm.h
> @@ -1401,10 +1401,18 @@ struct kvm_device_attr {
>  	__u64	addr;		/* userspace address of attr data */
>  };
>  
> -#define  KVM_DEV_VFIO_GROUP			1
> -#define   KVM_DEV_VFIO_GROUP_ADD			1
> -#define   KVM_DEV_VFIO_GROUP_DEL			2
> -#define   KVM_DEV_VFIO_GROUP_SET_SPAPR_TCE		3
> +#define  KVM_DEV_VFIO_FILE	1
> +
> +#define   KVM_DEV_VFIO_FILE_ADD			1
> +#define   KVM_DEV_VFIO_FILE_DEL			2
> +#define   KVM_DEV_VFIO_FILE_SET_SPAPR_TCE	3
> +
> +/* KVM_DEV_VFIO_GROUP aliases are for compile time uapi compatibility */
> +#define  KVM_DEV_VFIO_GROUP	KVM_DEV_VFIO_FILE
> +
> +#define   KVM_DEV_VFIO_GROUP_ADD	KVM_DEV_VFIO_FILE_ADD
> +#define   KVM_DEV_VFIO_GROUP_DEL	KVM_DEV_VFIO_FILE_DEL
> +#define   KVM_DEV_VFIO_GROUP_SET_SPAPR_TCE	KVM_DEV_VFIO_FILE_SET_SPAPR_TCE
>  
>  enum kvm_device_type {
>  	KVM_DEV_TYPE_FSL_MPIC_20	= 1,
> diff --git a/virt/kvm/vfio.c b/virt/kvm/vfio.c
> index 857d6ba349e1..d869913baafd 100644
> --- a/virt/kvm/vfio.c
> +++ b/virt/kvm/vfio.c
> @@ -286,18 +286,18 @@ static int kvm_vfio_set_file(struct kvm_device *dev, long attr,
>  	int32_t fd;
>  
>  	switch (attr) {
> -	case KVM_DEV_VFIO_GROUP_ADD:
> +	case KVM_DEV_VFIO_FILE_ADD:
>  		if (get_user(fd, argp))
>  			return -EFAULT;
>  		return kvm_vfio_file_add(dev, fd);
>  
> -	case KVM_DEV_VFIO_GROUP_DEL:
> +	case KVM_DEV_VFIO_FILE_DEL:
>  		if (get_user(fd, argp))
>  			return -EFAULT;
>  		return kvm_vfio_file_del(dev, fd);
>  
>  #ifdef CONFIG_SPAPR_TCE_IOMMU
> -	case KVM_DEV_VFIO_GROUP_SET_SPAPR_TCE:
> +	case KVM_DEV_VFIO_FILE_SET_SPAPR_TCE:
>  		return kvm_vfio_file_set_spapr_tce(dev, arg);

I don't see that the SPAPR code is so easily fungible to a device file
descriptor.  The kvm_vfio_spapr_tce data structure includes a groupfd,
which is required to match a groupfd on the file_list.  So a SPAPR user
cannot pass a cdev via FILE_ADD if they intend to use this TCE code.

Maybe SPAPR code is therefore tied to the group interface since there's
nobody around advancing this code any longer.

That also makes me wonder what we're really gaining in forcing this
generalization from group to file.  There's no userspace that's
suddenly going to find itself with a device cdev to require an ABI
compatible interface.  So we allow either groups or cdevs, but then we
potentially end up with a file_list of both groups and device cdevs.
Given the SPAPR issue above, is there some advantage to creating a
parallel FILE interface alongside the GROUP interface, with separate
file lists for each?  At least that would allow SPAPR to expose only a
GROUP interface via the has_attr interface below.  Maybe there's some
utility in general for being able to probe device cdev support here?
Thanks,

Alex

>  #endif
>  	}
> @@ -309,7 +309,7 @@ static int kvm_vfio_set_attr(struct kvm_device *dev,
>  			     struct kvm_device_attr *attr)
>  {
>  	switch (attr->group) {
> -	case KVM_DEV_VFIO_GROUP:
> +	case KVM_DEV_VFIO_FILE:
>  		return kvm_vfio_set_file(dev, attr->attr,
>  					 u64_to_user_ptr(attr->addr));
>  	}
> @@ -321,12 +321,12 @@ static int kvm_vfio_has_attr(struct kvm_device *dev,
>  			     struct kvm_device_attr *attr)
>  {
>  	switch (attr->group) {
> -	case KVM_DEV_VFIO_GROUP:
> +	case KVM_DEV_VFIO_FILE:
>  		switch (attr->attr) {
> -		case KVM_DEV_VFIO_GROUP_ADD:
> -		case KVM_DEV_VFIO_GROUP_DEL:
> +		case KVM_DEV_VFIO_FILE_ADD:
> +		case KVM_DEV_VFIO_FILE_DEL:
>  #ifdef CONFIG_SPAPR_TCE_IOMMU
> -		case KVM_DEV_VFIO_GROUP_SET_SPAPR_TCE:
> +		case KVM_DEV_VFIO_FILE_SET_SPAPR_TCE:
>  #endif
>  			return 0;
>  		}


^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [Intel-gfx] [PATCH v3 05/15] kvm/vfio: Accept vfio device file from userspace
@ 2023-02-14 22:26     ` Alex Williamson
  0 siblings, 0 replies; 135+ messages in thread
From: Alex Williamson @ 2023-02-14 22:26 UTC (permalink / raw)
  To: Yi Liu
  Cc: linux-s390, yi.y.sun, mjrosato, kvm, intel-gvt-dev, joro, cohuck,
	peterx, suravee.suthikulpanit, eric.auger, nicolinc,
	shameerali.kolothum.thodi, jgg, intel-gfx, chao.p.peng, lulu,
	robin.murphy, jasowang

On Mon, 13 Feb 2023 07:13:38 -0800
Yi Liu <yi.l.liu@intel.com> wrote:

> This defines KVM_DEV_VFIO_FILE* and make alias with KVM_DEV_VFIO_GROUP*.
> Old userspace uses KVM_DEV_VFIO_GROUP* works as well.
> 
> Signed-off-by: Yi Liu <yi.l.liu@intel.com>
> ---
>  Documentation/virt/kvm/devices/vfio.rst | 45 ++++++++++++++++---------
>  include/uapi/linux/kvm.h                | 16 ++++++---
>  virt/kvm/vfio.c                         | 16 ++++-----
>  3 files changed, 50 insertions(+), 27 deletions(-)
> 
> diff --git a/Documentation/virt/kvm/devices/vfio.rst b/Documentation/virt/kvm/devices/vfio.rst
> index 2d20dc561069..90f22933dcfa 100644
> --- a/Documentation/virt/kvm/devices/vfio.rst
> +++ b/Documentation/virt/kvm/devices/vfio.rst
> @@ -9,24 +9,37 @@ Device types supported:
>    - KVM_DEV_TYPE_VFIO
>  
>  Only one VFIO instance may be created per VM.  The created device
> -tracks VFIO groups in use by the VM and features of those groups
> -important to the correctness and acceleration of the VM.  As groups
> -are enabled and disabled for use by the VM, KVM should be updated
> -about their presence.  When registered with KVM, a reference to the
> -VFIO-group is held by KVM.
> +tracks VFIO files (group or device) in use by the VM and features
> +of those groups/devices important to the correctness and acceleration
> +of the VM.  As groups/devices are enabled and disabled for use by the
> +VM, KVM should be updated about their presence.  When registered with
> +KVM, a reference to the VFIO file is held by KVM.
>  
>  Groups:
> -  KVM_DEV_VFIO_GROUP
> -
> -KVM_DEV_VFIO_GROUP attributes:
> -  KVM_DEV_VFIO_GROUP_ADD: Add a VFIO group to VFIO-KVM device tracking
> -	kvm_device_attr.addr points to an int32_t file descriptor
> -	for the VFIO group.
> -  KVM_DEV_VFIO_GROUP_DEL: Remove a VFIO group from VFIO-KVM device tracking
> -	kvm_device_attr.addr points to an int32_t file descriptor
> -	for the VFIO group.
> -  KVM_DEV_VFIO_GROUP_SET_SPAPR_TCE: attaches a guest visible TCE table
> +  KVM_DEV_VFIO_FILE
> +	alias: KVM_DEV_VFIO_GROUP
> +
> +KVM_DEV_VFIO_FILE attributes:
> +  KVM_DEV_VFIO_FILE_ADD: Add a VFIO file (group/device) to VFIO-KVM device
> +	tracking
> +
> +	alias: KVM_DEV_VFIO_GROUP_ADD
> +
> +	kvm_device_attr.addr points to an int32_t file descriptor for the
> +	VFIO file.
> +  KVM_DEV_VFIO_FILE_DEL: Remove a VFIO file (group/device) from VFIO-KVM
> +	device tracking
> +
> +	alias: KVM_DEV_VFIO_GROUP_DEL
> +
> +	kvm_device_attr.addr points to an int32_t file descriptor for the
> +	VFIO file.
> +
> +  KVM_DEV_VFIO_FILE_SET_SPAPR_TCE: attaches a guest visible TCE table
>  	allocated by sPAPR KVM.
> +
> +	alias: KVM_DEV_VFIO_GROUP_SET_SPAPR_TCE
> +
>  	kvm_device_attr.addr points to a struct::
>  
>  		struct kvm_vfio_spapr_tce {
> @@ -39,3 +52,5 @@ KVM_DEV_VFIO_GROUP attributes:
>  	- @groupfd is a file descriptor for a VFIO group;
>  	- @tablefd is a file descriptor for a TCE table allocated via
>  	  KVM_CREATE_SPAPR_TCE.
> +
> +	only accepts vfio group file
> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
> index 55155e262646..484a8133bc69 100644
> --- a/include/uapi/linux/kvm.h
> +++ b/include/uapi/linux/kvm.h
> @@ -1401,10 +1401,18 @@ struct kvm_device_attr {
>  	__u64	addr;		/* userspace address of attr data */
>  };
>  
> -#define  KVM_DEV_VFIO_GROUP			1
> -#define   KVM_DEV_VFIO_GROUP_ADD			1
> -#define   KVM_DEV_VFIO_GROUP_DEL			2
> -#define   KVM_DEV_VFIO_GROUP_SET_SPAPR_TCE		3
> +#define  KVM_DEV_VFIO_FILE	1
> +
> +#define   KVM_DEV_VFIO_FILE_ADD			1
> +#define   KVM_DEV_VFIO_FILE_DEL			2
> +#define   KVM_DEV_VFIO_FILE_SET_SPAPR_TCE	3
> +
> +/* KVM_DEV_VFIO_GROUP aliases are for compile time uapi compatibility */
> +#define  KVM_DEV_VFIO_GROUP	KVM_DEV_VFIO_FILE
> +
> +#define   KVM_DEV_VFIO_GROUP_ADD	KVM_DEV_VFIO_FILE_ADD
> +#define   KVM_DEV_VFIO_GROUP_DEL	KVM_DEV_VFIO_FILE_DEL
> +#define   KVM_DEV_VFIO_GROUP_SET_SPAPR_TCE	KVM_DEV_VFIO_FILE_SET_SPAPR_TCE
>  
>  enum kvm_device_type {
>  	KVM_DEV_TYPE_FSL_MPIC_20	= 1,
> diff --git a/virt/kvm/vfio.c b/virt/kvm/vfio.c
> index 857d6ba349e1..d869913baafd 100644
> --- a/virt/kvm/vfio.c
> +++ b/virt/kvm/vfio.c
> @@ -286,18 +286,18 @@ static int kvm_vfio_set_file(struct kvm_device *dev, long attr,
>  	int32_t fd;
>  
>  	switch (attr) {
> -	case KVM_DEV_VFIO_GROUP_ADD:
> +	case KVM_DEV_VFIO_FILE_ADD:
>  		if (get_user(fd, argp))
>  			return -EFAULT;
>  		return kvm_vfio_file_add(dev, fd);
>  
> -	case KVM_DEV_VFIO_GROUP_DEL:
> +	case KVM_DEV_VFIO_FILE_DEL:
>  		if (get_user(fd, argp))
>  			return -EFAULT;
>  		return kvm_vfio_file_del(dev, fd);
>  
>  #ifdef CONFIG_SPAPR_TCE_IOMMU
> -	case KVM_DEV_VFIO_GROUP_SET_SPAPR_TCE:
> +	case KVM_DEV_VFIO_FILE_SET_SPAPR_TCE:
>  		return kvm_vfio_file_set_spapr_tce(dev, arg);

I don't see that the SPAPR code is so easily fungible to a device file
descriptor.  The kvm_vfio_spapr_tce data structure includes a groupfd,
which is required to match a groupfd on the file_list.  So a SPAPR user
cannot pass a cdev via FILE_ADD if they intend to use this TCE code.

Maybe SPAPR code is therefore tied to the group interface since there's
nobody around advancing this code any longer.

That also makes me wonder what we're really gaining in forcing this
generalization from group to file.  There's no userspace that's
suddenly going to find itself with a device cdev to require an ABI
compatible interface.  So we allow either groups or cdevs, but then we
potentially end up with a file_list of both groups and device cdevs.
Given the SPAPR issue above, is there some advantage to creating a
parallel FILE interface alongside the GROUP interface, with separate
file lists for each?  At least that would allow SPAPR to expose only a
GROUP interface via the has_attr interface below.  Maybe there's some
utility in general for being able to probe device cdev support here?
Thanks,

Alex

>  #endif
>  	}
> @@ -309,7 +309,7 @@ static int kvm_vfio_set_attr(struct kvm_device *dev,
>  			     struct kvm_device_attr *attr)
>  {
>  	switch (attr->group) {
> -	case KVM_DEV_VFIO_GROUP:
> +	case KVM_DEV_VFIO_FILE:
>  		return kvm_vfio_set_file(dev, attr->attr,
>  					 u64_to_user_ptr(attr->addr));
>  	}
> @@ -321,12 +321,12 @@ static int kvm_vfio_has_attr(struct kvm_device *dev,
>  			     struct kvm_device_attr *attr)
>  {
>  	switch (attr->group) {
> -	case KVM_DEV_VFIO_GROUP:
> +	case KVM_DEV_VFIO_FILE:
>  		switch (attr->attr) {
> -		case KVM_DEV_VFIO_GROUP_ADD:
> -		case KVM_DEV_VFIO_GROUP_DEL:
> +		case KVM_DEV_VFIO_FILE_ADD:
> +		case KVM_DEV_VFIO_FILE_DEL:
>  #ifdef CONFIG_SPAPR_TCE_IOMMU
> -		case KVM_DEV_VFIO_GROUP_SET_SPAPR_TCE:
> +		case KVM_DEV_VFIO_FILE_SET_SPAPR_TCE:
>  #endif
>  			return 0;
>  		}


^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [Intel-gfx] [PATCH v3 07/15] vfio: Block device access via device fd until device is opened
  2023-02-13 15:13   ` [Intel-gfx] " Yi Liu
@ 2023-02-14 22:46     ` Alex Williamson
  -1 siblings, 0 replies; 135+ messages in thread
From: Alex Williamson @ 2023-02-14 22:46 UTC (permalink / raw)
  To: Yi Liu
  Cc: linux-s390, yi.y.sun, mjrosato, kvm, intel-gvt-dev, joro, cohuck,
	peterx, suravee.suthikulpanit, eric.auger, nicolinc,
	shameerali.kolothum.thodi, jgg, intel-gfx, chao.p.peng, lulu,
	robin.murphy, jasowang

On Mon, 13 Feb 2023 07:13:40 -0800
Yi Liu <yi.l.liu@intel.com> wrote:

> Allow the vfio_device file to be in a state where the device FD is
> opened but the device cannot be used by userspace (i.e. its .open_device()
> hasn't been called). This inbetween state is not used when the device
> FD is spawned from the group FD, however when we create the device FD
> directly by opening a cdev it will be opened in the blocked state.
> 
> The reason for the inbetween state is that userspace only gets a FD but
> doesn't gain access permission until binding the FD to an iommufd. So in
> the blocked state, only the bind operation is allowed. Completing bind
> will allow user to further access the device.
> 
> This is implemented by adding a flag in struct vfio_device_file to mark
> the blocked state and using a simple smp_load_acquire() to obtain the
> flag value and serialize all the device setup with the thread accessing
> this device.
> 
> Following this lockless scheme, it can safely handle the device FD
> unbound->bound but it cannot handle bound->unbound. To allow this we'd
> need to add a lock on all the vfio ioctls which seems costly. So once
> device FD is bound, it remains bound until the FD is closed.
> 
> Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
> Signed-off-by: Yi Liu <yi.l.liu@intel.com>
> Reviewed-by: Kevin Tian <kevin.tian@intel.com>
> ---
>  drivers/vfio/vfio.h      |  1 +
>  drivers/vfio/vfio_main.c | 34 +++++++++++++++++++++++++++++++++-
>  2 files changed, 34 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/vfio/vfio.h b/drivers/vfio/vfio.h
> index 11e56fe079a1..d56cdb114024 100644
> --- a/drivers/vfio/vfio.h
> +++ b/drivers/vfio/vfio.h
> @@ -18,6 +18,7 @@ struct vfio_container;
>  
>  struct vfio_device_file {
>  	struct vfio_device *device;
> +	bool access_granted;
>  	spinlock_t kvm_ref_lock; /* protect kvm field */
>  	struct kvm *kvm;
>  	struct iommufd_ctx *iommufd; /* protected by struct vfio_device_set::lock */
> diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c
> index c517252aba19..2267057240bd 100644
> --- a/drivers/vfio/vfio_main.c
> +++ b/drivers/vfio/vfio_main.c
> @@ -476,7 +476,15 @@ int vfio_device_open(struct vfio_device_file *df)
>  			device->open_count--;
>  	}
>  
> -	return ret;
> +	if (ret)
> +		return ret;
> +
> +	/*
> +	 * Paired with smp_load_acquire() in vfio_device_fops::ioctl/
> +	 * read/write/mmap
> +	 */
> +	smp_store_release(&df->access_granted, true);
> +	return 0;
>  }
>  
>  void vfio_device_close(struct vfio_device_file *df)
> @@ -1104,8 +1112,14 @@ static long vfio_device_fops_unl_ioctl(struct file *filep,
>  {
>  	struct vfio_device_file *df = filep->private_data;
>  	struct vfio_device *device = df->device;
> +	bool access;
>  	int ret;
>  
> +	/* Paired with smp_store_release() in vfio_device_open() */
> +	access = smp_load_acquire(&df->access_granted);
> +	if (!access)
> +		return -EINVAL;
> +

Nit,

	if (!smp_load_acquire(&df->access_granted))
		...

Thanks,
Alex

>  	ret = vfio_device_pm_runtime_get(device);
>  	if (ret)
>  		return ret;
> @@ -1132,6 +1146,12 @@ static ssize_t vfio_device_fops_read(struct file *filep, char __user *buf,
>  {
>  	struct vfio_device_file *df = filep->private_data;
>  	struct vfio_device *device = df->device;
> +	bool access;
> +
> +	/* Paired with smp_store_release() in vfio_device_open() */
> +	access = smp_load_acquire(&df->access_granted);
> +	if (!access)
> +		return -EINVAL;
>  
>  	if (unlikely(!device->ops->read))
>  		return -EINVAL;
> @@ -1145,6 +1165,12 @@ static ssize_t vfio_device_fops_write(struct file *filep,
>  {
>  	struct vfio_device_file *df = filep->private_data;
>  	struct vfio_device *device = df->device;
> +	bool access;
> +
> +	/* Paired with smp_store_release() in vfio_device_open() */
> +	access = smp_load_acquire(&df->access_granted);
> +	if (!access)
> +		return -EINVAL;
>  
>  	if (unlikely(!device->ops->write))
>  		return -EINVAL;
> @@ -1156,6 +1182,12 @@ static int vfio_device_fops_mmap(struct file *filep, struct vm_area_struct *vma)
>  {
>  	struct vfio_device_file *df = filep->private_data;
>  	struct vfio_device *device = df->device;
> +	bool access;
> +
> +	/* Paired with smp_store_release() in vfio_device_open() */
> +	access = smp_load_acquire(&df->access_granted);
> +	if (!access)
> +		return -EINVAL;
>  
>  	if (unlikely(!device->ops->mmap))
>  		return -EINVAL;


^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [PATCH v3 07/15] vfio: Block device access via device fd until device is opened
@ 2023-02-14 22:46     ` Alex Williamson
  0 siblings, 0 replies; 135+ messages in thread
From: Alex Williamson @ 2023-02-14 22:46 UTC (permalink / raw)
  To: Yi Liu
  Cc: joro, jgg, kevin.tian, robin.murphy, cohuck, eric.auger,
	nicolinc, kvm, mjrosato, chao.p.peng, yi.y.sun, peterx, jasowang,
	shameerali.kolothum.thodi, lulu, suravee.suthikulpanit,
	intel-gvt-dev, intel-gfx, linux-s390

On Mon, 13 Feb 2023 07:13:40 -0800
Yi Liu <yi.l.liu@intel.com> wrote:

> Allow the vfio_device file to be in a state where the device FD is
> opened but the device cannot be used by userspace (i.e. its .open_device()
> hasn't been called). This inbetween state is not used when the device
> FD is spawned from the group FD, however when we create the device FD
> directly by opening a cdev it will be opened in the blocked state.
> 
> The reason for the inbetween state is that userspace only gets a FD but
> doesn't gain access permission until binding the FD to an iommufd. So in
> the blocked state, only the bind operation is allowed. Completing bind
> will allow user to further access the device.
> 
> This is implemented by adding a flag in struct vfio_device_file to mark
> the blocked state and using a simple smp_load_acquire() to obtain the
> flag value and serialize all the device setup with the thread accessing
> this device.
> 
> Following this lockless scheme, it can safely handle the device FD
> unbound->bound but it cannot handle bound->unbound. To allow this we'd
> need to add a lock on all the vfio ioctls which seems costly. So once
> device FD is bound, it remains bound until the FD is closed.
> 
> Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
> Signed-off-by: Yi Liu <yi.l.liu@intel.com>
> Reviewed-by: Kevin Tian <kevin.tian@intel.com>
> ---
>  drivers/vfio/vfio.h      |  1 +
>  drivers/vfio/vfio_main.c | 34 +++++++++++++++++++++++++++++++++-
>  2 files changed, 34 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/vfio/vfio.h b/drivers/vfio/vfio.h
> index 11e56fe079a1..d56cdb114024 100644
> --- a/drivers/vfio/vfio.h
> +++ b/drivers/vfio/vfio.h
> @@ -18,6 +18,7 @@ struct vfio_container;
>  
>  struct vfio_device_file {
>  	struct vfio_device *device;
> +	bool access_granted;
>  	spinlock_t kvm_ref_lock; /* protect kvm field */
>  	struct kvm *kvm;
>  	struct iommufd_ctx *iommufd; /* protected by struct vfio_device_set::lock */
> diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c
> index c517252aba19..2267057240bd 100644
> --- a/drivers/vfio/vfio_main.c
> +++ b/drivers/vfio/vfio_main.c
> @@ -476,7 +476,15 @@ int vfio_device_open(struct vfio_device_file *df)
>  			device->open_count--;
>  	}
>  
> -	return ret;
> +	if (ret)
> +		return ret;
> +
> +	/*
> +	 * Paired with smp_load_acquire() in vfio_device_fops::ioctl/
> +	 * read/write/mmap
> +	 */
> +	smp_store_release(&df->access_granted, true);
> +	return 0;
>  }
>  
>  void vfio_device_close(struct vfio_device_file *df)
> @@ -1104,8 +1112,14 @@ static long vfio_device_fops_unl_ioctl(struct file *filep,
>  {
>  	struct vfio_device_file *df = filep->private_data;
>  	struct vfio_device *device = df->device;
> +	bool access;
>  	int ret;
>  
> +	/* Paired with smp_store_release() in vfio_device_open() */
> +	access = smp_load_acquire(&df->access_granted);
> +	if (!access)
> +		return -EINVAL;
> +

Nit,

	if (!smp_load_acquire(&df->access_granted))
		...

Thanks,
Alex

>  	ret = vfio_device_pm_runtime_get(device);
>  	if (ret)
>  		return ret;
> @@ -1132,6 +1146,12 @@ static ssize_t vfio_device_fops_read(struct file *filep, char __user *buf,
>  {
>  	struct vfio_device_file *df = filep->private_data;
>  	struct vfio_device *device = df->device;
> +	bool access;
> +
> +	/* Paired with smp_store_release() in vfio_device_open() */
> +	access = smp_load_acquire(&df->access_granted);
> +	if (!access)
> +		return -EINVAL;
>  
>  	if (unlikely(!device->ops->read))
>  		return -EINVAL;
> @@ -1145,6 +1165,12 @@ static ssize_t vfio_device_fops_write(struct file *filep,
>  {
>  	struct vfio_device_file *df = filep->private_data;
>  	struct vfio_device *device = df->device;
> +	bool access;
> +
> +	/* Paired with smp_store_release() in vfio_device_open() */
> +	access = smp_load_acquire(&df->access_granted);
> +	if (!access)
> +		return -EINVAL;
>  
>  	if (unlikely(!device->ops->write))
>  		return -EINVAL;
> @@ -1156,6 +1182,12 @@ static int vfio_device_fops_mmap(struct file *filep, struct vm_area_struct *vma)
>  {
>  	struct vfio_device_file *df = filep->private_data;
>  	struct vfio_device *device = df->device;
> +	bool access;
> +
> +	/* Paired with smp_store_release() in vfio_device_open() */
> +	access = smp_load_acquire(&df->access_granted);
> +	if (!access)
> +		return -EINVAL;
>  
>  	if (unlikely(!device->ops->mmap))
>  		return -EINVAL;


^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [PATCH v3 05/15] kvm/vfio: Accept vfio device file from userspace
  2023-02-14 22:26     ` [Intel-gfx] " Alex Williamson
@ 2023-02-14 23:25       ` Jason Gunthorpe
  -1 siblings, 0 replies; 135+ messages in thread
From: Jason Gunthorpe @ 2023-02-14 23:25 UTC (permalink / raw)
  To: Alex Williamson
  Cc: Yi Liu, joro, kevin.tian, robin.murphy, cohuck, eric.auger,
	nicolinc, kvm, mjrosato, chao.p.peng, yi.y.sun, peterx, jasowang,
	shameerali.kolothum.thodi, lulu, suravee.suthikulpanit,
	intel-gvt-dev, intel-gfx, linux-s390

On Tue, Feb 14, 2023 at 03:26:27PM -0700, Alex Williamson wrote:
> > index 857d6ba349e1..d869913baafd 100644
> > --- a/virt/kvm/vfio.c
> > +++ b/virt/kvm/vfio.c
> > @@ -286,18 +286,18 @@ static int kvm_vfio_set_file(struct kvm_device *dev, long attr,
> >  	int32_t fd;
> >  
> >  	switch (attr) {
> > -	case KVM_DEV_VFIO_GROUP_ADD:
> > +	case KVM_DEV_VFIO_FILE_ADD:
> >  		if (get_user(fd, argp))
> >  			return -EFAULT;
> >  		return kvm_vfio_file_add(dev, fd);
> >  
> > -	case KVM_DEV_VFIO_GROUP_DEL:
> > +	case KVM_DEV_VFIO_FILE_DEL:
> >  		if (get_user(fd, argp))
> >  			return -EFAULT;
> >  		return kvm_vfio_file_del(dev, fd);
> >  
> >  #ifdef CONFIG_SPAPR_TCE_IOMMU
> > -	case KVM_DEV_VFIO_GROUP_SET_SPAPR_TCE:
> > +	case KVM_DEV_VFIO_FILE_SET_SPAPR_TCE:
> >  		return kvm_vfio_file_set_spapr_tce(dev, arg);
> 
> I don't see that the SPAPR code is so easily fungible to a device
> file descriptor.  The kvm_vfio_spapr_tce data structure includes a
> groupfd, which is required to match a groupfd on the file_list.  So
> a SPAPR user cannot pass a cdev via FILE_ADD if they intend to use
> this TCE code.

SPAPR cannot use cdev at all, cdev mode only works with iommufd.

So with my other remark about blocking unbound cdevs, in SPAPR mode
you can never open a cdev and make it bound thus
kvm_vfio_file_iommu_group() and others will return NULL always for
cdev.

Thus AFAICT this is all fine.

Yi, you should also add some kconfig stuff to ensure that SPAPR always
has the group interface compiled in.

> That also makes me wonder what we're really gaining in forcing this
> generalization from group to file.  

I think it is just less code overall. Otherwise we need to needlessly
double quite a lot of stuff, rather pointlessly, IMHO.

I'm still thinking about proposing to just delete all this SPAPR
stuff. Power still hasn't had the patches applied to make it work
again so it seems to all be dead.

Jason

^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [Intel-gfx] [PATCH v3 05/15] kvm/vfio: Accept vfio device file from userspace
@ 2023-02-14 23:25       ` Jason Gunthorpe
  0 siblings, 0 replies; 135+ messages in thread
From: Jason Gunthorpe @ 2023-02-14 23:25 UTC (permalink / raw)
  To: Alex Williamson
  Cc: linux-s390, Yi Liu, yi.y.sun, mjrosato, kvm, intel-gvt-dev, joro,
	cohuck, peterx, eric.auger, nicolinc, shameerali.kolothum.thodi,
	suravee.suthikulpanit, intel-gfx, chao.p.peng, lulu,
	robin.murphy, jasowang

On Tue, Feb 14, 2023 at 03:26:27PM -0700, Alex Williamson wrote:
> > index 857d6ba349e1..d869913baafd 100644
> > --- a/virt/kvm/vfio.c
> > +++ b/virt/kvm/vfio.c
> > @@ -286,18 +286,18 @@ static int kvm_vfio_set_file(struct kvm_device *dev, long attr,
> >  	int32_t fd;
> >  
> >  	switch (attr) {
> > -	case KVM_DEV_VFIO_GROUP_ADD:
> > +	case KVM_DEV_VFIO_FILE_ADD:
> >  		if (get_user(fd, argp))
> >  			return -EFAULT;
> >  		return kvm_vfio_file_add(dev, fd);
> >  
> > -	case KVM_DEV_VFIO_GROUP_DEL:
> > +	case KVM_DEV_VFIO_FILE_DEL:
> >  		if (get_user(fd, argp))
> >  			return -EFAULT;
> >  		return kvm_vfio_file_del(dev, fd);
> >  
> >  #ifdef CONFIG_SPAPR_TCE_IOMMU
> > -	case KVM_DEV_VFIO_GROUP_SET_SPAPR_TCE:
> > +	case KVM_DEV_VFIO_FILE_SET_SPAPR_TCE:
> >  		return kvm_vfio_file_set_spapr_tce(dev, arg);
> 
> I don't see that the SPAPR code is so easily fungible to a device
> file descriptor.  The kvm_vfio_spapr_tce data structure includes a
> groupfd, which is required to match a groupfd on the file_list.  So
> a SPAPR user cannot pass a cdev via FILE_ADD if they intend to use
> this TCE code.

SPAPR cannot use cdev at all, cdev mode only works with iommufd.

So with my other remark about blocking unbound cdevs, in SPAPR mode
you can never open a cdev and make it bound thus
kvm_vfio_file_iommu_group() and others will return NULL always for
cdev.

Thus AFAICT this is all fine.

Yi, you should also add some kconfig stuff to ensure that SPAPR always
has the group interface compiled in.

> That also makes me wonder what we're really gaining in forcing this
> generalization from group to file.  

I think it is just less code overall. Otherwise we need to needlessly
double quite a lot of stuff, rather pointlessly, IMHO.

I'm still thinking about proposing to just delete all this SPAPR
stuff. Power still hasn't had the patches applied to make it work
again so it seems to all be dead.

Jason

^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [PATCH v3 14/15] vfio: Add ioctls for device cdev using iommufd
  2023-02-13 15:13   ` Yi Liu
@ 2023-02-14 23:39     ` Yan Zhao
  -1 siblings, 0 replies; 135+ messages in thread
From: Yan Zhao @ 2023-02-14 23:39 UTC (permalink / raw)
  To: Yi Liu
  Cc: joro, alex.williamson, jgg, kevin.tian, robin.murphy, linux-s390,
	yi.y.sun, kvm, mjrosato, jasowang, cohuck, peterx, eric.auger,
	nicolinc, shameerali.kolothum.thodi, suravee.suthikulpanit,
	chao.p.peng, lulu, intel-gvt-dev, intel-gfx

On Mon, Feb 13, 2023 at 07:13:47AM -0800, Yi Liu wrote:
...
> +long vfio_device_ioctl_bind_iommufd(struct vfio_device_file *df,
> +				    unsigned long arg)
> +{
> +	struct vfio_device *device = df->device;
> +	struct vfio_device_bind_iommufd bind;
> +	struct iommufd_ctx *iommufd = NULL;
> +	struct fd f;
> +	unsigned long minsz;
> +	int ret;
> +
> +	minsz = offsetofend(struct vfio_device_bind_iommufd, out_devid);
> +
> +	if (copy_from_user(&bind, (void __user *)arg, minsz))
> +		return -EFAULT;
> +
> +	if (bind.argsz < minsz || bind.flags)
> +		return -EINVAL;
> +
> +	if (!device->ops->bind_iommufd)
> +		return -ENODEV;
> +
> +	ret = vfio_device_claim_group(device);
> +	if (ret)
> +		return ret;
> +
> +	mutex_lock(&device->dev_set->lock);
> +	/*
> +	 * If already been bound to an iommufd, or already set noiommu
> +	 * then fail it.
> +	 */
> +	if (df->iommufd || df->noiommu) {
> +		ret = -EINVAL;
> +		goto out_unlock;
> +	}
> +
> +	/* iommufd < 0 means noiommu mode */
> +	if (bind.iommufd < 0) {
> +		if (!capable(CAP_SYS_RAWIO)) {
> +			ret = -EPERM;
> +			goto out_unlock;
> +		}
> +		df->noiommu = true;
> +	} else {
> +		f = fdget(bind.iommufd);
> +		if (!f.file) {
> +			ret = -EBADF;
> +			goto out_unlock;
> +		}
> +		iommufd = iommufd_ctx_from_file(f.file);
> +		if (IS_ERR(iommufd)) {
> +			ret = PTR_ERR(iommufd);
> +			goto out_put_file;
> +		}
> +	}
> +
> +	/*
> +	 * Before the device open, get the KVM pointer currently
> +	 * associated with the device file (if there is) and obtain a
> +	 * reference. This reference is held until device closed. Save
> +	 * the pointer in the device for use by drivers.
> +	 */
> +	vfio_device_get_kvm_safe(df);
> +
> +	df->iommufd = iommufd;
> +	ret = vfio_device_open(df, &bind.out_devid, NULL);
> +	if (ret)
> +		goto out_put_kvm;
> +
> +	ret = copy_to_user((void __user *)arg +
> +			   offsetofend(struct vfio_device_bind_iommufd, iommufd),
> +			   &bind.out_devid,
> +			   sizeof(bind.out_devid)) ? -EFAULT : 0;
> +	if (ret)
> +		goto out_close_device;
> +
> +	if (iommufd)
> +		fdput(f);
> +	else if (df->noiommu)
> +		dev_warn(device->dev, "vfio-noiommu device used by user "
> +			 "(%s:%d)\n", current->comm, task_pid_nr(current));

IMHO, the "smp_store_release(&df->access_granted, true);" in patch 7
should be moved to here when bind is indeed successful.


> +	mutex_unlock(&device->dev_set->lock);
> +	return 0;
> +
> +out_close_device:
> +	vfio_device_close(df);
> +out_put_kvm:
> +	df->iommufd = NULL;
> +	df->noiommu = false;
> +	vfio_device_put_kvm(device);
> +out_put_file:
> +	if (iommufd)
> +		fdput(f);
> +out_unlock:
> +	mutex_unlock(&device->dev_set->lock);
> +	vfio_device_release_group(device);
> +	return ret;
> +}
...

^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [Intel-gfx] [PATCH v3 14/15] vfio: Add ioctls for device cdev using iommufd
@ 2023-02-14 23:39     ` Yan Zhao
  0 siblings, 0 replies; 135+ messages in thread
From: Yan Zhao @ 2023-02-14 23:39 UTC (permalink / raw)
  To: Yi Liu
  Cc: linux-s390, yi.y.sun, mjrosato, kvm, intel-gvt-dev, joro, cohuck,
	peterx, eric.auger, nicolinc, shameerali.kolothum.thodi, jgg,
	chao.p.peng, intel-gfx, suravee.suthikulpanit, lulu,
	robin.murphy, jasowang

On Mon, Feb 13, 2023 at 07:13:47AM -0800, Yi Liu wrote:
...
> +long vfio_device_ioctl_bind_iommufd(struct vfio_device_file *df,
> +				    unsigned long arg)
> +{
> +	struct vfio_device *device = df->device;
> +	struct vfio_device_bind_iommufd bind;
> +	struct iommufd_ctx *iommufd = NULL;
> +	struct fd f;
> +	unsigned long minsz;
> +	int ret;
> +
> +	minsz = offsetofend(struct vfio_device_bind_iommufd, out_devid);
> +
> +	if (copy_from_user(&bind, (void __user *)arg, minsz))
> +		return -EFAULT;
> +
> +	if (bind.argsz < minsz || bind.flags)
> +		return -EINVAL;
> +
> +	if (!device->ops->bind_iommufd)
> +		return -ENODEV;
> +
> +	ret = vfio_device_claim_group(device);
> +	if (ret)
> +		return ret;
> +
> +	mutex_lock(&device->dev_set->lock);
> +	/*
> +	 * If already been bound to an iommufd, or already set noiommu
> +	 * then fail it.
> +	 */
> +	if (df->iommufd || df->noiommu) {
> +		ret = -EINVAL;
> +		goto out_unlock;
> +	}
> +
> +	/* iommufd < 0 means noiommu mode */
> +	if (bind.iommufd < 0) {
> +		if (!capable(CAP_SYS_RAWIO)) {
> +			ret = -EPERM;
> +			goto out_unlock;
> +		}
> +		df->noiommu = true;
> +	} else {
> +		f = fdget(bind.iommufd);
> +		if (!f.file) {
> +			ret = -EBADF;
> +			goto out_unlock;
> +		}
> +		iommufd = iommufd_ctx_from_file(f.file);
> +		if (IS_ERR(iommufd)) {
> +			ret = PTR_ERR(iommufd);
> +			goto out_put_file;
> +		}
> +	}
> +
> +	/*
> +	 * Before the device open, get the KVM pointer currently
> +	 * associated with the device file (if there is) and obtain a
> +	 * reference. This reference is held until device closed. Save
> +	 * the pointer in the device for use by drivers.
> +	 */
> +	vfio_device_get_kvm_safe(df);
> +
> +	df->iommufd = iommufd;
> +	ret = vfio_device_open(df, &bind.out_devid, NULL);
> +	if (ret)
> +		goto out_put_kvm;
> +
> +	ret = copy_to_user((void __user *)arg +
> +			   offsetofend(struct vfio_device_bind_iommufd, iommufd),
> +			   &bind.out_devid,
> +			   sizeof(bind.out_devid)) ? -EFAULT : 0;
> +	if (ret)
> +		goto out_close_device;
> +
> +	if (iommufd)
> +		fdput(f);
> +	else if (df->noiommu)
> +		dev_warn(device->dev, "vfio-noiommu device used by user "
> +			 "(%s:%d)\n", current->comm, task_pid_nr(current));

IMHO, the "smp_store_release(&df->access_granted, true);" in patch 7
should be moved to here when bind is indeed successful.


> +	mutex_unlock(&device->dev_set->lock);
> +	return 0;
> +
> +out_close_device:
> +	vfio_device_close(df);
> +out_put_kvm:
> +	df->iommufd = NULL;
> +	df->noiommu = false;
> +	vfio_device_put_kvm(device);
> +out_put_file:
> +	if (iommufd)
> +		fdput(f);
> +out_unlock:
> +	mutex_unlock(&device->dev_set->lock);
> +	vfio_device_release_group(device);
> +	return ret;
> +}
...

^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [Intel-gfx] [PATCH v3 05/15] kvm/vfio: Accept vfio device file from userspace
  2023-02-14 23:25       ` [Intel-gfx] " Jason Gunthorpe
@ 2023-02-14 23:42         ` Alex Williamson
  -1 siblings, 0 replies; 135+ messages in thread
From: Alex Williamson @ 2023-02-14 23:42 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: linux-s390, Yi Liu, yi.y.sun, mjrosato, kvm, Michael Ellerman,
	intel-gvt-dev, joro, cohuck, peterx, eric.auger, Timothy Pearson,
	nicolinc, shameerali.kolothum.thodi, suravee.suthikulpanit,
	intel-gfx, chao.p.peng, lulu, robin.murphy, jasowang

On Tue, 14 Feb 2023 19:25:19 -0400
Jason Gunthorpe <jgg@nvidia.com> wrote:

> On Tue, Feb 14, 2023 at 03:26:27PM -0700, Alex Williamson wrote:
> > > index 857d6ba349e1..d869913baafd 100644
> > > --- a/virt/kvm/vfio.c
> > > +++ b/virt/kvm/vfio.c
> > > @@ -286,18 +286,18 @@ static int kvm_vfio_set_file(struct kvm_device *dev, long attr,
> > >  	int32_t fd;
> > >  
> > >  	switch (attr) {
> > > -	case KVM_DEV_VFIO_GROUP_ADD:
> > > +	case KVM_DEV_VFIO_FILE_ADD:
> > >  		if (get_user(fd, argp))
> > >  			return -EFAULT;
> > >  		return kvm_vfio_file_add(dev, fd);
> > >  
> > > -	case KVM_DEV_VFIO_GROUP_DEL:
> > > +	case KVM_DEV_VFIO_FILE_DEL:
> > >  		if (get_user(fd, argp))
> > >  			return -EFAULT;
> > >  		return kvm_vfio_file_del(dev, fd);
> > >  
> > >  #ifdef CONFIG_SPAPR_TCE_IOMMU
> > > -	case KVM_DEV_VFIO_GROUP_SET_SPAPR_TCE:
> > > +	case KVM_DEV_VFIO_FILE_SET_SPAPR_TCE:
> > >  		return kvm_vfio_file_set_spapr_tce(dev, arg);  
> > 
> > I don't see that the SPAPR code is so easily fungible to a device
> > file descriptor.  The kvm_vfio_spapr_tce data structure includes a
> > groupfd, which is required to match a groupfd on the file_list.  So
> > a SPAPR user cannot pass a cdev via FILE_ADD if they intend to use
> > this TCE code.  
> 
> SPAPR cannot use cdev at all, cdev mode only works with iommufd.
> 
> So with my other remark about blocking unbound cdevs, in SPAPR mode
> you can never open a cdev and make it bound thus
> kvm_vfio_file_iommu_group() and others will return NULL always for
> cdev.
> 
> Thus AFAICT this is all fine.

A device file opened through a group could be passed through this
interface though, right?  Do we just chalk that up to user error?
Maybe the SPAPR extension at least needs to be documented as relying on
registering groups rather than devices.
 
> Yi, you should also add some kconfig stuff to ensure that SPAPR always
> has the group interface compiled in.
> 
> > That also makes me wonder what we're really gaining in forcing this
> > generalization from group to file.    
> 
> I think it is just less code overall. Otherwise we need to needlessly
> double quite a lot of stuff, rather pointlessly, IMHO.
> 
> I'm still thinking about proposing to just delete all this SPAPR
> stuff. Power still hasn't had the patches applied to make it work
> again so it seems to all be dead.

There's been some off-list discussion about at least fixing SPAPR
support, but yes, it either needs to get some love or we ought to think
about its future.  Thanks,

Alex


^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [PATCH v3 05/15] kvm/vfio: Accept vfio device file from userspace
@ 2023-02-14 23:42         ` Alex Williamson
  0 siblings, 0 replies; 135+ messages in thread
From: Alex Williamson @ 2023-02-14 23:42 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Yi Liu, joro, kevin.tian, robin.murphy, cohuck, eric.auger,
	nicolinc, kvm, mjrosato, chao.p.peng, yi.y.sun, peterx, jasowang,
	shameerali.kolothum.thodi, lulu, suravee.suthikulpanit,
	intel-gvt-dev, intel-gfx, linux-s390, Timothy Pearson,
	Michael Ellerman

On Tue, 14 Feb 2023 19:25:19 -0400
Jason Gunthorpe <jgg@nvidia.com> wrote:

> On Tue, Feb 14, 2023 at 03:26:27PM -0700, Alex Williamson wrote:
> > > index 857d6ba349e1..d869913baafd 100644
> > > --- a/virt/kvm/vfio.c
> > > +++ b/virt/kvm/vfio.c
> > > @@ -286,18 +286,18 @@ static int kvm_vfio_set_file(struct kvm_device *dev, long attr,
> > >  	int32_t fd;
> > >  
> > >  	switch (attr) {
> > > -	case KVM_DEV_VFIO_GROUP_ADD:
> > > +	case KVM_DEV_VFIO_FILE_ADD:
> > >  		if (get_user(fd, argp))
> > >  			return -EFAULT;
> > >  		return kvm_vfio_file_add(dev, fd);
> > >  
> > > -	case KVM_DEV_VFIO_GROUP_DEL:
> > > +	case KVM_DEV_VFIO_FILE_DEL:
> > >  		if (get_user(fd, argp))
> > >  			return -EFAULT;
> > >  		return kvm_vfio_file_del(dev, fd);
> > >  
> > >  #ifdef CONFIG_SPAPR_TCE_IOMMU
> > > -	case KVM_DEV_VFIO_GROUP_SET_SPAPR_TCE:
> > > +	case KVM_DEV_VFIO_FILE_SET_SPAPR_TCE:
> > >  		return kvm_vfio_file_set_spapr_tce(dev, arg);  
> > 
> > I don't see that the SPAPR code is so easily fungible to a device
> > file descriptor.  The kvm_vfio_spapr_tce data structure includes a
> > groupfd, which is required to match a groupfd on the file_list.  So
> > a SPAPR user cannot pass a cdev via FILE_ADD if they intend to use
> > this TCE code.  
> 
> SPAPR cannot use cdev at all, cdev mode only works with iommufd.
> 
> So with my other remark about blocking unbound cdevs, in SPAPR mode
> you can never open a cdev and make it bound thus
> kvm_vfio_file_iommu_group() and others will return NULL always for
> cdev.
> 
> Thus AFAICT this is all fine.

A device file opened through a group could be passed through this
interface though, right?  Do we just chalk that up to user error?
Maybe the SPAPR extension at least needs to be documented as relying on
registering groups rather than devices.
 
> Yi, you should also add some kconfig stuff to ensure that SPAPR always
> has the group interface compiled in.
> 
> > That also makes me wonder what we're really gaining in forcing this
> > generalization from group to file.    
> 
> I think it is just less code overall. Otherwise we need to needlessly
> double quite a lot of stuff, rather pointlessly, IMHO.
> 
> I'm still thinking about proposing to just delete all this SPAPR
> stuff. Power still hasn't had the patches applied to make it work
> again so it seems to all be dead.

There's been some off-list discussion about at least fixing SPAPR
support, but yes, it either needs to get some love or we ought to think
about its future.  Thanks,

Alex


^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [PATCH v3 05/15] kvm/vfio: Accept vfio device file from userspace
  2023-02-14 23:42         ` Alex Williamson
@ 2023-02-15  0:17           ` Jason Gunthorpe
  -1 siblings, 0 replies; 135+ messages in thread
From: Jason Gunthorpe @ 2023-02-15  0:17 UTC (permalink / raw)
  To: Alex Williamson
  Cc: Yi Liu, joro, kevin.tian, robin.murphy, cohuck, eric.auger,
	nicolinc, kvm, mjrosato, chao.p.peng, yi.y.sun, peterx, jasowang,
	shameerali.kolothum.thodi, lulu, suravee.suthikulpanit,
	intel-gvt-dev, intel-gfx, linux-s390, Timothy Pearson,
	Michael Ellerman

On Tue, Feb 14, 2023 at 04:42:35PM -0700, Alex Williamson wrote:

> A device file opened through a group could be passed through this
> interface though, right?

Yes, I think so

> Do we just chalk that up to user error?  Maybe the SPAPR extension
> at least needs to be documented as relying on registering groups
> rather than devices.

The way these APIs work is you have to pass the same FD to all of
them. The SPAPR stuff is no different, if you used a cdev with
KVM_DEV_VFIO_GROUP_ADD then you have to use the same cdev fd with the
SPAPR group_fd. Yi just didn't rename it.

It is weird, but logically self consistent, I think.

> > I'm still thinking about proposing to just delete all this SPAPR
> > stuff. Power still hasn't had the patches applied to make it work
> > again so it seems to all be dead.
> 
> There's been some off-list discussion about at least fixing SPAPR
> support, but yes, it either needs to get some love or we ought to think
> about its future.  Thanks,

The patches exist, they just need to be applied AFAIK. If the people
responsible can't care enough about this to even do that then I find
it hard to care at all about the state of SPAPR.

Jason

^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [Intel-gfx] [PATCH v3 05/15] kvm/vfio: Accept vfio device file from userspace
@ 2023-02-15  0:17           ` Jason Gunthorpe
  0 siblings, 0 replies; 135+ messages in thread
From: Jason Gunthorpe @ 2023-02-15  0:17 UTC (permalink / raw)
  To: Alex Williamson
  Cc: linux-s390, Yi Liu, yi.y.sun, mjrosato, kvm, Michael Ellerman,
	intel-gvt-dev, joro, cohuck, peterx, eric.auger, Timothy Pearson,
	nicolinc, shameerali.kolothum.thodi, suravee.suthikulpanit,
	intel-gfx, chao.p.peng, lulu, robin.murphy, jasowang

On Tue, Feb 14, 2023 at 04:42:35PM -0700, Alex Williamson wrote:

> A device file opened through a group could be passed through this
> interface though, right?

Yes, I think so

> Do we just chalk that up to user error?  Maybe the SPAPR extension
> at least needs to be documented as relying on registering groups
> rather than devices.

The way these APIs work is you have to pass the same FD to all of
them. The SPAPR stuff is no different, if you used a cdev with
KVM_DEV_VFIO_GROUP_ADD then you have to use the same cdev fd with the
SPAPR group_fd. Yi just didn't rename it.

It is weird, but logically self consistent, I think.

> > I'm still thinking about proposing to just delete all this SPAPR
> > stuff. Power still hasn't had the patches applied to make it work
> > again so it seems to all be dead.
> 
> There's been some off-list discussion about at least fixing SPAPR
> support, but yes, it either needs to get some love or we ought to think
> about its future.  Thanks,

The patches exist, they just need to be applied AFAIK. If the people
responsible can't care enough about this to even do that then I find
it hard to care at all about the state of SPAPR.

Jason

^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [PATCH v3 05/15] kvm/vfio: Accept vfio device file from userspace
  2023-02-15  0:17           ` [Intel-gfx] " Jason Gunthorpe
@ 2023-02-15  0:27             ` Timothy Pearson
  -1 siblings, 0 replies; 135+ messages in thread
From: Timothy Pearson @ 2023-02-15  0:27 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Alex Williamson, Yi Liu, joro, kevin tian, robin murphy, cohuck,
	eric auger, nicolinc, kvm, mjrosato, chao p peng, yi y sun,
	peterx, jasowang, shameerali kolothum thodi, lulu,
	suravee suthikulpanit, intel-gvt-dev, intel-gfx, linux-s390,
	Michael Ellerman



----- Original Message -----
> From: "Jason Gunthorpe" <jgg@nvidia.com>
> To: "Alex Williamson" <alex.williamson@redhat.com>
> Cc: "Yi Liu" <yi.l.liu@intel.com>, joro@8bytes.org, "kevin tian" <kevin.tian@intel.com>, "robin murphy"
> <robin.murphy@arm.com>, cohuck@redhat.com, "eric auger" <eric.auger@redhat.com>, nicolinc@nvidia.com, "kvm"
> <kvm@vger.kernel.org>, mjrosato@linux.ibm.com, "chao p peng" <chao.p.peng@linux.intel.com>, "yi y sun"
> <yi.y.sun@linux.intel.com>, peterx@redhat.com, jasowang@redhat.com, "shameerali kolothum thodi"
> <shameerali.kolothum.thodi@huawei.com>, lulu@redhat.com, "suravee suthikulpanit" <suravee.suthikulpanit@amd.com>,
> intel-gvt-dev@lists.freedesktop.org, intel-gfx@lists.freedesktop.org, "linux-s390" <linux-s390@vger.kernel.org>,
> "Timothy Pearson" <tpearson@raptorengineering.com>, "Michael Ellerman" <mpe@ellerman.id.au>
> Sent: Tuesday, February 14, 2023 6:17:46 PM
> Subject: Re: [PATCH v3 05/15] kvm/vfio: Accept vfio device file from userspace

> On Tue, Feb 14, 2023 at 04:42:35PM -0700, Alex Williamson wrote:
> 
>> A device file opened through a group could be passed through this
>> interface though, right?
> 
> Yes, I think so
> 
>> Do we just chalk that up to user error?  Maybe the SPAPR extension
>> at least needs to be documented as relying on registering groups
>> rather than devices.
> 
> The way these APIs work is you have to pass the same FD to all of
> them. The SPAPR stuff is no different, if you used a cdev with
> KVM_DEV_VFIO_GROUP_ADD then you have to use the same cdev fd with the
> SPAPR group_fd. Yi just didn't rename it.
> 
> It is weird, but logically self consistent, I think.
> 
>> > I'm still thinking about proposing to just delete all this SPAPR
>> > stuff. Power still hasn't had the patches applied to make it work
>> > again so it seems to all be dead.
>> 
>> There's been some off-list discussion about at least fixing SPAPR
>> support, but yes, it either needs to get some love or we ought to think
>> about its future.  Thanks,
> 
> The patches exist, they just need to be applied AFAIK. If the people
> responsible can't care enough about this to even do that then I find
> it hard to care at all about the state of SPAPR.
> 
> Jason

I've been discussing the state of the patches offline, apologies for the delay in checking in here.

I'll be taking over SPAPR support going forward, as we need it for our product line.  My current thoughts are to rebase / fix and test the patches that were already generated, to at least get support reenabled, then we can coordinate on further changes needed to maintain the support going forward.

I should have a rebased patchset ready later this week.

Thank you!

^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [Intel-gfx] [PATCH v3 05/15] kvm/vfio: Accept vfio device file from userspace
@ 2023-02-15  0:27             ` Timothy Pearson
  0 siblings, 0 replies; 135+ messages in thread
From: Timothy Pearson @ 2023-02-15  0:27 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: linux-s390, Yi Liu, yi y sun, mjrosato, kvm, Michael Ellerman,
	intel-gvt-dev, joro, cohuck, peterx, eric auger, nicolinc,
	shameerali kolothum thodi, suravee suthikulpanit, intel-gfx,
	chao p peng, lulu, robin murphy, jasowang



----- Original Message -----
> From: "Jason Gunthorpe" <jgg@nvidia.com>
> To: "Alex Williamson" <alex.williamson@redhat.com>
> Cc: "Yi Liu" <yi.l.liu@intel.com>, joro@8bytes.org, "kevin tian" <kevin.tian@intel.com>, "robin murphy"
> <robin.murphy@arm.com>, cohuck@redhat.com, "eric auger" <eric.auger@redhat.com>, nicolinc@nvidia.com, "kvm"
> <kvm@vger.kernel.org>, mjrosato@linux.ibm.com, "chao p peng" <chao.p.peng@linux.intel.com>, "yi y sun"
> <yi.y.sun@linux.intel.com>, peterx@redhat.com, jasowang@redhat.com, "shameerali kolothum thodi"
> <shameerali.kolothum.thodi@huawei.com>, lulu@redhat.com, "suravee suthikulpanit" <suravee.suthikulpanit@amd.com>,
> intel-gvt-dev@lists.freedesktop.org, intel-gfx@lists.freedesktop.org, "linux-s390" <linux-s390@vger.kernel.org>,
> "Timothy Pearson" <tpearson@raptorengineering.com>, "Michael Ellerman" <mpe@ellerman.id.au>
> Sent: Tuesday, February 14, 2023 6:17:46 PM
> Subject: Re: [PATCH v3 05/15] kvm/vfio: Accept vfio device file from userspace

> On Tue, Feb 14, 2023 at 04:42:35PM -0700, Alex Williamson wrote:
> 
>> A device file opened through a group could be passed through this
>> interface though, right?
> 
> Yes, I think so
> 
>> Do we just chalk that up to user error?  Maybe the SPAPR extension
>> at least needs to be documented as relying on registering groups
>> rather than devices.
> 
> The way these APIs work is you have to pass the same FD to all of
> them. The SPAPR stuff is no different, if you used a cdev with
> KVM_DEV_VFIO_GROUP_ADD then you have to use the same cdev fd with the
> SPAPR group_fd. Yi just didn't rename it.
> 
> It is weird, but logically self consistent, I think.
> 
>> > I'm still thinking about proposing to just delete all this SPAPR
>> > stuff. Power still hasn't had the patches applied to make it work
>> > again so it seems to all be dead.
>> 
>> There's been some off-list discussion about at least fixing SPAPR
>> support, but yes, it either needs to get some love or we ought to think
>> about its future.  Thanks,
> 
> The patches exist, they just need to be applied AFAIK. If the people
> responsible can't care enough about this to even do that then I find
> it hard to care at all about the state of SPAPR.
> 
> Jason

I've been discussing the state of the patches offline, apologies for the delay in checking in here.

I'll be taking over SPAPR support going forward, as we need it for our product line.  My current thoughts are to rebase / fix and test the patches that were already generated, to at least get support reenabled, then we can coordinate on further changes needed to maintain the support going forward.

I should have a rebased patchset ready later this week.

Thank you!

^ permalink raw reply	[flat|nested] 135+ messages in thread

* RE: [PATCH v3 14/15] vfio: Add ioctls for device cdev using iommufd
  2023-02-14 23:39     ` [Intel-gfx] " Yan Zhao
@ 2023-02-15  2:04       ` Tian, Kevin
  -1 siblings, 0 replies; 135+ messages in thread
From: Tian, Kevin @ 2023-02-15  2:04 UTC (permalink / raw)
  To: Zhao, Yan Y, Liu, Yi L
  Cc: joro, alex.williamson, jgg, robin.murphy, linux-s390, yi.y.sun,
	kvm, mjrosato, jasowang, cohuck, peterx, eric.auger, nicolinc,
	shameerali.kolothum.thodi, suravee.suthikulpanit, chao.p.peng,
	lulu, intel-gvt-dev, intel-gfx

> From: Zhao, Yan Y <yan.y.zhao@intel.com>
> Sent: Wednesday, February 15, 2023 7:39 AM
> 
> On Mon, Feb 13, 2023 at 07:13:47AM -0800, Yi Liu wrote:
> ...
> > +long vfio_device_ioctl_bind_iommufd(struct vfio_device_file *df,
> > +				    unsigned long arg)
> > +{
> > +	struct vfio_device *device = df->device;
> > +	struct vfio_device_bind_iommufd bind;
> > +	struct iommufd_ctx *iommufd = NULL;
> > +	struct fd f;
> > +	unsigned long minsz;
> > +	int ret;
> > +
> > +	minsz = offsetofend(struct vfio_device_bind_iommufd, out_devid);
> > +
> > +	if (copy_from_user(&bind, (void __user *)arg, minsz))
> > +		return -EFAULT;
> > +
> > +	if (bind.argsz < minsz || bind.flags)
> > +		return -EINVAL;
> > +
> > +	if (!device->ops->bind_iommufd)
> > +		return -ENODEV;
> > +
> > +	ret = vfio_device_claim_group(device);
> > +	if (ret)
> > +		return ret;
> > +
> > +	mutex_lock(&device->dev_set->lock);
> > +	/*
> > +	 * If already been bound to an iommufd, or already set noiommu
> > +	 * then fail it.
> > +	 */
> > +	if (df->iommufd || df->noiommu) {
> > +		ret = -EINVAL;
> > +		goto out_unlock;
> > +	}
> > +
> > +	/* iommufd < 0 means noiommu mode */
> > +	if (bind.iommufd < 0) {
> > +		if (!capable(CAP_SYS_RAWIO)) {
> > +			ret = -EPERM;
> > +			goto out_unlock;
> > +		}
> > +		df->noiommu = true;
> > +	} else {
> > +		f = fdget(bind.iommufd);
> > +		if (!f.file) {
> > +			ret = -EBADF;
> > +			goto out_unlock;
> > +		}
> > +		iommufd = iommufd_ctx_from_file(f.file);
> > +		if (IS_ERR(iommufd)) {
> > +			ret = PTR_ERR(iommufd);
> > +			goto out_put_file;
> > +		}
> > +	}
> > +
> > +	/*
> > +	 * Before the device open, get the KVM pointer currently
> > +	 * associated with the device file (if there is) and obtain a
> > +	 * reference. This reference is held until device closed. Save
> > +	 * the pointer in the device for use by drivers.
> > +	 */
> > +	vfio_device_get_kvm_safe(df);
> > +
> > +	df->iommufd = iommufd;
> > +	ret = vfio_device_open(df, &bind.out_devid, NULL);
> > +	if (ret)
> > +		goto out_put_kvm;
> > +
> > +	ret = copy_to_user((void __user *)arg +
> > +			   offsetofend(struct vfio_device_bind_iommufd,
> iommufd),
> > +			   &bind.out_devid,
> > +			   sizeof(bind.out_devid)) ? -EFAULT : 0;
> > +	if (ret)
> > +		goto out_close_device;
> > +
> > +	if (iommufd)
> > +		fdput(f);
> > +	else if (df->noiommu)
> > +		dev_warn(device->dev, "vfio-noiommu device used by user "
> > +			 "(%s:%d)\n", current->comm, task_pid_nr(current));
> 
> IMHO, the "smp_store_release(&df->access_granted, true);" in patch 7
> should be moved to here when bind is indeed successful.
> 

yes. in that case patch7 should put release in vfio_device_group_open()
and then add a new release here.

^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [Intel-gfx] [PATCH v3 14/15] vfio: Add ioctls for device cdev using iommufd
@ 2023-02-15  2:04       ` Tian, Kevin
  0 siblings, 0 replies; 135+ messages in thread
From: Tian, Kevin @ 2023-02-15  2:04 UTC (permalink / raw)
  To: Zhao, Yan Y, Liu, Yi L
  Cc: linux-s390, yi.y.sun, mjrosato, kvm, intel-gvt-dev, joro, cohuck,
	peterx, eric.auger, nicolinc, shameerali.kolothum.thodi, jgg,
	chao.p.peng, intel-gfx, suravee.suthikulpanit, lulu,
	robin.murphy, jasowang

> From: Zhao, Yan Y <yan.y.zhao@intel.com>
> Sent: Wednesday, February 15, 2023 7:39 AM
> 
> On Mon, Feb 13, 2023 at 07:13:47AM -0800, Yi Liu wrote:
> ...
> > +long vfio_device_ioctl_bind_iommufd(struct vfio_device_file *df,
> > +				    unsigned long arg)
> > +{
> > +	struct vfio_device *device = df->device;
> > +	struct vfio_device_bind_iommufd bind;
> > +	struct iommufd_ctx *iommufd = NULL;
> > +	struct fd f;
> > +	unsigned long minsz;
> > +	int ret;
> > +
> > +	minsz = offsetofend(struct vfio_device_bind_iommufd, out_devid);
> > +
> > +	if (copy_from_user(&bind, (void __user *)arg, minsz))
> > +		return -EFAULT;
> > +
> > +	if (bind.argsz < minsz || bind.flags)
> > +		return -EINVAL;
> > +
> > +	if (!device->ops->bind_iommufd)
> > +		return -ENODEV;
> > +
> > +	ret = vfio_device_claim_group(device);
> > +	if (ret)
> > +		return ret;
> > +
> > +	mutex_lock(&device->dev_set->lock);
> > +	/*
> > +	 * If already been bound to an iommufd, or already set noiommu
> > +	 * then fail it.
> > +	 */
> > +	if (df->iommufd || df->noiommu) {
> > +		ret = -EINVAL;
> > +		goto out_unlock;
> > +	}
> > +
> > +	/* iommufd < 0 means noiommu mode */
> > +	if (bind.iommufd < 0) {
> > +		if (!capable(CAP_SYS_RAWIO)) {
> > +			ret = -EPERM;
> > +			goto out_unlock;
> > +		}
> > +		df->noiommu = true;
> > +	} else {
> > +		f = fdget(bind.iommufd);
> > +		if (!f.file) {
> > +			ret = -EBADF;
> > +			goto out_unlock;
> > +		}
> > +		iommufd = iommufd_ctx_from_file(f.file);
> > +		if (IS_ERR(iommufd)) {
> > +			ret = PTR_ERR(iommufd);
> > +			goto out_put_file;
> > +		}
> > +	}
> > +
> > +	/*
> > +	 * Before the device open, get the KVM pointer currently
> > +	 * associated with the device file (if there is) and obtain a
> > +	 * reference. This reference is held until device closed. Save
> > +	 * the pointer in the device for use by drivers.
> > +	 */
> > +	vfio_device_get_kvm_safe(df);
> > +
> > +	df->iommufd = iommufd;
> > +	ret = vfio_device_open(df, &bind.out_devid, NULL);
> > +	if (ret)
> > +		goto out_put_kvm;
> > +
> > +	ret = copy_to_user((void __user *)arg +
> > +			   offsetofend(struct vfio_device_bind_iommufd,
> iommufd),
> > +			   &bind.out_devid,
> > +			   sizeof(bind.out_devid)) ? -EFAULT : 0;
> > +	if (ret)
> > +		goto out_close_device;
> > +
> > +	if (iommufd)
> > +		fdput(f);
> > +	else if (df->noiommu)
> > +		dev_warn(device->dev, "vfio-noiommu device used by user "
> > +			 "(%s:%d)\n", current->comm, task_pid_nr(current));
> 
> IMHO, the "smp_store_release(&df->access_granted, true);" in patch 7
> should be moved to here when bind is indeed successful.
> 

yes. in that case patch7 should put release in vfio_device_group_open()
and then add a new release here.

^ permalink raw reply	[flat|nested] 135+ messages in thread

* RE: [PATCH v3 07/15] vfio: Block device access via device fd until device is opened
  2023-02-14 22:46     ` Alex Williamson
@ 2023-02-15  6:12       ` Liu, Yi L
  -1 siblings, 0 replies; 135+ messages in thread
From: Liu, Yi L @ 2023-02-15  6:12 UTC (permalink / raw)
  To: Alex Williamson
  Cc: joro, jgg, Tian, Kevin, robin.murphy, cohuck, eric.auger,
	nicolinc, kvm, mjrosato, chao.p.peng, yi.y.sun, peterx, jasowang,
	shameerali.kolothum.thodi, lulu, suravee.suthikulpanit,
	intel-gvt-dev, intel-gfx, linux-s390

> From: Alex Williamson <alex.williamson@redhat.com>
> Sent: Wednesday, February 15, 2023 6:47 AM
> 
> On Mon, 13 Feb 2023 07:13:40 -0800
> Yi Liu <yi.l.liu@intel.com> wrote:
> 
> > Allow the vfio_device file to be in a state where the device FD is
> > opened but the device cannot be used by userspace (i.e.
> its .open_device()
> > hasn't been called). This inbetween state is not used when the device
> > FD is spawned from the group FD, however when we create the device FD
> > directly by opening a cdev it will be opened in the blocked state.
> >
> > The reason for the inbetween state is that userspace only gets a FD but
> > doesn't gain access permission until binding the FD to an iommufd. So in
> > the blocked state, only the bind operation is allowed. Completing bind
> > will allow user to further access the device.
> >
> > This is implemented by adding a flag in struct vfio_device_file to mark
> > the blocked state and using a simple smp_load_acquire() to obtain the
> > flag value and serialize all the device setup with the thread accessing
> > this device.
> >
> > Following this lockless scheme, it can safely handle the device FD
> > unbound->bound but it cannot handle bound->unbound. To allow this
> we'd
> > need to add a lock on all the vfio ioctls which seems costly. So once
> > device FD is bound, it remains bound until the FD is closed.
> >
> > Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
> > Signed-off-by: Yi Liu <yi.l.liu@intel.com>
> > Reviewed-by: Kevin Tian <kevin.tian@intel.com>
> > ---
> >  drivers/vfio/vfio.h      |  1 +
> >  drivers/vfio/vfio_main.c | 34 +++++++++++++++++++++++++++++++++-
> >  2 files changed, 34 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/vfio/vfio.h b/drivers/vfio/vfio.h
> > index 11e56fe079a1..d56cdb114024 100644
> > --- a/drivers/vfio/vfio.h
> > +++ b/drivers/vfio/vfio.h
> > @@ -18,6 +18,7 @@ struct vfio_container;
> >
> >  struct vfio_device_file {
> >  	struct vfio_device *device;
> > +	bool access_granted;
> >  	spinlock_t kvm_ref_lock; /* protect kvm field */
> >  	struct kvm *kvm;
> >  	struct iommufd_ctx *iommufd; /* protected by struct
> vfio_device_set::lock */
> > diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c
> > index c517252aba19..2267057240bd 100644
> > --- a/drivers/vfio/vfio_main.c
> > +++ b/drivers/vfio/vfio_main.c
> > @@ -476,7 +476,15 @@ int vfio_device_open(struct vfio_device_file *df)
> >  			device->open_count--;
> >  	}
> >
> > -	return ret;
> > +	if (ret)
> > +		return ret;
> > +
> > +	/*
> > +	 * Paired with smp_load_acquire() in vfio_device_fops::ioctl/
> > +	 * read/write/mmap
> > +	 */
> > +	smp_store_release(&df->access_granted, true);
> > +	return 0;
> >  }
> >
> >  void vfio_device_close(struct vfio_device_file *df)
> > @@ -1104,8 +1112,14 @@ static long vfio_device_fops_unl_ioctl(struct
> file *filep,
> >  {
> >  	struct vfio_device_file *df = filep->private_data;
> >  	struct vfio_device *device = df->device;
> > +	bool access;
> >  	int ret;
> >
> > +	/* Paired with smp_store_release() in vfio_device_open() */
> > +	access = smp_load_acquire(&df->access_granted);
> > +	if (!access)
> > +		return -EINVAL;
> > +
> 
> Nit,
> 
> 	if (!smp_load_acquire(&df->access_granted))
> 		...

Got it. also updated other similar lines.

> Thanks,
> Alex
> 
> >  	ret = vfio_device_pm_runtime_get(device);
> >  	if (ret)
> >  		return ret;
> > @@ -1132,6 +1146,12 @@ static ssize_t vfio_device_fops_read(struct file
> *filep, char __user *buf,
> >  {
> >  	struct vfio_device_file *df = filep->private_data;
> >  	struct vfio_device *device = df->device;
> > +	bool access;
> > +
> > +	/* Paired with smp_store_release() in vfio_device_open() */
> > +	access = smp_load_acquire(&df->access_granted);
> > +	if (!access)
> > +		return -EINVAL;
> >
> >  	if (unlikely(!device->ops->read))
> >  		return -EINVAL;
> > @@ -1145,6 +1165,12 @@ static ssize_t vfio_device_fops_write(struct file
> *filep,
> >  {
> >  	struct vfio_device_file *df = filep->private_data;
> >  	struct vfio_device *device = df->device;
> > +	bool access;
> > +
> > +	/* Paired with smp_store_release() in vfio_device_open() */
> > +	access = smp_load_acquire(&df->access_granted);
> > +	if (!access)
> > +		return -EINVAL;
> >
> >  	if (unlikely(!device->ops->write))
> >  		return -EINVAL;
> > @@ -1156,6 +1182,12 @@ static int vfio_device_fops_mmap(struct file
> *filep, struct vm_area_struct *vma)
> >  {
> >  	struct vfio_device_file *df = filep->private_data;
> >  	struct vfio_device *device = df->device;
> > +	bool access;
> > +
> > +	/* Paired with smp_store_release() in vfio_device_open() */
> > +	access = smp_load_acquire(&df->access_granted);
> > +	if (!access)
> > +		return -EINVAL;
> >
> >  	if (unlikely(!device->ops->mmap))
> >  		return -EINVAL;

Regards,
Yi Liu

^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [Intel-gfx] [PATCH v3 07/15] vfio: Block device access via device fd until device is opened
@ 2023-02-15  6:12       ` Liu, Yi L
  0 siblings, 0 replies; 135+ messages in thread
From: Liu, Yi L @ 2023-02-15  6:12 UTC (permalink / raw)
  To: Alex Williamson
  Cc: linux-s390, yi.y.sun, mjrosato, kvm, intel-gvt-dev, joro, cohuck,
	peterx, suravee.suthikulpanit, eric.auger, nicolinc,
	shameerali.kolothum.thodi, jgg, intel-gfx, chao.p.peng, lulu,
	robin.murphy, jasowang

> From: Alex Williamson <alex.williamson@redhat.com>
> Sent: Wednesday, February 15, 2023 6:47 AM
> 
> On Mon, 13 Feb 2023 07:13:40 -0800
> Yi Liu <yi.l.liu@intel.com> wrote:
> 
> > Allow the vfio_device file to be in a state where the device FD is
> > opened but the device cannot be used by userspace (i.e.
> its .open_device()
> > hasn't been called). This inbetween state is not used when the device
> > FD is spawned from the group FD, however when we create the device FD
> > directly by opening a cdev it will be opened in the blocked state.
> >
> > The reason for the inbetween state is that userspace only gets a FD but
> > doesn't gain access permission until binding the FD to an iommufd. So in
> > the blocked state, only the bind operation is allowed. Completing bind
> > will allow user to further access the device.
> >
> > This is implemented by adding a flag in struct vfio_device_file to mark
> > the blocked state and using a simple smp_load_acquire() to obtain the
> > flag value and serialize all the device setup with the thread accessing
> > this device.
> >
> > Following this lockless scheme, it can safely handle the device FD
> > unbound->bound but it cannot handle bound->unbound. To allow this
> we'd
> > need to add a lock on all the vfio ioctls which seems costly. So once
> > device FD is bound, it remains bound until the FD is closed.
> >
> > Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
> > Signed-off-by: Yi Liu <yi.l.liu@intel.com>
> > Reviewed-by: Kevin Tian <kevin.tian@intel.com>
> > ---
> >  drivers/vfio/vfio.h      |  1 +
> >  drivers/vfio/vfio_main.c | 34 +++++++++++++++++++++++++++++++++-
> >  2 files changed, 34 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/vfio/vfio.h b/drivers/vfio/vfio.h
> > index 11e56fe079a1..d56cdb114024 100644
> > --- a/drivers/vfio/vfio.h
> > +++ b/drivers/vfio/vfio.h
> > @@ -18,6 +18,7 @@ struct vfio_container;
> >
> >  struct vfio_device_file {
> >  	struct vfio_device *device;
> > +	bool access_granted;
> >  	spinlock_t kvm_ref_lock; /* protect kvm field */
> >  	struct kvm *kvm;
> >  	struct iommufd_ctx *iommufd; /* protected by struct
> vfio_device_set::lock */
> > diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c
> > index c517252aba19..2267057240bd 100644
> > --- a/drivers/vfio/vfio_main.c
> > +++ b/drivers/vfio/vfio_main.c
> > @@ -476,7 +476,15 @@ int vfio_device_open(struct vfio_device_file *df)
> >  			device->open_count--;
> >  	}
> >
> > -	return ret;
> > +	if (ret)
> > +		return ret;
> > +
> > +	/*
> > +	 * Paired with smp_load_acquire() in vfio_device_fops::ioctl/
> > +	 * read/write/mmap
> > +	 */
> > +	smp_store_release(&df->access_granted, true);
> > +	return 0;
> >  }
> >
> >  void vfio_device_close(struct vfio_device_file *df)
> > @@ -1104,8 +1112,14 @@ static long vfio_device_fops_unl_ioctl(struct
> file *filep,
> >  {
> >  	struct vfio_device_file *df = filep->private_data;
> >  	struct vfio_device *device = df->device;
> > +	bool access;
> >  	int ret;
> >
> > +	/* Paired with smp_store_release() in vfio_device_open() */
> > +	access = smp_load_acquire(&df->access_granted);
> > +	if (!access)
> > +		return -EINVAL;
> > +
> 
> Nit,
> 
> 	if (!smp_load_acquire(&df->access_granted))
> 		...

Got it. also updated other similar lines.

> Thanks,
> Alex
> 
> >  	ret = vfio_device_pm_runtime_get(device);
> >  	if (ret)
> >  		return ret;
> > @@ -1132,6 +1146,12 @@ static ssize_t vfio_device_fops_read(struct file
> *filep, char __user *buf,
> >  {
> >  	struct vfio_device_file *df = filep->private_data;
> >  	struct vfio_device *device = df->device;
> > +	bool access;
> > +
> > +	/* Paired with smp_store_release() in vfio_device_open() */
> > +	access = smp_load_acquire(&df->access_granted);
> > +	if (!access)
> > +		return -EINVAL;
> >
> >  	if (unlikely(!device->ops->read))
> >  		return -EINVAL;
> > @@ -1145,6 +1165,12 @@ static ssize_t vfio_device_fops_write(struct file
> *filep,
> >  {
> >  	struct vfio_device_file *df = filep->private_data;
> >  	struct vfio_device *device = df->device;
> > +	bool access;
> > +
> > +	/* Paired with smp_store_release() in vfio_device_open() */
> > +	access = smp_load_acquire(&df->access_granted);
> > +	if (!access)
> > +		return -EINVAL;
> >
> >  	if (unlikely(!device->ops->write))
> >  		return -EINVAL;
> > @@ -1156,6 +1182,12 @@ static int vfio_device_fops_mmap(struct file
> *filep, struct vm_area_struct *vma)
> >  {
> >  	struct vfio_device_file *df = filep->private_data;
> >  	struct vfio_device *device = df->device;
> > +	bool access;
> > +
> > +	/* Paired with smp_store_release() in vfio_device_open() */
> > +	access = smp_load_acquire(&df->access_granted);
> > +	if (!access)
> > +		return -EINVAL;
> >
> >  	if (unlikely(!device->ops->mmap))
> >  		return -EINVAL;

Regards,
Yi Liu

^ permalink raw reply	[flat|nested] 135+ messages in thread

* RE: [PATCH v3 14/15] vfio: Add ioctls for device cdev using iommufd
  2023-02-15  2:04       ` [Intel-gfx] " Tian, Kevin
@ 2023-02-15  7:37         ` Liu, Yi L
  -1 siblings, 0 replies; 135+ messages in thread
From: Liu, Yi L @ 2023-02-15  7:37 UTC (permalink / raw)
  To: Tian, Kevin, Zhao, Yan Y
  Cc: joro, alex.williamson, jgg, robin.murphy, linux-s390, yi.y.sun,
	kvm, mjrosato, jasowang, cohuck, peterx, eric.auger, nicolinc,
	shameerali.kolothum.thodi, suravee.suthikulpanit, chao.p.peng,
	lulu, intel-gvt-dev, intel-gfx

> From: Tian, Kevin <kevin.tian@intel.com>
> Sent: Wednesday, February 15, 2023 10:05 AM
> 
> > From: Zhao, Yan Y <yan.y.zhao@intel.com>
> > Sent: Wednesday, February 15, 2023 7:39 AM
> >
> > On Mon, Feb 13, 2023 at 07:13:47AM -0800, Yi Liu wrote:
> > ...
> > > +long vfio_device_ioctl_bind_iommufd(struct vfio_device_file *df,
> > > +				    unsigned long arg)
> > > +{
> > > +	struct vfio_device *device = df->device;
> > > +	struct vfio_device_bind_iommufd bind;
> > > +	struct iommufd_ctx *iommufd = NULL;
> > > +	struct fd f;
> > > +	unsigned long minsz;
> > > +	int ret;
> > > +
> > > +	minsz = offsetofend(struct vfio_device_bind_iommufd, out_devid);
> > > +
> > > +	if (copy_from_user(&bind, (void __user *)arg, minsz))
> > > +		return -EFAULT;
> > > +
> > > +	if (bind.argsz < minsz || bind.flags)
> > > +		return -EINVAL;
> > > +
> > > +	if (!device->ops->bind_iommufd)
> > > +		return -ENODEV;
> > > +
> > > +	ret = vfio_device_claim_group(device);
> > > +	if (ret)
> > > +		return ret;
> > > +
> > > +	mutex_lock(&device->dev_set->lock);
> > > +	/*
> > > +	 * If already been bound to an iommufd, or already set noiommu
> > > +	 * then fail it.
> > > +	 */
> > > +	if (df->iommufd || df->noiommu) {
> > > +		ret = -EINVAL;
> > > +		goto out_unlock;
> > > +	}
> > > +
> > > +	/* iommufd < 0 means noiommu mode */
> > > +	if (bind.iommufd < 0) {
> > > +		if (!capable(CAP_SYS_RAWIO)) {
> > > +			ret = -EPERM;
> > > +			goto out_unlock;
> > > +		}
> > > +		df->noiommu = true;
> > > +	} else {
> > > +		f = fdget(bind.iommufd);
> > > +		if (!f.file) {
> > > +			ret = -EBADF;
> > > +			goto out_unlock;
> > > +		}
> > > +		iommufd = iommufd_ctx_from_file(f.file);
> > > +		if (IS_ERR(iommufd)) {
> > > +			ret = PTR_ERR(iommufd);
> > > +			goto out_put_file;
> > > +		}
> > > +	}
> > > +
> > > +	/*
> > > +	 * Before the device open, get the KVM pointer currently
> > > +	 * associated with the device file (if there is) and obtain a
> > > +	 * reference. This reference is held until device closed. Save
> > > +	 * the pointer in the device for use by drivers.
> > > +	 */
> > > +	vfio_device_get_kvm_safe(df);
> > > +
> > > +	df->iommufd = iommufd;
> > > +	ret = vfio_device_open(df, &bind.out_devid, NULL);
> > > +	if (ret)
> > > +		goto out_put_kvm;
> > > +
> > > +	ret = copy_to_user((void __user *)arg +
> > > +			   offsetofend(struct vfio_device_bind_iommufd,
> > iommufd),
> > > +			   &bind.out_devid,
> > > +			   sizeof(bind.out_devid)) ? -EFAULT : 0;
> > > +	if (ret)
> > > +		goto out_close_device;
> > > +
> > > +	if (iommufd)
> > > +		fdput(f);
> > > +	else if (df->noiommu)
> > > +		dev_warn(device->dev, "vfio-noiommu device used by user
> "
> > > +			 "(%s:%d)\n", current->comm,
> task_pid_nr(current));
> >
> > IMHO, the "smp_store_release(&df->access_granted, true);" in patch 7
> > should be moved to here when bind is indeed successful.
> >
> 
> yes. in that case patch7 should put release in vfio_device_group_open()
> and then add a new release here.

Right. This needs to be set in the caller instead of the vfio_device_open().
Done in the latest branch on github.

Regards,
Yi Liu

^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [Intel-gfx] [PATCH v3 14/15] vfio: Add ioctls for device cdev using iommufd
@ 2023-02-15  7:37         ` Liu, Yi L
  0 siblings, 0 replies; 135+ messages in thread
From: Liu, Yi L @ 2023-02-15  7:37 UTC (permalink / raw)
  To: Tian, Kevin, Zhao, Yan Y
  Cc: linux-s390, yi.y.sun, mjrosato, kvm, intel-gvt-dev, joro, cohuck,
	peterx, eric.auger, nicolinc, shameerali.kolothum.thodi, jgg,
	chao.p.peng, intel-gfx, suravee.suthikulpanit, lulu,
	robin.murphy, jasowang

> From: Tian, Kevin <kevin.tian@intel.com>
> Sent: Wednesday, February 15, 2023 10:05 AM
> 
> > From: Zhao, Yan Y <yan.y.zhao@intel.com>
> > Sent: Wednesday, February 15, 2023 7:39 AM
> >
> > On Mon, Feb 13, 2023 at 07:13:47AM -0800, Yi Liu wrote:
> > ...
> > > +long vfio_device_ioctl_bind_iommufd(struct vfio_device_file *df,
> > > +				    unsigned long arg)
> > > +{
> > > +	struct vfio_device *device = df->device;
> > > +	struct vfio_device_bind_iommufd bind;
> > > +	struct iommufd_ctx *iommufd = NULL;
> > > +	struct fd f;
> > > +	unsigned long minsz;
> > > +	int ret;
> > > +
> > > +	minsz = offsetofend(struct vfio_device_bind_iommufd, out_devid);
> > > +
> > > +	if (copy_from_user(&bind, (void __user *)arg, minsz))
> > > +		return -EFAULT;
> > > +
> > > +	if (bind.argsz < minsz || bind.flags)
> > > +		return -EINVAL;
> > > +
> > > +	if (!device->ops->bind_iommufd)
> > > +		return -ENODEV;
> > > +
> > > +	ret = vfio_device_claim_group(device);
> > > +	if (ret)
> > > +		return ret;
> > > +
> > > +	mutex_lock(&device->dev_set->lock);
> > > +	/*
> > > +	 * If already been bound to an iommufd, or already set noiommu
> > > +	 * then fail it.
> > > +	 */
> > > +	if (df->iommufd || df->noiommu) {
> > > +		ret = -EINVAL;
> > > +		goto out_unlock;
> > > +	}
> > > +
> > > +	/* iommufd < 0 means noiommu mode */
> > > +	if (bind.iommufd < 0) {
> > > +		if (!capable(CAP_SYS_RAWIO)) {
> > > +			ret = -EPERM;
> > > +			goto out_unlock;
> > > +		}
> > > +		df->noiommu = true;
> > > +	} else {
> > > +		f = fdget(bind.iommufd);
> > > +		if (!f.file) {
> > > +			ret = -EBADF;
> > > +			goto out_unlock;
> > > +		}
> > > +		iommufd = iommufd_ctx_from_file(f.file);
> > > +		if (IS_ERR(iommufd)) {
> > > +			ret = PTR_ERR(iommufd);
> > > +			goto out_put_file;
> > > +		}
> > > +	}
> > > +
> > > +	/*
> > > +	 * Before the device open, get the KVM pointer currently
> > > +	 * associated with the device file (if there is) and obtain a
> > > +	 * reference. This reference is held until device closed. Save
> > > +	 * the pointer in the device for use by drivers.
> > > +	 */
> > > +	vfio_device_get_kvm_safe(df);
> > > +
> > > +	df->iommufd = iommufd;
> > > +	ret = vfio_device_open(df, &bind.out_devid, NULL);
> > > +	if (ret)
> > > +		goto out_put_kvm;
> > > +
> > > +	ret = copy_to_user((void __user *)arg +
> > > +			   offsetofend(struct vfio_device_bind_iommufd,
> > iommufd),
> > > +			   &bind.out_devid,
> > > +			   sizeof(bind.out_devid)) ? -EFAULT : 0;
> > > +	if (ret)
> > > +		goto out_close_device;
> > > +
> > > +	if (iommufd)
> > > +		fdput(f);
> > > +	else if (df->noiommu)
> > > +		dev_warn(device->dev, "vfio-noiommu device used by user
> "
> > > +			 "(%s:%d)\n", current->comm,
> task_pid_nr(current));
> >
> > IMHO, the "smp_store_release(&df->access_granted, true);" in patch 7
> > should be moved to here when bind is indeed successful.
> >
> 
> yes. in that case patch7 should put release in vfio_device_group_open()
> and then add a new release here.

Right. This needs to be set in the caller instead of the vfio_device_open().
Done in the latest branch on github.

Regards,
Yi Liu

^ permalink raw reply	[flat|nested] 135+ messages in thread

* RE: [PATCH v3 05/15] kvm/vfio: Accept vfio device file from userspace
  2023-02-14 23:25       ` [Intel-gfx] " Jason Gunthorpe
@ 2023-02-15  7:37         ` Liu, Yi L
  -1 siblings, 0 replies; 135+ messages in thread
From: Liu, Yi L @ 2023-02-15  7:37 UTC (permalink / raw)
  To: Jason Gunthorpe, Alex Williamson
  Cc: joro, Tian, Kevin, robin.murphy, cohuck, eric.auger, nicolinc,
	kvm, mjrosato, chao.p.peng, yi.y.sun, peterx, jasowang,
	shameerali.kolothum.thodi, lulu, suravee.suthikulpanit,
	intel-gvt-dev, intel-gfx, linux-s390

> From: Jason Gunthorpe <jgg@nvidia.com>
> Sent: Wednesday, February 15, 2023 7:25 AM
> 
> On Tue, Feb 14, 2023 at 03:26:27PM -0700, Alex Williamson wrote:
> > > index 857d6ba349e1..d869913baafd 100644
> > > --- a/virt/kvm/vfio.c
> > > +++ b/virt/kvm/vfio.c
> > > @@ -286,18 +286,18 @@ static int kvm_vfio_set_file(struct kvm_device
> *dev, long attr,
> > >  	int32_t fd;
> > >
> > >  	switch (attr) {
> > > -	case KVM_DEV_VFIO_GROUP_ADD:
> > > +	case KVM_DEV_VFIO_FILE_ADD:
> > >  		if (get_user(fd, argp))
> > >  			return -EFAULT;
> > >  		return kvm_vfio_file_add(dev, fd);
> > >
> > > -	case KVM_DEV_VFIO_GROUP_DEL:
> > > +	case KVM_DEV_VFIO_FILE_DEL:
> > >  		if (get_user(fd, argp))
> > >  			return -EFAULT;
> > >  		return kvm_vfio_file_del(dev, fd);
> > >
> > >  #ifdef CONFIG_SPAPR_TCE_IOMMU
> > > -	case KVM_DEV_VFIO_GROUP_SET_SPAPR_TCE:
> > > +	case KVM_DEV_VFIO_FILE_SET_SPAPR_TCE:
> > >  		return kvm_vfio_file_set_spapr_tce(dev, arg);
> >
> > I don't see that the SPAPR code is so easily fungible to a device
> > file descriptor.  The kvm_vfio_spapr_tce data structure includes a
> > groupfd, which is required to match a groupfd on the file_list.  So
> > a SPAPR user cannot pass a cdev via FILE_ADD if they intend to use
> > this TCE code.
> 
> SPAPR cannot use cdev at all, cdev mode only works with iommufd.
> 
> So with my other remark about blocking unbound cdevs, in SPAPR mode
> you can never open a cdev and make it bound thus
> kvm_vfio_file_iommu_group() and others will return NULL always for
> cdev.
> 
> Thus AFAICT this is all fine.
> 
> Yi, you should also add some kconfig stuff to ensure that SPAPR always
> has the group interface compiled in.

Ok. I can make VFIO to select VFIO_GROUP for SPAPR.

Regards,
Yi Liu

^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [Intel-gfx] [PATCH v3 05/15] kvm/vfio: Accept vfio device file from userspace
@ 2023-02-15  7:37         ` Liu, Yi L
  0 siblings, 0 replies; 135+ messages in thread
From: Liu, Yi L @ 2023-02-15  7:37 UTC (permalink / raw)
  To: Jason Gunthorpe, Alex Williamson
  Cc: linux-s390, yi.y.sun, mjrosato, kvm, intel-gvt-dev, joro, cohuck,
	peterx, eric.auger, nicolinc, shameerali.kolothum.thodi,
	suravee.suthikulpanit, intel-gfx, chao.p.peng, lulu,
	robin.murphy, jasowang

> From: Jason Gunthorpe <jgg@nvidia.com>
> Sent: Wednesday, February 15, 2023 7:25 AM
> 
> On Tue, Feb 14, 2023 at 03:26:27PM -0700, Alex Williamson wrote:
> > > index 857d6ba349e1..d869913baafd 100644
> > > --- a/virt/kvm/vfio.c
> > > +++ b/virt/kvm/vfio.c
> > > @@ -286,18 +286,18 @@ static int kvm_vfio_set_file(struct kvm_device
> *dev, long attr,
> > >  	int32_t fd;
> > >
> > >  	switch (attr) {
> > > -	case KVM_DEV_VFIO_GROUP_ADD:
> > > +	case KVM_DEV_VFIO_FILE_ADD:
> > >  		if (get_user(fd, argp))
> > >  			return -EFAULT;
> > >  		return kvm_vfio_file_add(dev, fd);
> > >
> > > -	case KVM_DEV_VFIO_GROUP_DEL:
> > > +	case KVM_DEV_VFIO_FILE_DEL:
> > >  		if (get_user(fd, argp))
> > >  			return -EFAULT;
> > >  		return kvm_vfio_file_del(dev, fd);
> > >
> > >  #ifdef CONFIG_SPAPR_TCE_IOMMU
> > > -	case KVM_DEV_VFIO_GROUP_SET_SPAPR_TCE:
> > > +	case KVM_DEV_VFIO_FILE_SET_SPAPR_TCE:
> > >  		return kvm_vfio_file_set_spapr_tce(dev, arg);
> >
> > I don't see that the SPAPR code is so easily fungible to a device
> > file descriptor.  The kvm_vfio_spapr_tce data structure includes a
> > groupfd, which is required to match a groupfd on the file_list.  So
> > a SPAPR user cannot pass a cdev via FILE_ADD if they intend to use
> > this TCE code.
> 
> SPAPR cannot use cdev at all, cdev mode only works with iommufd.
> 
> So with my other remark about blocking unbound cdevs, in SPAPR mode
> you can never open a cdev and make it bound thus
> kvm_vfio_file_iommu_group() and others will return NULL always for
> cdev.
> 
> Thus AFAICT this is all fine.
> 
> Yi, you should also add some kconfig stuff to ensure that SPAPR always
> has the group interface compiled in.

Ok. I can make VFIO to select VFIO_GROUP for SPAPR.

Regards,
Yi Liu

^ permalink raw reply	[flat|nested] 135+ messages in thread

* RE: [PATCH v3 00/15] Add vfio_device cdev for iommufd support
  2023-02-14 15:47       ` Alex Williamson
@ 2023-02-15  7:54         ` Liu, Yi L
  -1 siblings, 0 replies; 135+ messages in thread
From: Liu, Yi L @ 2023-02-15  7:54 UTC (permalink / raw)
  To: Alex Williamson
  Cc: joro, jgg, Tian, Kevin, robin.murphy, cohuck, eric.auger,
	nicolinc, kvm, mjrosato, chao.p.peng, yi.y.sun, peterx, jasowang,
	shameerali.kolothum.thodi, lulu, suravee.suthikulpanit,
	intel-gvt-dev, intel-gfx, linux-s390

> From: Alex Williamson <alex.williamson@redhat.com>
> Sent: Tuesday, February 14, 2023 11:47 PM
> 
> On Tue, 14 Feb 2023 01:55:17 +0000
> "Liu, Yi L" <yi.l.liu@intel.com> wrote:
> 
> > > From: Alex Williamson <alex.williamson@redhat.com>
> > > Sent: Tuesday, February 14, 2023 3:47 AM
> > >
> > > On Mon, 13 Feb 2023 07:13:33 -0800
> > > Yi Liu <yi.l.liu@intel.com> wrote:
> > >
> > > > Existing VFIO provides group-centric user APIs for userspace.
> Userspace
> > > > opens the /dev/vfio/$group_id first before getting device fd and
> hence
> > > > getting access to device. This is not the desired model for iommufd.
> Per
> > > > the conclusion of community discussion[1], iommufd provides device-
> > > centric
> > > > kAPIs and requires its consumer (like VFIO) to be device-centric user
> > > > APIs. Such user APIs are used to associate device with iommufd and
> also
> > > > the I/O address spaces managed by the iommufd.
> > > >
> > > > This series first introduces a per device file structure to be prepared
> > > > for further enhancement and refactors the kvm-vfio code to be
> prepared
> > > > for accepting device file from userspace. Then refactors the vfio to be
> > > > able to handle iommufd binding. This refactor includes the mechanism
> of
> > > > blocking device access before iommufd bind, making
> vfio_device_open()
> > > be
> > > > exclusive between the group path and the cdev path. Eventually, adds
> the
> > > > cdev support for vfio device, and makes group infrastructure optional
> as
> > > > it is not needed when vfio device cdev is compiled.
> > > >
> > > > This is also a prerequisite for iommu nesting for vfio device[2].
> > > >
> > > > The complete code can be found in below branch, simple test done
> with
> > > the
> > > > legacy group path and the cdev path. Draft QEMU branch can be found
> > > at[3]
> > > >
> > > > https://github.com/yiliu1765/iommufd/tree/vfio_device_cdev_v3
> > > > (config CONFIG_IOMMUFD=y CONFIG_VFIO_DEVICE_CDEV=y)
> > >
> > > Even using your branch[1], it seems like this has not been tested
> > > except with cdev support enabled:
> > >
> > > /home/alwillia/Work/linux.git/drivers/vfio/vfio_main.c: In function
> > > ‘vfio_device_add’:
> > > /home/alwillia/Work/linux.git/drivers/vfio/vfio_main.c:253:48: error:
> ‘struct
> > > vfio_device’ has no member named ‘cdev’; did you mean ‘dev’?
> > >   253 |                 ret = cdev_device_add(&device->cdev, &device->device);
> > >       |                                                ^~~~
> > >       |                                                dev
> > > /home/alwillia/Work/linux.git/drivers/vfio/vfio_main.c: In function
> > > ‘vfio_device_del’:
> > > /home/alwillia/Work/linux.git/drivers/vfio/vfio_main.c:262:42: error:
> ‘struct
> > > vfio_device’ has no member named ‘cdev’; did you mean ‘dev’?
> > >   262 |                 cdev_device_del(&device->cdev, &device->device);
> > >       |                                          ^~~~
> > >       |                                          dev
> >
> > Sorry for it. It is due to the cdev definition is under
> > "#if IS_ENABLED(CONFIG_VFIO_DEVICE_CDEV)". While, in the code it
> > uses "if (IS_ENABLED(CONFIG_VFIO_DEVICE_CDEV))".  I think for
> > readability, it would be better to always define cdev in vfio_device,
> > and keep the using of cdev in code. How about your taste?
> 
> It seems necessary unless we want to litter the code with #ifdefs.

I've moved it to the header file and call cdev_device_add()
under #if (IS_ENABLED(CONFIG_VFIO_DEVICE_CDEV))".

> > > Additionally the VFIO_ENABLE_GROUP Kconfig option doesn't make much
> > > sense to me, it seems entirely redundant to VFIO_GROUP.
> >
> > The intention is to make the group code compiling match existing case.
> > Currently, if VFIO is configured, group code is by default compiled.
> > So VFIO_ENABLE_GROUP a hidden option, and VFIO_GROUP an option
> > for user.  User needs to explicitly config VFIO_GROUP if VFIO_DEVICE_CDEV==y.
> > If VFIO_DEVICE_CDEV==n, then no matter user configed VFIO_GROUP or
> > not, the group code shall be compiled.
> 
> I understand the mechanics, I still find VFIO_ENABLE_GROUP redundant
> and unnecessary.  Also, Kconfig should not allow a configuration
> without either VFIO_GROUP or VFIO_DEVICE_CDEV as this is not
> functional.  Deselecting VFIO_GROUP should select VFIO_DEVICE_CDEV,
> but  VFIO_DEVICE_CDEV should be an optional addition to VFIO_GROUP.

How about below? As Jason's remark on patch 0003, cdev is not available
for SPAPR.

diff --git a/drivers/vfio/Kconfig b/drivers/vfio/Kconfig
index 0476abf154f2..96535adc2301 100644
--- a/drivers/vfio/Kconfig
+++ b/drivers/vfio/Kconfig
@@ -4,6 +4,8 @@ menuconfig VFIO
 	select IOMMU_API
 	depends on IOMMUFD || !IOMMUFD
 	select INTERVAL_TREE
+	select VFIO_GROUP if SPAPR_TCE_IOMMU
+	select VFIO_DEVICE_CDEV if !VFIO_GROUP && (X86 || S390 || ARM || ARM64)
 	select VFIO_CONTAINER if IOMMUFD=n
 	help
 	  VFIO provides a framework for secure userspace device drivers.
@@ -14,7 +16,8 @@ menuconfig VFIO
 if VFIO
 config VFIO_DEVICE_CDEV
 	bool "Support for the VFIO cdev /dev/vfio/devices/vfioX"
 	depends on IOMMUFD && (X86 || S390 || ARM || ARM64)
+	default !VFIO_GROUP
 	help
 	  The VFIO device cdev is another way for userspace to get device
 	  access. Userspace gets device fd by opening device cdev under
@@ -23,9 +26,21 @@ config VFIO_DEVICE_CDEV
 
 	  If you don't know what to do here, say N.
 
+config VFIO_GROUP
+	bool "Support for the VFIO group /dev/vfio/$group_id"
+	default y
+	help
+	   VFIO group is legacy interface for userspace. As the introduction
+	   of VFIO device cdev interface, this can be N. For now, before
+	   userspace applications are fully converted to new vfio device cdev
+	   interface, this should be Y.
+
+	   If you don't know what to do here, say Y.
+
 config VFIO_CONTAINER
 	bool "Support for the VFIO container /dev/vfio/vfio"
 	select VFIO_IOMMU_TYPE1 if MMU && (X86 || S390 || ARM || ARM64)
+	depends on VFIO_GROUP
 	default y
 	help
 	  The VFIO container is the classic interface to VFIO for establishing

Regards,
Yi Liu

^ permalink raw reply related	[flat|nested] 135+ messages in thread

* Re: [Intel-gfx] [PATCH v3 00/15] Add vfio_device cdev for iommufd support
@ 2023-02-15  7:54         ` Liu, Yi L
  0 siblings, 0 replies; 135+ messages in thread
From: Liu, Yi L @ 2023-02-15  7:54 UTC (permalink / raw)
  To: Alex Williamson
  Cc: linux-s390, yi.y.sun, mjrosato, kvm, intel-gvt-dev, joro, cohuck,
	peterx, suravee.suthikulpanit, eric.auger, nicolinc,
	shameerali.kolothum.thodi, jgg, intel-gfx, chao.p.peng, lulu,
	robin.murphy, jasowang

> From: Alex Williamson <alex.williamson@redhat.com>
> Sent: Tuesday, February 14, 2023 11:47 PM
> 
> On Tue, 14 Feb 2023 01:55:17 +0000
> "Liu, Yi L" <yi.l.liu@intel.com> wrote:
> 
> > > From: Alex Williamson <alex.williamson@redhat.com>
> > > Sent: Tuesday, February 14, 2023 3:47 AM
> > >
> > > On Mon, 13 Feb 2023 07:13:33 -0800
> > > Yi Liu <yi.l.liu@intel.com> wrote:
> > >
> > > > Existing VFIO provides group-centric user APIs for userspace.
> Userspace
> > > > opens the /dev/vfio/$group_id first before getting device fd and
> hence
> > > > getting access to device. This is not the desired model for iommufd.
> Per
> > > > the conclusion of community discussion[1], iommufd provides device-
> > > centric
> > > > kAPIs and requires its consumer (like VFIO) to be device-centric user
> > > > APIs. Such user APIs are used to associate device with iommufd and
> also
> > > > the I/O address spaces managed by the iommufd.
> > > >
> > > > This series first introduces a per device file structure to be prepared
> > > > for further enhancement and refactors the kvm-vfio code to be
> prepared
> > > > for accepting device file from userspace. Then refactors the vfio to be
> > > > able to handle iommufd binding. This refactor includes the mechanism
> of
> > > > blocking device access before iommufd bind, making
> vfio_device_open()
> > > be
> > > > exclusive between the group path and the cdev path. Eventually, adds
> the
> > > > cdev support for vfio device, and makes group infrastructure optional
> as
> > > > it is not needed when vfio device cdev is compiled.
> > > >
> > > > This is also a prerequisite for iommu nesting for vfio device[2].
> > > >
> > > > The complete code can be found in below branch, simple test done
> with
> > > the
> > > > legacy group path and the cdev path. Draft QEMU branch can be found
> > > at[3]
> > > >
> > > > https://github.com/yiliu1765/iommufd/tree/vfio_device_cdev_v3
> > > > (config CONFIG_IOMMUFD=y CONFIG_VFIO_DEVICE_CDEV=y)
> > >
> > > Even using your branch[1], it seems like this has not been tested
> > > except with cdev support enabled:
> > >
> > > /home/alwillia/Work/linux.git/drivers/vfio/vfio_main.c: In function
> > > ‘vfio_device_add’:
> > > /home/alwillia/Work/linux.git/drivers/vfio/vfio_main.c:253:48: error:
> ‘struct
> > > vfio_device’ has no member named ‘cdev’; did you mean ‘dev’?
> > >   253 |                 ret = cdev_device_add(&device->cdev, &device->device);
> > >       |                                                ^~~~
> > >       |                                                dev
> > > /home/alwillia/Work/linux.git/drivers/vfio/vfio_main.c: In function
> > > ‘vfio_device_del’:
> > > /home/alwillia/Work/linux.git/drivers/vfio/vfio_main.c:262:42: error:
> ‘struct
> > > vfio_device’ has no member named ‘cdev’; did you mean ‘dev’?
> > >   262 |                 cdev_device_del(&device->cdev, &device->device);
> > >       |                                          ^~~~
> > >       |                                          dev
> >
> > Sorry for it. It is due to the cdev definition is under
> > "#if IS_ENABLED(CONFIG_VFIO_DEVICE_CDEV)". While, in the code it
> > uses "if (IS_ENABLED(CONFIG_VFIO_DEVICE_CDEV))".  I think for
> > readability, it would be better to always define cdev in vfio_device,
> > and keep the using of cdev in code. How about your taste?
> 
> It seems necessary unless we want to litter the code with #ifdefs.

I've moved it to the header file and call cdev_device_add()
under #if (IS_ENABLED(CONFIG_VFIO_DEVICE_CDEV))".

> > > Additionally the VFIO_ENABLE_GROUP Kconfig option doesn't make much
> > > sense to me, it seems entirely redundant to VFIO_GROUP.
> >
> > The intention is to make the group code compiling match existing case.
> > Currently, if VFIO is configured, group code is by default compiled.
> > So VFIO_ENABLE_GROUP a hidden option, and VFIO_GROUP an option
> > for user.  User needs to explicitly config VFIO_GROUP if VFIO_DEVICE_CDEV==y.
> > If VFIO_DEVICE_CDEV==n, then no matter user configed VFIO_GROUP or
> > not, the group code shall be compiled.
> 
> I understand the mechanics, I still find VFIO_ENABLE_GROUP redundant
> and unnecessary.  Also, Kconfig should not allow a configuration
> without either VFIO_GROUP or VFIO_DEVICE_CDEV as this is not
> functional.  Deselecting VFIO_GROUP should select VFIO_DEVICE_CDEV,
> but  VFIO_DEVICE_CDEV should be an optional addition to VFIO_GROUP.

How about below? As Jason's remark on patch 0003, cdev is not available
for SPAPR.

diff --git a/drivers/vfio/Kconfig b/drivers/vfio/Kconfig
index 0476abf154f2..96535adc2301 100644
--- a/drivers/vfio/Kconfig
+++ b/drivers/vfio/Kconfig
@@ -4,6 +4,8 @@ menuconfig VFIO
 	select IOMMU_API
 	depends on IOMMUFD || !IOMMUFD
 	select INTERVAL_TREE
+	select VFIO_GROUP if SPAPR_TCE_IOMMU
+	select VFIO_DEVICE_CDEV if !VFIO_GROUP && (X86 || S390 || ARM || ARM64)
 	select VFIO_CONTAINER if IOMMUFD=n
 	help
 	  VFIO provides a framework for secure userspace device drivers.
@@ -14,7 +16,8 @@ menuconfig VFIO
 if VFIO
 config VFIO_DEVICE_CDEV
 	bool "Support for the VFIO cdev /dev/vfio/devices/vfioX"
 	depends on IOMMUFD && (X86 || S390 || ARM || ARM64)
+	default !VFIO_GROUP
 	help
 	  The VFIO device cdev is another way for userspace to get device
 	  access. Userspace gets device fd by opening device cdev under
@@ -23,9 +26,21 @@ config VFIO_DEVICE_CDEV
 
 	  If you don't know what to do here, say N.
 
+config VFIO_GROUP
+	bool "Support for the VFIO group /dev/vfio/$group_id"
+	default y
+	help
+	   VFIO group is legacy interface for userspace. As the introduction
+	   of VFIO device cdev interface, this can be N. For now, before
+	   userspace applications are fully converted to new vfio device cdev
+	   interface, this should be Y.
+
+	   If you don't know what to do here, say Y.
+
 config VFIO_CONTAINER
 	bool "Support for the VFIO container /dev/vfio/vfio"
 	select VFIO_IOMMU_TYPE1 if MMU && (X86 || S390 || ARM || ARM64)
+	depends on VFIO_GROUP
 	default y
 	help
 	  The VFIO container is the classic interface to VFIO for establishing

Regards,
Yi Liu

^ permalink raw reply related	[flat|nested] 135+ messages in thread

* Re: [Intel-gfx] [PATCH v3 03/15] vfio: Accept vfio device file in the driver facing kAPI
  2023-02-14  2:02       ` [Intel-gfx] " Liu, Yi L
@ 2023-02-15 12:38         ` Jason Gunthorpe
  -1 siblings, 0 replies; 135+ messages in thread
From: Jason Gunthorpe @ 2023-02-15 12:38 UTC (permalink / raw)
  To: Liu, Yi L
  Cc: linux-s390, yi.y.sun, mjrosato, kvm, intel-gvt-dev, joro, cohuck,
	peterx, eric.auger, nicolinc, shameerali.kolothum.thodi,
	suravee.suthikulpanit, intel-gfx, chao.p.peng, lulu,
	robin.murphy, jasowang

On Tue, Feb 14, 2023 at 02:02:37AM +0000, Liu, Yi L wrote:
> > From: Jason Gunthorpe <jgg@nvidia.com>
> > Sent: Tuesday, February 14, 2023 7:44 AM
> > 
> > On Mon, Feb 13, 2023 at 07:13:36AM -0800, Yi Liu wrote:
> > > +static struct vfio_device *vfio_device_from_file(struct file *file)
> > > +{
> > > +	struct vfio_device_file *df = file->private_data;
> > > +
> > > +	if (file->f_op != &vfio_device_fops)
> > > +		return NULL;
> > > +	return df->device;
> > > +}
> > > +
> > >  /**
> > >   * vfio_file_is_valid - True if the file is usable with VFIO APIS
> > >   * @file: VFIO group file or VFIO device file
> > >   */
> > >  bool vfio_file_is_valid(struct file *file)
> > >  {
> > > -	return vfio_group_from_file(file);
> > > +	return vfio_group_from_file(file) ||
> > > +	       vfio_device_from_file(file);
> > >  }
> > >  EXPORT_SYMBOL_GPL(vfio_file_is_valid);
> > 
> > This can only succeed on a device cdev that has been fully opened.
> 
> Actually, we cannot. This is used in the kvm-vfio code to see if the
> user-provided fd is vfio fds in the SET_KVM path. And we don't
> have the device cdev fully opened until BIND_IOMMUFD. But we do
> need to invoke SET_KVM before issuing BIND_IOMMUFD as the device
> open needs kvm pointer. So if we cannot apply fully opened limit to this
> interface. Maybe an updated function comment is needed.

This also seems sketchy, KVM is using the VFIO fd as a "proof" to
enable the wbinvd stuff. A half opened cdev should not be used as that
proof.

Regardless it needs to be fixed for the pci usage.

Jason

^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [PATCH v3 03/15] vfio: Accept vfio device file in the driver facing kAPI
@ 2023-02-15 12:38         ` Jason Gunthorpe
  0 siblings, 0 replies; 135+ messages in thread
From: Jason Gunthorpe @ 2023-02-15 12:38 UTC (permalink / raw)
  To: Liu, Yi L
  Cc: joro, alex.williamson, Tian, Kevin, robin.murphy, cohuck,
	eric.auger, nicolinc, kvm, mjrosato, chao.p.peng, yi.y.sun,
	peterx, jasowang, shameerali.kolothum.thodi, lulu,
	suravee.suthikulpanit, intel-gvt-dev, intel-gfx, linux-s390

On Tue, Feb 14, 2023 at 02:02:37AM +0000, Liu, Yi L wrote:
> > From: Jason Gunthorpe <jgg@nvidia.com>
> > Sent: Tuesday, February 14, 2023 7:44 AM
> > 
> > On Mon, Feb 13, 2023 at 07:13:36AM -0800, Yi Liu wrote:
> > > +static struct vfio_device *vfio_device_from_file(struct file *file)
> > > +{
> > > +	struct vfio_device_file *df = file->private_data;
> > > +
> > > +	if (file->f_op != &vfio_device_fops)
> > > +		return NULL;
> > > +	return df->device;
> > > +}
> > > +
> > >  /**
> > >   * vfio_file_is_valid - True if the file is usable with VFIO APIS
> > >   * @file: VFIO group file or VFIO device file
> > >   */
> > >  bool vfio_file_is_valid(struct file *file)
> > >  {
> > > -	return vfio_group_from_file(file);
> > > +	return vfio_group_from_file(file) ||
> > > +	       vfio_device_from_file(file);
> > >  }
> > >  EXPORT_SYMBOL_GPL(vfio_file_is_valid);
> > 
> > This can only succeed on a device cdev that has been fully opened.
> 
> Actually, we cannot. This is used in the kvm-vfio code to see if the
> user-provided fd is vfio fds in the SET_KVM path. And we don't
> have the device cdev fully opened until BIND_IOMMUFD. But we do
> need to invoke SET_KVM before issuing BIND_IOMMUFD as the device
> open needs kvm pointer. So if we cannot apply fully opened limit to this
> interface. Maybe an updated function comment is needed.

This also seems sketchy, KVM is using the VFIO fd as a "proof" to
enable the wbinvd stuff. A half opened cdev should not be used as that
proof.

Regardless it needs to be fixed for the pci usage.

Jason

^ permalink raw reply	[flat|nested] 135+ messages in thread

* RE: [PATCH v3 03/15] vfio: Accept vfio device file in the driver facing kAPI
  2023-02-15 12:38         ` Jason Gunthorpe
@ 2023-02-15 14:43           ` Liu, Yi L
  -1 siblings, 0 replies; 135+ messages in thread
From: Liu, Yi L @ 2023-02-15 14:43 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: joro, alex.williamson, Tian, Kevin, robin.murphy, cohuck,
	eric.auger, nicolinc, kvm, mjrosato, chao.p.peng, yi.y.sun,
	peterx, jasowang, shameerali.kolothum.thodi, lulu,
	suravee.suthikulpanit, intel-gvt-dev, intel-gfx, linux-s390

> From: Jason Gunthorpe <jgg@nvidia.com>
> Sent: Wednesday, February 15, 2023 8:39 PM
> 
> On Tue, Feb 14, 2023 at 02:02:37AM +0000, Liu, Yi L wrote:
> > > From: Jason Gunthorpe <jgg@nvidia.com>
> > > Sent: Tuesday, February 14, 2023 7:44 AM
> > >
> > > On Mon, Feb 13, 2023 at 07:13:36AM -0800, Yi Liu wrote:
> > > > +static struct vfio_device *vfio_device_from_file(struct file *file)
> > > > +{
> > > > +	struct vfio_device_file *df = file->private_data;
> > > > +
> > > > +	if (file->f_op != &vfio_device_fops)
> > > > +		return NULL;
> > > > +	return df->device;
> > > > +}
> > > > +
> > > >  /**
> > > >   * vfio_file_is_valid - True if the file is usable with VFIO APIS
> > > >   * @file: VFIO group file or VFIO device file
> > > >   */
> > > >  bool vfio_file_is_valid(struct file *file)
> > > >  {
> > > > -	return vfio_group_from_file(file);
> > > > +	return vfio_group_from_file(file) ||
> > > > +	       vfio_device_from_file(file);
> > > >  }
> > > >  EXPORT_SYMBOL_GPL(vfio_file_is_valid);
> > >
> > > This can only succeed on a device cdev that has been fully opened.
> >
> > Actually, we cannot. This is used in the kvm-vfio code to see if the
> > user-provided fd is vfio fds in the SET_KVM path. And we don't
> > have the device cdev fully opened until BIND_IOMMUFD. But we do
> > need to invoke SET_KVM before issuing BIND_IOMMUFD as the device
> > open needs kvm pointer. So if we cannot apply fully opened limit to this
> > interface. Maybe an updated function comment is needed.
> 
> This also seems sketchy, KVM is using the VFIO fd as a "proof" to
> enable the wbinvd stuff. A half opened cdev should not be used as that
> proof.

From this angle, the group path seems has the same concern. Device is not
opened until VFIO_GROUP_GET_DEVICE_FD. GROUP_ADD happens before
VFIO_GROUP_GET_DEVICE_FD.

But group path has one advantage, which make it ok. Group can only be
opened by one application. So once it is opened, the devices within the
group are somehow obtained by the application until group fd close.

Cdev path may also do similar thing. E.g. one cdev can be opened by
one application. Then it should be ok to use the cdev fd as proof to
enable the wbinvd stuff even the device is not opened yet.. 

> Regardless it needs to be fixed for the pci usage.

For the pci usage, does my below reply make any sense?

https://lore.kernel.org/kvm/DS0PR11MB7529CFCE99E8A77AAC76DC7CC3A39@DS0PR11MB7529.namprd11.prod.outlook.com/T/#m7c00ae5dcae15f42b6dc0b3767c7037b99f53a56

Thanks,
Yi Liu

^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [Intel-gfx] [PATCH v3 03/15] vfio: Accept vfio device file in the driver facing kAPI
@ 2023-02-15 14:43           ` Liu, Yi L
  0 siblings, 0 replies; 135+ messages in thread
From: Liu, Yi L @ 2023-02-15 14:43 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: linux-s390, yi.y.sun, mjrosato, kvm, intel-gvt-dev, joro, cohuck,
	peterx, eric.auger, nicolinc, shameerali.kolothum.thodi,
	suravee.suthikulpanit, intel-gfx, chao.p.peng, lulu,
	robin.murphy, jasowang

> From: Jason Gunthorpe <jgg@nvidia.com>
> Sent: Wednesday, February 15, 2023 8:39 PM
> 
> On Tue, Feb 14, 2023 at 02:02:37AM +0000, Liu, Yi L wrote:
> > > From: Jason Gunthorpe <jgg@nvidia.com>
> > > Sent: Tuesday, February 14, 2023 7:44 AM
> > >
> > > On Mon, Feb 13, 2023 at 07:13:36AM -0800, Yi Liu wrote:
> > > > +static struct vfio_device *vfio_device_from_file(struct file *file)
> > > > +{
> > > > +	struct vfio_device_file *df = file->private_data;
> > > > +
> > > > +	if (file->f_op != &vfio_device_fops)
> > > > +		return NULL;
> > > > +	return df->device;
> > > > +}
> > > > +
> > > >  /**
> > > >   * vfio_file_is_valid - True if the file is usable with VFIO APIS
> > > >   * @file: VFIO group file or VFIO device file
> > > >   */
> > > >  bool vfio_file_is_valid(struct file *file)
> > > >  {
> > > > -	return vfio_group_from_file(file);
> > > > +	return vfio_group_from_file(file) ||
> > > > +	       vfio_device_from_file(file);
> > > >  }
> > > >  EXPORT_SYMBOL_GPL(vfio_file_is_valid);
> > >
> > > This can only succeed on a device cdev that has been fully opened.
> >
> > Actually, we cannot. This is used in the kvm-vfio code to see if the
> > user-provided fd is vfio fds in the SET_KVM path. And we don't
> > have the device cdev fully opened until BIND_IOMMUFD. But we do
> > need to invoke SET_KVM before issuing BIND_IOMMUFD as the device
> > open needs kvm pointer. So if we cannot apply fully opened limit to this
> > interface. Maybe an updated function comment is needed.
> 
> This also seems sketchy, KVM is using the VFIO fd as a "proof" to
> enable the wbinvd stuff. A half opened cdev should not be used as that
> proof.

From this angle, the group path seems has the same concern. Device is not
opened until VFIO_GROUP_GET_DEVICE_FD. GROUP_ADD happens before
VFIO_GROUP_GET_DEVICE_FD.

But group path has one advantage, which make it ok. Group can only be
opened by one application. So once it is opened, the devices within the
group are somehow obtained by the application until group fd close.

Cdev path may also do similar thing. E.g. one cdev can be opened by
one application. Then it should be ok to use the cdev fd as proof to
enable the wbinvd stuff even the device is not opened yet.. 

> Regardless it needs to be fixed for the pci usage.

For the pci usage, does my below reply make any sense?

https://lore.kernel.org/kvm/DS0PR11MB7529CFCE99E8A77AAC76DC7CC3A39@DS0PR11MB7529.namprd11.prod.outlook.com/T/#m7c00ae5dcae15f42b6dc0b3767c7037b99f53a56

Thanks,
Yi Liu

^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [PATCH v3 03/15] vfio: Accept vfio device file in the driver facing kAPI
  2023-02-15 14:43           ` [Intel-gfx] " Liu, Yi L
@ 2023-02-15 14:46             ` Jason Gunthorpe
  -1 siblings, 0 replies; 135+ messages in thread
From: Jason Gunthorpe @ 2023-02-15 14:46 UTC (permalink / raw)
  To: Liu, Yi L
  Cc: joro, alex.williamson, Tian, Kevin, robin.murphy, cohuck,
	eric.auger, nicolinc, kvm, mjrosato, chao.p.peng, yi.y.sun,
	peterx, jasowang, shameerali.kolothum.thodi, lulu,
	suravee.suthikulpanit, intel-gvt-dev, intel-gfx, linux-s390

On Wed, Feb 15, 2023 at 02:43:20PM +0000, Liu, Yi L wrote:
> > From: Jason Gunthorpe <jgg@nvidia.com>
> > Sent: Wednesday, February 15, 2023 8:39 PM
> > 
> > On Tue, Feb 14, 2023 at 02:02:37AM +0000, Liu, Yi L wrote:
> > > > From: Jason Gunthorpe <jgg@nvidia.com>
> > > > Sent: Tuesday, February 14, 2023 7:44 AM
> > > >
> > > > On Mon, Feb 13, 2023 at 07:13:36AM -0800, Yi Liu wrote:
> > > > > +static struct vfio_device *vfio_device_from_file(struct file *file)
> > > > > +{
> > > > > +	struct vfio_device_file *df = file->private_data;
> > > > > +
> > > > > +	if (file->f_op != &vfio_device_fops)
> > > > > +		return NULL;
> > > > > +	return df->device;
> > > > > +}
> > > > > +
> > > > >  /**
> > > > >   * vfio_file_is_valid - True if the file is usable with VFIO APIS
> > > > >   * @file: VFIO group file or VFIO device file
> > > > >   */
> > > > >  bool vfio_file_is_valid(struct file *file)
> > > > >  {
> > > > > -	return vfio_group_from_file(file);
> > > > > +	return vfio_group_from_file(file) ||
> > > > > +	       vfio_device_from_file(file);
> > > > >  }
> > > > >  EXPORT_SYMBOL_GPL(vfio_file_is_valid);
> > > >
> > > > This can only succeed on a device cdev that has been fully opened.
> > >
> > > Actually, we cannot. This is used in the kvm-vfio code to see if the
> > > user-provided fd is vfio fds in the SET_KVM path. And we don't
> > > have the device cdev fully opened until BIND_IOMMUFD. But we do
> > > need to invoke SET_KVM before issuing BIND_IOMMUFD as the device
> > > open needs kvm pointer. So if we cannot apply fully opened limit to this
> > > interface. Maybe an updated function comment is needed.
> > 
> > This also seems sketchy, KVM is using the VFIO fd as a "proof" to
> > enable the wbinvd stuff. A half opened cdev should not be used as that
> > proof.
> 
> From this angle, the group path seems has the same concern. Device is not
> opened until VFIO_GROUP_GET_DEVICE_FD. 

Well, classically the device was DMA ownership claimed at least.

> But group path has one advantage, which make it ok. Group can only be
> opened by one application. So once it is opened, the devices within the
> group are somehow obtained by the application until group fd close.

It depends on what do we want the KVM proof to actually mean.

Is simply having permissions on the cdev node sufficient proof for
wbinvd?

I admit I poorly understand the threat model for this in kvm beyond
that kvm doesn't want everyone to use wbinvd.

> > Regardless it needs to be fixed for the pci usage.
> 
> For the pci usage, does my below reply make any sense?
> 
> https://lore.kernel.org/kvm/DS0PR11MB7529CFCE99E8A77AAC76DC7CC3A39@DS0PR11MB7529.namprd11.prod.outlook.com/T/#m7c00ae5dcae15f42b6dc0b3767c7037b99f53a56

You basically end up with two APIs that test two different levels of
openeness (I have permissions vs I actually am the driver owning this device)

Document it carefully at least

Jason

^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [Intel-gfx] [PATCH v3 03/15] vfio: Accept vfio device file in the driver facing kAPI
@ 2023-02-15 14:46             ` Jason Gunthorpe
  0 siblings, 0 replies; 135+ messages in thread
From: Jason Gunthorpe @ 2023-02-15 14:46 UTC (permalink / raw)
  To: Liu, Yi L
  Cc: linux-s390, yi.y.sun, mjrosato, kvm, intel-gvt-dev, joro, cohuck,
	peterx, eric.auger, nicolinc, shameerali.kolothum.thodi,
	suravee.suthikulpanit, intel-gfx, chao.p.peng, lulu,
	robin.murphy, jasowang

On Wed, Feb 15, 2023 at 02:43:20PM +0000, Liu, Yi L wrote:
> > From: Jason Gunthorpe <jgg@nvidia.com>
> > Sent: Wednesday, February 15, 2023 8:39 PM
> > 
> > On Tue, Feb 14, 2023 at 02:02:37AM +0000, Liu, Yi L wrote:
> > > > From: Jason Gunthorpe <jgg@nvidia.com>
> > > > Sent: Tuesday, February 14, 2023 7:44 AM
> > > >
> > > > On Mon, Feb 13, 2023 at 07:13:36AM -0800, Yi Liu wrote:
> > > > > +static struct vfio_device *vfio_device_from_file(struct file *file)
> > > > > +{
> > > > > +	struct vfio_device_file *df = file->private_data;
> > > > > +
> > > > > +	if (file->f_op != &vfio_device_fops)
> > > > > +		return NULL;
> > > > > +	return df->device;
> > > > > +}
> > > > > +
> > > > >  /**
> > > > >   * vfio_file_is_valid - True if the file is usable with VFIO APIS
> > > > >   * @file: VFIO group file or VFIO device file
> > > > >   */
> > > > >  bool vfio_file_is_valid(struct file *file)
> > > > >  {
> > > > > -	return vfio_group_from_file(file);
> > > > > +	return vfio_group_from_file(file) ||
> > > > > +	       vfio_device_from_file(file);
> > > > >  }
> > > > >  EXPORT_SYMBOL_GPL(vfio_file_is_valid);
> > > >
> > > > This can only succeed on a device cdev that has been fully opened.
> > >
> > > Actually, we cannot. This is used in the kvm-vfio code to see if the
> > > user-provided fd is vfio fds in the SET_KVM path. And we don't
> > > have the device cdev fully opened until BIND_IOMMUFD. But we do
> > > need to invoke SET_KVM before issuing BIND_IOMMUFD as the device
> > > open needs kvm pointer. So if we cannot apply fully opened limit to this
> > > interface. Maybe an updated function comment is needed.
> > 
> > This also seems sketchy, KVM is using the VFIO fd as a "proof" to
> > enable the wbinvd stuff. A half opened cdev should not be used as that
> > proof.
> 
> From this angle, the group path seems has the same concern. Device is not
> opened until VFIO_GROUP_GET_DEVICE_FD. 

Well, classically the device was DMA ownership claimed at least.

> But group path has one advantage, which make it ok. Group can only be
> opened by one application. So once it is opened, the devices within the
> group are somehow obtained by the application until group fd close.

It depends on what do we want the KVM proof to actually mean.

Is simply having permissions on the cdev node sufficient proof for
wbinvd?

I admit I poorly understand the threat model for this in kvm beyond
that kvm doesn't want everyone to use wbinvd.

> > Regardless it needs to be fixed for the pci usage.
> 
> For the pci usage, does my below reply make any sense?
> 
> https://lore.kernel.org/kvm/DS0PR11MB7529CFCE99E8A77AAC76DC7CC3A39@DS0PR11MB7529.namprd11.prod.outlook.com/T/#m7c00ae5dcae15f42b6dc0b3767c7037b99f53a56

You basically end up with two APIs that test two different levels of
openeness (I have permissions vs I actually am the driver owning this device)

Document it carefully at least

Jason

^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [PATCH v3 03/15] vfio: Accept vfio device file in the driver facing kAPI
  2023-02-15 14:46             ` [Intel-gfx] " Jason Gunthorpe
@ 2023-02-15 15:32               ` Alex Williamson
  -1 siblings, 0 replies; 135+ messages in thread
From: Alex Williamson @ 2023-02-15 15:32 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Liu, Yi L, joro, Tian, Kevin, robin.murphy, cohuck, eric.auger,
	nicolinc, kvm, mjrosato, chao.p.peng, yi.y.sun, peterx, jasowang,
	shameerali.kolothum.thodi, lulu, suravee.suthikulpanit,
	intel-gvt-dev, intel-gfx, linux-s390, Paolo Bonzini

[Cc +Paolo]

On Wed, 15 Feb 2023 10:46:34 -0400
Jason Gunthorpe <jgg@nvidia.com> wrote:

> On Wed, Feb 15, 2023 at 02:43:20PM +0000, Liu, Yi L wrote:
> > > From: Jason Gunthorpe <jgg@nvidia.com>
> > > Sent: Wednesday, February 15, 2023 8:39 PM
> > > 
> > > On Tue, Feb 14, 2023 at 02:02:37AM +0000, Liu, Yi L wrote:  
> > > > > From: Jason Gunthorpe <jgg@nvidia.com>
> > > > > Sent: Tuesday, February 14, 2023 7:44 AM
> > > > >
> > > > > On Mon, Feb 13, 2023 at 07:13:36AM -0800, Yi Liu wrote:  
> > > > > > +static struct vfio_device *vfio_device_from_file(struct file *file)
> > > > > > +{
> > > > > > +	struct vfio_device_file *df = file->private_data;
> > > > > > +
> > > > > > +	if (file->f_op != &vfio_device_fops)
> > > > > > +		return NULL;
> > > > > > +	return df->device;
> > > > > > +}
> > > > > > +
> > > > > >  /**
> > > > > >   * vfio_file_is_valid - True if the file is usable with VFIO APIS
> > > > > >   * @file: VFIO group file or VFIO device file
> > > > > >   */
> > > > > >  bool vfio_file_is_valid(struct file *file)
> > > > > >  {
> > > > > > -	return vfio_group_from_file(file);
> > > > > > +	return vfio_group_from_file(file) ||
> > > > > > +	       vfio_device_from_file(file);
> > > > > >  }
> > > > > >  EXPORT_SYMBOL_GPL(vfio_file_is_valid);  
> > > > >
> > > > > This can only succeed on a device cdev that has been fully opened.  
> > > >
> > > > Actually, we cannot. This is used in the kvm-vfio code to see if the
> > > > user-provided fd is vfio fds in the SET_KVM path. And we don't
> > > > have the device cdev fully opened until BIND_IOMMUFD. But we do
> > > > need to invoke SET_KVM before issuing BIND_IOMMUFD as the device
> > > > open needs kvm pointer. So if we cannot apply fully opened limit to this
> > > > interface. Maybe an updated function comment is needed.  
> > > 
> > > This also seems sketchy, KVM is using the VFIO fd as a "proof" to
> > > enable the wbinvd stuff. A half opened cdev should not be used as that
> > > proof.  
> > 
> > From this angle, the group path seems has the same concern. Device is not
> > opened until VFIO_GROUP_GET_DEVICE_FD.   
> 
> Well, classically the device was DMA ownership claimed at least.
> 
> > But group path has one advantage, which make it ok. Group can only be
> > opened by one application. So once it is opened, the devices within the
> > group are somehow obtained by the application until group fd close.  
> 
> It depends on what do we want the KVM proof to actually mean.
> 
> Is simply having permissions on the cdev node sufficient proof for
> wbinvd?
> 
> I admit I poorly understand the threat model for this in kvm beyond
> that kvm doesn't want everyone to use wbinvd.

We've discussed this with Paolo before and I believe the bar of proof
is not very high.  I suspect it's not a problem that the device itself
is not yet accessible, so long as the user can prove they have the
ability to access the device, such as access to a restricted file.  In
most cases this isn't going to turn on wbinvd anyway since DMA will be
coherent.  Thanks,

Alex


^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [Intel-gfx] [PATCH v3 03/15] vfio: Accept vfio device file in the driver facing kAPI
@ 2023-02-15 15:32               ` Alex Williamson
  0 siblings, 0 replies; 135+ messages in thread
From: Alex Williamson @ 2023-02-15 15:32 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: linux-s390, Liu, Yi L, yi.y.sun, mjrosato, kvm, intel-gvt-dev,
	joro, cohuck, peterx, eric.auger, Paolo Bonzini, nicolinc,
	shameerali.kolothum.thodi, suravee.suthikulpanit, intel-gfx,
	chao.p.peng, lulu, robin.murphy, jasowang

[Cc +Paolo]

On Wed, 15 Feb 2023 10:46:34 -0400
Jason Gunthorpe <jgg@nvidia.com> wrote:

> On Wed, Feb 15, 2023 at 02:43:20PM +0000, Liu, Yi L wrote:
> > > From: Jason Gunthorpe <jgg@nvidia.com>
> > > Sent: Wednesday, February 15, 2023 8:39 PM
> > > 
> > > On Tue, Feb 14, 2023 at 02:02:37AM +0000, Liu, Yi L wrote:  
> > > > > From: Jason Gunthorpe <jgg@nvidia.com>
> > > > > Sent: Tuesday, February 14, 2023 7:44 AM
> > > > >
> > > > > On Mon, Feb 13, 2023 at 07:13:36AM -0800, Yi Liu wrote:  
> > > > > > +static struct vfio_device *vfio_device_from_file(struct file *file)
> > > > > > +{
> > > > > > +	struct vfio_device_file *df = file->private_data;
> > > > > > +
> > > > > > +	if (file->f_op != &vfio_device_fops)
> > > > > > +		return NULL;
> > > > > > +	return df->device;
> > > > > > +}
> > > > > > +
> > > > > >  /**
> > > > > >   * vfio_file_is_valid - True if the file is usable with VFIO APIS
> > > > > >   * @file: VFIO group file or VFIO device file
> > > > > >   */
> > > > > >  bool vfio_file_is_valid(struct file *file)
> > > > > >  {
> > > > > > -	return vfio_group_from_file(file);
> > > > > > +	return vfio_group_from_file(file) ||
> > > > > > +	       vfio_device_from_file(file);
> > > > > >  }
> > > > > >  EXPORT_SYMBOL_GPL(vfio_file_is_valid);  
> > > > >
> > > > > This can only succeed on a device cdev that has been fully opened.  
> > > >
> > > > Actually, we cannot. This is used in the kvm-vfio code to see if the
> > > > user-provided fd is vfio fds in the SET_KVM path. And we don't
> > > > have the device cdev fully opened until BIND_IOMMUFD. But we do
> > > > need to invoke SET_KVM before issuing BIND_IOMMUFD as the device
> > > > open needs kvm pointer. So if we cannot apply fully opened limit to this
> > > > interface. Maybe an updated function comment is needed.  
> > > 
> > > This also seems sketchy, KVM is using the VFIO fd as a "proof" to
> > > enable the wbinvd stuff. A half opened cdev should not be used as that
> > > proof.  
> > 
> > From this angle, the group path seems has the same concern. Device is not
> > opened until VFIO_GROUP_GET_DEVICE_FD.   
> 
> Well, classically the device was DMA ownership claimed at least.
> 
> > But group path has one advantage, which make it ok. Group can only be
> > opened by one application. So once it is opened, the devices within the
> > group are somehow obtained by the application until group fd close.  
> 
> It depends on what do we want the KVM proof to actually mean.
> 
> Is simply having permissions on the cdev node sufficient proof for
> wbinvd?
> 
> I admit I poorly understand the threat model for this in kvm beyond
> that kvm doesn't want everyone to use wbinvd.

We've discussed this with Paolo before and I believe the bar of proof
is not very high.  I suspect it's not a problem that the device itself
is not yet accessible, so long as the user can prove they have the
ability to access the device, such as access to a restricted file.  In
most cases this isn't going to turn on wbinvd anyway since DMA will be
coherent.  Thanks,

Alex


^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [PATCH v3 03/15] vfio: Accept vfio device file in the driver facing kAPI
  2023-02-15 15:32               ` [Intel-gfx] " Alex Williamson
@ 2023-02-15 17:04                 ` Jason Gunthorpe
  -1 siblings, 0 replies; 135+ messages in thread
From: Jason Gunthorpe @ 2023-02-15 17:04 UTC (permalink / raw)
  To: Alex Williamson
  Cc: Liu, Yi L, joro, Tian, Kevin, robin.murphy, cohuck, eric.auger,
	nicolinc, kvm, mjrosato, chao.p.peng, yi.y.sun, peterx, jasowang,
	shameerali.kolothum.thodi, lulu, suravee.suthikulpanit,
	intel-gvt-dev, intel-gfx, linux-s390, Paolo Bonzini

On Wed, Feb 15, 2023 at 08:32:34AM -0700, Alex Williamson wrote:

> We've discussed this with Paolo before and I believe the bar of proof
> is not very high.  I suspect it's not a problem that the device itself
> is not yet accessible, so long as the user can prove they have the
> ability to access the device, such as access to a restricted file.  In
> most cases this isn't going to turn on wbinvd anyway since DMA will be
> coherent.  Thanks,

Isn't that a second problem, we don't know if the device is coherent
until it is bound?

Jason

^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [Intel-gfx] [PATCH v3 03/15] vfio: Accept vfio device file in the driver facing kAPI
@ 2023-02-15 17:04                 ` Jason Gunthorpe
  0 siblings, 0 replies; 135+ messages in thread
From: Jason Gunthorpe @ 2023-02-15 17:04 UTC (permalink / raw)
  To: Alex Williamson
  Cc: linux-s390, Liu, Yi L, yi.y.sun, mjrosato, kvm, intel-gvt-dev,
	joro, cohuck, peterx, eric.auger, Paolo Bonzini, nicolinc,
	shameerali.kolothum.thodi, suravee.suthikulpanit, intel-gfx,
	chao.p.peng, lulu, robin.murphy, jasowang

On Wed, Feb 15, 2023 at 08:32:34AM -0700, Alex Williamson wrote:

> We've discussed this with Paolo before and I believe the bar of proof
> is not very high.  I suspect it's not a problem that the device itself
> is not yet accessible, so long as the user can prove they have the
> ability to access the device, such as access to a restricted file.  In
> most cases this isn't going to turn on wbinvd anyway since DMA will be
> coherent.  Thanks,

Isn't that a second problem, we don't know if the device is coherent
until it is bound?

Jason

^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [Intel-gfx] [PATCH v3 03/15] vfio: Accept vfio device file in the driver facing kAPI
  2023-02-15 17:04                 ` [Intel-gfx] " Jason Gunthorpe
@ 2023-02-15 17:19                   ` Alex Williamson
  -1 siblings, 0 replies; 135+ messages in thread
From: Alex Williamson @ 2023-02-15 17:19 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: linux-s390, Liu, Yi L, yi.y.sun, mjrosato, kvm, intel-gvt-dev,
	joro, cohuck, peterx, eric.auger, Paolo Bonzini, nicolinc,
	shameerali.kolothum.thodi, suravee.suthikulpanit, intel-gfx,
	chao.p.peng, lulu, robin.murphy, jasowang

On Wed, 15 Feb 2023 13:04:13 -0400
Jason Gunthorpe <jgg@nvidia.com> wrote:

> On Wed, Feb 15, 2023 at 08:32:34AM -0700, Alex Williamson wrote:
> 
> > We've discussed this with Paolo before and I believe the bar of proof
> > is not very high.  I suspect it's not a problem that the device itself
> > is not yet accessible, so long as the user can prove they have the
> > ability to access the device, such as access to a restricted file.  In
> > most cases this isn't going to turn on wbinvd anyway since DMA will be
> > coherent.  Thanks,  
> 
> Isn't that a second problem, we don't know if the device is coherent
> until it is bound?

I think this is already accounted for in the conversion to device level
IOMMU ops, ie. device_iommu_capable() follows the
dev->iommu->iommu_dev->ops, where for example intel_iommu_capable() is
only looking at the capabilities of the IOMMU managing the device.  We
did some hand waving simplifications that was sufficient at some point,
IIRC.  Thanks,

Alex


^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [PATCH v3 03/15] vfio: Accept vfio device file in the driver facing kAPI
@ 2023-02-15 17:19                   ` Alex Williamson
  0 siblings, 0 replies; 135+ messages in thread
From: Alex Williamson @ 2023-02-15 17:19 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Liu, Yi L, joro, Tian, Kevin, robin.murphy, cohuck, eric.auger,
	nicolinc, kvm, mjrosato, chao.p.peng, yi.y.sun, peterx, jasowang,
	shameerali.kolothum.thodi, lulu, suravee.suthikulpanit,
	intel-gvt-dev, intel-gfx, linux-s390, Paolo Bonzini

On Wed, 15 Feb 2023 13:04:13 -0400
Jason Gunthorpe <jgg@nvidia.com> wrote:

> On Wed, Feb 15, 2023 at 08:32:34AM -0700, Alex Williamson wrote:
> 
> > We've discussed this with Paolo before and I believe the bar of proof
> > is not very high.  I suspect it's not a problem that the device itself
> > is not yet accessible, so long as the user can prove they have the
> > ability to access the device, such as access to a restricted file.  In
> > most cases this isn't going to turn on wbinvd anyway since DMA will be
> > coherent.  Thanks,  
> 
> Isn't that a second problem, we don't know if the device is coherent
> until it is bound?

I think this is already accounted for in the conversion to device level
IOMMU ops, ie. device_iommu_capable() follows the
dev->iommu->iommu_dev->ops, where for example intel_iommu_capable() is
only looking at the capabilities of the IOMMU managing the device.  We
did some hand waving simplifications that was sufficient at some point,
IIRC.  Thanks,

Alex


^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [PATCH v3 03/15] vfio: Accept vfio device file in the driver facing kAPI
  2023-02-15 17:19                   ` Alex Williamson
@ 2023-02-15 17:33                     ` Jason Gunthorpe
  -1 siblings, 0 replies; 135+ messages in thread
From: Jason Gunthorpe @ 2023-02-15 17:33 UTC (permalink / raw)
  To: Alex Williamson
  Cc: Liu, Yi L, joro, Tian, Kevin, robin.murphy, cohuck, eric.auger,
	nicolinc, kvm, mjrosato, chao.p.peng, yi.y.sun, peterx, jasowang,
	shameerali.kolothum.thodi, lulu, suravee.suthikulpanit,
	intel-gvt-dev, intel-gfx, linux-s390, Paolo Bonzini

On Wed, Feb 15, 2023 at 10:19:35AM -0700, Alex Williamson wrote:
> On Wed, 15 Feb 2023 13:04:13 -0400
> Jason Gunthorpe <jgg@nvidia.com> wrote:
> 
> > On Wed, Feb 15, 2023 at 08:32:34AM -0700, Alex Williamson wrote:
> > 
> > > We've discussed this with Paolo before and I believe the bar of proof
> > > is not very high.  I suspect it's not a problem that the device itself
> > > is not yet accessible, so long as the user can prove they have the
> > > ability to access the device, such as access to a restricted file.  In
> > > most cases this isn't going to turn on wbinvd anyway since DMA will be
> > > coherent.  Thanks,  
> > 
> > Isn't that a second problem, we don't know if the device is coherent
> > until it is bound?
> 
> I think this is already accounted for in the conversion to device level
> IOMMU ops, ie. device_iommu_capable() follows the
> dev->iommu->iommu_dev->ops, where for example intel_iommu_capable() is
> only looking at the capabilities of the IOMMU managing the device.  We
> did some hand waving simplifications that was sufficient at some point,
> IIRC.  Thanks,

Oh right, I remember this now :)

Jason

^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [Intel-gfx] [PATCH v3 03/15] vfio: Accept vfio device file in the driver facing kAPI
@ 2023-02-15 17:33                     ` Jason Gunthorpe
  0 siblings, 0 replies; 135+ messages in thread
From: Jason Gunthorpe @ 2023-02-15 17:33 UTC (permalink / raw)
  To: Alex Williamson
  Cc: linux-s390, Liu, Yi L, yi.y.sun, mjrosato, kvm, intel-gvt-dev,
	joro, cohuck, peterx, eric.auger, Paolo Bonzini, nicolinc,
	shameerali.kolothum.thodi, suravee.suthikulpanit, intel-gfx,
	chao.p.peng, lulu, robin.murphy, jasowang

On Wed, Feb 15, 2023 at 10:19:35AM -0700, Alex Williamson wrote:
> On Wed, 15 Feb 2023 13:04:13 -0400
> Jason Gunthorpe <jgg@nvidia.com> wrote:
> 
> > On Wed, Feb 15, 2023 at 08:32:34AM -0700, Alex Williamson wrote:
> > 
> > > We've discussed this with Paolo before and I believe the bar of proof
> > > is not very high.  I suspect it's not a problem that the device itself
> > > is not yet accessible, so long as the user can prove they have the
> > > ability to access the device, such as access to a restricted file.  In
> > > most cases this isn't going to turn on wbinvd anyway since DMA will be
> > > coherent.  Thanks,  
> > 
> > Isn't that a second problem, we don't know if the device is coherent
> > until it is bound?
> 
> I think this is already accounted for in the conversion to device level
> IOMMU ops, ie. device_iommu_capable() follows the
> dev->iommu->iommu_dev->ops, where for example intel_iommu_capable() is
> only looking at the capabilities of the IOMMU managing the device.  We
> did some hand waving simplifications that was sufficient at some point,
> IIRC.  Thanks,

Oh right, I remember this now :)

Jason

^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [PATCH v3 00/15] Add vfio_device cdev for iommufd support
  2023-02-15  7:54         ` [Intel-gfx] " Liu, Yi L
@ 2023-02-15 20:09           ` Alex Williamson
  -1 siblings, 0 replies; 135+ messages in thread
From: Alex Williamson @ 2023-02-15 20:09 UTC (permalink / raw)
  To: Liu, Yi L
  Cc: joro, jgg, Tian, Kevin, robin.murphy, cohuck, eric.auger,
	nicolinc, kvm, mjrosato, chao.p.peng, yi.y.sun, peterx, jasowang,
	shameerali.kolothum.thodi, lulu, suravee.suthikulpanit,
	intel-gvt-dev, intel-gfx, linux-s390

On Wed, 15 Feb 2023 07:54:31 +0000
"Liu, Yi L" <yi.l.liu@intel.com> wrote:

> > From: Alex Williamson <alex.williamson@redhat.com>
> > Sent: Tuesday, February 14, 2023 11:47 PM
> > 
> > On Tue, 14 Feb 2023 01:55:17 +0000
> > "Liu, Yi L" <yi.l.liu@intel.com> wrote:
> >   
> > > > From: Alex Williamson <alex.williamson@redhat.com>
> > > > Sent: Tuesday, February 14, 2023 3:47 AM
> > > >
> > > > On Mon, 13 Feb 2023 07:13:33 -0800
> > > > Yi Liu <yi.l.liu@intel.com> wrote:
> > > >  
> > > > > Existing VFIO provides group-centric user APIs for userspace.  
> > Userspace  
> > > > > opens the /dev/vfio/$group_id first before getting device fd and  
> > hence  
> > > > > getting access to device. This is not the desired model for iommufd.  
> > Per  
> > > > > the conclusion of community discussion[1], iommufd provides device-  
> > > > centric  
> > > > > kAPIs and requires its consumer (like VFIO) to be device-centric user
> > > > > APIs. Such user APIs are used to associate device with iommufd and  
> > also  
> > > > > the I/O address spaces managed by the iommufd.
> > > > >
> > > > > This series first introduces a per device file structure to be prepared
> > > > > for further enhancement and refactors the kvm-vfio code to be  
> > prepared  
> > > > > for accepting device file from userspace. Then refactors the vfio to be
> > > > > able to handle iommufd binding. This refactor includes the mechanism  
> > of  
> > > > > blocking device access before iommufd bind, making  
> > vfio_device_open()  
> > > > be  
> > > > > exclusive between the group path and the cdev path. Eventually, adds  
> > the  
> > > > > cdev support for vfio device, and makes group infrastructure optional  
> > as  
> > > > > it is not needed when vfio device cdev is compiled.
> > > > >
> > > > > This is also a prerequisite for iommu nesting for vfio device[2].
> > > > >
> > > > > The complete code can be found in below branch, simple test done  
> > with  
> > > > the  
> > > > > legacy group path and the cdev path. Draft QEMU branch can be found  
> > > > at[3]  
> > > > >
> > > > > https://github.com/yiliu1765/iommufd/tree/vfio_device_cdev_v3
> > > > > (config CONFIG_IOMMUFD=y CONFIG_VFIO_DEVICE_CDEV=y)  
> > > >
> > > > Even using your branch[1], it seems like this has not been tested
> > > > except with cdev support enabled:
> > > >
> > > > /home/alwillia/Work/linux.git/drivers/vfio/vfio_main.c: In function
> > > > ‘vfio_device_add’:
> > > > /home/alwillia/Work/linux.git/drivers/vfio/vfio_main.c:253:48: error:  
> > ‘struct  
> > > > vfio_device’ has no member named ‘cdev’; did you mean ‘dev’?
> > > >   253 |                 ret = cdev_device_add(&device->cdev, &device->device);
> > > >       |                                                ^~~~
> > > >       |                                                dev
> > > > /home/alwillia/Work/linux.git/drivers/vfio/vfio_main.c: In function
> > > > ‘vfio_device_del’:
> > > > /home/alwillia/Work/linux.git/drivers/vfio/vfio_main.c:262:42: error:  
> > ‘struct  
> > > > vfio_device’ has no member named ‘cdev’; did you mean ‘dev’?
> > > >   262 |                 cdev_device_del(&device->cdev, &device->device);
> > > >       |                                          ^~~~
> > > >       |                                          dev  
> > >
> > > Sorry for it. It is due to the cdev definition is under
> > > "#if IS_ENABLED(CONFIG_VFIO_DEVICE_CDEV)". While, in the code it
> > > uses "if (IS_ENABLED(CONFIG_VFIO_DEVICE_CDEV))".  I think for
> > > readability, it would be better to always define cdev in vfio_device,
> > > and keep the using of cdev in code. How about your taste?  
> > 
> > It seems necessary unless we want to litter the code with #ifdefs.  
> 
> I've moved it to the header file and call cdev_device_add()
> under #if (IS_ENABLED(CONFIG_VFIO_DEVICE_CDEV))".
> 
> > > > Additionally the VFIO_ENABLE_GROUP Kconfig option doesn't make much
> > > > sense to me, it seems entirely redundant to VFIO_GROUP.  
> > >
> > > The intention is to make the group code compiling match existing case.
> > > Currently, if VFIO is configured, group code is by default compiled.
> > > So VFIO_ENABLE_GROUP a hidden option, and VFIO_GROUP an option
> > > for user.  User needs to explicitly config VFIO_GROUP if VFIO_DEVICE_CDEV==y.
> > > If VFIO_DEVICE_CDEV==n, then no matter user configed VFIO_GROUP or
> > > not, the group code shall be compiled.  
> > 
> > I understand the mechanics, I still find VFIO_ENABLE_GROUP redundant
> > and unnecessary.  Also, Kconfig should not allow a configuration
> > without either VFIO_GROUP or VFIO_DEVICE_CDEV as this is not
> > functional.  Deselecting VFIO_GROUP should select VFIO_DEVICE_CDEV,
> > but  VFIO_DEVICE_CDEV should be an optional addition to VFIO_GROUP.  
> 
> How about below? As Jason's remark on patch 0003, cdev is not available
> for SPAPR.
> 
> diff --git a/drivers/vfio/Kconfig b/drivers/vfio/Kconfig
> index 0476abf154f2..96535adc2301 100644
> --- a/drivers/vfio/Kconfig
> +++ b/drivers/vfio/Kconfig
> @@ -4,6 +4,8 @@ menuconfig VFIO
>  	select IOMMU_API
>  	depends on IOMMUFD || !IOMMUFD
>  	select INTERVAL_TREE
> +	select VFIO_GROUP if SPAPR_TCE_IOMMU
> +	select VFIO_DEVICE_CDEV if !VFIO_GROUP && (X86 || S390 || ARM || ARM64)
>  	select VFIO_CONTAINER if IOMMUFD=n
>  	help
>  	  VFIO provides a framework for secure userspace device drivers.
> @@ -14,7 +16,8 @@ menuconfig VFIO
>  if VFIO
>  config VFIO_DEVICE_CDEV
>  	bool "Support for the VFIO cdev /dev/vfio/devices/vfioX"
>  	depends on IOMMUFD && (X86 || S390 || ARM || ARM64)
> +	default !VFIO_GROUP
>  	help
>  	  The VFIO device cdev is another way for userspace to get device
>  	  access. Userspace gets device fd by opening device cdev under
> @@ -23,9 +26,21 @@ config VFIO_DEVICE_CDEV
>  
>  	  If you don't know what to do here, say N.
>  
> +config VFIO_GROUP
> +	bool "Support for the VFIO group /dev/vfio/$group_id"
> +	default y
> +	help
> +	   VFIO group is legacy interface for userspace. As the introduction
> +	   of VFIO device cdev interface, this can be N. For now, before
> +	   userspace applications are fully converted to new vfio device cdev
> +	   interface, this should be Y.
> +
> +	   If you don't know what to do here, say Y.
> +

I think this does the correct thing, but I'll reserve final judgment
until I can try to break it ;)

This message needs some tuning though, we're not far enough along the
path of cdev access to consider the group interface "legacy" (imo) or
expect that there are any userspace applications converted.  There are
also multiple setting recommendations to befuddle a layperson.  Perhaps:

	VFIO group support provides the traditional model for accessing
	devices through VFIO and is used by the majority of userspace
	applications and drivers making use of VFIO.

	If you don't know what to do here, say Y.

Thanks,
Alex

>  config VFIO_CONTAINER
>  	bool "Support for the VFIO container /dev/vfio/vfio"
>  	select VFIO_IOMMU_TYPE1 if MMU && (X86 || S390 || ARM || ARM64)
> +	depends on VFIO_GROUP
>  	default y
>  	help
>  	  The VFIO container is the classic interface to VFIO for establishing
> 
> Regards,
> Yi Liu


^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [Intel-gfx] [PATCH v3 00/15] Add vfio_device cdev for iommufd support
@ 2023-02-15 20:09           ` Alex Williamson
  0 siblings, 0 replies; 135+ messages in thread
From: Alex Williamson @ 2023-02-15 20:09 UTC (permalink / raw)
  To: Liu, Yi L
  Cc: linux-s390, yi.y.sun, mjrosato, kvm, intel-gvt-dev, joro, cohuck,
	peterx, suravee.suthikulpanit, eric.auger, nicolinc,
	shameerali.kolothum.thodi, jgg, intel-gfx, chao.p.peng, lulu,
	robin.murphy, jasowang

On Wed, 15 Feb 2023 07:54:31 +0000
"Liu, Yi L" <yi.l.liu@intel.com> wrote:

> > From: Alex Williamson <alex.williamson@redhat.com>
> > Sent: Tuesday, February 14, 2023 11:47 PM
> > 
> > On Tue, 14 Feb 2023 01:55:17 +0000
> > "Liu, Yi L" <yi.l.liu@intel.com> wrote:
> >   
> > > > From: Alex Williamson <alex.williamson@redhat.com>
> > > > Sent: Tuesday, February 14, 2023 3:47 AM
> > > >
> > > > On Mon, 13 Feb 2023 07:13:33 -0800
> > > > Yi Liu <yi.l.liu@intel.com> wrote:
> > > >  
> > > > > Existing VFIO provides group-centric user APIs for userspace.  
> > Userspace  
> > > > > opens the /dev/vfio/$group_id first before getting device fd and  
> > hence  
> > > > > getting access to device. This is not the desired model for iommufd.  
> > Per  
> > > > > the conclusion of community discussion[1], iommufd provides device-  
> > > > centric  
> > > > > kAPIs and requires its consumer (like VFIO) to be device-centric user
> > > > > APIs. Such user APIs are used to associate device with iommufd and  
> > also  
> > > > > the I/O address spaces managed by the iommufd.
> > > > >
> > > > > This series first introduces a per device file structure to be prepared
> > > > > for further enhancement and refactors the kvm-vfio code to be  
> > prepared  
> > > > > for accepting device file from userspace. Then refactors the vfio to be
> > > > > able to handle iommufd binding. This refactor includes the mechanism  
> > of  
> > > > > blocking device access before iommufd bind, making  
> > vfio_device_open()  
> > > > be  
> > > > > exclusive between the group path and the cdev path. Eventually, adds  
> > the  
> > > > > cdev support for vfio device, and makes group infrastructure optional  
> > as  
> > > > > it is not needed when vfio device cdev is compiled.
> > > > >
> > > > > This is also a prerequisite for iommu nesting for vfio device[2].
> > > > >
> > > > > The complete code can be found in below branch, simple test done  
> > with  
> > > > the  
> > > > > legacy group path and the cdev path. Draft QEMU branch can be found  
> > > > at[3]  
> > > > >
> > > > > https://github.com/yiliu1765/iommufd/tree/vfio_device_cdev_v3
> > > > > (config CONFIG_IOMMUFD=y CONFIG_VFIO_DEVICE_CDEV=y)  
> > > >
> > > > Even using your branch[1], it seems like this has not been tested
> > > > except with cdev support enabled:
> > > >
> > > > /home/alwillia/Work/linux.git/drivers/vfio/vfio_main.c: In function
> > > > ‘vfio_device_add’:
> > > > /home/alwillia/Work/linux.git/drivers/vfio/vfio_main.c:253:48: error:  
> > ‘struct  
> > > > vfio_device’ has no member named ‘cdev’; did you mean ‘dev’?
> > > >   253 |                 ret = cdev_device_add(&device->cdev, &device->device);
> > > >       |                                                ^~~~
> > > >       |                                                dev
> > > > /home/alwillia/Work/linux.git/drivers/vfio/vfio_main.c: In function
> > > > ‘vfio_device_del’:
> > > > /home/alwillia/Work/linux.git/drivers/vfio/vfio_main.c:262:42: error:  
> > ‘struct  
> > > > vfio_device’ has no member named ‘cdev’; did you mean ‘dev’?
> > > >   262 |                 cdev_device_del(&device->cdev, &device->device);
> > > >       |                                          ^~~~
> > > >       |                                          dev  
> > >
> > > Sorry for it. It is due to the cdev definition is under
> > > "#if IS_ENABLED(CONFIG_VFIO_DEVICE_CDEV)". While, in the code it
> > > uses "if (IS_ENABLED(CONFIG_VFIO_DEVICE_CDEV))".  I think for
> > > readability, it would be better to always define cdev in vfio_device,
> > > and keep the using of cdev in code. How about your taste?  
> > 
> > It seems necessary unless we want to litter the code with #ifdefs.  
> 
> I've moved it to the header file and call cdev_device_add()
> under #if (IS_ENABLED(CONFIG_VFIO_DEVICE_CDEV))".
> 
> > > > Additionally the VFIO_ENABLE_GROUP Kconfig option doesn't make much
> > > > sense to me, it seems entirely redundant to VFIO_GROUP.  
> > >
> > > The intention is to make the group code compiling match existing case.
> > > Currently, if VFIO is configured, group code is by default compiled.
> > > So VFIO_ENABLE_GROUP a hidden option, and VFIO_GROUP an option
> > > for user.  User needs to explicitly config VFIO_GROUP if VFIO_DEVICE_CDEV==y.
> > > If VFIO_DEVICE_CDEV==n, then no matter user configed VFIO_GROUP or
> > > not, the group code shall be compiled.  
> > 
> > I understand the mechanics, I still find VFIO_ENABLE_GROUP redundant
> > and unnecessary.  Also, Kconfig should not allow a configuration
> > without either VFIO_GROUP or VFIO_DEVICE_CDEV as this is not
> > functional.  Deselecting VFIO_GROUP should select VFIO_DEVICE_CDEV,
> > but  VFIO_DEVICE_CDEV should be an optional addition to VFIO_GROUP.  
> 
> How about below? As Jason's remark on patch 0003, cdev is not available
> for SPAPR.
> 
> diff --git a/drivers/vfio/Kconfig b/drivers/vfio/Kconfig
> index 0476abf154f2..96535adc2301 100644
> --- a/drivers/vfio/Kconfig
> +++ b/drivers/vfio/Kconfig
> @@ -4,6 +4,8 @@ menuconfig VFIO
>  	select IOMMU_API
>  	depends on IOMMUFD || !IOMMUFD
>  	select INTERVAL_TREE
> +	select VFIO_GROUP if SPAPR_TCE_IOMMU
> +	select VFIO_DEVICE_CDEV if !VFIO_GROUP && (X86 || S390 || ARM || ARM64)
>  	select VFIO_CONTAINER if IOMMUFD=n
>  	help
>  	  VFIO provides a framework for secure userspace device drivers.
> @@ -14,7 +16,8 @@ menuconfig VFIO
>  if VFIO
>  config VFIO_DEVICE_CDEV
>  	bool "Support for the VFIO cdev /dev/vfio/devices/vfioX"
>  	depends on IOMMUFD && (X86 || S390 || ARM || ARM64)
> +	default !VFIO_GROUP
>  	help
>  	  The VFIO device cdev is another way for userspace to get device
>  	  access. Userspace gets device fd by opening device cdev under
> @@ -23,9 +26,21 @@ config VFIO_DEVICE_CDEV
>  
>  	  If you don't know what to do here, say N.
>  
> +config VFIO_GROUP
> +	bool "Support for the VFIO group /dev/vfio/$group_id"
> +	default y
> +	help
> +	   VFIO group is legacy interface for userspace. As the introduction
> +	   of VFIO device cdev interface, this can be N. For now, before
> +	   userspace applications are fully converted to new vfio device cdev
> +	   interface, this should be Y.
> +
> +	   If you don't know what to do here, say Y.
> +

I think this does the correct thing, but I'll reserve final judgment
until I can try to break it ;)

This message needs some tuning though, we're not far enough along the
path of cdev access to consider the group interface "legacy" (imo) or
expect that there are any userspace applications converted.  There are
also multiple setting recommendations to befuddle a layperson.  Perhaps:

	VFIO group support provides the traditional model for accessing
	devices through VFIO and is used by the majority of userspace
	applications and drivers making use of VFIO.

	If you don't know what to do here, say Y.

Thanks,
Alex

>  config VFIO_CONTAINER
>  	bool "Support for the VFIO container /dev/vfio/vfio"
>  	select VFIO_IOMMU_TYPE1 if MMU && (X86 || S390 || ARM || ARM64)
> +	depends on VFIO_GROUP
>  	default y
>  	help
>  	  The VFIO container is the classic interface to VFIO for establishing
> 
> Regards,
> Yi Liu


^ permalink raw reply	[flat|nested] 135+ messages in thread

* RE: [PATCH v3 00/15] Add vfio_device cdev for iommufd support
  2023-02-15 20:09           ` [Intel-gfx] " Alex Williamson
@ 2023-02-16  2:53             ` Liu, Yi L
  -1 siblings, 0 replies; 135+ messages in thread
From: Liu, Yi L @ 2023-02-16  2:53 UTC (permalink / raw)
  To: Alex Williamson
  Cc: joro, jgg, Tian, Kevin, robin.murphy, cohuck, eric.auger,
	nicolinc, kvm, mjrosato, chao.p.peng, yi.y.sun, peterx, jasowang,
	shameerali.kolothum.thodi, lulu, suravee.suthikulpanit,
	intel-gvt-dev, intel-gfx, linux-s390

> From: Alex Williamson <alex.williamson@redhat.com>
> Sent: Thursday, February 16, 2023 4:09 AM
> 
> On Wed, 15 Feb 2023 07:54:31 +0000
> "Liu, Yi L" <yi.l.liu@intel.com> wrote:
> 
> > > From: Alex Williamson <alex.williamson@redhat.com>
> > > Sent: Tuesday, February 14, 2023 11:47 PM
> > >
> > > On Tue, 14 Feb 2023 01:55:17 +0000
> > > "Liu, Yi L" <yi.l.liu@intel.com> wrote:
> > >
> > > > > From: Alex Williamson <alex.williamson@redhat.com>
> > > > > Sent: Tuesday, February 14, 2023 3:47 AM
> > > > >
> > > > > On Mon, 13 Feb 2023 07:13:33 -0800
> > > > > Yi Liu <yi.l.liu@intel.com> wrote:
> > > > >
> > > > > > Existing VFIO provides group-centric user APIs for userspace.
> > > Userspace
> > > > > > opens the /dev/vfio/$group_id first before getting device fd and
> > > hence
> > > > > > getting access to device. This is not the desired model for iommufd.
> > > Per
> > > > > > the conclusion of community discussion[1], iommufd provides
> device-
> > > > > centric
> > > > > > kAPIs and requires its consumer (like VFIO) to be device-centric
> user
> > > > > > APIs. Such user APIs are used to associate device with iommufd
> and
> > > also
> > > > > > the I/O address spaces managed by the iommufd.
> > > > > >
> > > > > > This series first introduces a per device file structure to be
> prepared
> > > > > > for further enhancement and refactors the kvm-vfio code to be
> > > prepared
> > > > > > for accepting device file from userspace. Then refactors the vfio to
> be
> > > > > > able to handle iommufd binding. This refactor includes the
> mechanism
> > > of
> > > > > > blocking device access before iommufd bind, making
> > > vfio_device_open()
> > > > > be
> > > > > > exclusive between the group path and the cdev path. Eventually,
> adds
> > > the
> > > > > > cdev support for vfio device, and makes group infrastructure
> optional
> > > as
> > > > > > it is not needed when vfio device cdev is compiled.
> > > > > >
> > > > > > This is also a prerequisite for iommu nesting for vfio device[2].
> > > > > >
> > > > > > The complete code can be found in below branch, simple test done
> > > with
> > > > > the
> > > > > > legacy group path and the cdev path. Draft QEMU branch can be
> found
> > > > > at[3]
> > > > > >
> > > > > > https://github.com/yiliu1765/iommufd/tree/vfio_device_cdev_v3
> > > > > > (config CONFIG_IOMMUFD=y CONFIG_VFIO_DEVICE_CDEV=y)
> > > > >
> > > > > Even using your branch[1], it seems like this has not been tested
> > > > > except with cdev support enabled:
> > > > >
> > > > > /home/alwillia/Work/linux.git/drivers/vfio/vfio_main.c: In function
> > > > > ‘vfio_device_add’:
> > > > > /home/alwillia/Work/linux.git/drivers/vfio/vfio_main.c:253:48: error:
> > > ‘struct
> > > > > vfio_device’ has no member named ‘cdev’; did you mean ‘dev’?
> > > > >   253 |                 ret = cdev_device_add(&device->cdev, &device-
> >device);
> > > > >       |                                                ^~~~
> > > > >       |                                                dev
> > > > > /home/alwillia/Work/linux.git/drivers/vfio/vfio_main.c: In function
> > > > > ‘vfio_device_del’:
> > > > > /home/alwillia/Work/linux.git/drivers/vfio/vfio_main.c:262:42: error:
> > > ‘struct
> > > > > vfio_device’ has no member named ‘cdev’; did you mean ‘dev’?
> > > > >   262 |                 cdev_device_del(&device->cdev, &device->device);
> > > > >       |                                          ^~~~
> > > > >       |                                          dev
> > > >
> > > > Sorry for it. It is due to the cdev definition is under
> > > > "#if IS_ENABLED(CONFIG_VFIO_DEVICE_CDEV)". While, in the code it
> > > > uses "if (IS_ENABLED(CONFIG_VFIO_DEVICE_CDEV))".  I think for
> > > > readability, it would be better to always define cdev in vfio_device,
> > > > and keep the using of cdev in code. How about your taste?
> > >
> > > It seems necessary unless we want to litter the code with #ifdefs.
> >
> > I've moved it to the header file and call cdev_device_add()
> > under #if (IS_ENABLED(CONFIG_VFIO_DEVICE_CDEV))".
> >
> > > > > Additionally the VFIO_ENABLE_GROUP Kconfig option doesn't make
> much
> > > > > sense to me, it seems entirely redundant to VFIO_GROUP.
> > > >
> > > > The intention is to make the group code compiling match existing case.
> > > > Currently, if VFIO is configured, group code is by default compiled.
> > > > So VFIO_ENABLE_GROUP a hidden option, and VFIO_GROUP an
> option
> > > > for user.  User needs to explicitly config VFIO_GROUP if
> VFIO_DEVICE_CDEV==y.
> > > > If VFIO_DEVICE_CDEV==n, then no matter user configed
> VFIO_GROUP or
> > > > not, the group code shall be compiled.
> > >
> > > I understand the mechanics, I still find VFIO_ENABLE_GROUP redundant
> > > and unnecessary.  Also, Kconfig should not allow a configuration
> > > without either VFIO_GROUP or VFIO_DEVICE_CDEV as this is not
> > > functional.  Deselecting VFIO_GROUP should select VFIO_DEVICE_CDEV,
> > > but  VFIO_DEVICE_CDEV should be an optional addition to VFIO_GROUP.
> >
> > How about below? As Jason's remark on patch 0003, cdev is not available
> > for SPAPR.
> >
> > diff --git a/drivers/vfio/Kconfig b/drivers/vfio/Kconfig
> > index 0476abf154f2..96535adc2301 100644
> > --- a/drivers/vfio/Kconfig
> > +++ b/drivers/vfio/Kconfig
> > @@ -4,6 +4,8 @@ menuconfig VFIO
> >  	select IOMMU_API
> >  	depends on IOMMUFD || !IOMMUFD
> >  	select INTERVAL_TREE
> > +	select VFIO_GROUP if SPAPR_TCE_IOMMU
> > +	select VFIO_DEVICE_CDEV if !VFIO_GROUP && (X86 || S390 || ARM
> || ARM64)
> >  	select VFIO_CONTAINER if IOMMUFD=n
> >  	help
> >  	  VFIO provides a framework for secure userspace device drivers.
> > @@ -14,7 +16,8 @@ menuconfig VFIO
> >  if VFIO
> >  config VFIO_DEVICE_CDEV
> >  	bool "Support for the VFIO cdev /dev/vfio/devices/vfioX"
> >  	depends on IOMMUFD && (X86 || S390 || ARM || ARM64)
> > +	default !VFIO_GROUP
> >  	help
> >  	  The VFIO device cdev is another way for userspace to get device
> >  	  access. Userspace gets device fd by opening device cdev under
> > @@ -23,9 +26,21 @@ config VFIO_DEVICE_CDEV
> >
> >  	  If you don't know what to do here, say N.
> >
> > +config VFIO_GROUP
> > +	bool "Support for the VFIO group /dev/vfio/$group_id"
> > +	default y
> > +	help
> > +	   VFIO group is legacy interface for userspace. As the introduction
> > +	   of VFIO device cdev interface, this can be N. For now, before
> > +	   userspace applications are fully converted to new vfio device cdev
> > +	   interface, this should be Y.
> > +
> > +	   If you don't know what to do here, say Y.
> > +
> 
> I think this does the correct thing, but I'll reserve final judgment
> until I can try to break it ;)
> 
> This message needs some tuning though, we're not far enough along the
> path of cdev access to consider the group interface "legacy" (imo) or
> expect that there are any userspace applications converted.  There are
> also multiple setting recommendations to befuddle a layperson.  Perhaps:
> 
> 	VFIO group support provides the traditional model for accessing
> 	devices through VFIO and is used by the majority of userspace
> 	applications and drivers making use of VFIO.
> 
> 	If you don't know what to do here, say Y.

Got it. I'll update it to my branch first. 😊

Regards,
Yi Liu


^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [Intel-gfx] [PATCH v3 00/15] Add vfio_device cdev for iommufd support
@ 2023-02-16  2:53             ` Liu, Yi L
  0 siblings, 0 replies; 135+ messages in thread
From: Liu, Yi L @ 2023-02-16  2:53 UTC (permalink / raw)
  To: Alex Williamson
  Cc: linux-s390, yi.y.sun, mjrosato, kvm, intel-gvt-dev, joro, cohuck,
	peterx, suravee.suthikulpanit, eric.auger, nicolinc,
	shameerali.kolothum.thodi, jgg, intel-gfx, chao.p.peng, lulu,
	robin.murphy, jasowang

> From: Alex Williamson <alex.williamson@redhat.com>
> Sent: Thursday, February 16, 2023 4:09 AM
> 
> On Wed, 15 Feb 2023 07:54:31 +0000
> "Liu, Yi L" <yi.l.liu@intel.com> wrote:
> 
> > > From: Alex Williamson <alex.williamson@redhat.com>
> > > Sent: Tuesday, February 14, 2023 11:47 PM
> > >
> > > On Tue, 14 Feb 2023 01:55:17 +0000
> > > "Liu, Yi L" <yi.l.liu@intel.com> wrote:
> > >
> > > > > From: Alex Williamson <alex.williamson@redhat.com>
> > > > > Sent: Tuesday, February 14, 2023 3:47 AM
> > > > >
> > > > > On Mon, 13 Feb 2023 07:13:33 -0800
> > > > > Yi Liu <yi.l.liu@intel.com> wrote:
> > > > >
> > > > > > Existing VFIO provides group-centric user APIs for userspace.
> > > Userspace
> > > > > > opens the /dev/vfio/$group_id first before getting device fd and
> > > hence
> > > > > > getting access to device. This is not the desired model for iommufd.
> > > Per
> > > > > > the conclusion of community discussion[1], iommufd provides
> device-
> > > > > centric
> > > > > > kAPIs and requires its consumer (like VFIO) to be device-centric
> user
> > > > > > APIs. Such user APIs are used to associate device with iommufd
> and
> > > also
> > > > > > the I/O address spaces managed by the iommufd.
> > > > > >
> > > > > > This series first introduces a per device file structure to be
> prepared
> > > > > > for further enhancement and refactors the kvm-vfio code to be
> > > prepared
> > > > > > for accepting device file from userspace. Then refactors the vfio to
> be
> > > > > > able to handle iommufd binding. This refactor includes the
> mechanism
> > > of
> > > > > > blocking device access before iommufd bind, making
> > > vfio_device_open()
> > > > > be
> > > > > > exclusive between the group path and the cdev path. Eventually,
> adds
> > > the
> > > > > > cdev support for vfio device, and makes group infrastructure
> optional
> > > as
> > > > > > it is not needed when vfio device cdev is compiled.
> > > > > >
> > > > > > This is also a prerequisite for iommu nesting for vfio device[2].
> > > > > >
> > > > > > The complete code can be found in below branch, simple test done
> > > with
> > > > > the
> > > > > > legacy group path and the cdev path. Draft QEMU branch can be
> found
> > > > > at[3]
> > > > > >
> > > > > > https://github.com/yiliu1765/iommufd/tree/vfio_device_cdev_v3
> > > > > > (config CONFIG_IOMMUFD=y CONFIG_VFIO_DEVICE_CDEV=y)
> > > > >
> > > > > Even using your branch[1], it seems like this has not been tested
> > > > > except with cdev support enabled:
> > > > >
> > > > > /home/alwillia/Work/linux.git/drivers/vfio/vfio_main.c: In function
> > > > > ‘vfio_device_add’:
> > > > > /home/alwillia/Work/linux.git/drivers/vfio/vfio_main.c:253:48: error:
> > > ‘struct
> > > > > vfio_device’ has no member named ‘cdev’; did you mean ‘dev’?
> > > > >   253 |                 ret = cdev_device_add(&device->cdev, &device-
> >device);
> > > > >       |                                                ^~~~
> > > > >       |                                                dev
> > > > > /home/alwillia/Work/linux.git/drivers/vfio/vfio_main.c: In function
> > > > > ‘vfio_device_del’:
> > > > > /home/alwillia/Work/linux.git/drivers/vfio/vfio_main.c:262:42: error:
> > > ‘struct
> > > > > vfio_device’ has no member named ‘cdev’; did you mean ‘dev’?
> > > > >   262 |                 cdev_device_del(&device->cdev, &device->device);
> > > > >       |                                          ^~~~
> > > > >       |                                          dev
> > > >
> > > > Sorry for it. It is due to the cdev definition is under
> > > > "#if IS_ENABLED(CONFIG_VFIO_DEVICE_CDEV)". While, in the code it
> > > > uses "if (IS_ENABLED(CONFIG_VFIO_DEVICE_CDEV))".  I think for
> > > > readability, it would be better to always define cdev in vfio_device,
> > > > and keep the using of cdev in code. How about your taste?
> > >
> > > It seems necessary unless we want to litter the code with #ifdefs.
> >
> > I've moved it to the header file and call cdev_device_add()
> > under #if (IS_ENABLED(CONFIG_VFIO_DEVICE_CDEV))".
> >
> > > > > Additionally the VFIO_ENABLE_GROUP Kconfig option doesn't make
> much
> > > > > sense to me, it seems entirely redundant to VFIO_GROUP.
> > > >
> > > > The intention is to make the group code compiling match existing case.
> > > > Currently, if VFIO is configured, group code is by default compiled.
> > > > So VFIO_ENABLE_GROUP a hidden option, and VFIO_GROUP an
> option
> > > > for user.  User needs to explicitly config VFIO_GROUP if
> VFIO_DEVICE_CDEV==y.
> > > > If VFIO_DEVICE_CDEV==n, then no matter user configed
> VFIO_GROUP or
> > > > not, the group code shall be compiled.
> > >
> > > I understand the mechanics, I still find VFIO_ENABLE_GROUP redundant
> > > and unnecessary.  Also, Kconfig should not allow a configuration
> > > without either VFIO_GROUP or VFIO_DEVICE_CDEV as this is not
> > > functional.  Deselecting VFIO_GROUP should select VFIO_DEVICE_CDEV,
> > > but  VFIO_DEVICE_CDEV should be an optional addition to VFIO_GROUP.
> >
> > How about below? As Jason's remark on patch 0003, cdev is not available
> > for SPAPR.
> >
> > diff --git a/drivers/vfio/Kconfig b/drivers/vfio/Kconfig
> > index 0476abf154f2..96535adc2301 100644
> > --- a/drivers/vfio/Kconfig
> > +++ b/drivers/vfio/Kconfig
> > @@ -4,6 +4,8 @@ menuconfig VFIO
> >  	select IOMMU_API
> >  	depends on IOMMUFD || !IOMMUFD
> >  	select INTERVAL_TREE
> > +	select VFIO_GROUP if SPAPR_TCE_IOMMU
> > +	select VFIO_DEVICE_CDEV if !VFIO_GROUP && (X86 || S390 || ARM
> || ARM64)
> >  	select VFIO_CONTAINER if IOMMUFD=n
> >  	help
> >  	  VFIO provides a framework for secure userspace device drivers.
> > @@ -14,7 +16,8 @@ menuconfig VFIO
> >  if VFIO
> >  config VFIO_DEVICE_CDEV
> >  	bool "Support for the VFIO cdev /dev/vfio/devices/vfioX"
> >  	depends on IOMMUFD && (X86 || S390 || ARM || ARM64)
> > +	default !VFIO_GROUP
> >  	help
> >  	  The VFIO device cdev is another way for userspace to get device
> >  	  access. Userspace gets device fd by opening device cdev under
> > @@ -23,9 +26,21 @@ config VFIO_DEVICE_CDEV
> >
> >  	  If you don't know what to do here, say N.
> >
> > +config VFIO_GROUP
> > +	bool "Support for the VFIO group /dev/vfio/$group_id"
> > +	default y
> > +	help
> > +	   VFIO group is legacy interface for userspace. As the introduction
> > +	   of VFIO device cdev interface, this can be N. For now, before
> > +	   userspace applications are fully converted to new vfio device cdev
> > +	   interface, this should be Y.
> > +
> > +	   If you don't know what to do here, say Y.
> > +
> 
> I think this does the correct thing, but I'll reserve final judgment
> until I can try to break it ;)
> 
> This message needs some tuning though, we're not far enough along the
> path of cdev access to consider the group interface "legacy" (imo) or
> expect that there are any userspace applications converted.  There are
> also multiple setting recommendations to befuddle a layperson.  Perhaps:
> 
> 	VFIO group support provides the traditional model for accessing
> 	devices through VFIO and is used by the majority of userspace
> 	applications and drivers making use of VFIO.
> 
> 	If you don't know what to do here, say Y.

Got it. I'll update it to my branch first. 😊

Regards,
Yi Liu


^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [PATCH v3 14/15] vfio: Add ioctls for device cdev using iommufd
  2023-02-13 15:13   ` Yi Liu
@ 2023-02-16  8:24     ` Yan Zhao
  -1 siblings, 0 replies; 135+ messages in thread
From: Yan Zhao @ 2023-02-16  8:24 UTC (permalink / raw)
  To: Yi Liu
  Cc: joro, alex.williamson, jgg, kevin.tian, robin.murphy, linux-s390,
	yi.y.sun, kvm, mjrosato, jasowang, cohuck, peterx, eric.auger,
	nicolinc, shameerali.kolothum.thodi, suravee.suthikulpanit,
	chao.p.peng, lulu, intel-gvt-dev, intel-gfx

On Mon, Feb 13, 2023 at 07:13:47AM -0800, Yi Liu wrote:
...

> +long vfio_device_ioctl_bind_iommufd(struct vfio_device_file *df,
> +				    unsigned long arg)
> +{
> +	struct vfio_device *device = df->device;
> +	struct vfio_device_bind_iommufd bind;
> +	struct iommufd_ctx *iommufd = NULL;
> +	struct fd f;
> +	unsigned long minsz;
> +	int ret;
> +
> +	minsz = offsetofend(struct vfio_device_bind_iommufd, out_devid);
> +
> +	if (copy_from_user(&bind, (void __user *)arg, minsz))
> +		return -EFAULT;
> +
> +	if (bind.argsz < minsz || bind.flags)
> +		return -EINVAL;
> +
> +	if (!device->ops->bind_iommufd)
> +		return -ENODEV;
> +
> +	ret = vfio_device_claim_group(device);
> +	if (ret)
> +		return ret;
> +
> +	mutex_lock(&device->dev_set->lock);
> +	/*
> +	 * If already been bound to an iommufd, or already set noiommu
> +	 * then fail it.
> +	 */
> +	if (df->iommufd || df->noiommu) {
> +		ret = -EINVAL;
> +		goto out_unlock;
> +	}
> +
> +	/* iommufd < 0 means noiommu mode */
> +	if (bind.iommufd < 0) {
> +		if (!capable(CAP_SYS_RAWIO)) {
> +			ret = -EPERM;
> +			goto out_unlock;
> +		}
> +		df->noiommu = true;
> +	} else {
> +		f = fdget(bind.iommufd);
Here, the iommufd file count + 1,

> +		if (!f.file) {
> +			ret = -EBADF;
> +			goto out_unlock;
> +		}
> +		iommufd = iommufd_ctx_from_file(f.file);
iommufd file count + 1, again

> +		if (IS_ERR(iommufd)) {
> +			ret = PTR_ERR(iommufd);
> +			goto out_put_file;
> +		}
> +	}
> +
> +	/*
> +	 * Before the device open, get the KVM pointer currently
> +	 * associated with the device file (if there is) and obtain a
> +	 * reference. This reference is held until device closed. Save
> +	 * the pointer in the device for use by drivers.
> +	 */
> +	vfio_device_get_kvm_safe(df);
> +
> +	df->iommufd = iommufd;
> +	ret = vfio_device_open(df, &bind.out_devid, NULL);
iommufd file count + 1 in iommufd_device_bind for first open.

> +	if (ret)
> +		goto out_put_kvm;
> +
> +	ret = copy_to_user((void __user *)arg +
> +			   offsetofend(struct vfio_device_bind_iommufd, iommufd),
> +			   &bind.out_devid,
> +			   sizeof(bind.out_devid)) ? -EFAULT : 0;
> +	if (ret)
> +		goto out_close_device;
> +
> +	if (iommufd)
> +		fdput(f);
But, only one file count is put.

Need a paring iommufd_ctx_put() after a successful iommufd_ctx_from_file()
above to avoid iommufd_fops_release() never being called.

e.g.

@@ -1222,11 +1226,13 @@ static long vfio_device_ioctl_bind_iommufd(struct vfio_device_file *df,
                        ret = -EBADF;
                        goto out_unlock;
                }
                iommufd = iommufd_ctx_from_file(f.file);
                if (IS_ERR(iommufd)) {
                        ret = PTR_ERR(iommufd);
                        goto out_put_file;
                }
+               iommufd_ctx_put(iommufd);
        }

        /* df->kvm is supposed to be set in vfio_device_file_set_kvm() */

> +	else if (df->noiommu)
> +		dev_warn(device->dev, "vfio-noiommu device used by user "
> +			 "(%s:%d)\n", current->comm, task_pid_nr(current));
> +	mutex_unlock(&device->dev_set->lock);
> +	return 0;
> +
> +out_close_device:
> +	vfio_device_close(df);
> +out_put_kvm:
> +	df->iommufd = NULL;
> +	df->noiommu = false;
> +	vfio_device_put_kvm(device);
> +out_put_file:
> +	if (iommufd)
> +		fdput(f);
> +out_unlock:
> +	mutex_unlock(&device->dev_set->lock);
> +	vfio_device_release_group(device);
> +	return ret;
> +}
> +

^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [Intel-gfx] [PATCH v3 14/15] vfio: Add ioctls for device cdev using iommufd
@ 2023-02-16  8:24     ` Yan Zhao
  0 siblings, 0 replies; 135+ messages in thread
From: Yan Zhao @ 2023-02-16  8:24 UTC (permalink / raw)
  To: Yi Liu
  Cc: linux-s390, yi.y.sun, mjrosato, kvm, intel-gvt-dev, joro, cohuck,
	peterx, eric.auger, nicolinc, shameerali.kolothum.thodi, jgg,
	chao.p.peng, intel-gfx, suravee.suthikulpanit, lulu,
	robin.murphy, jasowang

On Mon, Feb 13, 2023 at 07:13:47AM -0800, Yi Liu wrote:
...

> +long vfio_device_ioctl_bind_iommufd(struct vfio_device_file *df,
> +				    unsigned long arg)
> +{
> +	struct vfio_device *device = df->device;
> +	struct vfio_device_bind_iommufd bind;
> +	struct iommufd_ctx *iommufd = NULL;
> +	struct fd f;
> +	unsigned long minsz;
> +	int ret;
> +
> +	minsz = offsetofend(struct vfio_device_bind_iommufd, out_devid);
> +
> +	if (copy_from_user(&bind, (void __user *)arg, minsz))
> +		return -EFAULT;
> +
> +	if (bind.argsz < minsz || bind.flags)
> +		return -EINVAL;
> +
> +	if (!device->ops->bind_iommufd)
> +		return -ENODEV;
> +
> +	ret = vfio_device_claim_group(device);
> +	if (ret)
> +		return ret;
> +
> +	mutex_lock(&device->dev_set->lock);
> +	/*
> +	 * If already been bound to an iommufd, or already set noiommu
> +	 * then fail it.
> +	 */
> +	if (df->iommufd || df->noiommu) {
> +		ret = -EINVAL;
> +		goto out_unlock;
> +	}
> +
> +	/* iommufd < 0 means noiommu mode */
> +	if (bind.iommufd < 0) {
> +		if (!capable(CAP_SYS_RAWIO)) {
> +			ret = -EPERM;
> +			goto out_unlock;
> +		}
> +		df->noiommu = true;
> +	} else {
> +		f = fdget(bind.iommufd);
Here, the iommufd file count + 1,

> +		if (!f.file) {
> +			ret = -EBADF;
> +			goto out_unlock;
> +		}
> +		iommufd = iommufd_ctx_from_file(f.file);
iommufd file count + 1, again

> +		if (IS_ERR(iommufd)) {
> +			ret = PTR_ERR(iommufd);
> +			goto out_put_file;
> +		}
> +	}
> +
> +	/*
> +	 * Before the device open, get the KVM pointer currently
> +	 * associated with the device file (if there is) and obtain a
> +	 * reference. This reference is held until device closed. Save
> +	 * the pointer in the device for use by drivers.
> +	 */
> +	vfio_device_get_kvm_safe(df);
> +
> +	df->iommufd = iommufd;
> +	ret = vfio_device_open(df, &bind.out_devid, NULL);
iommufd file count + 1 in iommufd_device_bind for first open.

> +	if (ret)
> +		goto out_put_kvm;
> +
> +	ret = copy_to_user((void __user *)arg +
> +			   offsetofend(struct vfio_device_bind_iommufd, iommufd),
> +			   &bind.out_devid,
> +			   sizeof(bind.out_devid)) ? -EFAULT : 0;
> +	if (ret)
> +		goto out_close_device;
> +
> +	if (iommufd)
> +		fdput(f);
But, only one file count is put.

Need a paring iommufd_ctx_put() after a successful iommufd_ctx_from_file()
above to avoid iommufd_fops_release() never being called.

e.g.

@@ -1222,11 +1226,13 @@ static long vfio_device_ioctl_bind_iommufd(struct vfio_device_file *df,
                        ret = -EBADF;
                        goto out_unlock;
                }
                iommufd = iommufd_ctx_from_file(f.file);
                if (IS_ERR(iommufd)) {
                        ret = PTR_ERR(iommufd);
                        goto out_put_file;
                }
+               iommufd_ctx_put(iommufd);
        }

        /* df->kvm is supposed to be set in vfio_device_file_set_kvm() */

> +	else if (df->noiommu)
> +		dev_warn(device->dev, "vfio-noiommu device used by user "
> +			 "(%s:%d)\n", current->comm, task_pid_nr(current));
> +	mutex_unlock(&device->dev_set->lock);
> +	return 0;
> +
> +out_close_device:
> +	vfio_device_close(df);
> +out_put_kvm:
> +	df->iommufd = NULL;
> +	df->noiommu = false;
> +	vfio_device_put_kvm(device);
> +out_put_file:
> +	if (iommufd)
> +		fdput(f);
> +out_unlock:
> +	mutex_unlock(&device->dev_set->lock);
> +	vfio_device_release_group(device);
> +	return ret;
> +}
> +

^ permalink raw reply	[flat|nested] 135+ messages in thread

* RE: [PATCH v3 14/15] vfio: Add ioctls for device cdev using iommufd
  2023-02-16  8:24     ` [Intel-gfx] " Yan Zhao
@ 2023-02-16  9:10       ` Liu, Yi L
  -1 siblings, 0 replies; 135+ messages in thread
From: Liu, Yi L @ 2023-02-16  9:10 UTC (permalink / raw)
  To: Zhao, Yan Y
  Cc: joro, alex.williamson, jgg, Tian, Kevin, robin.murphy,
	linux-s390, yi.y.sun, kvm, mjrosato, jasowang, cohuck, peterx,
	eric.auger, nicolinc, shameerali.kolothum.thodi,
	suravee.suthikulpanit, chao.p.peng, lulu, intel-gvt-dev,
	intel-gfx

> From: Zhao, Yan Y <yan.y.zhao@intel.com>
> Sent: Thursday, February 16, 2023 4:24 PM
> 
> On Mon, Feb 13, 2023 at 07:13:47AM -0800, Yi Liu wrote:
> ...
> 
> > +long vfio_device_ioctl_bind_iommufd(struct vfio_device_file *df,
> > +				    unsigned long arg)
> > +{
> > +	struct vfio_device *device = df->device;
> > +	struct vfio_device_bind_iommufd bind;
> > +	struct iommufd_ctx *iommufd = NULL;
> > +	struct fd f;
> > +	unsigned long minsz;
> > +	int ret;
> > +
> > +	minsz = offsetofend(struct vfio_device_bind_iommufd, out_devid);
> > +
> > +	if (copy_from_user(&bind, (void __user *)arg, minsz))
> > +		return -EFAULT;
> > +
> > +	if (bind.argsz < minsz || bind.flags)
> > +		return -EINVAL;
> > +
> > +	if (!device->ops->bind_iommufd)
> > +		return -ENODEV;
> > +
> > +	ret = vfio_device_claim_group(device);
> > +	if (ret)
> > +		return ret;
> > +
> > +	mutex_lock(&device->dev_set->lock);
> > +	/*
> > +	 * If already been bound to an iommufd, or already set noiommu
> > +	 * then fail it.
> > +	 */
> > +	if (df->iommufd || df->noiommu) {
> > +		ret = -EINVAL;
> > +		goto out_unlock;
> > +	}
> > +
> > +	/* iommufd < 0 means noiommu mode */
> > +	if (bind.iommufd < 0) {
> > +		if (!capable(CAP_SYS_RAWIO)) {
> > +			ret = -EPERM;
> > +			goto out_unlock;
> > +		}
> > +		df->noiommu = true;
> > +	} else {
> > +		f = fdget(bind.iommufd);
> Here, the iommufd file count + 1,
> 
> > +		if (!f.file) {
> > +			ret = -EBADF;
> > +			goto out_unlock;
> > +		}
> > +		iommufd = iommufd_ctx_from_file(f.file);
> iommufd file count + 1, again
> 
> > +		if (IS_ERR(iommufd)) {
> > +			ret = PTR_ERR(iommufd);
> > +			goto out_put_file;
> > +		}
> > +	}
> > +
> > +	/*
> > +	 * Before the device open, get the KVM pointer currently
> > +	 * associated with the device file (if there is) and obtain a
> > +	 * reference. This reference is held until device closed. Save
> > +	 * the pointer in the device for use by drivers.
> > +	 */
> > +	vfio_device_get_kvm_safe(df);
> > +
> > +	df->iommufd = iommufd;
> > +	ret = vfio_device_open(df, &bind.out_devid, NULL);
> iommufd file count + 1 in iommufd_device_bind for first open.
> 
> > +	if (ret)
> > +		goto out_put_kvm;
> > +
> > +	ret = copy_to_user((void __user *)arg +
> > +			   offsetofend(struct vfio_device_bind_iommufd,
> iommufd),
> > +			   &bind.out_devid,
> > +			   sizeof(bind.out_devid)) ? -EFAULT : 0;
> > +	if (ret)
> > +		goto out_close_device;
> > +
> > +	if (iommufd)
> > +		fdput(f);
> But, only one file count is put.

Good catch! Yes it is missed. And needs to call iommufd_ctx_put()
in vfio_device_cdev_close() as well.

> Need a paring iommufd_ctx_put() after a successful
> iommufd_ctx_from_file()
> above to avoid iommufd_fops_release() never being called.
> 
> e.g.
> 
> @@ -1222,11 +1226,13 @@ static long
> vfio_device_ioctl_bind_iommufd(struct vfio_device_file *df,
>                         ret = -EBADF;
>                         goto out_unlock;
>                 }
>                 iommufd = iommufd_ctx_from_file(f.file);
>                 if (IS_ERR(iommufd)) {
>                         ret = PTR_ERR(iommufd);
>                         goto out_put_file;
>                 }
> +               iommufd_ctx_put(iommufd);

Since iommufd is recorded in df, so needs to hold refcount till
df->iommufd=NULL;

Thanks,
Yi Liu

>         }
> 
>         /* df->kvm is supposed to be set in vfio_device_file_set_kvm() */
> 
> > +	else if (df->noiommu)
> > +		dev_warn(device->dev, "vfio-noiommu device used by user
> "
> > +			 "(%s:%d)\n", current->comm,
> task_pid_nr(current));
> > +	mutex_unlock(&device->dev_set->lock);
> > +	return 0;
> > +
> > +out_close_device:
> > +	vfio_device_close(df);
> > +out_put_kvm:
> > +	df->iommufd = NULL;
> > +	df->noiommu = false;
> > +	vfio_device_put_kvm(device);
> > +out_put_file:
> > +	if (iommufd)
> > +		fdput(f);
> > +out_unlock:
> > +	mutex_unlock(&device->dev_set->lock);
> > +	vfio_device_release_group(device);
> > +	return ret;
> > +}
> > +

^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [Intel-gfx] [PATCH v3 14/15] vfio: Add ioctls for device cdev using iommufd
@ 2023-02-16  9:10       ` Liu, Yi L
  0 siblings, 0 replies; 135+ messages in thread
From: Liu, Yi L @ 2023-02-16  9:10 UTC (permalink / raw)
  To: Zhao, Yan Y
  Cc: linux-s390, yi.y.sun, mjrosato, kvm, intel-gvt-dev, joro, cohuck,
	peterx, eric.auger, nicolinc, shameerali.kolothum.thodi, jgg,
	chao.p.peng, intel-gfx, suravee.suthikulpanit, lulu,
	robin.murphy, jasowang

> From: Zhao, Yan Y <yan.y.zhao@intel.com>
> Sent: Thursday, February 16, 2023 4:24 PM
> 
> On Mon, Feb 13, 2023 at 07:13:47AM -0800, Yi Liu wrote:
> ...
> 
> > +long vfio_device_ioctl_bind_iommufd(struct vfio_device_file *df,
> > +				    unsigned long arg)
> > +{
> > +	struct vfio_device *device = df->device;
> > +	struct vfio_device_bind_iommufd bind;
> > +	struct iommufd_ctx *iommufd = NULL;
> > +	struct fd f;
> > +	unsigned long minsz;
> > +	int ret;
> > +
> > +	minsz = offsetofend(struct vfio_device_bind_iommufd, out_devid);
> > +
> > +	if (copy_from_user(&bind, (void __user *)arg, minsz))
> > +		return -EFAULT;
> > +
> > +	if (bind.argsz < minsz || bind.flags)
> > +		return -EINVAL;
> > +
> > +	if (!device->ops->bind_iommufd)
> > +		return -ENODEV;
> > +
> > +	ret = vfio_device_claim_group(device);
> > +	if (ret)
> > +		return ret;
> > +
> > +	mutex_lock(&device->dev_set->lock);
> > +	/*
> > +	 * If already been bound to an iommufd, or already set noiommu
> > +	 * then fail it.
> > +	 */
> > +	if (df->iommufd || df->noiommu) {
> > +		ret = -EINVAL;
> > +		goto out_unlock;
> > +	}
> > +
> > +	/* iommufd < 0 means noiommu mode */
> > +	if (bind.iommufd < 0) {
> > +		if (!capable(CAP_SYS_RAWIO)) {
> > +			ret = -EPERM;
> > +			goto out_unlock;
> > +		}
> > +		df->noiommu = true;
> > +	} else {
> > +		f = fdget(bind.iommufd);
> Here, the iommufd file count + 1,
> 
> > +		if (!f.file) {
> > +			ret = -EBADF;
> > +			goto out_unlock;
> > +		}
> > +		iommufd = iommufd_ctx_from_file(f.file);
> iommufd file count + 1, again
> 
> > +		if (IS_ERR(iommufd)) {
> > +			ret = PTR_ERR(iommufd);
> > +			goto out_put_file;
> > +		}
> > +	}
> > +
> > +	/*
> > +	 * Before the device open, get the KVM pointer currently
> > +	 * associated with the device file (if there is) and obtain a
> > +	 * reference. This reference is held until device closed. Save
> > +	 * the pointer in the device for use by drivers.
> > +	 */
> > +	vfio_device_get_kvm_safe(df);
> > +
> > +	df->iommufd = iommufd;
> > +	ret = vfio_device_open(df, &bind.out_devid, NULL);
> iommufd file count + 1 in iommufd_device_bind for first open.
> 
> > +	if (ret)
> > +		goto out_put_kvm;
> > +
> > +	ret = copy_to_user((void __user *)arg +
> > +			   offsetofend(struct vfio_device_bind_iommufd,
> iommufd),
> > +			   &bind.out_devid,
> > +			   sizeof(bind.out_devid)) ? -EFAULT : 0;
> > +	if (ret)
> > +		goto out_close_device;
> > +
> > +	if (iommufd)
> > +		fdput(f);
> But, only one file count is put.

Good catch! Yes it is missed. And needs to call iommufd_ctx_put()
in vfio_device_cdev_close() as well.

> Need a paring iommufd_ctx_put() after a successful
> iommufd_ctx_from_file()
> above to avoid iommufd_fops_release() never being called.
> 
> e.g.
> 
> @@ -1222,11 +1226,13 @@ static long
> vfio_device_ioctl_bind_iommufd(struct vfio_device_file *df,
>                         ret = -EBADF;
>                         goto out_unlock;
>                 }
>                 iommufd = iommufd_ctx_from_file(f.file);
>                 if (IS_ERR(iommufd)) {
>                         ret = PTR_ERR(iommufd);
>                         goto out_put_file;
>                 }
> +               iommufd_ctx_put(iommufd);

Since iommufd is recorded in df, so needs to hold refcount till
df->iommufd=NULL;

Thanks,
Yi Liu

>         }
> 
>         /* df->kvm is supposed to be set in vfio_device_file_set_kvm() */
> 
> > +	else if (df->noiommu)
> > +		dev_warn(device->dev, "vfio-noiommu device used by user
> "
> > +			 "(%s:%d)\n", current->comm,
> task_pid_nr(current));
> > +	mutex_unlock(&device->dev_set->lock);
> > +	return 0;
> > +
> > +out_close_device:
> > +	vfio_device_close(df);
> > +out_put_kvm:
> > +	df->iommufd = NULL;
> > +	df->noiommu = false;
> > +	vfio_device_put_kvm(device);
> > +out_put_file:
> > +	if (iommufd)
> > +		fdput(f);
> > +out_unlock:
> > +	mutex_unlock(&device->dev_set->lock);
> > +	vfio_device_release_group(device);
> > +	return ret;
> > +}
> > +

^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [PATCH v3 14/15] vfio: Add ioctls for device cdev using iommufd
  2023-02-16  9:10       ` [Intel-gfx] " Liu, Yi L
@ 2023-02-16  9:23         ` Yan Zhao
  -1 siblings, 0 replies; 135+ messages in thread
From: Yan Zhao @ 2023-02-16  9:23 UTC (permalink / raw)
  To: Liu, Yi L
  Cc: joro, alex.williamson, jgg, Tian, Kevin, robin.murphy,
	linux-s390, yi.y.sun, kvm, mjrosato, jasowang, cohuck, peterx,
	eric.auger, nicolinc, shameerali.kolothum.thodi,
	suravee.suthikulpanit, chao.p.peng, lulu, intel-gvt-dev,
	intel-gfx

On Thu, Feb 16, 2023 at 05:10:06PM +0800, Liu, Yi L wrote:
> > From: Zhao, Yan Y <yan.y.zhao@intel.com>
> > Sent: Thursday, February 16, 2023 4:24 PM
> > 
> > On Mon, Feb 13, 2023 at 07:13:47AM -0800, Yi Liu wrote:
> > ...
> > 
> > > +long vfio_device_ioctl_bind_iommufd(struct vfio_device_file *df,
> > > +				    unsigned long arg)
> > > +{
> > > +	struct vfio_device *device = df->device;
> > > +	struct vfio_device_bind_iommufd bind;
> > > +	struct iommufd_ctx *iommufd = NULL;
> > > +	struct fd f;
> > > +	unsigned long minsz;
> > > +	int ret;
> > > +
> > > +	minsz = offsetofend(struct vfio_device_bind_iommufd, out_devid);
> > > +
> > > +	if (copy_from_user(&bind, (void __user *)arg, minsz))
> > > +		return -EFAULT;
> > > +
> > > +	if (bind.argsz < minsz || bind.flags)
> > > +		return -EINVAL;
> > > +
> > > +	if (!device->ops->bind_iommufd)
> > > +		return -ENODEV;
> > > +
> > > +	ret = vfio_device_claim_group(device);
> > > +	if (ret)
> > > +		return ret;
> > > +
> > > +	mutex_lock(&device->dev_set->lock);
> > > +	/*
> > > +	 * If already been bound to an iommufd, or already set noiommu
> > > +	 * then fail it.
> > > +	 */
> > > +	if (df->iommufd || df->noiommu) {
> > > +		ret = -EINVAL;
> > > +		goto out_unlock;
> > > +	}
> > > +
> > > +	/* iommufd < 0 means noiommu mode */
> > > +	if (bind.iommufd < 0) {
> > > +		if (!capable(CAP_SYS_RAWIO)) {
> > > +			ret = -EPERM;
> > > +			goto out_unlock;
> > > +		}
> > > +		df->noiommu = true;
> > > +	} else {
> > > +		f = fdget(bind.iommufd);
> > Here, the iommufd file count + 1,
> > 
> > > +		if (!f.file) {
> > > +			ret = -EBADF;
> > > +			goto out_unlock;
> > > +		}
> > > +		iommufd = iommufd_ctx_from_file(f.file);
> > iommufd file count + 1, again
> > 
> > > +		if (IS_ERR(iommufd)) {
> > > +			ret = PTR_ERR(iommufd);
> > > +			goto out_put_file;
> > > +		}
> > > +	}
> > > +
> > > +	/*
> > > +	 * Before the device open, get the KVM pointer currently
> > > +	 * associated with the device file (if there is) and obtain a
> > > +	 * reference. This reference is held until device closed. Save
> > > +	 * the pointer in the device for use by drivers.
> > > +	 */
> > > +	vfio_device_get_kvm_safe(df);
> > > +
> > > +	df->iommufd = iommufd;
> > > +	ret = vfio_device_open(df, &bind.out_devid, NULL);
> > iommufd file count + 1 in iommufd_device_bind for first open.
> > 
> > > +	if (ret)
> > > +		goto out_put_kvm;
> > > +
> > > +	ret = copy_to_user((void __user *)arg +
> > > +			   offsetofend(struct vfio_device_bind_iommufd,
> > iommufd),
> > > +			   &bind.out_devid,
> > > +			   sizeof(bind.out_devid)) ? -EFAULT : 0;
> > > +	if (ret)
> > > +		goto out_close_device;
> > > +
> > > +	if (iommufd)
> > > +		fdput(f);
> > But, only one file count is put.
> 
> Good catch! Yes it is missed. And needs to call iommufd_ctx_put()
> in vfio_device_cdev_close() as well.
If I read correctly, iommufd_device_bind() in the first open will
get the reference through iommufd_ctx_get(ictx) and 
iommufd_device_destroy() in the last close will do the iommufd_ctx_put().

As vfio_device_ioctl_bind_iommufd() isn't paring with
vfio_device_cdev_close(), I think the fix below is simpler :) 

> > Need a paring iommufd_ctx_put() after a successful
> > iommufd_ctx_from_file()
> > above to avoid iommufd_fops_release() never being called.
> > 
> > e.g.
> > 
> > @@ -1222,11 +1226,13 @@ static long
> > vfio_device_ioctl_bind_iommufd(struct vfio_device_file *df,
> >                         ret = -EBADF;
> >                         goto out_unlock;
> >                 }
> >                 iommufd = iommufd_ctx_from_file(f.file);
> >                 if (IS_ERR(iommufd)) {
> >                         ret = PTR_ERR(iommufd);
> >                         goto out_put_file;
> >                 }
> > +               iommufd_ctx_put(iommufd);
> 
> Since iommufd is recorded in df, so needs to hold refcount till
> df->iommufd=NULL;
> 
> 
> >         }
> > 
> >         /* df->kvm is supposed to be set in vfio_device_file_set_kvm() */
> > 
> > > +	else if (df->noiommu)
> > > +		dev_warn(device->dev, "vfio-noiommu device used by user
> > "
> > > +			 "(%s:%d)\n", current->comm,
> > task_pid_nr(current));
> > > +	mutex_unlock(&device->dev_set->lock);
> > > +	return 0;
> > > +
> > > +out_close_device:
> > > +	vfio_device_close(df);
> > > +out_put_kvm:
> > > +	df->iommufd = NULL;
> > > +	df->noiommu = false;
> > > +	vfio_device_put_kvm(device);
> > > +out_put_file:
> > > +	if (iommufd)
> > > +		fdput(f);
> > > +out_unlock:
> > > +	mutex_unlock(&device->dev_set->lock);
> > > +	vfio_device_release_group(device);
> > > +	return ret;
> > > +}
> > > +

^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [Intel-gfx] [PATCH v3 14/15] vfio: Add ioctls for device cdev using iommufd
@ 2023-02-16  9:23         ` Yan Zhao
  0 siblings, 0 replies; 135+ messages in thread
From: Yan Zhao @ 2023-02-16  9:23 UTC (permalink / raw)
  To: Liu, Yi L
  Cc: linux-s390, yi.y.sun, mjrosato, kvm, intel-gvt-dev, joro, cohuck,
	peterx, eric.auger, nicolinc, shameerali.kolothum.thodi, jgg,
	chao.p.peng, intel-gfx, suravee.suthikulpanit, lulu,
	robin.murphy, jasowang

On Thu, Feb 16, 2023 at 05:10:06PM +0800, Liu, Yi L wrote:
> > From: Zhao, Yan Y <yan.y.zhao@intel.com>
> > Sent: Thursday, February 16, 2023 4:24 PM
> > 
> > On Mon, Feb 13, 2023 at 07:13:47AM -0800, Yi Liu wrote:
> > ...
> > 
> > > +long vfio_device_ioctl_bind_iommufd(struct vfio_device_file *df,
> > > +				    unsigned long arg)
> > > +{
> > > +	struct vfio_device *device = df->device;
> > > +	struct vfio_device_bind_iommufd bind;
> > > +	struct iommufd_ctx *iommufd = NULL;
> > > +	struct fd f;
> > > +	unsigned long minsz;
> > > +	int ret;
> > > +
> > > +	minsz = offsetofend(struct vfio_device_bind_iommufd, out_devid);
> > > +
> > > +	if (copy_from_user(&bind, (void __user *)arg, minsz))
> > > +		return -EFAULT;
> > > +
> > > +	if (bind.argsz < minsz || bind.flags)
> > > +		return -EINVAL;
> > > +
> > > +	if (!device->ops->bind_iommufd)
> > > +		return -ENODEV;
> > > +
> > > +	ret = vfio_device_claim_group(device);
> > > +	if (ret)
> > > +		return ret;
> > > +
> > > +	mutex_lock(&device->dev_set->lock);
> > > +	/*
> > > +	 * If already been bound to an iommufd, or already set noiommu
> > > +	 * then fail it.
> > > +	 */
> > > +	if (df->iommufd || df->noiommu) {
> > > +		ret = -EINVAL;
> > > +		goto out_unlock;
> > > +	}
> > > +
> > > +	/* iommufd < 0 means noiommu mode */
> > > +	if (bind.iommufd < 0) {
> > > +		if (!capable(CAP_SYS_RAWIO)) {
> > > +			ret = -EPERM;
> > > +			goto out_unlock;
> > > +		}
> > > +		df->noiommu = true;
> > > +	} else {
> > > +		f = fdget(bind.iommufd);
> > Here, the iommufd file count + 1,
> > 
> > > +		if (!f.file) {
> > > +			ret = -EBADF;
> > > +			goto out_unlock;
> > > +		}
> > > +		iommufd = iommufd_ctx_from_file(f.file);
> > iommufd file count + 1, again
> > 
> > > +		if (IS_ERR(iommufd)) {
> > > +			ret = PTR_ERR(iommufd);
> > > +			goto out_put_file;
> > > +		}
> > > +	}
> > > +
> > > +	/*
> > > +	 * Before the device open, get the KVM pointer currently
> > > +	 * associated with the device file (if there is) and obtain a
> > > +	 * reference. This reference is held until device closed. Save
> > > +	 * the pointer in the device for use by drivers.
> > > +	 */
> > > +	vfio_device_get_kvm_safe(df);
> > > +
> > > +	df->iommufd = iommufd;
> > > +	ret = vfio_device_open(df, &bind.out_devid, NULL);
> > iommufd file count + 1 in iommufd_device_bind for first open.
> > 
> > > +	if (ret)
> > > +		goto out_put_kvm;
> > > +
> > > +	ret = copy_to_user((void __user *)arg +
> > > +			   offsetofend(struct vfio_device_bind_iommufd,
> > iommufd),
> > > +			   &bind.out_devid,
> > > +			   sizeof(bind.out_devid)) ? -EFAULT : 0;
> > > +	if (ret)
> > > +		goto out_close_device;
> > > +
> > > +	if (iommufd)
> > > +		fdput(f);
> > But, only one file count is put.
> 
> Good catch! Yes it is missed. And needs to call iommufd_ctx_put()
> in vfio_device_cdev_close() as well.
If I read correctly, iommufd_device_bind() in the first open will
get the reference through iommufd_ctx_get(ictx) and 
iommufd_device_destroy() in the last close will do the iommufd_ctx_put().

As vfio_device_ioctl_bind_iommufd() isn't paring with
vfio_device_cdev_close(), I think the fix below is simpler :) 

> > Need a paring iommufd_ctx_put() after a successful
> > iommufd_ctx_from_file()
> > above to avoid iommufd_fops_release() never being called.
> > 
> > e.g.
> > 
> > @@ -1222,11 +1226,13 @@ static long
> > vfio_device_ioctl_bind_iommufd(struct vfio_device_file *df,
> >                         ret = -EBADF;
> >                         goto out_unlock;
> >                 }
> >                 iommufd = iommufd_ctx_from_file(f.file);
> >                 if (IS_ERR(iommufd)) {
> >                         ret = PTR_ERR(iommufd);
> >                         goto out_put_file;
> >                 }
> > +               iommufd_ctx_put(iommufd);
> 
> Since iommufd is recorded in df, so needs to hold refcount till
> df->iommufd=NULL;
> 
> 
> >         }
> > 
> >         /* df->kvm is supposed to be set in vfio_device_file_set_kvm() */
> > 
> > > +	else if (df->noiommu)
> > > +		dev_warn(device->dev, "vfio-noiommu device used by user
> > "
> > > +			 "(%s:%d)\n", current->comm,
> > task_pid_nr(current));
> > > +	mutex_unlock(&device->dev_set->lock);
> > > +	return 0;
> > > +
> > > +out_close_device:
> > > +	vfio_device_close(df);
> > > +out_put_kvm:
> > > +	df->iommufd = NULL;
> > > +	df->noiommu = false;
> > > +	vfio_device_put_kvm(device);
> > > +out_put_file:
> > > +	if (iommufd)
> > > +		fdput(f);
> > > +out_unlock:
> > > +	mutex_unlock(&device->dev_set->lock);
> > > +	vfio_device_release_group(device);
> > > +	return ret;
> > > +}
> > > +

^ permalink raw reply	[flat|nested] 135+ messages in thread

* RE: [PATCH v3 14/15] vfio: Add ioctls for device cdev using iommufd
  2023-02-16  9:23         ` [Intel-gfx] " Yan Zhao
@ 2023-02-16 10:28           ` Liu, Yi L
  -1 siblings, 0 replies; 135+ messages in thread
From: Liu, Yi L @ 2023-02-16 10:28 UTC (permalink / raw)
  To: Zhao, Yan Y
  Cc: joro, alex.williamson, jgg, Tian, Kevin, robin.murphy,
	linux-s390, yi.y.sun, kvm, mjrosato, jasowang, cohuck, peterx,
	eric.auger, nicolinc, shameerali.kolothum.thodi,
	suravee.suthikulpanit, chao.p.peng, lulu, intel-gvt-dev,
	intel-gfx

> From: Zhao, Yan Y <yan.y.zhao@intel.com>
> Sent: Thursday, February 16, 2023 5:23 PM
> 
> On Thu, Feb 16, 2023 at 05:10:06PM +0800, Liu, Yi L wrote:
> > > From: Zhao, Yan Y <yan.y.zhao@intel.com>
> > > Sent: Thursday, February 16, 2023 4:24 PM
> > >
> > > On Mon, Feb 13, 2023 at 07:13:47AM -0800, Yi Liu wrote:
> > > ...
> > >
> > > > +long vfio_device_ioctl_bind_iommufd(struct vfio_device_file *df,
> > > > +				    unsigned long arg)
> > > > +{
> > > > +	struct vfio_device *device = df->device;
> > > > +	struct vfio_device_bind_iommufd bind;
> > > > +	struct iommufd_ctx *iommufd = NULL;
> > > > +	struct fd f;
> > > > +	unsigned long minsz;
> > > > +	int ret;
> > > > +
> > > > +	minsz = offsetofend(struct vfio_device_bind_iommufd, out_devid);
> > > > +
> > > > +	if (copy_from_user(&bind, (void __user *)arg, minsz))
> > > > +		return -EFAULT;
> > > > +
> > > > +	if (bind.argsz < minsz || bind.flags)
> > > > +		return -EINVAL;
> > > > +
> > > > +	if (!device->ops->bind_iommufd)
> > > > +		return -ENODEV;
> > > > +
> > > > +	ret = vfio_device_claim_group(device);
> > > > +	if (ret)
> > > > +		return ret;
> > > > +
> > > > +	mutex_lock(&device->dev_set->lock);
> > > > +	/*
> > > > +	 * If already been bound to an iommufd, or already set noiommu
> > > > +	 * then fail it.
> > > > +	 */
> > > > +	if (df->iommufd || df->noiommu) {
> > > > +		ret = -EINVAL;
> > > > +		goto out_unlock;
> > > > +	}
> > > > +
> > > > +	/* iommufd < 0 means noiommu mode */
> > > > +	if (bind.iommufd < 0) {
> > > > +		if (!capable(CAP_SYS_RAWIO)) {
> > > > +			ret = -EPERM;
> > > > +			goto out_unlock;
> > > > +		}
> > > > +		df->noiommu = true;
> > > > +	} else {
> > > > +		f = fdget(bind.iommufd);
> > > Here, the iommufd file count + 1,
> > >
> > > > +		if (!f.file) {
> > > > +			ret = -EBADF;
> > > > +			goto out_unlock;
> > > > +		}
> > > > +		iommufd = iommufd_ctx_from_file(f.file);
> > > iommufd file count + 1, again
> > >
> > > > +		if (IS_ERR(iommufd)) {
> > > > +			ret = PTR_ERR(iommufd);
> > > > +			goto out_put_file;
> > > > +		}
> > > > +	}
> > > > +
> > > > +	/*
> > > > +	 * Before the device open, get the KVM pointer currently
> > > > +	 * associated with the device file (if there is) and obtain a
> > > > +	 * reference. This reference is held until device closed. Save
> > > > +	 * the pointer in the device for use by drivers.
> > > > +	 */
> > > > +	vfio_device_get_kvm_safe(df);
> > > > +
> > > > +	df->iommufd = iommufd;
> > > > +	ret = vfio_device_open(df, &bind.out_devid, NULL);
> > > iommufd file count + 1 in iommufd_device_bind for first open.
> > >
> > > > +	if (ret)
> > > > +		goto out_put_kvm;
> > > > +
> > > > +	ret = copy_to_user((void __user *)arg +
> > > > +			   offsetofend(struct vfio_device_bind_iommufd,
> > > iommufd),
> > > > +			   &bind.out_devid,
> > > > +			   sizeof(bind.out_devid)) ? -EFAULT : 0;
> > > > +	if (ret)
> > > > +		goto out_close_device;
> > > > +
> > > > +	if (iommufd)
> > > > +		fdput(f);
> > > But, only one file count is put.
> >
> > Good catch! Yes it is missed. And needs to call iommufd_ctx_put()
> > in vfio_device_cdev_close() as well.
> If I read correctly, iommufd_device_bind() in the first open will
> get the reference through iommufd_ctx_get(ictx) and
> iommufd_device_destroy() in the last close will do the iommufd_ctx_put().

Aha, functionally no problem. Even storing iommufd_ctx in df without
an explicit get for iommufd_ctx is fine since iommufd_device_bind()
has an implicitly get for iommufd_ctx. But it appears to be better to
have an explicit get.

However, I need to admit, your fix can reduce the reference of iommufd file.
This may avoid file reference counting overflow if there are multiple devices
assigned to an application. I'm not sure how possible it is though. 😊 I'll see
if Alex or Jason have any preference.

Regards,
Yi Liu

> As vfio_device_ioctl_bind_iommufd() isn't paring with
> vfio_device_cdev_close(), I think the fix below is simpler :)
> 
> > > Need a paring iommufd_ctx_put() after a successful
> > > iommufd_ctx_from_file()
> > > above to avoid iommufd_fops_release() never being called.
> > >
> > > e.g.
> > >
> > > @@ -1222,11 +1226,13 @@ static long
> > > vfio_device_ioctl_bind_iommufd(struct vfio_device_file *df,
> > >                         ret = -EBADF;
> > >                         goto out_unlock;
> > >                 }
> > >                 iommufd = iommufd_ctx_from_file(f.file);
> > >                 if (IS_ERR(iommufd)) {
> > >                         ret = PTR_ERR(iommufd);
> > >                         goto out_put_file;
> > >                 }
> > > +               iommufd_ctx_put(iommufd);
> >
> > Since iommufd is recorded in df, so needs to hold refcount till
> > df->iommufd=NULL;
> >
> >
> > >         }
> > >
> > >         /* df->kvm is supposed to be set in vfio_device_file_set_kvm() */
> > >
> > > > +	else if (df->noiommu)
> > > > +		dev_warn(device->dev, "vfio-noiommu device used by user
> > > "
> > > > +			 "(%s:%d)\n", current->comm,
> > > task_pid_nr(current));
> > > > +	mutex_unlock(&device->dev_set->lock);
> > > > +	return 0;
> > > > +
> > > > +out_close_device:
> > > > +	vfio_device_close(df);
> > > > +out_put_kvm:
> > > > +	df->iommufd = NULL;
> > > > +	df->noiommu = false;
> > > > +	vfio_device_put_kvm(device);
> > > > +out_put_file:
> > > > +	if (iommufd)
> > > > +		fdput(f);
> > > > +out_unlock:
> > > > +	mutex_unlock(&device->dev_set->lock);
> > > > +	vfio_device_release_group(device);
> > > > +	return ret;
> > > > +}
> > > > +

^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [Intel-gfx] [PATCH v3 14/15] vfio: Add ioctls for device cdev using iommufd
@ 2023-02-16 10:28           ` Liu, Yi L
  0 siblings, 0 replies; 135+ messages in thread
From: Liu, Yi L @ 2023-02-16 10:28 UTC (permalink / raw)
  To: Zhao, Yan Y
  Cc: linux-s390, yi.y.sun, mjrosato, kvm, intel-gvt-dev, joro, cohuck,
	peterx, eric.auger, nicolinc, shameerali.kolothum.thodi, jgg,
	chao.p.peng, intel-gfx, suravee.suthikulpanit, lulu,
	robin.murphy, jasowang

> From: Zhao, Yan Y <yan.y.zhao@intel.com>
> Sent: Thursday, February 16, 2023 5:23 PM
> 
> On Thu, Feb 16, 2023 at 05:10:06PM +0800, Liu, Yi L wrote:
> > > From: Zhao, Yan Y <yan.y.zhao@intel.com>
> > > Sent: Thursday, February 16, 2023 4:24 PM
> > >
> > > On Mon, Feb 13, 2023 at 07:13:47AM -0800, Yi Liu wrote:
> > > ...
> > >
> > > > +long vfio_device_ioctl_bind_iommufd(struct vfio_device_file *df,
> > > > +				    unsigned long arg)
> > > > +{
> > > > +	struct vfio_device *device = df->device;
> > > > +	struct vfio_device_bind_iommufd bind;
> > > > +	struct iommufd_ctx *iommufd = NULL;
> > > > +	struct fd f;
> > > > +	unsigned long minsz;
> > > > +	int ret;
> > > > +
> > > > +	minsz = offsetofend(struct vfio_device_bind_iommufd, out_devid);
> > > > +
> > > > +	if (copy_from_user(&bind, (void __user *)arg, minsz))
> > > > +		return -EFAULT;
> > > > +
> > > > +	if (bind.argsz < minsz || bind.flags)
> > > > +		return -EINVAL;
> > > > +
> > > > +	if (!device->ops->bind_iommufd)
> > > > +		return -ENODEV;
> > > > +
> > > > +	ret = vfio_device_claim_group(device);
> > > > +	if (ret)
> > > > +		return ret;
> > > > +
> > > > +	mutex_lock(&device->dev_set->lock);
> > > > +	/*
> > > > +	 * If already been bound to an iommufd, or already set noiommu
> > > > +	 * then fail it.
> > > > +	 */
> > > > +	if (df->iommufd || df->noiommu) {
> > > > +		ret = -EINVAL;
> > > > +		goto out_unlock;
> > > > +	}
> > > > +
> > > > +	/* iommufd < 0 means noiommu mode */
> > > > +	if (bind.iommufd < 0) {
> > > > +		if (!capable(CAP_SYS_RAWIO)) {
> > > > +			ret = -EPERM;
> > > > +			goto out_unlock;
> > > > +		}
> > > > +		df->noiommu = true;
> > > > +	} else {
> > > > +		f = fdget(bind.iommufd);
> > > Here, the iommufd file count + 1,
> > >
> > > > +		if (!f.file) {
> > > > +			ret = -EBADF;
> > > > +			goto out_unlock;
> > > > +		}
> > > > +		iommufd = iommufd_ctx_from_file(f.file);
> > > iommufd file count + 1, again
> > >
> > > > +		if (IS_ERR(iommufd)) {
> > > > +			ret = PTR_ERR(iommufd);
> > > > +			goto out_put_file;
> > > > +		}
> > > > +	}
> > > > +
> > > > +	/*
> > > > +	 * Before the device open, get the KVM pointer currently
> > > > +	 * associated with the device file (if there is) and obtain a
> > > > +	 * reference. This reference is held until device closed. Save
> > > > +	 * the pointer in the device for use by drivers.
> > > > +	 */
> > > > +	vfio_device_get_kvm_safe(df);
> > > > +
> > > > +	df->iommufd = iommufd;
> > > > +	ret = vfio_device_open(df, &bind.out_devid, NULL);
> > > iommufd file count + 1 in iommufd_device_bind for first open.
> > >
> > > > +	if (ret)
> > > > +		goto out_put_kvm;
> > > > +
> > > > +	ret = copy_to_user((void __user *)arg +
> > > > +			   offsetofend(struct vfio_device_bind_iommufd,
> > > iommufd),
> > > > +			   &bind.out_devid,
> > > > +			   sizeof(bind.out_devid)) ? -EFAULT : 0;
> > > > +	if (ret)
> > > > +		goto out_close_device;
> > > > +
> > > > +	if (iommufd)
> > > > +		fdput(f);
> > > But, only one file count is put.
> >
> > Good catch! Yes it is missed. And needs to call iommufd_ctx_put()
> > in vfio_device_cdev_close() as well.
> If I read correctly, iommufd_device_bind() in the first open will
> get the reference through iommufd_ctx_get(ictx) and
> iommufd_device_destroy() in the last close will do the iommufd_ctx_put().

Aha, functionally no problem. Even storing iommufd_ctx in df without
an explicit get for iommufd_ctx is fine since iommufd_device_bind()
has an implicitly get for iommufd_ctx. But it appears to be better to
have an explicit get.

However, I need to admit, your fix can reduce the reference of iommufd file.
This may avoid file reference counting overflow if there are multiple devices
assigned to an application. I'm not sure how possible it is though. 😊 I'll see
if Alex or Jason have any preference.

Regards,
Yi Liu

> As vfio_device_ioctl_bind_iommufd() isn't paring with
> vfio_device_cdev_close(), I think the fix below is simpler :)
> 
> > > Need a paring iommufd_ctx_put() after a successful
> > > iommufd_ctx_from_file()
> > > above to avoid iommufd_fops_release() never being called.
> > >
> > > e.g.
> > >
> > > @@ -1222,11 +1226,13 @@ static long
> > > vfio_device_ioctl_bind_iommufd(struct vfio_device_file *df,
> > >                         ret = -EBADF;
> > >                         goto out_unlock;
> > >                 }
> > >                 iommufd = iommufd_ctx_from_file(f.file);
> > >                 if (IS_ERR(iommufd)) {
> > >                         ret = PTR_ERR(iommufd);
> > >                         goto out_put_file;
> > >                 }
> > > +               iommufd_ctx_put(iommufd);
> >
> > Since iommufd is recorded in df, so needs to hold refcount till
> > df->iommufd=NULL;
> >
> >
> > >         }
> > >
> > >         /* df->kvm is supposed to be set in vfio_device_file_set_kvm() */
> > >
> > > > +	else if (df->noiommu)
> > > > +		dev_warn(device->dev, "vfio-noiommu device used by user
> > > "
> > > > +			 "(%s:%d)\n", current->comm,
> > > task_pid_nr(current));
> > > > +	mutex_unlock(&device->dev_set->lock);
> > > > +	return 0;
> > > > +
> > > > +out_close_device:
> > > > +	vfio_device_close(df);
> > > > +out_put_kvm:
> > > > +	df->iommufd = NULL;
> > > > +	df->noiommu = false;
> > > > +	vfio_device_put_kvm(device);
> > > > +out_put_file:
> > > > +	if (iommufd)
> > > > +		fdput(f);
> > > > +out_unlock:
> > > > +	mutex_unlock(&device->dev_set->lock);
> > > > +	vfio_device_release_group(device);
> > > > +	return ret;
> > > > +}
> > > > +

^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [PATCH v3 14/15] vfio: Add ioctls for device cdev using iommufd
  2023-02-16 10:28           ` [Intel-gfx] " Liu, Yi L
@ 2023-02-16 14:24             ` Jason Gunthorpe
  -1 siblings, 0 replies; 135+ messages in thread
From: Jason Gunthorpe @ 2023-02-16 14:24 UTC (permalink / raw)
  To: Liu, Yi L
  Cc: Zhao, Yan Y, joro, alex.williamson, Tian, Kevin, robin.murphy,
	linux-s390, yi.y.sun, kvm, mjrosato, jasowang, cohuck, peterx,
	eric.auger, nicolinc, shameerali.kolothum.thodi,
	suravee.suthikulpanit, chao.p.peng, lulu, intel-gvt-dev,
	intel-gfx

On Thu, Feb 16, 2023 at 10:28:33AM +0000, Liu, Yi L wrote:

> However, I need to admit, your fix can reduce the reference of iommufd file.
> This may avoid file reference counting overflow if there are multiple devices
> assigned to an application. I'm not sure how possible it is though. 😊 I'll see
> if Alex or Jason have any preference.

Please write refcounting in a way that is easy to understand and don't
try to be clever.

Every stored pointer to a ref object should hold its own ref.

Jason

^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [Intel-gfx] [PATCH v3 14/15] vfio: Add ioctls for device cdev using iommufd
@ 2023-02-16 14:24             ` Jason Gunthorpe
  0 siblings, 0 replies; 135+ messages in thread
From: Jason Gunthorpe @ 2023-02-16 14:24 UTC (permalink / raw)
  To: Liu, Yi L
  Cc: linux-s390, Zhao, Yan Y, mjrosato, kvm, intel-gvt-dev, joro,
	cohuck, peterx, eric.auger, nicolinc, shameerali.kolothum.thodi,
	suravee.suthikulpanit, chao.p.peng, intel-gfx, yi.y.sun, lulu,
	robin.murphy, jasowang

On Thu, Feb 16, 2023 at 10:28:33AM +0000, Liu, Yi L wrote:

> However, I need to admit, your fix can reduce the reference of iommufd file.
> This may avoid file reference counting overflow if there are multiple devices
> assigned to an application. I'm not sure how possible it is though. 😊 I'll see
> if Alex or Jason have any preference.

Please write refcounting in a way that is easy to understand and don't
try to be clever.

Every stored pointer to a ref object should hold its own ref.

Jason

^ permalink raw reply	[flat|nested] 135+ messages in thread

* RE: [PATCH v3 05/15] kvm/vfio: Accept vfio device file from userspace
  2023-02-15  0:17           ` [Intel-gfx] " Jason Gunthorpe
@ 2023-02-17  5:34             ` Liu, Yi L
  -1 siblings, 0 replies; 135+ messages in thread
From: Liu, Yi L @ 2023-02-17  5:34 UTC (permalink / raw)
  To: Jason Gunthorpe, Alex Williamson
  Cc: joro, Tian, Kevin, robin.murphy, cohuck, eric.auger, nicolinc,
	kvm, mjrosato, chao.p.peng, yi.y.sun, peterx, jasowang,
	shameerali.kolothum.thodi, lulu, suravee.suthikulpanit,
	intel-gvt-dev, intel-gfx, linux-s390, Timothy Pearson,
	Michael Ellerman

> From: Jason Gunthorpe <jgg@nvidia.com>
> Sent: Wednesday, February 15, 2023 8:18 AM
> 
> On Tue, Feb 14, 2023 at 04:42:35PM -0700, Alex Williamson wrote:
> 
> > A device file opened through a group could be passed through this
> > interface though, right?
> 
> Yes, I think so
> 
> > Do we just chalk that up to user error?  Maybe the SPAPR extension
> > at least needs to be documented as relying on registering groups
> > rather than devices.
> 
> The way these APIs work is you have to pass the same FD to all of
> them. The SPAPR stuff is no different, if you used a cdev with
> KVM_DEV_VFIO_GROUP_ADD then you have to use the same cdev fd with
> the
> SPAPR group_fd. Yi just didn't rename it.

This is because SPAPR cannot accept cdev fd yet. It explicitly requires
group fd and get iommu_group during the handling.

Regards,
Yi Liu

^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [Intel-gfx] [PATCH v3 05/15] kvm/vfio: Accept vfio device file from userspace
@ 2023-02-17  5:34             ` Liu, Yi L
  0 siblings, 0 replies; 135+ messages in thread
From: Liu, Yi L @ 2023-02-17  5:34 UTC (permalink / raw)
  To: Jason Gunthorpe, Alex Williamson
  Cc: linux-s390, yi.y.sun, mjrosato, kvm, Michael Ellerman,
	intel-gvt-dev, joro, cohuck, peterx, eric.auger, Timothy Pearson,
	nicolinc, shameerali.kolothum.thodi, suravee.suthikulpanit,
	intel-gfx, chao.p.peng, lulu, robin.murphy, jasowang

> From: Jason Gunthorpe <jgg@nvidia.com>
> Sent: Wednesday, February 15, 2023 8:18 AM
> 
> On Tue, Feb 14, 2023 at 04:42:35PM -0700, Alex Williamson wrote:
> 
> > A device file opened through a group could be passed through this
> > interface though, right?
> 
> Yes, I think so
> 
> > Do we just chalk that up to user error?  Maybe the SPAPR extension
> > at least needs to be documented as relying on registering groups
> > rather than devices.
> 
> The way these APIs work is you have to pass the same FD to all of
> them. The SPAPR stuff is no different, if you used a cdev with
> KVM_DEV_VFIO_GROUP_ADD then you have to use the same cdev fd with
> the
> SPAPR group_fd. Yi just didn't rename it.

This is because SPAPR cannot accept cdev fd yet. It explicitly requires
group fd and get iommu_group during the handling.

Regards,
Yi Liu

^ permalink raw reply	[flat|nested] 135+ messages in thread

* RE: [PATCH v3 05/15] kvm/vfio: Accept vfio device file from userspace
  2023-02-17  5:34             ` [Intel-gfx] " Liu, Yi L
@ 2023-02-17  5:48               ` Liu, Yi L
  -1 siblings, 0 replies; 135+ messages in thread
From: Liu, Yi L @ 2023-02-17  5:48 UTC (permalink / raw)
  To: Liu, Yi L, Jason Gunthorpe, Alex Williamson
  Cc: joro, Tian, Kevin, robin.murphy, cohuck, eric.auger, nicolinc,
	kvm, mjrosato, chao.p.peng, yi.y.sun, peterx, jasowang,
	shameerali.kolothum.thodi, lulu, suravee.suthikulpanit,
	intel-gvt-dev, intel-gfx, linux-s390, Timothy Pearson,
	Michael Ellerman

> From: Liu, Yi L <yi.l.liu@intel.com>
> Sent: Friday, February 17, 2023 1:34 PM
> 
> > From: Jason Gunthorpe <jgg@nvidia.com>
> > Sent: Wednesday, February 15, 2023 8:18 AM
> >
> > On Tue, Feb 14, 2023 at 04:42:35PM -0700, Alex Williamson wrote:
> >
> > > A device file opened through a group could be passed through this
> > > interface though, right?
> >
> > Yes, I think so
> >
> > > Do we just chalk that up to user error?  Maybe the SPAPR extension
> > > at least needs to be documented as relying on registering groups
> > > rather than devices.
> >
> > The way these APIs work is you have to pass the same FD to all of
> > them. The SPAPR stuff is no different, if you used a cdev with
> > KVM_DEV_VFIO_GROUP_ADD then you have to use the same cdev fd
> with
> > the
> > SPAPR group_fd. Yi just didn't rename it.
> 
> This is because SPAPR cannot accept cdev fd yet. It explicitly requires
> group fd and get iommu_group during the handling.

Sorry I misunderstood it. I think this can be renamed to be fds if
no objection. Maybe as below, so that old userspace that uses
group_fds can still compile. I doubt if a new flag is needed to
identify the provided fds are group or device fds. I guess no since
the pci hot reset code does not really care about it. It cares more
the fd is held by the application.

struct vfio_pci_hot_reset {
	__u32   argsz;
	__u32   flags;
	__u32   count;
	union {
		__s32   group_fds[];
		__s32   fds[];
	};
};

Regards,
Yi Liu

^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [Intel-gfx] [PATCH v3 05/15] kvm/vfio: Accept vfio device file from userspace
@ 2023-02-17  5:48               ` Liu, Yi L
  0 siblings, 0 replies; 135+ messages in thread
From: Liu, Yi L @ 2023-02-17  5:48 UTC (permalink / raw)
  To: Liu, Yi L, Jason Gunthorpe, Alex Williamson
  Cc: linux-s390, yi.y.sun, mjrosato, kvm, Michael Ellerman,
	intel-gvt-dev, joro, cohuck, peterx, eric.auger, Timothy Pearson,
	nicolinc, shameerali.kolothum.thodi, suravee.suthikulpanit,
	intel-gfx, chao.p.peng, lulu, robin.murphy, jasowang

> From: Liu, Yi L <yi.l.liu@intel.com>
> Sent: Friday, February 17, 2023 1:34 PM
> 
> > From: Jason Gunthorpe <jgg@nvidia.com>
> > Sent: Wednesday, February 15, 2023 8:18 AM
> >
> > On Tue, Feb 14, 2023 at 04:42:35PM -0700, Alex Williamson wrote:
> >
> > > A device file opened through a group could be passed through this
> > > interface though, right?
> >
> > Yes, I think so
> >
> > > Do we just chalk that up to user error?  Maybe the SPAPR extension
> > > at least needs to be documented as relying on registering groups
> > > rather than devices.
> >
> > The way these APIs work is you have to pass the same FD to all of
> > them. The SPAPR stuff is no different, if you used a cdev with
> > KVM_DEV_VFIO_GROUP_ADD then you have to use the same cdev fd
> with
> > the
> > SPAPR group_fd. Yi just didn't rename it.
> 
> This is because SPAPR cannot accept cdev fd yet. It explicitly requires
> group fd and get iommu_group during the handling.

Sorry I misunderstood it. I think this can be renamed to be fds if
no objection. Maybe as below, so that old userspace that uses
group_fds can still compile. I doubt if a new flag is needed to
identify the provided fds are group or device fds. I guess no since
the pci hot reset code does not really care about it. It cares more
the fd is held by the application.

struct vfio_pci_hot_reset {
	__u32   argsz;
	__u32   flags;
	__u32   count;
	union {
		__s32   group_fds[];
		__s32   fds[];
	};
};

Regards,
Yi Liu

^ permalink raw reply	[flat|nested] 135+ messages in thread

* RE: [PATCH v3 03/15] vfio: Accept vfio device file in the driver facing kAPI
  2023-02-14  7:19         ` [Intel-gfx] " Liu, Yi L
@ 2023-02-17 10:55           ` Liu, Yi L
  -1 siblings, 0 replies; 135+ messages in thread
From: Liu, Yi L @ 2023-02-17 10:55 UTC (permalink / raw)
  To: Liu, Yi L, Jason Gunthorpe, alex.williamson
  Cc: joro, alex.williamson, Tian, Kevin, robin.murphy, cohuck,
	eric.auger, nicolinc, kvm, mjrosato, chao.p.peng, yi.y.sun,
	peterx, jasowang, shameerali.kolothum.thodi, lulu,
	suravee.suthikulpanit, intel-gvt-dev, intel-gfx, linux-s390

Hi Alex, Jason,

> From: Liu, Yi L <yi.l.liu@intel.com>
> Sent: Tuesday, February 14, 2023 3:19 PM
> 
> > From: Liu, Yi L <yi.l.liu@intel.com>
> > Sent: Tuesday, February 14, 2023 10:03 AM
> >
> > > From: Jason Gunthorpe <jgg@nvidia.com>
> > > Sent: Tuesday, February 14, 2023 7:44 AM
> > >
> > > On Mon, Feb 13, 2023 at 07:13:36AM -0800, Yi Liu wrote:
> > > > +static struct vfio_device *vfio_device_from_file(struct file *file)
> > > > +{
> > > > +	struct vfio_device_file *df = file->private_data;
> > > > +
> > > > +	if (file->f_op != &vfio_device_fops)
> > > > +		return NULL;
> > > > +	return df->device;
> > > > +}
> > > > +
> > > >  /**
> > > >   * vfio_file_is_valid - True if the file is usable with VFIO APIS
> > > >   * @file: VFIO group file or VFIO device file
> > > >   */
> > > >  bool vfio_file_is_valid(struct file *file)
> > > >  {
> > > > -	return vfio_group_from_file(file);
> > > > +	return vfio_group_from_file(file) ||
> > > > +	       vfio_device_from_file(file);
> > > >  }
> > > >  EXPORT_SYMBOL_GPL(vfio_file_is_valid);
> > >
> > > This can only succeed on a device cdev that has been fully opened.
> >
> > Actually, we cannot. This is used in the kvm-vfio code to see if the
> > user-provided fd is vfio fds in the SET_KVM path. And we don't
> > have the device cdev fully opened until BIND_IOMMUFD. But we do
> > need to invoke SET_KVM before issuing BIND_IOMMUFD as the device
> > open needs kvm pointer. So if we cannot apply fully opened limit to this
> > interface. Maybe an updated function comment is needed.
> >
> > " vfio_file_is_valid - True if the file is vfio files (group or device)"
> 
> I guess your point is this is also called in the pci hot reset path. And
> in the reset path, the device referred by the device fd should be fully
> opened. vfio_file_is_valid() only checks f_ops, which is not enough to
> show the device is fully-opened for cdev fd. However, view the high-level
> flow, for cdev fd, the device access (neither VFIO_DEVICE_PCI_HOT_RESET
> nor VFIO_DEVICE_GET_PCI_HOT_RESET_INFO) is not allowed until the
> device is fully-opened (done in the bind_iommufd). So if the
> VFIO_DEVICE_PCI_HOT_RESET path goes such far to call vfio_file_is_valid(),
> the device should have been fully-opened.

One more thinking on this. For a single device, my above reply is true.
The device should have been fully-opened when its GET_PCI_HOT_RESET_INFO
and HOT_RESET path have been unblocked. However, when there are
multiple devices that have been affected by the hotreset. User may only
have one device that is fully opened while others are not yet. In such case,
existing vfio_file_is_valid() is not enough. Shall we have another API for
this purpose? E.g. if it's cdev fd, then the new API return true only when
the device is fully opened. Any suggestion here?

Regards,
Yi Liu

^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [Intel-gfx] [PATCH v3 03/15] vfio: Accept vfio device file in the driver facing kAPI
@ 2023-02-17 10:55           ` Liu, Yi L
  0 siblings, 0 replies; 135+ messages in thread
From: Liu, Yi L @ 2023-02-17 10:55 UTC (permalink / raw)
  To: Liu, Yi L, Jason Gunthorpe, alex.williamson
  Cc: linux-s390, yi.y.sun, mjrosato, kvm, intel-gvt-dev, joro, cohuck,
	peterx, eric.auger, nicolinc, shameerali.kolothum.thodi,
	suravee.suthikulpanit, intel-gfx, chao.p.peng, lulu,
	robin.murphy, jasowang

Hi Alex, Jason,

> From: Liu, Yi L <yi.l.liu@intel.com>
> Sent: Tuesday, February 14, 2023 3:19 PM
> 
> > From: Liu, Yi L <yi.l.liu@intel.com>
> > Sent: Tuesday, February 14, 2023 10:03 AM
> >
> > > From: Jason Gunthorpe <jgg@nvidia.com>
> > > Sent: Tuesday, February 14, 2023 7:44 AM
> > >
> > > On Mon, Feb 13, 2023 at 07:13:36AM -0800, Yi Liu wrote:
> > > > +static struct vfio_device *vfio_device_from_file(struct file *file)
> > > > +{
> > > > +	struct vfio_device_file *df = file->private_data;
> > > > +
> > > > +	if (file->f_op != &vfio_device_fops)
> > > > +		return NULL;
> > > > +	return df->device;
> > > > +}
> > > > +
> > > >  /**
> > > >   * vfio_file_is_valid - True if the file is usable with VFIO APIS
> > > >   * @file: VFIO group file or VFIO device file
> > > >   */
> > > >  bool vfio_file_is_valid(struct file *file)
> > > >  {
> > > > -	return vfio_group_from_file(file);
> > > > +	return vfio_group_from_file(file) ||
> > > > +	       vfio_device_from_file(file);
> > > >  }
> > > >  EXPORT_SYMBOL_GPL(vfio_file_is_valid);
> > >
> > > This can only succeed on a device cdev that has been fully opened.
> >
> > Actually, we cannot. This is used in the kvm-vfio code to see if the
> > user-provided fd is vfio fds in the SET_KVM path. And we don't
> > have the device cdev fully opened until BIND_IOMMUFD. But we do
> > need to invoke SET_KVM before issuing BIND_IOMMUFD as the device
> > open needs kvm pointer. So if we cannot apply fully opened limit to this
> > interface. Maybe an updated function comment is needed.
> >
> > " vfio_file_is_valid - True if the file is vfio files (group or device)"
> 
> I guess your point is this is also called in the pci hot reset path. And
> in the reset path, the device referred by the device fd should be fully
> opened. vfio_file_is_valid() only checks f_ops, which is not enough to
> show the device is fully-opened for cdev fd. However, view the high-level
> flow, for cdev fd, the device access (neither VFIO_DEVICE_PCI_HOT_RESET
> nor VFIO_DEVICE_GET_PCI_HOT_RESET_INFO) is not allowed until the
> device is fully-opened (done in the bind_iommufd). So if the
> VFIO_DEVICE_PCI_HOT_RESET path goes such far to call vfio_file_is_valid(),
> the device should have been fully-opened.

One more thinking on this. For a single device, my above reply is true.
The device should have been fully-opened when its GET_PCI_HOT_RESET_INFO
and HOT_RESET path have been unblocked. However, when there are
multiple devices that have been affected by the hotreset. User may only
have one device that is fully opened while others are not yet. In such case,
existing vfio_file_is_valid() is not enough. Shall we have another API for
this purpose? E.g. if it's cdev fd, then the new API return true only when
the device is fully opened. Any suggestion here?

Regards,
Yi Liu

^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [PATCH v3 03/15] vfio: Accept vfio device file in the driver facing kAPI
  2023-02-17 10:55           ` [Intel-gfx] " Liu, Yi L
@ 2023-02-17 15:59             ` Jason Gunthorpe
  -1 siblings, 0 replies; 135+ messages in thread
From: Jason Gunthorpe @ 2023-02-17 15:59 UTC (permalink / raw)
  To: Liu, Yi L
  Cc: alex.williamson, joro, Tian, Kevin, robin.murphy, cohuck,
	eric.auger, nicolinc, kvm, mjrosato, chao.p.peng, yi.y.sun,
	peterx, jasowang, shameerali.kolothum.thodi, lulu,
	suravee.suthikulpanit, intel-gvt-dev, intel-gfx, linux-s390

On Fri, Feb 17, 2023 at 10:55:08AM +0000, Liu, Yi L wrote:

> One more thinking on this. For a single device, my above reply is true.
> The device should have been fully-opened when its GET_PCI_HOT_RESET_INFO
> and HOT_RESET path have been unblocked. However, when there are
> multiple devices that have been affected by the hotreset. User may only
> have one device that is fully opened while others are not yet. In such case,
> existing vfio_file_is_valid() is not enough. Shall we have another API for
> this purpose? E.g. if it's cdev fd, then the new API return true only when
> the device is fully opened. Any suggestion here?

I think what I heard is you need two APIs, one for pci and one for KVM
and the PCI one requires binding to succeed.

Jason

^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [Intel-gfx] [PATCH v3 03/15] vfio: Accept vfio device file in the driver facing kAPI
@ 2023-02-17 15:59             ` Jason Gunthorpe
  0 siblings, 0 replies; 135+ messages in thread
From: Jason Gunthorpe @ 2023-02-17 15:59 UTC (permalink / raw)
  To: Liu, Yi L
  Cc: linux-s390, yi.y.sun, mjrosato, kvm, intel-gvt-dev, joro, cohuck,
	peterx, eric.auger, nicolinc, shameerali.kolothum.thodi,
	suravee.suthikulpanit, intel-gfx, chao.p.peng, lulu,
	robin.murphy, jasowang

On Fri, Feb 17, 2023 at 10:55:08AM +0000, Liu, Yi L wrote:

> One more thinking on this. For a single device, my above reply is true.
> The device should have been fully-opened when its GET_PCI_HOT_RESET_INFO
> and HOT_RESET path have been unblocked. However, when there are
> multiple devices that have been affected by the hotreset. User may only
> have one device that is fully opened while others are not yet. In such case,
> existing vfio_file_is_valid() is not enough. Shall we have another API for
> this purpose? E.g. if it's cdev fd, then the new API return true only when
> the device is fully opened. Any suggestion here?

I think what I heard is you need two APIs, one for pci and one for KVM
and the PCI one requires binding to succeed.

Jason

^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [Intel-gfx] [PATCH v3 05/15] kvm/vfio: Accept vfio device file from userspace
  2023-02-17  5:48               ` [Intel-gfx] " Liu, Yi L
@ 2023-02-17 16:00                 ` Jason Gunthorpe
  -1 siblings, 0 replies; 135+ messages in thread
From: Jason Gunthorpe @ 2023-02-17 16:00 UTC (permalink / raw)
  To: Liu, Yi L
  Cc: linux-s390, yi.y.sun, mjrosato, kvm, Michael Ellerman,
	intel-gvt-dev, joro, cohuck, Timothy Pearson, peterx, eric.auger,
	nicolinc, shameerali.kolothum.thodi, suravee.suthikulpanit,
	intel-gfx, chao.p.peng, lulu, robin.murphy, jasowang

On Fri, Feb 17, 2023 at 05:48:57AM +0000, Liu, Yi L wrote:
> > From: Liu, Yi L <yi.l.liu@intel.com>
> > Sent: Friday, February 17, 2023 1:34 PM
> > 
> > > From: Jason Gunthorpe <jgg@nvidia.com>
> > > Sent: Wednesday, February 15, 2023 8:18 AM
> > >
> > > On Tue, Feb 14, 2023 at 04:42:35PM -0700, Alex Williamson wrote:
> > >
> > > > A device file opened through a group could be passed through this
> > > > interface though, right?
> > >
> > > Yes, I think so
> > >
> > > > Do we just chalk that up to user error?  Maybe the SPAPR extension
> > > > at least needs to be documented as relying on registering groups
> > > > rather than devices.
> > >
> > > The way these APIs work is you have to pass the same FD to all of
> > > them. The SPAPR stuff is no different, if you used a cdev with
> > > KVM_DEV_VFIO_GROUP_ADD then you have to use the same cdev fd
> > with
> > > the
> > > SPAPR group_fd. Yi just didn't rename it.
> > 
> > This is because SPAPR cannot accept cdev fd yet. It explicitly requires
> > group fd and get iommu_group during the handling.
> 
> Sorry I misunderstood it. I think this can be renamed to be fds if
> no objection. Maybe as below, so that old userspace that uses
> group_fds can still compile. I doubt if a new flag is needed to
> identify the provided fds are group or device fds. I guess no since
> the pci hot reset code does not really care about it. It cares more
> the fd is held by the application.

I wouldn't change it, even though it does work like this

spapr requires the group fd because it doesn't work with
iommufd. No sense in confusing things.

Jason

^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [PATCH v3 05/15] kvm/vfio: Accept vfio device file from userspace
@ 2023-02-17 16:00                 ` Jason Gunthorpe
  0 siblings, 0 replies; 135+ messages in thread
From: Jason Gunthorpe @ 2023-02-17 16:00 UTC (permalink / raw)
  To: Liu, Yi L
  Cc: Alex Williamson, joro, Tian, Kevin, robin.murphy, cohuck,
	eric.auger, nicolinc, kvm, mjrosato, chao.p.peng, yi.y.sun,
	peterx, jasowang, shameerali.kolothum.thodi, lulu,
	suravee.suthikulpanit, intel-gvt-dev, intel-gfx, linux-s390,
	Timothy Pearson, Michael Ellerman

On Fri, Feb 17, 2023 at 05:48:57AM +0000, Liu, Yi L wrote:
> > From: Liu, Yi L <yi.l.liu@intel.com>
> > Sent: Friday, February 17, 2023 1:34 PM
> > 
> > > From: Jason Gunthorpe <jgg@nvidia.com>
> > > Sent: Wednesday, February 15, 2023 8:18 AM
> > >
> > > On Tue, Feb 14, 2023 at 04:42:35PM -0700, Alex Williamson wrote:
> > >
> > > > A device file opened through a group could be passed through this
> > > > interface though, right?
> > >
> > > Yes, I think so
> > >
> > > > Do we just chalk that up to user error?  Maybe the SPAPR extension
> > > > at least needs to be documented as relying on registering groups
> > > > rather than devices.
> > >
> > > The way these APIs work is you have to pass the same FD to all of
> > > them. The SPAPR stuff is no different, if you used a cdev with
> > > KVM_DEV_VFIO_GROUP_ADD then you have to use the same cdev fd
> > with
> > > the
> > > SPAPR group_fd. Yi just didn't rename it.
> > 
> > This is because SPAPR cannot accept cdev fd yet. It explicitly requires
> > group fd and get iommu_group during the handling.
> 
> Sorry I misunderstood it. I think this can be renamed to be fds if
> no objection. Maybe as below, so that old userspace that uses
> group_fds can still compile. I doubt if a new flag is needed to
> identify the provided fds are group or device fds. I guess no since
> the pci hot reset code does not really care about it. It cares more
> the fd is held by the application.

I wouldn't change it, even though it does work like this

spapr requires the group fd because it doesn't work with
iommufd. No sense in confusing things.

Jason

^ permalink raw reply	[flat|nested] 135+ messages in thread

* RE: [PATCH v3 03/15] vfio: Accept vfio device file in the driver facing kAPI
  2023-02-17 15:59             ` [Intel-gfx] " Jason Gunthorpe
@ 2023-02-18  2:54               ` Liu, Yi L
  -1 siblings, 0 replies; 135+ messages in thread
From: Liu, Yi L @ 2023-02-18  2:54 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: alex.williamson, joro, Tian, Kevin, robin.murphy, cohuck,
	eric.auger, nicolinc, kvm, mjrosato, chao.p.peng, yi.y.sun,
	peterx, jasowang, shameerali.kolothum.thodi, lulu,
	suravee.suthikulpanit, intel-gvt-dev, intel-gfx, linux-s390

> From: Jason Gunthorpe <jgg@nvidia.com>
> Sent: Friday, February 17, 2023 11:59 PM
> 
> On Fri, Feb 17, 2023 at 10:55:08AM +0000, Liu, Yi L wrote:
> 
> > One more thinking on this. For a single device, my above reply is true.
> > The device should have been fully-opened when its
> GET_PCI_HOT_RESET_INFO
> > and HOT_RESET path have been unblocked. However, when there are
> > multiple devices that have been affected by the hotreset. User may only
> > have one device that is fully opened while others are not yet. In such case,
> > existing vfio_file_is_valid() is not enough. Shall we have another API for
> > this purpose? E.g. if it's cdev fd, then the new API return true only when
> > the device is fully opened. Any suggestion here?
> 
> I think what I heard is you need two APIs, one for pci and one for KVM
> and the PCI one requires binding to succeed.

Yes.

One is vfio_file_is_valid() - for KVM
Another one is vfio_file_device_opened() - for PCI.

Regards,
Yi Liu

^ permalink raw reply	[flat|nested] 135+ messages in thread

* Re: [Intel-gfx] [PATCH v3 03/15] vfio: Accept vfio device file in the driver facing kAPI
@ 2023-02-18  2:54               ` Liu, Yi L
  0 siblings, 0 replies; 135+ messages in thread
From: Liu, Yi L @ 2023-02-18  2:54 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: linux-s390, yi.y.sun, mjrosato, kvm, intel-gvt-dev, joro, cohuck,
	peterx, eric.auger, nicolinc, shameerali.kolothum.thodi,
	suravee.suthikulpanit, intel-gfx, chao.p.peng, lulu,
	robin.murphy, jasowang

> From: Jason Gunthorpe <jgg@nvidia.com>
> Sent: Friday, February 17, 2023 11:59 PM
> 
> On Fri, Feb 17, 2023 at 10:55:08AM +0000, Liu, Yi L wrote:
> 
> > One more thinking on this. For a single device, my above reply is true.
> > The device should have been fully-opened when its
> GET_PCI_HOT_RESET_INFO
> > and HOT_RESET path have been unblocked. However, when there are
> > multiple devices that have been affected by the hotreset. User may only
> > have one device that is fully opened while others are not yet. In such case,
> > existing vfio_file_is_valid() is not enough. Shall we have another API for
> > this purpose? E.g. if it's cdev fd, then the new API return true only when
> > the device is fully opened. Any suggestion here?
> 
> I think what I heard is you need two APIs, one for pci and one for KVM
> and the PCI one requires binding to succeed.

Yes.

One is vfio_file_is_valid() - for KVM
Another one is vfio_file_device_opened() - for PCI.

Regards,
Yi Liu

^ permalink raw reply	[flat|nested] 135+ messages in thread

end of thread, other threads:[~2023-02-18  2:55 UTC | newest]

Thread overview: 135+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-02-13 15:13 [PATCH v3 00/15] Add vfio_device cdev for iommufd support Yi Liu
2023-02-13 15:13 ` [Intel-gfx] " Yi Liu
2023-02-13 15:13 ` [PATCH v3 01/15] vfio: Allocate per device file structure Yi Liu
2023-02-13 15:13   ` [Intel-gfx] " Yi Liu
2023-02-13 15:13 ` [PATCH v3 02/15] vfio: Refine vfio file kAPIs Yi Liu
2023-02-13 15:13   ` [Intel-gfx] " Yi Liu
2023-02-13 15:13 ` [PATCH v3 03/15] vfio: Accept vfio device file in the driver facing kAPI Yi Liu
2023-02-13 15:13   ` [Intel-gfx] " Yi Liu
2023-02-13 23:21   ` Alex Williamson
2023-02-13 23:21     ` [Intel-gfx] " Alex Williamson
2023-02-14  2:19     ` Liu, Yi L
2023-02-14  2:19       ` [Intel-gfx] " Liu, Yi L
2023-02-13 23:43   ` Jason Gunthorpe
2023-02-13 23:43     ` [Intel-gfx] " Jason Gunthorpe
2023-02-14  2:02     ` Liu, Yi L
2023-02-14  2:02       ` [Intel-gfx] " Liu, Yi L
2023-02-14  7:19       ` Liu, Yi L
2023-02-14  7:19         ` [Intel-gfx] " Liu, Yi L
2023-02-17 10:55         ` Liu, Yi L
2023-02-17 10:55           ` [Intel-gfx] " Liu, Yi L
2023-02-17 15:59           ` Jason Gunthorpe
2023-02-17 15:59             ` [Intel-gfx] " Jason Gunthorpe
2023-02-18  2:54             ` Liu, Yi L
2023-02-18  2:54               ` [Intel-gfx] " Liu, Yi L
2023-02-15 12:38       ` Jason Gunthorpe
2023-02-15 12:38         ` Jason Gunthorpe
2023-02-15 14:43         ` Liu, Yi L
2023-02-15 14:43           ` [Intel-gfx] " Liu, Yi L
2023-02-15 14:46           ` Jason Gunthorpe
2023-02-15 14:46             ` [Intel-gfx] " Jason Gunthorpe
2023-02-15 15:32             ` Alex Williamson
2023-02-15 15:32               ` [Intel-gfx] " Alex Williamson
2023-02-15 17:04               ` Jason Gunthorpe
2023-02-15 17:04                 ` [Intel-gfx] " Jason Gunthorpe
2023-02-15 17:19                 ` Alex Williamson
2023-02-15 17:19                   ` Alex Williamson
2023-02-15 17:33                   ` Jason Gunthorpe
2023-02-15 17:33                     ` [Intel-gfx] " Jason Gunthorpe
2023-02-13 15:13 ` [PATCH v3 04/15] kvm/vfio: Rename kvm_vfio_group to prepare for accepting vfio device fd Yi Liu
2023-02-13 15:13   ` [Intel-gfx] " Yi Liu
2023-02-13 15:13 ` [PATCH v3 05/15] kvm/vfio: Accept vfio device file from userspace Yi Liu
2023-02-13 15:13   ` [Intel-gfx] " Yi Liu
2023-02-14 22:26   ` Alex Williamson
2023-02-14 22:26     ` [Intel-gfx] " Alex Williamson
2023-02-14 23:25     ` Jason Gunthorpe
2023-02-14 23:25       ` [Intel-gfx] " Jason Gunthorpe
2023-02-14 23:42       ` Alex Williamson
2023-02-14 23:42         ` Alex Williamson
2023-02-15  0:17         ` Jason Gunthorpe
2023-02-15  0:17           ` [Intel-gfx] " Jason Gunthorpe
2023-02-15  0:27           ` Timothy Pearson
2023-02-15  0:27             ` [Intel-gfx] " Timothy Pearson
2023-02-17  5:34           ` Liu, Yi L
2023-02-17  5:34             ` [Intel-gfx] " Liu, Yi L
2023-02-17  5:48             ` Liu, Yi L
2023-02-17  5:48               ` [Intel-gfx] " Liu, Yi L
2023-02-17 16:00               ` Jason Gunthorpe
2023-02-17 16:00                 ` Jason Gunthorpe
2023-02-15  7:37       ` Liu, Yi L
2023-02-15  7:37         ` [Intel-gfx] " Liu, Yi L
2023-02-13 15:13 ` [PATCH v3 06/15] vfio: Pass struct vfio_device_file * to vfio_device_open/close() Yi Liu
2023-02-13 15:13   ` [Intel-gfx] " Yi Liu
2023-02-13 15:13 ` [PATCH v3 07/15] vfio: Block device access via device fd until device is opened Yi Liu
2023-02-13 15:13   ` [Intel-gfx] " Yi Liu
2023-02-14 22:46   ` Alex Williamson
2023-02-14 22:46     ` Alex Williamson
2023-02-15  6:12     ` Liu, Yi L
2023-02-15  6:12       ` [Intel-gfx] " Liu, Yi L
2023-02-13 15:13 ` [PATCH v3 08/15] vfio: Add infrastructure for bind_iommufd from userspace Yi Liu
2023-02-13 15:13   ` [Intel-gfx] " Yi Liu
2023-02-13 15:13 ` [PATCH v3 09/15] vfio-iommufd: Add detach_ioas support for physical VFIO devices Yi Liu
2023-02-13 15:13   ` [Intel-gfx] " Yi Liu
2023-02-14  8:05   ` Tian, Kevin
2023-02-14  8:05     ` [Intel-gfx] " Tian, Kevin
2023-02-13 15:13 ` [PATCH v3 10/15] vfio-iommufd: Add detach_ioas for emulated " Yi Liu
2023-02-13 15:13   ` [Intel-gfx] " Yi Liu
2023-02-14  8:06   ` Tian, Kevin
2023-02-14  8:06     ` [Intel-gfx] " Tian, Kevin
2023-02-13 15:13 ` [PATCH v3 11/15] vfio: Add cdev_device_open_cnt to vfio_group Yi Liu
2023-02-13 15:13   ` [Intel-gfx] " Yi Liu
2023-02-14  8:18   ` Tian, Kevin
2023-02-14  8:18     ` [Intel-gfx] " Tian, Kevin
2023-02-13 15:13 ` [Intel-gfx] [PATCH v3 12/15] vfio: Make vfio_device_open() single open for device cdev path Yi Liu
2023-02-13 15:13   ` Yi Liu
2023-02-14  8:25   ` Tian, Kevin
2023-02-14  8:25     ` [Intel-gfx] " Tian, Kevin
2023-02-13 15:13 ` [Intel-gfx] [PATCH v3 13/15] vfio: Add cdev for vfio_device Yi Liu
2023-02-13 15:13   ` Yi Liu
2023-02-14  8:32   ` Tian, Kevin
2023-02-14  8:32     ` [Intel-gfx] " Tian, Kevin
2023-02-14  8:35     ` Liu, Yi L
2023-02-14  8:35       ` [Intel-gfx] " Liu, Yi L
2023-02-13 15:13 ` [Intel-gfx] [PATCH v3 14/15] vfio: Add ioctls for device cdev using iommufd Yi Liu
2023-02-13 15:13   ` Yi Liu
2023-02-14  8:53   ` Tian, Kevin
2023-02-14  8:53     ` [Intel-gfx] " Tian, Kevin
2023-02-14 23:39   ` Yan Zhao
2023-02-14 23:39     ` [Intel-gfx] " Yan Zhao
2023-02-15  2:04     ` Tian, Kevin
2023-02-15  2:04       ` [Intel-gfx] " Tian, Kevin
2023-02-15  7:37       ` Liu, Yi L
2023-02-15  7:37         ` [Intel-gfx] " Liu, Yi L
2023-02-16  8:24   ` Yan Zhao
2023-02-16  8:24     ` [Intel-gfx] " Yan Zhao
2023-02-16  9:10     ` Liu, Yi L
2023-02-16  9:10       ` [Intel-gfx] " Liu, Yi L
2023-02-16  9:23       ` Yan Zhao
2023-02-16  9:23         ` [Intel-gfx] " Yan Zhao
2023-02-16 10:28         ` Liu, Yi L
2023-02-16 10:28           ` [Intel-gfx] " Liu, Yi L
2023-02-16 14:24           ` Jason Gunthorpe
2023-02-16 14:24             ` [Intel-gfx] " Jason Gunthorpe
2023-02-13 15:13 ` [Intel-gfx] [PATCH v3 15/15] vfio: Compile group optionally Yi Liu
2023-02-13 15:13   ` Yi Liu
2023-02-13 15:30 ` [Intel-gfx] ✗ Fi.CI.BUILD: failure for Add vfio_device cdev for iommufd support (rev2) Patchwork
2023-02-13 19:47 ` [PATCH v3 00/15] Add vfio_device cdev for iommufd support Alex Williamson
2023-02-13 19:47   ` [Intel-gfx] " Alex Williamson
2023-02-13 23:21   ` Jason Gunthorpe
2023-02-13 23:21     ` [Intel-gfx] " Jason Gunthorpe
2023-02-14 15:15     ` Liu, Yi L
2023-02-14 15:15       ` [Intel-gfx] " Liu, Yi L
2023-02-14 15:54       ` Alex Williamson
2023-02-14 15:54         ` [Intel-gfx] " Alex Williamson
2023-02-14 16:48         ` Jason Gunthorpe
2023-02-14 16:48           ` [Intel-gfx] " Jason Gunthorpe
2023-02-14  1:55   ` Liu, Yi L
2023-02-14  1:55     ` [Intel-gfx] " Liu, Yi L
2023-02-14 15:47     ` Alex Williamson
2023-02-14 15:47       ` Alex Williamson
2023-02-15  7:54       ` Liu, Yi L
2023-02-15  7:54         ` [Intel-gfx] " Liu, Yi L
2023-02-15 20:09         ` Alex Williamson
2023-02-15 20:09           ` [Intel-gfx] " Alex Williamson
2023-02-16  2:53           ` Liu, Yi L
2023-02-16  2:53             ` [Intel-gfx] " Liu, Yi L

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.