linux-kselftest.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/4] iommufd: Add iommu hardware info reporting
@ 2023-03-09  7:53 Yi Liu
  2023-03-09  7:53 ` [PATCH v2 1/4] iommu: Move dev_iommu_ops() to private header Yi Liu
                   ` (4 more replies)
  0 siblings, 5 replies; 16+ messages in thread
From: Yi Liu @ 2023-03-09  7:53 UTC (permalink / raw)
  To: joro, alex.williamson, jgg, kevin.tian, robin.murphy, baolu.lu
  Cc: cohuck, eric.auger, nicolinc, kvm, mjrosato, chao.p.peng,
	yi.l.liu, yi.y.sun, peterx, jasowang, shameerali.kolothum.thodi,
	lulu, suravee.suthikulpanit, iommu, linux-kernel,
	linux-kselftest

iommufd gives userspace the capability to manipulate iommu subsytem.
e.g. DMA map/unmap etc. In the near future, it will support iommu nested
translation. Different platform vendors have different implementation for
the nested translation. So before set up nested translation, userspace
needs to know the hardware iommu information. For example, Intel VT-d
supports using guest I/O page table as the stage-1 translation table. This
requires guest I/O page table be compatible with hardware IOMMU.

This series reports the iommu hardware information for a given iommufd_device
which has been bound to iommufd. It is preparation work for userspace to
allocate hwpt for given device. Like the nested translation support[1].

This series introduces an iommu op to report the iommu hardware info,
and an ioctl IOMMU_DEVICE_GET_HW_INFO is added to report such hardware
info to user. enum iommu_hw_info_type is defined to differentiate the
iommu hardware info reported to user hence user can decode them. This
series only adds the framework for iommu hw info reporting, the complete
reporting path needs vendor specific definition and driver support. The
full picture is available in [1] as well.

base-commit: 4c7e97cb6e65eab02991f60a5cc7a4fed5498c7a

[1] https://github.com/yiliu1765/iommufd/tree/iommufd_nesting

Change log:

v2:
 - Drop patch 05 of v1 as it is already covered by other series
 - Rename the capability info to be iommu hardware info

v1: https://lore.kernel.org/linux-iommu/20230209041642.9346-1-yi.l.liu@intel.com/

Regards,
	Yi Liu

Lu Baolu (1):
  iommu: Add new iommu op to get iommu hardware information

Nicolin Chen (1):
  iommufd/selftest: Add coverage for IOMMU_DEVICE_GET_HW_INFO ioctl

Yi Liu (2):
  iommu: Move dev_iommu_ops() to private header
  iommufd: Add IOMMU_DEVICE_GET_HW_INFO

 drivers/iommu/iommu-priv.h                    | 11 +++
 drivers/iommu/iommu.c                         |  2 +
 drivers/iommu/iommufd/device.c                | 75 +++++++++++++++++++
 drivers/iommu/iommufd/iommufd_private.h       |  1 +
 drivers/iommu/iommufd/iommufd_test.h          | 15 ++++
 drivers/iommu/iommufd/main.c                  |  3 +
 drivers/iommu/iommufd/selftest.c              | 16 ++++
 include/linux/iommu.h                         | 24 +++---
 include/uapi/linux/iommufd.h                  | 47 ++++++++++++
 tools/testing/selftests/iommu/iommufd.c       | 17 ++++-
 tools/testing/selftests/iommu/iommufd_utils.h | 26 +++++++
 11 files changed, 225 insertions(+), 12 deletions(-)

-- 
2.34.1


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH v2 1/4] iommu: Move dev_iommu_ops() to private header
  2023-03-09  7:53 [PATCH v2 0/4] iommufd: Add iommu hardware info reporting Yi Liu
@ 2023-03-09  7:53 ` Yi Liu
  2023-03-09 12:58   ` Baolu Lu
  2023-03-09  7:53 ` [PATCH v2 2/4] iommu: Add new iommu op to get iommu hardware information Yi Liu
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 16+ messages in thread
From: Yi Liu @ 2023-03-09  7:53 UTC (permalink / raw)
  To: joro, alex.williamson, jgg, kevin.tian, robin.murphy, baolu.lu
  Cc: cohuck, eric.auger, nicolinc, kvm, mjrosato, chao.p.peng,
	yi.l.liu, yi.y.sun, peterx, jasowang, shameerali.kolothum.thodi,
	lulu, suravee.suthikulpanit, iommu, linux-kernel,
	linux-kselftest

dev_iommu_ops() is essentially only used in iommu subsystem, so
move to a private header to avoid being abused by other drivers.

Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
---
 drivers/iommu/iommu-priv.h | 11 +++++++++++
 drivers/iommu/iommu.c      |  2 ++
 include/linux/iommu.h      | 11 -----------
 3 files changed, 13 insertions(+), 11 deletions(-)

diff --git a/drivers/iommu/iommu-priv.h b/drivers/iommu/iommu-priv.h
index 7c8011bfd153..a6e694f59f64 100644
--- a/drivers/iommu/iommu-priv.h
+++ b/drivers/iommu/iommu-priv.h
@@ -4,6 +4,17 @@
 
 #include <linux/iommu.h>
 
+static inline const struct iommu_ops *dev_iommu_ops(struct device *dev)
+{
+	/*
+	 * Assume that valid ops must be installed if iommu_probe_device()
+	 * has succeeded. The device ops are essentially for internal use
+	 * within the IOMMU subsystem itself, so we should be able to trust
+	 * ourselves not to misuse the helper.
+	 */
+	return dev->iommu->iommu_dev->ops;
+}
+
 int iommu_group_replace_domain(struct iommu_group *group,
 			       struct iommu_domain *new_domain);
 
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index fd8fe2cd7303..437476c36de5 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -37,6 +37,8 @@
 
 #include "iommu-sva.h"
 
+#include "iommu-priv.h"
+
 static struct kset *iommu_group_kset;
 static DEFINE_IDA(iommu_group_ida);
 
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 6595454d4f48..7202d8ece343 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -444,17 +444,6 @@ static inline void iommu_iotlb_gather_init(struct iommu_iotlb_gather *gather)
 	};
 }
 
-static inline const struct iommu_ops *dev_iommu_ops(struct device *dev)
-{
-	/*
-	 * Assume that valid ops must be installed if iommu_probe_device()
-	 * has succeeded. The device ops are essentially for internal use
-	 * within the IOMMU subsystem itself, so we should be able to trust
-	 * ourselves not to misuse the helper.
-	 */
-	return dev->iommu->iommu_dev->ops;
-}
-
 extern int bus_iommu_probe(struct bus_type *bus);
 extern bool iommu_present(struct bus_type *bus);
 extern bool device_iommu_capable(struct device *dev, enum iommu_cap cap);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v2 2/4] iommu: Add new iommu op to get iommu hardware information
  2023-03-09  7:53 [PATCH v2 0/4] iommufd: Add iommu hardware info reporting Yi Liu
  2023-03-09  7:53 ` [PATCH v2 1/4] iommu: Move dev_iommu_ops() to private header Yi Liu
@ 2023-03-09  7:53 ` Yi Liu
  2023-03-16  8:16   ` Tian, Kevin
  2023-03-09  7:53 ` [PATCH v2 3/4] iommufd: Add IOMMU_DEVICE_GET_HW_INFO Yi Liu
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 16+ messages in thread
From: Yi Liu @ 2023-03-09  7:53 UTC (permalink / raw)
  To: joro, alex.williamson, jgg, kevin.tian, robin.murphy, baolu.lu
  Cc: cohuck, eric.auger, nicolinc, kvm, mjrosato, chao.p.peng,
	yi.l.liu, yi.y.sun, peterx, jasowang, shameerali.kolothum.thodi,
	lulu, suravee.suthikulpanit, iommu, linux-kernel,
	linux-kselftest

From: Lu Baolu <baolu.lu@linux.intel.com>

Introduce a new iommu op to get the IOMMU hardware capabilities for
iommufd. This information will be used by any vIOMMU driver which is
owned by userspace.

This op chooses to make the special parameters opaque to the core. This
suits the current usage model where accessing any of the IOMMU device
special parameters does require a userspace driver that matches the kernel
driver. If a need for common parameters, implemented similarly by several
drivers, arises then there's room in the design to grow a generic parameter
set as well. No wrapper API is added as it is supposed to be used by
iommufd only.

Different IOMMU hardware would have different hardware information. So the
information reported differs as well. To let the external user understand
the difference. enum iommu_hw_info_type is defined. For the iommu drivers
that are capable to report hardware information, it should have a unique
iommu_hw_info_type. The iommu_hw_info_type is stored in struct iommu_ops.
For the driver oesn't report hardware information, just use
IOMMU_HW_INFO_TYPE_DEFAULT.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Co-developed-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
---
 include/linux/iommu.h        | 13 +++++++++++++
 include/uapi/linux/iommufd.h |  7 +++++++
 2 files changed, 20 insertions(+)

diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 7202d8ece343..3ef84ee359d2 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -15,6 +15,7 @@
 #include <linux/of.h>
 #include <linux/ioasid.h>
 #include <uapi/linux/iommu.h>
+#include <uapi/linux/iommufd.h>
 
 #define IOMMU_READ	(1 << 0)
 #define IOMMU_WRITE	(1 << 1)
@@ -222,6 +223,11 @@ struct iommu_iotlb_gather {
 /**
  * struct iommu_ops - iommu ops and capabilities
  * @capable: check capability
+ * @hw_info: IOMMU hardware information. The type of the returned data is
+ *           defined in include/uapi/linux/iommufd.h. The data buffer is
+ *           allocated in the IOMMU driver and the caller should free it
+ *           after use. Return the data buffer if success, or ERR_PTR on
+ *           failure.
  * @domain_alloc: allocate iommu domain
  * @probe_device: Add device to iommu driver handling
  * @release_device: Remove device from iommu driver handling
@@ -246,11 +252,17 @@ struct iommu_iotlb_gather {
  * @remove_dev_pasid: Remove any translation configurations of a specific
  *                    pasid, so that any DMA transactions with this pasid
  *                    will be blocked by the hardware.
+ * @driver_type: One of enum iommu_hw_info_type. This is used in the hw_info
+ *               reporting path. For the drivers that supports it, a unique
+ *               type should be defined. For the driver that does not support
+ *               it, this field is the IOMMU_HW_INFO_TYPE_DEFAULT that is 0.
+ *               Hence, such drivers do not need to care this field.
  * @pgsize_bitmap: bitmap of all possible supported page sizes
  * @owner: Driver module providing these ops
  */
 struct iommu_ops {
 	bool (*capable)(struct device *dev, enum iommu_cap);
+	void *(*hw_info)(struct device *dev, u32 *length);
 
 	/* Domain allocation and freeing by the iommu driver */
 	struct iommu_domain *(*domain_alloc)(unsigned iommu_domain_type);
@@ -279,6 +291,7 @@ struct iommu_ops {
 	void (*remove_dev_pasid)(struct device *dev, ioasid_t pasid);
 
 	const struct iommu_domain_ops *default_domain_ops;
+	enum iommu_hw_info_type driver_type;
 	unsigned long pgsize_bitmap;
 	struct module *owner;
 };
diff --git a/include/uapi/linux/iommufd.h b/include/uapi/linux/iommufd.h
index ccd36acad36a..955cbef640da 100644
--- a/include/uapi/linux/iommufd.h
+++ b/include/uapi/linux/iommufd.h
@@ -370,4 +370,11 @@ struct iommu_hwpt_alloc {
 	__u32 __reserved;
 };
 #define IOMMU_HWPT_ALLOC _IO(IOMMUFD_TYPE, IOMMUFD_CMD_HWPT_ALLOC)
+
+/**
+ * enum iommu_hw_info_type - IOMMU Hardware Info Types
+ */
+enum iommu_hw_info_type {
+	IOMMU_HW_INFO_TYPE_DEFAULT,
+};
 #endif
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v2 3/4] iommufd: Add IOMMU_DEVICE_GET_HW_INFO
  2023-03-09  7:53 [PATCH v2 0/4] iommufd: Add iommu hardware info reporting Yi Liu
  2023-03-09  7:53 ` [PATCH v2 1/4] iommu: Move dev_iommu_ops() to private header Yi Liu
  2023-03-09  7:53 ` [PATCH v2 2/4] iommu: Add new iommu op to get iommu hardware information Yi Liu
@ 2023-03-09  7:53 ` Yi Liu
  2023-03-09 13:50   ` Baolu Lu
  2023-03-16  8:23   ` Tian, Kevin
  2023-03-09  7:53 ` [PATCH v2 4/4] iommufd/selftest: Add coverage for IOMMU_DEVICE_GET_HW_INFO ioctl Yi Liu
  2023-03-20 12:43 ` [PATCH v2 0/4] iommufd: Add iommu hardware info reporting Jason Gunthorpe
  4 siblings, 2 replies; 16+ messages in thread
From: Yi Liu @ 2023-03-09  7:53 UTC (permalink / raw)
  To: joro, alex.williamson, jgg, kevin.tian, robin.murphy, baolu.lu
  Cc: cohuck, eric.auger, nicolinc, kvm, mjrosato, chao.p.peng,
	yi.l.liu, yi.y.sun, peterx, jasowang, shameerali.kolothum.thodi,
	lulu, suravee.suthikulpanit, iommu, linux-kernel,
	linux-kselftest

Under nested IOMMU translation, userspace owns the stage-1 translation
table (e.g. the stage-1 page table of Intel VT-d or the context table
of ARM SMMUv3, and etc.). Stage-1 translation tables are vendor specific,
and needs to be compatiable with the underlying IOMMU hardware. Hence,
userspace should know the IOMMU hardware capability before creating and
configuring the stage-1 translation table to kernel.

This adds IOMMU_DEVICE_GET_HW_INFO to query the IOMMU hardware information
for a given device. The returned data is vendor specific, userspace needs
to decode it with the structure mapped by the @out_data_type field.

As only physical devices have IOMMU hardware, so this will return error
if the given device is not a physical device.

Co-developed-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
---
 drivers/iommu/iommufd/device.c          | 74 +++++++++++++++++++++++++
 drivers/iommu/iommufd/iommufd_private.h |  1 +
 drivers/iommu/iommufd/main.c            |  3 +
 include/uapi/linux/iommufd.h            | 40 +++++++++++++
 4 files changed, 118 insertions(+)

diff --git a/drivers/iommu/iommufd/device.c b/drivers/iommu/iommufd/device.c
index c10e02f6a0be..6948539488a5 100644
--- a/drivers/iommu/iommufd/device.c
+++ b/drivers/iommu/iommufd/device.c
@@ -257,6 +257,80 @@ struct iommufd_ctx *iommufd_device_to_ictx(struct iommufd_device *idev)
 }
 EXPORT_SYMBOL_NS_GPL(iommufd_device_to_ictx, IOMMUFD);
 
+static int iommufd_zero_fill_user(u64 ptr, int bytes)
+{
+	int index = 0;
+
+	for (; index < bytes; index++) {
+		if (put_user(0, (uint8_t __user *)(ptr + index)))
+			return -EFAULT;
+	}
+	return 0;
+}
+
+int iommufd_device_get_hw_info(struct iommufd_ucmd *ucmd)
+{
+	struct iommu_hw_info *cmd = ucmd->cmd;
+	struct iommufd_device *idev;
+	const struct iommu_ops *ops;
+	void *data;
+	unsigned int length, data_len;
+	int rc;
+
+	if (cmd->flags || cmd->__reserved || !cmd->data_len)
+		return -EOPNOTSUPP;
+
+	idev = iommufd_get_device(ucmd, cmd->dev_id);
+	if (IS_ERR(idev))
+		return PTR_ERR(idev);
+
+	ops = dev_iommu_ops(idev->dev);
+	if (!ops || !ops->hw_info) {
+		rc = -EOPNOTSUPP;
+		goto out_put;
+	}
+
+	/* driver has hw_info callback should have a unique driver_type */
+	if (WARN_ON(ops->driver_type == IOMMU_HW_INFO_TYPE_DEFAULT)) {
+		rc = -EOPNOTSUPP;
+		goto out_put;
+	}
+
+	data = ops->hw_info(idev->dev, &data_len);
+	if (IS_ERR(data)) {
+		rc = PTR_ERR(data);
+		goto out_put;
+	}
+
+	length = min(cmd->data_len, data_len);
+	if (copy_to_user(u64_to_user_ptr(cmd->data_ptr), data, length)) {
+		rc = -EFAULT;
+		goto out_free_data;
+	}
+
+	/*
+	 * Zero the trailing bytes if the user buffer is bigger than the
+	 * data size kernel actually has.
+	 */
+	if (length < cmd->data_len) {
+		rc = iommufd_zero_fill_user(cmd->data_ptr + length,
+					    cmd->data_len - length);
+		if (rc)
+			goto out_free_data;
+	}
+
+	cmd->out_data_type = ops->driver_type;
+	cmd->data_len = length;
+
+	rc = iommufd_ucmd_respond(ucmd, sizeof(*cmd));
+
+out_free_data:
+	kfree(data);
+out_put:
+	iommufd_put_object(&idev->obj);
+	return rc;
+}
+
 static int iommufd_group_setup_msi(struct iommufd_group *igroup,
 				   struct iommufd_hw_pagetable *hwpt)
 {
diff --git a/drivers/iommu/iommufd/iommufd_private.h b/drivers/iommu/iommufd/iommufd_private.h
index b18f843ad6a4..05b5ad66f716 100644
--- a/drivers/iommu/iommufd/iommufd_private.h
+++ b/drivers/iommu/iommufd/iommufd_private.h
@@ -307,6 +307,7 @@ iommufd_get_device(struct iommufd_ucmd *ucmd, u32 id)
 }
 
 void iommufd_device_destroy(struct iommufd_object *obj);
+int iommufd_device_get_hw_info(struct iommufd_ucmd *ucmd);
 
 struct iommufd_access {
 	struct iommufd_object obj;
diff --git a/drivers/iommu/iommufd/main.c b/drivers/iommu/iommufd/main.c
index 694da191e4b1..f079c0bda46b 100644
--- a/drivers/iommu/iommufd/main.c
+++ b/drivers/iommu/iommufd/main.c
@@ -262,6 +262,7 @@ static int iommufd_option(struct iommufd_ucmd *ucmd)
 union ucmd_buffer {
 	struct iommu_destroy destroy;
 	struct iommu_hwpt_alloc hwpt;
+	struct iommu_hw_info info;
 	struct iommu_ioas_alloc alloc;
 	struct iommu_ioas_allow_iovas allow_iovas;
 	struct iommu_ioas_copy ioas_copy;
@@ -295,6 +296,8 @@ static const struct iommufd_ioctl_op iommufd_ioctl_ops[] = {
 	IOCTL_OP(IOMMU_DESTROY, iommufd_destroy, struct iommu_destroy, id),
 	IOCTL_OP(IOMMU_HWPT_ALLOC, iommufd_hwpt_alloc, struct iommu_hwpt_alloc,
 		 __reserved),
+	IOCTL_OP(IOMMU_DEVICE_GET_HW_INFO, iommufd_device_get_hw_info,
+		 struct iommu_hw_info, __reserved),
 	IOCTL_OP(IOMMU_IOAS_ALLOC, iommufd_ioas_alloc_ioctl,
 		 struct iommu_ioas_alloc, out_ioas_id),
 	IOCTL_OP(IOMMU_IOAS_ALLOW_IOVAS, iommufd_ioas_allow_iovas,
diff --git a/include/uapi/linux/iommufd.h b/include/uapi/linux/iommufd.h
index 955cbef640da..4ac525897b82 100644
--- a/include/uapi/linux/iommufd.h
+++ b/include/uapi/linux/iommufd.h
@@ -46,6 +46,7 @@ enum {
 	IOMMUFD_CMD_OPTION,
 	IOMMUFD_CMD_VFIO_IOAS,
 	IOMMUFD_CMD_HWPT_ALLOC,
+	IOMMUFD_CMD_DEVICE_GET_HW_INFO,
 };
 
 /**
@@ -377,4 +378,43 @@ struct iommu_hwpt_alloc {
 enum iommu_hw_info_type {
 	IOMMU_HW_INFO_TYPE_DEFAULT,
 };
+
+/**
+ * struct iommu_hw_info - ioctl(IOMMU_DEVICE_GET_HW_INFO)
+ * @size: sizeof(struct iommu_hw_info)
+ * @flags: Must be 0
+ * @dev_id: The device being attached to the iommufd
+ * @data_len: Input the length of the user buffer in bytes. Output the
+ *            length of data filled to the user buffer.
+ * @data_ptr: Pointer to the type specific structure
+ * @out_data_type: Output the iommu hardware info type, it is one of
+ *                 enum iommu_hw_info_type.
+ * @__reserved: Must be 0
+ *
+ * Query the hardware iommu information for given device which has been
+ * bound to iommufd. @data_len is the size of the buffer which captures
+ * iommu type specific data and the data will be filled. Trailing bytes
+ * are zeroed if the user buffer is larger than the data kernel has.
+ *
+ * The type specific data would be used to sync capability between the
+ * vIOMMU and the hardware IOMMU. e.g. nested translation requires to
+ * check the hardware IOMMU capability, since a stage-1 translation table
+ * is owned by user but used by hardware IOMMU.
+ *
+ * The @out_data_type will be filled if the ioctl succeeds. It would
+ * be used to decode the data filled in the buffer pointed by @data_ptr.
+ *
+ * This is only available for the physical devices bound to iommufd as
+ * only physical devices can have hardware IOMMU.
+ */
+struct iommu_hw_info {
+	__u32 size;
+	__u32 flags;
+	__u32 dev_id;
+	__u32 data_len;
+	__aligned_u64 data_ptr;
+	__u32 out_data_type;
+	__u32 __reserved;
+};
+#define IOMMU_DEVICE_GET_HW_INFO _IO(IOMMUFD_TYPE, IOMMUFD_CMD_DEVICE_GET_HW_INFO)
 #endif
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v2 4/4] iommufd/selftest: Add coverage for IOMMU_DEVICE_GET_HW_INFO ioctl
  2023-03-09  7:53 [PATCH v2 0/4] iommufd: Add iommu hardware info reporting Yi Liu
                   ` (2 preceding siblings ...)
  2023-03-09  7:53 ` [PATCH v2 3/4] iommufd: Add IOMMU_DEVICE_GET_HW_INFO Yi Liu
@ 2023-03-09  7:53 ` Yi Liu
  2023-03-20 12:43   ` Jason Gunthorpe
  2023-03-20 12:43 ` [PATCH v2 0/4] iommufd: Add iommu hardware info reporting Jason Gunthorpe
  4 siblings, 1 reply; 16+ messages in thread
From: Yi Liu @ 2023-03-09  7:53 UTC (permalink / raw)
  To: joro, alex.williamson, jgg, kevin.tian, robin.murphy, baolu.lu
  Cc: cohuck, eric.auger, nicolinc, kvm, mjrosato, chao.p.peng,
	yi.l.liu, yi.y.sun, peterx, jasowang, shameerali.kolothum.thodi,
	lulu, suravee.suthikulpanit, iommu, linux-kernel,
	linux-kselftest

From: Nicolin Chen <nicolinc@nvidia.com>

Add a mock_domain_hw_info function and an iommu_hw_info_selftest data
structure. This allows to test the IOMMU_DEVICE_GET_HW_INFO ioctl by
passing the test_reg value for the mock_dev.

Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
---
 drivers/iommu/iommufd/device.c                |  1 +
 drivers/iommu/iommufd/iommufd_test.h          | 15 +++++++++++
 drivers/iommu/iommufd/selftest.c              | 16 ++++++++++++
 tools/testing/selftests/iommu/iommufd.c       | 17 +++++++++++-
 tools/testing/selftests/iommu/iommufd_utils.h | 26 +++++++++++++++++++
 5 files changed, 74 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/iommufd/device.c b/drivers/iommu/iommufd/device.c
index 6948539488a5..5c352807d946 100644
--- a/drivers/iommu/iommufd/device.c
+++ b/drivers/iommu/iommufd/device.c
@@ -8,6 +8,7 @@
 
 #include "io_pagetable.h"
 #include "iommufd_private.h"
+#include "iommufd_test.h"
 
 static bool allow_unsafe_interrupts;
 module_param(allow_unsafe_interrupts, bool, S_IRUGO | S_IWUSR);
diff --git a/drivers/iommu/iommufd/iommufd_test.h b/drivers/iommu/iommufd/iommufd_test.h
index da1898a9128f..578691602d94 100644
--- a/drivers/iommu/iommufd/iommufd_test.h
+++ b/drivers/iommu/iommufd/iommufd_test.h
@@ -100,4 +100,19 @@ struct iommu_test_cmd {
 };
 #define IOMMU_TEST_CMD _IO(IOMMUFD_TYPE, IOMMUFD_CMD_BASE + 32)
 
+/* Mock structs for IOMMU_DEVICE_GET_HW_INFO ioctl */
+#define IOMMU_HW_INFO_TYPE_SELFTEST	0xfeedbeef
+#define IOMMU_HW_INFO_SELFTEST_REGVAL	0xdeadbeef
+
+/**
+ * struct iommu_hw_info_selftest
+ *
+ * @flags: Must be set to 0
+ * @test_reg: Pass IOMMU_HW_INFO_SELFTEST_REGVAL to user selftest program
+ */
+struct iommu_hw_info_selftest {
+	__u32 flags;
+	__u32 test_reg;
+};
+
 #endif
diff --git a/drivers/iommu/iommufd/selftest.c b/drivers/iommu/iommufd/selftest.c
index d7832ffc3aa6..b50ec3528ec1 100644
--- a/drivers/iommu/iommufd/selftest.c
+++ b/drivers/iommu/iommufd/selftest.c
@@ -128,6 +128,20 @@ static struct iommu_domain mock_blocking_domain = {
 	.ops = &mock_blocking_ops,
 };
 
+static void *mock_domain_hw_info(struct device *dev, u32 *length)
+{
+	struct iommu_hw_info_selftest *info;
+
+	info = kzalloc(sizeof(*info), GFP_KERNEL);
+	if (!info)
+		return ERR_PTR(-ENOMEM);
+
+	info->test_reg = IOMMU_HW_INFO_SELFTEST_REGVAL;
+	*length = sizeof(*info);
+
+	return info;
+}
+
 static struct iommu_domain *mock_domain_alloc(unsigned int iommu_domain_type)
 {
 	struct mock_iommu_domain *mock;
@@ -279,6 +293,8 @@ static void mock_domain_set_plaform_dma_ops(struct device *dev)
 static const struct iommu_ops mock_ops = {
 	.owner = THIS_MODULE,
 	.pgsize_bitmap = MOCK_IO_PAGE_SIZE,
+	.driver_type = IOMMU_HW_INFO_TYPE_SELFTEST,
+	.hw_info = mock_domain_hw_info,
 	.domain_alloc = mock_domain_alloc,
 	.capable = mock_domain_capable,
 	.set_platform_dma_ops = mock_domain_set_plaform_dma_ops,
diff --git a/tools/testing/selftests/iommu/iommufd.c b/tools/testing/selftests/iommu/iommufd.c
index 5c33b6b52c65..d2ce2ddbdc40 100644
--- a/tools/testing/selftests/iommu/iommufd.c
+++ b/tools/testing/selftests/iommu/iommufd.c
@@ -124,6 +124,7 @@ TEST_F(iommufd, cmd_length)
 	TEST_LENGTH(iommu_ioas_unmap, IOMMU_IOAS_UNMAP);
 	TEST_LENGTH(iommu_option, IOMMU_OPTION);
 	TEST_LENGTH(iommu_vfio_ioas, IOMMU_VFIO_IOAS);
+	TEST_LENGTH(iommu_hw_info, IOMMU_DEVICE_GET_HW_INFO);
 #undef TEST_LENGTH
 }
 
@@ -188,6 +189,7 @@ FIXTURE(iommufd_ioas)
 	uint32_t ioas_id;
 	uint32_t stdev_id;
 	uint32_t hwpt_id;
+	uint32_t device_id;
 	uint64_t base_iova;
 };
 
@@ -214,7 +216,7 @@ FIXTURE_SETUP(iommufd_ioas)
 
 	for (i = 0; i != variant->mock_domains; i++) {
 		test_cmd_mock_domain(self->ioas_id, &self->stdev_id,
-				     &self->hwpt_id, NULL);
+				     &self->hwpt_id, &self->device_id);
 		self->base_iova = MOCK_APERTURE_START;
 	}
 }
@@ -293,6 +295,19 @@ TEST_F(iommufd_ioas, ioas_area_auto_destroy)
 	}
 }
 
+TEST_F(iommufd_ioas, device_get_hw_info)
+{
+	struct iommu_hw_info_selftest info;
+
+	if (self->device_id) {
+		test_cmd_device_get_hw_info(self->device_id, sizeof(info), &info);
+		assert(info.test_reg == IOMMU_HW_INFO_SELFTEST_REGVAL);
+	} else {
+		test_err_device_get_hw_info(ENOENT, self->device_id,
+					    sizeof(info), &info);
+	}
+}
+
 TEST_F(iommufd_ioas, area)
 {
 	int i;
diff --git a/tools/testing/selftests/iommu/iommufd_utils.h b/tools/testing/selftests/iommu/iommufd_utils.h
index 399779acce23..b57e1e60f69d 100644
--- a/tools/testing/selftests/iommu/iommufd_utils.h
+++ b/tools/testing/selftests/iommu/iommufd_utils.h
@@ -345,3 +345,29 @@ static void teardown_iommufd(int fd, struct __test_metadata *_metadata)
 	})
 
 #endif
+
+static int _test_cmd_device_get_hw_info(int fd, __u32 device_id,
+					__u32 data_len, void *data)
+{
+	struct iommu_hw_info cmd = {
+		.size = sizeof(cmd),
+		.dev_id = device_id,
+		.data_len = data_len,
+		.data_ptr = (uint64_t)data,
+	};
+	int ret;
+
+	ret = ioctl(fd, IOMMU_DEVICE_GET_HW_INFO, &cmd);
+	if (ret)
+		return ret;
+	return 0;
+}
+
+#define test_cmd_device_get_hw_info(device_id, data_len, data)         \
+	ASSERT_EQ(0, _test_cmd_device_get_hw_info(self->fd, device_id, \
+						  data_len, data))
+
+#define test_err_device_get_hw_info(_errno, device_id, data_len, data) \
+	EXPECT_ERRNO(_errno,                                        \
+		     _test_cmd_device_get_hw_info(self->fd, device_id, \
+						  data_len, data))
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH v2 1/4] iommu: Move dev_iommu_ops() to private header
  2023-03-09  7:53 ` [PATCH v2 1/4] iommu: Move dev_iommu_ops() to private header Yi Liu
@ 2023-03-09 12:58   ` Baolu Lu
  0 siblings, 0 replies; 16+ messages in thread
From: Baolu Lu @ 2023-03-09 12:58 UTC (permalink / raw)
  To: Yi Liu, joro, alex.williamson, jgg, kevin.tian, robin.murphy
  Cc: baolu.lu, cohuck, eric.auger, nicolinc, kvm, mjrosato,
	chao.p.peng, yi.y.sun, peterx, jasowang,
	shameerali.kolothum.thodi, lulu, suravee.suthikulpanit, iommu,
	linux-kernel, linux-kselftest

On 2023/3/9 15:53, Yi Liu wrote:
> dev_iommu_ops() is essentially only used in iommu subsystem, so
> move to a private header to avoid being abused by other drivers.
> 
> Suggested-by: Jason Gunthorpe<jgg@nvidia.com>
> Reviewed-by: Kevin Tian<kevin.tian@intel.com>
> Signed-off-by: Yi Liu<yi.l.liu@intel.com>

Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>

Best regards,
baolu

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2 3/4] iommufd: Add IOMMU_DEVICE_GET_HW_INFO
  2023-03-09  7:53 ` [PATCH v2 3/4] iommufd: Add IOMMU_DEVICE_GET_HW_INFO Yi Liu
@ 2023-03-09 13:50   ` Baolu Lu
  2023-03-09 17:20     ` Jason Gunthorpe
  2023-03-16  8:23   ` Tian, Kevin
  1 sibling, 1 reply; 16+ messages in thread
From: Baolu Lu @ 2023-03-09 13:50 UTC (permalink / raw)
  To: Yi Liu, joro, alex.williamson, jgg, kevin.tian, robin.murphy
  Cc: baolu.lu, cohuck, eric.auger, nicolinc, kvm, mjrosato,
	chao.p.peng, yi.y.sun, peterx, jasowang,
	shameerali.kolothum.thodi, lulu, suravee.suthikulpanit, iommu,
	linux-kernel, linux-kselftest

On 2023/3/9 15:53, Yi Liu wrote:
> Under nested IOMMU translation, userspace owns the stage-1 translation
> table (e.g. the stage-1 page table of Intel VT-d or the context table
> of ARM SMMUv3, and etc.). Stage-1 translation tables are vendor specific,
> and needs to be compatiable with the underlying IOMMU hardware. Hence,
> userspace should know the IOMMU hardware capability before creating and
> configuring the stage-1 translation table to kernel.
> 
> This adds IOMMU_DEVICE_GET_HW_INFO to query the IOMMU hardware information
> for a given device. The returned data is vendor specific, userspace needs
> to decode it with the structure mapped by the @out_data_type field.
> 
> As only physical devices have IOMMU hardware, so this will return error
> if the given device is not a physical device.
> 
> Co-developed-by: Nicolin Chen <nicolinc@nvidia.com>
> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
> Signed-off-by: Yi Liu <yi.l.liu@intel.com>
> ---
>   drivers/iommu/iommufd/device.c          | 74 +++++++++++++++++++++++++
>   drivers/iommu/iommufd/iommufd_private.h |  1 +
>   drivers/iommu/iommufd/main.c            |  3 +
>   include/uapi/linux/iommufd.h            | 40 +++++++++++++
>   4 files changed, 118 insertions(+)
> 
> diff --git a/drivers/iommu/iommufd/device.c b/drivers/iommu/iommufd/device.c
> index c10e02f6a0be..6948539488a5 100644
> --- a/drivers/iommu/iommufd/device.c
> +++ b/drivers/iommu/iommufd/device.c
> @@ -257,6 +257,80 @@ struct iommufd_ctx *iommufd_device_to_ictx(struct iommufd_device *idev)
>   }
>   EXPORT_SYMBOL_NS_GPL(iommufd_device_to_ictx, IOMMUFD);
>   
> +static int iommufd_zero_fill_user(u64 ptr, int bytes)
> +{
> +	int index = 0;
> +
> +	for (; index < bytes; index++) {
> +		if (put_user(0, (uint8_t __user *)(ptr + index)))
> +			return -EFAULT;
> +	}
> +	return 0;
> +}
> +
> +int iommufd_device_get_hw_info(struct iommufd_ucmd *ucmd)
> +{
> +	struct iommu_hw_info *cmd = ucmd->cmd;
> +	struct iommufd_device *idev;
> +	const struct iommu_ops *ops;
> +	void *data;
> +	unsigned int length, data_len;
> +	int rc;

Reverse christmas tree order for declarations.

> +
> +	if (cmd->flags || cmd->__reserved || !cmd->data_len)
> +		return -EOPNOTSUPP;
> +
> +	idev = iommufd_get_device(ucmd, cmd->dev_id);
> +	if (IS_ERR(idev))
> +		return PTR_ERR(idev);
> +
> +	ops = dev_iommu_ops(idev->dev);
> +	if (!ops || !ops->hw_info) {

dev_iommu_ops() will never return a NULL.

Need below check

	dev->iommu && dev->iommu->iommu_dev

before dev_iommu_ops(). Perhaps something like below?

	if (!dev->iommu || !dev->iommu->iommu_dev)
		return -EINVAL;

	ops = dev_iommu_ops(idev->dev);
	if (!ops->hw_info)
		return -ENODEV;

> +		rc = -EOPNOTSUPP;
> +		goto out_put;
> +	}
> +
> +	/* driver has hw_info callback should have a unique driver_type */
> +	if (WARN_ON(ops->driver_type == IOMMU_HW_INFO_TYPE_DEFAULT)) {

If so, perhaps IOMMU_HW_INFO_TYPE_INVALID is more readable?

I'm not sure if a calltrace is really necessary here. If not, perhaps,

	if (ops->driver_type == IOMMU_HW_INFO_TYPE_DEFAULT) {
		pr_warn_ratelimited("iommu driver set an invalid type\n");
		rc = -ENODEV;
		goto out_put;
	}

> +		rc = -EOPNOTSUPP;
> +		goto out_put;
> +	}
> +
> +	data = ops->hw_info(idev->dev, &data_len);
> +	if (IS_ERR(data)) {
> +		rc = PTR_ERR(data);
> +		goto out_put;
> +	}
> +
> +	length = min(cmd->data_len, data_len);
> +	if (copy_to_user(u64_to_user_ptr(cmd->data_ptr), data, length)) {
> +		rc = -EFAULT;
> +		goto out_free_data;
> +	}
> +
> +	/*
> +	 * Zero the trailing bytes if the user buffer is bigger than the
> +	 * data size kernel actually has.
> +	 */
> +	if (length < cmd->data_len) {
> +		rc = iommufd_zero_fill_user(cmd->data_ptr + length,
> +					    cmd->data_len - length);
> +		if (rc)
> +			goto out_free_data;
> +	}
> +
> +	cmd->out_data_type = ops->driver_type;
> +	cmd->data_len = length;
> +
> +	rc = iommufd_ucmd_respond(ucmd, sizeof(*cmd));
> +
> +out_free_data:
> +	kfree(data);
> +out_put:
> +	iommufd_put_object(&idev->obj);
> +	return rc;
> +}
> +
>   static int iommufd_group_setup_msi(struct iommufd_group *igroup,
>   				   struct iommufd_hw_pagetable *hwpt)
>   {
> diff --git a/drivers/iommu/iommufd/iommufd_private.h b/drivers/iommu/iommufd/iommufd_private.h
> index b18f843ad6a4..05b5ad66f716 100644
> --- a/drivers/iommu/iommufd/iommufd_private.h
> +++ b/drivers/iommu/iommufd/iommufd_private.h
> @@ -307,6 +307,7 @@ iommufd_get_device(struct iommufd_ucmd *ucmd, u32 id)
>   }
>   
>   void iommufd_device_destroy(struct iommufd_object *obj);
> +int iommufd_device_get_hw_info(struct iommufd_ucmd *ucmd);
>   
>   struct iommufd_access {
>   	struct iommufd_object obj;
> diff --git a/drivers/iommu/iommufd/main.c b/drivers/iommu/iommufd/main.c
> index 694da191e4b1..f079c0bda46b 100644
> --- a/drivers/iommu/iommufd/main.c
> +++ b/drivers/iommu/iommufd/main.c
> @@ -262,6 +262,7 @@ static int iommufd_option(struct iommufd_ucmd *ucmd)
>   union ucmd_buffer {
>   	struct iommu_destroy destroy;
>   	struct iommu_hwpt_alloc hwpt;
> +	struct iommu_hw_info info;
>   	struct iommu_ioas_alloc alloc;
>   	struct iommu_ioas_allow_iovas allow_iovas;
>   	struct iommu_ioas_copy ioas_copy;
> @@ -295,6 +296,8 @@ static const struct iommufd_ioctl_op iommufd_ioctl_ops[] = {
>   	IOCTL_OP(IOMMU_DESTROY, iommufd_destroy, struct iommu_destroy, id),
>   	IOCTL_OP(IOMMU_HWPT_ALLOC, iommufd_hwpt_alloc, struct iommu_hwpt_alloc,
>   		 __reserved),
> +	IOCTL_OP(IOMMU_DEVICE_GET_HW_INFO, iommufd_device_get_hw_info,
> +		 struct iommu_hw_info, __reserved),
>   	IOCTL_OP(IOMMU_IOAS_ALLOC, iommufd_ioas_alloc_ioctl,
>   		 struct iommu_ioas_alloc, out_ioas_id),
>   	IOCTL_OP(IOMMU_IOAS_ALLOW_IOVAS, iommufd_ioas_allow_iovas,
> diff --git a/include/uapi/linux/iommufd.h b/include/uapi/linux/iommufd.h
> index 955cbef640da..4ac525897b82 100644
> --- a/include/uapi/linux/iommufd.h
> +++ b/include/uapi/linux/iommufd.h
> @@ -46,6 +46,7 @@ enum {
>   	IOMMUFD_CMD_OPTION,
>   	IOMMUFD_CMD_VFIO_IOAS,
>   	IOMMUFD_CMD_HWPT_ALLOC,
> +	IOMMUFD_CMD_DEVICE_GET_HW_INFO,
>   };
>   
>   /**
> @@ -377,4 +378,43 @@ struct iommu_hwpt_alloc {
>   enum iommu_hw_info_type {
>   	IOMMU_HW_INFO_TYPE_DEFAULT,
>   };
> +
> +/**
> + * struct iommu_hw_info - ioctl(IOMMU_DEVICE_GET_HW_INFO)
> + * @size: sizeof(struct iommu_hw_info)
> + * @flags: Must be 0
> + * @dev_id: The device being attached to the iommufd
> + * @data_len: Input the length of the user buffer in bytes. Output the
> + *            length of data filled to the user buffer.
> + * @data_ptr: Pointer to the type specific structure
> + * @out_data_type: Output the iommu hardware info type, it is one of
> + *                 enum iommu_hw_info_type.
> + * @__reserved: Must be 0
> + *
> + * Query the hardware iommu information for given device which has been
> + * bound to iommufd. @data_len is the size of the buffer which captures
> + * iommu type specific data and the data will be filled. Trailing bytes
> + * are zeroed if the user buffer is larger than the data kernel has.
> + *
> + * The type specific data would be used to sync capability between the
> + * vIOMMU and the hardware IOMMU. e.g. nested translation requires to
> + * check the hardware IOMMU capability, since a stage-1 translation table
> + * is owned by user but used by hardware IOMMU.
> + *
> + * The @out_data_type will be filled if the ioctl succeeds. It would
> + * be used to decode the data filled in the buffer pointed by @data_ptr.
> + *
> + * This is only available for the physical devices bound to iommufd as
> + * only physical devices can have hardware IOMMU.
> + */
> +struct iommu_hw_info {
> +	__u32 size;
> +	__u32 flags;
> +	__u32 dev_id;
> +	__u32 data_len;
> +	__aligned_u64 data_ptr;
> +	__u32 out_data_type;
> +	__u32 __reserved;
> +};
> +#define IOMMU_DEVICE_GET_HW_INFO _IO(IOMMUFD_TYPE, IOMMUFD_CMD_DEVICE_GET_HW_INFO)
>   #endif

Other look good to me.

Best regards,
baolu

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2 3/4] iommufd: Add IOMMU_DEVICE_GET_HW_INFO
  2023-03-09 13:50   ` Baolu Lu
@ 2023-03-09 17:20     ` Jason Gunthorpe
  2023-03-10  8:06       ` Liu, Yi L
  0 siblings, 1 reply; 16+ messages in thread
From: Jason Gunthorpe @ 2023-03-09 17:20 UTC (permalink / raw)
  To: Baolu Lu
  Cc: Yi Liu, joro, alex.williamson, kevin.tian, robin.murphy, cohuck,
	eric.auger, nicolinc, kvm, mjrosato, chao.p.peng, yi.y.sun,
	peterx, jasowang, shameerali.kolothum.thodi, lulu,
	suravee.suthikulpanit, iommu, linux-kernel, linux-kselftest

On Thu, Mar 09, 2023 at 09:50:18PM +0800, Baolu Lu wrote:

> > +	if (cmd->flags || cmd->__reserved || !cmd->data_len)
> > +		return -EOPNOTSUPP;
> > +
> > +	idev = iommufd_get_device(ucmd, cmd->dev_id);
> > +	if (IS_ERR(idev))
> > +		return PTR_ERR(idev);
> > +
> > +	ops = dev_iommu_ops(idev->dev);
> > +	if (!ops || !ops->hw_info) {
> 
> dev_iommu_ops() will never return a NULL.
> 
> Need below check
> 
> 	dev->iommu && dev->iommu->iommu_dev
> 
> before dev_iommu_ops(). Perhaps something like below?
> 
> 	if (!dev->iommu || !dev->iommu->iommu_dev)
> 		return -EINVAL;

At this point the device has become owned through the ownership API,
it absolutely has to have an iommu and an ops. No need to check
anything.

Jason

^ permalink raw reply	[flat|nested] 16+ messages in thread

* RE: [PATCH v2 3/4] iommufd: Add IOMMU_DEVICE_GET_HW_INFO
  2023-03-09 17:20     ` Jason Gunthorpe
@ 2023-03-10  8:06       ` Liu, Yi L
  0 siblings, 0 replies; 16+ messages in thread
From: Liu, Yi L @ 2023-03-10  8:06 UTC (permalink / raw)
  To: Jason Gunthorpe, Baolu Lu
  Cc: joro, alex.williamson, Tian, Kevin, robin.murphy, cohuck,
	eric.auger, nicolinc, kvm, mjrosato, chao.p.peng, yi.y.sun,
	peterx, jasowang, shameerali.kolothum.thodi, lulu,
	suravee.suthikulpanit, iommu, linux-kernel, linux-kselftest

> From: Jason Gunthorpe <jgg@nvidia.com>
> Sent: Friday, March 10, 2023 1:21 AM
> 
> On Thu, Mar 09, 2023 at 09:50:18PM +0800, Baolu Lu wrote:
> 
> > > +	if (cmd->flags || cmd->__reserved || !cmd->data_len)
> > > +		return -EOPNOTSUPP;
> > > +
> > > +	idev = iommufd_get_device(ucmd, cmd->dev_id);
> > > +	if (IS_ERR(idev))
> > > +		return PTR_ERR(idev);
> > > +
> > > +	ops = dev_iommu_ops(idev->dev);
> > > +	if (!ops || !ops->hw_info) {
> >
> > dev_iommu_ops() will never return a NULL.
> >
> > Need below check
> >
> > 	dev->iommu && dev->iommu->iommu_dev
> >
> > before dev_iommu_ops(). Perhaps something like below?
> >
> > 	if (!dev->iommu || !dev->iommu->iommu_dev)
> > 		return -EINVAL;
> 
> At this point the device has become owned through the ownership API,
> it absolutely has to have an iommu and an ops. No need to check
> anything.

ok. so just needs to check hw_info callback.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* RE: [PATCH v2 2/4] iommu: Add new iommu op to get iommu hardware information
  2023-03-09  7:53 ` [PATCH v2 2/4] iommu: Add new iommu op to get iommu hardware information Yi Liu
@ 2023-03-16  8:16   ` Tian, Kevin
  2023-03-16  8:30     ` Baolu Lu
  2023-03-29  9:46     ` Liu, Yi L
  0 siblings, 2 replies; 16+ messages in thread
From: Tian, Kevin @ 2023-03-16  8:16 UTC (permalink / raw)
  To: Liu, Yi L, joro, alex.williamson, jgg, robin.murphy, baolu.lu
  Cc: cohuck, eric.auger, nicolinc, kvm, mjrosato, chao.p.peng,
	yi.y.sun, peterx, jasowang, shameerali.kolothum.thodi, lulu,
	suravee.suthikulpanit, iommu, linux-kernel, linux-kselftest

> From: Liu, Yi L <yi.l.liu@intel.com>
> Sent: Thursday, March 9, 2023 3:54 PM
> @@ -222,6 +223,11 @@ struct iommu_iotlb_gather {
>  /**
>   * struct iommu_ops - iommu ops and capabilities
>   * @capable: check capability
> + * @hw_info: IOMMU hardware information. The type of the returned data
> is
> + *           defined in include/uapi/linux/iommufd.h. The data buffer is

"The type of the returned data is marked by @driver_type".

"defined in include/uapi/linux/iommufd.h" should belong to the comment
of @driver_type

> + *           allocated in the IOMMU driver and the caller should free it
> + *           after use. Return the data buffer if success, or ERR_PTR on
> + *           failure.
>   * @domain_alloc: allocate iommu domain
>   * @probe_device: Add device to iommu driver handling
>   * @release_device: Remove device from iommu driver handling
> @@ -246,11 +252,17 @@ struct iommu_iotlb_gather {
>   * @remove_dev_pasid: Remove any translation configurations of a specific
>   *                    pasid, so that any DMA transactions with this pasid
>   *                    will be blocked by the hardware.
> + * @driver_type: One of enum iommu_hw_info_type. This is used in the
> hw_info
> + *               reporting path. For the drivers that supports it, a unique
> + *               type should be defined. For the driver that does not support
> + *               it, this field is the IOMMU_HW_INFO_TYPE_DEFAULT that is 0.
> + *               Hence, such drivers do not need to care this field.

The meaning of "driver_type" is much broader than reporting hw_info.

let's be accurate to call it as "hw_info_type". and while we have two
separate fields for one feature where is the check enforced on whether
both are provided?

Is it simpler to return the type directly in @hw_info?

btw IOMMU_HW_INFO_TYPE_DEFAULT also sounds misleading.
'default' implies hw_info still available but in a default format.

probably it's clearer to call it IOMMU_HW_INFO_TYPE_NONE.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* RE: [PATCH v2 3/4] iommufd: Add IOMMU_DEVICE_GET_HW_INFO
  2023-03-09  7:53 ` [PATCH v2 3/4] iommufd: Add IOMMU_DEVICE_GET_HW_INFO Yi Liu
  2023-03-09 13:50   ` Baolu Lu
@ 2023-03-16  8:23   ` Tian, Kevin
  1 sibling, 0 replies; 16+ messages in thread
From: Tian, Kevin @ 2023-03-16  8:23 UTC (permalink / raw)
  To: Liu, Yi L, joro, alex.williamson, jgg, robin.murphy, baolu.lu
  Cc: cohuck, eric.auger, nicolinc, kvm, mjrosato, chao.p.peng,
	yi.y.sun, peterx, jasowang, shameerali.kolothum.thodi, lulu,
	suravee.suthikulpanit, iommu, linux-kernel, linux-kselftest

> From: Liu, Yi L <yi.l.liu@intel.com>
> Sent: Thursday, March 9, 2023 3:54 PM
> +
> +int iommufd_device_get_hw_info(struct iommufd_ucmd *ucmd)
> +{
> +	struct iommu_hw_info *cmd = ucmd->cmd;
> +	struct iommufd_device *idev;
> +	const struct iommu_ops *ops;
> +	void *data;
> +	unsigned int length, data_len;
> +	int rc;
> +
> +	if (cmd->flags || cmd->__reserved || !cmd->data_len)
> +		return -EOPNOTSUPP;
> +
> +	idev = iommufd_get_device(ucmd, cmd->dev_id);
> +	if (IS_ERR(idev))
> +		return PTR_ERR(idev);
> +
> +	ops = dev_iommu_ops(idev->dev);
> +	if (!ops || !ops->hw_info) {
> +		rc = -EOPNOTSUPP;
> +		goto out_put;
> +	}
> +
> +	/* driver has hw_info callback should have a unique driver_type */
> +	if (WARN_ON(ops->driver_type ==
> IOMMU_HW_INFO_TYPE_DEFAULT)) {
> +		rc = -EOPNOTSUPP;
> +		goto out_put;
> +	}

ok, here is where the check is done.

> +
> +	data = ops->hw_info(idev->dev, &data_len);

if we directly return type in @hw_info, this becomes:

	data = ops->hw_info(idev->dev, &data_len, &driver_type);

> +/**
> + * struct iommu_hw_info - ioctl(IOMMU_DEVICE_GET_HW_INFO)
> + * @size: sizeof(struct iommu_hw_info)
> + * @flags: Must be 0
> + * @dev_id: The device being attached to the iommufd

"The device bound to the iommufd"

> + * @data_len: Input the length of the user buffer in bytes. Output the
> + *            length of data filled to the user buffer.

s/to/in/

> + * @data_ptr: Pointer to the type specific structure

"Pointer to the user buffer"

> + * @out_data_type: Output the iommu hardware info type, it is one of
> + *                 enum iommu_hw_info_type.

s/it is one of/as defined by/

> + * @__reserved: Must be 0
> + *
> + * Query the hardware iommu information for given device which has been
> + * bound to iommufd. @data_len is the size of the buffer which captures
> + * iommu type specific data and the data will be filled. Trailing bytes
> + * are zeroed if the user buffer is larger than the data kernel has.
> + *
> + * The type specific data would be used to sync capability between the
> + * vIOMMU and the hardware IOMMU. e.g. nested translation requires to

s/vIOMMU/virtual IOMMU/

> + * check the hardware IOMMU capability, since a stage-1 translation table
> + * is owned by user but used by hardware IOMMU.

"check ... capability so guest stage-1 page table uses a format compatible
to the hardware IOMMU"

> + *
> + * The @out_data_type will be filled if the ioctl succeeds. It would
> + * be used to decode the data filled in the buffer pointed by @data_ptr.
> + *
> + * This is only available for the physical devices bound to iommufd as
> + * only physical devices can have hardware IOMMU.

not required. User doesn't know whether it's physical or emulated
device.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2 2/4] iommu: Add new iommu op to get iommu hardware information
  2023-03-16  8:16   ` Tian, Kevin
@ 2023-03-16  8:30     ` Baolu Lu
  2023-03-17  0:08       ` Tian, Kevin
  2023-03-29  9:46     ` Liu, Yi L
  1 sibling, 1 reply; 16+ messages in thread
From: Baolu Lu @ 2023-03-16  8:30 UTC (permalink / raw)
  To: Tian, Kevin, Liu, Yi L, joro, alex.williamson, jgg, robin.murphy
  Cc: baolu.lu, cohuck, eric.auger, nicolinc, kvm, mjrosato,
	chao.p.peng, yi.y.sun, peterx, jasowang,
	shameerali.kolothum.thodi, lulu, suravee.suthikulpanit, iommu,
	linux-kernel, linux-kselftest

On 2023/3/16 16:16, Tian, Kevin wrote:
>> + *           allocated in the IOMMU driver and the caller should free it
>> + *           after use. Return the data buffer if success, or ERR_PTR on
>> + *           failure.
>>    * @domain_alloc: allocate iommu domain
>>    * @probe_device: Add device to iommu driver handling
>>    * @release_device: Remove device from iommu driver handling
>> @@ -246,11 +252,17 @@ struct iommu_iotlb_gather {
>>    * @remove_dev_pasid: Remove any translation configurations of a specific
>>    *                    pasid, so that any DMA transactions with this pasid
>>    *                    will be blocked by the hardware.
>> + * @driver_type: One of enum iommu_hw_info_type. This is used in the
>> hw_info
>> + *               reporting path. For the drivers that supports it, a unique
>> + *               type should be defined. For the driver that does not support
>> + *               it, this field is the IOMMU_HW_INFO_TYPE_DEFAULT that is 0.
>> + *               Hence, such drivers do not need to care this field.
> The meaning of "driver_type" is much broader than reporting hw_info.
> 
> let's be accurate to call it as "hw_info_type". and while we have two
> separate fields for one feature where is the check enforced on whether
> both are provided?
> 
> Is it simpler to return the type directly in @hw_info?

If I remember correctly, the vendor iommu type and hardware info are
reported to user space separately.

Best regards,
baolu

^ permalink raw reply	[flat|nested] 16+ messages in thread

* RE: [PATCH v2 2/4] iommu: Add new iommu op to get iommu hardware information
  2023-03-16  8:30     ` Baolu Lu
@ 2023-03-17  0:08       ` Tian, Kevin
  0 siblings, 0 replies; 16+ messages in thread
From: Tian, Kevin @ 2023-03-17  0:08 UTC (permalink / raw)
  To: Baolu Lu, Liu, Yi L, joro, alex.williamson, jgg, robin.murphy
  Cc: cohuck, eric.auger, nicolinc, kvm, mjrosato, chao.p.peng,
	yi.y.sun, peterx, jasowang, shameerali.kolothum.thodi, lulu,
	suravee.suthikulpanit, iommu, linux-kernel, linux-kselftest

> From: Baolu Lu <baolu.lu@linux.intel.com>
> Sent: Thursday, March 16, 2023 4:30 PM
> 
> On 2023/3/16 16:16, Tian, Kevin wrote:
> >> + *           allocated in the IOMMU driver and the caller should free it
> >> + *           after use. Return the data buffer if success, or ERR_PTR on
> >> + *           failure.
> >>    * @domain_alloc: allocate iommu domain
> >>    * @probe_device: Add device to iommu driver handling
> >>    * @release_device: Remove device from iommu driver handling
> >> @@ -246,11 +252,17 @@ struct iommu_iotlb_gather {
> >>    * @remove_dev_pasid: Remove any translation configurations of a
> specific
> >>    *                    pasid, so that any DMA transactions with this pasid
> >>    *                    will be blocked by the hardware.
> >> + * @driver_type: One of enum iommu_hw_info_type. This is used in the
> >> hw_info
> >> + *               reporting path. For the drivers that supports it, a unique
> >> + *               type should be defined. For the driver that does not support
> >> + *               it, this field is the IOMMU_HW_INFO_TYPE_DEFAULT that is 0.
> >> + *               Hence, such drivers do not need to care this field.
> > The meaning of "driver_type" is much broader than reporting hw_info.
> >
> > let's be accurate to call it as "hw_info_type". and while we have two
> > separate fields for one feature where is the check enforced on whether
> > both are provided?
> >
> > Is it simpler to return the type directly in @hw_info?
> 
> If I remember correctly, the vendor iommu type and hardware info are
> reported to user space separately.
> 

there is only one IOMMU_DEVICE_GET_HW_INFO cmd. It's written as:

	data = ops->hw_info(idev->dev, &data_len);
	copy_to_user(u64_to_user_ptr(cmd->data_ptr), data, length);
	cmd->out_data_type = ops->driver_type;

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2 4/4] iommufd/selftest: Add coverage for IOMMU_DEVICE_GET_HW_INFO ioctl
  2023-03-09  7:53 ` [PATCH v2 4/4] iommufd/selftest: Add coverage for IOMMU_DEVICE_GET_HW_INFO ioctl Yi Liu
@ 2023-03-20 12:43   ` Jason Gunthorpe
  0 siblings, 0 replies; 16+ messages in thread
From: Jason Gunthorpe @ 2023-03-20 12:43 UTC (permalink / raw)
  To: Yi Liu
  Cc: joro, alex.williamson, kevin.tian, robin.murphy, baolu.lu,
	cohuck, eric.auger, nicolinc, kvm, mjrosato, chao.p.peng,
	yi.y.sun, peterx, jasowang, shameerali.kolothum.thodi, lulu,
	suravee.suthikulpanit, iommu, linux-kernel, linux-kselftest

On Wed, Mar 08, 2023 at 11:53:58PM -0800, Yi Liu wrote:
> diff --git a/drivers/iommu/iommufd/iommufd_test.h b/drivers/iommu/iommufd/iommufd_test.h
> index da1898a9128f..578691602d94 100644
> --- a/drivers/iommu/iommufd/iommufd_test.h
> +++ b/drivers/iommu/iommufd/iommufd_test.h
> @@ -100,4 +100,19 @@ struct iommu_test_cmd {
>  };
>  #define IOMMU_TEST_CMD _IO(IOMMUFD_TYPE, IOMMUFD_CMD_BASE + 32)
>  
> +/* Mock structs for IOMMU_DEVICE_GET_HW_INFO ioctl */
> +#define IOMMU_HW_INFO_TYPE_SELFTEST	0xfeedbeef
> +#define IOMMU_HW_INFO_SELFTEST_REGVAL	0xdeadbeef
> +
> +/**
> + * struct iommu_hw_info_selftest
> + *
> + * @flags: Must be set to 0
> + * @test_reg: Pass IOMMU_HW_INFO_SELFTEST_REGVAL to user selftest program
> + */

Probably don't need the comment, it is misleading

> +struct iommu_hw_info_selftest {

struct iommu_test_hw_info

Jason

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2 0/4] iommufd: Add iommu hardware info reporting
  2023-03-09  7:53 [PATCH v2 0/4] iommufd: Add iommu hardware info reporting Yi Liu
                   ` (3 preceding siblings ...)
  2023-03-09  7:53 ` [PATCH v2 4/4] iommufd/selftest: Add coverage for IOMMU_DEVICE_GET_HW_INFO ioctl Yi Liu
@ 2023-03-20 12:43 ` Jason Gunthorpe
  4 siblings, 0 replies; 16+ messages in thread
From: Jason Gunthorpe @ 2023-03-20 12:43 UTC (permalink / raw)
  To: Yi Liu
  Cc: joro, alex.williamson, kevin.tian, robin.murphy, baolu.lu,
	cohuck, eric.auger, nicolinc, kvm, mjrosato, chao.p.peng,
	yi.y.sun, peterx, jasowang, shameerali.kolothum.thodi, lulu,
	suravee.suthikulpanit, iommu, linux-kernel, linux-kselftest

On Wed, Mar 08, 2023 at 11:53:54PM -0800, Yi Liu wrote:
> iommufd gives userspace the capability to manipulate iommu subsytem.
> e.g. DMA map/unmap etc. In the near future, it will support iommu nested
> translation. Different platform vendors have different implementation for
> the nested translation. So before set up nested translation, userspace
> needs to know the hardware iommu information. For example, Intel VT-d
> supports using guest I/O page table as the stage-1 translation table. This
> requires guest I/O page table be compatible with hardware IOMMU.
> 
> This series reports the iommu hardware information for a given iommufd_device
> which has been bound to iommufd. It is preparation work for userspace to
> allocate hwpt for given device. Like the nested translation support[1].
> 
> This series introduces an iommu op to report the iommu hardware info,
> and an ioctl IOMMU_DEVICE_GET_HW_INFO is added to report such hardware
> info to user. enum iommu_hw_info_type is defined to differentiate the
> iommu hardware info reported to user hence user can decode them. This
> series only adds the framework for iommu hw info reporting, the complete
> reporting path needs vendor specific definition and driver support. The
> full picture is available in [1] as well.

Other than the small notes this looks pretty good to me

Jason

^ permalink raw reply	[flat|nested] 16+ messages in thread

* RE: [PATCH v2 2/4] iommu: Add new iommu op to get iommu hardware information
  2023-03-16  8:16   ` Tian, Kevin
  2023-03-16  8:30     ` Baolu Lu
@ 2023-03-29  9:46     ` Liu, Yi L
  1 sibling, 0 replies; 16+ messages in thread
From: Liu, Yi L @ 2023-03-29  9:46 UTC (permalink / raw)
  To: Tian, Kevin, joro, alex.williamson, jgg, robin.murphy, baolu.lu
  Cc: cohuck, eric.auger, nicolinc, kvm, mjrosato, chao.p.peng,
	yi.y.sun, peterx, jasowang, shameerali.kolothum.thodi, lulu,
	suravee.suthikulpanit, iommu, linux-kernel, linux-kselftest

> From: Tian, Kevin <kevin.tian@intel.com>
> Sent: Thursday, March 16, 2023 4:17 PM
> 
> > From: Liu, Yi L <yi.l.liu@intel.com>
> > Sent: Thursday, March 9, 2023 3:54 PM
> > @@ -222,6 +223,11 @@ struct iommu_iotlb_gather {
> >  /**
> >   * struct iommu_ops - iommu ops and capabilities
> >   * @capable: check capability
> > + * @hw_info: IOMMU hardware information. The type of the returned
> data
> > is
> > + *           defined in include/uapi/linux/iommufd.h. The data buffer is
> 
> "The type of the returned data is marked by @driver_type".
> 
> "defined in include/uapi/linux/iommufd.h" should belong to the comment
> of @driver_type

Sure.

> 
> > + *           allocated in the IOMMU driver and the caller should free it
> > + *           after use. Return the data buffer if success, or ERR_PTR on
> > + *           failure.
> >   * @domain_alloc: allocate iommu domain
> >   * @probe_device: Add device to iommu driver handling
> >   * @release_device: Remove device from iommu driver handling
> > @@ -246,11 +252,17 @@ struct iommu_iotlb_gather {
> >   * @remove_dev_pasid: Remove any translation configurations of a
> specific
> >   *                    pasid, so that any DMA transactions with this pasid
> >   *                    will be blocked by the hardware.
> > + * @driver_type: One of enum iommu_hw_info_type. This is used in the
> > hw_info
> > + *               reporting path. For the drivers that supports it, a unique
> > + *               type should be defined. For the driver that does not support
> > + *               it, this field is the IOMMU_HW_INFO_TYPE_DEFAULT that is 0.
> > + *               Hence, such drivers do not need to care this field.
> 
> The meaning of "driver_type" is much broader than reporting hw_info.
> 
> let's be accurate to call it as "hw_info_type". and while we have two
> separate fields for one feature where is the check enforced on whether
> both are provided?

It is filled in the uapi structure by referring ops->driver_type in next
patch.

> Is it simpler to return the type directly in @hw_info?

Per the current description, if the iommu driver doesn't implement .hw_info
callback, then it will not set driver_type field neither. Then this field is 0
(IOMMU_HW_INFO_TYPE_NONE). The GET_HW_INFO ioctl in next patch
would fail as well. Under this implementation, returning the driver_type
(a.k.a hw_info_type per your comment) in the hw_info callback may be
simpler.

But I plan to update the implementation per the below remark from Jason.
The GET_HW_INFO needs to succeed even if the underlying iommu driver
does not implement hw_info callback. If so, it's still much more convenient
to get the type by referring ops->driver_type.

https://lore.kernel.org/kvm/ZAcwJSK%2F9UVI9LXu@nvidia.com/

Also, per Nic's other remark, there would be a bitmap named hwpt_types
field added to iommu_ops. Then it is also easier to referring it by
ops->hwpt_types.

https://lore.kernel.org/linux-iommu/ZArgAXMUpNjDfFgZ@Asurada-Nvidia/#t

Surely, we also have another alternative. We can enforce all the iommu
drivers to implement a minimum hw_info callback which just returns the
driver_type if it does not have driver-specific data to report to the user
yet.

> btw IOMMU_HW_INFO_TYPE_DEFAULT also sounds misleading.
> 'default' implies hw_info still available but in a default format.
> 
> probably it's clearer to call it IOMMU_HW_INFO_TYPE_NONE.

Sure. Makes sense. So _NONE means no driver specific info is
Reported back to user.

Regards,
Yi Liu

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2023-03-29  9:47 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-03-09  7:53 [PATCH v2 0/4] iommufd: Add iommu hardware info reporting Yi Liu
2023-03-09  7:53 ` [PATCH v2 1/4] iommu: Move dev_iommu_ops() to private header Yi Liu
2023-03-09 12:58   ` Baolu Lu
2023-03-09  7:53 ` [PATCH v2 2/4] iommu: Add new iommu op to get iommu hardware information Yi Liu
2023-03-16  8:16   ` Tian, Kevin
2023-03-16  8:30     ` Baolu Lu
2023-03-17  0:08       ` Tian, Kevin
2023-03-29  9:46     ` Liu, Yi L
2023-03-09  7:53 ` [PATCH v2 3/4] iommufd: Add IOMMU_DEVICE_GET_HW_INFO Yi Liu
2023-03-09 13:50   ` Baolu Lu
2023-03-09 17:20     ` Jason Gunthorpe
2023-03-10  8:06       ` Liu, Yi L
2023-03-16  8:23   ` Tian, Kevin
2023-03-09  7:53 ` [PATCH v2 4/4] iommufd/selftest: Add coverage for IOMMU_DEVICE_GET_HW_INFO ioctl Yi Liu
2023-03-20 12:43   ` Jason Gunthorpe
2023-03-20 12:43 ` [PATCH v2 0/4] iommufd: Add iommu hardware info reporting Jason Gunthorpe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).