From: Jason Gunthorpe <jgg@nvidia.com>
To: unlisted-recipients:; (no To-header on input)
Cc: Alex Williamson <alex.williamson@redhat.com>,
Lu Baolu <baolu.lu@linux.intel.com>,
Chaitanya Kulkarni <chaitanyak@nvidia.com>,
Cornelia Huck <cohuck@redhat.com>,
Daniel Jordan <daniel.m.jordan@oracle.com>,
David Gibson <david@gibson.dropbear.id.au>,
Eric Auger <eric.auger@redhat.com>,
iommu@lists.linux-foundation.org,
Jason Wang <jasowang@redhat.com>,
Jean-Philippe Brucker <jean-philippe@linaro.org>,
Joao Martins <joao.m.martins@oracle.com>,
Kevin Tian <kevin.tian@intel.com>,
kvm@vger.kernel.org, Matthew Rosato <mjrosato@linux.ibm.com>,
"Michael S. Tsirkin" <mst@redhat.com>,
Nicolin Chen <nicolinc@nvidia.com>,
Niklas Schnelle <schnelle@linux.ibm.com>,
Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>,
Yi Liu <yi.l.liu@intel.com>, Keqian Zhu <zhukeqian1@huawei.com>
Subject: [PATCH RFC 11/12] iommufd: vfio container FD ioctl compatibility
Date: Fri, 18 Mar 2022 14:27:36 -0300 [thread overview]
Message-ID: <11-v1-e79cd8d168e8+6-iommufd_jgg@nvidia.com> (raw)
In-Reply-To: <0-v1-e79cd8d168e8+6-iommufd_jgg@nvidia.com>
iommufd can directly implement the /dev/vfio/vfio container IOCTLs by
mapping them into io_pagetable operations. Doing so allows the use of
iommufd by symliking /dev/vfio/vfio to /dev/iommufd. Allowing VFIO to
SET_CONTAINER using a iommufd instead of a container fd is a followup
series.
Internally the compatibility API uses a normal IOAS object that, like
vfio, is automatically allocated when the first device is
attached.
Userspace can also query or set this IOAS object directly using the
IOMMU_VFIO_IOAS ioctl. This allows mixing and matching new iommufd only
features while still using the VFIO style map/unmap ioctls.
While this is enough to operate qemu, it is still a bit of a WIP with a
few gaps to be resolved:
- Only the TYPE1v2 mode is supported where unmap cannot punch holes or
split areas. The old mode can be implemented with a new operation to
split an iopt_area into two without disturbing the iopt_pages or the
domains, then unmapping a whole area as normal.
- Resource limits rely on memory cgroups to bound what userspace can do
instead of the module parameter dma_entry_limit.
- VFIO P2P is not implemented. Avoiding the follow_pfn() mis-design will
require some additional work to properly expose PFN lifecycle between
VFIO and iommfd
- Various components of the mdev API are not completed yet
- Indefinite suspend of SW access (VFIO_DMA_MAP_FLAG_VADDR) is not
implemented.
- The 'dirty tracking' is not implemented
- A full audit for pedantic compatibility details (eg errnos, etc) has
not yet been done
- powerpc SPAPR is left out, as it is not connected to the iommu_domain
framework. My hope is that SPAPR will be moved into the iommu_domain
framework as a special HW specific type and would expect power to
support the generic interface through a normal iommu_domain.
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
drivers/iommu/iommufd/Makefile | 3 +-
drivers/iommu/iommufd/iommufd_private.h | 6 +
drivers/iommu/iommufd/main.c | 16 +-
drivers/iommu/iommufd/vfio_compat.c | 401 ++++++++++++++++++++++++
include/uapi/linux/iommufd.h | 36 +++
5 files changed, 456 insertions(+), 6 deletions(-)
create mode 100644 drivers/iommu/iommufd/vfio_compat.c
diff --git a/drivers/iommu/iommufd/Makefile b/drivers/iommu/iommufd/Makefile
index ca28a135b9675f..2fdff04000b326 100644
--- a/drivers/iommu/iommufd/Makefile
+++ b/drivers/iommu/iommufd/Makefile
@@ -5,6 +5,7 @@ iommufd-y := \
io_pagetable.o \
ioas.o \
main.o \
- pages.o
+ pages.o \
+ vfio_compat.o
obj-$(CONFIG_IOMMUFD) += iommufd.o
diff --git a/drivers/iommu/iommufd/iommufd_private.h b/drivers/iommu/iommufd/iommufd_private.h
index e5c717231f851e..31628591591c17 100644
--- a/drivers/iommu/iommufd/iommufd_private.h
+++ b/drivers/iommu/iommufd/iommufd_private.h
@@ -67,6 +67,8 @@ void iopt_remove_reserved_iova(struct io_pagetable *iopt, void *owner);
struct iommufd_ctx {
struct file *filp;
struct xarray objects;
+
+ struct iommufd_ioas *vfio_ioas;
};
struct iommufd_ctx *iommufd_fget(int fd);
@@ -78,6 +80,9 @@ struct iommufd_ucmd {
void *cmd;
};
+int iommufd_vfio_ioctl(struct iommufd_ctx *ictx, unsigned int cmd,
+ unsigned long arg);
+
/* Copy the response in ucmd->cmd back to userspace. */
static inline int iommufd_ucmd_respond(struct iommufd_ucmd *ucmd,
size_t cmd_len)
@@ -186,6 +191,7 @@ int iommufd_ioas_iova_ranges(struct iommufd_ucmd *ucmd);
int iommufd_ioas_map(struct iommufd_ucmd *ucmd);
int iommufd_ioas_copy(struct iommufd_ucmd *ucmd);
int iommufd_ioas_unmap(struct iommufd_ucmd *ucmd);
+int iommufd_vfio_ioas(struct iommufd_ucmd *ucmd);
/*
* A HW pagetable is called an iommu_domain inside the kernel. This user object
diff --git a/drivers/iommu/iommufd/main.c b/drivers/iommu/iommufd/main.c
index 6a895489fb5b82..f746fcff8145cb 100644
--- a/drivers/iommu/iommufd/main.c
+++ b/drivers/iommu/iommufd/main.c
@@ -122,6 +122,8 @@ bool iommufd_object_destroy_user(struct iommufd_ctx *ictx,
return false;
}
__xa_erase(&ictx->objects, obj->id);
+ if (ictx->vfio_ioas && &ictx->vfio_ioas->obj == obj)
+ ictx->vfio_ioas = NULL;
xa_unlock(&ictx->objects);
iommufd_object_ops[obj->type].destroy(obj);
@@ -219,27 +221,31 @@ static struct iommufd_ioctl_op iommufd_ioctl_ops[] = {
__reserved),
IOCTL_OP(IOMMU_IOAS_UNMAP, iommufd_ioas_unmap, struct iommu_ioas_unmap,
length),
+ IOCTL_OP(IOMMU_VFIO_IOAS, iommufd_vfio_ioas, struct iommu_vfio_ioas,
+ __reserved),
};
static long iommufd_fops_ioctl(struct file *filp, unsigned int cmd,
unsigned long arg)
{
+ struct iommufd_ctx *ictx = filp->private_data;
struct iommufd_ucmd ucmd = {};
struct iommufd_ioctl_op *op;
union ucmd_buffer buf;
unsigned int nr;
int ret;
- ucmd.ictx = filp->private_data;
+ nr = _IOC_NR(cmd);
+ if (nr < IOMMUFD_CMD_BASE ||
+ (nr - IOMMUFD_CMD_BASE) >= ARRAY_SIZE(iommufd_ioctl_ops))
+ return iommufd_vfio_ioctl(ictx, cmd, arg);
+
+ ucmd.ictx = ictx;
ucmd.ubuffer = (void __user *)arg;
ret = get_user(ucmd.user_size, (u32 __user *)ucmd.ubuffer);
if (ret)
return ret;
- nr = _IOC_NR(cmd);
- if (nr < IOMMUFD_CMD_BASE ||
- (nr - IOMMUFD_CMD_BASE) >= ARRAY_SIZE(iommufd_ioctl_ops))
- return -ENOIOCTLCMD;
op = &iommufd_ioctl_ops[nr - IOMMUFD_CMD_BASE];
if (op->ioctl_num != cmd)
return -ENOIOCTLCMD;
diff --git a/drivers/iommu/iommufd/vfio_compat.c b/drivers/iommu/iommufd/vfio_compat.c
new file mode 100644
index 00000000000000..5c996bc9b44d48
--- /dev/null
+++ b/drivers/iommu/iommufd/vfio_compat.c
@@ -0,0 +1,401 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/* Copyright (c) 2021-2022, NVIDIA CORPORATION & AFFILIATES
+ */
+#include <linux/file.h>
+#include <linux/interval_tree.h>
+#include <linux/iommu.h>
+#include <linux/iommufd.h>
+#include <linux/slab.h>
+#include <linux/vfio.h>
+#include <uapi/linux/vfio.h>
+#include <uapi/linux/iommufd.h>
+
+#include "iommufd_private.h"
+
+static struct iommufd_ioas *get_compat_ioas(struct iommufd_ctx *ictx)
+{
+ struct iommufd_ioas *ioas = ERR_PTR(-ENODEV);
+
+ xa_lock(&ictx->objects);
+ if (!ictx->vfio_ioas || !iommufd_lock_obj(&ictx->vfio_ioas->obj))
+ goto out_unlock;
+ ioas = ictx->vfio_ioas;
+out_unlock:
+ xa_unlock(&ictx->objects);
+ return ioas;
+}
+
+/*
+ * Only attaching a group should cause a default creation of the internal ioas,
+ * this returns the existing ioas if it has already been assigned somehow
+ * FIXME: maybe_unused
+ */
+static __maybe_unused struct iommufd_ioas *
+create_compat_ioas(struct iommufd_ctx *ictx)
+{
+ struct iommufd_ioas *ioas = NULL;
+ struct iommufd_ioas *out_ioas;
+
+ ioas = iommufd_ioas_alloc(ictx);
+ if (IS_ERR(ioas))
+ return ioas;
+
+ xa_lock(&ictx->objects);
+ if (ictx->vfio_ioas && iommufd_lock_obj(&ictx->vfio_ioas->obj))
+ out_ioas = ictx->vfio_ioas;
+ else
+ out_ioas = ioas;
+ xa_unlock(&ictx->objects);
+
+ if (out_ioas != ioas) {
+ iommufd_object_abort(ictx, &ioas->obj);
+ return out_ioas;
+ }
+ if (!iommufd_lock_obj(&ioas->obj))
+ WARN_ON(true);
+ iommufd_object_finalize(ictx, &ioas->obj);
+ return ioas;
+}
+
+int iommufd_vfio_ioas(struct iommufd_ucmd *ucmd)
+{
+ struct iommu_vfio_ioas *cmd = ucmd->cmd;
+ struct iommufd_ioas *ioas;
+
+ if (cmd->__reserved)
+ return -EOPNOTSUPP;
+ switch (cmd->op) {
+ case IOMMU_VFIO_IOAS_GET:
+ ioas = get_compat_ioas(ucmd->ictx);
+ if (IS_ERR(ioas))
+ return PTR_ERR(ioas);
+ cmd->ioas_id = ioas->obj.id;
+ iommufd_put_object(&ioas->obj);
+ return iommufd_ucmd_respond(ucmd, sizeof(*cmd));
+
+ case IOMMU_VFIO_IOAS_SET:
+ ioas = iommufd_get_ioas(ucmd, cmd->ioas_id);
+ if (IS_ERR(ioas))
+ return PTR_ERR(ioas);
+ xa_lock(&ucmd->ictx->objects);
+ ucmd->ictx->vfio_ioas = ioas;
+ xa_unlock(&ucmd->ictx->objects);
+ iommufd_put_object(&ioas->obj);
+ return 0;
+
+ case IOMMU_VFIO_IOAS_CLEAR:
+ xa_lock(&ucmd->ictx->objects);
+ ucmd->ictx->vfio_ioas = NULL;
+ xa_unlock(&ucmd->ictx->objects);
+ return 0;
+ default:
+ return -EOPNOTSUPP;
+ }
+}
+
+static int iommufd_vfio_map_dma(struct iommufd_ctx *ictx, unsigned int cmd,
+ void __user *arg)
+{
+ u32 supported_flags = VFIO_DMA_MAP_FLAG_READ | VFIO_DMA_MAP_FLAG_WRITE;
+ size_t minsz = offsetofend(struct vfio_iommu_type1_dma_map, size);
+ struct vfio_iommu_type1_dma_map map;
+ int iommu_prot = IOMMU_CACHE;
+ struct iommufd_ioas *ioas;
+ unsigned long iova;
+ int rc;
+
+ if (copy_from_user(&map, arg, minsz))
+ return -EFAULT;
+
+ if (map.argsz < minsz || map.flags & ~supported_flags)
+ return -EINVAL;
+
+ if (map.flags & VFIO_DMA_MAP_FLAG_READ)
+ iommu_prot |= IOMMU_READ;
+ if (map.flags & VFIO_DMA_MAP_FLAG_WRITE)
+ iommu_prot |= IOMMU_WRITE;
+
+ ioas = get_compat_ioas(ictx);
+ if (IS_ERR(ioas))
+ return PTR_ERR(ioas);
+
+ iova = map.iova;
+ rc = iopt_map_user_pages(&ioas->iopt, &iova,
+ u64_to_user_ptr(map.vaddr), map.size,
+ iommu_prot, 0);
+ iommufd_put_object(&ioas->obj);
+ return rc;
+}
+
+static int iommufd_vfio_unmap_dma(struct iommufd_ctx *ictx, unsigned int cmd,
+ void __user *arg)
+{
+ size_t minsz = offsetofend(struct vfio_iommu_type1_dma_unmap, size);
+ u32 supported_flags = VFIO_DMA_UNMAP_FLAG_ALL;
+ struct vfio_iommu_type1_dma_unmap unmap;
+ struct iommufd_ioas *ioas;
+ int rc;
+
+ if (copy_from_user(&unmap, arg, minsz))
+ return -EFAULT;
+
+ if (unmap.argsz < minsz || unmap.flags & ~supported_flags)
+ return -EINVAL;
+
+ ioas = get_compat_ioas(ictx);
+ if (IS_ERR(ioas))
+ return PTR_ERR(ioas);
+
+ if (unmap.flags & VFIO_DMA_UNMAP_FLAG_ALL)
+ rc = iopt_unmap_all(&ioas->iopt);
+ else
+ rc = iopt_unmap_iova(&ioas->iopt, unmap.iova, unmap.size);
+ iommufd_put_object(&ioas->obj);
+ return rc;
+}
+
+static int iommufd_vfio_check_extension(unsigned long type)
+{
+ switch (type) {
+ case VFIO_TYPE1v2_IOMMU:
+ case VFIO_UNMAP_ALL:
+ return 1;
+ /*
+ * FIXME: The type1 iommu allows splitting of maps, which can fail. This is doable but
+ * is a bunch of extra code that is only for supporting this case.
+ */
+ case VFIO_TYPE1_IOMMU:
+ /*
+ * FIXME: No idea what VFIO_TYPE1_NESTING_IOMMU does as far as the uAPI
+ * is concerned. Seems like it was never completed, it only does
+ * something on ARM, but I can't figure out what or how to use it. Can't
+ * find any user implementation either.
+ */
+ case VFIO_TYPE1_NESTING_IOMMU:
+ /*
+ * FIXME: Easy to support, but needs rework in the Intel iommu driver
+ * to expose the no snoop squashing to iommufd
+ */
+ case VFIO_DMA_CC_IOMMU:
+ /*
+ * FIXME: VFIO_DMA_MAP_FLAG_VADDR
+ * https://lore.kernel.org/kvm/1611939252-7240-1-git-send-email-steven.sistare@oracle.com/
+ * Wow, what a wild feature. This should have been implemented by
+ * allowing a iopt_pages to be associated with a memfd. It can then
+ * source mapping requests directly from a memfd without going through a
+ * mm_struct and thus doesn't care that the original qemu exec'd itself.
+ * The idea that userspace can flip a flag and cause kernel users to
+ * block indefinately is unacceptable.
+ *
+ * For VFIO compat we should implement this in a slightly different way,
+ * Creating a access_user that spans the whole area will immediately
+ * stop new faults as they will be handled from the xarray. We can then
+ * reparent the iopt_pages to the new mm_struct and undo the
+ * access_user. No blockage of kernel users required, does require
+ * filling the xarray with pages though.
+ */
+ case VFIO_UPDATE_VADDR:
+ default:
+ return 0;
+ }
+
+ /* FIXME: VFIO_DMA_UNMAP_FLAG_GET_DIRTY_BITMAP I think everything with dirty
+ * tracking should be in its own ioctl, not muddled in unmap. If we want to
+ * atomically unmap and get the dirty bitmap it should be a flag in the dirty
+ * tracking ioctl, not here in unmap. Overall dirty tracking needs a careful
+ * review along side HW drivers implementing it.
+ */
+}
+
+static int iommufd_vfio_set_iommu(struct iommufd_ctx *ictx, unsigned long type)
+{
+ struct iommufd_ioas *ioas = NULL;
+
+ if (type != VFIO_TYPE1v2_IOMMU)
+ return -EINVAL;
+
+ /* VFIO fails the set_iommu if there is no group */
+ ioas = get_compat_ioas(ictx);
+ if (IS_ERR(ioas))
+ return PTR_ERR(ioas);
+ iommufd_put_object(&ioas->obj);
+ return 0;
+}
+
+static u64 iommufd_get_pagesizes(struct iommufd_ioas *ioas)
+{
+ /* FIXME: See vfio_update_pgsize_bitmap(), for compat this should return
+ * the high bits too, and we need to decide if we should report that
+ * iommufd supports less than PAGE_SIZE alignment or stick to strict
+ * compatibility. qemu only cares about the first set bit.
+ */
+ return ioas->iopt.iova_alignment;
+}
+
+static int iommufd_fill_cap_iova(struct iommufd_ioas *ioas,
+ struct vfio_info_cap_header __user *cur,
+ size_t avail)
+{
+ struct vfio_iommu_type1_info_cap_iova_range __user *ucap_iovas =
+ container_of(cur, struct vfio_iommu_type1_info_cap_iova_range,
+ header);
+ struct vfio_iommu_type1_info_cap_iova_range cap_iovas = {
+ .header = {
+ .id = VFIO_IOMMU_TYPE1_INFO_CAP_IOVA_RANGE,
+ .version = 1,
+ },
+ };
+ struct interval_tree_span_iter span;
+
+ for (interval_tree_span_iter_first(
+ &span, &ioas->iopt.reserved_iova_itree, 0, ULONG_MAX);
+ !interval_tree_span_iter_done(&span);
+ interval_tree_span_iter_next(&span)) {
+ struct vfio_iova_range range;
+
+ if (!span.is_hole)
+ continue;
+ range.start = span.start_hole;
+ range.end = span.last_hole;
+ if (avail >= struct_size(&cap_iovas, iova_ranges,
+ cap_iovas.nr_iovas + 1) &&
+ copy_to_user(&ucap_iovas->iova_ranges[cap_iovas.nr_iovas],
+ &range, sizeof(range)))
+ return -EFAULT;
+ cap_iovas.nr_iovas++;
+ }
+ if (avail >= struct_size(&cap_iovas, iova_ranges, cap_iovas.nr_iovas) &&
+ copy_to_user(ucap_iovas, &cap_iovas, sizeof(cap_iovas)))
+ return -EFAULT;
+ return struct_size(&cap_iovas, iova_ranges, cap_iovas.nr_iovas);
+}
+
+static int iommufd_fill_cap_dma_avail(struct iommufd_ioas *ioas,
+ struct vfio_info_cap_header __user *cur,
+ size_t avail)
+{
+ struct vfio_iommu_type1_info_dma_avail cap_dma = {
+ .header = {
+ .id = VFIO_IOMMU_TYPE1_INFO_DMA_AVAIL,
+ .version = 1,
+ },
+ /* iommufd has no limit, return the same value as VFIO. */
+ .avail = U16_MAX,
+ };
+
+ if (avail >= sizeof(cap_dma) &&
+ copy_to_user(cur, &cap_dma, sizeof(cap_dma)))
+ return -EFAULT;
+ return sizeof(cap_dma);
+}
+
+static int iommufd_vfio_iommu_get_info(struct iommufd_ctx *ictx,
+ void __user *arg)
+{
+ typedef int (*fill_cap_fn)(struct iommufd_ioas *ioas,
+ struct vfio_info_cap_header __user *cur,
+ size_t avail);
+ static const fill_cap_fn fill_fns[] = {
+ iommufd_fill_cap_iova,
+ iommufd_fill_cap_dma_avail,
+ };
+ size_t minsz = offsetofend(struct vfio_iommu_type1_info, iova_pgsizes);
+ struct vfio_info_cap_header __user *last_cap = NULL;
+ struct vfio_iommu_type1_info info;
+ struct iommufd_ioas *ioas;
+ size_t total_cap_size;
+ int rc;
+ int i;
+
+ if (copy_from_user(&info, arg, minsz))
+ return -EFAULT;
+
+ if (info.argsz < minsz)
+ return -EINVAL;
+ minsz = min_t(size_t, info.argsz, sizeof(info));
+
+ ioas = get_compat_ioas(ictx);
+ if (IS_ERR(ioas))
+ return PTR_ERR(ioas);
+
+ down_read(&ioas->iopt.iova_rwsem);
+ info.flags = VFIO_IOMMU_INFO_PGSIZES;
+ info.iova_pgsizes = iommufd_get_pagesizes(ioas);
+ info.cap_offset = 0;
+
+ total_cap_size = sizeof(info);
+ for (i = 0; i != ARRAY_SIZE(fill_fns); i++) {
+ int cap_size;
+
+ if (info.argsz > total_cap_size)
+ cap_size = fill_fns[i](ioas, arg + total_cap_size,
+ info.argsz - total_cap_size);
+ else
+ cap_size = fill_fns[i](ioas, NULL, 0);
+ if (cap_size < 0) {
+ rc = cap_size;
+ goto out_put;
+ }
+ if (last_cap && info.argsz >= total_cap_size &&
+ put_user(total_cap_size, &last_cap->next)) {
+ rc = -EFAULT;
+ goto out_put;
+ }
+ last_cap = arg + total_cap_size;
+ total_cap_size += cap_size;
+ }
+
+ /*
+ * If the user did not provide enough space then only some caps are
+ * returned and the argsz will be updated to the correct amount to get
+ * all caps.
+ */
+ if (info.argsz >= total_cap_size)
+ info.cap_offset = sizeof(info);
+ info.argsz = total_cap_size;
+ info.flags |= VFIO_IOMMU_INFO_CAPS;
+ if (copy_to_user(arg, &info, minsz))
+ rc = -EFAULT;
+ rc = 0;
+
+out_put:
+ up_read(&ioas->iopt.iova_rwsem);
+ iommufd_put_object(&ioas->obj);
+ return rc;
+}
+
+/* FIXME TODO:
+PowerPC SPAPR only:
+#define VFIO_IOMMU_ENABLE _IO(VFIO_TYPE, VFIO_BASE + 15)
+#define VFIO_IOMMU_DISABLE _IO(VFIO_TYPE, VFIO_BASE + 16)
+#define VFIO_IOMMU_SPAPR_TCE_GET_INFO _IO(VFIO_TYPE, VFIO_BASE + 12)
+#define VFIO_IOMMU_SPAPR_REGISTER_MEMORY _IO(VFIO_TYPE, VFIO_BASE + 17)
+#define VFIO_IOMMU_SPAPR_UNREGISTER_MEMORY _IO(VFIO_TYPE, VFIO_BASE + 18)
+#define VFIO_IOMMU_SPAPR_TCE_CREATE _IO(VFIO_TYPE, VFIO_BASE + 19)
+#define VFIO_IOMMU_SPAPR_TCE_REMOVE _IO(VFIO_TYPE, VFIO_BASE + 20)
+*/
+
+int iommufd_vfio_ioctl(struct iommufd_ctx *ictx, unsigned int cmd,
+ unsigned long arg)
+{
+ void __user *uarg = (void __user *)arg;
+
+ switch (cmd) {
+ case VFIO_GET_API_VERSION:
+ return VFIO_API_VERSION;
+ case VFIO_SET_IOMMU:
+ return iommufd_vfio_set_iommu(ictx, arg);
+ case VFIO_CHECK_EXTENSION:
+ return iommufd_vfio_check_extension(arg);
+ case VFIO_IOMMU_GET_INFO:
+ return iommufd_vfio_iommu_get_info(ictx, uarg);
+ case VFIO_IOMMU_MAP_DMA:
+ return iommufd_vfio_map_dma(ictx, cmd, uarg);
+ case VFIO_IOMMU_UNMAP_DMA:
+ return iommufd_vfio_unmap_dma(ictx, cmd, uarg);
+ case VFIO_IOMMU_DIRTY_PAGES:
+ default:
+ return -ENOIOCTLCMD;
+ }
+ return -ENOIOCTLCMD;
+}
diff --git a/include/uapi/linux/iommufd.h b/include/uapi/linux/iommufd.h
index ba7b17ec3002e3..2c0f5ced417335 100644
--- a/include/uapi/linux/iommufd.h
+++ b/include/uapi/linux/iommufd.h
@@ -42,6 +42,7 @@ enum {
IOMMUFD_CMD_IOAS_MAP,
IOMMUFD_CMD_IOAS_COPY,
IOMMUFD_CMD_IOAS_UNMAP,
+ IOMMUFD_CMD_VFIO_IOAS,
};
/**
@@ -184,4 +185,39 @@ struct iommu_ioas_unmap {
__aligned_u64 length;
};
#define IOMMU_IOAS_UNMAP _IO(IOMMUFD_TYPE, IOMMUFD_CMD_IOAS_UNMAP)
+
+/**
+ * enum iommufd_vfio_ioas_op
+ * @IOMMU_VFIO_IOAS_GET: Get the current compatibility IOAS
+ * @IOMMU_VFIO_IOAS_SET: Change the current compatibility IOAS
+ * @IOMMU_VFIO_IOAS_CLEAR: Disable VFIO compatibility
+ */
+enum iommufd_vfio_ioas_op {
+ IOMMU_VFIO_IOAS_GET = 0,
+ IOMMU_VFIO_IOAS_SET = 1,
+ IOMMU_VFIO_IOAS_CLEAR = 2,
+};
+
+/**
+ * struct iommu_vfio_ioas - ioctl(IOMMU_VFIO_IOAS)
+ * @size: sizeof(struct iommu_ioas_copy)
+ * @ioas_id: For IOMMU_VFIO_IOAS_SET the input IOAS ID to set
+ * For IOMMU_VFIO_IOAS_GET will output the IOAS ID
+ * @op: One of enum iommufd_vfio_ioas_op
+ * @__reserved: Must be 0
+ *
+ * The VFIO compatibility support uses a single ioas because VFIO APIs do not
+ * support the ID field. Set or Get the IOAS that VFIO compatibility will use.
+ * When VFIO_GROUP_SET_CONTAINER is used on an iommufd it will get the
+ * compatibility ioas, either by taking what is already set, or auto creating
+ * one. From then on VFIO will continue to use that ioas and is not effected by
+ * this ioctl. SET or CLEAR does not destroy any auto-created IOAS.
+ */
+struct iommu_vfio_ioas {
+ __u32 size;
+ __u32 ioas_id;
+ __u16 op;
+ __u16 __reserved;
+};
+#define IOMMU_VFIO_IOAS _IO(IOMMUFD_TYPE, IOMMUFD_CMD_VFIO_IOAS)
#endif
--
2.35.1
next prev parent reply other threads:[~2022-03-18 17:27 UTC|newest]
Thread overview: 122+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-03-18 17:27 [PATCH RFC 00/12] IOMMUFD Generic interface Jason Gunthorpe
2022-03-18 17:27 ` [PATCH RFC 01/12] interval-tree: Add a utility to iterate over spans in an interval tree Jason Gunthorpe
2022-03-18 17:27 ` [PATCH RFC 02/12] iommufd: Overview documentation Jason Gunthorpe
2022-03-18 17:27 ` [PATCH RFC 03/12] iommufd: File descriptor, context, kconfig and makefiles Jason Gunthorpe
2022-03-22 14:18 ` Niklas Schnelle
2022-03-22 14:50 ` Jason Gunthorpe
2022-03-18 17:27 ` [PATCH RFC 04/12] kernel/user: Allow user::locked_vm to be usable for iommufd Jason Gunthorpe
2022-03-22 14:28 ` Niklas Schnelle
2022-03-22 14:57 ` Jason Gunthorpe
2022-03-22 15:29 ` Alex Williamson
2022-03-22 16:15 ` Jason Gunthorpe
2022-03-24 2:11 ` Tian, Kevin
2022-03-24 2:27 ` Jason Wang
2022-03-24 2:42 ` Tian, Kevin
2022-03-24 2:57 ` Jason Wang
2022-03-24 3:15 ` Tian, Kevin
2022-03-24 3:50 ` Jason Wang
2022-03-24 4:29 ` Tian, Kevin
2022-03-24 11:46 ` Jason Gunthorpe
2022-03-28 1:53 ` Jason Wang
2022-03-28 12:22 ` Jason Gunthorpe
2022-03-29 4:59 ` Jason Wang
2022-03-29 11:46 ` Jason Gunthorpe
2022-03-28 13:14 ` Sean Mooney
2022-03-28 14:27 ` Jason Gunthorpe
2022-03-24 20:40 ` Alex Williamson
2022-03-24 22:27 ` Jason Gunthorpe
2022-03-24 22:41 ` Alex Williamson
2022-03-22 16:31 ` Niklas Schnelle
2022-03-22 16:41 ` Jason Gunthorpe
2022-03-18 17:27 ` [PATCH RFC 05/12] iommufd: PFN handling for iopt_pages Jason Gunthorpe
2022-03-23 15:37 ` Niklas Schnelle
2022-03-23 16:09 ` Jason Gunthorpe
2022-03-18 17:27 ` [PATCH RFC 06/12] iommufd: Algorithms for PFN storage Jason Gunthorpe
2022-03-18 17:27 ` [PATCH RFC 07/12] iommufd: Data structure to provide IOVA to PFN mapping Jason Gunthorpe
2022-03-22 22:15 ` Alex Williamson
2022-03-23 18:15 ` Jason Gunthorpe
2022-03-24 3:09 ` Tian, Kevin
2022-03-24 12:46 ` Jason Gunthorpe
2022-03-25 13:34 ` zhangfei.gao
2022-03-25 17:19 ` Jason Gunthorpe
2022-04-13 14:02 ` Yi Liu
2022-04-13 14:36 ` Jason Gunthorpe
2022-04-13 14:49 ` Yi Liu
2022-04-17 14:56 ` Yi Liu
2022-04-18 10:47 ` Yi Liu
2022-03-18 17:27 ` [PATCH RFC 08/12] iommufd: IOCTLs for the io_pagetable Jason Gunthorpe
2022-03-23 19:10 ` Alex Williamson
2022-03-23 19:34 ` Jason Gunthorpe
2022-03-23 20:04 ` Alex Williamson
2022-03-23 20:34 ` Jason Gunthorpe
2022-03-23 22:54 ` Jason Gunthorpe
2022-03-24 7:25 ` Tian, Kevin
2022-03-24 13:46 ` Jason Gunthorpe
2022-03-25 2:15 ` Tian, Kevin
2022-03-27 2:32 ` Tian, Kevin
2022-03-27 14:28 ` Jason Gunthorpe
2022-03-28 17:17 ` Alex Williamson
2022-03-28 18:57 ` Jason Gunthorpe
2022-03-28 19:47 ` Jason Gunthorpe
2022-03-28 21:26 ` Alex Williamson
2022-03-24 6:46 ` Tian, Kevin
2022-03-30 13:35 ` Yi Liu
2022-03-31 12:59 ` Jason Gunthorpe
2022-04-01 13:30 ` Yi Liu
2022-03-31 4:36 ` David Gibson
2022-03-31 5:41 ` Tian, Kevin
2022-03-31 12:58 ` Jason Gunthorpe
2022-04-28 5:58 ` David Gibson
2022-04-28 14:22 ` Jason Gunthorpe
2022-04-29 6:00 ` David Gibson
2022-04-29 12:54 ` Jason Gunthorpe
2022-04-30 14:44 ` David Gibson
2022-03-18 17:27 ` [PATCH RFC 09/12] iommufd: Add a HW pagetable object Jason Gunthorpe
2022-03-18 17:27 ` [PATCH RFC 10/12] iommufd: Add kAPI toward external drivers Jason Gunthorpe
2022-03-23 18:10 ` Alex Williamson
2022-03-23 18:15 ` Jason Gunthorpe
2022-05-11 12:54 ` Yi Liu
2022-05-19 9:45 ` Yi Liu
2022-05-19 12:35 ` Jason Gunthorpe
2022-03-18 17:27 ` Jason Gunthorpe [this message]
2022-03-23 22:51 ` [PATCH RFC 11/12] iommufd: vfio container FD ioctl compatibility Alex Williamson
2022-03-24 0:33 ` Jason Gunthorpe
2022-03-24 8:13 ` Eric Auger
2022-03-24 22:04 ` Alex Williamson
2022-03-24 23:11 ` Jason Gunthorpe
2022-03-25 3:10 ` Tian, Kevin
2022-03-25 11:24 ` Joao Martins
2022-04-28 14:53 ` David Gibson
2022-04-28 15:10 ` Jason Gunthorpe
2022-04-29 1:21 ` Tian, Kevin
2022-04-29 6:22 ` David Gibson
2022-04-29 12:50 ` Jason Gunthorpe
2022-05-02 4:10 ` David Gibson
2022-04-29 6:20 ` David Gibson
2022-04-29 12:48 ` Jason Gunthorpe
2022-05-02 7:30 ` David Gibson
2022-05-05 19:07 ` Jason Gunthorpe
2022-05-06 5:25 ` David Gibson
2022-05-06 10:42 ` Tian, Kevin
2022-05-09 3:36 ` David Gibson
2022-05-06 12:48 ` Jason Gunthorpe
2022-05-09 6:01 ` David Gibson
2022-05-09 14:00 ` Jason Gunthorpe
2022-05-10 7:12 ` David Gibson
2022-05-10 19:00 ` Jason Gunthorpe
2022-05-11 3:15 ` Tian, Kevin
2022-05-11 16:32 ` Jason Gunthorpe
2022-05-11 23:23 ` Tian, Kevin
2022-05-13 4:35 ` David Gibson
2022-05-11 4:40 ` David Gibson
2022-05-11 2:46 ` Tian, Kevin
2022-05-23 6:02 ` Alexey Kardashevskiy
2022-05-24 13:25 ` Jason Gunthorpe
2022-05-25 1:39 ` David Gibson
2022-05-25 2:09 ` Alexey Kardashevskiy
2022-03-29 9:17 ` Yi Liu
2022-03-18 17:27 ` [PATCH RFC 12/12] iommufd: Add a selftest Jason Gunthorpe
2022-04-12 20:13 ` [PATCH RFC 00/12] IOMMUFD Generic interface Eric Auger
2022-04-12 20:22 ` Jason Gunthorpe
2022-04-12 20:50 ` Eric Auger
2022-04-14 10:56 ` Yi Liu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=11-v1-e79cd8d168e8+6-iommufd_jgg@nvidia.com \
--to=jgg@nvidia.com \
--cc=alex.williamson@redhat.com \
--cc=baolu.lu@linux.intel.com \
--cc=chaitanyak@nvidia.com \
--cc=cohuck@redhat.com \
--cc=daniel.m.jordan@oracle.com \
--cc=david@gibson.dropbear.id.au \
--cc=eric.auger@redhat.com \
--cc=iommu@lists.linux-foundation.org \
--cc=jasowang@redhat.com \
--cc=jean-philippe@linaro.org \
--cc=joao.m.martins@oracle.com \
--cc=kevin.tian@intel.com \
--cc=kvm@vger.kernel.org \
--cc=mjrosato@linux.ibm.com \
--cc=mst@redhat.com \
--cc=nicolinc@nvidia.com \
--cc=schnelle@linux.ibm.com \
--cc=shameerali.kolothum.thodi@huawei.com \
--cc=yi.l.liu@intel.com \
--cc=zhukeqian1@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).