All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH 0/6] Auxiliary IOMMU domains and Arm SMMUv3
@ 2018-10-19 18:11 Jean-Philippe Brucker
       [not found] ` <20181019181158.2395-1-jean-philippe.brucker-5wv7dgnIgG8@public.gmane.org>
  0 siblings, 1 reply; 60+ messages in thread
From: Jean-Philippe Brucker @ 2018-10-19 18:11 UTC (permalink / raw)
  To: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA
  Cc: kevin.tian-ral2JQCrhuEAvxtiuMwx3w, rafael-DgEjT+Ai2ygdnm+yROfE0A,
	gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, will.deacon-5wv7dgnIgG8,
	christian.koenig-5C7GfCeVMHo,
	alex.williamson-H+wXaHxf7aLQT0dZR+AlfA, robin.murphy-5wv7dgnIgG8

This is a first prototype adding auxiliary domain support to Arm SMMUv3,
following Lu Baolu's latest proposal for IOMMU aware mediated devices
[1]. It works, but the attach() API still doesn't feel right. See (2)
below.

Patch 1 adapts iommu.c to the current proposal for auxiliary domains.
Patches 2-4 rework the PASID allocator to make it usable for SVA and
AUXD simultaneously. Patches 5-6 add AUXD support to SMMUv3.


When a device can have multiple address space, for instance with PCI
PASID, an auxiliary domain (AUXD) is the IOMMU representation of one
address space. I distinguish auxiliary from "main" domain, which
represents the non-PASID address space but also (at least for SMMUv3)
the whole device context, PASID tables etc.

Auxiliary domains will be used by VFIO for IOMMU-aware mdev, and by any
other device driver that wants to use PASID for private address spaces
(as opposed to SVA [2]). The following API is available to device
drivers:

(1) Enable AUXD for a device. Enable PASID if necessary and set an AUXD
    flag on the IOMMU data associated to a device.

    For my own convenience I've been using the SVA infrastructure since
    I already had the locking and IOMMU ops in place. The proposed
    interface is also missing min_pasid and max_pasid parameters, which
    could be needed by device drivers to enforce PASID limits.
    iommu_sva_init_device() without arguments already enables PASID, so
    I just added an AUXD flag to SVA features:

      iommu_sva_init_device(dev, IOMMU_SVA_FEAT_AUXD,
                            min_pasid, max_pasid, NULL)
      iommu_sva_shutdown_device(dev)

    Or as proposed in [1]:

      iommu_set_dev_attr(dev, IOMMU_DEV_ATTR_AUXD_ENABLE, NULL)
      iommu_set_dev_attr(dev, IOMMU_DEV_ATTR_AUXD_DISABLE, NULL)

    Finding a compromise for this interface should be easy.

(2) Allocate a domain and attach it to the device.

      dom = iommu_domain_alloc()
      iommu_attach_device(dom, dev)

    I still have concerns about this part, which are highlighted by the
    messy changes of patch 1. I think it would make more sense to
    introduce new attach/detach_dev_aux() functions instead of reusing
    attach/detach_dev()

    Can we reconsider this and avoid unnecessary complications in IOMMU
    core and drivers? Does the VFIO code greatly benefit from using the
    same attach() function? It could as well use a different one for
    devices in AUXD mode, which the mediating driver could tell by
    adding a flag in mdev_set_iommu_device(), for example.

    And I don't think other users of AUXD would benefit from using the
    same attach() function, since they will know whether they want to be
    using main or auxiliary domain when doing attach().

(3) Get the PASID, and program it in the device

      iommu_domain_get_attr(dom, DOMAIN_ATTR_AUXD_ID, &pasid)

(4) Create DMA mappings

      iommu_map(dom, ...)
      iommu_unmap(dom, ...)

    Ultimately it would be nice to add PASID support to the DMA API as
    well. For now drivers allocate IOVAs and pages themselves.


For vfio-mdev, a driver that wants to create mdevs only performs steps (1)
and (3):

* When initializing the parent device, enable AUXD (1)
* In mdev_parent_ops::create(), call mdev_set_iommu_device(mdev_dev(mdev),
  mdev_parent_dev(mdev)).
* In mdev_parent_ops::open(), get the PASID (3) and install it in the
  parent device.
* In mdev_parent_ops::close(), clear the PASID


This code and the many patches it depends on can be found on my
iommu/auxd branch:
	git://linux-arm.org/linux-jpb.git iommu/auxd

[1] [PATCH v3 0/8] vfio/mdev: IOMMU aware mediated device
    https://lwn.net/ml/linux-kernel/20181012051632.26064-1-baolu.lu-VuQAYsv1563Yd54FQh9/CA@public.gmane.org/
[2] [PATCH v3 00/10] Shared Virtual Addressing for the IOMMU
    https://www.spinics.net/lists/iommu/msg30076.html

Jean-Philippe Brucker (6):
  iommu: Adapt attach/detach_dev() for auxiliary domains
  drivers core: Add I/O ASID allocator
  iommu/sva: Use external PASID allocator
  iommu/sva: Support AUXD feature
  iommu/arm-smmu-v3: Implement detach_dev op
  iommu/arm-smmu-v3: Add support for auxiliary domains

 drivers/base/Kconfig        |   7 ++
 drivers/base/Makefile       |   1 +
 drivers/base/ioasid.c       | 140 ++++++++++++++++++++++++++++++
 drivers/iommu/Kconfig       |   1 +
 drivers/iommu/arm-smmu-v3.c | 164 ++++++++++++++++++++++++++++++++++--
 drivers/iommu/iommu-sva.c   | 113 +++++++++++++++++--------
 drivers/iommu/iommu.c       |  58 +++++++++----
 include/linux/ioasid.h      |  45 ++++++++++
 include/linux/iommu.h       |  12 +++
 9 files changed, 479 insertions(+), 62 deletions(-)
 create mode 100644 drivers/base/ioasid.c
 create mode 100644 include/linux/ioasid.h

-- 
2.19.1

^ permalink raw reply	[flat|nested] 60+ messages in thread

* [RFC PATCH 1/6] iommu: Adapt attach/detach_dev() for auxiliary domains
       [not found] ` <20181019181158.2395-1-jean-philippe.brucker-5wv7dgnIgG8@public.gmane.org>
@ 2018-10-19 18:11   ` Jean-Philippe Brucker
       [not found]     ` <20181019181158.2395-2-jean-philippe.brucker-5wv7dgnIgG8@public.gmane.org>
  2018-10-19 18:11   ` [RFC PATCH 2/6] drivers core: Add I/O ASID allocator Jean-Philippe Brucker
                     ` (8 subsequent siblings)
  9 siblings, 1 reply; 60+ messages in thread
From: Jean-Philippe Brucker @ 2018-10-19 18:11 UTC (permalink / raw)
  To: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA
  Cc: kevin.tian-ral2JQCrhuEAvxtiuMwx3w, rafael-DgEjT+Ai2ygdnm+yROfE0A,
	gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, will.deacon-5wv7dgnIgG8,
	christian.koenig-5C7GfCeVMHo,
	alex.williamson-H+wXaHxf7aLQT0dZR+AlfA, robin.murphy-5wv7dgnIgG8

The same set of functions, iommu_attach/detach_device/group, is used
both to change a device's domain (let's call it "main domain") and to
add auxiliary domains. The former is used by vfio-pci for example, to
assign the whole device to userspace. The latter is used by vfio-mdev to
assign partitions of devices using PASIDs. Device drivers set or unset
the 'AUXD' device attribute to switch between these modes.

Supporting both auxiliary and main domain through the same callback
requires some core changes. For an IOMMU driver that supports both, the
new semantics of attach_dev() and detach_dev() ops are:

attach_dev(dev, new_domain)
 (1) When auxd is disabled, detach from the current domain and attach to
     the new domain.

 (2) When auxd is enabled, keep the current domain and attach to the new
     domain.

detach_dev(dev, new_domain)
 (3) When auxd is disabled, detach from the domain. Previously the core
     didn't call detach_dev explicitly when a device has a default domain,
     since case (1) already takes care of it. But in order to also handle
     case (4), we now need to always call detach_dev.

 (4) When auxd is enabled, detach from the given domain, but stay attached
     to the default domain.

attach_dev() now returns a value greater than 0 if the domain is in
auxiliary mode. This allows the core to know if the main domain
(group->domain) is affected. detach_dev() should probably do the same
(currently group->domain is always cleared on detach), but that requires
changing all IOMMU drivers.

Since the core doesn't know about auxiliary domains when attaching, we
have to relax attach_dev a bit: we can't check if the user is trying to
replace the main domain with a new unmanaged domain anymore.

IOMMU drivers that don't support auxiliary domains are also affected by
the change in (3). Those that support default_domain now get
detach_dev() followed by attach_dev() callback on iommu_detach_device(),
instead of just attach_dev().

Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker-5wv7dgnIgG8@public.gmane.org>
---
 drivers/iommu/iommu.c | 58 ++++++++++++++++++++++++++++++-------------
 1 file changed, 41 insertions(+), 17 deletions(-)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index a8a144dded52..667ccfc6d3fd 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -1553,7 +1553,7 @@ static int __iommu_attach_device(struct iommu_domain *domain,
 		return -ENODEV;
 
 	ret = domain->ops->attach_dev(domain, dev);
-	if (!ret)
+	if (ret >= 0)
 		trace_attach_device_to_domain(dev);
 	return ret;
 }
@@ -1726,6 +1726,12 @@ struct iommu_domain *iommu_get_domain_for_dev(struct device *dev)
 }
 EXPORT_SYMBOL_GPL(iommu_get_domain_for_dev);
 
+struct iommu_attach_group_info {
+	struct iommu_domain *domain;
+	unsigned long count;
+	int ret;
+};
+
 /*
  * IOMMU groups are really the natrual working unit of the IOMMU, but
  * the IOMMU API works on domains and devices.  Bridge that gap by
@@ -1738,20 +1744,31 @@ EXPORT_SYMBOL_GPL(iommu_get_domain_for_dev);
  */
 static int iommu_group_do_attach_device(struct device *dev, void *data)
 {
-	struct iommu_domain *domain = data;
+	int ret;
+	struct iommu_attach_group_info *info = data;
 
-	return __iommu_attach_device(domain, dev);
+	ret = __iommu_attach_device(info->domain, dev);
+	if (ret < 0)
+		return ret;
+
+	/* Can't mix auxiliary and main domain within a group */
+	if (info->count++ && ret != info->ret)
+		/* TODO: rollback */
+		return -ENODEV;
+	info->ret = ret;
+
+	return 0;
 }
 
 static int __iommu_attach_group(struct iommu_domain *domain,
 				struct iommu_group *group)
 {
 	int ret;
+	struct iommu_attach_group_info info = {
+		.domain = domain,
+	};
 
-	if (group->default_domain && group->domain != group->default_domain)
-		return -EBUSY;
-
-	ret = __iommu_group_for_each_dev(group, domain,
+	ret = __iommu_group_for_each_dev(group, &info,
 					 iommu_group_do_attach_device);
 	if (ret == 0)
 		group->domain = domain;
@@ -1784,23 +1801,30 @@ static void __iommu_detach_group(struct iommu_domain *domain,
 				 struct iommu_group *group)
 {
 	int ret;
+	struct iommu_attach_group_info info = {
+		.domain = group->default_domain,
+	};
 
+	__iommu_group_for_each_dev(group, domain, iommu_group_do_detach_device);
 	if (!group->default_domain) {
-		__iommu_group_for_each_dev(group, domain,
-					   iommu_group_do_detach_device);
+		/*
+		 * If detach returned, then we're not removing the main domain
+		 * but an auxiliary one.
+		 *
+		 * FIXME: don't clear domain!
+		 */
 		group->domain = NULL;
 		return;
 	}
 
-	if (group->domain == group->default_domain)
-		return;
-
-	/* Detach by re-attaching to the default domain */
-	ret = __iommu_group_for_each_dev(group, group->default_domain,
+	/*
+	 * Detach by re-attaching to the default domain. If the device is
+	 * already attached, then @domain was in auxiliary mode and the driver
+	 * returns a value > 0.
+	 */
+	ret = __iommu_group_for_each_dev(group, &info,
 					 iommu_group_do_attach_device);
-	if (ret != 0)
-		WARN_ON(1);
-	else
+	if (ret == 0)
 		group->domain = group->default_domain;
 }
 
-- 
2.19.1

^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [RFC PATCH 2/6] drivers core: Add I/O ASID allocator
       [not found] ` <20181019181158.2395-1-jean-philippe.brucker-5wv7dgnIgG8@public.gmane.org>
  2018-10-19 18:11   ` [RFC PATCH 1/6] iommu: Adapt attach/detach_dev() for auxiliary domains Jean-Philippe Brucker
@ 2018-10-19 18:11   ` Jean-Philippe Brucker
       [not found]     ` <20181019181158.2395-3-jean-philippe.brucker-5wv7dgnIgG8@public.gmane.org>
  2018-10-19 18:11   ` [RFC PATCH 3/6] iommu/sva: Use external PASID allocator Jean-Philippe Brucker
                     ` (7 subsequent siblings)
  9 siblings, 1 reply; 60+ messages in thread
From: Jean-Philippe Brucker @ 2018-10-19 18:11 UTC (permalink / raw)
  To: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA
  Cc: kevin.tian-ral2JQCrhuEAvxtiuMwx3w, rafael-DgEjT+Ai2ygdnm+yROfE0A,
	gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, will.deacon-5wv7dgnIgG8,
	christian.koenig-5C7GfCeVMHo,
	alex.williamson-H+wXaHxf7aLQT0dZR+AlfA, robin.murphy-5wv7dgnIgG8

Some devices might support multiple DMA address spaces, in particular
those that have the PCI PASID feature. PASID (Process Address Space ID)
allows to share process address spaces with devices (SVA), partition a
device into VM-assignable entities (VFIO mdev) or simply provide
multiple DMA address space to kernel drivers. Add a global PASID
allocator usable by different drivers at the same time. Name it I/O ASID
to avoid confusion with ASIDs allocated by arch code, which are usually
a separate ID space.

The IOASID space is global. Each device can have its own PASID space,
but by convention the IOMMU ended up having a global PASID space, so
that with SVA, each mm_struct is associated to a single PASID.

The allocator doesn't really belong in drivers/iommu because some
drivers would like to allocate PASIDs for devices that aren't managed by
an IOMMU, using the same ID space as IOMMU. It doesn't really belong in
drivers/pci either since platform device also support PASID. Add the
allocator in drivers/base.

Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker-5wv7dgnIgG8@public.gmane.org>
---
 drivers/base/Kconfig   |   7 +++
 drivers/base/Makefile  |   1 +
 drivers/base/ioasid.c  | 140 +++++++++++++++++++++++++++++++++++++++++
 include/linux/ioasid.h |  45 +++++++++++++
 4 files changed, 193 insertions(+)
 create mode 100644 drivers/base/ioasid.c
 create mode 100644 include/linux/ioasid.h

diff --git a/drivers/base/Kconfig b/drivers/base/Kconfig
index 3e63a900b330..9e41653467d7 100644
--- a/drivers/base/Kconfig
+++ b/drivers/base/Kconfig
@@ -182,6 +182,13 @@ config DMA_SHARED_BUFFER
 	  APIs extension; the file's descriptor can then be passed on to other
 	  driver.
 
+config IOASID
+	bool
+	default n
+	help
+	  Enable the I/O Address Space ID allocator. A single ID space shared
+	  between different users.
+
 config DMA_FENCE_TRACE
 	bool "Enable verbose DMA_FENCE_TRACE messages"
 	depends on DMA_SHARED_BUFFER
diff --git a/drivers/base/Makefile b/drivers/base/Makefile
index 704f44295810..79e83a1b344b 100644
--- a/drivers/base/Makefile
+++ b/drivers/base/Makefile
@@ -23,6 +23,7 @@ obj-$(CONFIG_PINCTRL) += pinctrl.o
 obj-$(CONFIG_DEV_COREDUMP) += devcoredump.o
 obj-$(CONFIG_GENERIC_MSI_IRQ_DOMAIN) += platform-msi.o
 obj-$(CONFIG_GENERIC_ARCH_TOPOLOGY) += arch_topology.o
+obj-$(CONFIG_IOASID) += ioasid.o
 
 obj-y			+= test/
 
diff --git a/drivers/base/ioasid.c b/drivers/base/ioasid.c
new file mode 100644
index 000000000000..71ab7ee73ecb
--- /dev/null
+++ b/drivers/base/ioasid.c
@@ -0,0 +1,140 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * I/O Address Space ID allocator. There is one global IOASID space, split into
+ * subsets. Users create a subset with DECLARE_IOASID_SET, then allocate and
+ * free IOASIDs with ioasid_alloc and ioasid_free.
+ */
+#include <linux/idr.h>
+#include <linux/ioasid.h>
+#include <linux/slab.h>
+#include <linux/spinlock.h>
+
+struct ioasid_data {
+	ioasid_t id;
+	struct ioasid_set *set;
+	void *private;
+};
+
+static DEFINE_IDR(ioasid_idr);
+
+/**
+ * ioasid_alloc - Allocate an IOASID
+ * @set: the IOASID set
+ * @min: the minimum ID (inclusive)
+ * @max: the maximum ID (exclusive)
+ * @private: data private to the caller
+ *
+ * Allocate an ID between @min and @max (or %0 and %INT_MAX). Return the
+ * allocated ID on success, or INVALID_IOASID on failure. The @private pointer
+ * is stored internally and can be retrieved with ioasid_find().
+ */
+ioasid_t ioasid_alloc(struct ioasid_set *set, ioasid_t min, ioasid_t max,
+		      void *private)
+{
+	int id;
+	struct ioasid_data *data;
+
+	data = kzalloc(sizeof(*data), GFP_KERNEL);
+	if (!data)
+		return INVALID_IOASID;
+
+	data->set = set;
+	data->private = private;
+
+	idr_preload(GFP_KERNEL);
+	idr_lock(&ioasid_idr);
+	data->id = id = idr_alloc(&ioasid_idr, data, min, max, GFP_ATOMIC);
+	idr_unlock(&ioasid_idr);
+	idr_preload_end();
+
+	if (id < 0) {
+		kfree(data);
+		return INVALID_IOASID;
+	}
+
+	return data->id;
+}
+EXPORT_SYMBOL_GPL(ioasid_alloc);
+
+/**
+ * ioasid_free - Free an IOASID
+ * @ioasid: the ID to remove
+ */
+void ioasid_free(ioasid_t ioasid)
+{
+	struct ioasid_data *ioasid_data;
+
+	idr_lock(&ioasid_idr);
+	ioasid_data = idr_remove(&ioasid_idr, ioasid);
+	idr_unlock(&ioasid_idr);
+
+	kfree(ioasid_data);
+}
+EXPORT_SYMBOL_GPL(ioasid_free);
+
+struct ioasid_iter_data {
+	struct ioasid_set *set;
+	ioasid_iter_t func;
+	void *data;
+};
+
+static int ioasid_iter(int ioasid, void *p, void *data)
+{
+	struct ioasid_iter_data *iter_data = data;
+	struct ioasid_data *ioasid_data = p;
+
+	if (iter_data->set != ioasid_data->set)
+		return 0;
+
+	return iter_data->func(ioasid, ioasid_data->private, iter_data->data);
+}
+
+/**
+ * ioasid_for_each - Iterate over IOASIDs for this set.
+ * @set: the IOASID set
+ * @func: called for each matching IOASID.
+ * @data: data passed to the callback
+ *
+ * Execute @func for all IOASIDs in the given set. If @func returns anything
+ * other than %0, the iteration stops and that value is returned from this
+ * function.
+ */
+int ioasid_for_each(struct ioasid_set *set, ioasid_iter_t func, void *data)
+{
+	int ret;
+	struct ioasid_iter_data iter_data = {
+		.set	= set,
+		.func	= func,
+		.data	= data,
+	};
+
+	idr_lock(&ioasid_idr);
+	ret = idr_for_each(&ioasid_idr, ioasid_iter, &iter_data);
+	idr_unlock(&ioasid_idr);
+
+	return ret;
+}
+EXPORT_SYMBOL_GPL(ioasid_for_each);
+
+/**
+ * ioasid_find - Find IOASID data
+ * @set: the IOASID set
+ * @ioasid: the IOASID to find
+ *
+ * If the IOASID has been allocated for this set, return the private pointer
+ * passed to ioasid_alloc. Otherwise return NULL.
+ */
+void *ioasid_find(struct ioasid_set *set, ioasid_t ioasid)
+{
+	void *priv = NULL;
+	struct ioasid_data *ioasid_data;
+
+	idr_lock(&ioasid_idr);
+	ioasid_data = idr_find(&ioasid_idr, ioasid);
+	if (ioasid_data && ioasid_data->set == set)
+		priv = ioasid_data->private;
+	idr_unlock(&ioasid_idr);
+
+	return priv;
+}
+EXPORT_SYMBOL_GPL(ioasid_find);
diff --git a/include/linux/ioasid.h b/include/linux/ioasid.h
new file mode 100644
index 000000000000..cf6fb3496692
--- /dev/null
+++ b/include/linux/ioasid.h
@@ -0,0 +1,45 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef __LINUX_IOASID_H
+#define __LINUX_IOASID_H
+
+#define INVALID_IOASID ((ioasid_t)-1)
+typedef unsigned int ioasid_t;
+typedef int (*ioasid_iter_t)(ioasid_t ioasid, void *private, void *data);
+
+struct ioasid_set {
+	int dummy;
+};
+
+#define DECLARE_IOASID_SET(name) struct ioasid_set name = { 0 }
+
+#ifdef CONFIG_IOASID
+ioasid_t ioasid_alloc(struct ioasid_set *set, ioasid_t min, ioasid_t max,
+		      void *private);
+void ioasid_free(ioasid_t ioasid);
+
+int ioasid_for_each(struct ioasid_set *set, ioasid_iter_t func, void *data);
+void *ioasid_find(struct ioasid_set *set, ioasid_t ioasid);
+
+#else /* !CONFIG_IOASID */
+static inline ioasid_t ioasid_alloc(struct ioasid_set *set, ioasid_t min,
+				    ioasid_t max, void *private)
+{
+	return INVALID_IOASID;
+}
+
+static inline void ioasid_free(ioasid_t ioasid)
+{
+}
+
+static inline int ioasid_for_each(struct ioasid_set *set, ioasid_iter_t func,
+				  void *data)
+{
+	return -ESRCH;
+}
+
+static inline void *ioasid_find(struct ioasid_set *set, ioasid_t ioasid)
+{
+	return -ESRCH;
+}
+#endif /* CONFIG_IOASID */
+#endif /* __LINUX_IOASID_H */
-- 
2.19.1

^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [RFC PATCH 3/6] iommu/sva: Use external PASID allocator
       [not found] ` <20181019181158.2395-1-jean-philippe.brucker-5wv7dgnIgG8@public.gmane.org>
  2018-10-19 18:11   ` [RFC PATCH 1/6] iommu: Adapt attach/detach_dev() for auxiliary domains Jean-Philippe Brucker
  2018-10-19 18:11   ` [RFC PATCH 2/6] drivers core: Add I/O ASID allocator Jean-Philippe Brucker
@ 2018-10-19 18:11   ` Jean-Philippe Brucker
  2018-10-19 18:11   ` [RFC PATCH 4/6] iommu/sva: Support AUXD feature Jean-Philippe Brucker
                     ` (6 subsequent siblings)
  9 siblings, 0 replies; 60+ messages in thread
From: Jean-Philippe Brucker @ 2018-10-19 18:11 UTC (permalink / raw)
  To: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA
  Cc: kevin.tian-ral2JQCrhuEAvxtiuMwx3w, rafael-DgEjT+Ai2ygdnm+yROfE0A,
	gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, will.deacon-5wv7dgnIgG8,
	christian.koenig-5C7GfCeVMHo,
	alex.williamson-H+wXaHxf7aLQT0dZR+AlfA, robin.murphy-5wv7dgnIgG8

Now that the IOASID allocator is in place, use it to allocate shared
PASIDs.

Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker-5wv7dgnIgG8@public.gmane.org>
---
 drivers/iommu/Kconfig     |  1 +
 drivers/iommu/iommu-sva.c | 80 ++++++++++++++++++++++-----------------
 2 files changed, 46 insertions(+), 35 deletions(-)

diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
index e89f5a97d3c4..83a62634bbc4 100644
--- a/drivers/iommu/Kconfig
+++ b/drivers/iommu/Kconfig
@@ -116,6 +116,7 @@ config IOMMU_SVA
 	bool
 	select IOMMU_API
 	select MMU_NOTIFIER
+	select IOASID
 
 config IOMMU_PAGE_FAULT
 	bool
diff --git a/drivers/iommu/iommu-sva.c b/drivers/iommu/iommu-sva.c
index ea34e7383ab4..28eaa617b4f0 100644
--- a/drivers/iommu/iommu-sva.c
+++ b/drivers/iommu/iommu-sva.c
@@ -6,6 +6,7 @@
  */
 
 #include <linux/idr.h>
+#include <linux/ioasid.h>
 #include <linux/iommu.h>
 #include <linux/mmu_notifier.h>
 #include <linux/sched/mm.h>
@@ -115,12 +116,7 @@ struct iommu_bond {
 	void			*drvdata;
 };
 
-/*
- * Because we're using an IDR, PASIDs are limited to 31 bits (the sign bit is
- * used for returning errors). In practice implementations will use at most 20
- * bits, which is the PCI limit.
- */
-static DEFINE_IDR(iommu_pasid_idr);
+static DECLARE_IOASID_SET(ioasid_shared);
 
 /*
  * For the moment this is an all-purpose lock. It serializes
@@ -163,18 +159,13 @@ static struct io_mm *io_mm_alloc(struct device *dev, struct mm_struct *mm,
 	INIT_LIST_HEAD(&io_mm->devices);
 	/* Leave kref as zero until the io_mm is fully initialized */
 
-	idr_preload(GFP_KERNEL);
-	spin_lock(&iommu_sva_lock);
-	pasid = idr_alloc(&iommu_pasid_idr, io_mm, param->min_pasid,
-			  param->max_pasid + 1, GFP_ATOMIC);
-	io_mm->pasid = pasid;
-	spin_unlock(&iommu_sva_lock);
-	idr_preload_end();
-
+	pasid = ioasid_alloc(&ioasid_shared, param->min_pasid,
+			     param->max_pasid + 1, io_mm);
 	if (pasid < 0) {
 		ret = pasid;
 		goto err_free_mm;
 	}
+	io_mm->pasid = pasid;
 
 	ret = mmu_notifier_register(&io_mm->notifier, mm);
 	if (ret)
@@ -201,7 +192,7 @@ static struct io_mm *io_mm_alloc(struct device *dev, struct mm_struct *mm,
 	 * 0 so no user could get a reference to it. Free it manually.
 	 */
 	spin_lock(&iommu_sva_lock);
-	idr_remove(&iommu_pasid_idr, io_mm->pasid);
+	ioasid_free(pasid);
 	spin_unlock(&iommu_sva_lock);
 
 err_free_mm:
@@ -231,7 +222,7 @@ static void io_mm_release(struct kref *kref)
 	io_mm = container_of(kref, struct io_mm, kref);
 	WARN_ON(!list_empty(&io_mm->devices));
 
-	idr_remove(&iommu_pasid_idr, io_mm->pasid);
+	ioasid_free(io_mm->pasid);
 
 	/*
 	 * If we're being released from mm exit, the notifier callback ->release
@@ -491,15 +482,44 @@ static struct mmu_notifier_ops iommu_mmu_notifier = {
 	.invalidate_range	= iommu_notifier_invalidate_range,
 };
 
+struct io_mm_bond_info {
+	struct device *dev;
+	struct mm_struct *mm;
+	struct io_mm *io_mm;
+};
+
+static int io_mm_compare_bond(ioasid_t pasid, void *priv, void *data)
+{
+	struct iommu_bond *bond;
+	struct io_mm *io_mm = priv;
+	struct io_mm_bond_info *info = data;
+
+	if (io_mm->mm != info->mm || !io_mm_get_locked(io_mm))
+		/* Keep looking */
+		return 0;
+
+	/* Is it already bound to this device? */
+	list_for_each_entry(bond, &io_mm->devices, mm_head) {
+		if (bond->dev == info->dev) {
+			io_mm_put_locked(io_mm);
+			return -EEXIST;
+		}
+	}
+	info->io_mm = io_mm;
+	return 1;
+}
+
 int __iommu_sva_bind_device(struct device *dev, struct mm_struct *mm, int *pasid,
 			    unsigned long flags, void *drvdata)
 {
-	int i;
 	int ret = 0;
-	struct iommu_bond *bond;
-	struct io_mm *io_mm = NULL;
+	struct io_mm *io_mm;
 	struct iommu_sva_param *param;
 	const struct iommu_ops *ops = dev->bus->iommu_ops;
+	struct io_mm_bond_info bond_info = {
+		.dev = dev,
+		.mm = mm,
+	};
 
 	if (!ops || !dev->iommu_param)
 		return -ENODEV;
@@ -511,25 +531,15 @@ int __iommu_sva_bind_device(struct device *dev, struct mm_struct *mm, int *pasid
 		goto out_unlock;
 	}
 
-	/* If an io_mm already exists, use it */
+	/* If an io_mm already exists, get a reference and use it */
 	spin_lock(&iommu_sva_lock);
-	idr_for_each_entry(&iommu_pasid_idr, io_mm, i) {
-		if (io_mm->mm == mm && io_mm_get_locked(io_mm)) {
-			/* ... Unless it's already bound to this device */
-			list_for_each_entry(bond, &io_mm->devices, mm_head) {
-				if (bond->dev == dev) {
-					ret = -EEXIST;
-					io_mm_put_locked(io_mm);
-					break;
-				}
-			}
-			break;
-		}
-	}
+	ret = ioasid_for_each(&ioasid_shared, io_mm_compare_bond, &bond_info);
 	spin_unlock(&iommu_sva_lock);
-	if (ret)
+	if (ret < 0)
 		goto out_unlock;
 
+	io_mm = bond_info.io_mm;
+
 	/* Require identical features within an io_mm for now */
 	if (io_mm && (flags != io_mm->flags)) {
 		io_mm_put(io_mm);
@@ -640,7 +650,7 @@ struct mm_struct *iommu_sva_find(int pasid)
 	struct mm_struct *mm = NULL;
 
 	spin_lock(&iommu_sva_lock);
-	io_mm = idr_find(&iommu_pasid_idr, pasid);
+	io_mm = ioasid_find(&ioasid_shared, pasid);
 	if (io_mm && io_mm_get_locked(io_mm)) {
 		if (mmget_not_zero(io_mm->mm))
 			mm = io_mm->mm;
-- 
2.19.1

^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [RFC PATCH 4/6] iommu/sva: Support AUXD feature
       [not found] ` <20181019181158.2395-1-jean-philippe.brucker-5wv7dgnIgG8@public.gmane.org>
                     ` (2 preceding siblings ...)
  2018-10-19 18:11   ` [RFC PATCH 3/6] iommu/sva: Use external PASID allocator Jean-Philippe Brucker
@ 2018-10-19 18:11   ` Jean-Philippe Brucker
  2018-10-19 18:11   ` [RFC PATCH 5/6] iommu/arm-smmu-v3: Implement detach_dev op Jean-Philippe Brucker
                     ` (5 subsequent siblings)
  9 siblings, 0 replies; 60+ messages in thread
From: Jean-Philippe Brucker @ 2018-10-19 18:11 UTC (permalink / raw)
  To: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA
  Cc: kevin.tian-ral2JQCrhuEAvxtiuMwx3w, rafael-DgEjT+Ai2ygdnm+yROfE0A,
	gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, will.deacon-5wv7dgnIgG8,
	christian.koenig-5C7GfCeVMHo,
	alex.williamson-H+wXaHxf7aLQT0dZR+AlfA, robin.murphy-5wv7dgnIgG8

Add IOMMU_SVA_FEAT_AUXD and small helpers to SVA, to setup PASID with
the sva_init_device() IOMMU op and allocate PASIDs.

Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker-5wv7dgnIgG8@public.gmane.org>
---
 drivers/iommu/iommu-sva.c | 33 ++++++++++++++++++++++++++++++++-
 include/linux/iommu.h     | 12 ++++++++++++
 2 files changed, 44 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/iommu-sva.c b/drivers/iommu/iommu-sva.c
index 28eaa617b4f0..7c5176e2e113 100644
--- a/drivers/iommu/iommu-sva.c
+++ b/drivers/iommu/iommu-sva.c
@@ -117,6 +117,7 @@ struct iommu_bond {
 };
 
 static DECLARE_IOASID_SET(ioasid_shared);
+static DECLARE_IOASID_SET(ioasid_private);
 
 /*
  * For the moment this is an all-purpose lock. It serializes
@@ -663,6 +664,36 @@ struct mm_struct *iommu_sva_find(int pasid)
 }
 EXPORT_SYMBOL_GPL(iommu_sva_find);
 
+int iommu_sva_alloc_pasid(struct device *dev)
+{
+	int ret;
+	struct iommu_sva_param *param;
+
+	if (!dev->iommu_param)
+		return -EINVAL;
+
+	mutex_lock(&dev->iommu_param->sva_lock);
+	param = dev->iommu_param->sva_param;
+	if (!param) {
+		ret = -EINVAL;
+		goto out_unlock;
+	}
+
+	ret = ioasid_alloc(&ioasid_private, param->min_pasid,
+			   param->max_pasid + 1, NULL);
+out_unlock:
+	mutex_unlock(&dev->iommu_param->sva_lock);
+
+	return ret;
+}
+EXPORT_SYMBOL_GPL(iommu_sva_alloc_pasid);
+
+void iommu_sva_free_pasid(int pasid)
+{
+	ioasid_free(pasid);
+}
+EXPORT_SYMBOL_GPL(iommu_sva_free_pasid);
+
 /**
  * iommu_sva_init_device() - Initialize Shared Virtual Addressing for a device
  * @dev: the device
@@ -711,7 +742,7 @@ int iommu_sva_init_device(struct device *dev, unsigned long features,
 	if (!dev->iommu_param || !ops || !ops->sva_init_device)
 		return -ENODEV;
 
-	if (features & ~IOMMU_SVA_FEAT_IOPF)
+	if (features & ~(IOMMU_SVA_FEAT_IOPF | IOMMU_SVA_FEAT_AUXD))
 		return -EINVAL;
 
 	param = kzalloc(sizeof(*param), GFP_KERNEL);
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 67bbc5320c9d..bfa70b1f8897 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -66,6 +66,7 @@ typedef int (*iommu_mm_exit_handler_t)(struct device *dev, int pasid, void *);
 #define IOMMU_PASID_INVALID		(-1)
 
 #define IOMMU_SVA_FEAT_IOPF		(1 << 0)
+#define IOMMU_SVA_FEAT_AUXD		(1 << 1)
 
 struct iommu_domain_geometry {
 	dma_addr_t aperture_start; /* First address that can be mapped    */
@@ -1035,6 +1036,8 @@ extern int __iommu_sva_bind_device(struct device *dev, struct mm_struct *mm,
 extern int __iommu_sva_unbind_device(struct device *dev, int pasid);
 extern void iommu_sva_unbind_device_all(struct device *dev);
 extern struct mm_struct *iommu_sva_find(int pasid);
+extern int iommu_sva_alloc_pasid(struct device *dev);
+extern void iommu_sva_free_pasid(int pasid);
 
 #else /* CONFIG_IOMMU_SVA */
 static inline int iommu_sva_init_device(struct device *dev,
@@ -1070,6 +1073,15 @@ static inline struct mm_struct *iommu_sva_find(int pasid)
 {
 	return NULL;
 }
+
+static inline int iommu_sva_alloc_pasid(struct device *dev)
+{
+	return -ENODEV;
+}
+
+static inline void iommu_sva_free_pasid(int pasid)
+{
+}
 #endif /* CONFIG_IOMMU_SVA */
 
 #ifdef CONFIG_IOMMU_PAGE_FAULT
-- 
2.19.1

^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [RFC PATCH 5/6] iommu/arm-smmu-v3: Implement detach_dev op
       [not found] ` <20181019181158.2395-1-jean-philippe.brucker-5wv7dgnIgG8@public.gmane.org>
                     ` (3 preceding siblings ...)
  2018-10-19 18:11   ` [RFC PATCH 4/6] iommu/sva: Support AUXD feature Jean-Philippe Brucker
@ 2018-10-19 18:11   ` Jean-Philippe Brucker
  2018-10-19 18:11   ` [RFC PATCH 6/6] iommu/arm-smmu-v3: Add support for auxiliary domains Jean-Philippe Brucker
                     ` (4 subsequent siblings)
  9 siblings, 0 replies; 60+ messages in thread
From: Jean-Philippe Brucker @ 2018-10-19 18:11 UTC (permalink / raw)
  To: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA
  Cc: kevin.tian-ral2JQCrhuEAvxtiuMwx3w, rafael-DgEjT+Ai2ygdnm+yROfE0A,
	gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, will.deacon-5wv7dgnIgG8,
	christian.koenig-5C7GfCeVMHo,
	alex.williamson-H+wXaHxf7aLQT0dZR+AlfA, robin.murphy-5wv7dgnIgG8

To prepare for auxiliary domains, add detach_dev() to the IOMMU ops.
There shouldn't be any functional change.

Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker-5wv7dgnIgG8@public.gmane.org>
---
 drivers/iommu/arm-smmu-v3.c | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
index 1da41fb8111e..665365b5f02e 100644
--- a/drivers/iommu/arm-smmu-v3.c
+++ b/drivers/iommu/arm-smmu-v3.c
@@ -2202,7 +2202,7 @@ static void arm_smmu_install_ste_for_dev(struct iommu_fwspec *fwspec)
 	}
 }
 
-static void arm_smmu_detach_dev(struct device *dev)
+static void arm_smmu_detach_dev(struct iommu_domain *domain, struct device *dev)
 {
 	unsigned long flags;
 	struct arm_smmu_master_data *master = dev->iommu_fwspec->iommu_priv;
@@ -2241,8 +2241,8 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
 	ste = &master->ste;
 
 	/* Already attached to a different domain? */
-	if (ste->assigned)
-		arm_smmu_detach_dev(dev);
+	if (master->domain)
+		arm_smmu_detach_dev(&master->domain->domain, dev);
 
 	mutex_lock(&smmu_domain->init_mutex);
 
@@ -2833,7 +2833,7 @@ static void arm_smmu_remove_device(struct device *dev)
 	smmu = master->smmu;
 	iopf_queue_remove_device(dev);
 	if (master->ste.assigned)
-		arm_smmu_detach_dev(dev);
+		arm_smmu_detach_dev(&master->domain->domain, dev);
 	iommu_group_remove_device(dev);
 	arm_smmu_remove_master(smmu, master);
 	iommu_device_unlink(&smmu->iommu, dev);
@@ -2947,6 +2947,7 @@ static struct iommu_ops arm_smmu_ops = {
 	.domain_alloc		= arm_smmu_domain_alloc,
 	.domain_free		= arm_smmu_domain_free,
 	.attach_dev		= arm_smmu_attach_dev,
+	.detach_dev		= arm_smmu_detach_dev,
 	.sva_init_device	= arm_smmu_sva_init,
 	.sva_shutdown_device	= arm_smmu_sva_shutdown,
 	.mm_alloc		= arm_smmu_mm_alloc,
-- 
2.19.1

^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [RFC PATCH 6/6] iommu/arm-smmu-v3: Add support for auxiliary domains
       [not found] ` <20181019181158.2395-1-jean-philippe.brucker-5wv7dgnIgG8@public.gmane.org>
                     ` (4 preceding siblings ...)
  2018-10-19 18:11   ` [RFC PATCH 5/6] iommu/arm-smmu-v3: Implement detach_dev op Jean-Philippe Brucker
@ 2018-10-19 18:11   ` Jean-Philippe Brucker
  2018-10-20  3:36   ` [RFC PATCH 0/6] Auxiliary IOMMU domains and Arm SMMUv3 Xu Zaibo
                     ` (3 subsequent siblings)
  9 siblings, 0 replies; 60+ messages in thread
From: Jean-Philippe Brucker @ 2018-10-19 18:11 UTC (permalink / raw)
  To: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA
  Cc: kevin.tian-ral2JQCrhuEAvxtiuMwx3w, rafael-DgEjT+Ai2ygdnm+yROfE0A,
	gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, will.deacon-5wv7dgnIgG8,
	christian.koenig-5C7GfCeVMHo,
	alex.williamson-H+wXaHxf7aLQT0dZR+AlfA, robin.murphy-5wv7dgnIgG8

Allow device driver to create PASID contexts by setting a device in
'auxiliary' mode and allocating new domains. Each domain corrsponds to a
PASID.

Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker-5wv7dgnIgG8@public.gmane.org>
---
 drivers/iommu/arm-smmu-v3.c | 157 ++++++++++++++++++++++++++++++++++--
 1 file changed, 151 insertions(+), 6 deletions(-)

diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
index 665365b5f02e..ae10c78d20a8 100644
--- a/drivers/iommu/arm-smmu-v3.c
+++ b/drivers/iommu/arm-smmu-v3.c
@@ -539,6 +539,7 @@ struct arm_smmu_s1_cfg {
 	struct iommu_pasid_table_cfg	tables;
 	struct iommu_pasid_table_ops	*ops;
 	struct iommu_pasid_entry	*cd0; /* Default context */
+	int				pasid;
 };
 
 struct arm_smmu_s2_cfg {
@@ -656,6 +657,7 @@ struct arm_smmu_master_data {
 	struct device			*dev;
 	size_t				ssid_bits;
 	bool				can_fault;
+	bool				auxd;
 };
 
 /* SMMU private data for an IOMMU domain */
@@ -715,6 +717,13 @@ static struct arm_smmu_mm *to_smmu_mm(struct io_mm *io_mm)
 	return container_of(io_mm, struct arm_smmu_mm, io_mm);
 }
 
+static struct arm_smmu_master_data *dev_to_master(struct device *dev)
+{
+	struct iommu_fwspec *fwspec = dev->iommu_fwspec;
+
+	return fwspec ? fwspec->iommu_priv : NULL;
+}
+
 static void parse_driver_options(struct arm_smmu_device *smmu)
 {
 	int i = 0;
@@ -2007,10 +2016,12 @@ static void arm_smmu_domain_free(struct iommu_domain *domain)
 	if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1) {
 		struct iommu_pasid_table_ops *ops = smmu_domain->s1_cfg.ops;
 
-		if (ops) {
+		if (smmu_domain->s1_cfg.pasid)
+			iommu_sva_free_pasid(smmu_domain->s1_cfg.pasid);
+		if (smmu_domain->s1_cfg.cd0)
 			iommu_free_pasid_entry(smmu_domain->s1_cfg.cd0);
+		if (ops)
 			iommu_free_pasid_ops(ops);
-		}
 	} else {
 		struct arm_smmu_s2_cfg *cfg = &smmu_domain->s2_cfg;
 		if (cfg->vmid)
@@ -2068,6 +2079,28 @@ static int arm_smmu_domain_finalise_s1(struct arm_smmu_domain *smmu_domain,
 	return ret;
 }
 
+static int arm_smmu_domain_finalise_aux(struct arm_smmu_domain *smmu_domain,
+					struct arm_smmu_master_data *master,
+					struct io_pgtable_cfg *pgtbl_cfg)
+{
+	struct iommu_pasid_entry *entry;
+	struct iommu_pasid_table_ops *ops;
+
+	if (!master->ste.s1_cfg)
+		return -EINVAL;
+
+	ops = master->ste.s1_cfg->ops;
+	if (!ops)
+		return -EINVAL;
+
+	entry = ops->alloc_priv_entry(ops, ARM_64_LPAE_S1, pgtbl_cfg);
+	if (IS_ERR(entry))
+		return PTR_ERR(entry);
+
+	smmu_domain->s1_cfg.cd0 = entry; /* FIXME: cd, not cd0 */
+	return 0;
+}
+
 static int arm_smmu_domain_finalise_s2(struct arm_smmu_domain *smmu_domain,
 				       struct arm_smmu_master_data *master,
 				       struct io_pgtable_cfg *pgtbl_cfg)
@@ -2130,6 +2163,13 @@ static int arm_smmu_domain_finalise(struct iommu_domain *domain,
 		return -EINVAL;
 	}
 
+	if (master->auxd) {
+		if (WARN_ON(smmu_domain->stage != ARM_SMMU_DOMAIN_S1))
+			return -EINVAL;
+		/* We're initializing an auxiliary domain */
+		finalise_stage_fn = arm_smmu_domain_finalise_aux;
+	}
+
 	pgtbl_cfg = (struct io_pgtable_cfg) {
 		.pgsize_bitmap	= smmu->pgsize_bitmap,
 		.ias		= ias,
@@ -2202,11 +2242,36 @@ static void arm_smmu_install_ste_for_dev(struct iommu_fwspec *fwspec)
 	}
 }
 
+static void arm_smmu_detach_dev_aux(struct arm_smmu_domain *smmu_domain,
+				    struct arm_smmu_master_data *master)
+{
+	int pasid;
+	struct iommu_pasid_table_ops *ops = master->ste.s1_cfg->ops;
+
+	if (WARN_ON(!ops))
+		return;
+
+	pasid = smmu_domain->s1_cfg.pasid;
+	if (WARN_ON(!pasid))
+		return;
+	ops->clear_entry(ops, pasid, smmu_domain->s1_cfg.cd0);
+	arm_smmu_atc_inv_master_all(master, pasid);
+	/*
+	 * We don't count the number of attached devices, so free the
+	 * PASID in domain_free.
+	 */
+}
+
 static void arm_smmu_detach_dev(struct iommu_domain *domain, struct device *dev)
 {
 	unsigned long flags;
 	struct arm_smmu_master_data *master = dev->iommu_fwspec->iommu_priv;
-	struct arm_smmu_domain *smmu_domain = master->domain;
+	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
+
+	if (master->auxd) {
+		arm_smmu_detach_dev_aux(smmu_domain, master);
+		return;
+	}
 
 	if (smmu_domain) {
 		iommu_sva_unbind_device_all(dev);
@@ -2224,6 +2289,39 @@ static void arm_smmu_detach_dev(struct iommu_domain *domain, struct device *dev)
 	arm_smmu_install_ste_for_dev(dev->iommu_fwspec);
 }
 
+static int arm_smmu_attach_dev_aux(struct arm_smmu_domain *smmu_domain,
+				   struct arm_smmu_master_data *master)
+{
+	int pasid, ret;
+	struct iommu_pasid_table_ops *ops = master->ste.s1_cfg->ops;
+
+	if (!ops)
+		return -EINVAL;
+
+	if (smmu_domain->s1_cfg.pasid) {
+		/*
+		 * Reuse PASID if already allocated for another device. As the
+		 * PASID space is global and this is the same SMMU, we know that
+		 * the below set_entry will work.
+		 *
+		 * TODO: check device boundaries (sva_param)
+		 */
+		pasid = smmu_domain->s1_cfg.pasid;
+	} else {
+		pasid = iommu_sva_alloc_pasid(master->dev);
+		if (pasid < 0)
+			return pasid;
+
+		smmu_domain->s1_cfg.pasid = pasid;
+	}
+
+	ret = ops->set_entry(ops, pasid, smmu_domain->s1_cfg.cd0);
+	if (ret)
+		return ret;
+
+	return 1; /* Auxd needs special value. */
+}
+
 static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
 {
 	int ret = 0;
@@ -2240,8 +2338,12 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
 	smmu = master->smmu;
 	ste = &master->ste;
 
+	/* Spurious attach_dev when doing detach on an aux domain... */
+	if (master->domain == smmu_domain)
+		return !!master->auxd;
+
 	/* Already attached to a different domain? */
-	if (master->domain)
+	if (master->domain && !master->auxd)
 		arm_smmu_detach_dev(&master->domain->domain, dev);
 
 	mutex_lock(&smmu_domain->init_mutex);
@@ -2262,6 +2364,12 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
 		goto out_unlock;
 	}
 
+	if (master->auxd) {
+		/* Now install the context descriptor */
+		ret = arm_smmu_attach_dev_aux(smmu_domain, master);
+		goto out_unlock;
+	}
+
 	ste->assigned = true;
 	master->domain = smmu_domain;
 
@@ -2411,7 +2519,7 @@ static int arm_smmu_sva_init(struct device *dev, struct iommu_sva_param *param)
 	if (!master->ssid_bits)
 		return -EINVAL;
 
-	if (param->features & ~IOMMU_SVA_FEAT_IOPF)
+	if (param->features & ~(IOMMU_SVA_FEAT_IOPF | IOMMU_SVA_FEAT_AUXD))
 		return -EINVAL;
 
 	if (param->features & IOMMU_SVA_FEAT_IOPF) {
@@ -2426,6 +2534,9 @@ static int arm_smmu_sva_init(struct device *dev, struct iommu_sva_param *param)
 			goto err_disable_pri;
 	}
 
+	if (param->features & IOMMU_SVA_FEAT_AUXD)
+		master->auxd = true;
+
 	if (!param->max_pasid)
 		param->max_pasid = 0xfffffU;
 
@@ -2443,7 +2554,13 @@ static int arm_smmu_sva_init(struct device *dev, struct iommu_sva_param *param)
 
 static void arm_smmu_sva_shutdown(struct device *dev)
 {
-	arm_smmu_disable_pri(dev->iommu_fwspec->iommu_priv);
+	struct arm_smmu_master_data *master = dev_to_master(dev);
+
+	if (!master)
+		return;
+
+	master->auxd = false;
+	arm_smmu_disable_pri(master);
 	iopf_queue_remove_device(dev);
 }
 
@@ -2874,6 +2991,11 @@ static int arm_smmu_domain_get_attr(struct iommu_domain *domain,
 	case DOMAIN_ATTR_NESTING:
 		*(int *)data = (smmu_domain->stage == ARM_SMMU_DOMAIN_NESTED);
 		return 0;
+	case DOMAIN_ATTR_AUXD_ID:
+		if (smmu_domain->stage != ARM_SMMU_DOMAIN_S1)
+			return -EINVAL;
+		*(int *)data = smmu_domain->s1_cfg.pasid;
+		return 0;
 	default:
 		return -ENODEV;
 	}
@@ -2912,6 +3034,28 @@ static int arm_smmu_domain_set_attr(struct iommu_domain *domain,
 	return ret;
 }
 
+static int arm_smmu_get_dev_attr(struct device *dev, enum iommu_dev_attr attr,
+				 void *data)
+{
+	bool *auxd;
+	struct arm_smmu_master_data *master = dev_to_master(dev);
+
+	if (!master || !data)
+		return -ENODEV;
+
+	switch (attr) {
+	case IOMMU_DEV_ATTR_AUXD_CAPABILITY:
+		auxd = data;
+		/* PASID supported */
+		*auxd = !!master->ssid_bits;
+		break;
+	default:
+		return -ENODEV;
+	}
+
+	return 0;
+}
+
 static int arm_smmu_of_xlate(struct device *dev, struct of_phandle_args *args)
 {
 	return iommu_fwspec_add_ids(dev, args->args, 1);
@@ -2966,6 +3110,7 @@ static struct iommu_ops arm_smmu_ops = {
 	.device_group		= arm_smmu_device_group,
 	.domain_get_attr	= arm_smmu_domain_get_attr,
 	.domain_set_attr	= arm_smmu_domain_set_attr,
+	.get_dev_attr		= arm_smmu_get_dev_attr,
 	.of_xlate		= arm_smmu_of_xlate,
 	.get_resv_regions	= arm_smmu_get_resv_regions,
 	.put_resv_regions	= arm_smmu_put_resv_regions,
-- 
2.19.1

^ permalink raw reply related	[flat|nested] 60+ messages in thread

* Re: [RFC PATCH 0/6] Auxiliary IOMMU domains and Arm SMMUv3
       [not found] ` <20181019181158.2395-1-jean-philippe.brucker-5wv7dgnIgG8@public.gmane.org>
                     ` (5 preceding siblings ...)
  2018-10-19 18:11   ` [RFC PATCH 6/6] iommu/arm-smmu-v3: Add support for auxiliary domains Jean-Philippe Brucker
@ 2018-10-20  3:36   ` Xu Zaibo
  2018-10-22  6:53   ` Tian, Kevin
                     ` (2 subsequent siblings)
  9 siblings, 0 replies; 60+ messages in thread
From: Xu Zaibo @ 2018-10-20  3:36 UTC (permalink / raw)
  To: Jean-Philippe Brucker, iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA
  Cc: kevin.tian-ral2JQCrhuEAvxtiuMwx3w, rafael-DgEjT+Ai2ygdnm+yROfE0A,
	gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, will.deacon-5wv7dgnIgG8,
	alex.williamson-H+wXaHxf7aLQT0dZR+AlfA, robin.murphy-5wv7dgnIgG8,
	christian.koenig-5C7GfCeVMHo

Hi Jean,

On 2018/10/20 2:11, Jean-Philippe Brucker wrote:
> This is a first prototype adding auxiliary domain support to Arm SMMUv3,
> following Lu Baolu's latest proposal for IOMMU aware mediated devices
> [1]. It works, but the attach() API still doesn't feel right. See (2)
> below.
>
> Patch 1 adapts iommu.c to the current proposal for auxiliary domains.
> Patches 2-4 rework the PASID allocator to make it usable for SVA and
> AUXD simultaneously. Patches 5-6 add AUXD support to SMMUv3.
>
>
> When a device can have multiple address space, for instance with PCI
> PASID, an auxiliary domain (AUXD) is the IOMMU representation of one
> address space. I distinguish auxiliary from "main" domain, which
> represents the non-PASID address space but also (at least for SMMUv3)
> the whole device context, PASID tables etc.
>
> Auxiliary domains will be used by VFIO for IOMMU-aware mdev, and by any
> other device driver that wants to use PASID for private address spaces
> (as opposed to SVA [2]). The following API is available to device
> drivers:
>
> (1) Enable AUXD for a device. Enable PASID if necessary and set an AUXD
>      flag on the IOMMU data associated to a device.
>
>      For my own convenience I've been using the SVA infrastructure since
>      I already had the locking and IOMMU ops in place. The proposed
>      interface is also missing min_pasid and max_pasid parameters, which
>      could be needed by device drivers to enforce PASID limits.
>      iommu_sva_init_device() without arguments already enables PASID, so
>      I just added an AUXD flag to SVA features:
>
>        iommu_sva_init_device(dev, IOMMU_SVA_FEAT_AUXD,
>                              min_pasid, max_pasid, NULL)
>        iommu_sva_shutdown_device(dev)
>
>      Or as proposed in [1]:
>
>        iommu_set_dev_attr(dev, IOMMU_DEV_ATTR_AUXD_ENABLE, NULL)
>        iommu_set_dev_attr(dev, IOMMU_DEV_ATTR_AUXD_DISABLE, NULL)
>
>      Finding a compromise for this interface should be easy.

Personal idea, from the respective vendors, PASID limit should be set as 
enabling.

>
> (2) Allocate a domain and attach it to the device.
>
>        dom = iommu_domain_alloc()
>        iommu_attach_device(dom, dev)
>
>      I still have concerns about this part, which are highlighted by the
>      messy changes of patch 1. I think it would make more sense to
>      introduce new attach/detach_dev_aux() functions instead of reusing
>      attach/detach_dev()
>
>      Can we reconsider this and avoid unnecessary complications in IOMMU
>      core and drivers? Does the VFIO code greatly benefit from using the
>      same attach() function? It could as well use a different one for
>      devices in AUXD mode, which the mediating driver could tell by
>      adding a flag in mdev_set_iommu_device(), for example.
>
>      And I don't think other users of AUXD would benefit from using the
>      same attach() function, since they will know whether they want to be
>      using main or auxiliary domain when doing attach().

It seems that SVA/AUXD have no better sulotion for users applcations 
except VFIO.
But VFIO seems caring for VMs' scenario more than general user space 
processes at present. :)

>
> (3) Get the PASID, and program it in the device
>
>        iommu_domain_get_attr(dom, DOMAIN_ATTR_AUXD_ID, &pasid)
>
> (4) Create DMA mappings
>
>        iommu_map(dom, ...)
>        iommu_unmap(dom, ...)
>
>      Ultimately it would be nice to add PASID support to the DMA API as
>      well. For now drivers allocate IOVAs and pages themselves.
>
>
> For vfio-mdev, a driver that wants to create mdevs only performs steps (1)
> and (3):
>
> * When initializing the parent device, enable AUXD (1)
> * In mdev_parent_ops::create(), call mdev_set_iommu_device(mdev_dev(mdev),
>    mdev_parent_dev(mdev)).
> * In mdev_parent_ops::open(), get the PASID (3) and install it in the
>    parent device.
> * In mdev_parent_ops::close(), clear the PASID

This may be a good solution to avoid exposing PASID to user space. 
Personally, VFIO-PCI is
not in the range of this using scenario for partial device while 
PASID/SVA/AUXD are enabled.

Cheers,
Zaibo

.

>
> This code and the many patches it depends on can be found on my
> iommu/auxd branch:
> 	git://linux-arm.org/linux-jpb.git iommu/auxd
>
> [1] [PATCH v3 0/8] vfio/mdev: IOMMU aware mediated device
>      https://lwn.net/ml/linux-kernel/20181012051632.26064-1-baolu.lu-VuQAYsv1563Yd54FQh9/CA@public.gmane.org/
> [2] [PATCH v3 00/10] Shared Virtual Addressing for the IOMMU
>      https://www.spinics.net/lists/iommu/msg30076.html
>
> Jean-Philippe Brucker (6):
>    iommu: Adapt attach/detach_dev() for auxiliary domains
>    drivers core: Add I/O ASID allocator
>    iommu/sva: Use external PASID allocator
>    iommu/sva: Support AUXD feature
>    iommu/arm-smmu-v3: Implement detach_dev op
>    iommu/arm-smmu-v3: Add support for auxiliary domains
>
>   drivers/base/Kconfig        |   7 ++
>   drivers/base/Makefile       |   1 +
>   drivers/base/ioasid.c       | 140 ++++++++++++++++++++++++++++++
>   drivers/iommu/Kconfig       |   1 +
>   drivers/iommu/arm-smmu-v3.c | 164 ++++++++++++++++++++++++++++++++++--
>   drivers/iommu/iommu-sva.c   | 113 +++++++++++++++++--------
>   drivers/iommu/iommu.c       |  58 +++++++++----
>   include/linux/ioasid.h      |  45 ++++++++++
>   include/linux/iommu.h       |  12 +++
>   9 files changed, 479 insertions(+), 62 deletions(-)
>   create mode 100644 drivers/base/ioasid.c
>   create mode 100644 include/linux/ioasid.h
>

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC PATCH 1/6] iommu: Adapt attach/detach_dev() for auxiliary domains
       [not found]     ` <20181019181158.2395-2-jean-philippe.brucker-5wv7dgnIgG8@public.gmane.org>
@ 2018-10-22  2:32       ` Lu Baolu
  0 siblings, 0 replies; 60+ messages in thread
From: Lu Baolu @ 2018-10-22  2:32 UTC (permalink / raw)
  To: Jean-Philippe Brucker, iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA
  Cc: kevin.tian-ral2JQCrhuEAvxtiuMwx3w, rafael-DgEjT+Ai2ygdnm+yROfE0A,
	gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, will.deacon-5wv7dgnIgG8,
	christian.koenig-5C7GfCeVMHo,
	alex.williamson-H+wXaHxf7aLQT0dZR+AlfA, robin.murphy-5wv7dgnIgG8

Hi,

On 10/20/18 2:11 AM, Jean-Philippe Brucker wrote:
> The same set of functions, iommu_attach/detach_device/group, is used
> both to change a device's domain (let's call it "main domain") and to
> add auxiliary domains. The former is used by vfio-pci for example, to
> assign the whole device to userspace. The latter is used by vfio-mdev to
> assign partitions of devices using PASIDs. Device drivers set or unset
> the 'AUXD' device attribute to switch between these modes.
> 
> Supporting both auxiliary and main domain through the same callback
> requires some core changes. For an IOMMU driver that supports both, the
> new semantics of attach_dev() and detach_dev() ops are:
> 
> attach_dev(dev, new_domain)
>   (1) When auxd is disabled, detach from the current domain and attach to
>       the new domain.
> 
>   (2) When auxd is enabled, keep the current domain and attach to the new
>       domain.

When auxd is enabled, the "main domain" is not affected. An auxiliary
domain is added.

> 
> detach_dev(dev, new_domain)
>   (3) When auxd is disabled, detach from the domain. Previously the core
>       didn't call detach_dev explicitly when a device has a default domain,
>       since case (1) already takes care of it. But in order to also handle
>       case (4), we now need to always call detach_dev.
> 
>   (4) When auxd is enabled, detach from the given domain, but stay attached
>       to the default domain.

When auxd is enabled, the "main domain" is not affected. An auxiliary
domain is removed.

> 
> attach_dev() now returns a value greater than 0 if the domain is in
> auxiliary mode. This allows the core to know if the main domain
> (group->domain) is affected. detach_dev() should probably do the same
> (currently group->domain is always cleared on detach), but that requires
> changing all IOMMU drivers.
> 
> Since the core doesn't know about auxiliary domains when attaching, we
> have to relax attach_dev a bit: we can't check if the user is trying to
> replace the main domain with a new unmanaged domain anymore.

The core can know about the auxiliary domain by calling:

iommu_get_dev_attr(dev, IOMMU_DEV_ATTR_AUXD_ENABLED)

> 
> IOMMU drivers that don't support auxiliary domains are also affected by
> the change in (3). Those that support default_domain now get
> detach_dev() followed by attach_dev() callback on iommu_detach_device(),
> instead of just attach_dev().
> 
> Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker-5wv7dgnIgG8@public.gmane.org>
> ---
>   drivers/iommu/iommu.c | 58 ++++++++++++++++++++++++++++++-------------
>   1 file changed, 41 insertions(+), 17 deletions(-)
> 
> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
> index a8a144dded52..667ccfc6d3fd 100644
> --- a/drivers/iommu/iommu.c
> +++ b/drivers/iommu/iommu.c
> @@ -1553,7 +1553,7 @@ static int __iommu_attach_device(struct iommu_domain *domain,
>   		return -ENODEV;
>   
>   	ret = domain->ops->attach_dev(domain, dev);
> -	if (!ret)
> +	if (ret >= 0)
>   		trace_attach_device_to_domain(dev);
>   	return ret;
>   }
> @@ -1726,6 +1726,12 @@ struct iommu_domain *iommu_get_domain_for_dev(struct device *dev)
>   }
>   EXPORT_SYMBOL_GPL(iommu_get_domain_for_dev);
>   
> +struct iommu_attach_group_info {
> +	struct iommu_domain *domain;
> +	unsigned long count;
> +	int ret;
> +};
> +
>   /*
>    * IOMMU groups are really the natrual working unit of the IOMMU, but
>    * the IOMMU API works on domains and devices.  Bridge that gap by
> @@ -1738,20 +1744,31 @@ EXPORT_SYMBOL_GPL(iommu_get_domain_for_dev);
>    */
>   static int iommu_group_do_attach_device(struct device *dev, void *data)
>   {
> -	struct iommu_domain *domain = data;
> +	int ret;
> +	struct iommu_attach_group_info *info = data;
>   
> -	return __iommu_attach_device(domain, dev);
> +	ret = __iommu_attach_device(info->domain, dev);
> +	if (ret < 0)
> +		return ret;
> +
> +	/* Can't mix auxiliary and main domain within a group */
> +	if (info->count++ && ret != info->ret)
> +		/* TODO: rollback */
> +		return -ENODEV;

To make it simple, we can disable the AUXD feature on devices which
share its group with other devices.

Best regards,
Lu Baolu


> +	info->ret = ret;
> +
> +	return 0;
>   }
>   
>   static int __iommu_attach_group(struct iommu_domain *domain,
>   				struct iommu_group *group)
>   {
>   	int ret;
> +	struct iommu_attach_group_info info = {
> +		.domain = domain,
> +	};
>   
> -	if (group->default_domain && group->domain != group->default_domain)
> -		return -EBUSY;
> -
> -	ret = __iommu_group_for_each_dev(group, domain,
> +	ret = __iommu_group_for_each_dev(group, &info,
>   					 iommu_group_do_attach_device);
>   	if (ret == 0)
>   		group->domain = domain;
> @@ -1784,23 +1801,30 @@ static void __iommu_detach_group(struct iommu_domain *domain,
>   				 struct iommu_group *group)
>   {
>   	int ret;
> +	struct iommu_attach_group_info info = {
> +		.domain = group->default_domain,
> +	};
>   
> +	__iommu_group_for_each_dev(group, domain, iommu_group_do_detach_device);
>   	if (!group->default_domain) {
> -		__iommu_group_for_each_dev(group, domain,
> -					   iommu_group_do_detach_device);
> +		/*
> +		 * If detach returned, then we're not removing the main domain
> +		 * but an auxiliary one.
> +		 *
> +		 * FIXME: don't clear domain!
> +		 */
>   		group->domain = NULL;
>   		return;
>   	}
>   
> -	if (group->domain == group->default_domain)
> -		return;
> -
> -	/* Detach by re-attaching to the default domain */
> -	ret = __iommu_group_for_each_dev(group, group->default_domain,
> +	/*
> +	 * Detach by re-attaching to the default domain. If the device is
> +	 * already attached, then @domain was in auxiliary mode and the driver
> +	 * returns a value > 0.
> +	 */
> +	ret = __iommu_group_for_each_dev(group, &info,
>   					 iommu_group_do_attach_device);
> -	if (ret != 0)
> -		WARN_ON(1);
> -	else
> +	if (ret == 0)
>   		group->domain = group->default_domain;
>   }
>   
> 

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC PATCH 2/6] drivers core: Add I/O ASID allocator
       [not found]     ` <20181019181158.2395-3-jean-philippe.brucker-5wv7dgnIgG8@public.gmane.org>
@ 2018-10-22  4:49       ` Lu Baolu
       [not found]         ` <9c6cd6c1-3569-4251-8344-fc9df0e743bc-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>
  2018-11-07  4:53       ` Lu Baolu
  2018-11-12 14:40       ` Joerg Roedel
  2 siblings, 1 reply; 60+ messages in thread
From: Lu Baolu @ 2018-10-22  4:49 UTC (permalink / raw)
  To: Jean-Philippe Brucker, iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA
  Cc: kevin.tian-ral2JQCrhuEAvxtiuMwx3w, rafael-DgEjT+Ai2ygdnm+yROfE0A,
	gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, will.deacon-5wv7dgnIgG8,
	christian.koenig-5C7GfCeVMHo,
	alex.williamson-H+wXaHxf7aLQT0dZR+AlfA, robin.murphy-5wv7dgnIgG8

Hi,

On 10/20/18 2:11 AM, Jean-Philippe Brucker wrote:
> Some devices might support multiple DMA address spaces, in particular
> those that have the PCI PASID feature. PASID (Process Address Space ID)
> allows to share process address spaces with devices (SVA), partition a
> device into VM-assignable entities (VFIO mdev) or simply provide
> multiple DMA address space to kernel drivers. Add a global PASID
> allocator usable by different drivers at the same time. Name it I/O ASID
> to avoid confusion with ASIDs allocated by arch code, which are usually
> a separate ID space.
> 
> The IOASID space is global. Each device can have its own PASID space,
> but by convention the IOMMU ended up having a global PASID space, so
> that with SVA, each mm_struct is associated to a single PASID.
> 
> The allocator doesn't really belong in drivers/iommu because some
> drivers would like to allocate PASIDs for devices that aren't managed by
> an IOMMU, using the same ID space as IOMMU. It doesn't really belong in
> drivers/pci either since platform device also support PASID. Add the
> allocator in drivers/base.

One concern of moving pasid allocator here is about paravirtual
allocation of pasid.

Since there is only a single set of pasid tables which is controlled by
the host, the pasid is a system wide resource. When a driver running in
a guest VM wants to consume a pasid, it must be intercepted by the
simulation software and routed the allocation to the host via VFIO. Some
iommu arch's provide mechanisms to aid this, for example, the virtual
command interfaces defined in vt-d 3.0. Any pasid used in guest VM
should go through the virtual command interfaces.

Best regards,
Lu Baolu

> 
> Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker-5wv7dgnIgG8@public.gmane.org>
> ---
>   drivers/base/Kconfig   |   7 +++
>   drivers/base/Makefile  |   1 +
>   drivers/base/ioasid.c  | 140 +++++++++++++++++++++++++++++++++++++++++
>   include/linux/ioasid.h |  45 +++++++++++++
>   4 files changed, 193 insertions(+)
>   create mode 100644 drivers/base/ioasid.c
>   create mode 100644 include/linux/ioasid.h
> 
> diff --git a/drivers/base/Kconfig b/drivers/base/Kconfig
> index 3e63a900b330..9e41653467d7 100644
> --- a/drivers/base/Kconfig
> +++ b/drivers/base/Kconfig
> @@ -182,6 +182,13 @@ config DMA_SHARED_BUFFER
>   	  APIs extension; the file's descriptor can then be passed on to other
>   	  driver.
>   
> +config IOASID
> +	bool
> +	default n
> +	help
> +	  Enable the I/O Address Space ID allocator. A single ID space shared
> +	  between different users.
> +
>   config DMA_FENCE_TRACE
>   	bool "Enable verbose DMA_FENCE_TRACE messages"
>   	depends on DMA_SHARED_BUFFER
> diff --git a/drivers/base/Makefile b/drivers/base/Makefile
> index 704f44295810..79e83a1b344b 100644
> --- a/drivers/base/Makefile
> +++ b/drivers/base/Makefile
> @@ -23,6 +23,7 @@ obj-$(CONFIG_PINCTRL) += pinctrl.o
>   obj-$(CONFIG_DEV_COREDUMP) += devcoredump.o
>   obj-$(CONFIG_GENERIC_MSI_IRQ_DOMAIN) += platform-msi.o
>   obj-$(CONFIG_GENERIC_ARCH_TOPOLOGY) += arch_topology.o
> +obj-$(CONFIG_IOASID) += ioasid.o
>   
>   obj-y			+= test/
>   
> diff --git a/drivers/base/ioasid.c b/drivers/base/ioasid.c
> new file mode 100644
> index 000000000000..71ab7ee73ecb
> --- /dev/null
> +++ b/drivers/base/ioasid.c
> @@ -0,0 +1,140 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * I/O Address Space ID allocator. There is one global IOASID space, split into
> + * subsets. Users create a subset with DECLARE_IOASID_SET, then allocate and
> + * free IOASIDs with ioasid_alloc and ioasid_free.
> + */
> +#include <linux/idr.h>
> +#include <linux/ioasid.h>
> +#include <linux/slab.h>
> +#include <linux/spinlock.h>
> +
> +struct ioasid_data {
> +	ioasid_t id;
> +	struct ioasid_set *set;
> +	void *private;
> +};
> +
> +static DEFINE_IDR(ioasid_idr);
> +
> +/**
> + * ioasid_alloc - Allocate an IOASID
> + * @set: the IOASID set
> + * @min: the minimum ID (inclusive)
> + * @max: the maximum ID (exclusive)
> + * @private: data private to the caller
> + *
> + * Allocate an ID between @min and @max (or %0 and %INT_MAX). Return the
> + * allocated ID on success, or INVALID_IOASID on failure. The @private pointer
> + * is stored internally and can be retrieved with ioasid_find().
> + */
> +ioasid_t ioasid_alloc(struct ioasid_set *set, ioasid_t min, ioasid_t max,
> +		      void *private)
> +{
> +	int id;
> +	struct ioasid_data *data;
> +
> +	data = kzalloc(sizeof(*data), GFP_KERNEL);
> +	if (!data)
> +		return INVALID_IOASID;
> +
> +	data->set = set;
> +	data->private = private;
> +
> +	idr_preload(GFP_KERNEL);
> +	idr_lock(&ioasid_idr);
> +	data->id = id = idr_alloc(&ioasid_idr, data, min, max, GFP_ATOMIC);
> +	idr_unlock(&ioasid_idr);
> +	idr_preload_end();
> +
> +	if (id < 0) {
> +		kfree(data);
> +		return INVALID_IOASID;
> +	}
> +
> +	return data->id;
> +}
> +EXPORT_SYMBOL_GPL(ioasid_alloc);
> +
> +/**
> + * ioasid_free - Free an IOASID
> + * @ioasid: the ID to remove
> + */
> +void ioasid_free(ioasid_t ioasid)
> +{
> +	struct ioasid_data *ioasid_data;
> +
> +	idr_lock(&ioasid_idr);
> +	ioasid_data = idr_remove(&ioasid_idr, ioasid);
> +	idr_unlock(&ioasid_idr);
> +
> +	kfree(ioasid_data);
> +}
> +EXPORT_SYMBOL_GPL(ioasid_free);
> +
> +struct ioasid_iter_data {
> +	struct ioasid_set *set;
> +	ioasid_iter_t func;
> +	void *data;
> +};
> +
> +static int ioasid_iter(int ioasid, void *p, void *data)
> +{
> +	struct ioasid_iter_data *iter_data = data;
> +	struct ioasid_data *ioasid_data = p;
> +
> +	if (iter_data->set != ioasid_data->set)
> +		return 0;
> +
> +	return iter_data->func(ioasid, ioasid_data->private, iter_data->data);
> +}
> +
> +/**
> + * ioasid_for_each - Iterate over IOASIDs for this set.
> + * @set: the IOASID set
> + * @func: called for each matching IOASID.
> + * @data: data passed to the callback
> + *
> + * Execute @func for all IOASIDs in the given set. If @func returns anything
> + * other than %0, the iteration stops and that value is returned from this
> + * function.
> + */
> +int ioasid_for_each(struct ioasid_set *set, ioasid_iter_t func, void *data)
> +{
> +	int ret;
> +	struct ioasid_iter_data iter_data = {
> +		.set	= set,
> +		.func	= func,
> +		.data	= data,
> +	};
> +
> +	idr_lock(&ioasid_idr);
> +	ret = idr_for_each(&ioasid_idr, ioasid_iter, &iter_data);
> +	idr_unlock(&ioasid_idr);
> +
> +	return ret;
> +}
> +EXPORT_SYMBOL_GPL(ioasid_for_each);
> +
> +/**
> + * ioasid_find - Find IOASID data
> + * @set: the IOASID set
> + * @ioasid: the IOASID to find
> + *
> + * If the IOASID has been allocated for this set, return the private pointer
> + * passed to ioasid_alloc. Otherwise return NULL.
> + */
> +void *ioasid_find(struct ioasid_set *set, ioasid_t ioasid)
> +{
> +	void *priv = NULL;
> +	struct ioasid_data *ioasid_data;
> +
> +	idr_lock(&ioasid_idr);
> +	ioasid_data = idr_find(&ioasid_idr, ioasid);
> +	if (ioasid_data && ioasid_data->set == set)
> +		priv = ioasid_data->private;
> +	idr_unlock(&ioasid_idr);
> +
> +	return priv;
> +}
> +EXPORT_SYMBOL_GPL(ioasid_find);
> diff --git a/include/linux/ioasid.h b/include/linux/ioasid.h
> new file mode 100644
> index 000000000000..cf6fb3496692
> --- /dev/null
> +++ b/include/linux/ioasid.h
> @@ -0,0 +1,45 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +#ifndef __LINUX_IOASID_H
> +#define __LINUX_IOASID_H
> +
> +#define INVALID_IOASID ((ioasid_t)-1)
> +typedef unsigned int ioasid_t;
> +typedef int (*ioasid_iter_t)(ioasid_t ioasid, void *private, void *data);
> +
> +struct ioasid_set {
> +	int dummy;
> +};
> +
> +#define DECLARE_IOASID_SET(name) struct ioasid_set name = { 0 }
> +
> +#ifdef CONFIG_IOASID
> +ioasid_t ioasid_alloc(struct ioasid_set *set, ioasid_t min, ioasid_t max,
> +		      void *private);
> +void ioasid_free(ioasid_t ioasid);
> +
> +int ioasid_for_each(struct ioasid_set *set, ioasid_iter_t func, void *data);
> +void *ioasid_find(struct ioasid_set *set, ioasid_t ioasid);
> +
> +#else /* !CONFIG_IOASID */
> +static inline ioasid_t ioasid_alloc(struct ioasid_set *set, ioasid_t min,
> +				    ioasid_t max, void *private)
> +{
> +	return INVALID_IOASID;
> +}
> +
> +static inline void ioasid_free(ioasid_t ioasid)
> +{
> +}
> +
> +static inline int ioasid_for_each(struct ioasid_set *set, ioasid_iter_t func,
> +				  void *data)
> +{
> +	return -ESRCH;
> +}
> +
> +static inline void *ioasid_find(struct ioasid_set *set, ioasid_t ioasid)
> +{
> +	return -ESRCH;
> +}
> +#endif /* CONFIG_IOASID */
> +#endif /* __LINUX_IOASID_H */
> 

^ permalink raw reply	[flat|nested] 60+ messages in thread

* RE: [RFC PATCH 0/6] Auxiliary IOMMU domains and Arm SMMUv3
       [not found] ` <20181019181158.2395-1-jean-philippe.brucker-5wv7dgnIgG8@public.gmane.org>
                     ` (6 preceding siblings ...)
  2018-10-20  3:36   ` [RFC PATCH 0/6] Auxiliary IOMMU domains and Arm SMMUv3 Xu Zaibo
@ 2018-10-22  6:53   ` Tian, Kevin
       [not found]     ` <AADFC41AFE54684AB9EE6CBC0274A5D19BE0E176-0J0gbvR4kThpB2pF5aRoyrfspsVTdybXVpNB7YpNyf8@public.gmane.org>
  2018-10-22 10:07   ` Raj, Ashok
  2018-10-22 16:48   ` Jordan Crouse
  9 siblings, 1 reply; 60+ messages in thread
From: Tian, Kevin @ 2018-10-22  6:53 UTC (permalink / raw)
  To: Jean-Philippe Brucker, iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA
  Cc: rafael-DgEjT+Ai2ygdnm+yROfE0A,
	gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, will.deacon-5wv7dgnIgG8,
	christian.koenig-5C7GfCeVMHo,
	alex.williamson-H+wXaHxf7aLQT0dZR+AlfA, robin.murphy-5wv7dgnIgG8

> From: Jean-Philippe Brucker [mailto:jean-philippe.brucker-5wv7dgnIgG8@public.gmane.org]
> Sent: Saturday, October 20, 2018 2:12 AM
> 
> This is a first prototype adding auxiliary domain support to Arm SMMUv3,
> following Lu Baolu's latest proposal for IOMMU aware mediated devices
> [1]. It works, but the attach() API still doesn't feel right. See (2)
> below.
> 
> Patch 1 adapts iommu.c to the current proposal for auxiliary domains.
> Patches 2-4 rework the PASID allocator to make it usable for SVA and
> AUXD simultaneously. Patches 5-6 add AUXD support to SMMUv3.
> 
> 
> When a device can have multiple address space, for instance with PCI
> PASID, an auxiliary domain (AUXD) is the IOMMU representation of one
> address space. I distinguish auxiliary from "main" domain, which
> represents the non-PASID address space but also (at least for SMMUv3)
> the whole device context, PASID tables etc.

I'd like to clarify a bit. :-)

a domain can always represent one or more address spaces, where an 
address space could be for IOVA or GPA and/or other address spaces for 
SVA. Address spaces may be chained together in so-called nested mode.

a domain can be associated with a full device (in BDF granular), and/or 
with a partition of a device (say in PASID granular). In the former case,
the domain becomes the master (or primary) domain of the device. In
the latter case, the domain becomes the auxiliary domain of the device.

vendor domain structure may include vendor-specific states which
are applied to device context at attach time, e.g. PASID table (SMMUv3) 
if representing as master domain, or PASID value (VT-d scalable mode)
if representing as auxiliary domain.

so the domain definition is never changed with the introduction of
AUXD. 'auxiliary' is a per-device attribute which takes effect when at
domain attach time. A domain is perfectly sane to represent as a
master domain to deviceA and an auxiliary domain to deviceB at the
same time (say when device A and a mdev on deviceB are assigned to
same VM), while supporting one or more address spaces.

I explained above concept in my KVM forum session:

https://events.linuxfoundation.org/wp-content/uploads/2017/12/Hardware-Assisted-Mediated-Pass-Through-with-VFIO-Kevin-Tian-Intel.pdf (slide 16/17)

> 
> Auxiliary domains will be used by VFIO for IOMMU-aware mdev, and by
> any
> other device driver that wants to use PASID for private address spaces
> (as opposed to SVA [2]). The following API is available to device
> drivers:
> 
> (1) Enable AUXD for a device. Enable PASID if necessary and set an AUXD
>     flag on the IOMMU data associated to a device.
> 
>     For my own convenience I've been using the SVA infrastructure since
>     I already had the locking and IOMMU ops in place. The proposed
>     interface is also missing min_pasid and max_pasid parameters, which
>     could be needed by device drivers to enforce PASID limits.
>     iommu_sva_init_device() without arguments already enables PASID, so
>     I just added an AUXD flag to SVA features:
> 
>       iommu_sva_init_device(dev, IOMMU_SVA_FEAT_AUXD,
>                             min_pasid, max_pasid, NULL)
>       iommu_sva_shutdown_device(dev)
> 
>     Or as proposed in [1]:
> 
>       iommu_set_dev_attr(dev, IOMMU_DEV_ATTR_AUXD_ENABLE, NULL)
>       iommu_set_dev_attr(dev, IOMMU_DEV_ATTR_AUXD_DISABLE, NULL)
> 
>     Finding a compromise for this interface should be easy.
> 
> (2) Allocate a domain and attach it to the device.
> 
>       dom = iommu_domain_alloc()
>       iommu_attach_device(dom, dev)
> 
>     I still have concerns about this part, which are highlighted by the
>     messy changes of patch 1. I think it would make more sense to
>     introduce new attach/detach_dev_aux() functions instead of reusing
>     attach/detach_dev()
> 
>     Can we reconsider this and avoid unnecessary complications in IOMMU
>     core and drivers? Does the VFIO code greatly benefit from using the
>     same attach() function? It could as well use a different one for
>     devices in AUXD mode, which the mediating driver could tell by
>     adding a flag in mdev_set_iommu_device(), for example.

Baolu gave some recommendations to patch 1. Please check whether it can
help reduce the mess.

IMO using same API is conceptually clearer to VFIO... let's gather in KVM
forum to have a conclusion, with Alex being there.

Thanks
Kevin

> 
>     And I don't think other users of AUXD would benefit from using the
>     same attach() function, since they will know whether they want to be
>     using main or auxiliary domain when doing attach().
> 
> (3) Get the PASID, and program it in the device
> 
>       iommu_domain_get_attr(dom, DOMAIN_ATTR_AUXD_ID, &pasid)
> 
> (4) Create DMA mappings
> 
>       iommu_map(dom, ...)
>       iommu_unmap(dom, ...)
> 
>     Ultimately it would be nice to add PASID support to the DMA API as
>     well. For now drivers allocate IOVAs and pages themselves.
> 
> 
> For vfio-mdev, a driver that wants to create mdevs only performs steps (1)
> and (3):
> 
> * When initializing the parent device, enable AUXD (1)
> * In mdev_parent_ops::create(), call
> mdev_set_iommu_device(mdev_dev(mdev),
>   mdev_parent_dev(mdev)).
> * In mdev_parent_ops::open(), get the PASID (3) and install it in the
>   parent device.
> * In mdev_parent_ops::close(), clear the PASID
> 
> 
> This code and the many patches it depends on can be found on my
> iommu/auxd branch:
> 	git://linux-arm.org/linux-jpb.git iommu/auxd
> 
> [1] [PATCH v3 0/8] vfio/mdev: IOMMU aware mediated device
>     https://lwn.net/ml/linux-kernel/20181012051632.26064-1-
> baolu.lu-VuQAYsv1563Yd54FQh9/CA@public.gmane.org/
> [2] [PATCH v3 00/10] Shared Virtual Addressing for the IOMMU
>     https://www.spinics.net/lists/iommu/msg30076.html
> 
> Jean-Philippe Brucker (6):
>   iommu: Adapt attach/detach_dev() for auxiliary domains
>   drivers core: Add I/O ASID allocator
>   iommu/sva: Use external PASID allocator
>   iommu/sva: Support AUXD feature
>   iommu/arm-smmu-v3: Implement detach_dev op
>   iommu/arm-smmu-v3: Add support for auxiliary domains
> 
>  drivers/base/Kconfig        |   7 ++
>  drivers/base/Makefile       |   1 +
>  drivers/base/ioasid.c       | 140 ++++++++++++++++++++++++++++++
>  drivers/iommu/Kconfig       |   1 +
>  drivers/iommu/arm-smmu-v3.c | 164
> ++++++++++++++++++++++++++++++++++--
>  drivers/iommu/iommu-sva.c   | 113 +++++++++++++++++--------
>  drivers/iommu/iommu.c       |  58 +++++++++----
>  include/linux/ioasid.h      |  45 ++++++++++
>  include/linux/iommu.h       |  12 +++
>  9 files changed, 479 insertions(+), 62 deletions(-)
>  create mode 100644 drivers/base/ioasid.c
>  create mode 100644 include/linux/ioasid.h
> 
> --
> 2.19.1

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC PATCH 0/6] Auxiliary IOMMU domains and Arm SMMUv3
       [not found] ` <20181019181158.2395-1-jean-philippe.brucker-5wv7dgnIgG8@public.gmane.org>
                     ` (7 preceding siblings ...)
  2018-10-22  6:53   ` Tian, Kevin
@ 2018-10-22 10:07   ` Raj, Ashok
       [not found]     ` <20181021194426.GA11201-7RUrO8UaCDyr4tA6zuQqW9h3ngVCH38I@public.gmane.org>
  2018-10-22 16:48   ` Jordan Crouse
  9 siblings, 1 reply; 60+ messages in thread
From: Raj, Ashok @ 2018-10-22 10:07 UTC (permalink / raw)
  To: Jean-Philippe Brucker
  Cc: kevin.tian-ral2JQCrhuEAvxtiuMwx3w, Ashok Raj,
	rafael-DgEjT+Ai2ygdnm+yROfE0A,
	gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, will.deacon-5wv7dgnIgG8,
	alex.williamson-H+wXaHxf7aLQT0dZR+AlfA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	robin.murphy-5wv7dgnIgG8, christian.koenig-5C7GfCeVMHo

On Fri, Oct 19, 2018 at 07:11:52PM +0100, Jean-Philippe Brucker wrote:
> This is a first prototype adding auxiliary domain support to Arm SMMUv3,
> following Lu Baolu's latest proposal for IOMMU aware mediated devices
> [1]. It works, but the attach() API still doesn't feel right. See (2)
> below.
> 
> Patch 1 adapts iommu.c to the current proposal for auxiliary domains.
> Patches 2-4 rework the PASID allocator to make it usable for SVA and
> AUXD simultaneously. Patches 5-6 add AUXD support to SMMUv3.
> 
> 
> When a device can have multiple address space, for instance with PCI
> PASID, an auxiliary domain (AUXD) is the IOMMU representation of one
> address space. I distinguish auxiliary from "main" domain, which
> represents the non-PASID address space but also (at least for SMMUv3)
> the whole device context, PASID tables etc.
> 
> Auxiliary domains will be used by VFIO for IOMMU-aware mdev, and by any
> other device driver that wants to use PASID for private address spaces
> (as opposed to SVA [2]). The following API is available to device
> drivers:
> 
> (1) Enable AUXD for a device. Enable PASID if necessary and set an AUXD
>     flag on the IOMMU data associated to a device.
> 
>     For my own convenience I've been using the SVA infrastructure since
>     I already had the locking and IOMMU ops in place. The proposed
>     interface is also missing min_pasid and max_pasid parameters, which
>     could be needed by device drivers to enforce PASID limits.

Variable PASID limits per-device is hard to manage. I suppose we can play
some games to get this right, but I suspect that wont be very useful in 
the long run.

#1: Given that PASID is a system wide resource, during boot iommu driver needs 
to scan and set the max PASID to be no more than min(iommu_supported_max_pasid) 
of all available IOMMU's. 

#2: In addition if any device supports less than the choosen system max PASID
we should simply disable PASID on that device.

The reason is if the process were to bind to 2 or more accelerators and
the second device has a limit smaller than the first that the application
already attached, the second attemt to bind would fail. (Because
we would use the same PASID for a single process)

In addition for Intel IOMMU's PASID is a system wide resource. We have
a virtual capability for vIOMMU's to allocate PASID's. At the time of
allocation we don't know what device this PASID is even being allocated 
for. Certainly we could add other intelligence to pass a hint for 
max_pasid in the virtiual interface. 

I suppose when this becomes real, most serious accelerators will 
support the full width. Hence doing #1 and #2 above should be good
for most cases.

Cheers,
Ashok

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC PATCH 2/6] drivers core: Add I/O ASID allocator
       [not found]         ` <9c6cd6c1-3569-4251-8344-fc9df0e743bc-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>
@ 2018-10-22 10:22           ` Raj, Ashok
       [not found]             ` <20181022102254.GA25399-7RUrO8UaCDyr4tA6zuQqW9h3ngVCH38I@public.gmane.org>
  0 siblings, 1 reply; 60+ messages in thread
From: Raj, Ashok @ 2018-10-22 10:22 UTC (permalink / raw)
  To: Lu Baolu
  Cc: kevin.tian-ral2JQCrhuEAvxtiuMwx3w, Ashok Raj,
	rafael-DgEjT+Ai2ygdnm+yROfE0A, Jean-Philippe Brucker,
	gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, will.deacon-5wv7dgnIgG8,
	alex.williamson-H+wXaHxf7aLQT0dZR+AlfA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	robin.murphy-5wv7dgnIgG8, christian.koenig-5C7GfCeVMHo

On Mon, Oct 22, 2018 at 12:49:47PM +0800, Lu Baolu wrote:
> Hi,
> 
> On 10/20/18 2:11 AM, Jean-Philippe Brucker wrote:
> > Some devices might support multiple DMA address spaces, in particular
> > those that have the PCI PASID feature. PASID (Process Address Space ID)
> > allows to share process address spaces with devices (SVA), partition a
> > device into VM-assignable entities (VFIO mdev) or simply provide
> > multiple DMA address space to kernel drivers. Add a global PASID
> > allocator usable by different drivers at the same time. Name it I/O ASID
> > to avoid confusion with ASIDs allocated by arch code, which are usually
> > a separate ID space.
> > 
> > The IOASID space is global. Each device can have its own PASID space,
> > but by convention the IOMMU ended up having a global PASID space, so
> > that with SVA, each mm_struct is associated to a single PASID.
> > 
> > The allocator doesn't really belong in drivers/iommu because some
> > drivers would like to allocate PASIDs for devices that aren't managed by
> > an IOMMU, using the same ID space as IOMMU. It doesn't really belong in
> > drivers/pci either since platform device also support PASID. Add the
> > allocator in drivers/base.
> 
> One concern of moving pasid allocator here is about paravirtual
> allocation of pasid.
> 
> Since there is only a single set of pasid tables which is controlled by

Minor correction: Single system wide PASID namespace, but PASID tables
would be created ideally per-bdf for isolation purposes. 

I'm sure you meant name space, but didn't want that to be mis-interpreted.


> the host, the pasid is a system wide resource. When a driver running in
> a guest VM wants to consume a pasid, it must be intercepted by the
> simulation software and routed the allocation to the host via VFIO. Some
> iommu arch's provide mechanisms to aid this, for example, the virtual
> command interfaces defined in vt-d 3.0. Any pasid used in guest VM
> should go through the virtual command interfaces.
> 

Cheers,
Ashok

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC PATCH 0/6] Auxiliary IOMMU domains and Arm SMMUv3
       [not found]     ` <AADFC41AFE54684AB9EE6CBC0274A5D19BE0E176-0J0gbvR4kThpB2pF5aRoyrfspsVTdybXVpNB7YpNyf8@public.gmane.org>
@ 2018-10-22 11:50       ` Robin Murphy
       [not found]         ` <11f88122-afd3-a34c-3cd4-db681bf5498b-5wv7dgnIgG8@public.gmane.org>
  0 siblings, 1 reply; 60+ messages in thread
From: Robin Murphy @ 2018-10-22 11:50 UTC (permalink / raw)
  To: Tian, Kevin, Jean-Philippe Brucker,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA
  Cc: rafael-DgEjT+Ai2ygdnm+yROfE0A,
	gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, will.deacon-5wv7dgnIgG8,
	christian.koenig-5C7GfCeVMHo,
	alex.williamson-H+wXaHxf7aLQT0dZR+AlfA

On 22/10/2018 07:53, Tian, Kevin wrote:
>> From: Jean-Philippe Brucker [mailto:jean-philippe.brucker-5wv7dgnIgG8@public.gmane.org]
>> Sent: Saturday, October 20, 2018 2:12 AM
>>
>> This is a first prototype adding auxiliary domain support to Arm SMMUv3,
>> following Lu Baolu's latest proposal for IOMMU aware mediated devices
>> [1]. It works, but the attach() API still doesn't feel right. See (2)
>> below.
>>
>> Patch 1 adapts iommu.c to the current proposal for auxiliary domains.
>> Patches 2-4 rework the PASID allocator to make it usable for SVA and
>> AUXD simultaneously. Patches 5-6 add AUXD support to SMMUv3.
>>
>>
>> When a device can have multiple address space, for instance with PCI
>> PASID, an auxiliary domain (AUXD) is the IOMMU representation of one
>> address space. I distinguish auxiliary from "main" domain, which
>> represents the non-PASID address space but also (at least for SMMUv3)
>> the whole device context, PASID tables etc.
> 
> I'd like to clarify a bit. :-)
> 
> a domain can always represent one or more address spaces, where an
> address space could be for IOVA or GPA and/or other address spaces for
> SVA. Address spaces may be chained together in so-called nested mode.
> 
> a domain can be associated with a full device (in BDF granular), and/or
> with a partition of a device (say in PASID granular). In the former case,
> the domain becomes the master (or primary) domain of the device. In
> the latter case, the domain becomes the auxiliary domain of the device.
> 
> vendor domain structure may include vendor-specific states which
> are applied to device context at attach time, e.g. PASID table (SMMUv3)
> if representing as master domain, or PASID value (VT-d scalable mode)
> if representing as auxiliary domain.
> 
> so the domain definition is never changed with the introduction of
> AUXD. 'auxiliary' is a per-device attribute which takes effect when at
> domain attach time. A domain is perfectly sane to represent as a
> master domain to deviceA and an auxiliary domain to deviceB at the
> same time (say when device A and a mdev on deviceB are assigned to
> same VM), while supporting one or more address spaces.

To me, that sounds like a very good argument for having separate "attach 
as primary domain" and "attach as aux domain" APIs. Say a driver wants 
to use multiple aux domains itself to isolate logically-separate 
transaction streams, but still also needs to control the main domain for 
transactions without PASID (interrupts?) - having to juggle some 
invisible device state around multiple iommu_{attach,detach}_group() 
calls which look identical but are expected to behave differently at 
runtime sounds like a recipe for unmaintainable code.

I don't think it's necessarily safe to assume that 
one-struct-device-per-PASID will be the only usage model.

Robin.

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC PATCH 0/6] Auxiliary IOMMU domains and Arm SMMUv3
       [not found]         ` <11f88122-afd3-a34c-3cd4-db681bf5498b-5wv7dgnIgG8@public.gmane.org>
@ 2018-10-22 15:35           ` Jordan Crouse
  2018-10-22 18:01           ` Jean-Philippe Brucker
  2018-11-06 16:25           ` joro-zLv9SwRftAIdnm+yROfE0A
  2 siblings, 0 replies; 60+ messages in thread
From: Jordan Crouse @ 2018-10-22 15:35 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Tian, Kevin, rafael-DgEjT+Ai2ygdnm+yROfE0A,
	Jean-Philippe Brucker, gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r,
	will.deacon-5wv7dgnIgG8,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	christian.koenig-5C7GfCeVMHo,
	alex.williamson-H+wXaHxf7aLQT0dZR+AlfA

On Mon, Oct 22, 2018 at 12:50:56PM +0100, Robin Murphy wrote:
> On 22/10/2018 07:53, Tian, Kevin wrote:
> >>From: Jean-Philippe Brucker [mailto:jean-philippe.brucker-5wv7dgnIgG8@public.gmane.org]
> >>Sent: Saturday, October 20, 2018 2:12 AM
> >>
> >>This is a first prototype adding auxiliary domain support to Arm SMMUv3,
> >>following Lu Baolu's latest proposal for IOMMU aware mediated devices
> >>[1]. It works, but the attach() API still doesn't feel right. See (2)
> >>below.
> >>
> >>Patch 1 adapts iommu.c to the current proposal for auxiliary domains.
> >>Patches 2-4 rework the PASID allocator to make it usable for SVA and
> >>AUXD simultaneously. Patches 5-6 add AUXD support to SMMUv3.
> >>
> >>
> >>When a device can have multiple address space, for instance with PCI
> >>PASID, an auxiliary domain (AUXD) is the IOMMU representation of one
> >>address space. I distinguish auxiliary from "main" domain, which
> >>represents the non-PASID address space but also (at least for SMMUv3)
> >>the whole device context, PASID tables etc.
> >
> >I'd like to clarify a bit. :-)
> >
> >a domain can always represent one or more address spaces, where an
> >address space could be for IOVA or GPA and/or other address spaces for
> >SVA. Address spaces may be chained together in so-called nested mode.
> >
> >a domain can be associated with a full device (in BDF granular), and/or
> >with a partition of a device (say in PASID granular). In the former case,
> >the domain becomes the master (or primary) domain of the device. In
> >the latter case, the domain becomes the auxiliary domain of the device.
> >
> >vendor domain structure may include vendor-specific states which
> >are applied to device context at attach time, e.g. PASID table (SMMUv3)
> >if representing as master domain, or PASID value (VT-d scalable mode)
> >if representing as auxiliary domain.
> >
> >so the domain definition is never changed with the introduction of
> >AUXD. 'auxiliary' is a per-device attribute which takes effect when at
> >domain attach time. A domain is perfectly sane to represent as a
> >master domain to deviceA and an auxiliary domain to deviceB at the
> >same time (say when device A and a mdev on deviceB are assigned to
> >same VM), while supporting one or more address spaces.
> 
> To me, that sounds like a very good argument for having separate
> "attach as primary domain" and "attach as aux domain" APIs. Say a
> driver wants to use multiple aux domains itself to isolate
> logically-separate transaction streams, but still also needs to
> control the main domain for transactions without PASID (interrupts?)
> - having to juggle some invisible device state around multiple
> iommu_{attach,detach}_group() calls which look identical but are
> expected to behave differently at runtime sounds like a recipe for
> unmaintainable code.
> 
> I don't think it's necessarily safe to assume that
> one-struct-device-per-PASID will be the only usage model.

This feels exactly like the QCOM GPU model - multiple aux domains
for individual process pagetables but the main domain for TTBR1
(global) allocations.

Jordan

-- 
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC PATCH 0/6] Auxiliary IOMMU domains and Arm SMMUv3
       [not found]     ` <20181021194426.GA11201-7RUrO8UaCDyr4tA6zuQqW9h3ngVCH38I@public.gmane.org>
@ 2018-10-22 16:03       ` Jean-Philippe Brucker
       [not found]         ` <d45c5222-68e9-1d6e-730b-bb8dbc060586-5wv7dgnIgG8@public.gmane.org>
  0 siblings, 1 reply; 60+ messages in thread
From: Jean-Philippe Brucker @ 2018-10-22 16:03 UTC (permalink / raw)
  To: Raj, Ashok
  Cc: kevin.tian-ral2JQCrhuEAvxtiuMwx3w, Ashok Raj,
	rafael-DgEjT+Ai2ygdnm+yROfE0A,
	gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, will.deacon-5wv7dgnIgG8,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	alex.williamson-H+wXaHxf7aLQT0dZR+AlfA, robin.murphy-5wv7dgnIgG8,
	christian.koenig-5C7GfCeVMHo

On 22/10/2018 11:07, Raj, Ashok wrote:
>>     For my own convenience I've been using the SVA infrastructure since
>>     I already had the locking and IOMMU ops in place. The proposed
>>     interface is also missing min_pasid and max_pasid parameters, which
>>     could be needed by device drivers to enforce PASID limits.
> 
> Variable PASID limits per-device is hard to manage. I suppose we can play
> some games to get this right, but I suspect that wont be very useful in 
> the long run.

Devices may have PASID limits that are not discoverable with the PCI
PASID capability (https://patchwork.kernel.org/patch/9989307/#21389571).
Even if we simply reject devices that don't support enough PASIDs, I
think it's still better to return an error on bind or init_device than
to return a valid PASID that they can't use

> #1: Given that PASID is a system wide resource, during boot iommu driver needs 
> to scan and set the max PASID to be no more than min(iommu_supported_max_pasid) 
> of all available IOMMU's. 
> 
> #2: In addition if any device supports less than the choosen system max PASID
> we should simply disable PASID on that device.
> 
> The reason is if the process were to bind to 2 or more accelerators and
> the second device has a limit smaller than the first that the application
> already attached, the second attemt to bind would fail. (Because
> we would use the same PASID for a single process)

But that's not reason enough to completely disable PASID for the device,
it could be the only one bound to that process, or PASID could be only
used privately by the host device driver.

> In addition for Intel IOMMU's PASID is a system wide resource. We have
> a virtual capability for vIOMMU's to allocate PASID's. At the time of
> allocation we don't know what device this PASID is even being allocated 
> for. 

Ah, I had missed that part, I thought the PV allocation had the device
ID as well. That's a significant change.

I was still hoping that we could go back to per-device PASID spaces in
the host, since I still haven't seen any convincing argument in favor of
system-wide PASID. Get rid of #1 in order to solve #2, and as a bonus
support more PASIDs in total. Until now PASID was a per-device resource
except for choices made when writing the IOMMU driver.

> Certainly we could add other intelligence to pass a hint for 
> max_pasid in the virtiual interface. 
> 
> I suppose when this becomes real, most serious accelerators will 
> support the full width.

I don't know if that's obvious from the device perspective. If I had to
implement one, I would simply size my PASIDs to the number of contexts
supported by my device (which might be especially small in the embedded
space), given that technically nothing prevents software from supporting
that and the specification allows me to choose a width.

This was the intended model for PCI, but thankfully version 4.0 added an
implementation note (6.20.2.1 - PASID Field) advising against this
approach, and to instead support 20 bits for interoperability. Maybe it
will set a trend for non-PCI devices as well.

Thanks,
Jean

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC PATCH 0/6] Auxiliary IOMMU domains and Arm SMMUv3
       [not found] ` <20181019181158.2395-1-jean-philippe.brucker-5wv7dgnIgG8@public.gmane.org>
                     ` (8 preceding siblings ...)
  2018-10-22 10:07   ` Raj, Ashok
@ 2018-10-22 16:48   ` Jordan Crouse
       [not found]     ` <20181022164834.GH26762-9PYrDHPZ2Orvke4nUoYGnHL1okKdlPRT@public.gmane.org>
  9 siblings, 1 reply; 60+ messages in thread
From: Jordan Crouse @ 2018-10-22 16:48 UTC (permalink / raw)
  To: Jean-Philippe Brucker
  Cc: kevin.tian-ral2JQCrhuEAvxtiuMwx3w, rafael-DgEjT+Ai2ygdnm+yROfE0A,
	gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, will.deacon-5wv7dgnIgG8,
	alex.williamson-H+wXaHxf7aLQT0dZR+AlfA,
	christian.koenig-5C7GfCeVMHo,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	robin.murphy-5wv7dgnIgG8

On Fri, Oct 19, 2018 at 07:11:52PM +0100, Jean-Philippe Brucker wrote:

> (2) Allocate a domain and attach it to the device.
> 
>       dom = iommu_domain_alloc()
>       iommu_attach_device(dom, dev)
> 
>     I still have concerns about this part, which are highlighted by the
>     messy changes of patch 1. I think it would make more sense to
>     introduce new attach/detach_dev_aux() functions instead of reusing
>     attach/detach_dev()
> 
>     Can we reconsider this and avoid unnecessary complications in IOMMU
>     core and drivers? Does the VFIO code greatly benefit from using the
>     same attach() function? It could as well use a different one for
>     devices in AUXD mode, which the mediating driver could tell by
>     adding a flag in mdev_set_iommu_device(), for example.
> 
>     And I don't think other users of AUXD would benefit from using the
>     same attach() function, since they will know whether they want to be
>     using main or auxiliary domain when doing attach().

I second this. For the arm-smmu-v2 multiple-pagetable model we would either need
new API functions or do something along the lines of 

dom = iommu_domain_alloc()
iommu_set_attr(dom, DOMAIN_ATTR_I_WANT_TO_BE_AUX)
iommu_attach_device(dom,dev)

Either that or do some some dummy device magic that I haven't started to work
out in my head. Having aux specific functions just for this step would make the
arm-smmu-v2 version work out much cleaner.

Jordan

-- 
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC PATCH 0/6] Auxiliary IOMMU domains and Arm SMMUv3
       [not found]         ` <11f88122-afd3-a34c-3cd4-db681bf5498b-5wv7dgnIgG8@public.gmane.org>
  2018-10-22 15:35           ` Jordan Crouse
@ 2018-10-22 18:01           ` Jean-Philippe Brucker
  2018-11-06 16:25           ` joro-zLv9SwRftAIdnm+yROfE0A
  2 siblings, 0 replies; 60+ messages in thread
From: Jean-Philippe Brucker @ 2018-10-22 18:01 UTC (permalink / raw)
  To: Robin Murphy, Tian, Kevin,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA
  Cc: gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r,
	alex.williamson-H+wXaHxf7aLQT0dZR+AlfA, will.deacon-5wv7dgnIgG8,
	christian.koenig-5C7GfCeVMHo, rafael-DgEjT+Ai2ygdnm+yROfE0A

On 22/10/2018 12:50, Robin Murphy wrote:
> To me, that sounds like a very good argument for having separate "attach
> as primary domain" and "attach as aux domain" APIs. Say a driver wants
> to use multiple aux domains itself to isolate logically-separate
> transaction streams, but still also needs to control the main domain for
> transactions without PASID (interrupts?)

Yes, MSIs don't have a PASID (well, shouldn't). So in Arm systems, where
the address of the IRQ chip is translated by the SMMU, they are mapped
in the main domain.

Thanks,
Jean

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC PATCH 2/6] drivers core: Add I/O ASID allocator
       [not found]             ` <20181022102254.GA25399-7RUrO8UaCDyr4tA6zuQqW9h3ngVCH38I@public.gmane.org>
@ 2018-10-23  6:56               ` Lu Baolu
       [not found]                 ` <02006e4f-2acf-6ff8-b695-c54c99509b46-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>
  0 siblings, 1 reply; 60+ messages in thread
From: Lu Baolu @ 2018-10-23  6:56 UTC (permalink / raw)
  To: Raj, Ashok
  Cc: kevin.tian-ral2JQCrhuEAvxtiuMwx3w, Ashok Raj,
	rafael-DgEjT+Ai2ygdnm+yROfE0A, Jean-Philippe Brucker,
	gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, will.deacon-5wv7dgnIgG8,
	alex.williamson-H+wXaHxf7aLQT0dZR+AlfA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	robin.murphy-5wv7dgnIgG8, christian.koenig-5C7GfCeVMHo

Hi,

On 10/22/18 6:22 PM, Raj, Ashok wrote:
> On Mon, Oct 22, 2018 at 12:49:47PM +0800, Lu Baolu wrote:
>> Hi,
>>
>> On 10/20/18 2:11 AM, Jean-Philippe Brucker wrote:
>>> Some devices might support multiple DMA address spaces, in particular
>>> those that have the PCI PASID feature. PASID (Process Address Space ID)
>>> allows to share process address spaces with devices (SVA), partition a
>>> device into VM-assignable entities (VFIO mdev) or simply provide
>>> multiple DMA address space to kernel drivers. Add a global PASID
>>> allocator usable by different drivers at the same time. Name it I/O ASID
>>> to avoid confusion with ASIDs allocated by arch code, which are usually
>>> a separate ID space.
>>>
>>> The IOASID space is global. Each device can have its own PASID space,
>>> but by convention the IOMMU ended up having a global PASID space, so
>>> that with SVA, each mm_struct is associated to a single PASID.
>>>
>>> The allocator doesn't really belong in drivers/iommu because some
>>> drivers would like to allocate PASIDs for devices that aren't managed by
>>> an IOMMU, using the same ID space as IOMMU. It doesn't really belong in
>>> drivers/pci either since platform device also support PASID. Add the
>>> allocator in drivers/base.
>>
>> One concern of moving pasid allocator here is about paravirtual
>> allocation of pasid.
>>
>> Since there is only a single set of pasid tables which is controlled by
> 
> Minor correction: Single system wide PASID namespace, but PASID tables
> would be created ideally per-bdf for isolation purposes.
> 
> I'm sure you meant name space, but didn't want that to be mis-interpreted.

Yes, confirmed.

Best regards,
Lu Baolu

> 
> 
>> the host, the pasid is a system wide resource. When a driver running in
>> a guest VM wants to consume a pasid, it must be intercepted by the
>> simulation software and routed the allocation to the host via VFIO. Some
>> iommu arch's provide mechanisms to aid this, for example, the virtual
>> command interfaces defined in vt-d 3.0. Any pasid used in guest VM
>> should go through the virtual command interfaces.
>>
> 
> Cheers,
> Ashok
> 

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC PATCH 0/6] Auxiliary IOMMU domains and Arm SMMUv3
       [not found]         ` <d45c5222-68e9-1d6e-730b-bb8dbc060586-5wv7dgnIgG8@public.gmane.org>
@ 2018-10-23 17:16           ` Raj, Ashok
       [not found]             ` <1540314963.21962.20.camel-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
  2018-10-26  3:00           ` Lu Baolu
  1 sibling, 1 reply; 60+ messages in thread
From: Raj, Ashok @ 2018-10-23 17:16 UTC (permalink / raw)
  To: jean-philippe.brucker-5wv7dgnIgG8
  Cc: Tian, Kevin, Raj, Ashok, rafael-DgEjT+Ai2ygdnm+yROfE0A,
	gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, will.deacon-5wv7dgnIgG8,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	alex.williamson-H+wXaHxf7aLQT0dZR+AlfA, robin.murphy-5wv7dgnIgG8,
	christian.koenig-5C7GfCeVMHo

On Mon, 2018-10-22 at 17:03 +0100, Jean-Philippe Brucker wrote:
> On 22/10/2018 11:07, Raj, Ashok wrote:
> > >     For my own convenience I've been using the SVA infrastructure
> > > since
> > >     I already had the locking and IOMMU ops in place. The
> > > proposed
> > >     interface is also missing min_pasid and max_pasid parameters,
> > > which
> > >     could be needed by device drivers to enforce PASID limits.
> > 
> > Variable PASID limits per-device is hard to manage. I suppose we
> > can play
> > some games to get this right, but I suspect that wont be very
> > useful in 
> > the long run.
> 
> Devices may have PASID limits that are not discoverable with the PCI
> PASID capability (https://patchwork.kernel.org/patch/9989307/#2138957
> 1).
> Even if we simply reject devices that don't support enough PASIDs, I
> think it's still better to return an error on bind or init_device
> than
> to return a valid PASID that they can't use

That sounds reasonable. What I meant was that trying to do similar
things like what we do for IOVA (Reserving from high etc) may not work
when a process might need to share the same PASID between 2
accelerators.


> 
> > #1: Given that PASID is a system wide resource, during boot iommu
> > driver needs 
> > to scan and set the max PASID to be no more than
> > min(iommu_supported_max_pasid) 
> > of all available IOMMU's. 
> > 
> > #2: In addition if any device supports less than the choosen system
> > max PASID
> > we should simply disable PASID on that device.
> > 
> > The reason is if the process were to bind to 2 or more accelerators
> > and
> > the second device has a limit smaller than the first that the
> > application
> > already attached, the second attemt to bind would fail. (Because
> > we would use the same PASID for a single process)
> 
> But that's not reason enough to completely disable PASID for the
> device,
> it could be the only one bound to that process, or PASID could be
> only
> used privately by the host device driver.

Agree, there could be other use cases. 

If the device was already attached during boot the driver comes early
to get the low PASID space. If the device was hot-added and the PASID
supported by device wasn't available its going to fail.

Enforcing something that will always work will be more reliable. But i
agree it maybe too strict for some cases. 

Maybe its a IOMMU enforced limit for the platform on the minimum
requirement for consistency. 

> 
> > In addition for Intel IOMMU's PASID is a system wide resource. We
> > have
> > a virtual capability for vIOMMU's to allocate PASID's. At the time
> > of
> > allocation we don't know what device this PASID is even being
> > allocated 
> > for. 
> 
> Ah, I had missed that part, I thought the PV allocation had the
> device
> ID as well. That's a significant change.
> 
> I was still hoping that we could go back to per-device PASID spaces
> in
> the host, since I still haven't seen any convincing argument in favor
> of
> system-wide PASID. Get rid of #1 in order to solve #2, and as a bonus
> support more PASIDs in total. Until now PASID was a per-device
> resource
> except for choices made when writing the IOMMU driver.

A global shared PASID table is designed with a future ISA extension for
PASID isolation. 
> 
> > Certainly we could add other intelligence to pass a hint for 
> > max_pasid in the virtiual interface. 
> > 
> > I suppose when this becomes real, most serious accelerators will 
> > support the full width.
> 
> I don't know if that's obvious from the device perspective. If I had
> to
> implement one, I would simply size my PASIDs to the number of
> contexts
> supported by my device (which might be especially small in the
> embedded
> space), given that technically nothing prevents software from
> supporting
> that and the specification allows me to choose a width.

That's very reasonable. But supporting a smaller number of contexts
is different from supporting smaller number of PASID's. A device could
support the full PASID width, but limit the number of contexts to a
smaller number. Certainly if the device vendors don't know upfront that
could be an issue. 

> 
> This was the intended model for PCI, but thankfully version 4.0 added
> an
> implementation note (6.20.2.1 - PASID Field) advising against this
> approach, and to instead support 20 bits for interoperability. Maybe
> it
> will set a trend for non-PCI devices as well.

That's a great suggestion. In fact we have already made this suggestion
 to our PCI WG rep. For Scalable IOV devices the white-paper we
published should hint that we expect devices to support the full 20 bit
PASID width.

> 
> Thanks,
> Jean

^ permalink raw reply	[flat|nested] 60+ messages in thread

* RE: [RFC PATCH 0/6] Auxiliary IOMMU domains and Arm SMMUv3
       [not found]             ` <1540314963.21962.20.camel-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
@ 2018-10-23 22:08               ` Tian, Kevin
  0 siblings, 0 replies; 60+ messages in thread
From: Tian, Kevin @ 2018-10-23 22:08 UTC (permalink / raw)
  To: Raj, Ashok, jean-philippe.brucker-5wv7dgnIgG8
  Cc: rafael-DgEjT+Ai2ygdnm+yROfE0A,
	gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, will.deacon-5wv7dgnIgG8,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	alex.williamson-H+wXaHxf7aLQT0dZR+AlfA, robin.murphy-5wv7dgnIgG8,
	christian.koenig-5C7GfCeVMHo

> From: Raj, Ashok
> Sent: Wednesday, October 24, 2018 1:17 AM
> 
> >
> > But that's not reason enough to completely disable PASID for the
> > device,
> > it could be the only one bound to that process, or PASID could be
> > only
> > used privately by the host device driver.
> 
> Agree, there could be other use cases.
> 
> If the device was already attached during boot the driver comes early
> to get the low PASID space. If the device was hot-added and the PASID
> supported by device wasn't available its going to fail.
> 
> Enforcing something that will always work will be more reliable. But i
> agree it maybe too strict for some cases.
> 
> Maybe its a IOMMU enforced limit for the platform on the minimum
> requirement for consistency.
> 

what about keeping the flexibility in common logic (e.g. allowing 
pasid_min and pasid_max in PASID allocator, detecting PASID
width limitation at binding time, etc.), while letting vendor driver
to vote disabling PASID on device based on its own need (e.g. if
not supporting 20bit width).

Thanks
kevin

^ permalink raw reply	[flat|nested] 60+ messages in thread

* RE: [RFC PATCH 2/6] drivers core: Add I/O ASID allocator
       [not found]                 ` <02006e4f-2acf-6ff8-b695-c54c99509b46-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>
@ 2018-10-23 22:13                   ` Tian, Kevin
  0 siblings, 0 replies; 60+ messages in thread
From: Tian, Kevin @ 2018-10-23 22:13 UTC (permalink / raw)
  To: Lu Baolu, Raj, Ashok
  Cc: Raj, Ashok, rafael-DgEjT+Ai2ygdnm+yROfE0A, Jean-Philippe Brucker,
	gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, will.deacon-5wv7dgnIgG8,
	alex.williamson-H+wXaHxf7aLQT0dZR+AlfA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	robin.murphy-5wv7dgnIgG8, christian.koenig-5C7GfCeVMHo

> From: Lu Baolu [mailto:baolu.lu-VuQAYsv1563Yd54FQh9/CA@public.gmane.org]
> Sent: Tuesday, October 23, 2018 2:57 PM
> 
> Hi,
> 
> On 10/22/18 6:22 PM, Raj, Ashok wrote:
> > On Mon, Oct 22, 2018 at 12:49:47PM +0800, Lu Baolu wrote:
> >> Hi,
> >>
> >> On 10/20/18 2:11 AM, Jean-Philippe Brucker wrote:
> >>> Some devices might support multiple DMA address spaces, in particular
> >>> those that have the PCI PASID feature. PASID (Process Address Space ID)
> >>> allows to share process address spaces with devices (SVA), partition a
> >>> device into VM-assignable entities (VFIO mdev) or simply provide
> >>> multiple DMA address space to kernel drivers. Add a global PASID
> >>> allocator usable by different drivers at the same time. Name it I/O ASID
> >>> to avoid confusion with ASIDs allocated by arch code, which are usually
> >>> a separate ID space.
> >>>
> >>> The IOASID space is global. Each device can have its own PASID space,
> >>> but by convention the IOMMU ended up having a global PASID space,
> so
> >>> that with SVA, each mm_struct is associated to a single PASID.
> >>>
> >>> The allocator doesn't really belong in drivers/iommu because some
> >>> drivers would like to allocate PASIDs for devices that aren't managed by
> >>> an IOMMU, using the same ID space as IOMMU. It doesn't really
> belong in
> >>> drivers/pci either since platform device also support PASID. Add the
> >>> allocator in drivers/base.
> >>
> >> One concern of moving pasid allocator here is about paravirtual
> >> allocation of pasid.
> >>
> >> Since there is only a single set of pasid tables which is controlled by
> >
> > Minor correction: Single system wide PASID namespace, but PASID tables
> > would be created ideally per-bdf for isolation purposes.
> >
> > I'm sure you meant name space, but didn't want that to be mis-
> interpreted.
> 
> Yes, confirmed.
> 

Above essentially means that multiple IOASID allocators may exist.
There needs a way for IOMMU driver to choose which one to use, e.g.
for current two usages the core allocator will be used on host and
also in VM (non-VTd cases), and pv allocator will be used in VM w/
virtual VTd.

Thanks
Kevin

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC PATCH 0/6] Auxiliary IOMMU domains and Arm SMMUv3
       [not found]         ` <d45c5222-68e9-1d6e-730b-bb8dbc060586-5wv7dgnIgG8@public.gmane.org>
  2018-10-23 17:16           ` Raj, Ashok
@ 2018-10-26  3:00           ` Lu Baolu
  1 sibling, 0 replies; 60+ messages in thread
From: Lu Baolu @ 2018-10-26  3:00 UTC (permalink / raw)
  To: Jean-Philippe Brucker, Raj, Ashok
  Cc: kevin.tian-ral2JQCrhuEAvxtiuMwx3w, Ashok Raj,
	rafael-DgEjT+Ai2ygdnm+yROfE0A,
	gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, will.deacon-5wv7dgnIgG8,
	alex.williamson-H+wXaHxf7aLQT0dZR+AlfA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	robin.murphy-5wv7dgnIgG8, christian.koenig-5C7GfCeVMHo

Hi,

On 10/23/18 12:03 AM, Jean-Philippe Brucker wrote:
>> #1: Given that PASID is a system wide resource, during boot iommu driver needs
>> to scan and set the max PASID to be no more than min(iommu_supported_max_pasid)
>> of all available IOMMU's.
>>
>> #2: In addition if any device supports less than the choosen system max PASID
>> we should simply disable PASID on that device.
>>
>> The reason is if the process were to bind to 2 or more accelerators and
>> the second device has a limit smaller than the first that the application
>> already attached, the second attemt to bind would fail. (Because
>> we would use the same PASID for a single process)
> But that's not reason enough to completely disable PASID for the device,
> it could be the only one bound to that process, or PASID could be only
> used privately by the host device driver.
> 
>> In addition for Intel IOMMU's PASID is a system wide resource. We have
>> a virtual capability for vIOMMU's to allocate PASID's. At the time of
>> allocation we don't know what device this PASID is even being allocated
>> for.
> Ah, I had missed that part, I thought the PV allocation had the device
> ID as well. That's a significant change.
> 
> I was still hoping that we could go back to per-device PASID spaces in
> the host, since I still haven't seen any convincing argument in favor of
> system-wide PASID. Get rid of #1 in order to solve #2, and as a bonus
> support more PASIDs in total. Until now PASID was a per-device resource
> except for choices made when writing the IOMMU driver.
> 

System wide PASID makes things simple and workable in many use cases
although it looks a bit coarse grained.

For an example, in a use case of one application manipulating multiple
devices with a single PASID, it will be complicated to find a PASID
suitable for all the per-device PASID spaces. It will become more
complicated when it comes to PV allocation since PV context might have
no device information when it requires a PASID.

Best regards,
Lu Baolu

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC PATCH 0/6] Auxiliary IOMMU domains and Arm SMMUv3
       [not found]     ` <20181022164834.GH26762-9PYrDHPZ2Orvke4nUoYGnHL1okKdlPRT@public.gmane.org>
@ 2018-11-02  3:19       ` Lu Baolu
  0 siblings, 0 replies; 60+ messages in thread
From: Lu Baolu @ 2018-11-02  3:19 UTC (permalink / raw)
  To: Jean-Philippe Brucker,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	xuzaibo-hv44wF8Li93QT0dZR+AlfA,
	kevin.tian-ral2JQCrhuEAvxtiuMwx3w,
	alex.williamson-H+wXaHxf7aLQT0dZR+AlfA,
	joro-zLv9SwRftAIdnm+yROfE0A, will.deacon-5wv7dgnIgG8,
	robin.murphy-5wv7dgnIgG8, andrew.murray-5wv7dgnIgG8,
	gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r,
	rafael-DgEjT+Ai2ygdnm+yROfE0A, christian.koenig-5C7GfCeVMHo,
	jacob.jun.pan-VuQAYsv1563Yd54FQh9/CA

Hi,

On 10/23/18 12:48 AM, Jordan Crouse wrote:
> On Fri, Oct 19, 2018 at 07:11:52PM +0100, Jean-Philippe Brucker wrote:
> 
>> (2) Allocate a domain and attach it to the device.
>>
>>        dom = iommu_domain_alloc()
>>        iommu_attach_device(dom, dev)
>>
>>      I still have concerns about this part, which are highlighted by the
>>      messy changes of patch 1. I think it would make more sense to
>>      introduce new attach/detach_dev_aux() functions instead of reusing
>>      attach/detach_dev()
>>
>>      Can we reconsider this and avoid unnecessary complications in IOMMU
>>      core and drivers? Does the VFIO code greatly benefit from using the
>>      same attach() function? It could as well use a different one for
>>      devices in AUXD mode, which the mediating driver could tell by
>>      adding a flag in mdev_set_iommu_device(), for example.
>>
>>      And I don't think other users of AUXD would benefit from using the
>>      same attach() function, since they will know whether they want to be
>>      using main or auxiliary domain when doing attach().
> 
> I second this. For the arm-smmu-v2 multiple-pagetable model we would either need
> new API functions or do something along the lines of
> 
> dom = iommu_domain_alloc()
> iommu_set_attr(dom, DOMAIN_ATTR_I_WANT_TO_BE_AUX)
> iommu_attach_device(dom,dev)
> 
> Either that or do some some dummy device magic that I haven't started to work
> out in my head. Having aux specific functions just for this step would make the
> arm-smmu-v2 version work out much cleaner.

Okay! If there are no more comments, I will prepare a new version with
aux specific functions.

Best regards,
Lu Baolu

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC PATCH 0/6] Auxiliary IOMMU domains and Arm SMMUv3
       [not found]         ` <11f88122-afd3-a34c-3cd4-db681bf5498b-5wv7dgnIgG8@public.gmane.org>
  2018-10-22 15:35           ` Jordan Crouse
  2018-10-22 18:01           ` Jean-Philippe Brucker
@ 2018-11-06 16:25           ` joro-zLv9SwRftAIdnm+yROfE0A
       [not found]             ` <20181106162539.4gmkvg57mja3bn7k-zLv9SwRftAIdnm+yROfE0A@public.gmane.org>
  2 siblings, 1 reply; 60+ messages in thread
From: joro-zLv9SwRftAIdnm+yROfE0A @ 2018-11-06 16:25 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Tian, Kevin, alex.williamson-H+wXaHxf7aLQT0dZR+AlfA,
	rafael-DgEjT+Ai2ygdnm+yROfE0A, Jean-Philippe Brucker,
	gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, will.deacon-5wv7dgnIgG8,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	christian.koenig-5C7GfCeVMHo

On Mon, Oct 22, 2018 at 12:50:56PM +0100, Robin Murphy wrote:
> To me, that sounds like a very good argument for having separate "attach as
> primary domain" and "attach as aux domain" APIs.

I agree with that, overloading iommu_attach_device() to support
aux-domains is not going to be a maintainable solution. I'd like this
to be a two-level approach, where the aux-domains are sub-domains of a
primary domain (naming is still to-be-improved):


	struct iommu_domain *domain;

	domain = iommu_alloc_aux_domain();

	iommu_attach_device(pdev, domain); /* Fails if device has not
					      support for this
					      domain-type */

	/* Bind a process address space to the aux-domain */
	sva_handle = iommu_sva_bind_mm(domain, current, ...);
	pasid = iommu_sva_get_pasid(sva_handle);

	mdev_handle = iommu_mdev_alloc_domain(domain);
	iommu_mdev_map(mdev_handle, ...);
	iommu_mdev_unmap(mdev_handle, ...);

	/* Do more work */

	iommu_sva_unbind_mm(sva_handle);

	iommu_mdev_free_domain(mdev_handle);

	iommu_detach_device(domain, pdev);


Regards,

	Joerg

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC PATCH 0/6] Auxiliary IOMMU domains and Arm SMMUv3
       [not found]             ` <20181106162539.4gmkvg57mja3bn7k-zLv9SwRftAIdnm+yROfE0A@public.gmane.org>
@ 2018-11-07  3:40               ` Lu Baolu
       [not found]                 ` <e22e3631-2ecb-664d-5666-8e0f865dec83-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>
  2018-11-08 18:29               ` Jean-Philippe Brucker
  1 sibling, 1 reply; 60+ messages in thread
From: Lu Baolu @ 2018-11-07  3:40 UTC (permalink / raw)
  To: joro-zLv9SwRftAIdnm+yROfE0A, Robin Murphy
  Cc: Tian, Kevin, alex.williamson-H+wXaHxf7aLQT0dZR+AlfA,
	rafael-DgEjT+Ai2ygdnm+yROfE0A, Jean-Philippe Brucker,
	gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, will.deacon-5wv7dgnIgG8,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	christian.koenig-5C7GfCeVMHo

Hi Joerg,

On 11/7/18 12:25 AM, joro-zLv9SwRftAIdnm+yROfE0A@public.gmane.org wrote:
> On Mon, Oct 22, 2018 at 12:50:56PM +0100, Robin Murphy wrote:
>> To me, that sounds like a very good argument for having separate "attach as
>> primary domain" and "attach as aux domain" APIs.
> 
> I agree with that, overloading iommu_attach_device() to support
> aux-domains is not going to be a maintainable solution. I'd like this
> to be a two-level approach, where the aux-domains are sub-domains of a
> primary domain (naming is still to-be-improved):
> 
> 
> 	struct iommu_domain *domain;
> 
> 	domain = iommu_alloc_aux_domain();
> 
> 	iommu_attach_device(pdev, domain); /* Fails if device has not
> 					      support for this
> 					      domain-type */
> 
> 	/* Bind a process address space to the aux-domain */
> 	sva_handle = iommu_sva_bind_mm(domain, current, ...);
> 	pasid = iommu_sva_get_pasid(sva_handle);
> 
> 	mdev_handle = iommu_mdev_alloc_domain(domain);
> 	iommu_mdev_map(mdev_handle, ...);
> 	iommu_mdev_unmap(mdev_handle, ...);
> 
> 	/* Do more work */
> 
> 	iommu_sva_unbind_mm(sva_handle);
> 
> 	iommu_mdev_free_domain(mdev_handle);
> 
> 	iommu_detach_device(domain, pdev);
> 

Aux-domain is not used for sva case. Please allow me to introduce some
backgrounds of the aux-domain.

Previously we have RID (Request ID) granular DMA translation, hence only
a single domain might be attached to a device. The same domain could be
attached to multiple devices if they are configured to be assigned to a
same application (or VM).

    .--------------.                            .--------------.
    | PCI device A |    .------------------.    | PCI device B |
    |              |<---|   Domain         |--->|              |
    |              |    '------------------'    |              |
    |              |                            |              |
    |              |                            |              |
    |              |                            |              |
    |              |                            |              |
    '--------------'                            '--------------'

Nowadays, we can find PASID granular DMA translation on both ARM and x86
platforms. With PASID granular DMA translation supported in system 
iommu, if a device divides itself into multiple subsets and tag the DMA
transfers of each subset with a unique PASID, each subset become
assignable. We call the assignable subset as an ADI (Assignable Device
Interface). As the result, each ADI could be attached with a domain.

So for a device described above, a primary domain will be attached to it
for DMA transfers without PASID, and an aux-domain might be attached to
each ADI for transfers tagged with a PASID#xyz.

      .--------------.     .------------------.
      | PCI device A |<----| Primary Domain   |
      |              |     '------------------'
      .--------------.     .------------------.
      | ADI(PASID#x) |<----| Aux Domain1      |
      .--------------.     .------------------.
      | ADI(PASID#y) |<----| Aux Domain2      |
      .--------------.     .------------------.
      | ADI(PASID#z) |<----| Aux Domain3      |
      '--------------'     '------------------'
      |              |
      '--------------'

Conceptually, if we look a domain as an address boundary of each
assignable element, there is no difference between a primary domain and
an aux domain. The difference exists only in the vendor specific iommu
driver, say primary domains and aux domains use different translation
tables.

Further more, a single domain might be attached to an ADI of device A,
while attached to another legacy device B which doesn't support PASID
features. In this case, we say "Domain 4" is attached to ADI(PASID#x) in
aux mode and attached to device B in primary mode.

      .--------------.     .------------------.     .---------------.
      | PCI device A |<----| Primary Domain   |     | PCI device B  |
      |              |     '------------------'     |               |
      .--------------.     .------------------.     |               |
      | ADI(PASID#x) |<----|   Domain 4       |---->|               |
      .--------------.     .------------------.     |               |
      | ADI(PASID#y) |<----| Aux Domain2      |     |               |
      .--------------.     .------------------.     |               |
      | ADI(PASID#z) |<----| Aux Domain3      |     |               |
      '--------------'     '------------------'     |               |
      |              |                              |               |
      '--------------'                              '---------------'

Based on this, actually we can reuse all existing domain APIs for aux
domain.

     iommu_domain_alloc()
     iommu_domain_free()
     iommu_attach_device()
     iommu_detach_device()
     iommu_map()
     iommu_unmap()

During discussion, we found that reusing iommu_at(de)tach_device() will
cause some changes in iommu core due to the implementation of default
domain. So Jean suggested in this thread that we could consider to use
aux domain specific interfaces for domain attaching and detaching.
We followed Jean's suggestion and submitted a new version with aux
domain specific interfaces for domain attaching and detaching.

https://lkml.org/lkml/2018/11/5/209

Best regards,
Lu Baolu

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC PATCH 2/6] drivers core: Add I/O ASID allocator
       [not found]     ` <20181019181158.2395-3-jean-philippe.brucker-5wv7dgnIgG8@public.gmane.org>
  2018-10-22  4:49       ` Lu Baolu
@ 2018-11-07  4:53       ` Lu Baolu
       [not found]         ` <fb2bd5fe-5742-fcd8-b8f0-1885040e8d4f-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>
  2018-11-12 14:40       ` Joerg Roedel
  2 siblings, 1 reply; 60+ messages in thread
From: Lu Baolu @ 2018-11-07  4:53 UTC (permalink / raw)
  To: Jean-Philippe Brucker, iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA
  Cc: kevin.tian-ral2JQCrhuEAvxtiuMwx3w, rafael-DgEjT+Ai2ygdnm+yROfE0A,
	gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, will.deacon-5wv7dgnIgG8,
	christian.koenig-5C7GfCeVMHo,
	alex.williamson-H+wXaHxf7aLQT0dZR+AlfA, robin.murphy-5wv7dgnIgG8

Hi,

On 10/20/18 2:11 AM, Jean-Philippe Brucker wrote:
> +static inline void *ioasid_find(struct ioasid_set *set, ioasid_t ioasid)
> +{
> +	return -ESRCH;

return NULL;

Best regards,
Lu Baolu

> +}
> +#endif /* CONFIG_IOASID */
> +#endif /* __LINUX_IOASID_H */

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC PATCH 0/6] Auxiliary IOMMU domains and Arm SMMUv3
       [not found]                 ` <e22e3631-2ecb-664d-5666-8e0f865dec83-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>
@ 2018-11-07 16:43                   ` joro-zLv9SwRftAIdnm+yROfE0A
       [not found]                     ` <20181107164323.GA19831-zLv9SwRftAIdnm+yROfE0A@public.gmane.org>
  0 siblings, 1 reply; 60+ messages in thread
From: joro-zLv9SwRftAIdnm+yROfE0A @ 2018-11-07 16:43 UTC (permalink / raw)
  To: Lu Baolu
  Cc: Tian, Kevin, alex.williamson-H+wXaHxf7aLQT0dZR+AlfA,
	rafael-DgEjT+Ai2ygdnm+yROfE0A, Jean-Philippe Brucker,
	gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, will.deacon-5wv7dgnIgG8,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Robin Murphy,
	christian.koenig-5C7GfCeVMHo

Hi,

On Wed, Nov 07, 2018 at 11:40:30AM +0800, Lu Baolu wrote:
> Hi Joerg,
> 
> On 11/7/18 12:25 AM, joro-zLv9SwRftAIdnm+yROfE0A@public.gmane.org wrote:
> Nowadays, we can find PASID granular DMA translation on both ARM and x86
> platforms. With PASID granular DMA translation supported in system iommu, if
> a device divides itself into multiple subsets and tag the DMA
> transfers of each subset with a unique PASID, each subset become
> assignable. We call the assignable subset as an ADI (Assignable Device
> Interface). As the result, each ADI could be attached with a domain.

Yeah, I know the background. The point is, the IOMMU-API as of today
implements a strict one-to-one relationship between domains and devices,
every device can only have one domain assigned at a given time. If we
assign a new domain to a device, the old gets unassigned.

If we allow to attach multiple domains to a single device we
fundamentally break that semantic.

Therefore I think it is better is support the ADI devices with
subdomains and a new set of functions in the API to handle only these
sub-domains.

> Further more, a single domain might be attached to an ADI of device A,
> while attached to another legacy device B which doesn't support PASID
> features. In this case, we say "Domain 4" is attached to ADI(PASID#x) in
> aux mode and attached to device B in primary mode.

This will of course not work with subdomains, but is that really
necessary? VFIO already allocates a separate domain for each device
anyway (iirc), so it is unlikely that is uses the same domain for a
legacy and an ADI device.


Regards,

	Joerg

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC PATCH 0/6] Auxiliary IOMMU domains and Arm SMMUv3
       [not found]                     ` <20181107164323.GA19831-zLv9SwRftAIdnm+yROfE0A@public.gmane.org>
@ 2018-11-07 17:23                       ` Alex Williamson
  2018-11-21  4:40                       ` Lu Baolu
  1 sibling, 0 replies; 60+ messages in thread
From: Alex Williamson @ 2018-11-07 17:23 UTC (permalink / raw)
  To: joro-zLv9SwRftAIdnm+yROfE0A
  Cc: Tian, Kevin, rafael-DgEjT+Ai2ygdnm+yROfE0A,
	Jean-Philippe Brucker, gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r,
	will.deacon-5wv7dgnIgG8,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Robin Murphy,
	christian.koenig-5C7GfCeVMHo

On Wed, 7 Nov 2018 17:43:23 +0100
"joro-zLv9SwRftAIdnm+yROfE0A@public.gmane.org" <joro-zLv9SwRftAIdnm+yROfE0A@public.gmane.org> wrote:

> Hi,
> 
> On Wed, Nov 07, 2018 at 11:40:30AM +0800, Lu Baolu wrote:
> > Hi Joerg,
> > 
> > On 11/7/18 12:25 AM, joro-zLv9SwRftAIdnm+yROfE0A@public.gmane.org wrote:
> > Nowadays, we can find PASID granular DMA translation on both ARM and x86
> > platforms. With PASID granular DMA translation supported in system iommu, if
> > a device divides itself into multiple subsets and tag the DMA
> > transfers of each subset with a unique PASID, each subset become
> > assignable. We call the assignable subset as an ADI (Assignable Device
> > Interface). As the result, each ADI could be attached with a domain.  
> 
> Yeah, I know the background. The point is, the IOMMU-API as of today
> implements a strict one-to-one relationship between domains and devices,
> every device can only have one domain assigned at a given time. If we
> assign a new domain to a device, the old gets unassigned.
> 
> If we allow to attach multiple domains to a single device we
> fundamentally break that semantic.
> 
> Therefore I think it is better is support the ADI devices with
> subdomains and a new set of functions in the API to handle only these
> sub-domains.
> 
> > Further more, a single domain might be attached to an ADI of device A,
> > while attached to another legacy device B which doesn't support PASID
> > features. In this case, we say "Domain 4" is attached to ADI(PASID#x) in
> > aux mode and attached to device B in primary mode.  
> 
> This will of course not work with subdomains, but is that really
> necessary? VFIO already allocates a separate domain for each device
> anyway (iirc), so it is unlikely that is uses the same domain for a
> legacy and an ADI device.

VFIO will attempt to use the same domain for all groups within a
container, only falling back to separate domains if there's an
incompatibility.  Multiple domains means that we need to mirror mapping
and unmapping across each domain, so there's more work to do and
theoretically more overhead in the IOMMU implementation assuming those
domains land on the same IOMMU driver.  Thanks,

Alex

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC PATCH 0/6] Auxiliary IOMMU domains and Arm SMMUv3
       [not found]             ` <20181106162539.4gmkvg57mja3bn7k-zLv9SwRftAIdnm+yROfE0A@public.gmane.org>
  2018-11-07  3:40               ` Lu Baolu
@ 2018-11-08 18:29               ` Jean-Philippe Brucker
       [not found]                 ` <42949d93-e22c-dd4d-cd49-46efc0b73cdb-5wv7dgnIgG8@public.gmane.org>
  1 sibling, 1 reply; 60+ messages in thread
From: Jean-Philippe Brucker @ 2018-11-08 18:29 UTC (permalink / raw)
  To: joro-zLv9SwRftAIdnm+yROfE0A, Robin Murphy
  Cc: Tian, Kevin, rafael-DgEjT+Ai2ygdnm+yROfE0A,
	gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, will.deacon-5wv7dgnIgG8,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	alex.williamson-H+wXaHxf7aLQT0dZR+AlfA,
	christian.koenig-5C7GfCeVMHo

Hi,

On 06/11/2018 16:25, joro-zLv9SwRftAIdnm+yROfE0A@public.gmane.org wrote:
> On Mon, Oct 22, 2018 at 12:50:56PM +0100, Robin Murphy wrote:
>> To me, that sounds like a very good argument for having separate "attach as
>> primary domain" and "attach as aux domain" APIs.
> 
> I agree with that, overloading iommu_attach_device() to support
> aux-domains is not going to be a maintainable solution. I'd like this
> to be a two-level approach, where the aux-domains are sub-domains of a
> primary domain (naming is still to-be-improved):
> 
> 
> 	struct iommu_domain *domain;
> 
> 	domain = iommu_alloc_aux_domain();
> 
> 	iommu_attach_device(pdev, domain); /* Fails if device has not
> 					      support for this
> 					      domain-type */

I'm still not sure how this should interact with MSI on Arm. On x86 the
MSI doorbell is a reserved IOVA region in the IOMMU, but for Arm it's a
MMIO register in a separate IRQ chip, upstream of the SMMU. From the
SMMU point of view, MSI is a normal DMA without PASID, and the
corresponding IOMMU mapping must be present.

SMMU mappings for the MSI doorbell are created by the IRQ driver, when
the device driver requests an IRQ. Roughly speaking it goes like:

 request_irq()
  irq_chip_compose_msi_msg()
   iommu_dma_map_msi_msg()
    iommu_map(iommu_get_domain_for_dev(dev), ...)

Currently for "normal" device drivers, this will create the MSI mapping
on the default_domain (which is created before the drivers' probe()).
Device drivers that attach an unmanaged domain *and* use MSIs - only
VFIO so far - check if the MSI needs to be mapped, and then call
iommu_get_msi_cookie() before requesting IRQs.

With my previous SVA proposals the default_domain was used for SVA, so
we didn't need any change. But now, depending on the order between the
IRQ request and this attach(), iommu_get_domain_for_dev() will return
either default_domain or the new domain.

(1) My initial approach would have been to use the same page tables for
the default_domain and this new domain, but it might be precisely what
you were trying to avoid with this new proposal: a device attached to
two domains at the same time.

(2) Another solution is to enforce an order between request_irq() and
iommu_attach_device() for all device drivers that intend to use SVA, and
have them do the MSI mapping themselves between attaching the domain and
requesting IRQs.

I don't really like either but to make things easier for device drivers,
I would go with option (1).

Thanks,
Jean

> 
> 	/* Bind a process address space to the aux-domain */
> 	sva_handle = iommu_sva_bind_mm(domain, current, ...);
> 	pasid = iommu_sva_get_pasid(sva_handle);
> 
> 	mdev_handle = iommu_mdev_alloc_domain(domain);
> 	iommu_mdev_map(mdev_handle, ...);
> 	iommu_mdev_unmap(mdev_handle, ...);
> 
> 	/* Do more work */
> 
> 	iommu_sva_unbind_mm(sva_handle);
> 
> 	iommu_mdev_free_domain(mdev_handle);
> 
> 	iommu_detach_device(domain, pdev);
> 
> 
> Regards,
> 
> 	Joerg
> _______________________________________________
> iommu mailing list
> iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org
> https://lists.linuxfoundation.org/mailman/listinfo/iommu
> 

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC PATCH 2/6] drivers core: Add I/O ASID allocator
       [not found]         ` <fb2bd5fe-5742-fcd8-b8f0-1885040e8d4f-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>
@ 2018-11-08 18:51           ` Jean-Philippe Brucker
  0 siblings, 0 replies; 60+ messages in thread
From: Jean-Philippe Brucker @ 2018-11-08 18:51 UTC (permalink / raw)
  To: Lu Baolu, iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA
  Cc: kevin.tian-ral2JQCrhuEAvxtiuMwx3w, rafael-DgEjT+Ai2ygdnm+yROfE0A,
	gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, Will Deacon,
	alex.williamson-H+wXaHxf7aLQT0dZR+AlfA, Robin Murphy,
	christian.koenig-5C7GfCeVMHo

On 07/11/2018 04:53, Lu Baolu wrote:
> Hi,
> 
> On 10/20/18 2:11 AM, Jean-Philippe Brucker wrote:
>> +static inline void *ioasid_find(struct ioasid_set *set, ioasid_t ioasid)
>> +{
>> +	return -ESRCH;
> 
> return NULL;

I'll fix it, thanks

Jean

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC PATCH 2/6] drivers core: Add I/O ASID allocator
       [not found]     ` <20181019181158.2395-3-jean-philippe.brucker-5wv7dgnIgG8@public.gmane.org>
  2018-10-22  4:49       ` Lu Baolu
  2018-11-07  4:53       ` Lu Baolu
@ 2018-11-12 14:40       ` Joerg Roedel
       [not found]         ` <20181112144039.GA25808-zLv9SwRftAIdnm+yROfE0A@public.gmane.org>
  2 siblings, 1 reply; 60+ messages in thread
From: Joerg Roedel @ 2018-11-12 14:40 UTC (permalink / raw)
  To: Jean-Philippe Brucker
  Cc: kevin.tian-ral2JQCrhuEAvxtiuMwx3w, rafael-DgEjT+Ai2ygdnm+yROfE0A,
	robin.murphy-5wv7dgnIgG8, will.deacon-5wv7dgnIgG8,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	alex.williamson-H+wXaHxf7aLQT0dZR+AlfA,
	gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r,
	christian.koenig-5C7GfCeVMHo

Hi Jean-Philippe,

On Fri, Oct 19, 2018 at 07:11:54PM +0100, Jean-Philippe Brucker wrote:
> The allocator doesn't really belong in drivers/iommu because some
> drivers would like to allocate PASIDs for devices that aren't managed by
> an IOMMU, using the same ID space as IOMMU. It doesn't really belong in
> drivers/pci either since platform device also support PASID. Add the
> allocator in drivers/base.

I don't really buy this argument, in the end it is the IOMMU that
enforces the PASID mappings for a device. Why should a device not behind
an IOMMU need to allocate a pasid? Do you have examples of such devices
which also don't have their own iommu built-in?

> +int ioasid_for_each(struct ioasid_set *set, ioasid_iter_t func, void *data)
> +{
> +	int ret;
> +	struct ioasid_iter_data iter_data = {
> +		.set	= set,
> +		.func	= func,
> +		.data	= data,
> +	};
> +
> +	idr_lock(&ioasid_idr);
> +	ret = idr_for_each(&ioasid_idr, ioasid_iter, &iter_data);
> +	idr_unlock(&ioasid_idr);
> +
> +	return ret;
> +}
> +EXPORT_SYMBOL_GPL(ioasid_for_each);

What is the intended use-case for this function?

> +void *ioasid_find(struct ioasid_set *set, ioasid_t ioasid)
> +{
> +	void *priv = NULL;
> +	struct ioasid_data *ioasid_data;
> +
> +	idr_lock(&ioasid_idr);
> +	ioasid_data = idr_find(&ioasid_idr, ioasid);
> +	if (ioasid_data && ioasid_data->set == set)
> +		priv = ioasid_data->private;
> +	idr_unlock(&ioasid_idr);
> +
> +	return priv;
> +}

I think using the idr_lock here will become a performance problem, as we
need this function in the SVA page-fault path to retrieve the mm_struct
belonging to a PASID.

The head comment in lib/idr.c states that it should be safe to use
rcu_readlock() on read-only iterations. If so, this should be used here
so that concurrent lookups are possible.

Regards,

	Joerg

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC PATCH 0/6] Auxiliary IOMMU domains and Arm SMMUv3
       [not found]                 ` <42949d93-e22c-dd4d-cd49-46efc0b73cdb-5wv7dgnIgG8@public.gmane.org>
@ 2018-11-12 14:55                   ` joro-zLv9SwRftAIdnm+yROfE0A
       [not found]                     ` <20181112145541.GB25808-zLv9SwRftAIdnm+yROfE0A@public.gmane.org>
  0 siblings, 1 reply; 60+ messages in thread
From: joro-zLv9SwRftAIdnm+yROfE0A @ 2018-11-12 14:55 UTC (permalink / raw)
  To: Jean-Philippe Brucker
  Cc: Tian, Kevin, rafael-DgEjT+Ai2ygdnm+yROfE0A,
	gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, will.deacon-5wv7dgnIgG8,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	alex.williamson-H+wXaHxf7aLQT0dZR+AlfA, Robin Murphy,
	christian.koenig-5C7GfCeVMHo

Hi Jean-Philippe,

On Thu, Nov 08, 2018 at 06:29:42PM +0000, Jean-Philippe Brucker wrote:
> (1) My initial approach would have been to use the same page tables for
> the default_domain and this new domain, but it might be precisely what
> you were trying to avoid with this new proposal: a device attached to
> two domains at the same time.

My request was about the initial assumptions a device driver can make
about a device. This assumptions is that DMA-API is set up and
initialized for it, for everything else (like SVA) the device driver
needs to take special action, like allocating an SVA domain and attaching
the device to it.

This should of course not break any IRQ setups for the device, and also
enforcing an ordering is not a good and maintainable solution.

So what you could do here is to either:

	1) Save the needed IRQ mappings in some extra datastructure and
	   duplicate it in the SVA domain. This also makes it easier to
	   use the same SVA domain with multiple devices.

	2) Just re-use the 'page-tables' from the default domain in the
	   sva-domain. This needs to happen at attach-time, because at
	   allocation time you don't know the device yet.

I think 1) is the best solution, what do you think?

Btw, things would be different if we could expose SVA through the
DMA-API to drivers. In this case we could just make the default domain
of type SVA and be done, but we are not there yet.

Regards,

	Joerg

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC PATCH 0/6] Auxiliary IOMMU domains and Arm SMMUv3
       [not found]                     ` <20181107164323.GA19831-zLv9SwRftAIdnm+yROfE0A@public.gmane.org>
  2018-11-07 17:23                       ` Alex Williamson
@ 2018-11-21  4:40                       ` Lu Baolu
       [not found]                         ` <758bb120-5bc0-1e2d-ccd0-9be0bcc5d8bc-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>
  1 sibling, 1 reply; 60+ messages in thread
From: Lu Baolu @ 2018-11-21  4:40 UTC (permalink / raw)
  To: joro-zLv9SwRftAIdnm+yROfE0A
  Cc: Tian, Kevin, alex.williamson-H+wXaHxf7aLQT0dZR+AlfA,
	rafael-DgEjT+Ai2ygdnm+yROfE0A, Jean-Philippe Brucker,
	gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, will.deacon-5wv7dgnIgG8,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Robin Murphy,
	christian.koenig-5C7GfCeVMHo

Hi Joerg,

On 11/8/18 12:43 AM, joro-zLv9SwRftAIdnm+yROfE0A@public.gmane.org wrote:
> Hi,
> 
> On Wed, Nov 07, 2018 at 11:40:30AM +0800, Lu Baolu wrote:
>> Hi Joerg,
>>
>> On 11/7/18 12:25 AM, joro-zLv9SwRftAIdnm+yROfE0A@public.gmane.org wrote:
>> Nowadays, we can find PASID granular DMA translation on both ARM and x86
>> platforms. With PASID granular DMA translation supported in system iommu, if
>> a device divides itself into multiple subsets and tag the DMA
>> transfers of each subset with a unique PASID, each subset become
>> assignable. We call the assignable subset as an ADI (Assignable Device
>> Interface). As the result, each ADI could be attached with a domain.
> 
> Yeah, I know the background. The point is, the IOMMU-API as of today
> implements a strict one-to-one relationship between domains and devices,
> every device can only have one domain assigned at a given time. If we
> assign a new domain to a device, the old gets unassigned.
> 
> If we allow to attach multiple domains to a single device we
> fundamentally break that semantic.

Yes. In the latest v4 submission, we use the aux-domain specific APIs to
attach/detach the domain to a device. Do you see other APIs that will
possibly break this semantic?

> 
> Therefore I think it is better is support the ADI devices with
> subdomains and a new set of functions in the API to handle only these
> sub-domains.

Can you please elaborate a bit more about the concept of subdomains?
 From my point of view, an aux-domain is a normal un-managed domain which
has a PASID and could be attached to any ADIs through the aux-domain
specific attach/detach APIs. It could also be attached to a device with
the existing domain_attach/detach_device() APIs at the same time, hence
mdev and pci devices in a vfio container could share a domain.

Best regards,
Lu Baolu

> 
>> Further more, a single domain might be attached to an ADI of device A,
>> while attached to another legacy device B which doesn't support PASID
>> features. In this case, we say "Domain 4" is attached to ADI(PASID#x) in
>> aux mode and attached to device B in primary mode.
> 
> This will of course not work with subdomains, but is that really
> necessary? VFIO already allocates a separate domain for each device
> anyway (iirc), so it is unlikely that is uses the same domain for a
> legacy and an ADI device.
> 
> 
> Regards,
> 
> 	Joerg
> 

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC PATCH 2/6] drivers core: Add I/O ASID allocator
       [not found]         ` <20181112144039.GA25808-zLv9SwRftAIdnm+yROfE0A@public.gmane.org>
@ 2018-11-21 11:16           ` Jean-Philippe Brucker
       [not found]             ` <8f17757a-c657-aab9-a6a0-fb0cc9c610a8-5wv7dgnIgG8@public.gmane.org>
  0 siblings, 1 reply; 60+ messages in thread
From: Jean-Philippe Brucker @ 2018-11-21 11:16 UTC (permalink / raw)
  To: Joerg Roedel
  Cc: kevin.tian-ral2JQCrhuEAvxtiuMwx3w, rafael-DgEjT+Ai2ygdnm+yROfE0A,
	gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, will.deacon-5wv7dgnIgG8,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	alex.williamson-H+wXaHxf7aLQT0dZR+AlfA, robin.murphy-5wv7dgnIgG8,
	christian.koenig-5C7GfCeVMHo

On 12/11/2018 14:40, Joerg Roedel wrote:
> Hi Jean-Philippe,
> 
> On Fri, Oct 19, 2018 at 07:11:54PM +0100, Jean-Philippe Brucker wrote:
>> The allocator doesn't really belong in drivers/iommu because some
>> drivers would like to allocate PASIDs for devices that aren't managed by
>> an IOMMU, using the same ID space as IOMMU. It doesn't really belong in
>> drivers/pci either since platform device also support PASID. Add the
>> allocator in drivers/base.
> 
> I don't really buy this argument, in the end it is the IOMMU that
> enforces the PASID mappings for a device. Why should a device not behind
> an IOMMU need to allocate a pasid? Do you have examples of such devices
> which also don't have their own iommu built-in?

I misunderstood that requirement. Reading through the thread again
(https://www.mail-archive.com/iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org/msg25640.html)
it's more about a device using PASIDs as context IDs. Some contexts are
not bound to processes but they still need their ID in the same PASID
space as the contexts that are bound to process address spaces. So we
can keep that code in drivers/iommu

>> +int ioasid_for_each(struct ioasid_set *set, ioasid_iter_t func, void *data)
>> +{
>> +	int ret;
>> +	struct ioasid_iter_data iter_data = {
>> +		.set	= set,
>> +		.func	= func,
>> +		.data	= data,
>> +	};
>> +
>> +	idr_lock(&ioasid_idr);
>> +	ret = idr_for_each(&ioasid_idr, ioasid_iter, &iter_data);
>> +	idr_unlock(&ioasid_idr);
>> +
>> +	return ret;
>> +}
>> +EXPORT_SYMBOL_GPL(ioasid_for_each);
> 
> What is the intended use-case for this function?

I'm using it in sva_bind(), to see if there already exists an io_mm
associated to the mm_struct given as argument

>> +void *ioasid_find(struct ioasid_set *set, ioasid_t ioasid)
>> +{
>> +	void *priv = NULL;
>> +	struct ioasid_data *ioasid_data;
>> +
>> +	idr_lock(&ioasid_idr);
>> +	ioasid_data = idr_find(&ioasid_idr, ioasid);
>> +	if (ioasid_data && ioasid_data->set == set)
>> +		priv = ioasid_data->private;
>> +	idr_unlock(&ioasid_idr);
>> +
>> +	return priv;
>> +}
> 
> I think using the idr_lock here will become a performance problem, as we
> need this function in the SVA page-fault path to retrieve the mm_struct
> belonging to a PASID.
> 
> The head comment in lib/idr.c states that it should be safe to use
> rcu_readlock() on read-only iterations. If so, this should be used here
> so that concurrent lookups are possible.

Agreed. I have a patch to use the RCU which looks simple enough, but
since I haven't used RCU before I wasn't comfortable squashing it right
away.

Thanks,
Jean

diff --git a/drivers/base/ioasid.c b/drivers/base/ioasid.c
index 5eb3abbf69ac..fa3b48c9c044 100644
--- a/drivers/base/ioasid.c
+++ b/drivers/base/ioasid.c
@@ -13,6 +13,7 @@ struct ioasid_data {
 	int id;
 	struct ioasid_set *set;
 	void *private;
+	struct rcu_head rcu;
 };

 static DEFINE_IDR(ioasid_idr);
@@ -56,7 +57,8 @@ void ioasid_free(ioasid_t ioasid)
 	ioasid_data = idr_remove(&ioasid_idr, ioasid);
 	idr_unlock(&ioasid_idr);

-	kfree(ioasid_data);
+	if (ioasid_data)
+		kfree_rcu(ioasid_data, rcu);
 }
 EXPORT_SYMBOL_GPL(ioasid_free);

@@ -95,9 +97,9 @@ int ioasid_for_each(struct ioasid_set *set,
ioasid_iter_t func, void *data)
 		.data	= data,
 	};

-	idr_lock(&ioasid_idr);
+	rcu_read_lock();
 	ret = idr_for_each(&ioasid_idr, ioasid_iter, &iter_data);
-	idr_unlock(&ioasid_idr);
+	rcu_read_unlock();

 	return ret;
 }
@@ -111,11 +113,11 @@ void *ioasid_find(struct ioasid_set *set, ioasid_t
ioasid)
 	void *priv = NULL;
 	struct ioasid_data *ioasid_data;

-	idr_lock(&ioasid_idr);
+	rcu_read_lock();
 	ioasid_data = idr_find(&ioasid_idr, ioasid);
 	if (ioasid_data && ioasid_data->set == set)
 		priv = ioasid_data->private;
-	idr_unlock(&ioasid_idr);
+	rcu_read_unlock();

 	return priv;
 }

^ permalink raw reply related	[flat|nested] 60+ messages in thread

* Re: [RFC PATCH 0/6] Auxiliary IOMMU domains and Arm SMMUv3
       [not found]                     ` <20181112145541.GB25808-zLv9SwRftAIdnm+yROfE0A@public.gmane.org>
@ 2018-11-21 19:05                       ` Jean-Philippe Brucker
       [not found]                         ` <5dcf9238-62b2-8df6-b378-183ee09c5951-5wv7dgnIgG8@public.gmane.org>
  2018-11-22  8:39                       ` Tian, Kevin
  1 sibling, 1 reply; 60+ messages in thread
From: Jean-Philippe Brucker @ 2018-11-21 19:05 UTC (permalink / raw)
  To: joro-zLv9SwRftAIdnm+yROfE0A
  Cc: Tian, Kevin, rafael-DgEjT+Ai2ygdnm+yROfE0A,
	gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, will.deacon-5wv7dgnIgG8,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	alex.williamson-H+wXaHxf7aLQT0dZR+AlfA, Robin Murphy,
	christian.koenig-5C7GfCeVMHo

On 12/11/2018 14:55, joro-zLv9SwRftAIdnm+yROfE0A@public.gmane.org wrote:
> Hi Jean-Philippe,
> 
> On Thu, Nov 08, 2018 at 06:29:42PM +0000, Jean-Philippe Brucker wrote:
>> (1) My initial approach would have been to use the same page tables for
>> the default_domain and this new domain, but it might be precisely what
>> you were trying to avoid with this new proposal: a device attached to
>> two domains at the same time.
> 
> My request was about the initial assumptions a device driver can make
> about a device. This assumptions is that DMA-API is set up and
> initialized for it, for everything else (like SVA) the device driver
> needs to take special action, like allocating an SVA domain and attaching
> the device to it.
> 
> This should of course not break any IRQ setups for the device, and also
> enforcing an ordering is not a good and maintainable solution.
> 
> So what you could do here is to either:
> 
> 	1) Save the needed IRQ mappings in some extra datastructure and
> 	   duplicate it in the SVA domain. This also makes it easier to
> 	   use the same SVA domain with multiple devices.
> 
> 	2) Just re-use the 'page-tables' from the default domain in the
> 	   sva-domain. This needs to happen at attach-time, because at
> 	   allocation time you don't know the device yet.
> 
> I think 1) is the best solution, what do you think?

Right now 2) seems more feasible, because changes will be limited to the
IOMMU driver and don't spill in the dma-iommu layer. Maybe it will be
clearer to me once I write a prototype.

> Btw, things would be different if we could expose SVA through the
> DMA-API to drivers. In this case we could just make the default domain
> of type SVA and be done, but we are not there yet.

For the moment though, I think we should allow device drivers to use the
DMA-API at the same time as SVA. If a device driver has to map a
management ring buffer for example, it's much nicer to use dma_alloc as
usual rather than have them write their own DMA allocation routines.
Which is another reason to implement 2) above: the DMA-API would still
act on the default_domain, and attaching an "extended" domain augments
it with SVA features. Detaching from the device doesn't require copying
mappings back to the default domain. Does that sound OK?

Then if vfio-pci wants to use SVA, maybe we can make the UNMANAGED
domain support SVA by default? Otherwise introduce a UNMANAGED+ext
domain. Anyway, that's for later.

This is what I currently plan to add:

	struct io_mm *io_mm;
	struct iommu_domain *domain;

	domain = iommu_alloc_ext_domain(bus);

	/* Set an mm-exit callback to disable PASIDs. More attributes
	   could be added later to change the capabilities of the ext
	   domain */
	iommu_domain_set_attr(domain, DOMAIN_ATTR_MM_EXIT_CB, &mm_exit);

	/* Fails if the device doesn't support this domain type */
	iommu_attach_device(domain, dev);

	/* Same as SVA v3, except a domain instead of dev as argument */
	io_mm = iommu_sva_bind_mm(domain, current->mm, ...);

	/* on success, install the PASID in the device */
	pasid = io_mm->pasid;

	/* Do more work */

	iommu_sva_unbind_mm(io_mm);
	iommu_detach_device(domain, dev);


Thanks,
Jean

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC PATCH 2/6] drivers core: Add I/O ASID allocator
       [not found]             ` <8f17757a-c657-aab9-a6a0-fb0cc9c610a8-5wv7dgnIgG8@public.gmane.org>
@ 2018-11-21 19:10               ` Koenig, Christian
       [not found]                 ` <62f05552-df46-6e12-10ed-820429dfda59-5C7GfCeVMHo@public.gmane.org>
  2018-11-22  8:44               ` Joerg Roedel
  1 sibling, 1 reply; 60+ messages in thread
From: Koenig, Christian @ 2018-11-21 19:10 UTC (permalink / raw)
  To: Jean-Philippe Brucker, Joerg Roedel
  Cc: kevin.tian-ral2JQCrhuEAvxtiuMwx3w, rafael-DgEjT+Ai2ygdnm+yROfE0A,
	gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, will.deacon-5wv7dgnIgG8,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	alex.williamson-H+wXaHxf7aLQT0dZR+AlfA, robin.murphy-5wv7dgnIgG8

Am 21.11.18 um 12:16 schrieb Jean-Philippe Brucker:
> On 12/11/2018 14:40, Joerg Roedel wrote:
>> Hi Jean-Philippe,
>>
>> On Fri, Oct 19, 2018 at 07:11:54PM +0100, Jean-Philippe Brucker wrote:
>>> The allocator doesn't really belong in drivers/iommu because some
>>> drivers would like to allocate PASIDs for devices that aren't managed by
>>> an IOMMU, using the same ID space as IOMMU. It doesn't really belong in
>>> drivers/pci either since platform device also support PASID. Add the
>>> allocator in drivers/base.
>> I don't really buy this argument, in the end it is the IOMMU that
>> enforces the PASID mappings for a device. Why should a device not behind
>> an IOMMU need to allocate a pasid? Do you have examples of such devices
>> which also don't have their own iommu built-in?
> I misunderstood that requirement. Reading through the thread again
> (https://www.mail-archive.com/iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org/msg25640.html)
> it's more about a device using PASIDs as context IDs. Some contexts are
> not bound to processes but they still need their ID in the same PASID
> space as the contexts that are bound to process address spaces. So we
> can keep that code in drivers/iommu

The problem with that approach is that we also need this allocator when 
IOMMU is completely disabled.

In other words PASIDs are used as contexts IDs by the hardware for 
things like signaling which application has caused an interrupt/event 
even when they are not used by IOMMU later on.

Additional to that we have some MMUs which are internal to the devices 
(GPUVM on AMD GPUs for example, similar exists for NVidia) which uses 
PASIDs for address translation internally in the device.

All of that is completely unrelated to IOMMU, but when IOMMU is enabled 
you need to use the same allocator because all use cases use the same ID 
space.

Regards,
Christian.

>
>>> +int ioasid_for_each(struct ioasid_set *set, ioasid_iter_t func, void *data)
>>> +{
>>> +	int ret;
>>> +	struct ioasid_iter_data iter_data = {
>>> +		.set	= set,
>>> +		.func	= func,
>>> +		.data	= data,
>>> +	};
>>> +
>>> +	idr_lock(&ioasid_idr);
>>> +	ret = idr_for_each(&ioasid_idr, ioasid_iter, &iter_data);
>>> +	idr_unlock(&ioasid_idr);
>>> +
>>> +	return ret;
>>> +}
>>> +EXPORT_SYMBOL_GPL(ioasid_for_each);
>> What is the intended use-case for this function?
> I'm using it in sva_bind(), to see if there already exists an io_mm
> associated to the mm_struct given as argument
>
>>> +void *ioasid_find(struct ioasid_set *set, ioasid_t ioasid)
>>> +{
>>> +	void *priv = NULL;
>>> +	struct ioasid_data *ioasid_data;
>>> +
>>> +	idr_lock(&ioasid_idr);
>>> +	ioasid_data = idr_find(&ioasid_idr, ioasid);
>>> +	if (ioasid_data && ioasid_data->set == set)
>>> +		priv = ioasid_data->private;
>>> +	idr_unlock(&ioasid_idr);
>>> +
>>> +	return priv;
>>> +}
>> I think using the idr_lock here will become a performance problem, as we
>> need this function in the SVA page-fault path to retrieve the mm_struct
>> belonging to a PASID.
>>
>> The head comment in lib/idr.c states that it should be safe to use
>> rcu_readlock() on read-only iterations. If so, this should be used here
>> so that concurrent lookups are possible.
> Agreed. I have a patch to use the RCU which looks simple enough, but
> since I haven't used RCU before I wasn't comfortable squashing it right
> away.
>
> Thanks,
> Jean
>
> diff --git a/drivers/base/ioasid.c b/drivers/base/ioasid.c
> index 5eb3abbf69ac..fa3b48c9c044 100644
> --- a/drivers/base/ioasid.c
> +++ b/drivers/base/ioasid.c
> @@ -13,6 +13,7 @@ struct ioasid_data {
>   	int id;
>   	struct ioasid_set *set;
>   	void *private;
> +	struct rcu_head rcu;
>   };
>
>   static DEFINE_IDR(ioasid_idr);
> @@ -56,7 +57,8 @@ void ioasid_free(ioasid_t ioasid)
>   	ioasid_data = idr_remove(&ioasid_idr, ioasid);
>   	idr_unlock(&ioasid_idr);
>
> -	kfree(ioasid_data);
> +	if (ioasid_data)
> +		kfree_rcu(ioasid_data, rcu);
>   }
>   EXPORT_SYMBOL_GPL(ioasid_free);
>
> @@ -95,9 +97,9 @@ int ioasid_for_each(struct ioasid_set *set,
> ioasid_iter_t func, void *data)
>   		.data	= data,
>   	};
>
> -	idr_lock(&ioasid_idr);
> +	rcu_read_lock();
>   	ret = idr_for_each(&ioasid_idr, ioasid_iter, &iter_data);
> -	idr_unlock(&ioasid_idr);
> +	rcu_read_unlock();
>
>   	return ret;
>   }
> @@ -111,11 +113,11 @@ void *ioasid_find(struct ioasid_set *set, ioasid_t
> ioasid)
>   	void *priv = NULL;
>   	struct ioasid_data *ioasid_data;
>
> -	idr_lock(&ioasid_idr);
> +	rcu_read_lock();
>   	ioasid_data = idr_find(&ioasid_idr, ioasid);
>   	if (ioasid_data && ioasid_data->set == set)
>   		priv = ioasid_data->private;
> -	idr_unlock(&ioasid_idr);
> +	rcu_read_unlock();
>
>   	return priv;
>   }
>

^ permalink raw reply	[flat|nested] 60+ messages in thread

* RE: [RFC PATCH 2/6] drivers core: Add I/O ASID allocator
       [not found]                 ` <62f05552-df46-6e12-10ed-820429dfda59-5C7GfCeVMHo@public.gmane.org>
@ 2018-11-22  6:59                   ` Tian, Kevin
  2018-11-22  8:38                   ` Joerg Roedel
  1 sibling, 0 replies; 60+ messages in thread
From: Tian, Kevin @ 2018-11-22  6:59 UTC (permalink / raw)
  To: Koenig, Christian, Jean-Philippe Brucker, Joerg Roedel
  Cc: rafael-DgEjT+Ai2ygdnm+yROfE0A,
	gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, will.deacon-5wv7dgnIgG8,
	alex.williamson-H+wXaHxf7aLQT0dZR+AlfA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	robin.murphy-5wv7dgnIgG8

> From: Koenig, Christian
> Sent: Thursday, November 22, 2018 3:10 AM
> 
> Am 21.11.18 um 12:16 schrieb Jean-Philippe Brucker:
> > On 12/11/2018 14:40, Joerg Roedel wrote:
> >> Hi Jean-Philippe,
> >>
> >> On Fri, Oct 19, 2018 at 07:11:54PM +0100, Jean-Philippe Brucker wrote:
> >>> The allocator doesn't really belong in drivers/iommu because some
> >>> drivers would like to allocate PASIDs for devices that aren't managed by
> >>> an IOMMU, using the same ID space as IOMMU. It doesn't really
> belong in
> >>> drivers/pci either since platform device also support PASID. Add the
> >>> allocator in drivers/base.
> >> I don't really buy this argument, in the end it is the IOMMU that
> >> enforces the PASID mappings for a device. Why should a device not
> behind
> >> an IOMMU need to allocate a pasid? Do you have examples of such
> devices
> >> which also don't have their own iommu built-in?
> > I misunderstood that requirement. Reading through the thread again
> > (https://www.mail-archive.com/iommu-cunTk1MwBs/ROKNJybVBZg@public.gmane.org
> foundation.org/msg25640.html)
> > it's more about a device using PASIDs as context IDs. Some contexts are
> > not bound to processes but they still need their ID in the same PASID
> > space as the contexts that are bound to process address spaces. So we
> > can keep that code in drivers/iommu
> 
> The problem with that approach is that we also need this allocator when
> IOMMU is completely disabled.
> 
> In other words PASIDs are used as contexts IDs by the hardware for
> things like signaling which application has caused an interrupt/event
> even when they are not used by IOMMU later on.
> 
> Additional to that we have some MMUs which are internal to the devices
> (GPUVM on AMD GPUs for example, similar exists for NVidia) which uses
> PASIDs for address translation internally in the device.
> 
> All of that is completely unrelated to IOMMU, but when IOMMU is enabled
> you need to use the same allocator because all use cases use the same ID
> space.
> 

imo context ID is a resource used by device internal while PASID is sth.
informational to external world (as defined in PCI bus package). ideally two 
things are separate in hardware but can be associated (e.g. PASID as an 
optional parameter per context). That is what I know for Intel GPUs.

For your device, is "PASIDs are used as contexts IDs" a software 
implementation policy or actual hardware requirement (i.e. no dedicated
PASID recording)?

Thanks
Kevin

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC PATCH 2/6] drivers core: Add I/O ASID allocator
       [not found]                 ` <62f05552-df46-6e12-10ed-820429dfda59-5C7GfCeVMHo@public.gmane.org>
  2018-11-22  6:59                   ` Tian, Kevin
@ 2018-11-22  8:38                   ` Joerg Roedel
  1 sibling, 0 replies; 60+ messages in thread
From: Joerg Roedel @ 2018-11-22  8:38 UTC (permalink / raw)
  To: Koenig, Christian
  Cc: kevin.tian-ral2JQCrhuEAvxtiuMwx3w, rafael-DgEjT+Ai2ygdnm+yROfE0A,
	Jean-Philippe Brucker, gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r,
	will.deacon-5wv7dgnIgG8, alex.williamson-H+wXaHxf7aLQT0dZR+AlfA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	robin.murphy-5wv7dgnIgG8

On Wed, Nov 21, 2018 at 07:10:20PM +0000, Koenig, Christian wrote:
> All of that is completely unrelated to IOMMU, but when IOMMU is enabled 
> you need to use the same allocator because all use cases use the same ID 
> space.

Okay, I see. Is there ever a case where we can have multiple PASIDs for
one mm_struct?


Regards,

	Joerg

^ permalink raw reply	[flat|nested] 60+ messages in thread

* RE: [RFC PATCH 0/6] Auxiliary IOMMU domains and Arm SMMUv3
       [not found]                     ` <20181112145541.GB25808-zLv9SwRftAIdnm+yROfE0A@public.gmane.org>
  2018-11-21 19:05                       ` Jean-Philippe Brucker
@ 2018-11-22  8:39                       ` Tian, Kevin
       [not found]                         ` <AADFC41AFE54684AB9EE6CBC0274A5D19BE5A7A7-0J0gbvR4kThpB2pF5aRoyrfspsVTdybXVpNB7YpNyf8@public.gmane.org>
  1 sibling, 1 reply; 60+ messages in thread
From: Tian, Kevin @ 2018-11-22  8:39 UTC (permalink / raw)
  To: joro-zLv9SwRftAIdnm+yROfE0A, Jean-Philippe Brucker
  Cc: rafael-DgEjT+Ai2ygdnm+yROfE0A,
	gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, will.deacon-5wv7dgnIgG8,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	alex.williamson-H+wXaHxf7aLQT0dZR+AlfA, Robin Murphy,
	christian.koenig-5C7GfCeVMHo

> From: joro-zLv9SwRftAIdnm+yROfE0A@public.gmane.org [mailto:joro-zLv9SwRftAIdnm+yROfE0A@public.gmane.org]
> Sent: Monday, November 12, 2018 10:56 PM
> 
> Hi Jean-Philippe,
> 
> On Thu, Nov 08, 2018 at 06:29:42PM +0000, Jean-Philippe Brucker wrote:
> > (1) My initial approach would have been to use the same page tables for
> > the default_domain and this new domain, but it might be precisely what
> > you were trying to avoid with this new proposal: a device attached to
> > two domains at the same time.
> 
> My request was about the initial assumptions a device driver can make
> about a device. This assumptions is that DMA-API is set up and
> initialized for it, for everything else (like SVA) the device driver
> needs to take special action, like allocating an SVA domain and attaching
> the device to it.

Hi, Joerg,

I agree special action needs to be taken for everything else (other than
DMA-API), but the point that I didn't get is why the action must be based
a new SVA-type domain, instead of extending default domain with SVA
capability (including related paging structures which are accessed through 
new SVA APIs). In the latter case domain-wise attribute (e.g. IRQ mapping) 
is naturally shared between capabilities (DMA-API and SVA). There is no 
need to create cross-domain connections as two options that you listed 
below.

Can you help elaborate more about the motivation behind proposal?

P.S. as you may see from other threads, we support same IOMMU 
capabilities (both IOVA and SVA) on vfio-mdev (i.e. aux domain) as ones 
on vfio-pci. which leads to below situation following your proposal, if
'AUX' is treated as a separate capability as SVA:

|-default domain (DMA-API)
	|-sva domain1 (SVA)
	|-sva domain2 (SVA)
	|-...
	|-sva domainN (AUX, guest DMA-API)
	|	|- sva domainN1 (AUX, guest SVA)
	|	|- sva domainN2 (AUX, guest SVA)
	|	|-...
	|-sva domainM (AUX, guest DMA-API)
	|	|- sva domainM1 (AUX, guest SVA)
	|	|- sva domainM2 (AUX, guest SVA)
	|	|-...

Is that what is in your mind? would it cause unnecessary complexities
on handling those layers?

The hierarchy could be simplified if we allow aux domain to carry both 
DMA-API and SVA capability:

|-default domain (DMA-API)
	|-sva domain1 (SVA)
	|-sva domain2 (SVA)
	|-...
	|-sva domainN (AUX, guest DMA-API, guest SVA)
	|-sva domainM (AUX, guest DMA-API, guest SVA)

However doing so cause inconsistency on domain definition. why cannot
default domain have full capability but aux domain can?

Further moving along that road, we could have below (as proposed in
this series):

|-default domain (DMA-API, SVA)
	|-sva domainN (AUX, guest DMA-API, guest SVA)
	|-sva domainM (AUX, guest DMA-API, guest SVA)
(whether putting aux domains as sub-domain or same level is not big deal)

There each domain could support full DMA translation capabilities (IOVA, SVA, 
GPA, GIOVA, etc) with supporting structures together (1st level, 2nd level, irq 
mapping, etc.), and new APIs are invented to utilized new capabilities beyond 
existing DMA-API in each domain. 'AUX' here becomes a orthogonal way to 
those DMA translation  capabilities, and is chosen at domain attach time based 
on device configuration. means same domain can be default on deviceA while
being aux on deviceB (when assigning pci and mdev to same VM).

Thoughts? :-)

> 
> This should of course not break any IRQ setups for the device, and also
> enforcing an ordering is not a good and maintainable solution.
> 
> So what you could do here is to either:
> 
> 	1) Save the needed IRQ mappings in some extra datastructure and
> 	   duplicate it in the SVA domain. This also makes it easier to
> 	   use the same SVA domain with multiple devices.
> 
> 	2) Just re-use the 'page-tables' from the default domain in the
> 	   sva-domain. This needs to happen at attach-time, because at
> 	   allocation time you don't know the device yet.
> 
> I think 1) is the best solution, what do you think?
> 
> Btw, things would be different if we could expose SVA through the
> DMA-API to drivers. In this case we could just make the default domain
> of type SVA and be done, but we are not there yet.
> 
> Regards,

Thanks
Kevin

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC PATCH 2/6] drivers core: Add I/O ASID allocator
       [not found]             ` <8f17757a-c657-aab9-a6a0-fb0cc9c610a8-5wv7dgnIgG8@public.gmane.org>
  2018-11-21 19:10               ` Koenig, Christian
@ 2018-11-22  8:44               ` Joerg Roedel
       [not found]                 ` <20181122084429.GB1586-zLv9SwRftAIdnm+yROfE0A@public.gmane.org>
  1 sibling, 1 reply; 60+ messages in thread
From: Joerg Roedel @ 2018-11-22  8:44 UTC (permalink / raw)
  To: Jean-Philippe Brucker
  Cc: kevin.tian-ral2JQCrhuEAvxtiuMwx3w, rafael-DgEjT+Ai2ygdnm+yROfE0A,
	gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, will.deacon-5wv7dgnIgG8,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	alex.williamson-H+wXaHxf7aLQT0dZR+AlfA, robin.murphy-5wv7dgnIgG8,
	christian.koenig-5C7GfCeVMHo

Hi Jean-Philippe,

On Wed, Nov 21, 2018 at 11:16:07AM +0000, Jean-Philippe Brucker wrote:
> On 12/11/2018 14:40, Joerg Roedel wrote:
> > What is the intended use-case for this function?
> 
> I'm using it in sva_bind(), to see if there already exists an io_mm
> associated to the mm_struct given as argument

Can we store the pasid in the mm_struct directly? We still need a
reverse mapping for the page-fault path, but storing information in
mm_struct makes at least this path faster.

> diff --git a/drivers/base/ioasid.c b/drivers/base/ioasid.c
> index 5eb3abbf69ac..fa3b48c9c044 100644
> --- a/drivers/base/ioasid.c
> +++ b/drivers/base/ioasid.c
> @@ -13,6 +13,7 @@ struct ioasid_data {
>  	int id;
>  	struct ioasid_set *set;
>  	void *private;
> +	struct rcu_head rcu;
>  };
> 
>  static DEFINE_IDR(ioasid_idr);
> @@ -56,7 +57,8 @@ void ioasid_free(ioasid_t ioasid)
>  	ioasid_data = idr_remove(&ioasid_idr, ioasid);
>  	idr_unlock(&ioasid_idr);
> 
> -	kfree(ioasid_data);
> +	if (ioasid_data)
> +		kfree_rcu(ioasid_data, rcu);
>  }
>  EXPORT_SYMBOL_GPL(ioasid_free);
> 
> @@ -95,9 +97,9 @@ int ioasid_for_each(struct ioasid_set *set,
> ioasid_iter_t func, void *data)
>  		.data	= data,
>  	};
> 
> -	idr_lock(&ioasid_idr);
> +	rcu_read_lock();
>  	ret = idr_for_each(&ioasid_idr, ioasid_iter, &iter_data);
> -	idr_unlock(&ioasid_idr);
> +	rcu_read_unlock();
> 
>  	return ret;
>  }
> @@ -111,11 +113,11 @@ void *ioasid_find(struct ioasid_set *set, ioasid_t
> ioasid)
>  	void *priv = NULL;
>  	struct ioasid_data *ioasid_data;
> 
> -	idr_lock(&ioasid_idr);
> +	rcu_read_lock();
>  	ioasid_data = idr_find(&ioasid_idr, ioasid);
>  	if (ioasid_data && ioasid_data->set == set)
>  		priv = ioasid_data->private;
> -	idr_unlock(&ioasid_idr);
> +	rcu_read_unlock();
> 
>  	return priv;
>  }

Looks good to me.

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC PATCH 2/6] drivers core: Add I/O ASID allocator
       [not found]                 ` <20181122084429.GB1586-zLv9SwRftAIdnm+yROfE0A@public.gmane.org>
@ 2018-11-22 11:17                   ` Jean-Philippe Brucker
  0 siblings, 0 replies; 60+ messages in thread
From: Jean-Philippe Brucker @ 2018-11-22 11:17 UTC (permalink / raw)
  To: Joerg Roedel
  Cc: kevin.tian-ral2JQCrhuEAvxtiuMwx3w, rafael-DgEjT+Ai2ygdnm+yROfE0A,
	gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, Will Deacon,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	alex.williamson-H+wXaHxf7aLQT0dZR+AlfA, Robin Murphy,
	christian.koenig-5C7GfCeVMHo

On 22/11/2018 08:44, Joerg Roedel wrote:
> Hi Jean-Philippe,
> 
> On Wed, Nov 21, 2018 at 11:16:07AM +0000, Jean-Philippe Brucker wrote:
>> On 12/11/2018 14:40, Joerg Roedel wrote:
>>> What is the intended use-case for this function?
>>
>> I'm using it in sva_bind(), to see if there already exists an io_mm
>> associated to the mm_struct given as argument
> 
> Can we store the pasid in the mm_struct directly? We still need a
> reverse mapping for the page-fault path, but storing information in
> mm_struct makes at least this path faster.

The bind() path doesn't really need to be fast. So on its own, I don't
think it justifies changing the mm_struct until we have real-world
numbers indicating that it's really too slow (as the worst case of
iterating over 2**20 values will be)

Thanks,
Jean

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC PATCH 0/6] Auxiliary IOMMU domains and Arm SMMUv3
       [not found]                         ` <758bb120-5bc0-1e2d-ccd0-9be0bcc5d8bc-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>
@ 2018-11-23 11:21                           ` joro-zLv9SwRftAIdnm+yROfE0A
       [not found]                             ` <20181123112125.GF1586-zLv9SwRftAIdnm+yROfE0A@public.gmane.org>
       [not found]                             ` <AADFC41AFE54684AB9EE6CBC0274A5D19BE68777@SHSMSX101.ccr.corp.intel.com>
  0 siblings, 2 replies; 60+ messages in thread
From: joro-zLv9SwRftAIdnm+yROfE0A @ 2018-11-23 11:21 UTC (permalink / raw)
  To: Lu Baolu
  Cc: Tian, Kevin, alex.williamson-H+wXaHxf7aLQT0dZR+AlfA,
	rafael-DgEjT+Ai2ygdnm+yROfE0A, Jean-Philippe Brucker,
	gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, will.deacon-5wv7dgnIgG8,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Robin Murphy,
	christian.koenig-5C7GfCeVMHo

On Wed, Nov 21, 2018 at 12:40:44PM +0800, Lu Baolu wrote:
> Can you please elaborate a bit more about the concept of subdomains?
> From my point of view, an aux-domain is a normal un-managed domain which
> has a PASID and could be attached to any ADIs through the aux-domain
> specific attach/detach APIs. It could also be attached to a device with
> the existing domain_attach/detach_device() APIs at the same time, hence
> mdev and pci devices in a vfio container could share a domain.

Okay, let's think a bit about having aux-specific attach/detach
functions, in the end I don't insist on my proposal as long as the
IOMMU-API extensions are clean, consistent, and have well defined
semantics.

If we have aux-domain specific attach/detach functions like
iommu_aux_domain_attach/detach(), what happens when the primary domain
of the device is changed with iommu_attach/detach()?

	1) Will the aux-domains stay in place? If yes, how does this
	   work in setups where the pasid-bound page-tables are
	   guest-owned and translated through the primary domain
	   page-tables?

	2) Will the aux-domains be unbound too? In that case, if the
	   primary domain is re-used, will the aux-domains be implicitly
	   bound too when iommu_device_attach() is called?

	3) Do we just disallow changing the primary domain through that
	   interface as long as there are aux-domains or mm_structs
	   bound to the device?

Using option 2) or 3) would mean that the aux-domains are still bound to
the primary domain, but that is not reflected in the API. Further, if an
aux-domain is just a normal domain (struct iommu_domain), what happens
when a domain that was used as a primary domain and has bound
aux-domains to it, is bound with iommu_aux_domain_attach() to another
domain?

As you can see, this design decission raises a lot of questions, but
maybe we can work it out and find a solution we all agree on.


Regards,

	Joerg

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC PATCH 0/6] Auxiliary IOMMU domains and Arm SMMUv3
       [not found]                         ` <AADFC41AFE54684AB9EE6CBC0274A5D19BE5A7A7-0J0gbvR4kThpB2pF5aRoyrfspsVTdybXVpNB7YpNyf8@public.gmane.org>
@ 2018-11-23 12:14                           ` joro-zLv9SwRftAIdnm+yROfE0A
  0 siblings, 0 replies; 60+ messages in thread
From: joro-zLv9SwRftAIdnm+yROfE0A @ 2018-11-23 12:14 UTC (permalink / raw)
  To: Tian, Kevin
  Cc: rafael-DgEjT+Ai2ygdnm+yROfE0A, Jean-Philippe Brucker,
	gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, will.deacon-5wv7dgnIgG8,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	alex.williamson-H+wXaHxf7aLQT0dZR+AlfA, Robin Murphy,
	christian.koenig-5C7GfCeVMHo

Hi Kevin,

On Thu, Nov 22, 2018 at 08:39:19AM +0000, Tian, Kevin wrote:
> I agree special action needs to be taken for everything else (other than
> DMA-API), but the point that I didn't get is why the action must be based
> a new SVA-type domain, instead of extending default domain with SVA
> capability (including related paging structures which are accessed through 
> new SVA APIs). In the latter case domain-wise attribute (e.g. IRQ mapping) 
> is naturally shared between capabilities (DMA-API and SVA). There is no 
> need to create cross-domain connections as two options that you listed 
> below.
> 
> Can you help elaborate more about the motivation behind proposal?

Yeah, thinking more about it, there are no real reasons against allowing
aux and sva bindings on the default domain. It allows the host to use a
device through the DMA-API and assign parts of it to a guest or a
user-space process, for example.

> |-default domain (DMA-API)
> 	|-sva domain1 (SVA)
> 	|-sva domain2 (SVA)
> 	|-...
> 	|-sva domainN (AUX, guest DMA-API)
> 	|	|- sva domainN1 (AUX, guest SVA)
> 	|	|- sva domainN2 (AUX, guest SVA)
> 	|	|-...
> 	|-sva domainM (AUX, guest DMA-API)
> 	|	|- sva domainM1 (AUX, guest SVA)
> 	|	|- sva domainM2 (AUX, guest SVA)
> 	|	|-...

In my proposal the sva-domains are no sub-domains of the default-domain,
they exist on the same level. A device is detached from its default
domain and attached to an sva-domain.

> means same domain can be default on deviceA while being aux on deviceB
> (when assigning pci and mdev to same VM).

As already written in another mail, this raises a couple of questions,
like what happens when the aux-domain itself has other aux-domains bound
to it?


Regards,

	Joerg

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC PATCH 0/6] Auxiliary IOMMU domains and Arm SMMUv3
       [not found]                         ` <5dcf9238-62b2-8df6-b378-183ee09c5951-5wv7dgnIgG8@public.gmane.org>
@ 2018-11-23 12:50                           ` joro-zLv9SwRftAIdnm+yROfE0A
  0 siblings, 0 replies; 60+ messages in thread
From: joro-zLv9SwRftAIdnm+yROfE0A @ 2018-11-23 12:50 UTC (permalink / raw)
  To: Jean-Philippe Brucker
  Cc: Tian, Kevin, rafael-DgEjT+Ai2ygdnm+yROfE0A,
	gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, will.deacon-5wv7dgnIgG8,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	alex.williamson-H+wXaHxf7aLQT0dZR+AlfA, Robin Murphy,
	christian.koenig-5C7GfCeVMHo

Hi Jean-Philippe,

On Wed, Nov 21, 2018 at 07:05:13PM +0000, Jean-Philippe Brucker wrote:
> For the moment though, I think we should allow device drivers to use the
> DMA-API at the same time as SVA.

Yeah, that makes sense.

> If a device driver has to map a management ring buffer for example,
> it's much nicer to use dma_alloc as usual rather than have them write
> their own DMA allocation routines.  Which is another reason to
> implement 2) above: the DMA-API would still act on the default_domain,
> and attaching an "extended" domain augments it with SVA features.
> Detaching from the device doesn't require copying mappings back to the
> default domain. Does that sound OK?

Yes, as I just wrote to Kevin, this sounds good.

> 	struct io_mm *io_mm;
> 	struct iommu_domain *domain;
> 
> 	domain = iommu_alloc_ext_domain(bus);
> 
> 	/* Set an mm-exit callback to disable PASIDs. More attributes
> 	   could be added later to change the capabilities of the ext
> 	   domain */
> 	iommu_domain_set_attr(domain, DOMAIN_ATTR_MM_EXIT_CB, &mm_exit);
> 
> 	/* Fails if the device doesn't support this domain type */
> 	iommu_attach_device(domain, dev);
> 
> 	/* Same as SVA v3, except a domain instead of dev as argument */
> 	io_mm = iommu_sva_bind_mm(domain, current->mm, ...);
> 
> 	/* on success, install the PASID in the device */
> 	pasid = io_mm->pasid;
> 
> 	/* Do more work */
> 
> 	iommu_sva_unbind_mm(io_mm);
> 	iommu_detach_device(domain, dev);

Okay, we can still discuss the naming and a few details, but that goes
into the right directions. One open questions is, for example, where the
pasid-allocator comes into play. As it looks the amdgpu driver needs it
even without an iommu enabled.


Regards,

	Joerg

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC PATCH 0/6] Auxiliary IOMMU domains and Arm SMMUv3
       [not found]                             ` <20181123112125.GF1586-zLv9SwRftAIdnm+yROfE0A@public.gmane.org>
@ 2018-11-25  6:51                               ` Lu Baolu
  2018-11-26  3:01                               ` Tian, Kevin
  1 sibling, 0 replies; 60+ messages in thread
From: Lu Baolu @ 2018-11-25  6:51 UTC (permalink / raw)
  To: joro-zLv9SwRftAIdnm+yROfE0A
  Cc: Tian, Kevin, alex.williamson-H+wXaHxf7aLQT0dZR+AlfA,
	rafael-DgEjT+Ai2ygdnm+yROfE0A, Jean-Philippe Brucker,
	gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, will.deacon-5wv7dgnIgG8,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Robin Murphy,
	christian.koenig-5C7GfCeVMHo

Hi Joerg,

On 11/23/18 7:21 PM, joro-zLv9SwRftAIdnm+yROfE0A@public.gmane.org wrote:
> On Wed, Nov 21, 2018 at 12:40:44PM +0800, Lu Baolu wrote:
>> Can you please elaborate a bit more about the concept of subdomains?
>>  From my point of view, an aux-domain is a normal un-managed domain which
>> has a PASID and could be attached to any ADIs through the aux-domain
>> specific attach/detach APIs. It could also be attached to a device with
>> the existing domain_attach/detach_device() APIs at the same time, hence
>> mdev and pci devices in a vfio container could share a domain.
> 
> Okay, let's think a bit about having aux-specific attach/detach
> functions, in the end I don't insist on my proposal as long as the
> IOMMU-API extensions are clean, consistent, and have well defined
> semantics.
> 
> If we have aux-domain specific attach/detach functions like
> iommu_aux_domain_attach/detach(), what happens when the primary domain
> of the device is changed with iommu_attach/detach()?
> 
> 	1) Will the aux-domains stay in place? If yes, how does this
> 	   work in setups where the pasid-bound page-tables are
> 	   guest-owned and translated through the primary domain
> 	   page-tables?

Are you thinking about guest SVA? In this case, guest SVA will be
translated through the aux domain page-tables. So, we can safely change
the primary domain with aux-domains staying there.

Conceptually, the primary domain means the existing domain which is for
all Request ID based (a.k.a. transfers without PASID) translations. Any
transfer with PASID will not go through the primary domain.

> 
> 	2) Will the aux-domains be unbound too? In that case, if the
> 	   primary domain is re-used, will the aux-domains be implicitly
> 	   bound too when iommu_device_attach() is called?
> 
> 	3) Do we just disallow changing the primary domain through that
> 	   interface as long as there are aux-domains or mm_structs
> 	   bound to the device?
> 
> Using option 2) or 3) would mean that the aux-domains are still bound to
> the primary domain, but that is not reflected in the API. Further, if an
> aux-domain is just a normal domain (struct iommu_domain), what happens
> when a domain that was used as a primary domain and has bound
> aux-domains to it, is bound with iommu_aux_domain_attach() to another
> domain?
> 
> As you can see, this design decission raises a lot of questions, but
> maybe we can work it out and find a solution we all agree on.
> 
> 
> Regards,
> 
> 	Joerg

Best regards,
Lu Baolu

^ permalink raw reply	[flat|nested] 60+ messages in thread

* RE: [RFC PATCH 0/6] Auxiliary IOMMU domains and Arm SMMUv3
       [not found]                             ` <20181123112125.GF1586-zLv9SwRftAIdnm+yROfE0A@public.gmane.org>
  2018-11-25  6:51                               ` Lu Baolu
@ 2018-11-26  3:01                               ` Tian, Kevin
  1 sibling, 0 replies; 60+ messages in thread
From: Tian, Kevin @ 2018-11-26  3:01 UTC (permalink / raw)
  To: joro-zLv9SwRftAIdnm+yROfE0A, Lu Baolu
  Cc: rafael-DgEjT+Ai2ygdnm+yROfE0A, Jean-Philippe Brucker,
	Robin Murphy, will.deacon-5wv7dgnIgG8,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	alex.williamson-H+wXaHxf7aLQT0dZR+AlfA,
	gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r,
	christian.koenig-5C7GfCeVMHo

> From: joro-zLv9SwRftAIdnm+yROfE0A@public.gmane.org [mailto:joro-zLv9SwRftAIdnm+yROfE0A@public.gmane.org]
> Sent: Friday, November 23, 2018 7:21 PM
> 
> On Wed, Nov 21, 2018 at 12:40:44PM +0800, Lu Baolu wrote:
> > Can you please elaborate a bit more about the concept of subdomains?
> > From my point of view, an aux-domain is a normal un-managed domain
> which
> > has a PASID and could be attached to any ADIs through the aux-domain
> > specific attach/detach APIs. It could also be attached to a device with
> > the existing domain_attach/detach_device() APIs at the same time, hence
> > mdev and pci devices in a vfio container could share a domain.
> 
> Okay, let's think a bit about having aux-specific attach/detach
> functions, in the end I don't insist on my proposal as long as the
> IOMMU-API extensions are clean, consistent, and have well defined
> semantics.
> 
> If we have aux-domain specific attach/detach functions like
> iommu_aux_domain_attach/detach(), what happens when the primary
> domain
> of the device is changed with iommu_attach/detach()?
> 
> 	1) Will the aux-domains stay in place? If yes, how does this
> 	   work in setups where the pasid-bound page-tables are
> 	   guest-owned and translated through the primary domain
> 	   page-tables?
> 
> 	2) Will the aux-domains be unbound too? In that case, if the
> 	   primary domain is re-used, will the aux-domains be implicitly
> 	   bound too when iommu_device_attach() is called?
> 
> 	3) Do we just disallow changing the primary domain through that
> 	   interface as long as there are aux-domains or mm_structs
> 	   bound to the device?

3) sounds like a right option. 

IMO Primary domain represents the primary ownership of DMA capability 
of the whole device. The owner (say host driver) may lend partial capability 
(e.g. allocating some queues) to a sub-owner (through aux-domain to a VM 
or mm_struct to a process), when ownership is still claimed by this owner.

Primary domain switch means ownership change (e.g. from host driver to 
guest driver if type is changed from DMA-API to UNMANAGED), which usually
implies driver load/unload thus all resource usages from previous owner must 
be cleaned up before actual switch happens, which include things allocated 
from DMA-API, aux domain API, and SVA APIs. 

> 
> Using option 2) or 3) would mean that the aux-domains are still bound to
> the primary domain, but that is not reflected in the API. Further, if an

Do you suggest some reflection in API interface definition, or specific API 
implementation? If the former I didn't get how to do it. If the latter, yes 
it's missed in current RFC. We should add check in necessary code path.

> aux-domain is just a normal domain (struct iommu_domain), what
> happens
> when a domain that was used as a primary domain and has bound
> aux-domains to it, is bound with iommu_aux_domain_attach() to another
> domain?

aux domain is essentially a way to lend partial DMA capability to less-
privileged entity (process or VM), which implies that part of DMA 
unmanaged by kernel driver. If we can make such assumption that aux 
only applies to UNMANAGED domain (thus disallow BLOCKED/IDENTITY/
DMA from using aux), then above open doesn't exist, because if primary 
domain is UNMANAGED type it essentially means the whole device fully 
owned by user space or guest thus no host driver entity to ever create 
aux domain. If primary domain is not UNMANAGED type, it will fail in 
aux API. with this design, each device will have at most one primary
and multiple aux, i.e. just two layers. no aux-bind-to-aux thing.

this assumption should be true in discussed usages around aux domain. 
but I may overlook something... Jean?

> 
> As you can see, this design decission raises a lot of questions, but
> maybe we can work it out and find a solution we all agree on.
> 

Thanks for raising those good opens!

Thanks
Kevin

^ permalink raw reply	[flat|nested] 60+ messages in thread

* RE: [RFC PATCH 0/6] Auxiliary IOMMU domains and Arm SMMUv3
       [not found]                               ` <AADFC41AFE54684AB9EE6CBC0274A5D19BE68777-0J0gbvR4kThpB2pF5aRoyrfspsVTdybXVpNB7YpNyf8@public.gmane.org>
@ 2018-11-26  7:29                                 ` Tian, Kevin
       [not found]                                   ` <AADFC41AFE54684AB9EE6CBC0274A5D19BE689B8-0J0gbvR4kThpB2pF5aRoyrfspsVTdybXVpNB7YpNyf8@public.gmane.org>
  0 siblings, 1 reply; 60+ messages in thread
From: Tian, Kevin @ 2018-11-26  7:29 UTC (permalink / raw)
  To: 'joro-zLv9SwRftAIdnm+yROfE0A@public.gmane.org', Lu Baolu
  Cc: rafael-DgEjT+Ai2ygdnm+yROfE0A, Jean-Philippe Brucker,
	Robin Murphy, will.deacon-5wv7dgnIgG8,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	alex.williamson-H+wXaHxf7aLQT0dZR+AlfA,
	gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r,
	christian.koenig-5C7GfCeVMHo

> From: Tian, Kevin
> Sent: Monday, November 26, 2018 11:01 AM
[...]
> > aux-domain is just a normal domain (struct iommu_domain), what
> > happens
> > when a domain that was used as a primary domain and has bound
> > aux-domains to it, is bound with iommu_aux_domain_attach() to another
> > domain?
> 
> aux domain is essentially a way to lend partial DMA capability to less-
> privileged entity (process or VM), which implies that part of DMA
> unmanaged by kernel driver. If we can make such assumption that aux
> only applies to UNMANAGED domain (thus disallow BLOCKED/IDENTITY/
> DMA from using aux), then above open doesn't exist, because if primary
> domain is UNMANAGED type it essentially means the whole device fully
> owned by user space or guest thus no host driver entity to ever create
> aux domain. If primary domain is not UNMANAGED type, it will fail in
> aux API. with this design, each device will have at most one primary
> and multiple aux, i.e. just two layers. no aux-bind-to-aux thing.
> 
> this assumption should be true in discussed usages around aux domain.
> but I may overlook something... Jean?
> 

btw Baolu just reminded me one thing which is worthy of noting here.
'primary' vs. 'aux' concept makes sense only when we look from a device
p.o.v. That binding relationship is not (*should not be*) carry-and-forwarded
cross devices. every domain must be explicitly attached to other devices
(instead of implicitly attached being above example), and new primary/aux
attribute on another device will be decided at attach time.

Thanks
Kevin

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC PATCH 0/6] Auxiliary IOMMU domains and Arm SMMUv3
       [not found]                                   ` <AADFC41AFE54684AB9EE6CBC0274A5D19BE689B8-0J0gbvR4kThpB2pF5aRoyrfspsVTdybXVpNB7YpNyf8@public.gmane.org>
@ 2018-12-07 10:29                                     ` 'joro-zLv9SwRftAIdnm+yROfE0A@public.gmane.org'
       [not found]                                       ` <20181207102926.GM16835-zLv9SwRftAIdnm+yROfE0A@public.gmane.org>
  0 siblings, 1 reply; 60+ messages in thread
From: 'joro-zLv9SwRftAIdnm+yROfE0A@public.gmane.org' @ 2018-12-07 10:29 UTC (permalink / raw)
  To: Tian, Kevin
  Cc: alex.williamson-H+wXaHxf7aLQT0dZR+AlfA,
	rafael-DgEjT+Ai2ygdnm+yROfE0A, Jean-Philippe Brucker,
	gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, will.deacon-5wv7dgnIgG8,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Robin Murphy,
	christian.koenig-5C7GfCeVMHo

Hi,

On Mon, Nov 26, 2018 at 07:29:45AM +0000, Tian, Kevin wrote:
> btw Baolu just reminded me one thing which is worthy of noting here.
> 'primary' vs. 'aux' concept makes sense only when we look from a device
> p.o.v. That binding relationship is not (*should not be*) carry-and-forwarded
> cross devices. every domain must be explicitly attached to other devices
> (instead of implicitly attached being above example), and new primary/aux
> attribute on another device will be decided at attach time.

Okay, so after all the discussions we had I learned a few more things
about the scalable mode feature and thought a bit longer about how to
best support it in the IOMMU-API.

The concept of sub-domains I initially proposed certainly makes no
sense, but scalable-mode specific attach/detach functions do. So instead
of a sub-domain mode, I'd like to propose device-feature sets.

The posted patch-set already includes this as device-attributes, but I
don't like this naming as we are really talking about additional
feature sets of a device. So how about we introduce this:

	enum iommu_dev_features {
		/* ... */
		IOMMU_DEV_FEAT_AUX,
		IOMMU_DEV_FEAT_SVA,
		/* ... */
	};

	/* Check if a device supports a given feature of the IOMMU-API */
	bool iommu_dev_has_feature(struct device *dev, enum iommu_dev_features *feat);

	/*
	 * Only works if iommu_dev_has_feature(dev, IOMMU_DEV_FEAT_AUX)
	 * returns true
	 *
	 * Also, as long as domains are attached to a device through
	 * this interface, any trys to call iommu_attach_device() should
	 * fail (iommu_detach_device() can't fail, so we fail on the
	 * tryint to re-attach). This should make us safe against a
	 * device being attached to a guest as a whole while there are
	 * still pasid users on it (aux and sva).
	 */
	int iommu_aux_attach_device(struct iommu_domain *domain,
				    struct device *dev);

	int iommu_aux_detach_device(struct iommu_domain *domain,
				    struct device *dev);
	/*
	 * I know we are targeting a system-wide pasid-space, so that
	 * the pasid would be the same for one domain on all devices,
	 * let's just keep the option open to have different
	 * pasid-spaces in one system. Also this way we can use it to
	 * check whether the domain is attached to this device at all.
	 *
	 * returns pasid or <0 if domain has no pasid on that device.
	 */
	int iommu_aux_get_pasid(struct iommu_domain *domain,
				struct device *dev);

	/* So we need a iommu_aux_detach_all()? */

This concept can also be easily extended for supporting SVA in parallel
on the same device, with the same constraints regarding the behavior of
iommu_attach_device()/iommu_detach_device().

So what do you think about that approach?

Regards,

	Joerg

	

^ permalink raw reply	[flat|nested] 60+ messages in thread

* RE: [RFC PATCH 0/6] Auxiliary IOMMU domains and Arm SMMUv3
       [not found]                                       ` <20181207102926.GM16835-zLv9SwRftAIdnm+yROfE0A@public.gmane.org>
@ 2018-12-10  2:06                                         ` Tian, Kevin
       [not found]                                           ` <AADFC41AFE54684AB9EE6CBC0274A5D19BE95394-0J0gbvR4kThpB2pF5aRoyrfspsVTdybXVpNB7YpNyf8@public.gmane.org>
  2018-12-10  2:57                                         ` Lu Baolu
  2018-12-11 13:35                                         ` Jean-Philippe Brucker
  2 siblings, 1 reply; 60+ messages in thread
From: Tian, Kevin @ 2018-12-10  2:06 UTC (permalink / raw)
  To: 'joro-zLv9SwRftAIdnm+yROfE0A@public.gmane.org'
  Cc: alex.williamson-H+wXaHxf7aLQT0dZR+AlfA,
	rafael-DgEjT+Ai2ygdnm+yROfE0A, Jean-Philippe Brucker,
	gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, will.deacon-5wv7dgnIgG8,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Robin Murphy,
	christian.koenig-5C7GfCeVMHo

> From: 'joro-zLv9SwRftAIdnm+yROfE0A@public.gmane.org' [mailto:joro-zLv9SwRftAIdnm+yROfE0A@public.gmane.org]
> Sent: Friday, December 7, 2018 6:29 PM
> 
> Hi,
> 
> On Mon, Nov 26, 2018 at 07:29:45AM +0000, Tian, Kevin wrote:
> > btw Baolu just reminded me one thing which is worthy of noting here.
> > 'primary' vs. 'aux' concept makes sense only when we look from a device
> > p.o.v. That binding relationship is not (*should not be*) carry-and-
> forwarded
> > cross devices. every domain must be explicitly attached to other devices
> > (instead of implicitly attached being above example), and new
> primary/aux
> > attribute on another device will be decided at attach time.
> 
> Okay, so after all the discussions we had I learned a few more things
> about the scalable mode feature and thought a bit longer about how to
> best support it in the IOMMU-API.

Thanks for thinking through this.

> 
> The concept of sub-domains I initially proposed certainly makes no
> sense, but scalable-mode specific attach/detach functions do. So instead
> of a sub-domain mode, I'd like to propose device-feature sets.

Can I interpret above as that you agree with the aux domain concept (i.e. one
device can be linked to multiple domains) in general, and now we're just trying
to address the remaining open in API level?

> 
> The posted patch-set already includes this as device-attributes, but I
> don't like this naming as we are really talking about additional
> feature sets of a device. So how about we introduce this:
> 
> 	enum iommu_dev_features {
> 		/* ... */
> 		IOMMU_DEV_FEAT_AUX,
> 		IOMMU_DEV_FEAT_SVA,
> 		/* ... */
> 	};
> 

Does above represent whether a device implements aux/sva features,
or whether a device has been enabled by driver to support aux/sva 
features?

> 	/* Check if a device supports a given feature of the IOMMU-API */
> 	bool iommu_dev_has_feature(struct device *dev, enum
> iommu_dev_features *feat);

If the latter we also need iommu_dev_set_feature so driver can poke
it based on its own configuration. 

> 
> 	/*
> 	 * Only works if iommu_dev_has_feature(dev,
> IOMMU_DEV_FEAT_AUX)
> 	 * returns true
> 	 *
> 	 * Also, as long as domains are attached to a device through
> 	 * this interface, any trys to call iommu_attach_device() should
> 	 * fail (iommu_detach_device() can't fail, so we fail on the
> 	 * tryint to re-attach). This should make us safe against a
> 	 * device being attached to a guest as a whole while there are
> 	 * still pasid users on it (aux and sva).

yes, it makes sense.

> 	 */
> 	int iommu_aux_attach_device(struct iommu_domain *domain,
> 				    struct device *dev);
> 
> 	int iommu_aux_detach_device(struct iommu_domain *domain,
> 				    struct device *dev);
> 	/*
> 	 * I know we are targeting a system-wide pasid-space, so that
> 	 * the pasid would be the same for one domain on all devices,
> 	 * let's just keep the option open to have different
> 	 * pasid-spaces in one system. Also this way we can use it to
> 	 * check whether the domain is attached to this device at all.
> 	 *
> 	 * returns pasid or <0 if domain has no pasid on that device.
> 	 */
> 	int iommu_aux_get_pasid(struct iommu_domain *domain,
> 				struct device *dev);
> 
> 	/* So we need a iommu_aux_detach_all()? */

for what scenario?

> 
> This concept can also be easily extended for supporting SVA in parallel
> on the same device, with the same constraints regarding the behavior of
> iommu_attach_device()/iommu_detach_device().
> 
> So what do you think about that approach?
> 
> Regards,
> 
> 	Joerg
> 
> 

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC PATCH 0/6] Auxiliary IOMMU domains and Arm SMMUv3
       [not found]                                       ` <20181207102926.GM16835-zLv9SwRftAIdnm+yROfE0A@public.gmane.org>
  2018-12-10  2:06                                         ` Tian, Kevin
@ 2018-12-10  2:57                                         ` Lu Baolu
       [not found]                                           ` <bf1ee4a3-6d3f-e0db-a02a-1db819843a60-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>
  2018-12-11 13:35                                         ` Jean-Philippe Brucker
  2 siblings, 1 reply; 60+ messages in thread
From: Lu Baolu @ 2018-12-10  2:57 UTC (permalink / raw)
  To: 'joro-zLv9SwRftAIdnm+yROfE0A@public.gmane.org', Tian, Kevin
  Cc: alex.williamson-H+wXaHxf7aLQT0dZR+AlfA,
	rafael-DgEjT+Ai2ygdnm+yROfE0A, Jean-Philippe Brucker,
	gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, will.deacon-5wv7dgnIgG8,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Robin Murphy,
	christian.koenig-5C7GfCeVMHo

Hi Joerg,

On 12/7/18 6:29 PM, 'joro-zLv9SwRftAIdnm+yROfE0A@public.gmane.org' wrote:
> Hi,
> 
> On Mon, Nov 26, 2018 at 07:29:45AM +0000, Tian, Kevin wrote:
>> btw Baolu just reminded me one thing which is worthy of noting here.
>> 'primary' vs. 'aux' concept makes sense only when we look from a device
>> p.o.v. That binding relationship is not (*should not be*) carry-and-forwarded
>> cross devices. every domain must be explicitly attached to other devices
>> (instead of implicitly attached being above example), and new primary/aux
>> attribute on another device will be decided at attach time.
> 
> Okay, so after all the discussions we had I learned a few more things
> about the scalable mode feature and thought a bit longer about how to
> best support it in the IOMMU-API.

Thanks for thinking about this.

> 
> The concept of sub-domains I initially proposed certainly makes no
> sense, but scalable-mode specific attach/detach functions do. So instead
> of a sub-domain mode, I'd like to propose device-feature sets.
> 
> The posted patch-set already includes this as device-attributes, but I
> don't like this naming as we are really talking about additional
> feature sets of a device. So how about we introduce this:
> 
> 	enum iommu_dev_features {
> 		/* ... */
> 		IOMMU_DEV_FEAT_AUX,
> 		IOMMU_DEV_FEAT_SVA,
> 		/* ... */
> 	};
> 
> 	/* Check if a device supports a given feature of the IOMMU-API */
> 	bool iommu_dev_has_feature(struct device *dev, enum iommu_dev_features *feat);

Here we pass in a pointer of "enum iommu_dev_features", do we want
to return anything here?

Best regards,
Lu Baolu

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC PATCH 0/6] Auxiliary IOMMU domains and Arm SMMUv3
       [not found]                                           ` <AADFC41AFE54684AB9EE6CBC0274A5D19BE95394-0J0gbvR4kThpB2pF5aRoyrfspsVTdybXVpNB7YpNyf8@public.gmane.org>
@ 2018-12-10  8:57                                             ` 'joro-zLv9SwRftAIdnm+yROfE0A@public.gmane.org'
       [not found]                                               ` <20181210085745.GN16835-zLv9SwRftAIdnm+yROfE0A@public.gmane.org>
  0 siblings, 1 reply; 60+ messages in thread
From: 'joro-zLv9SwRftAIdnm+yROfE0A@public.gmane.org' @ 2018-12-10  8:57 UTC (permalink / raw)
  To: Tian, Kevin
  Cc: alex.williamson-H+wXaHxf7aLQT0dZR+AlfA,
	rafael-DgEjT+Ai2ygdnm+yROfE0A, Jean-Philippe Brucker,
	gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, will.deacon-5wv7dgnIgG8,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Robin Murphy,
	christian.koenig-5C7GfCeVMHo

Hi Kevin,

On Mon, Dec 10, 2018 at 02:06:44AM +0000, Tian, Kevin wrote:
> Can I interpret above as that you agree with the aux domain concept (i.e. one
> device can be linked to multiple domains) in general, and now we're just trying
> to address the remaining open in API level?

Yes, I thouht about alternatives, but in the end they are all harder to
use than the aux-domain concept. We just need to make sure that we have
a clear definition of what the API extension do and how they impact the
behavior of existing functions.

> > 	enum iommu_dev_features {
> > 		/* ... */
> > 		IOMMU_DEV_FEAT_AUX,
> > 		IOMMU_DEV_FEAT_SVA,
> > 		/* ... */
> > 	};
> > 
> 
> Does above represent whether a device implements aux/sva features,
> or whether a device has been enabled by driver to support aux/sva 
> features?

These represent whether the device together with the IOMMU support them,
basically whether these features are usable via the IOMMU-API.


> 
> > 	/* Check if a device supports a given feature of the IOMMU-API */
> > 	bool iommu_dev_has_feature(struct device *dev, enum
> > iommu_dev_features *feat);
> 
> If the latter we also need iommu_dev_set_feature so driver can poke
> it based on its own configuration.

Do you mean we need an explict enable-API for the features? I thought
about that too, but some features clearly don't need it and I didn't
want to complicate the usage. My assumption was that when the feature is
available, it is also enabled.

> > 	/* So we need a iommu_aux_detach_all()? */
> 
> for what scenario?

Maybe for detaching any PASID users from a device so that it can be
assigned to a VM as a whole. But I am not sure it is needed.

Regards,

	Joerg

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC PATCH 0/6] Auxiliary IOMMU domains and Arm SMMUv3
       [not found]                                           ` <bf1ee4a3-6d3f-e0db-a02a-1db819843a60-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>
@ 2018-12-10  8:59                                             ` 'joro-zLv9SwRftAIdnm+yROfE0A@public.gmane.org'
  0 siblings, 0 replies; 60+ messages in thread
From: 'joro-zLv9SwRftAIdnm+yROfE0A@public.gmane.org' @ 2018-12-10  8:59 UTC (permalink / raw)
  To: Lu Baolu
  Cc: Tian, Kevin, alex.williamson-H+wXaHxf7aLQT0dZR+AlfA,
	rafael-DgEjT+Ai2ygdnm+yROfE0A, Jean-Philippe Brucker,
	gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, will.deacon-5wv7dgnIgG8,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Robin Murphy,
	christian.koenig-5C7GfCeVMHo

Hi Lu Baolu,

On Mon, Dec 10, 2018 at 10:57:22AM +0800, Lu Baolu wrote:
> > 	/* Check if a device supports a given feature of the IOMMU-API */
> > 	bool iommu_dev_has_feature(struct device *dev, enum iommu_dev_features *feat);
> 
> Here we pass in a pointer of "enum iommu_dev_features", do we want
> to return anything here?

Yeah, this is a mistake on my side, The enum parameter should be passed
by-value, so not pointer there. It doesn't need to return anything in
this parameter.

Regards,

	Joerg

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC PATCH 0/6] Auxiliary IOMMU domains and Arm SMMUv3
       [not found]                                       ` <20181207102926.GM16835-zLv9SwRftAIdnm+yROfE0A@public.gmane.org>
  2018-12-10  2:06                                         ` Tian, Kevin
  2018-12-10  2:57                                         ` Lu Baolu
@ 2018-12-11 13:35                                         ` Jean-Philippe Brucker
       [not found]                                           ` <fc173d9f-57e2-dd87-95d0-1c615f2e14e3-5wv7dgnIgG8@public.gmane.org>
  2 siblings, 1 reply; 60+ messages in thread
From: Jean-Philippe Brucker @ 2018-12-11 13:35 UTC (permalink / raw)
  To: 'joro-zLv9SwRftAIdnm+yROfE0A@public.gmane.org', Tian, Kevin
  Cc: rafael-DgEjT+Ai2ygdnm+yROfE0A, Robin Murphy, Will Deacon,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	alex.williamson-H+wXaHxf7aLQT0dZR+AlfA,
	gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r,
	christian.koenig-5C7GfCeVMHo

On 07/12/2018 10:29, 'joro-zLv9SwRftAIdnm+yROfE0A@public.gmane.org' wrote:
> Hi,
> 
> On Mon, Nov 26, 2018 at 07:29:45AM +0000, Tian, Kevin wrote:
>> btw Baolu just reminded me one thing which is worthy of noting here.
>> 'primary' vs. 'aux' concept makes sense only when we look from a device
>> p.o.v. That binding relationship is not (*should not be*) carry-and-forwarded
>> cross devices. every domain must be explicitly attached to other devices
>> (instead of implicitly attached being above example), and new primary/aux
>> attribute on another device will be decided at attach time.
> 
> Okay, so after all the discussions we had I learned a few more things
> about the scalable mode feature and thought a bit longer about how to
> best support it in the IOMMU-API.
> 
> The concept of sub-domains I initially proposed certainly makes no
> sense, but scalable-mode specific attach/detach functions do. So instead
> of a sub-domain mode, I'd like to propose device-feature sets.
> 
> The posted patch-set already includes this as device-attributes, but I
> don't like this naming as we are really talking about additional
> feature sets of a device. So how about we introduce this:
> 
> 	enum iommu_dev_features {
> 		/* ... */
> 		IOMMU_DEV_FEAT_AUX,
> 		IOMMU_DEV_FEAT_SVA,
> 		/* ... */
> 	};
> 
> 	/* Check if a device supports a given feature of the IOMMU-API */
> 	bool iommu_dev_has_feature(struct device *dev, enum iommu_dev_features *feat);
> 
> 	/*
> 	 * Only works if iommu_dev_has_feature(dev, IOMMU_DEV_FEAT_AUX)
> 	 * returns true
> 	 *
> 	 * Also, as long as domains are attached to a device through
> 	 * this interface, any trys to call iommu_attach_device() should
> 	 * fail (iommu_detach_device() can't fail, so we fail on the
> 	 * tryint to re-attach). This should make us safe against a
> 	 * device being attached to a guest as a whole while there are
> 	 * still pasid users on it (aux and sva).
> 	 */
> 	int iommu_aux_attach_device(struct iommu_domain *domain,
> 				    struct device *dev);
> 
> 	int iommu_aux_detach_device(struct iommu_domain *domain,
> 				    struct device *dev);
> 	/*
> 	 * I know we are targeting a system-wide pasid-space, so that
> 	 * the pasid would be the same for one domain on all devices,
> 	 * let's just keep the option open to have different
> 	 * pasid-spaces in one system. Also this way we can use it to
> 	 * check whether the domain is attached to this device at all.
> 	 *
> 	 * returns pasid or <0 if domain has no pasid on that device.
> 	 */
> 	int iommu_aux_get_pasid(struct iommu_domain *domain,
> 				struct device *dev);>
> 	/* So we need a iommu_aux_detach_all()? */

This could be useful for device drivers that want to do bulk cleanup on
device removal. If they rely on Function Level Reset to disable PASID
states for example, they could also cleanup with a detach_all(). But
most will probably need to clean up individual PASID states (for example
free all mdevs) and therefore don't need detach_all()

> This concept can also be easily extended for supporting SVA in parallel
> on the same device, with the same constraints regarding the behavior of
> iommu_attach_device()/iommu_detach_device().

So this would be without the initial step of attaching an "ext" domain
to the device in order to enable SVA/PASID?

If I understood it correctly, I agree with the
iommu_attach/detach_device() behavior for SVA as well. If a device is
bound to an mm, then the device cannot be attached to another domain
with iommu_attach_device(). This prevents leaks and forces the device
driver to clean up PASID states when switching to a different driver
(e.g. vfio-pci)

Thanks,
Jean

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC PATCH 0/6] Auxiliary IOMMU domains and Arm SMMUv3
       [not found]                                               ` <20181210085745.GN16835-zLv9SwRftAIdnm+yROfE0A@public.gmane.org>
@ 2018-12-11 18:34                                                 ` Jean-Philippe Brucker
       [not found]                                                   ` <4be63d12-fa1c-b180-761b-5e8ceed58545-5wv7dgnIgG8@public.gmane.org>
  2018-12-12  9:31                                                 ` Tian, Kevin
  1 sibling, 1 reply; 60+ messages in thread
From: Jean-Philippe Brucker @ 2018-12-11 18:34 UTC (permalink / raw)
  To: 'joro-zLv9SwRftAIdnm+yROfE0A@public.gmane.org', Tian, Kevin
  Cc: rafael-DgEjT+Ai2ygdnm+yROfE0A, Robin Murphy, Will Deacon,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	alex.williamson-H+wXaHxf7aLQT0dZR+AlfA,
	gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r,
	christian.koenig-5C7GfCeVMHo

On 10/12/2018 08:57, 'joro-zLv9SwRftAIdnm+yROfE0A@public.gmane.org' wrote:
> Hi Kevin,
> 
> On Mon, Dec 10, 2018 at 02:06:44AM +0000, Tian, Kevin wrote:
>> Can I interpret above as that you agree with the aux domain concept (i.e. one
>> device can be linked to multiple domains) in general, and now we're just trying
>> to address the remaining open in API level?
> 
> Yes, I thouht about alternatives, but in the end they are all harder to
> use than the aux-domain concept. We just need to make sure that we have
> a clear definition of what the API extension do and how they impact the
> behavior of existing functions.
> 
>>> 	enum iommu_dev_features {
>>> 		/* ... */
>>> 		IOMMU_DEV_FEAT_AUX,
>>> 		IOMMU_DEV_FEAT_SVA,
>>> 		/* ... */
>>> 	};
>>>
>>
>> Does above represent whether a device implements aux/sva features,
>> or whether a device has been enabled by driver to support aux/sva 
>> features?
> 
> These represent whether the device together with the IOMMU support them,
> basically whether these features are usable via the IOMMU-API.
> 
> 
>>
>>> 	/* Check if a device supports a given feature of the IOMMU-API */
>>> 	bool iommu_dev_has_feature(struct device *dev, enum
>>> iommu_dev_features *feat);
>>
>> If the latter we also need iommu_dev_set_feature so driver can poke
>> it based on its own configuration.
> 
> Do you mean we need an explict enable-API for the features? I thought
> about that too, but some features clearly don't need it and I didn't
> want to complicate the usage. My assumption was that when the feature is
> available, it is also enabled.

The cost of enabling those features for a device does seem negligible.
For the SMMU we need to allocate about 70k of additional memory for the
initial PASID table, but enabling the PASID cap shouldn't add any
overhead otherwise. Same for PRI, shouldn't add any overhead.

As for ATS, it supposedly makes things faster, although it does mean
more invalidation requests. There currently is a global pci=noats
parameter, but maybe we need something with finer granularity to
enable/disable it per device?

Thanks,
Jean

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC PATCH 0/6] Auxiliary IOMMU domains and Arm SMMUv3
       [not found]                                                   ` <4be63d12-fa1c-b180-761b-5e8ceed58545-5wv7dgnIgG8@public.gmane.org>
@ 2018-12-12  9:22                                                     ` 'joro-zLv9SwRftAIdnm+yROfE0A@public.gmane.org'
  0 siblings, 0 replies; 60+ messages in thread
From: 'joro-zLv9SwRftAIdnm+yROfE0A@public.gmane.org' @ 2018-12-12  9:22 UTC (permalink / raw)
  To: Jean-Philippe Brucker
  Cc: Tian, Kevin, alex.williamson-H+wXaHxf7aLQT0dZR+AlfA,
	rafael-DgEjT+Ai2ygdnm+yROfE0A,
	gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, Will Deacon,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Robin Murphy,
	christian.koenig-5C7GfCeVMHo

On Tue, Dec 11, 2018 at 06:34:23PM +0000, Jean-Philippe Brucker wrote:
> The cost of enabling those features for a device does seem negligible.
> For the SMMU we need to allocate about 70k of additional memory for the
> initial PASID table, but enabling the PASID cap shouldn't add any
> overhead otherwise. Same for PRI, shouldn't add any overhead.

Okay, the memory requirements are a pretty good case for an
enable/disable part in the API.

> As for ATS, it supposedly makes things faster, although it does mean
> more invalidation requests. There currently is a global pci=noats
> parameter, but maybe we need something with finer granularity to
> enable/disable it per device?

Yeah, I don't object against this, but I think this is a matter of PCI
code. The IOMMUs should just enable ATS when it is available. PCI core
can fail to enable it if requested by the user.

Regards,

	Joerg

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC PATCH 0/6] Auxiliary IOMMU domains and Arm SMMUv3
       [not found]                                           ` <fc173d9f-57e2-dd87-95d0-1c615f2e14e3-5wv7dgnIgG8@public.gmane.org>
@ 2018-12-12  9:29                                             ` 'joro-zLv9SwRftAIdnm+yROfE0A@public.gmane.org'
  0 siblings, 0 replies; 60+ messages in thread
From: 'joro-zLv9SwRftAIdnm+yROfE0A@public.gmane.org' @ 2018-12-12  9:29 UTC (permalink / raw)
  To: Jean-Philippe Brucker
  Cc: Tian, Kevin, alex.williamson-H+wXaHxf7aLQT0dZR+AlfA,
	rafael-DgEjT+Ai2ygdnm+yROfE0A,
	gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, Will Deacon,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Robin Murphy,
	christian.koenig-5C7GfCeVMHo

On Tue, Dec 11, 2018 at 01:35:23PM +0000, Jean-Philippe Brucker wrote:
> > 	/* So we need a iommu_aux_detach_all()? */
> 
> This could be useful for device drivers that want to do bulk cleanup on
> device removal. If they rely on Function Level Reset to disable PASID
> states for example, they could also cleanup with a detach_all(). But
> most will probably need to clean up individual PASID states (for example
> free all mdevs) and therefore don't need detach_all()

Yeah, so the more I think about it, the more dangerous it seems to have
this function. It creates subtle error cases for the users of SVA and
aux-domains because their mapping goes away without notice. It is
certainly better to force an orderly shutdown of all users before the
device can be re-assigned.

> > This concept can also be easily extended for supporting SVA in parallel
> > on the same device, with the same constraints regarding the behavior of
> > iommu_attach_device()/iommu_detach_device().
> 
> So this would be without the initial step of attaching an "ext" domain
> to the device in order to enable SVA/PASID?

No, I don't think anymore we should introduce a special domain type to
enable these features. Separate enable/disable functions per feature
seem to be a better choice.

> If I understood it correctly, I agree with the
> iommu_attach/detach_device() behavior for SVA as well. If a device is
> bound to an mm, then the device cannot be attached to another domain
> with iommu_attach_device(). This prevents leaks and forces the device
> driver to clean up PASID states when switching to a different driver
> (e.g. vfio-pci)

Right, the API should ensure we are safe on that side.

Thanks,

	Joerg

^ permalink raw reply	[flat|nested] 60+ messages in thread

* RE: [RFC PATCH 0/6] Auxiliary IOMMU domains and Arm SMMUv3
       [not found]                                               ` <20181210085745.GN16835-zLv9SwRftAIdnm+yROfE0A@public.gmane.org>
  2018-12-11 18:34                                                 ` Jean-Philippe Brucker
@ 2018-12-12  9:31                                                 ` Tian, Kevin
       [not found]                                                   ` <AADFC41AFE54684AB9EE6CBC0274A5D19BE9D6CA-0J0gbvR4kThpB2pF5aRoyrfspsVTdybXVpNB7YpNyf8@public.gmane.org>
  1 sibling, 1 reply; 60+ messages in thread
From: Tian, Kevin @ 2018-12-12  9:31 UTC (permalink / raw)
  To: 'joro-zLv9SwRftAIdnm+yROfE0A@public.gmane.org'
  Cc: rafael-DgEjT+Ai2ygdnm+yROfE0A, Jean-Philippe Brucker,
	gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, will.deacon-5wv7dgnIgG8,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	alex.williamson-H+wXaHxf7aLQT0dZR+AlfA, Robin Murphy,
	christian.koenig-5C7GfCeVMHo

> From: 'joro-zLv9SwRftAIdnm+yROfE0A@public.gmane.org'
> Sent: Monday, December 10, 2018 4:58 PM
> 
> Hi Kevin,
> 
> On Mon, Dec 10, 2018 at 02:06:44AM +0000, Tian, Kevin wrote:
> > Can I interpret above as that you agree with the aux domain concept (i.e.
> one
> > device can be linked to multiple domains) in general, and now we're just
> trying
> > to address the remaining open in API level?
> 
> Yes, I thouht about alternatives, but in the end they are all harder to
> use than the aux-domain concept. We just need to make sure that we have
> a clear definition of what the API extension do and how they impact the
> behavior of existing functions.

sounds great!

> 
> > > 	enum iommu_dev_features {
> > > 		/* ... */
> > > 		IOMMU_DEV_FEAT_AUX,
> > > 		IOMMU_DEV_FEAT_SVA,
> > > 		/* ... */
> > > 	};
> > >
> >
> > Does above represent whether a device implements aux/sva features,
> > or whether a device has been enabled by driver to support aux/sva
> > features?
> 
> These represent whether the device together with the IOMMU support
> them,
> basically whether these features are usable via the IOMMU-API.

"device together with IOMMU" or just "IOMMU itself"? 

the former might be OK for sva. As Jean replied in another mail, enabling
it in both IOMMU and device side just consumes some resources, while 
not impacting other usages on that device.

however there is a problem with aux. A device may implement both
SR-IOV and Scalable IOV capabilities, but at any time only one of them
can be enabled. Driver will provide interfaces for end user to choose.
In such case we cannot assume that device-side Scalable-IOV can be
always enabled while IOMMU is in scalable mode.

It works better if we position those features just representing IOMMU 
support only. In that case, aux is related only to scalable mode of IOMMU
itself, which is orthogonal to whether device side supports SIOV.

> 
> 
> >
> > > 	/* Check if a device supports a given feature of the IOMMU-API */
> > > 	bool iommu_dev_has_feature(struct device *dev, enum
> > > iommu_dev_features *feat);
> >
> > If the latter we also need iommu_dev_set_feature so driver can poke
> > it based on its own configuration.
> 
> Do you mean we need an explict enable-API for the features? I thought
> about that too, but some features clearly don't need it and I didn't
> want to complicate the usage. My assumption was that when the feature is
> available, it is also enabled.
> 
> > > 	/* So we need a iommu_aux_detach_all()? */
> >
> > for what scenario?
> 
> Maybe for detaching any PASID users from a device so that it can be
> assigned to a VM as a whole. But I am not sure it is needed.
> 

yes, possibly useful in reset path as Jean pointed out.

Thanks
Kevin

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC PATCH 0/6] Auxiliary IOMMU domains and Arm SMMUv3
       [not found]                                                   ` <AADFC41AFE54684AB9EE6CBC0274A5D19BE9D6CA-0J0gbvR4kThpB2pF5aRoyrfspsVTdybXVpNB7YpNyf8@public.gmane.org>
@ 2018-12-12  9:54                                                     ` 'joro-zLv9SwRftAIdnm+yROfE0A@public.gmane.org'
       [not found]                                                       ` <20181212095403.GU16835-zLv9SwRftAIdnm+yROfE0A@public.gmane.org>
  0 siblings, 1 reply; 60+ messages in thread
From: 'joro-zLv9SwRftAIdnm+yROfE0A@public.gmane.org' @ 2018-12-12  9:54 UTC (permalink / raw)
  To: Tian, Kevin
  Cc: rafael-DgEjT+Ai2ygdnm+yROfE0A, Jean-Philippe Brucker,
	gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, will.deacon-5wv7dgnIgG8,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	alex.williamson-H+wXaHxf7aLQT0dZR+AlfA, Robin Murphy,
	christian.koenig-5C7GfCeVMHo

Hi Kevin,

On Wed, Dec 12, 2018 at 09:31:27AM +0000, Tian, Kevin wrote:
> > From: 'joro-zLv9SwRftAIdnm+yROfE0A@public.gmane.org'
> > Sent: Monday, December 10, 2018 4:58 PM
> > These represent whether the device together with the IOMMU support
> > them,
> > basically whether these features are usable via the IOMMU-API.
> 
> "device together with IOMMU" or just "IOMMU itself"?

No, it should mean device together with IOMMU support it. It is a way
for users of the IOMMU-API to check whether they can successfully use
the aux-specific functions.

> however there is a problem with aux. A device may implement both
> SR-IOV and Scalable IOV capabilities, but at any time only one of them
> can be enabled. Driver will provide interfaces for end user to choose.
> In such case we cannot assume that device-side Scalable-IOV can be
> always enabled while IOMMU is in scalable mode.
> 
> It works better if we position those features just representing IOMMU 
> support only. In that case, aux is related only to scalable mode of IOMMU
> itself, which is orthogonal to whether device side supports SIOV.

Yeah, but we don't make that decision in the IOMMU code. Whether the
device exposes SR-IOV or PASID based isolation is decided in PCI code
based on user input (SR-IOV is also enabled in PCI code and IOMMU just
uses the new devices that appear).

Only if the user enabled scalable mode on the device and the IOMMU
supports it too, the feature-check function returns true for the aux
feature.

Regards,

	Joerg

^ permalink raw reply	[flat|nested] 60+ messages in thread

* RE: [RFC PATCH 0/6] Auxiliary IOMMU domains and Arm SMMUv3
       [not found]                                                       ` <20181212095403.GU16835-zLv9SwRftAIdnm+yROfE0A@public.gmane.org>
@ 2018-12-12 10:03                                                         ` Tian, Kevin
  0 siblings, 0 replies; 60+ messages in thread
From: Tian, Kevin @ 2018-12-12 10:03 UTC (permalink / raw)
  To: 'joro-zLv9SwRftAIdnm+yROfE0A@public.gmane.org'
  Cc: rafael-DgEjT+Ai2ygdnm+yROfE0A, Jean-Philippe Brucker,
	gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, will.deacon-5wv7dgnIgG8,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	alex.williamson-H+wXaHxf7aLQT0dZR+AlfA, Robin Murphy,
	christian.koenig-5C7GfCeVMHo

> From: 'joro-zLv9SwRftAIdnm+yROfE0A@public.gmane.org' [mailto:joro-zLv9SwRftAIdnm+yROfE0A@public.gmane.org]
> Sent: Wednesday, December 12, 2018 5:54 PM
> 
> Hi Kevin,
> 
> On Wed, Dec 12, 2018 at 09:31:27AM +0000, Tian, Kevin wrote:
> > > From: 'joro-zLv9SwRftAIdnm+yROfE0A@public.gmane.org'
> > > Sent: Monday, December 10, 2018 4:58 PM
> > > These represent whether the device together with the IOMMU support
> > > them,
> > > basically whether these features are usable via the IOMMU-API.
> >
> > "device together with IOMMU" or just "IOMMU itself"?
> 
> No, it should mean device together with IOMMU support it. It is a way
> for users of the IOMMU-API to check whether they can successfully use
> the aux-specific functions.
> 
> > however there is a problem with aux. A device may implement both
> > SR-IOV and Scalable IOV capabilities, but at any time only one of them
> > can be enabled. Driver will provide interfaces for end user to choose.
> > In such case we cannot assume that device-side Scalable-IOV can be
> > always enabled while IOMMU is in scalable mode.
> >
> > It works better if we position those features just representing IOMMU
> > support only. In that case, aux is related only to scalable mode of IOMMU
> > itself, which is orthogonal to whether device side supports SIOV.
> 
> Yeah, but we don't make that decision in the IOMMU code. Whether the
> device exposes SR-IOV or PASID based isolation is decided in PCI code
> based on user input (SR-IOV is also enabled in PCI code and IOMMU just
> uses the new devices that appear).
> 
> Only if the user enabled scalable mode on the device and the IOMMU
> supports it too, the feature-check function returns true for the aux
> feature.
> 

Then this is another proof for an enable/disable part in API. :-)

Thanks
kevin

^ permalink raw reply	[flat|nested] 60+ messages in thread

end of thread, other threads:[~2018-12-12 10:03 UTC | newest]

Thread overview: 60+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-10-19 18:11 [RFC PATCH 0/6] Auxiliary IOMMU domains and Arm SMMUv3 Jean-Philippe Brucker
     [not found] ` <20181019181158.2395-1-jean-philippe.brucker-5wv7dgnIgG8@public.gmane.org>
2018-10-19 18:11   ` [RFC PATCH 1/6] iommu: Adapt attach/detach_dev() for auxiliary domains Jean-Philippe Brucker
     [not found]     ` <20181019181158.2395-2-jean-philippe.brucker-5wv7dgnIgG8@public.gmane.org>
2018-10-22  2:32       ` Lu Baolu
2018-10-19 18:11   ` [RFC PATCH 2/6] drivers core: Add I/O ASID allocator Jean-Philippe Brucker
     [not found]     ` <20181019181158.2395-3-jean-philippe.brucker-5wv7dgnIgG8@public.gmane.org>
2018-10-22  4:49       ` Lu Baolu
     [not found]         ` <9c6cd6c1-3569-4251-8344-fc9df0e743bc-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>
2018-10-22 10:22           ` Raj, Ashok
     [not found]             ` <20181022102254.GA25399-7RUrO8UaCDyr4tA6zuQqW9h3ngVCH38I@public.gmane.org>
2018-10-23  6:56               ` Lu Baolu
     [not found]                 ` <02006e4f-2acf-6ff8-b695-c54c99509b46-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>
2018-10-23 22:13                   ` Tian, Kevin
2018-11-07  4:53       ` Lu Baolu
     [not found]         ` <fb2bd5fe-5742-fcd8-b8f0-1885040e8d4f-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>
2018-11-08 18:51           ` Jean-Philippe Brucker
2018-11-12 14:40       ` Joerg Roedel
     [not found]         ` <20181112144039.GA25808-zLv9SwRftAIdnm+yROfE0A@public.gmane.org>
2018-11-21 11:16           ` Jean-Philippe Brucker
     [not found]             ` <8f17757a-c657-aab9-a6a0-fb0cc9c610a8-5wv7dgnIgG8@public.gmane.org>
2018-11-21 19:10               ` Koenig, Christian
     [not found]                 ` <62f05552-df46-6e12-10ed-820429dfda59-5C7GfCeVMHo@public.gmane.org>
2018-11-22  6:59                   ` Tian, Kevin
2018-11-22  8:38                   ` Joerg Roedel
2018-11-22  8:44               ` Joerg Roedel
     [not found]                 ` <20181122084429.GB1586-zLv9SwRftAIdnm+yROfE0A@public.gmane.org>
2018-11-22 11:17                   ` Jean-Philippe Brucker
2018-10-19 18:11   ` [RFC PATCH 3/6] iommu/sva: Use external PASID allocator Jean-Philippe Brucker
2018-10-19 18:11   ` [RFC PATCH 4/6] iommu/sva: Support AUXD feature Jean-Philippe Brucker
2018-10-19 18:11   ` [RFC PATCH 5/6] iommu/arm-smmu-v3: Implement detach_dev op Jean-Philippe Brucker
2018-10-19 18:11   ` [RFC PATCH 6/6] iommu/arm-smmu-v3: Add support for auxiliary domains Jean-Philippe Brucker
2018-10-20  3:36   ` [RFC PATCH 0/6] Auxiliary IOMMU domains and Arm SMMUv3 Xu Zaibo
2018-10-22  6:53   ` Tian, Kevin
     [not found]     ` <AADFC41AFE54684AB9EE6CBC0274A5D19BE0E176-0J0gbvR4kThpB2pF5aRoyrfspsVTdybXVpNB7YpNyf8@public.gmane.org>
2018-10-22 11:50       ` Robin Murphy
     [not found]         ` <11f88122-afd3-a34c-3cd4-db681bf5498b-5wv7dgnIgG8@public.gmane.org>
2018-10-22 15:35           ` Jordan Crouse
2018-10-22 18:01           ` Jean-Philippe Brucker
2018-11-06 16:25           ` joro-zLv9SwRftAIdnm+yROfE0A
     [not found]             ` <20181106162539.4gmkvg57mja3bn7k-zLv9SwRftAIdnm+yROfE0A@public.gmane.org>
2018-11-07  3:40               ` Lu Baolu
     [not found]                 ` <e22e3631-2ecb-664d-5666-8e0f865dec83-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>
2018-11-07 16:43                   ` joro-zLv9SwRftAIdnm+yROfE0A
     [not found]                     ` <20181107164323.GA19831-zLv9SwRftAIdnm+yROfE0A@public.gmane.org>
2018-11-07 17:23                       ` Alex Williamson
2018-11-21  4:40                       ` Lu Baolu
     [not found]                         ` <758bb120-5bc0-1e2d-ccd0-9be0bcc5d8bc-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>
2018-11-23 11:21                           ` joro-zLv9SwRftAIdnm+yROfE0A
     [not found]                             ` <20181123112125.GF1586-zLv9SwRftAIdnm+yROfE0A@public.gmane.org>
2018-11-25  6:51                               ` Lu Baolu
2018-11-26  3:01                               ` Tian, Kevin
     [not found]                             ` <AADFC41AFE54684AB9EE6CBC0274A5D19BE68777@SHSMSX101.ccr.corp.intel.com>
     [not found]                               ` <AADFC41AFE54684AB9EE6CBC0274A5D19BE68777-0J0gbvR4kThpB2pF5aRoyrfspsVTdybXVpNB7YpNyf8@public.gmane.org>
2018-11-26  7:29                                 ` Tian, Kevin
     [not found]                                   ` <AADFC41AFE54684AB9EE6CBC0274A5D19BE689B8-0J0gbvR4kThpB2pF5aRoyrfspsVTdybXVpNB7YpNyf8@public.gmane.org>
2018-12-07 10:29                                     ` 'joro-zLv9SwRftAIdnm+yROfE0A@public.gmane.org'
     [not found]                                       ` <20181207102926.GM16835-zLv9SwRftAIdnm+yROfE0A@public.gmane.org>
2018-12-10  2:06                                         ` Tian, Kevin
     [not found]                                           ` <AADFC41AFE54684AB9EE6CBC0274A5D19BE95394-0J0gbvR4kThpB2pF5aRoyrfspsVTdybXVpNB7YpNyf8@public.gmane.org>
2018-12-10  8:57                                             ` 'joro-zLv9SwRftAIdnm+yROfE0A@public.gmane.org'
     [not found]                                               ` <20181210085745.GN16835-zLv9SwRftAIdnm+yROfE0A@public.gmane.org>
2018-12-11 18:34                                                 ` Jean-Philippe Brucker
     [not found]                                                   ` <4be63d12-fa1c-b180-761b-5e8ceed58545-5wv7dgnIgG8@public.gmane.org>
2018-12-12  9:22                                                     ` 'joro-zLv9SwRftAIdnm+yROfE0A@public.gmane.org'
2018-12-12  9:31                                                 ` Tian, Kevin
     [not found]                                                   ` <AADFC41AFE54684AB9EE6CBC0274A5D19BE9D6CA-0J0gbvR4kThpB2pF5aRoyrfspsVTdybXVpNB7YpNyf8@public.gmane.org>
2018-12-12  9:54                                                     ` 'joro-zLv9SwRftAIdnm+yROfE0A@public.gmane.org'
     [not found]                                                       ` <20181212095403.GU16835-zLv9SwRftAIdnm+yROfE0A@public.gmane.org>
2018-12-12 10:03                                                         ` Tian, Kevin
2018-12-10  2:57                                         ` Lu Baolu
     [not found]                                           ` <bf1ee4a3-6d3f-e0db-a02a-1db819843a60-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>
2018-12-10  8:59                                             ` 'joro-zLv9SwRftAIdnm+yROfE0A@public.gmane.org'
2018-12-11 13:35                                         ` Jean-Philippe Brucker
     [not found]                                           ` <fc173d9f-57e2-dd87-95d0-1c615f2e14e3-5wv7dgnIgG8@public.gmane.org>
2018-12-12  9:29                                             ` 'joro-zLv9SwRftAIdnm+yROfE0A@public.gmane.org'
2018-11-08 18:29               ` Jean-Philippe Brucker
     [not found]                 ` <42949d93-e22c-dd4d-cd49-46efc0b73cdb-5wv7dgnIgG8@public.gmane.org>
2018-11-12 14:55                   ` joro-zLv9SwRftAIdnm+yROfE0A
     [not found]                     ` <20181112145541.GB25808-zLv9SwRftAIdnm+yROfE0A@public.gmane.org>
2018-11-21 19:05                       ` Jean-Philippe Brucker
     [not found]                         ` <5dcf9238-62b2-8df6-b378-183ee09c5951-5wv7dgnIgG8@public.gmane.org>
2018-11-23 12:50                           ` joro-zLv9SwRftAIdnm+yROfE0A
2018-11-22  8:39                       ` Tian, Kevin
     [not found]                         ` <AADFC41AFE54684AB9EE6CBC0274A5D19BE5A7A7-0J0gbvR4kThpB2pF5aRoyrfspsVTdybXVpNB7YpNyf8@public.gmane.org>
2018-11-23 12:14                           ` joro-zLv9SwRftAIdnm+yROfE0A
2018-10-22 10:07   ` Raj, Ashok
     [not found]     ` <20181021194426.GA11201-7RUrO8UaCDyr4tA6zuQqW9h3ngVCH38I@public.gmane.org>
2018-10-22 16:03       ` Jean-Philippe Brucker
     [not found]         ` <d45c5222-68e9-1d6e-730b-bb8dbc060586-5wv7dgnIgG8@public.gmane.org>
2018-10-23 17:16           ` Raj, Ashok
     [not found]             ` <1540314963.21962.20.camel-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
2018-10-23 22:08               ` Tian, Kevin
2018-10-26  3:00           ` Lu Baolu
2018-10-22 16:48   ` Jordan Crouse
     [not found]     ` <20181022164834.GH26762-9PYrDHPZ2Orvke4nUoYGnHL1okKdlPRT@public.gmane.org>
2018-11-02  3:19       ` Lu Baolu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.