iommu.lists.linux-foundation.org archive mirror
 help / color / mirror / Atom feed
* [PATCH V2 0/5] iommu: Add support to change default domain of a group
@ 2020-02-16 21:57 Sai Praneeth Prakhya
  2020-02-16 21:57 ` [PATCH V2 1/5] iommu: Add dev_def_domain_type() call back function to iommu_ops Sai Praneeth Prakhya
                   ` (5 more replies)
  0 siblings, 6 replies; 28+ messages in thread
From: Sai Praneeth Prakhya @ 2020-02-16 21:57 UTC (permalink / raw)
  To: iommu; +Cc: Ashok Raj, Will Deacon, Robin Murphy, Christoph Hellwig

Presently, the default domain of a group is allocated during boot time and it
cannot be changed later. So, the device would typically be either in identity
(pass_through) mode or the device would be in DMA mode as long as the
system is up and running. There is no way to change the default domain type
dynamically i.e. after booting, a device cannot switch between identity mode and
DMA mode.

Assume a use case wherein the privileged user would want to use the device in
pass-through mode when the device is used for host so that it would be high
performing. Presently, this is not supported. Hence add support to change the
default domain of a group dynamically.

Support this through a sysfs file, namely "/sys/kernel/iommu_groups/<grp_id>/type".

Hi Joerg,
Sorry! for _huge_ delay in posting a V2 of this patch set. Was stuck with some
internal PoC work. Will be consistent from now.

Changes from V1:
----------------
1. V1 patch set wasn't updating dma_ops for some vendors (Eg: AMD), hence,
   change the logic of updating default domain as below (because adding a device
   to iommu_group automatically updates dma_ops)
   a. Allocate a new domain
   b. For every device in the group
	i. Remove the device from the group
	ii. Add the device back to the group
   c. Free previous domain
2. Drop 1st patch of V1 (iommu/vt-d: Modify device_def_domain_type() to use at
   runtime) because "iommu=pt" has no effect on this function anymore.
3. Added a patch to take/release lock while reading iommu_group->default_domain->type
   because it can be changed any time by user.
4. Before changing default domain type of a group, check if the group is
   directly assigned for user level access. If so, abort.
5. Sanitize return path (using ternary operator) in iommu_group_store_type()
6. Split 2nd patch of V1 (iommu: Add device_def_domain_type() call back function
   to iommu_ops) into two patches such that iommu generic changes are now in 1st
   patch of V2 and vt-d specific changes are in 2nd patch of V2.
7. Rename device_def_domain_type() to dev_def_domain_type()
8. Remove example from documentation
9. Change the value written to file "/sys/kernel/iommu_groups/<grp_id>/type"
   from "dma" to "DMA".

Changes from RFC:
-----------------
1. Added support for "auto" type, so that kernel selects one among identity or
   dma mode.
2. Use "system_state" in device_def_domain_type() instead of an argument.

Testing:
--------
Tested by dynamically changing storage device (nvme) from
1. identity mode to dma and making sure file transfer works
2. dma mode to identity mode and making sure file transfer works
Tested only for intel_iommu/vt-d.

Based on iommu maintainer's 'next' branch.

Sai Praneeth Prakhya (5):
  iommu: Add dev_def_domain_type() call back function to iommu_ops
  iommu/vt-d: Rename device_def_domain_type() to    
    intel_iommu_dev_def_domain_type()
  iommu: Add support to change default domain of an iommu_group
  iommu: Take lock before reading iommu_group default domain type
  iommu: Document usage of "/sys/kernel/iommu_groups/<grp_id>/type" file

 .../ABI/testing/sysfs-kernel-iommu_groups          |  29 +++
 drivers/iommu/intel-iommu.c                        |   8 +-
 drivers/iommu/iommu.c                              | 229 ++++++++++++++++++++-
 include/linux/iommu.h                              |   3 +
 4 files changed, 265 insertions(+), 4 deletions(-)

Cc: Christoph Hellwig <hch@lst.de>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Lu Baolu <baolu.lu@linux.intel.com>
Cc: Sohil Mehta <sohil.mehta@intel.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com>

-- 
2.7.4

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH V2 1/5] iommu: Add dev_def_domain_type() call back function to iommu_ops
  2020-02-16 21:57 [PATCH V2 0/5] iommu: Add support to change default domain of a group Sai Praneeth Prakhya
@ 2020-02-16 21:57 ` Sai Praneeth Prakhya
  2020-02-22 23:37   ` Lu Baolu
  2020-02-16 21:57 ` [PATCH V2 2/5] iommu/vt-d: Rename device_def_domain_type() to intel_iommu_dev_def_domain_type() Sai Praneeth Prakhya
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 28+ messages in thread
From: Sai Praneeth Prakhya @ 2020-02-16 21:57 UTC (permalink / raw)
  To: iommu; +Cc: Ashok Raj, Will Deacon, Robin Murphy, Christoph Hellwig

When user requests kernel to change the default domain type of a group
through sysfs, kernel has to make sure that it's ok to change the domain
type of every device in the group to the requested domain (every device may
not support both the domain types i.e. DMA and identity). Hence, add a call
back function that could be implemented per architecture that performs the
above check.

Cc: Christoph Hellwig <hch@lst.de>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Lu Baolu <baolu.lu@linux.intel.com>
Cc: Sohil Mehta <sohil.mehta@intel.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com>
---
 include/linux/iommu.h | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index d1b5f4d98569..3f4aaad0aeb7 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -248,6 +248,7 @@ struct iommu_iotlb_gather {
  * @cache_invalidate: invalidate translation caches
  * @sva_bind_gpasid: bind guest pasid and mm
  * @sva_unbind_gpasid: unbind guest pasid and mm
+ * @dev_def_domain_type: Return the required default domain type for a device
  * @pgsize_bitmap: bitmap of all possible supported page sizes
  * @owner: Driver module providing these ops
  */
@@ -318,6 +319,8 @@ struct iommu_ops {
 
 	int (*sva_unbind_gpasid)(struct device *dev, int pasid);
 
+	int (*dev_def_domain_type)(struct device *dev);
+
 	unsigned long pgsize_bitmap;
 	struct module *owner;
 };
-- 
2.7.4

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH V2 2/5] iommu/vt-d: Rename device_def_domain_type() to intel_iommu_dev_def_domain_type()
  2020-02-16 21:57 [PATCH V2 0/5] iommu: Add support to change default domain of a group Sai Praneeth Prakhya
  2020-02-16 21:57 ` [PATCH V2 1/5] iommu: Add dev_def_domain_type() call back function to iommu_ops Sai Praneeth Prakhya
@ 2020-02-16 21:57 ` Sai Praneeth Prakhya
  2020-02-22 23:42   ` Lu Baolu
  2020-02-16 21:57 ` [PATCH V2 3/5] iommu: Add support to change default domain of an iommu_group Sai Praneeth Prakhya
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 28+ messages in thread
From: Sai Praneeth Prakhya @ 2020-02-16 21:57 UTC (permalink / raw)
  To: iommu; +Cc: Ashok Raj, Will Deacon, Robin Murphy, Christoph Hellwig

The functionality needed for iommu_ops->dev_def_domain_type() is already
provided by device_def_domain_type() in intel_iommu.c. But, every call back
function in intel_iommu_ops starts with intel_iommu prefix, hence rename
device_def_domain_type() to intel_iommu_dev_def_domain_type() so that it
follows the same semantics.

Cc: Christoph Hellwig <hch@lst.de>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Lu Baolu <baolu.lu@linux.intel.com>
Cc: Sohil Mehta <sohil.mehta@intel.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com>
---
 drivers/iommu/intel-iommu.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 64ddccf1d5fe..68f10d271ac0 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -3034,7 +3034,7 @@ static bool device_is_rmrr_locked(struct device *dev)
  *  - IOMMU_DOMAIN_IDENTITY: device requires an identical mapping domain
  *  - 0: both identity and dynamic domains work for this device
  */
-static int device_def_domain_type(struct device *dev)
+static int intel_iommu_dev_def_domain_type(struct device *dev)
 {
 	if (dev_is_pci(dev)) {
 		struct pci_dev *pdev = to_pci_dev(dev);
@@ -5796,7 +5796,8 @@ static int intel_iommu_add_device(struct device *dev)
 	domain = iommu_get_domain_for_dev(dev);
 	dmar_domain = to_dmar_domain(domain);
 	if (domain->type == IOMMU_DOMAIN_DMA) {
-		if (device_def_domain_type(dev) == IOMMU_DOMAIN_IDENTITY) {
+		if (intel_iommu_dev_def_domain_type(dev) ==
+		    IOMMU_DOMAIN_IDENTITY) {
 			ret = iommu_request_dm_for_dev(dev);
 			if (ret) {
 				dmar_remove_one_dev_info(dev);
@@ -5807,7 +5808,7 @@ static int intel_iommu_add_device(struct device *dev)
 			}
 		}
 	} else {
-		if (device_def_domain_type(dev) == IOMMU_DOMAIN_DMA) {
+		if (intel_iommu_dev_def_domain_type(dev) == IOMMU_DOMAIN_DMA) {
 			ret = iommu_request_dma_domain_for_dev(dev);
 			if (ret) {
 				dmar_remove_one_dev_info(dev);
@@ -6194,6 +6195,7 @@ const struct iommu_ops intel_iommu_ops = {
 	.dev_enable_feat	= intel_iommu_dev_enable_feat,
 	.dev_disable_feat	= intel_iommu_dev_disable_feat,
 	.is_attach_deferred	= intel_iommu_is_attach_deferred,
+	.dev_def_domain_type	= intel_iommu_dev_def_domain_type,
 	.pgsize_bitmap		= INTEL_IOMMU_PGSIZES,
 };
 
-- 
2.7.4

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH V2 3/5] iommu: Add support to change default domain of an iommu_group
  2020-02-16 21:57 [PATCH V2 0/5] iommu: Add support to change default domain of a group Sai Praneeth Prakhya
  2020-02-16 21:57 ` [PATCH V2 1/5] iommu: Add dev_def_domain_type() call back function to iommu_ops Sai Praneeth Prakhya
  2020-02-16 21:57 ` [PATCH V2 2/5] iommu/vt-d: Rename device_def_domain_type() to intel_iommu_dev_def_domain_type() Sai Praneeth Prakhya
@ 2020-02-16 21:57 ` Sai Praneeth Prakhya
  2020-02-23  1:20   ` Lu Baolu
  2020-03-02 15:08   ` Joerg Roedel
  2020-02-16 21:57 ` [PATCH V2 4/5] iommu: Take lock before reading iommu_group default domain type Sai Praneeth Prakhya
                   ` (2 subsequent siblings)
  5 siblings, 2 replies; 28+ messages in thread
From: Sai Praneeth Prakhya @ 2020-02-16 21:57 UTC (permalink / raw)
  To: iommu; +Cc: Ashok Raj, Will Deacon, Robin Murphy, Christoph Hellwig

Presently, the default domain of an iommu_group is allocated during boot
time (i.e. when a device is being added to a group) and it cannot be
changed later. So, the device would typically be either in identity (also
known as pass_through) mode (controlled by "iommu=pt" kernel command line
argument) or the device would be in DMA mode as long as the machine is up
and running. There is no way to change the default domain type dynamically
i.e. after booting, a device cannot switch between identity mode and DMA
mode.

But, assume a use case wherein the user trusts the device and also believes
that the OS is secure enough and hence wants *only* this device to bypass
IOMMU (so that it could be high performing) whereas all the other devices
to go through IOMMU (so that the system is protected). Presently, this use
case is not supported. Hence it will be helpful if there is some way to
change the default domain of a B:D.F dynamically. Since, linux iommu
subsystem prefers to deal at iommu_group level instead of B:D.F level, it
might be helpful if there is some way to change the default domain of a
*iommu_group* dynamically. Hence, add such support.

A privileged user could request the kernel to change the default domain
type of a iommu_group by writing to
"/sys/kernel/iommu_groups/<grp_id>/type"
file. Presently, only three values are supported
1. identity: all the DMA transactions from the devices in this group are
	     *not* translated by the iommu
2. DMA: all the DMA transactions from the devices in this group are
	translated by the iommu
3. auto: kernel decides one among DMA/identity

Also please note that a group type could be modified only when *all* the
devices in the group are not binded to any device driver. Please see
"Documentation/ABI/testing/sysfs-kernel-iommu_groups" for more information.

Cc: Christoph Hellwig <hch@lst.de>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Lu Baolu <baolu.lu@linux.intel.com>
Cc: Sohil Mehta <sohil.mehta@intel.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com>
---
 drivers/iommu/iommu.c | 227 +++++++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 226 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 3e3528436e0b..56a29076871f 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -225,6 +225,8 @@ static int __iommu_attach_group(struct iommu_domain *domain,
 				struct iommu_group *group);
 static void __iommu_detach_group(struct iommu_domain *domain,
 				 struct iommu_group *group);
+static ssize_t iommu_group_store_type(struct iommu_group *group,
+				      const char *buf, size_t count);
 
 static int __init iommu_set_def_domain_type(char *str)
 {
@@ -448,7 +450,8 @@ static IOMMU_GROUP_ATTR(name, S_IRUGO, iommu_group_show_name, NULL);
 static IOMMU_GROUP_ATTR(reserved_regions, 0444,
 			iommu_group_show_resv_regions, NULL);
 
-static IOMMU_GROUP_ATTR(type, 0444, iommu_group_show_type, NULL);
+static IOMMU_GROUP_ATTR(type, 0644, iommu_group_show_type,
+			iommu_group_store_type);
 
 static void iommu_group_release(struct kobject *kobj)
 {
@@ -2654,3 +2657,225 @@ int iommu_sva_get_pasid(struct iommu_sva *handle)
 	return ops->sva_get_pasid(handle);
 }
 EXPORT_SYMBOL_GPL(iommu_sva_get_pasid);
+
+/*
+ * Changes the default domain of a group
+ *
+ * @group: The group for which the default domain should be changed
+ * @prv_dom: The previous domain that is being switched from
+ * @type: The type of the new default domain that gets associated with the group
+ *
+ * Returns 0 on success and error code on failure
+ */
+static int iommu_group_change_def_domain(struct iommu_group *group,
+					 struct iommu_domain *prv_dom,
+					 int type)
+{
+	struct group_device *grp_dev, *temp;
+	struct iommu_domain *new_domain;
+	int ret;
+
+	/*
+	 * iommu_domain_alloc() takes "struct bus_type" as an argument which is
+	 * a member in "struct device". Changing a group's default domain type
+	 * deals at iommu_group level rather than device level and hence there
+	 * is no straight forward way to get "bus_type" of an iommu_group that
+	 * could be passed to iommu_domain_alloc(). So, instead of directly
+	 * calling iommu_domain_alloc(), use iommu_ops from previous default
+	 * domain.
+	 */
+	if (!prv_dom || !prv_dom->ops || !prv_dom->ops->domain_alloc || !type)
+		return -EINVAL;
+
+	/* Allocate a new domain of requested type */
+	new_domain = prv_dom->ops->domain_alloc(type);
+	if (!new_domain) {
+		pr_err("Unable to allocate memory for the new domain\n");
+		return -ENOMEM;
+	}
+
+	new_domain->type = type;
+	new_domain->ops = prv_dom->ops;
+	new_domain->pgsize_bitmap = prv_dom->pgsize_bitmap;
+
+	/*
+	 * Set these upfront here because iommu_group_add_device() and
+	 * iommu_group_create_direct_mappings() needs these to be set
+	 */
+	mutex_lock(&group->mutex);
+	group->default_domain = new_domain;
+	group->domain = new_domain;
+	mutex_unlock(&group->mutex);
+
+	iommu_group_ref_get(group);
+
+	list_for_each_entry_safe(grp_dev, temp, &group->devices, list) {
+		struct device *dev;
+
+		dev = grp_dev->dev;
+		iommu_release_device(dev);
+
+		ret = iommu_group_add_device(group, dev);
+		if (ret)
+			dev_err(dev, "Failed to add to iommu group %d: %d\n",
+				group->id, ret);
+
+		ret = prv_dom->ops->add_device(dev);
+		if (ret)
+			dev_err(dev, "Error adding to iommu: %d\n", ret);
+	}
+
+	iommu_group_put(group);
+	iommu_domain_free(prv_dom);
+	return 0;
+}
+
+static int is_driver_bound(struct device *dev, void *not_used)
+{
+	int ret = 0;
+
+	device_lock(dev);
+	if (device_is_bound(dev))
+		ret = 1;
+	device_unlock(dev);
+	return ret;
+}
+
+static ssize_t iommu_group_store_type(struct iommu_group *group,
+				      const char *buf, size_t count)
+{
+	int ret = 0, req_type = 0, req_auto = 0;
+	struct iommu_domain *prv_dom;
+	struct group_device *grp_dev;
+	const struct iommu_ops *ops;
+	struct device *dev;
+
+	if (!capable(CAP_SYS_ADMIN) || !capable(CAP_SYS_RAWIO))
+		return -EACCES;
+
+	if (WARN_ON(!group))
+		return -EINVAL;
+
+	if (sysfs_streq(buf, "identity"))
+		req_type = IOMMU_DOMAIN_IDENTITY;
+	else if (sysfs_streq(buf, "DMA"))
+		req_type = IOMMU_DOMAIN_DMA;
+	else if (sysfs_streq(buf, "auto"))
+		req_auto = 1;
+	else
+		return -EINVAL;
+
+	/*
+	 * Check if any device in the group still has a driver binded to it.
+	 * This might race with device driver probing code and unfortunately
+	 * there is no clean way out of that either, locking all devices in the
+	 * group and then do the re-attach will introduce a lock-inversion with
+	 * group->mutex - Joerg.
+	 */
+	if (iommu_group_for_each_dev(group, NULL, is_driver_bound)) {
+		pr_err("Active drivers exist for devices in the group\n");
+		return -EBUSY;
+	}
+
+	mutex_lock(&group->mutex);
+	prv_dom = group->default_domain;
+	if (!prv_dom || !prv_dom->ops || !prv_dom->ops->dev_def_domain_type) {
+		pr_err("'dev_def_domain_type' call back isn't registered\n");
+		ret = -EPERM;
+		goto out;
+	}
+
+	/*
+	 * Check if any user level driver (that doesn't use kernel driver like
+	 * VFIO) is directly using the group. VFIO use case is detected by
+	 * is_driver_bound() check above
+	 */
+	if (prv_dom != group->domain) {
+		pr_err("Group assigned to user level for direct access\n");
+		ret = -EBUSY;
+		goto out;
+	}
+
+	/*
+	 * If user has requested "auto", kernel has to decide the default domain
+	 * type of a group. Hence, find out individual preferences of a device.
+	 */
+	ops = prv_dom->ops;
+	if (req_auto) {
+		int dma_devs = 0, idt_devs = 0, dev_def_dom;
+
+		list_for_each_entry(grp_dev, &group->devices, list) {
+			dev = grp_dev->dev;
+			dev_def_dom = ops->dev_def_domain_type(dev);
+			if (dev_def_dom == IOMMU_DOMAIN_IDENTITY)
+				idt_devs++;
+			if (dev_def_dom == IOMMU_DOMAIN_DMA)
+				dma_devs++;
+		}
+
+		/*
+		 * Default domain type of a group is undefined if some devices
+		 * in the group should be in dma domain while other devices
+		 * should be in identity domain
+		 */
+		if (idt_devs && dma_devs) {
+			pr_err("Some devices need identity domain while other need dma domain\n");
+			ret = -EPERM;
+			goto out;
+		}
+
+		/*
+		 * Default domain type of a group is identity if
+		 * 1. All the devices in the group need to be in identity domain
+		 * 2. Some devices should be in identity domain while other
+		 *    devices could be in either of dma or identity domain
+		 */
+		if (idt_devs && !dma_devs)
+			req_type = IOMMU_DOMAIN_IDENTITY;
+
+		/*
+		 * Default domain type of a group is dma if
+		 * 1. All the devices in the group need to be in dma domain
+		 * 2. Some devices should be in dma domain while other devices
+		 *    could be in either of dma or identity domain
+		 * 3. All the devices could be in either of the domains (namely
+		 *    dma and identity)
+		 */
+		if (!idt_devs)
+			req_type = IOMMU_DOMAIN_DMA;
+	}
+
+	/*
+	 * Switch to a new domain only if the requested domain type is different
+	 * from the existing default domain type
+	 */
+	if (prv_dom->type == req_type)
+		goto out;
+
+	/*
+	 * Every device may not support both the domain types (namely DMA and
+	 * identity), so check if it's ok to change domain type of every device
+	 * in the group to the requested domain. This check is not required if
+	 * user has requested "auto" because it's already done above implicitly.
+	 */
+	if (!req_auto) {
+		list_for_each_entry(grp_dev, &group->devices, list) {
+			int allowed_types;
+
+			dev = grp_dev->dev;
+			allowed_types = ops->dev_def_domain_type(dev);
+			if (allowed_types && allowed_types != req_type) {
+				dev_err(dev, "Cannot be in %s domain\n", buf);
+				ret = -EPERM;
+				goto out;
+			}
+		}
+	}
+
+	mutex_unlock(&group->mutex);
+	ret = iommu_group_change_def_domain(group, prv_dom, req_type);
+	return ret ?: count;
+out:
+	mutex_unlock(&group->mutex);
+	return ret ?: count;
+}
-- 
2.7.4

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH V2 4/5] iommu: Take lock before reading iommu_group default domain type
  2020-02-16 21:57 [PATCH V2 0/5] iommu: Add support to change default domain of a group Sai Praneeth Prakhya
                   ` (2 preceding siblings ...)
  2020-02-16 21:57 ` [PATCH V2 3/5] iommu: Add support to change default domain of an iommu_group Sai Praneeth Prakhya
@ 2020-02-16 21:57 ` Sai Praneeth Prakhya
  2020-02-16 21:57 ` [PATCH V2 5/5] iommu: Document usage of "/sys/kernel/iommu_groups/<grp_id>/type" file Sai Praneeth Prakhya
  2020-02-22 23:40 ` [PATCH V2 0/5] iommu: Add support to change default domain of a group Prakhya, Sai Praneeth
  5 siblings, 0 replies; 28+ messages in thread
From: Sai Praneeth Prakhya @ 2020-02-16 21:57 UTC (permalink / raw)
  To: iommu; +Cc: Ashok Raj, Will Deacon, Robin Murphy, Christoph Hellwig

Since user could change default domain type of an iommu group through
sysfs, reading the default domain type now requires taking the lock first.

Cc: Christoph Hellwig <hch@lst.de>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Lu Baolu <baolu.lu@linux.intel.com>
Cc: Sohil Mehta <sohil.mehta@intel.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com>
---
 drivers/iommu/iommu.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 56a29076871f..10e14cb9c32b 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -424,6 +424,7 @@ static ssize_t iommu_group_show_type(struct iommu_group *group,
 {
 	char *type = "unknown\n";
 
+	mutex_lock(&group->mutex);
 	if (group->default_domain) {
 		switch (group->default_domain->type) {
 		case IOMMU_DOMAIN_BLOCKED:
@@ -440,6 +441,7 @@ static ssize_t iommu_group_show_type(struct iommu_group *group,
 			break;
 		}
 	}
+	mutex_unlock(&group->mutex);
 	strcpy(buf, type);
 
 	return strlen(type);
-- 
2.7.4

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH V2 5/5] iommu: Document usage of "/sys/kernel/iommu_groups/<grp_id>/type" file
  2020-02-16 21:57 [PATCH V2 0/5] iommu: Add support to change default domain of a group Sai Praneeth Prakhya
                   ` (3 preceding siblings ...)
  2020-02-16 21:57 ` [PATCH V2 4/5] iommu: Take lock before reading iommu_group default domain type Sai Praneeth Prakhya
@ 2020-02-16 21:57 ` Sai Praneeth Prakhya
  2020-02-23  1:38   ` Lu Baolu
  2020-02-22 23:40 ` [PATCH V2 0/5] iommu: Add support to change default domain of a group Prakhya, Sai Praneeth
  5 siblings, 1 reply; 28+ messages in thread
From: Sai Praneeth Prakhya @ 2020-02-16 21:57 UTC (permalink / raw)
  To: iommu; +Cc: Ashok Raj, Will Deacon, Robin Murphy, Christoph Hellwig

The default domain type of an iommu group can be changed using file
"/sys/kernel/iommu_groups/<grp_id>/type". Hence, document it's usage and
more importantly spell out it's limitations.

Cc: Christoph Hellwig <hch@lst.de>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Lu Baolu <baolu.lu@linux.intel.com>
Cc: Sohil Mehta <sohil.mehta@intel.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com>
---
 .../ABI/testing/sysfs-kernel-iommu_groups          | 29 ++++++++++++++++++++++
 1 file changed, 29 insertions(+)

diff --git a/Documentation/ABI/testing/sysfs-kernel-iommu_groups b/Documentation/ABI/testing/sysfs-kernel-iommu_groups
index 017f5bc3920c..808a9507b281 100644
--- a/Documentation/ABI/testing/sysfs-kernel-iommu_groups
+++ b/Documentation/ABI/testing/sysfs-kernel-iommu_groups
@@ -33,3 +33,32 @@ Description:    In case an RMRR is used only by graphics or USB devices
 		it is now exposed as "direct-relaxable" instead of "direct".
 		In device assignment use case, for instance, those RMRR
 		are considered to be relaxable and safe.
+
+What:		/sys/kernel/iommu_groups/<grp_id>/type
+Date:		February 2020
+KernelVersion:	v5.6
+Contact:	Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com>
+Description:	Lets the user know the type of default domain in use by iommu
+		for this group. A privileged user could request kernel to change
+		the group type by writing to this file. Presently, only three
+		types are supported
+		1. DMA: All the DMA transactions from the devices in this group
+			are translated by the iommu.
+		2. identity: All the DMA transactions from the devices in this
+			     group are *not* translated by the iommu.
+		3. auto: Kernel decides one among DMA/identity mode and hence
+			 when the user reads the file he would never see "auto".
+			 This is just a write only value.
+		Note:
+		-----
+		A group type could be modified only when *all* the devices in
+		the group are not binded to any device driver. So, the user has
+		to first unbind the appropriate drivers and then change the
+		default domain type.
+		Caution:
+		--------
+		Unbinding a device driver will take away the drivers control
+		over the device and if done on devices that host root file
+		system could lead to catastrophic effects (the user might
+		need to reboot the machine to get it to normal state). So, it's
+		expected that the user understands what he is doing.
-- 
2.7.4

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: [PATCH V2 1/5] iommu: Add dev_def_domain_type() call back function to iommu_ops
  2020-02-16 21:57 ` [PATCH V2 1/5] iommu: Add dev_def_domain_type() call back function to iommu_ops Sai Praneeth Prakhya
@ 2020-02-22 23:37   ` Lu Baolu
  2020-02-22 23:39     ` Prakhya, Sai Praneeth
  0 siblings, 1 reply; 28+ messages in thread
From: Lu Baolu @ 2020-02-22 23:37 UTC (permalink / raw)
  To: Sai Praneeth Prakhya, iommu
  Cc: Ashok Raj, Will Deacon, Robin Murphy, Christoph Hellwig

Hi Sai,

On 2020/2/17 5:57, Sai Praneeth Prakhya wrote:
> When user requests kernel to change the default domain type of a group
> through sysfs, kernel has to make sure that it's ok to change the domain
> type of every device in the group to the requested domain (every device may
> not support both the domain types i.e. DMA and identity). Hence, add a call
> back function that could be implemented per architecture that performs the
> above check.
> 
> Cc: Christoph Hellwig <hch@lst.de>
> Cc: Joerg Roedel <joro@8bytes.org>
> Cc: Ashok Raj <ashok.raj@intel.com>
> Cc: Will Deacon <will.deacon@arm.com>
> Cc: Lu Baolu <baolu.lu@linux.intel.com>
> Cc: Sohil Mehta <sohil.mehta@intel.com>
> Cc: Robin Murphy <robin.murphy@arm.com>
> Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
> Signed-off-by: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com>
> ---
>   include/linux/iommu.h | 3 +++
>   1 file changed, 3 insertions(+)
> 
> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
> index d1b5f4d98569..3f4aaad0aeb7 100644
> --- a/include/linux/iommu.h
> +++ b/include/linux/iommu.h
> @@ -248,6 +248,7 @@ struct iommu_iotlb_gather {
>    * @cache_invalidate: invalidate translation caches
>    * @sva_bind_gpasid: bind guest pasid and mm
>    * @sva_unbind_gpasid: unbind guest pasid and mm
> + * @dev_def_domain_type: Return the required default domain type for a device

Can you please define the return value of this callback?

Best regards,
baolu

>    * @pgsize_bitmap: bitmap of all possible supported page sizes
>    * @owner: Driver module providing these ops
>    */
> @@ -318,6 +319,8 @@ struct iommu_ops {
>   
>   	int (*sva_unbind_gpasid)(struct device *dev, int pasid);
>   
> +	int (*dev_def_domain_type)(struct device *dev);
> +
>   	unsigned long pgsize_bitmap;
>   	struct module *owner;
>   };
> 
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 28+ messages in thread

* RE: [PATCH V2 1/5] iommu: Add dev_def_domain_type() call back function to iommu_ops
  2020-02-22 23:37   ` Lu Baolu
@ 2020-02-22 23:39     ` Prakhya, Sai Praneeth
  0 siblings, 0 replies; 28+ messages in thread
From: Prakhya, Sai Praneeth @ 2020-02-22 23:39 UTC (permalink / raw)
  To: Lu Baolu, iommu; +Cc: Raj, Ashok, Will Deacon, Robin Murphy, Christoph Hellwig

> 
> On 2020/2/17 5:57, Sai Praneeth Prakhya wrote:
> > When user requests kernel to change the default domain type of a group
> > through sysfs, kernel has to make sure that it's ok to change the
> > domain type of every device in the group to the requested domain
> > (every device may not support both the domain types i.e. DMA and
> > identity). Hence, add a call back function that could be implemented
> > per architecture that performs the above check.
> >
> > Cc: Christoph Hellwig <hch@lst.de>
> > Cc: Joerg Roedel <joro@8bytes.org>
> > Cc: Ashok Raj <ashok.raj@intel.com>
> > Cc: Will Deacon <will.deacon@arm.com>
> > Cc: Lu Baolu <baolu.lu@linux.intel.com>
> > Cc: Sohil Mehta <sohil.mehta@intel.com>
> > Cc: Robin Murphy <robin.murphy@arm.com>
> > Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
> > Signed-off-by: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com>
> > ---
> >   include/linux/iommu.h | 3 +++
> >   1 file changed, 3 insertions(+)
> >
> > diff --git a/include/linux/iommu.h b/include/linux/iommu.h index
> > d1b5f4d98569..3f4aaad0aeb7 100644
> > --- a/include/linux/iommu.h
> > +++ b/include/linux/iommu.h
> > @@ -248,6 +248,7 @@ struct iommu_iotlb_gather {
> >    * @cache_invalidate: invalidate translation caches
> >    * @sva_bind_gpasid: bind guest pasid and mm
> >    * @sva_unbind_gpasid: unbind guest pasid and mm
> > + * @dev_def_domain_type: Return the required default domain type for
> > + a device
> 
> Can you please define the return value of this callback?

Sure! I will add it in the next version

Regards,
Sai
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 28+ messages in thread

* RE: [PATCH V2 0/5] iommu: Add support to change default domain of a group
  2020-02-16 21:57 [PATCH V2 0/5] iommu: Add support to change default domain of a group Sai Praneeth Prakhya
                   ` (4 preceding siblings ...)
  2020-02-16 21:57 ` [PATCH V2 5/5] iommu: Document usage of "/sys/kernel/iommu_groups/<grp_id>/type" file Sai Praneeth Prakhya
@ 2020-02-22 23:40 ` Prakhya, Sai Praneeth
  5 siblings, 0 replies; 28+ messages in thread
From: Prakhya, Sai Praneeth @ 2020-02-22 23:40 UTC (permalink / raw)
  To: iommu; +Cc: Raj, Ashok, Will Deacon, Robin Murphy, Christoph Hellwig

Hi Joerg,

> Presently, the default domain of a group is allocated during boot time and it
> cannot be changed later. So, the device would typically be either in identity
> (pass_through) mode or the device would be in DMA mode as long as the system
> is up and running. There is no way to change the default domain type
> dynamically i.e. after booting, a device cannot switch between identity mode
> and DMA mode.
> 
> Assume a use case wherein the privileged user would want to use the device in
> pass-through mode when the device is used for host so that it would be high
> performing. Presently, this is not supported. Hence add support to change the
> default domain of a group dynamically.
> 
> Support this through a sysfs file, namely
> "/sys/kernel/iommu_groups/<grp_id>/type".
> 
> Hi Joerg,
> Sorry! for _huge_ delay in posting a V2 of this patch set. Was stuck with some
> internal PoC work. Will be consistent from now.
> 
> Changes from V1:
> ----------------
> 1. V1 patch set wasn't updating dma_ops for some vendors (Eg: AMD), hence,
>    change the logic of updating default domain as below (because adding a
> device
>    to iommu_group automatically updates dma_ops)
>    a. Allocate a new domain
>    b. For every device in the group
> 	i. Remove the device from the group
> 	ii. Add the device back to the group
>    c. Free previous domain
> 2. Drop 1st patch of V1 (iommu/vt-d: Modify device_def_domain_type() to use
> at
>    runtime) because "iommu=pt" has no effect on this function anymore.
> 3. Added a patch to take/release lock while reading iommu_group-
> >default_domain->type
>    because it can be changed any time by user.
> 4. Before changing default domain type of a group, check if the group is
>    directly assigned for user level access. If so, abort.
> 5. Sanitize return path (using ternary operator) in iommu_group_store_type() 6.
> Split 2nd patch of V1 (iommu: Add device_def_domain_type() call back function
>    to iommu_ops) into two patches such that iommu generic changes are now in
> 1st
>    patch of V2 and vt-d specific changes are in 2nd patch of V2.
> 7. Rename device_def_domain_type() to dev_def_domain_type() 8. Remove
> example from documentation 9. Change the value written to file
> "/sys/kernel/iommu_groups/<grp_id>/type"
>    from "dma" to "DMA".

Just a gentle reminder, could you please review this patch set and let me know your comments?

Regards,
Sai
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH V2 2/5] iommu/vt-d: Rename device_def_domain_type() to intel_iommu_dev_def_domain_type()
  2020-02-16 21:57 ` [PATCH V2 2/5] iommu/vt-d: Rename device_def_domain_type() to intel_iommu_dev_def_domain_type() Sai Praneeth Prakhya
@ 2020-02-22 23:42   ` Lu Baolu
  2020-02-22 23:59     ` Prakhya, Sai Praneeth
  0 siblings, 1 reply; 28+ messages in thread
From: Lu Baolu @ 2020-02-22 23:42 UTC (permalink / raw)
  To: Sai Praneeth Prakhya, iommu
  Cc: Ashok Raj, Will Deacon, Robin Murphy, Christoph Hellwig

Hi,

On 2020/2/17 5:57, Sai Praneeth Prakhya wrote:
> The functionality needed for iommu_ops->dev_def_domain_type() is already
> provided by device_def_domain_type() in intel_iommu.c. But, every call back
> function in intel_iommu_ops starts with intel_iommu prefix, hence rename
> device_def_domain_type() to intel_iommu_dev_def_domain_type() so that it
> follows the same semantics.

How about keep device_def_domain_type() and call it in the new
intel_iommu_dev_def_domain_type()?

Best regards,
baolu

> 
> Cc: Christoph Hellwig <hch@lst.de>
> Cc: Joerg Roedel <joro@8bytes.org>
> Cc: Ashok Raj <ashok.raj@intel.com>
> Cc: Will Deacon <will.deacon@arm.com>
> Cc: Lu Baolu <baolu.lu@linux.intel.com>
> Cc: Sohil Mehta <sohil.mehta@intel.com>
> Cc: Robin Murphy <robin.murphy@arm.com>
> Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
> Signed-off-by: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com>
> ---
>   drivers/iommu/intel-iommu.c | 8 +++++---
>   1 file changed, 5 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
> index 64ddccf1d5fe..68f10d271ac0 100644
> --- a/drivers/iommu/intel-iommu.c
> +++ b/drivers/iommu/intel-iommu.c
> @@ -3034,7 +3034,7 @@ static bool device_is_rmrr_locked(struct device *dev)
>    *  - IOMMU_DOMAIN_IDENTITY: device requires an identical mapping domain
>    *  - 0: both identity and dynamic domains work for this device
>    */
> -static int device_def_domain_type(struct device *dev)
> +static int intel_iommu_dev_def_domain_type(struct device *dev)
>   {
>   	if (dev_is_pci(dev)) {
>   		struct pci_dev *pdev = to_pci_dev(dev);
> @@ -5796,7 +5796,8 @@ static int intel_iommu_add_device(struct device *dev)
>   	domain = iommu_get_domain_for_dev(dev);
>   	dmar_domain = to_dmar_domain(domain);
>   	if (domain->type == IOMMU_DOMAIN_DMA) {
> -		if (device_def_domain_type(dev) == IOMMU_DOMAIN_IDENTITY) {
> +		if (intel_iommu_dev_def_domain_type(dev) ==
> +		    IOMMU_DOMAIN_IDENTITY) {
>   			ret = iommu_request_dm_for_dev(dev);
>   			if (ret) {
>   				dmar_remove_one_dev_info(dev);
> @@ -5807,7 +5808,7 @@ static int intel_iommu_add_device(struct device *dev)
>   			}
>   		}
>   	} else {
> -		if (device_def_domain_type(dev) == IOMMU_DOMAIN_DMA) {
> +		if (intel_iommu_dev_def_domain_type(dev) == IOMMU_DOMAIN_DMA) {
>   			ret = iommu_request_dma_domain_for_dev(dev);
>   			if (ret) {
>   				dmar_remove_one_dev_info(dev);
> @@ -6194,6 +6195,7 @@ const struct iommu_ops intel_iommu_ops = {
>   	.dev_enable_feat	= intel_iommu_dev_enable_feat,
>   	.dev_disable_feat	= intel_iommu_dev_disable_feat,
>   	.is_attach_deferred	= intel_iommu_is_attach_deferred,
> +	.dev_def_domain_type	= intel_iommu_dev_def_domain_type,
>   	.pgsize_bitmap		= INTEL_IOMMU_PGSIZES,
>   };
>   
> 
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 28+ messages in thread

* RE: [PATCH V2 2/5] iommu/vt-d: Rename device_def_domain_type() to intel_iommu_dev_def_domain_type()
  2020-02-22 23:42   ` Lu Baolu
@ 2020-02-22 23:59     ` Prakhya, Sai Praneeth
  2020-02-23  1:50       ` Lu Baolu
  0 siblings, 1 reply; 28+ messages in thread
From: Prakhya, Sai Praneeth @ 2020-02-22 23:59 UTC (permalink / raw)
  To: Lu Baolu, iommu; +Cc: Raj, Ashok, Will Deacon, Robin Murphy, Christoph Hellwig

> 
> On 2020/2/17 5:57, Sai Praneeth Prakhya wrote:
> > The functionality needed for iommu_ops->dev_def_domain_type() is
> > already provided by device_def_domain_type() in intel_iommu.c. But,
> > every call back function in intel_iommu_ops starts with intel_iommu
> > prefix, hence rename
> > device_def_domain_type() to intel_iommu_dev_def_domain_type() so that
> > it follows the same semantics.
> 
> How about keep device_def_domain_type() and call it in the new
> intel_iommu_dev_def_domain_type()?

Sure! I could but could you please explain the advantages we might get by doing so?
Less number of changes is what I could think of.. any other reasons?

Regards,
Sai
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH V2 3/5] iommu: Add support to change default domain of an iommu_group
  2020-02-16 21:57 ` [PATCH V2 3/5] iommu: Add support to change default domain of an iommu_group Sai Praneeth Prakhya
@ 2020-02-23  1:20   ` Lu Baolu
  2020-02-24  3:20     ` Prakhya, Sai Praneeth
  2020-03-02 15:08   ` Joerg Roedel
  1 sibling, 1 reply; 28+ messages in thread
From: Lu Baolu @ 2020-02-23  1:20 UTC (permalink / raw)
  To: Sai Praneeth Prakhya, iommu
  Cc: Ashok Raj, Will Deacon, Robin Murphy, Christoph Hellwig

Hi,

On 2020/2/17 5:57, Sai Praneeth Prakhya wrote:
> Presently, the default domain of an iommu_group is allocated during boot
> time (i.e. when a device is being added to a group) and it cannot be
> changed later. So, the device would typically be either in identity (also
> known as pass_through) mode (controlled by "iommu=pt" kernel command line

The default domain type is initialized according to the kernel build
option, and could be overrided by several kernel commands like iommu=pt
and iommu.passthrough.

> argument) or the device would be in DMA mode as long as the machine is up
> and running. There is no way to change the default domain type dynamically
> i.e. after booting, a device cannot switch between identity mode and DMA
> mode.
> 
> But, assume a use case wherein the user trusts the device and also believes
> that the OS is secure enough and hence wants *only* this device to bypass
> IOMMU (so that it could be high performing) whereas all the other devices
> to go through IOMMU (so that the system is protected). Presently, this use
> case is not supported. Hence it will be helpful if there is some way to
> change the default domain of a B:D.F dynamically. Since, linux iommu
> subsystem prefers to deal at iommu_group level instead of B:D.F level, it
> might be helpful if there is some way to change the default domain of a
> *iommu_group* dynamically. Hence, add such support.
> 
> A privileged user could request the kernel to change the default domain
> type of a iommu_group by writing to
> "/sys/kernel/iommu_groups/<grp_id>/type"
> file. Presently, only three values are supported
> 1. identity: all the DMA transactions from the devices in this group are
> 	     *not* translated by the iommu
> 2. DMA: all the DMA transactions from the devices in this group are
> 	translated by the iommu
> 3. auto: kernel decides one among DMA/identity
> 
> Also please note that a group type could be modified only when *all* the
> devices in the group are not binded to any device driver. Please see
> "Documentation/ABI/testing/sysfs-kernel-iommu_groups" for more information.
> 
> Cc: Christoph Hellwig <hch@lst.de>
> Cc: Joerg Roedel <joro@8bytes.org>
> Cc: Ashok Raj <ashok.raj@intel.com>
> Cc: Will Deacon <will.deacon@arm.com>
> Cc: Lu Baolu <baolu.lu@linux.intel.com>
> Cc: Sohil Mehta <sohil.mehta@intel.com>
> Cc: Robin Murphy <robin.murphy@arm.com>
> Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
> Signed-off-by: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com>
> ---
>   drivers/iommu/iommu.c | 227 +++++++++++++++++++++++++++++++++++++++++++++++++-
>   1 file changed, 226 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
> index 3e3528436e0b..56a29076871f 100644
> --- a/drivers/iommu/iommu.c
> +++ b/drivers/iommu/iommu.c
> @@ -225,6 +225,8 @@ static int __iommu_attach_group(struct iommu_domain *domain,
>   				struct iommu_group *group);
>   static void __iommu_detach_group(struct iommu_domain *domain,
>   				 struct iommu_group *group);
> +static ssize_t iommu_group_store_type(struct iommu_group *group,
> +				      const char *buf, size_t count);
>   
>   static int __init iommu_set_def_domain_type(char *str)
>   {
> @@ -448,7 +450,8 @@ static IOMMU_GROUP_ATTR(name, S_IRUGO, iommu_group_show_name, NULL);
>   static IOMMU_GROUP_ATTR(reserved_regions, 0444,
>   			iommu_group_show_resv_regions, NULL);
>   
> -static IOMMU_GROUP_ATTR(type, 0444, iommu_group_show_type, NULL);
> +static IOMMU_GROUP_ATTR(type, 0644, iommu_group_show_type,
> +			iommu_group_store_type);
>   
>   static void iommu_group_release(struct kobject *kobj)
>   {
> @@ -2654,3 +2657,225 @@ int iommu_sva_get_pasid(struct iommu_sva *handle)
>   	return ops->sva_get_pasid(handle);
>   }
>   EXPORT_SYMBOL_GPL(iommu_sva_get_pasid);
> +
> +/*
> + * Changes the default domain of a group
> + *
> + * @group: The group for which the default domain should be changed
> + * @prv_dom: The previous domain that is being switched from

The previous domain is still kept in group->default_domain, so it's
unnecessary to pass it as a parameter, right?

> + * @type: The type of the new default domain that gets associated with the group
> + *
> + * Returns 0 on success and error code on failure
> + */
> +static int iommu_group_change_def_domain(struct iommu_group *group,
> +					 struct iommu_domain *prv_dom,
> +					 int type)
> +{
> +	struct group_device *grp_dev, *temp;
> +	struct iommu_domain *new_domain;
> +	int ret;
> +
> +	/*
> +	 * iommu_domain_alloc() takes "struct bus_type" as an argument which is
> +	 * a member in "struct device". Changing a group's default domain type
> +	 * deals at iommu_group level rather than device level and hence there
> +	 * is no straight forward way to get "bus_type" of an iommu_group that
> +	 * could be passed to iommu_domain_alloc(). So, instead of directly
> +	 * calling iommu_domain_alloc(), use iommu_ops from previous default
> +	 * domain.
> +	 */
> +	if (!prv_dom || !prv_dom->ops || !prv_dom->ops->domain_alloc || !type)
> +		return -EINVAL;
> +
> +	/* Allocate a new domain of requested type */
> +	new_domain = prv_dom->ops->domain_alloc(type);
> +	if (!new_domain) {
> +		pr_err("Unable to allocate memory for the new domain\n");
> +		return -ENOMEM;
> +	}
> +
> +	new_domain->type = type;
> +	new_domain->ops = prv_dom->ops;
> +	new_domain->pgsize_bitmap = prv_dom->pgsize_bitmap;
> +
> +	/*
> +	 * Set these upfront here because iommu_group_add_device() and
> +	 * iommu_group_create_direct_mappings() needs these to be set
> +	 */
> +	mutex_lock(&group->mutex);
> +	group->default_domain = new_domain;
> +	group->domain = new_domain;
> +	mutex_unlock(&group->mutex);
> +
> +	iommu_group_ref_get(group);
> +
> +	list_for_each_entry_safe(grp_dev, temp, &group->devices, list) {
> +		struct device *dev;
> +
> +		dev = grp_dev->dev;
> +		iommu_release_device(dev);
> +
> +		ret = iommu_group_add_device(group, dev);
> +		if (ret)
> +			dev_err(dev, "Failed to add to iommu group %d: %d\n",
> +				group->id, ret);

Need to handle this error case.

> +
> +		ret = prv_dom->ops->add_device(dev);
> +		if (ret)
> +			dev_err(dev, "Error adding to iommu: %d\n", ret);

Ditto.

> +	}
> +
> +	iommu_group_put(group);
> +	iommu_domain_free(prv_dom);
> +	return 0;
> +}
> +
> +static int is_driver_bound(struct device *dev, void *not_used)
> +{
> +	int ret = 0;
> +
> +	device_lock(dev);
> +	if (device_is_bound(dev))
> +		ret = 1;
> +	device_unlock(dev);
> +	return ret;
> +}
> +
> +static ssize_t iommu_group_store_type(struct iommu_group *group,
> +				      const char *buf, size_t count)
> +{
> +	int ret = 0, req_type = 0, req_auto = 0;
> +	struct iommu_domain *prv_dom;
> +	struct group_device *grp_dev;
> +	const struct iommu_ops *ops;
> +	struct device *dev;
> +
> +	if (!capable(CAP_SYS_ADMIN) || !capable(CAP_SYS_RAWIO))
> +		return -EACCES;
> +
> +	if (WARN_ON(!group))
> +		return -EINVAL;
> +
> +	if (sysfs_streq(buf, "identity"))
> +		req_type = IOMMU_DOMAIN_IDENTITY;
> +	else if (sysfs_streq(buf, "DMA"))
> +		req_type = IOMMU_DOMAIN_DMA;
> +	else if (sysfs_streq(buf, "auto"))
> +		req_auto = 1;
> +	else
> +		return -EINVAL;
> +
> +	/*
> +	 * Check if any device in the group still has a driver binded to it.
> +	 * This might race with device driver probing code and unfortunately
> +	 * there is no clean way out of that either, locking all devices in the
> +	 * group and then do the re-attach will introduce a lock-inversion with
> +	 * group->mutex - Joerg.

Do you mean that we can't do below?

mutex_lock(&group->mutex);
for_each_group_device()
	device_lock(dev);

/* Default domain switch */
for_each_group_device()
	device_unlock()
mutex_unlock(&group->mutex)

> +	 */
> +	if (iommu_group_for_each_dev(group, NULL, is_driver_bound)) {
> +		pr_err("Active drivers exist for devices in the group\n");
> +		return -EBUSY;
> +	}
> +
> +	mutex_lock(&group->mutex);
> +	prv_dom = group->default_domain;
> +	if (!prv_dom || !prv_dom->ops || !prv_dom->ops->dev_def_domain_type) {
> +		pr_err("'dev_def_domain_type' call back isn't registered\n");
> +		ret = -EPERM;
> +		goto out;
> +	}
> +
> +	/*
> +	 * Check if any user level driver (that doesn't use kernel driver like
> +	 * VFIO) is directly using the group. VFIO use case is detected by
> +	 * is_driver_bound() check above
> +	 */
> +	if (prv_dom != group->domain) {
> +		pr_err("Group assigned to user level for direct access\n");
> +		ret = -EBUSY;
> +		goto out;
> +	}
> +
> +	/*
> +	 * If user has requested "auto", kernel has to decide the default domain
> +	 * type of a group. Hence, find out individual preferences of a device.
> +	 */
> +	ops = prv_dom->ops;
> +	if (req_auto) {
> +		int dma_devs = 0, idt_devs = 0, dev_def_dom;
> +
> +		list_for_each_entry(grp_dev, &group->devices, list) {
> +			dev = grp_dev->dev;
> +			dev_def_dom = ops->dev_def_domain_type(dev);
> +			if (dev_def_dom == IOMMU_DOMAIN_IDENTITY)
> +				idt_devs++;
> +			if (dev_def_dom == IOMMU_DOMAIN_DMA)
> +				dma_devs++;
> +		}
> +
> +		/*
> +		 * Default domain type of a group is undefined if some devices
> +		 * in the group should be in dma domain while other devices
> +		 * should be in identity domain
> +		 */
> +		if (idt_devs && dma_devs) {
> +			pr_err("Some devices need identity domain while other need dma domain\n");
> +			ret = -EPERM;
> +			goto out;
> +		}
> +
> +		/*
> +		 * Default domain type of a group is identity if
> +		 * 1. All the devices in the group need to be in identity domain
> +		 * 2. Some devices should be in identity domain while other
> +		 *    devices could be in either of dma or identity domain
> +		 */
> +		if (idt_devs && !dma_devs)
> +			req_type = IOMMU_DOMAIN_IDENTITY;
> +
> +		/*
> +		 * Default domain type of a group is dma if
> +		 * 1. All the devices in the group need to be in dma domain
> +		 * 2. Some devices should be in dma domain while other devices
> +		 *    could be in either of dma or identity domain
> +		 * 3. All the devices could be in either of the domains (namely
> +		 *    dma and identity)
> +		 */
> +		if (!idt_devs)
> +			req_type = IOMMU_DOMAIN_DMA;

Actually, There are four combinations:

			idt_devs==0		idt_devs!=0
dma_devs == 0		iommu_def_domain_type	identity		
dma_devs != 0		DMA			unsupported	

> +	}
> +
> +	/*
> +	 * Switch to a new domain only if the requested domain type is different
> +	 * from the existing default domain type
> +	 */
> +	if (prv_dom->type == req_type)
> +		goto out;
> +
> +	/*
> +	 * Every device may not support both the domain types (namely DMA and
> +	 * identity), so check if it's ok to change domain type of every device
> +	 * in the group to the requested domain. This check is not required if
> +	 * user has requested "auto" because it's already done above implicitly.
> +	 */
> +	if (!req_auto) {
> +		list_for_each_entry(grp_dev, &group->devices, list) {
> +			int allowed_types;
> +
> +			dev = grp_dev->dev;
> +			allowed_types = ops->dev_def_domain_type(dev);
> +			if (allowed_types && allowed_types != req_type) {
> +				dev_err(dev, "Cannot be in %s domain\n", buf);
> +				ret = -EPERM;
> +				goto out;
> +			}
> +		}
> +	}
> +
> +	mutex_unlock(&group->mutex);
> +	ret = iommu_group_change_def_domain(group, prv_dom, req_type);

Why do you want to put group->mutex before do the real domain switching?

> +	return ret ?: count;
> +out:
> +	mutex_unlock(&group->mutex);
> +	return ret ?: count;
> +}
> 

Best regards,
baolu
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH V2 5/5] iommu: Document usage of "/sys/kernel/iommu_groups/<grp_id>/type" file
  2020-02-16 21:57 ` [PATCH V2 5/5] iommu: Document usage of "/sys/kernel/iommu_groups/<grp_id>/type" file Sai Praneeth Prakhya
@ 2020-02-23  1:38   ` Lu Baolu
  2020-02-24  2:18     ` Prakhya, Sai Praneeth
  0 siblings, 1 reply; 28+ messages in thread
From: Lu Baolu @ 2020-02-23  1:38 UTC (permalink / raw)
  To: Sai Praneeth Prakhya, iommu
  Cc: Ashok Raj, Will Deacon, Robin Murphy, Christoph Hellwig

Hi,

On 2020/2/17 5:57, Sai Praneeth Prakhya wrote:
> The default domain type of an iommu group can be changed using file
> "/sys/kernel/iommu_groups/<grp_id>/type". Hence, document it's usage and
> more importantly spell out it's limitations.
> 
> Cc: Christoph Hellwig <hch@lst.de>
> Cc: Joerg Roedel <joro@8bytes.org>
> Cc: Ashok Raj <ashok.raj@intel.com>
> Cc: Will Deacon <will.deacon@arm.com>
> Cc: Lu Baolu <baolu.lu@linux.intel.com>
> Cc: Sohil Mehta <sohil.mehta@intel.com>
> Cc: Robin Murphy <robin.murphy@arm.com>
> Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
> Signed-off-by: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com>
> ---
>   .../ABI/testing/sysfs-kernel-iommu_groups          | 29 ++++++++++++++++++++++
>   1 file changed, 29 insertions(+)
> 
> diff --git a/Documentation/ABI/testing/sysfs-kernel-iommu_groups b/Documentation/ABI/testing/sysfs-kernel-iommu_groups
> index 017f5bc3920c..808a9507b281 100644
> --- a/Documentation/ABI/testing/sysfs-kernel-iommu_groups
> +++ b/Documentation/ABI/testing/sysfs-kernel-iommu_groups
> @@ -33,3 +33,32 @@ Description:    In case an RMRR is used only by graphics or USB devices
>   		it is now exposed as "direct-relaxable" instead of "direct".
>   		In device assignment use case, for instance, those RMRR
>   		are considered to be relaxable and safe.
> +
> +What:		/sys/kernel/iommu_groups/<grp_id>/type
> +Date:		February 2020
> +KernelVersion:	v5.6
> +Contact:	Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com>
> +Description:	Lets the user know the type of default domain in use by iommu
> +		for this group. A privileged user could request kernel to change

Let the users know the default domain type of the IOMMU group by
reading this file. A privileged user could request to change it by
writing to this file.

> +		the group type by writing to this file. Presently, only three
> +		types are supported
> +		1. DMA: All the DMA transactions from the devices in this group
> +			are translated by the iommu.
> +		2. identity: All the DMA transactions from the devices in this
> +			     group are *not* translated by the iommu.
> +		3. auto: Kernel decides one among DMA/identity mode and hence
> +			 when the user reads the file he would never see "auto".

Just out of curiosity, when could a user reads "auto" from this file?

> +			 This is just a write only value.
> +		Note:
> +		-----
> +		A group type could be modified only when *all* the devices in

group's default domain type (not group type).

> +		the group are not binded to any device driver. So, the user has

bound

> +		to first unbind the appropriate drivers and then change the
> +		default domain type.
> +		Caution:
> +		--------
> +		Unbinding a device driver will take away the drivers control
> +		over the device and if done on devices that host root file
> +		system could lead to catastrophic effects (the user might
> +		need to reboot the machine to get it to normal state). So, it's
> +		expected that the user understands what he is doing.
> 

Best regards,
baolu
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH V2 2/5] iommu/vt-d: Rename device_def_domain_type() to intel_iommu_dev_def_domain_type()
  2020-02-22 23:59     ` Prakhya, Sai Praneeth
@ 2020-02-23  1:50       ` Lu Baolu
  2020-02-24  3:23         ` Prakhya, Sai Praneeth
  0 siblings, 1 reply; 28+ messages in thread
From: Lu Baolu @ 2020-02-23  1:50 UTC (permalink / raw)
  To: Prakhya, Sai Praneeth, iommu
  Cc: Raj, Ashok, Will Deacon, Robin Murphy, Christoph Hellwig

Hi,

On 2020/2/23 7:59, Prakhya, Sai Praneeth wrote:
>>
>> On 2020/2/17 5:57, Sai Praneeth Prakhya wrote:
>>> The functionality needed for iommu_ops->dev_def_domain_type() is
>>> already provided by device_def_domain_type() in intel_iommu.c. But,
>>> every call back function in intel_iommu_ops starts with intel_iommu
>>> prefix, hence rename
>>> device_def_domain_type() to intel_iommu_dev_def_domain_type() so that
>>> it follows the same semantics.
>>
>> How about keep device_def_domain_type() and call it in the new
>> intel_iommu_dev_def_domain_type()?
> 
> Sure! I could but could you please explain the advantages we might get by doing so?
> Less number of changes is what I could think of.. any other reasons?
> 

device_def_domain_type() is a quirk list for devices that must use a
specified domain type. intel_iommu_dev_def_domain_type() tells the upper
layer whether the device could switch to another type of domain. Put
them in separated functions will make it easier for maintenance.

Best regards,
baolu
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 28+ messages in thread

* RE: [PATCH V2 5/5] iommu: Document usage of "/sys/kernel/iommu_groups/<grp_id>/type" file
  2020-02-23  1:38   ` Lu Baolu
@ 2020-02-24  2:18     ` Prakhya, Sai Praneeth
  0 siblings, 0 replies; 28+ messages in thread
From: Prakhya, Sai Praneeth @ 2020-02-24  2:18 UTC (permalink / raw)
  To: Lu Baolu, iommu; +Cc: Raj, Ashok, Will Deacon, Robin Murphy, Christoph Hellwig


[-- Attachment #1.1: Type: text/plain, Size: 2255 bytes --]

> > +What:                         /sys/kernel/iommu_groups/<grp_id>/type

> > +Date:                          February 2020

> > +KernelVersion:          v5.6

> > +Contact:     Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com<mailto:sai.praneeth.prakhya@intel.com>>

> > +Description:              Lets the user know the type of default domain in use by iommu

> > +                     for this group. A privileged user could request kernel to change

>

> Let the users know the default domain type of the IOMMU group by reading this

> file. A privileged user could request to change it by writing to this file.



Sure! Will fix it.



> > +                     the group type by writing to this file. Presently, only three

> > +                     types are supported

> > +                     1. DMA: All the DMA transactions from the devices in this group

> > +                                    are translated by the iommu.

> > +                     2. identity: All the DMA transactions from the devices in this

> > +                                         group are *not* translated by the iommu.

> > +                     3. auto: Kernel decides one among DMA/identity mode and

> hence

> > +                                    when the user reads the file he would never see "auto".

>

> Just out of curiosity, when could a user reads "auto" from this file?



Never, because when a user requests kernel to change to "auto" mode, kernel will select one among identity or DMA. So, default domain type cannot be auto, it's always either identity or DMA.

"auto" mode is just a write value where user is requesting kernel to select one among identity or DMA.

> > +                                    This is just a write only value.

> > +                     Note:

> > +                     -----

> > +                     A group type could be modified only when *all* the devices in

>

> group's default domain type (not group type).



Sure! Makes sense.. will fix it 😊



>

> > +                     the group are not binded to any device driver. So, the user has

>

> bound



Makes sense.. will fix this too.



Regards,

Sai

[-- Attachment #1.2: Type: text/html, Size: 8560 bytes --]

[-- Attachment #2: Type: text/plain, Size: 156 bytes --]

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 28+ messages in thread

* RE: [PATCH V2 3/5] iommu: Add support to change default domain of an iommu_group
  2020-02-23  1:20   ` Lu Baolu
@ 2020-02-24  3:20     ` Prakhya, Sai Praneeth
  2020-02-24  5:46       ` Lu Baolu
  0 siblings, 1 reply; 28+ messages in thread
From: Prakhya, Sai Praneeth @ 2020-02-24  3:20 UTC (permalink / raw)
  To: Lu Baolu, iommu; +Cc: Raj, Ashok, Will Deacon, Robin Murphy, Christoph Hellwig

> On 2020/2/17 5:57, Sai Praneeth Prakhya wrote:
> > Presently, the default domain of an iommu_group is allocated during
> > boot time (i.e. when a device is being added to a group) and it cannot
> > be changed later. So, the device would typically be either in identity
> > (also known as pass_through) mode (controlled by "iommu=pt" kernel
> > command line
> 
> The default domain type is initialized according to the kernel build option, and
> could be overrided by several kernel commands like iommu=pt and
> iommu.passthrough.

Yes, that's right. Will change this text.

> > argument) or the device would be in DMA mode as long as the machine is
> > up and running. There is no way to change the default domain type
> > dynamically i.e. after booting, a device cannot switch between
> > identity mode and DMA mode.
> >
> > But, assume a use case wherein the user trusts the device and also
> > believes that the OS is secure enough and hence wants *only* this
> > device to bypass IOMMU (so that it could be high performing) whereas
> > all the other devices to go through IOMMU (so that the system is
> > protected). Presently, this use case is not supported. Hence it will
> > be helpful if there is some way to change the default domain of a
> > B:D.F dynamically. Since, linux iommu subsystem prefers to deal at
> > iommu_group level instead of B:D.F level, it might be helpful if there
> > is some way to change the default domain of a
> > *iommu_group* dynamically. Hence, add such support.

[snipped]

> > +/*
> > + * Changes the default domain of a group
> > + *
> > + * @group: The group for which the default domain should be changed
> > + * @prv_dom: The previous domain that is being switched from
> 
> The previous domain is still kept in group->default_domain, so it's unnecessary
> to pass it as a parameter, right?

Yes, you are right. I passed it as an argument thinking that I could make one assignment less in this function. The caller of this function iommu_group_store_type() already has prv_dom, so just passed it. But, will change this so that instead of every caller getting and passing the argument, it's easier to do it in this function.

> > + * @type: The type of the new default domain that gets associated
> > +with the group
> > + *
> > + * Returns 0 on success and error code on failure  */ static int
> > +iommu_group_change_def_domain(struct iommu_group *group,
> > +					 struct iommu_domain *prv_dom,
> > +					 int type)
> > +{
> > +	struct group_device *grp_dev, *temp;
> > +	struct iommu_domain *new_domain;
> > +	int ret;
> > +
> > +	/*
> > +	 * iommu_domain_alloc() takes "struct bus_type" as an argument which
> is
> > +	 * a member in "struct device". Changing a group's default domain type
> > +	 * deals at iommu_group level rather than device level and hence there
> > +	 * is no straight forward way to get "bus_type" of an iommu_group that
> > +	 * could be passed to iommu_domain_alloc(). So, instead of directly
> > +	 * calling iommu_domain_alloc(), use iommu_ops from previous default
> > +	 * domain.
> > +	 */
> > +	if (!prv_dom || !prv_dom->ops || !prv_dom->ops->domain_alloc ||
> !type)
> > +		return -EINVAL;
> > +
> > +	/* Allocate a new domain of requested type */
> > +	new_domain = prv_dom->ops->domain_alloc(type);
> > +	if (!new_domain) {
> > +		pr_err("Unable to allocate memory for the new domain\n");
> > +		return -ENOMEM;
> > +	}
> > +
> > +	new_domain->type = type;
> > +	new_domain->ops = prv_dom->ops;
> > +	new_domain->pgsize_bitmap = prv_dom->pgsize_bitmap;
> > +
> > +	/*
> > +	 * Set these upfront here because iommu_group_add_device() and
> > +	 * iommu_group_create_direct_mappings() needs these to be set
> > +	 */
> > +	mutex_lock(&group->mutex);
> > +	group->default_domain = new_domain;
> > +	group->domain = new_domain;
> > +	mutex_unlock(&group->mutex);
> > +
> > +	iommu_group_ref_get(group);
> > +
> > +	list_for_each_entry_safe(grp_dev, temp, &group->devices, list) {
> > +		struct device *dev;
> > +
> > +		dev = grp_dev->dev;
> > +		iommu_release_device(dev);
> > +
> > +		ret = iommu_group_add_device(group, dev);
> > +		if (ret)
> > +			dev_err(dev, "Failed to add to iommu group %d: %d\n",
> > +				group->id, ret);
> 
> Need to handle this error case.

I wasn't sure on how to handle the error ☹
i.e. group's domain/default_domain are already updated to new domain and assume there are 'n' devices in the group and this failed for 'k'th device, I wasn't sure how I could roll back the changes made for k-1 devices. So, I thought probably just alert the user that there was an error while changing default domain type and try updating for other devices in the group (hopefully other devices might succeed). Also, *generally* we shouldn't see any errors here because all these devices were already in the same group earlier (we aren’t adding/removing new devices to the group). We are just changing default domain type and we already made sure that device could be in the requested default domain type.

> > +
> > +		ret = prv_dom->ops->add_device(dev);
> > +		if (ret)
> > +			dev_err(dev, "Error adding to iommu: %d\n", ret);
> 
> Ditto.
> 
> > +	}
> > +
> > +	iommu_group_put(group);
> > +	iommu_domain_free(prv_dom);
> > +	return 0;
> > +}
> > +
> > +static int is_driver_bound(struct device *dev, void *not_used) {
> > +	int ret = 0;
> > +
> > +	device_lock(dev);
> > +	if (device_is_bound(dev))
> > +		ret = 1;
> > +	device_unlock(dev);
> > +	return ret;
> > +}
> > +
> > +static ssize_t iommu_group_store_type(struct iommu_group *group,
> > +				      const char *buf, size_t count) {
> > +	int ret = 0, req_type = 0, req_auto = 0;
> > +	struct iommu_domain *prv_dom;
> > +	struct group_device *grp_dev;
> > +	const struct iommu_ops *ops;
> > +	struct device *dev;
> > +
> > +	if (!capable(CAP_SYS_ADMIN) || !capable(CAP_SYS_RAWIO))
> > +		return -EACCES;
> > +
> > +	if (WARN_ON(!group))
> > +		return -EINVAL;
> > +
> > +	if (sysfs_streq(buf, "identity"))
> > +		req_type = IOMMU_DOMAIN_IDENTITY;
> > +	else if (sysfs_streq(buf, "DMA"))
> > +		req_type = IOMMU_DOMAIN_DMA;
> > +	else if (sysfs_streq(buf, "auto"))
> > +		req_auto = 1;
> > +	else
> > +		return -EINVAL;
> > +
> > +	/*
> > +	 * Check if any device in the group still has a driver binded to it.
> > +	 * This might race with device driver probing code and unfortunately
> > +	 * there is no clean way out of that either, locking all devices in the
> > +	 * group and then do the re-attach will introduce a lock-inversion with
> > +	 * group->mutex - Joerg.
> 
> Do you mean that we can't do below?
> 
> mutex_lock(&group->mutex);
> for_each_group_device()
> 	device_lock(dev);
> 
> /* Default domain switch */
> for_each_group_device()
> 	device_unlock()
> mutex_unlock(&group->mutex)

I think, Joerg talks about two issues here
1. is_driver_bound() takes/releases device_lock() which could race with device driver probing code i.e. in an other terminal user could try to unload/load the module.
2. We cannot do as you suggested above because the functions (one is mentioned below) that switch default domain need to take iommu_group_mutex lock. So, taking the mutex and then calling iommu functions that switch default domain will dead lock.

> > +	 */
> > +	if (iommu_group_for_each_dev(group, NULL, is_driver_bound)) {
> > +		pr_err("Active drivers exist for devices in the group\n");
> > +		return -EBUSY;
> > +	}
> > +
> > +	mutex_lock(&group->mutex);
> > +	prv_dom = group->default_domain;
> > +	if (!prv_dom || !prv_dom->ops || !prv_dom->ops-
> >dev_def_domain_type) {
> > +		pr_err("'dev_def_domain_type' call back isn't registered\n");
> > +		ret = -EPERM;
> > +		goto out;
> > +	}
> > +
> > +	/*
> > +	 * Check if any user level driver (that doesn't use kernel driver like
> > +	 * VFIO) is directly using the group. VFIO use case is detected by
> > +	 * is_driver_bound() check above
> > +	 */
> > +	if (prv_dom != group->domain) {
> > +		pr_err("Group assigned to user level for direct access\n");
> > +		ret = -EBUSY;
> > +		goto out;
> > +	}
> > +
> > +	/*
> > +	 * If user has requested "auto", kernel has to decide the default domain
> > +	 * type of a group. Hence, find out individual preferences of a device.
> > +	 */
> > +	ops = prv_dom->ops;
> > +	if (req_auto) {
> > +		int dma_devs = 0, idt_devs = 0, dev_def_dom;
> > +
> > +		list_for_each_entry(grp_dev, &group->devices, list) {
> > +			dev = grp_dev->dev;
> > +			dev_def_dom = ops->dev_def_domain_type(dev);
> > +			if (dev_def_dom == IOMMU_DOMAIN_IDENTITY)
> > +				idt_devs++;
> > +			if (dev_def_dom == IOMMU_DOMAIN_DMA)
> > +				dma_devs++;
> > +		}
> > +
> > +		/*
> > +		 * Default domain type of a group is undefined if some devices
> > +		 * in the group should be in dma domain while other devices
> > +		 * should be in identity domain
> > +		 */
> > +		if (idt_devs && dma_devs) {
> > +			pr_err("Some devices need identity domain while other
> need dma domain\n");
> > +			ret = -EPERM;
> > +			goto out;
> > +		}
> > +
> > +		/*
> > +		 * Default domain type of a group is identity if
> > +		 * 1. All the devices in the group need to be in identity domain
> > +		 * 2. Some devices should be in identity domain while other
> > +		 *    devices could be in either of dma or identity domain
> > +		 */
> > +		if (idt_devs && !dma_devs)
> > +			req_type = IOMMU_DOMAIN_IDENTITY;
> > +
> > +		/*
> > +		 * Default domain type of a group is dma if
> > +		 * 1. All the devices in the group need to be in dma domain
> > +		 * 2. Some devices should be in dma domain while other devices
> > +		 *    could be in either of dma or identity domain
> > +		 * 3. All the devices could be in either of the domains (namely
> > +		 *    dma and identity)
> > +		 */
> > +		if (!idt_devs)
> > +			req_type = IOMMU_DOMAIN_DMA;
> 
> Actually, There are four combinations:
> 
> 			idt_devs==0		idt_devs!=0
> dma_devs == 0		iommu_def_domain_type	identity
> dma_devs != 0		DMA			unsupported

Yes.. you are right. All these four combinations are presently handled in the code.
The comment above talks about combinations that lead us to select DMA domain.

The cases you mentioned above are handled as:
1. dma_devs == 0 && idt_devs == 0 -> In code, req_type = IOMMU_DOMAIN_DMA; because this means *all* the devices in the group can be in either of identity domain or DMA domain, hence select DMA domain over identity domain.
2. dma_devs != 0 && idt_devs == 0 -> In code, req_type = IOMMU_DOMAIN_DMA; because this means some devices in the group *should* be in DMA domain and other devices in the group could be in either of identity domain or DMA domain.
3. dma_devs == 0 && idt_devs != 0 -> In code, req_type = IOMMU_DOMAIN_IDENTITY; because this means some devices in the group *should* be in identity domain and other devices in the group could be in either of identity domain or DMA domain.
4. dma_devs != 0 && idt_devs != 0 -> In code, return error, because this means some devices in the group *should* be in identity domain and other devices in the group *should* be in DMA domain. This combination is not expected. This situation shouldn't arise because this would have failed during boot itself because these kind of devices aren't grouped together in the first place.

> > +	}
> > +
> > +	/*
> > +	 * Switch to a new domain only if the requested domain type is different
> > +	 * from the existing default domain type
> > +	 */
> > +	if (prv_dom->type == req_type)
> > +		goto out;
> > +
> > +	/*
> > +	 * Every device may not support both the domain types (namely DMA
> and
> > +	 * identity), so check if it's ok to change domain type of every device
> > +	 * in the group to the requested domain. This check is not required if
> > +	 * user has requested "auto" because it's already done above implicitly.
> > +	 */
> > +	if (!req_auto) {
> > +		list_for_each_entry(grp_dev, &group->devices, list) {
> > +			int allowed_types;
> > +
> > +			dev = grp_dev->dev;
> > +			allowed_types = ops->dev_def_domain_type(dev);
> > +			if (allowed_types && allowed_types != req_type) {
> > +				dev_err(dev, "Cannot be in %s domain\n", buf);
> > +				ret = -EPERM;
> > +				goto out;
> > +			}
> > +		}
> > +	}
> > +
> > +	mutex_unlock(&group->mutex);
> > +	ret = iommu_group_change_def_domain(group, prv_dom, req_type);
> 
> Why do you want to put group->mutex before do the real domain switching?

Because the below call stack takes iommu_group_mutex. I drop the lock here to facilitate it, otherwise the kernel hangs trying to get the same lock which it is already holding on to 

mutex_lock(&group->mutex);
iommu_group_remove_device()
intel_iommu_remove_device()
iommu_release_device()
iommu_group_change_def_domain()

Regards,
Sai
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 28+ messages in thread

* RE: [PATCH V2 2/5] iommu/vt-d: Rename device_def_domain_type() to intel_iommu_dev_def_domain_type()
  2020-02-23  1:50       ` Lu Baolu
@ 2020-02-24  3:23         ` Prakhya, Sai Praneeth
  0 siblings, 0 replies; 28+ messages in thread
From: Prakhya, Sai Praneeth @ 2020-02-24  3:23 UTC (permalink / raw)
  To: Lu Baolu, iommu; +Cc: Raj, Ashok, Will Deacon, Robin Murphy, Christoph Hellwig

> >> On 2020/2/17 5:57, Sai Praneeth Prakhya wrote:
> >>> The functionality needed for iommu_ops->dev_def_domain_type() is
> >>> already provided by device_def_domain_type() in intel_iommu.c. But,
> >>> every call back function in intel_iommu_ops starts with intel_iommu
> >>> prefix, hence rename
> >>> device_def_domain_type() to intel_iommu_dev_def_domain_type() so
> >>> that it follows the same semantics.
> >>
> >> How about keep device_def_domain_type() and call it in the new
> >> intel_iommu_dev_def_domain_type()?
> >
> > Sure! I could but could you please explain the advantages we might get by
> doing so?
> > Less number of changes is what I could think of.. any other reasons?
> >
> 
> device_def_domain_type() is a quirk list for devices that must use a specified
> domain type. intel_iommu_dev_def_domain_type() tells the upper layer whether
> the device could switch to another type of domain. Put them in separated
> functions will make it easier for maintenance.

Ok. Will fix this.

Regards,
Sai
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH V2 3/5] iommu: Add support to change default domain of an iommu_group
  2020-02-24  3:20     ` Prakhya, Sai Praneeth
@ 2020-02-24  5:46       ` Lu Baolu
  2020-02-24  7:03         ` Prakhya, Sai Praneeth
  0 siblings, 1 reply; 28+ messages in thread
From: Lu Baolu @ 2020-02-24  5:46 UTC (permalink / raw)
  To: Prakhya, Sai Praneeth, iommu
  Cc: Raj, Ashok, Will Deacon, Robin Murphy, Christoph Hellwig

Hi Sai,

On 2020/2/24 11:20, Prakhya, Sai Praneeth wrote:
>>> +	list_for_each_entry_safe(grp_dev, temp, &group->devices, list) {
>>> +		struct device *dev;
>>> +
>>> +		dev = grp_dev->dev;
>>> +		iommu_release_device(dev);
>>> +
>>> +		ret = iommu_group_add_device(group, dev);
>>> +		if (ret)
>>> +			dev_err(dev, "Failed to add to iommu group %d: %d\n",
>>> +				group->id, ret);
>> Need to handle this error case.
> I wasn't sure on how to handle the error ☹

Just roll back to the state before calling this function and return an
appropriate error value.

The likely behavior is detaching the new domains from all devices (if it
has already attached), attaching the old domains to all devices in the
group, cleaning up all new resources allocated in this function, putting
a error message to tell the user why it fails and returning an error
code.

> i.e. group's domain/default_domain are already updated to new domain and assume there are 'n' devices in the group and this failed for 'k'th device, I wasn't sure how I could roll back the changes made for k-1 devices.

A successful attach could be checked by (group->domain ==
group->default_domain).

> So, I thought probably just alert the user that there was an error while changing default domain type and try updating for other devices in the group (hopefully other devices might succeed). Also,*generally*  we shouldn't see any errors here because all these devices were already in the same group earlier (we aren’t adding/removing new devices to the group). We are just changing default domain type and we already made sure that device could be in the requested default domain type.
> 
>>> +
>>> +		ret = prv_dom->ops->add_device(dev);
>>> +		if (ret)
>>> +			dev_err(dev, "Error adding to iommu: %d\n", ret);
>> Ditto.
>>
>>> +	}
>>> +
>>> +	iommu_group_put(group);
>>> +	iommu_domain_free(prv_dom);
>>> +	return 0;
>>> +}
>>> +
>>> +static int is_driver_bound(struct device *dev, void *not_used) {
>>> +	int ret = 0;
>>> +
>>> +	device_lock(dev);
>>> +	if (device_is_bound(dev))
>>> +		ret = 1;
>>> +	device_unlock(dev);
>>> +	return ret;
>>> +}
>>> +
>>> +static ssize_t iommu_group_store_type(struct iommu_group *group,
>>> +				      const char *buf, size_t count) {
>>> +	int ret = 0, req_type = 0, req_auto = 0;
>>> +	struct iommu_domain *prv_dom;
>>> +	struct group_device *grp_dev;
>>> +	const struct iommu_ops *ops;
>>> +	struct device *dev;
>>> +
>>> +	if (!capable(CAP_SYS_ADMIN) || !capable(CAP_SYS_RAWIO))
>>> +		return -EACCES;
>>> +
>>> +	if (WARN_ON(!group))
>>> +		return -EINVAL;
>>> +
>>> +	if (sysfs_streq(buf, "identity"))
>>> +		req_type = IOMMU_DOMAIN_IDENTITY;
>>> +	else if (sysfs_streq(buf, "DMA"))
>>> +		req_type = IOMMU_DOMAIN_DMA;
>>> +	else if (sysfs_streq(buf, "auto"))
>>> +		req_auto = 1;
>>> +	else
>>> +		return -EINVAL;
>>> +
>>> +	/*
>>> +	 * Check if any device in the group still has a driver binded to it.
>>> +	 * This might race with device driver probing code and unfortunately
>>> +	 * there is no clean way out of that either, locking all devices in the
>>> +	 * group and then do the re-attach will introduce a lock-inversion with
>>> +	 * group->mutex - Joerg.
>> Do you mean that we can't do below?
>>
>> mutex_lock(&group->mutex);
>> for_each_group_device()
>> 	device_lock(dev);
>>
>> /* Default domain switch */
>> for_each_group_device()
>> 	device_unlock()
>> mutex_unlock(&group->mutex)
> I think, Joerg talks about two issues here
> 1. is_driver_bound() takes/releases device_lock() which could race with device driver probing code i.e. in an other terminal user could try to unload/load the module.

So you need to device_lock() all devices during the whole process.
(I assume that load/unload device driver requiring this lock as well.
Didn't check it in code. Please correct me if I misunderstood it.)

> 2. We cannot do as you suggested above because the functions (one is mentioned below) that switch default domain need to take iommu_group_mutex lock. So, taking the mutex and then calling iommu functions that switch default domain will dead lock.

You probably need to move the mutex_lock() in that function out to the
caller? And only assert the lock has been held there.

> 
>>> +	 */
>>> +	if (iommu_group_for_each_dev(group, NULL, is_driver_bound)) {
>>> +		pr_err("Active drivers exist for devices in the group\n");
>>> +		return -EBUSY;
>>> +	}
>>> +
>>> +	mutex_lock(&group->mutex);
>>> +	prv_dom = group->default_domain;
>>> +	if (!prv_dom || !prv_dom->ops || !prv_dom->ops-
>>> dev_def_domain_type) {
>>> +		pr_err("'dev_def_domain_type' call back isn't registered\n");
>>> +		ret = -EPERM;
>>> +		goto out;
>>> +	}
>>> +
>>> +	/*
>>> +	 * Check if any user level driver (that doesn't use kernel driver like
>>> +	 * VFIO) is directly using the group. VFIO use case is detected by
>>> +	 * is_driver_bound() check above
>>> +	 */
>>> +	if (prv_dom != group->domain) {
>>> +		pr_err("Group assigned to user level for direct access\n");
>>> +		ret = -EBUSY;
>>> +		goto out;
>>> +	}
>>> +
>>> +	/*
>>> +	 * If user has requested "auto", kernel has to decide the default domain
>>> +	 * type of a group. Hence, find out individual preferences of a device.
>>> +	 */
>>> +	ops = prv_dom->ops;
>>> +	if (req_auto) {
>>> +		int dma_devs = 0, idt_devs = 0, dev_def_dom;
>>> +
>>> +		list_for_each_entry(grp_dev, &group->devices, list) {
>>> +			dev = grp_dev->dev;
>>> +			dev_def_dom = ops->dev_def_domain_type(dev);
>>> +			if (dev_def_dom == IOMMU_DOMAIN_IDENTITY)
>>> +				idt_devs++;
>>> +			if (dev_def_dom == IOMMU_DOMAIN_DMA)
>>> +				dma_devs++;
>>> +		}
>>> +
>>> +		/*
>>> +		 * Default domain type of a group is undefined if some devices
>>> +		 * in the group should be in dma domain while other devices
>>> +		 * should be in identity domain
>>> +		 */
>>> +		if (idt_devs && dma_devs) {
>>> +			pr_err("Some devices need identity domain while other
>> need dma domain\n");
>>> +			ret = -EPERM;
>>> +			goto out;
>>> +		}
>>> +
>>> +		/*
>>> +		 * Default domain type of a group is identity if
>>> +		 * 1. All the devices in the group need to be in identity domain
>>> +		 * 2. Some devices should be in identity domain while other
>>> +		 *    devices could be in either of dma or identity domain
>>> +		 */
>>> +		if (idt_devs && !dma_devs)
>>> +			req_type = IOMMU_DOMAIN_IDENTITY;
>>> +
>>> +		/*
>>> +		 * Default domain type of a group is dma if
>>> +		 * 1. All the devices in the group need to be in dma domain
>>> +		 * 2. Some devices should be in dma domain while other devices
>>> +		 *    could be in either of dma or identity domain
>>> +		 * 3. All the devices could be in either of the domains (namely
>>> +		 *    dma and identity)
>>> +		 */
>>> +		if (!idt_devs)
>>> +			req_type = IOMMU_DOMAIN_DMA;
>> Actually, There are four combinations:
>>
>> 			idt_devs==0		idt_devs!=0
>> dma_devs == 0		iommu_def_domain_type	identity
>> dma_devs != 0		DMA			unsupported
> Yes.. you are right. All these four combinations are presently handled in the code.

At least I didn't see the iommu_def_domain_type is used if both idt_devs
and dma_devs are both equal to 0. :-)

> The comment above talks about combinations that lead us to select DMA domain.
> 
> The cases you mentioned above are handled as:
> 1. dma_devs == 0 && idt_devs == 0 -> In code, req_type = IOMMU_DOMAIN_DMA; because this means*all*  the devices in the group can be in either of identity domain or DMA domain, hence select DMA domain over identity domain.
> 2. dma_devs != 0 && idt_devs == 0 -> In code, req_type = IOMMU_DOMAIN_DMA; because this means some devices in the group*should*  be in DMA domain and other devices in the group could be in either of identity domain or DMA domain.
> 3. dma_devs == 0 && idt_devs != 0 -> In code, req_type = IOMMU_DOMAIN_IDENTITY; because this means some devices in the group*should*  be in identity domain and other devices in the group could be in either of identity domain or DMA domain.
> 4. dma_devs != 0 && idt_devs != 0 -> In code, return error, because this means some devices in the group*should*  be in identity domain and other devices in the group*should*  be in DMA domain. This combination is not expected. This situation shouldn't arise because this would have failed during boot itself because these kind of devices aren't grouped together in the first place.
> 
>>> +	}
>>> +
>>> +	/*
>>> +	 * Switch to a new domain only if the requested domain type is different
>>> +	 * from the existing default domain type
>>> +	 */
>>> +	if (prv_dom->type == req_type)
>>> +		goto out;
>>> +
>>> +	/*
>>> +	 * Every device may not support both the domain types (namely DMA
>> and
>>> +	 * identity), so check if it's ok to change domain type of every device
>>> +	 * in the group to the requested domain. This check is not required if
>>> +	 * user has requested "auto" because it's already done above implicitly.
>>> +	 */
>>> +	if (!req_auto) {
>>> +		list_for_each_entry(grp_dev, &group->devices, list) {
>>> +			int allowed_types;
>>> +
>>> +			dev = grp_dev->dev;
>>> +			allowed_types = ops->dev_def_domain_type(dev);
>>> +			if (allowed_types && allowed_types != req_type) {
>>> +				dev_err(dev, "Cannot be in %s domain\n", buf);
>>> +				ret = -EPERM;
>>> +				goto out;
>>> +			}
>>> +		}
>>> +	}
>>> +
>>> +	mutex_unlock(&group->mutex);
>>> +	ret = iommu_group_change_def_domain(group, prv_dom, req_type);
>> Why do you want to put group->mutex before do the real domain switching?
> Because the below call stack takes iommu_group_mutex. I drop the lock here to facilitate it, otherwise the kernel hangs trying to get the same lock which it is already holding on to
> 
> mutex_lock(&group->mutex);
> iommu_group_remove_device()
> intel_iommu_remove_device()
> iommu_release_device()
> iommu_group_change_def_domain()

Refer to above comment.

> 
> Regards,
> Sai

Best regards,
baolu
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 28+ messages in thread

* RE: [PATCH V2 3/5] iommu: Add support to change default domain of an iommu_group
  2020-02-24  5:46       ` Lu Baolu
@ 2020-02-24  7:03         ` Prakhya, Sai Praneeth
  2020-02-24  7:39           ` Lu Baolu
  0 siblings, 1 reply; 28+ messages in thread
From: Prakhya, Sai Praneeth @ 2020-02-24  7:03 UTC (permalink / raw)
  To: Lu Baolu, iommu; +Cc: Raj, Ashok, Will Deacon, Robin Murphy, Christoph Hellwig

> On 2020/2/24 11:20, Prakhya, Sai Praneeth wrote:
> >>> +	list_for_each_entry_safe(grp_dev, temp, &group->devices, list) {
> >>> +		struct device *dev;
> >>> +
> >>> +		dev = grp_dev->dev;
> >>> +		iommu_release_device(dev);
> >>> +
> >>> +		ret = iommu_group_add_device(group, dev);
> >>> +		if (ret)
> >>> +			dev_err(dev, "Failed to add to iommu group %d: %d\n",
> >>> +				group->id, ret);
> >> Need to handle this error case.
> > I wasn't sure on how to handle the error ☹
> 
> Just roll back to the state before calling this function and return an appropriate
> error value.
> 
> The likely behavior is detaching the new domains from all devices (if it has
> already attached), attaching the old domains to all devices in the group,

And while handling this error case, there is a possibility that attaching to old domain could fail.. so, we might need to handle that error case as well. If we plan to handle error case, since we have removed devices from group above, adding them back could fail too.. that would lead into handling error case for an error case.
So, given the probability of these functions failing here are very low, I think, why not bite the bullet and say, add code to handle these error cases if we see that these functions are failing frequently? Otherwise the error handling code is just a dead code.

> cleaning
> up all new resources allocated in this function, putting a error message to tell
> the user why it fails and returning an error code.
> 
> > i.e. group's domain/default_domain are already updated to new domain and
> assume there are 'n' devices in the group and this failed for 'k'th device, I wasn't
> sure how I could roll back the changes made for k-1 devices.
> 
> A successful attach could be checked by (group->domain ==
> group->default_domain).

No.. because I have manually set group->domain == group->default_domain = new_domain (did this because iommu_group_add_device() and iommu_group_create_direct_mappings() needs them)
So, probably we might need some other way to check successful attach.

> > So, I thought probably just alert the user that there was an error while
> changing default domain type and try updating for other devices in the group
> (hopefully other devices might succeed). Also,*generally*  we shouldn't see any
> errors here because all these devices were already in the same group earlier (we
> aren’t adding/removing new devices to the group). We are just changing default
> domain type and we already made sure that device could be in the requested
> default domain type.
> >
> >>> +
> >>> +		ret = prv_dom->ops->add_device(dev);
> >>> +		if (ret)
> >>> +			dev_err(dev, "Error adding to iommu: %d\n", ret);
> >> Ditto.
> >>
> >>> +	}
> >>> +
> >>> +	iommu_group_put(group);
> >>> +	iommu_domain_free(prv_dom);
> >>> +	return 0;
> >>> +}
> >>> +
> >>> +static int is_driver_bound(struct device *dev, void *not_used) {
> >>> +	int ret = 0;
> >>> +
> >>> +	device_lock(dev);
> >>> +	if (device_is_bound(dev))
> >>> +		ret = 1;
> >>> +	device_unlock(dev);
> >>> +	return ret;
> >>> +}
> >>> +
> >>> +static ssize_t iommu_group_store_type(struct iommu_group *group,
> >>> +				      const char *buf, size_t count) {
> >>> +	int ret = 0, req_type = 0, req_auto = 0;
> >>> +	struct iommu_domain *prv_dom;
> >>> +	struct group_device *grp_dev;
> >>> +	const struct iommu_ops *ops;
> >>> +	struct device *dev;
> >>> +
> >>> +	if (!capable(CAP_SYS_ADMIN) || !capable(CAP_SYS_RAWIO))
> >>> +		return -EACCES;
> >>> +
> >>> +	if (WARN_ON(!group))
> >>> +		return -EINVAL;
> >>> +
> >>> +	if (sysfs_streq(buf, "identity"))
> >>> +		req_type = IOMMU_DOMAIN_IDENTITY;
> >>> +	else if (sysfs_streq(buf, "DMA"))
> >>> +		req_type = IOMMU_DOMAIN_DMA;
> >>> +	else if (sysfs_streq(buf, "auto"))
> >>> +		req_auto = 1;
> >>> +	else
> >>> +		return -EINVAL;
> >>> +
> >>> +	/*
> >>> +	 * Check if any device in the group still has a driver binded to it.
> >>> +	 * This might race with device driver probing code and unfortunately
> >>> +	 * there is no clean way out of that either, locking all devices in the
> >>> +	 * group and then do the re-attach will introduce a lock-inversion with
> >>> +	 * group->mutex - Joerg.
> >> Do you mean that we can't do below?
> >>
> >> mutex_lock(&group->mutex);
> >> for_each_group_device()
> >> 	device_lock(dev);
> >>
> >> /* Default domain switch */
> >> for_each_group_device()
> >> 	device_unlock()
> >> mutex_unlock(&group->mutex)
> > I think, Joerg talks about two issues here 1. is_driver_bound()
> > takes/releases device_lock() which could race with device driver probing code
> i.e. in an other terminal user could try to unload/load the module.
> 
> So you need to device_lock() all devices during the whole process.
> (I assume that load/unload device driver requiring this lock as well.
> Didn't check it in code. Please correct me if I misunderstood it.)

Ok.. I will check this.

> > 2. We cannot do as you suggested above because the functions (one is
> mentioned below) that switch default domain need to take
> iommu_group_mutex lock. So, taking the mutex and then calling iommu
> functions that switch default domain will dead lock.
> 
> You probably need to move the mutex_lock() in that function out to the caller?
> And only assert the lock has been held there.

I mean, there are many functions that are taking iommu_group_mutex_lock and they are being called as part of changing default domain.
So, we would then need to change all those functions and let callers of those functions know that now it's their responsibility to take the lock and call these functions.
Some of those functions are:

iommu_group_add_device
iommu_group_remove_device
iommu_group_for_each_dev

So, rather than that, I think it's better to delegate taking/releasing lock to individual functions than giving that responsibility to callers because then some callers might forget to take/release lock.

> >
> >>> +	 */
> >>> +	if (iommu_group_for_each_dev(group, NULL, is_driver_bound)) {
> >>> +		pr_err("Active drivers exist for devices in the group\n");
> >>> +		return -EBUSY;
> >>> +	}
> >>> +
> >>> +	mutex_lock(&group->mutex);
> >>> +	prv_dom = group->default_domain;
> >>> +	if (!prv_dom || !prv_dom->ops || !prv_dom->ops-
> >>> dev_def_domain_type) {
> >>> +		pr_err("'dev_def_domain_type' call back isn't registered\n");
> >>> +		ret = -EPERM;
> >>> +		goto out;
> >>> +	}
> >>> +
> >>> +	/*
> >>> +	 * Check if any user level driver (that doesn't use kernel driver like
> >>> +	 * VFIO) is directly using the group. VFIO use case is detected by
> >>> +	 * is_driver_bound() check above
> >>> +	 */
> >>> +	if (prv_dom != group->domain) {
> >>> +		pr_err("Group assigned to user level for direct access\n");
> >>> +		ret = -EBUSY;
> >>> +		goto out;
> >>> +	}
> >>> +
> >>> +	/*
> >>> +	 * If user has requested "auto", kernel has to decide the default domain
> >>> +	 * type of a group. Hence, find out individual preferences of a device.
> >>> +	 */
> >>> +	ops = prv_dom->ops;
> >>> +	if (req_auto) {
> >>> +		int dma_devs = 0, idt_devs = 0, dev_def_dom;
> >>> +
> >>> +		list_for_each_entry(grp_dev, &group->devices, list) {
> >>> +			dev = grp_dev->dev;
> >>> +			dev_def_dom = ops->dev_def_domain_type(dev);
> >>> +			if (dev_def_dom == IOMMU_DOMAIN_IDENTITY)
> >>> +				idt_devs++;
> >>> +			if (dev_def_dom == IOMMU_DOMAIN_DMA)
> >>> +				dma_devs++;
> >>> +		}
> >>> +
> >>> +		/*
> >>> +		 * Default domain type of a group is undefined if some devices
> >>> +		 * in the group should be in dma domain while other devices
> >>> +		 * should be in identity domain
> >>> +		 */
> >>> +		if (idt_devs && dma_devs) {
> >>> +			pr_err("Some devices need identity domain while other
> >> need dma domain\n");
> >>> +			ret = -EPERM;
> >>> +			goto out;
> >>> +		}
> >>> +
> >>> +		/*
> >>> +		 * Default domain type of a group is identity if
> >>> +		 * 1. All the devices in the group need to be in identity domain
> >>> +		 * 2. Some devices should be in identity domain while other
> >>> +		 *    devices could be in either of dma or identity domain
> >>> +		 */
> >>> +		if (idt_devs && !dma_devs)
> >>> +			req_type = IOMMU_DOMAIN_IDENTITY;
> >>> +
> >>> +		/*
> >>> +		 * Default domain type of a group is dma if
> >>> +		 * 1. All the devices in the group need to be in dma domain
> >>> +		 * 2. Some devices should be in dma domain while other devices
> >>> +		 *    could be in either of dma or identity domain
> >>> +		 * 3. All the devices could be in either of the domains (namely
> >>> +		 *    dma and identity)
> >>> +		 */
> >>> +		if (!idt_devs)
> >>> +			req_type = IOMMU_DOMAIN_DMA;
> >> Actually, There are four combinations:
> >>
> >> 			idt_devs==0		idt_devs!=0
> >> dma_devs == 0		iommu_def_domain_type	identity
> >> dma_devs != 0		DMA			unsupported
> > Yes.. you are right. All these four combinations are presently handled in the
> code.
> 
> At least I didn't see the iommu_def_domain_type is used if both idt_devs and
> dma_devs are both equal to 0. :-)

Sorry! I didn't get what you meant by "iommu_def_domain_type", so, could you please explain that?

When idt_devs == 0 and dma_devs == 0, it means that all the devices in the group could be in either of the domain. Hence, I default to DMA domain which is covered by this code.
		if (!idt_devs)
			req_type = IOMMU_DOMAIN_DMA;

> > The comment above talks about combinations that lead us to select DMA
> domain.
> >
> > The cases you mentioned above are handled as:
> > 1. dma_devs == 0 && idt_devs == 0 -> In code, req_type =
> IOMMU_DOMAIN_DMA; because this means*all*  the devices in the group can
> be in either of identity domain or DMA domain, hence select DMA domain over
> identity domain.
> > 2. dma_devs != 0 && idt_devs == 0 -> In code, req_type =
> IOMMU_DOMAIN_DMA; because this means some devices in the
> group*should*  be in DMA domain and other devices in the group could be in
> either of identity domain or DMA domain.
> > 3. dma_devs == 0 && idt_devs != 0 -> In code, req_type =
> IOMMU_DOMAIN_IDENTITY; because this means some devices in the
> group*should*  be in identity domain and other devices in the group could be in
> either of identity domain or DMA domain.
> > 4. dma_devs != 0 && idt_devs != 0 -> In code, return error, because this means
> some devices in the group*should*  be in identity domain and other devices in
> the group*should*  be in DMA domain. This combination is not expected. This
> situation shouldn't arise because this would have failed during boot itself
> because these kind of devices aren't grouped together in the first place.

Regards,
Sai
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH V2 3/5] iommu: Add support to change default domain of an iommu_group
  2020-02-24  7:03         ` Prakhya, Sai Praneeth
@ 2020-02-24  7:39           ` Lu Baolu
  2020-02-24  7:57             ` Prakhya, Sai Praneeth
  0 siblings, 1 reply; 28+ messages in thread
From: Lu Baolu @ 2020-02-24  7:39 UTC (permalink / raw)
  To: Prakhya, Sai Praneeth, iommu
  Cc: Raj, Ashok, Will Deacon, Robin Murphy, Christoph Hellwig

On 2020/2/24 15:03, Prakhya, Sai Praneeth wrote:
>> On 2020/2/24 11:20, Prakhya, Sai Praneeth wrote:
>>>>> +	list_for_each_entry_safe(grp_dev, temp, &group->devices, list) {
>>>>> +		struct device *dev;
>>>>> +
>>>>> +		dev = grp_dev->dev;
>>>>> +		iommu_release_device(dev);
>>>>> +
>>>>> +		ret = iommu_group_add_device(group, dev);
>>>>> +		if (ret)
>>>>> +			dev_err(dev, "Failed to add to iommu group %d: %d\n",
>>>>> +				group->id, ret);
>>>> Need to handle this error case.
>>> I wasn't sure on how to handle the error ☹
>>
>> Just roll back to the state before calling this function and return an appropriate
>> error value.
>>
>> The likely behavior is detaching the new domains from all devices (if it has
>> already attached), attaching the old domains to all devices in the group,
> 
> And while handling this error case, there is a possibility that attaching to old domain could fail.. so, we might need to handle that error case as well. If we plan to handle error case, since we have removed devices from group above, adding them back could fail too.. that would lead into handling error case for an error case.

We can assume that the old domain should always be attached back.
Otherwise, there must be some bugs in the vendor iommu driver.

It must be able to role back, otherwise users have to reboot the
system in order to use the devices in the group. This is not
acceptable in the production kernel.

> So, given the probability of these functions failing here are very low, I think, why not bite the bullet and say, add code to handle these error cases if we see that these functions are failing frequently? Otherwise the error handling code is just a dead code.
> 
>> cleaning
>> up all new resources allocated in this function, putting a error message to tell
>> the user why it fails and returning an error code.
>>
>>> i.e. group's domain/default_domain are already updated to new domain and
>> assume there are 'n' devices in the group and this failed for 'k'th device, I wasn't
>> sure how I could roll back the changes made for k-1 devices.
>>
>> A successful attach could be checked by (group->domain ==
>> group->default_domain).
> 
> No.. because I have manually set group->domain == group->default_domain = new_domain (did this because iommu_group_add_device() and iommu_group_create_direct_mappings() needs them)

You could set group->domain to the default domain only after it has been
attached to the device successfully, right?

> So, probably we might need some other way to check successful attach.
> 
>>> So, I thought probably just alert the user that there was an error while
>> changing default domain type and try updating for other devices in the group
>> (hopefully other devices might succeed). Also,*generally*  we shouldn't see any
>> errors here because all these devices were already in the same group earlier (we
>> aren’t adding/removing new devices to the group). We are just changing default
>> domain type and we already made sure that device could be in the requested
>> default domain type.
>>>
>>>>> +
>>>>> +		ret = prv_dom->ops->add_device(dev);
>>>>> +		if (ret)
>>>>> +			dev_err(dev, "Error adding to iommu: %d\n", ret);
>>>> Ditto.
>>>>
>>>>> +	}
>>>>> +
>>>>> +	iommu_group_put(group);
>>>>> +	iommu_domain_free(prv_dom);
>>>>> +	return 0;
>>>>> +}
>>>>> +
>>>>> +static int is_driver_bound(struct device *dev, void *not_used) {
>>>>> +	int ret = 0;
>>>>> +
>>>>> +	device_lock(dev);
>>>>> +	if (device_is_bound(dev))
>>>>> +		ret = 1;
>>>>> +	device_unlock(dev);
>>>>> +	return ret;
>>>>> +}
>>>>> +
>>>>> +static ssize_t iommu_group_store_type(struct iommu_group *group,
>>>>> +				      const char *buf, size_t count) {
>>>>> +	int ret = 0, req_type = 0, req_auto = 0;
>>>>> +	struct iommu_domain *prv_dom;
>>>>> +	struct group_device *grp_dev;
>>>>> +	const struct iommu_ops *ops;
>>>>> +	struct device *dev;
>>>>> +
>>>>> +	if (!capable(CAP_SYS_ADMIN) || !capable(CAP_SYS_RAWIO))
>>>>> +		return -EACCES;
>>>>> +
>>>>> +	if (WARN_ON(!group))
>>>>> +		return -EINVAL;
>>>>> +
>>>>> +	if (sysfs_streq(buf, "identity"))
>>>>> +		req_type = IOMMU_DOMAIN_IDENTITY;
>>>>> +	else if (sysfs_streq(buf, "DMA"))
>>>>> +		req_type = IOMMU_DOMAIN_DMA;
>>>>> +	else if (sysfs_streq(buf, "auto"))
>>>>> +		req_auto = 1;
>>>>> +	else
>>>>> +		return -EINVAL;
>>>>> +
>>>>> +	/*
>>>>> +	 * Check if any device in the group still has a driver binded to it.
>>>>> +	 * This might race with device driver probing code and unfortunately
>>>>> +	 * there is no clean way out of that either, locking all devices in the
>>>>> +	 * group and then do the re-attach will introduce a lock-inversion with
>>>>> +	 * group->mutex - Joerg.
>>>> Do you mean that we can't do below?
>>>>
>>>> mutex_lock(&group->mutex);
>>>> for_each_group_device()
>>>> 	device_lock(dev);
>>>>
>>>> /* Default domain switch */
>>>> for_each_group_device()
>>>> 	device_unlock()
>>>> mutex_unlock(&group->mutex)
>>> I think, Joerg talks about two issues here 1. is_driver_bound()
>>> takes/releases device_lock() which could race with device driver probing code
>> i.e. in an other terminal user could try to unload/load the module.
>>
>> So you need to device_lock() all devices during the whole process.
>> (I assume that load/unload device driver requiring this lock as well.
>> Didn't check it in code. Please correct me if I misunderstood it.)
> 
> Ok.. I will check this.
> 
>>> 2. We cannot do as you suggested above because the functions (one is
>> mentioned below) that switch default domain need to take
>> iommu_group_mutex lock. So, taking the mutex and then calling iommu
>> functions that switch default domain will dead lock.
>>
>> You probably need to move the mutex_lock() in that function out to the caller?
>> And only assert the lock has been held there.
> 
> I mean, there are many functions that are taking iommu_group_mutex_lock and they are being called as part of changing default domain.
> So, we would then need to change all those functions and let callers of those functions know that now it's their responsibility to take the lock and call these functions.
> Some of those functions are:
> 
> iommu_group_add_device
> iommu_group_remove_device
> iommu_group_for_each_dev
> 
> So, rather than that, I think it's better to delegate taking/releasing lock to individual functions than giving that responsibility to callers because then some callers might forget to take/release lock.

Okay. Fair enough.

> 
>>>
>>>>> +	 */
>>>>> +	if (iommu_group_for_each_dev(group, NULL, is_driver_bound)) {
>>>>> +		pr_err("Active drivers exist for devices in the group\n");
>>>>> +		return -EBUSY;
>>>>> +	}
>>>>> +
>>>>> +	mutex_lock(&group->mutex);
>>>>> +	prv_dom = group->default_domain;
>>>>> +	if (!prv_dom || !prv_dom->ops || !prv_dom->ops-
>>>>> dev_def_domain_type) {
>>>>> +		pr_err("'dev_def_domain_type' call back isn't registered\n");
>>>>> +		ret = -EPERM;
>>>>> +		goto out;
>>>>> +	}
>>>>> +
>>>>> +	/*
>>>>> +	 * Check if any user level driver (that doesn't use kernel driver like
>>>>> +	 * VFIO) is directly using the group. VFIO use case is detected by
>>>>> +	 * is_driver_bound() check above
>>>>> +	 */
>>>>> +	if (prv_dom != group->domain) {
>>>>> +		pr_err("Group assigned to user level for direct access\n");
>>>>> +		ret = -EBUSY;
>>>>> +		goto out;
>>>>> +	}
>>>>> +
>>>>> +	/*
>>>>> +	 * If user has requested "auto", kernel has to decide the default domain
>>>>> +	 * type of a group. Hence, find out individual preferences of a device.
>>>>> +	 */
>>>>> +	ops = prv_dom->ops;
>>>>> +	if (req_auto) {
>>>>> +		int dma_devs = 0, idt_devs = 0, dev_def_dom;
>>>>> +
>>>>> +		list_for_each_entry(grp_dev, &group->devices, list) {
>>>>> +			dev = grp_dev->dev;
>>>>> +			dev_def_dom = ops->dev_def_domain_type(dev);
>>>>> +			if (dev_def_dom == IOMMU_DOMAIN_IDENTITY)
>>>>> +				idt_devs++;
>>>>> +			if (dev_def_dom == IOMMU_DOMAIN_DMA)
>>>>> +				dma_devs++;
>>>>> +		}
>>>>> +
>>>>> +		/*
>>>>> +		 * Default domain type of a group is undefined if some devices
>>>>> +		 * in the group should be in dma domain while other devices
>>>>> +		 * should be in identity domain
>>>>> +		 */
>>>>> +		if (idt_devs && dma_devs) {
>>>>> +			pr_err("Some devices need identity domain while other
>>>> need dma domain\n");
>>>>> +			ret = -EPERM;
>>>>> +			goto out;
>>>>> +		}
>>>>> +
>>>>> +		/*
>>>>> +		 * Default domain type of a group is identity if
>>>>> +		 * 1. All the devices in the group need to be in identity domain
>>>>> +		 * 2. Some devices should be in identity domain while other
>>>>> +		 *    devices could be in either of dma or identity domain
>>>>> +		 */
>>>>> +		if (idt_devs && !dma_devs)
>>>>> +			req_type = IOMMU_DOMAIN_IDENTITY;
>>>>> +
>>>>> +		/*
>>>>> +		 * Default domain type of a group is dma if
>>>>> +		 * 1. All the devices in the group need to be in dma domain
>>>>> +		 * 2. Some devices should be in dma domain while other devices
>>>>> +		 *    could be in either of dma or identity domain
>>>>> +		 * 3. All the devices could be in either of the domains (namely
>>>>> +		 *    dma and identity)
>>>>> +		 */
>>>>> +		if (!idt_devs)
>>>>> +			req_type = IOMMU_DOMAIN_DMA;
>>>> Actually, There are four combinations:
>>>>
>>>> 			idt_devs==0		idt_devs!=0
>>>> dma_devs == 0		iommu_def_domain_type	identity
>>>> dma_devs != 0		DMA			unsupported
>>> Yes.. you are right. All these four combinations are presently handled in the
>> code.
>>
>> At least I didn't see the iommu_def_domain_type is used if both idt_devs and
>> dma_devs are both equal to 0. :-)
> 
> Sorry! I didn't get what you meant by "iommu_def_domain_type", so, could you please explain that?
> 
> When idt_devs == 0 and dma_devs == 0, it means that all the devices in the group could be in either of the domain. Hence, I default to DMA domain which is covered by this code.
> 		if (!idt_devs)
> 			req_type = IOMMU_DOMAIN_DMA;

iommu_def_domain_type is the system level default domain type defined in
iommu.c. If the vendor iommu driver has no special requirement, we
should choose the default value.

Best regards,
baolu
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 28+ messages in thread

* RE: [PATCH V2 3/5] iommu: Add support to change default domain of an iommu_group
  2020-02-24  7:39           ` Lu Baolu
@ 2020-02-24  7:57             ` Prakhya, Sai Praneeth
  2020-02-24  8:12               ` Lu Baolu
  0 siblings, 1 reply; 28+ messages in thread
From: Prakhya, Sai Praneeth @ 2020-02-24  7:57 UTC (permalink / raw)
  To: Lu Baolu, iommu; +Cc: Raj, Ashok, Will Deacon, Robin Murphy, Christoph Hellwig

> On 2020/2/24 15:03, Prakhya, Sai Praneeth wrote:
> >> On 2020/2/24 11:20, Prakhya, Sai Praneeth wrote:
> >>>>> +	list_for_each_entry_safe(grp_dev, temp, &group->devices, list) {
> >>>>> +		struct device *dev;
> >>>>> +
> >>>>> +		dev = grp_dev->dev;
> >>>>> +		iommu_release_device(dev);
> >>>>> +
> >>>>> +		ret = iommu_group_add_device(group, dev);
> >>>>> +		if (ret)
> >>>>> +			dev_err(dev, "Failed to add to iommu group %d: %d\n",
> >>>>> +				group->id, ret);
> >>>> Need to handle this error case.
> >>> I wasn't sure on how to handle the error ☹
> >>
> >> Just roll back to the state before calling this function and return
> >> an appropriate error value.
> >>
> >> The likely behavior is detaching the new domains from all devices (if
> >> it has already attached), attaching the old domains to all devices in
> >> the group,
> >
> > And while handling this error case, there is a possibility that attaching to old
> domain could fail.. so, we might need to handle that error case as well. If we
> plan to handle error case, since we have removed devices from group above,
> adding them back could fail too.. that would lead into handling error case for an
> error case.
> 
> We can assume that the old domain should always be attached back.
> Otherwise, there must be some bugs in the vendor iommu driver.
> 
> It must be able to role back, otherwise users have to reboot the system in order
> to use the devices in the group. This is not acceptable in the production kernel.

I agree that we should be able to roll back but I am afraid that the error handling code could become complex than the usual code that gets to run. For example, assume there are 'n' devices in the group, 'k' of them are successfully processed (i.e. default domain type has been changed) and if k+1 fails while changing default domain type, to roll back state of k devices, we need to maintain a list of processed devices so that we now roll back state for devices in this list. The present code does not have any list because it's processing sequentially, it takes a device from the group, changes it domain and moves to other device and hence does not require a list.

All said, I could give this a try and see how complex the code could turn out to be. I hope I am wrong (i.e. turns out implementing error handling is simple).

> > So, given the probability of these functions failing here are very low, I think,
> why not bite the bullet and say, add code to handle these error cases if we see
> that these functions are failing frequently? Otherwise the error handling code is
> just a dead code.
> >
> >> cleaning
> >> up all new resources allocated in this function, putting a error
> >> message to tell the user why it fails and returning an error code.
> >>
> >>> i.e. group's domain/default_domain are already updated to new domain
> >>> and
> >> assume there are 'n' devices in the group and this failed for 'k'th
> >> device, I wasn't sure how I could roll back the changes made for k-1 devices.
> >>
> >> A successful attach could be checked by (group->domain ==
> >> group->default_domain).
> >
> > No.. because I have manually set group->domain ==
> > group->default_domain = new_domain (did this because
> > iommu_group_add_device() and iommu_group_create_direct_mappings()
> > needs them)
> 
> You could set group->domain to the default domain only after it has been
> attached to the device successfully, right?

Will reorder things and see how this could be handled.

> > So, probably we might need some other way to check successful attach.
> >
> >>> So, I thought probably just alert the user that there was an error
> >>> while
> >> changing default domain type and try updating for other devices in
> >> the group (hopefully other devices might succeed). Also,*generally*
> >> we shouldn't see any errors here because all these devices were
> >> already in the same group earlier (we aren’t adding/removing new
> >> devices to the group). We are just changing default domain type and
> >> we already made sure that device could be in the requested default domain
> type.

[snipped]

> >> At least I didn't see the iommu_def_domain_type is used if both
> >> idt_devs and dma_devs are both equal to 0. :-)
> >
> > Sorry! I didn't get what you meant by "iommu_def_domain_type", so, could
> you please explain that?
> >
> > When idt_devs == 0 and dma_devs == 0, it means that all the devices in the
> group could be in either of the domain. Hence, I default to DMA domain which is
> covered by this code.
> > 		if (!idt_devs)
> > 			req_type = IOMMU_DOMAIN_DMA;
> 
> iommu_def_domain_type is the system level default domain type defined in
> iommu.c. If the vendor iommu driver has no special requirement, we should
> choose the default value.

Ok.. got it, thanks for explaining. I will change this as suggested.

Regards,
Sai
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH V2 3/5] iommu: Add support to change default domain of an iommu_group
  2020-02-24  7:57             ` Prakhya, Sai Praneeth
@ 2020-02-24  8:12               ` Lu Baolu
  2020-02-24  8:39                 ` Lu Baolu
  0 siblings, 1 reply; 28+ messages in thread
From: Lu Baolu @ 2020-02-24  8:12 UTC (permalink / raw)
  To: Prakhya, Sai Praneeth, iommu
  Cc: Raj, Ashok, Will Deacon, Robin Murphy, Christoph Hellwig

On 2020/2/24 15:57, Prakhya, Sai Praneeth wrote:
>> On 2020/2/24 15:03, Prakhya, Sai Praneeth wrote:
>>>> On 2020/2/24 11:20, Prakhya, Sai Praneeth wrote:
>>>>>>> +	list_for_each_entry_safe(grp_dev, temp, &group->devices, list) {
>>>>>>> +		struct device *dev;
>>>>>>> +
>>>>>>> +		dev = grp_dev->dev;
>>>>>>> +		iommu_release_device(dev);
>>>>>>> +
>>>>>>> +		ret = iommu_group_add_device(group, dev);
>>>>>>> +		if (ret)
>>>>>>> +			dev_err(dev, "Failed to add to iommu group %d: %d\n",
>>>>>>> +				group->id, ret);
>>>>>> Need to handle this error case.
>>>>> I wasn't sure on how to handle the error ☹
>>>> Just roll back to the state before calling this function and return
>>>> an appropriate error value.
>>>>
>>>> The likely behavior is detaching the new domains from all devices (if
>>>> it has already attached), attaching the old domains to all devices in
>>>> the group,
>>> And while handling this error case, there is a possibility that attaching to old
>> domain could fail.. so, we might need to handle that error case as well. If we
>> plan to handle error case, since we have removed devices from group above,
>> adding them back could fail too.. that would lead into handling error case for an
>> error case.
>>
>> We can assume that the old domain should always be attached back.
>> Otherwise, there must be some bugs in the vendor iommu driver.
>>
>> It must be able to role back, otherwise users have to reboot the system in order
>> to use the devices in the group. This is not acceptable in the production kernel.
> I agree that we should be able to roll back but I am afraid that the error handling code could become complex than the usual code that gets to run. For example, assume there are 'n' devices in the group, 'k' of them are successfully processed (i.e. default domain type has been changed) and if k+1 fails while changing default domain type, to roll back state of k devices, we need to maintain a list of processed devices so that we now roll back state for devices in this list. The present code does not have any list because it's processing sequentially, it takes a device from the group, changes it domain and moves to other device and hence does not require a list.
> 
> All said, I could give this a try and see how complex the code could turn out to be. I hope I am wrong (i.e. turns out implementing error handling is simple).
> 

I think something like below should work.

static int iommu_group_do_attach_device(struct device *dev, void *data)
{
         struct iommu_domain *domain = data;

         return __iommu_attach_device(domain, dev);
}

static int __iommu_attach_group(struct iommu_domain *domain,
                                 struct iommu_group *group)
{
         int ret;

         if (group->default_domain && group->domain != 
group->default_domain)
                 return -EBUSY;

         ret = __iommu_group_for_each_dev(group, domain,
                                          iommu_group_do_attach_device);
         if (ret == 0)
                 group->domain = domain;

         return ret;
}

The vendor iommu driver should always deprecate the old domain if it
exists. Add a comment there.

Best regards,
baolu
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH V2 3/5] iommu: Add support to change default domain of an iommu_group
  2020-02-24  8:12               ` Lu Baolu
@ 2020-02-24  8:39                 ` Lu Baolu
  2020-02-24  8:44                   ` Prakhya, Sai Praneeth
  0 siblings, 1 reply; 28+ messages in thread
From: Lu Baolu @ 2020-02-24  8:39 UTC (permalink / raw)
  To: Prakhya, Sai Praneeth, iommu
  Cc: Raj, Ashok, Will Deacon, Robin Murphy, Christoph Hellwig

On 2020/2/24 16:12, Lu Baolu wrote:
> On 2020/2/24 15:57, Prakhya, Sai Praneeth wrote:
>>> On 2020/2/24 15:03, Prakhya, Sai Praneeth wrote:
>>>>> On 2020/2/24 11:20, Prakhya, Sai Praneeth wrote:
>>>>>>>> +    list_for_each_entry_safe(grp_dev, temp, &group->devices, 
>>>>>>>> list) {
>>>>>>>> +        struct device *dev;
>>>>>>>> +
>>>>>>>> +        dev = grp_dev->dev;
>>>>>>>> +        iommu_release_device(dev);
>>>>>>>> +
>>>>>>>> +        ret = iommu_group_add_device(group, dev);
>>>>>>>> +        if (ret)
>>>>>>>> +            dev_err(dev, "Failed to add to iommu group %d: %d\n",
>>>>>>>> +                group->id, ret);
>>>>>>> Need to handle this error case.
>>>>>> I wasn't sure on how to handle the error ☹
>>>>> Just roll back to the state before calling this function and return
>>>>> an appropriate error value.
>>>>>
>>>>> The likely behavior is detaching the new domains from all devices (if
>>>>> it has already attached), attaching the old domains to all devices in
>>>>> the group,
>>>> And while handling this error case, there is a possibility that 
>>>> attaching to old
>>> domain could fail.. so, we might need to handle that error case as 
>>> well. If we
>>> plan to handle error case, since we have removed devices from group 
>>> above,
>>> adding them back could fail too.. that would lead into handling error 
>>> case for an
>>> error case.
>>>
>>> We can assume that the old domain should always be attached back.
>>> Otherwise, there must be some bugs in the vendor iommu driver.
>>>
>>> It must be able to role back, otherwise users have to reboot the 
>>> system in order
>>> to use the devices in the group. This is not acceptable in the 
>>> production kernel.
>> I agree that we should be able to roll back but I am afraid that the 
>> error handling code could become complex than the usual code that gets 
>> to run. For example, assume there are 'n' devices in the group, 'k' of 
>> them are successfully processed (i.e. default domain type has been 
>> changed) and if k+1 fails while changing default domain type, to roll 
>> back state of k devices, we need to maintain a list of processed 
>> devices so that we now roll back state for devices in this list. The 
>> present code does not have any list because it's processing 
>> sequentially, it takes a device from the group, changes it domain and 
>> moves to other device and hence does not require a list.
>>
>> All said, I could give this a try and see how complex the code could 
>> turn out to be. I hope I am wrong (i.e. turns out implementing error 
>> handling is simple).
>>
> 
> I think something like below should work.
> 
> static int iommu_group_do_attach_device(struct device *dev, void *data)
> {
>          struct iommu_domain *domain = data;
> 
>          return __iommu_attach_device(domain, dev);
> }
> 
> static int __iommu_attach_group(struct iommu_domain *domain,
>                                  struct iommu_group *group)
> {
>          int ret;
> 
>          if (group->default_domain && group->domain != 
> group->default_domain)
>                  return -EBUSY;
> 
>          ret = __iommu_group_for_each_dev(group, domain,
>                                           iommu_group_do_attach_device);
>          if (ret == 0)
>                  group->domain = domain;
> 
>          return ret;
> }
> 
> The vendor iommu driver should always deprecate the old domain if it
> exists. Add a comment there.
>

By the way, this is the expected behavior. Please check
__iommu_detach_group().

Best regards,
baolu
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 28+ messages in thread

* RE: [PATCH V2 3/5] iommu: Add support to change default domain of an iommu_group
  2020-02-24  8:39                 ` Lu Baolu
@ 2020-02-24  8:44                   ` Prakhya, Sai Praneeth
  0 siblings, 0 replies; 28+ messages in thread
From: Prakhya, Sai Praneeth @ 2020-02-24  8:44 UTC (permalink / raw)
  To: Lu Baolu, iommu; +Cc: Raj, Ashok, Will Deacon, Robin Murphy, Christoph Hellwig

> On 2020/2/24 16:12, Lu Baolu wrote:
> > On 2020/2/24 15:57, Prakhya, Sai Praneeth wrote:
> >>> On 2020/2/24 15:03, Prakhya, Sai Praneeth wrote:
> >>>>> On 2020/2/24 11:20, Prakhya, Sai Praneeth wrote:
> >>>>>>>> +    list_for_each_entry_safe(grp_dev, temp, &group->devices,
> >>>>>>>> list) {
> >>>>>>>> +        struct device *dev;
> >>>>>>>> +
> >>>>>>>> +        dev = grp_dev->dev;
> >>>>>>>> +        iommu_release_device(dev);
> >>>>>>>> +
> >>>>>>>> +        ret = iommu_group_add_device(group, dev);
> >>>>>>>> +        if (ret)
> >>>>>>>> +            dev_err(dev, "Failed to add to iommu group %d:
> >>>>>>>> +%d\n",
> >>>>>>>> +                group->id, ret);
> >>>>>>> Need to handle this error case.
> >>>>>> I wasn't sure on how to handle the error ☹
> >>>>> Just roll back to the state before calling this function and
> >>>>> return an appropriate error value.
> >>>>>
> >>>>> The likely behavior is detaching the new domains from all devices
> >>>>> (if it has already attached), attaching the old domains to all
> >>>>> devices in the group,
> >>>> And while handling this error case, there is a possibility that
> >>>> attaching to old
> >>> domain could fail.. so, we might need to handle that error case as
> >>> well. If we plan to handle error case, since we have removed devices
> >>> from group above, adding them back could fail too.. that would lead
> >>> into handling error case for an error case.
> >>>
> >>> We can assume that the old domain should always be attached back.
> >>> Otherwise, there must be some bugs in the vendor iommu driver.
> >>>
> >>> It must be able to role back, otherwise users have to reboot the
> >>> system in order to use the devices in the group. This is not
> >>> acceptable in the production kernel.
> >> I agree that we should be able to roll back but I am afraid that the
> >> error handling code could become complex than the usual code that
> >> gets to run. For example, assume there are 'n' devices in the group,
> >> 'k' of them are successfully processed (i.e. default domain type has
> >> been
> >> changed) and if k+1 fails while changing default domain type, to roll
> >> back state of k devices, we need to maintain a list of processed
> >> devices so that we now roll back state for devices in this list. The
> >> present code does not have any list because it's processing
> >> sequentially, it takes a device from the group, changes it domain and
> >> moves to other device and hence does not require a list.
> >>
> >> All said, I could give this a try and see how complex the code could
> >> turn out to be. I hope I am wrong (i.e. turns out implementing error
> >> handling is simple).
> >>
> >
> > I think something like below should work.
> >
> > static int iommu_group_do_attach_device(struct device *dev, void
> > *data) {
> >          struct iommu_domain *domain = data;
> >
> >          return __iommu_attach_device(domain, dev); }
> >
> > static int __iommu_attach_group(struct iommu_domain *domain,
> >                                  struct iommu_group *group) {
> >          int ret;
> >
> >          if (group->default_domain && group->domain !=
> > group->default_domain)
> >                  return -EBUSY;
> >
> >          ret = __iommu_group_for_each_dev(group, domain,
> >
> > iommu_group_do_attach_device);
> >          if (ret == 0)
> >                  group->domain = domain;
> >
> >          return ret;
> > }
> >
> > The vendor iommu driver should always deprecate the old domain if it
> > exists. Add a comment there.
> >
> 
> By the way, this is the expected behavior. Please check
> __iommu_detach_group().

Ok.. I will look into it.

Regards,
Sai
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH V2 3/5] iommu: Add support to change default domain of an iommu_group
  2020-02-16 21:57 ` [PATCH V2 3/5] iommu: Add support to change default domain of an iommu_group Sai Praneeth Prakhya
  2020-02-23  1:20   ` Lu Baolu
@ 2020-03-02 15:08   ` Joerg Roedel
  2020-03-03  6:47     ` Lu Baolu
  1 sibling, 1 reply; 28+ messages in thread
From: Joerg Roedel @ 2020-03-02 15:08 UTC (permalink / raw)
  To: Sai Praneeth Prakhya
  Cc: Ashok Raj, Will Deacon, iommu, Robin Murphy, Christoph Hellwig

Hello Sai, Baolu,

On Sun, Feb 16, 2020 at 01:57:26PM -0800, Sai Praneeth Prakhya wrote:
> Hence it will be helpful if there is some way to change the default
> domain of a B:D.F dynamically. Since, linux iommu subsystem prefers to
> deal at iommu_group level instead of B:D.F level, it might be helpful
> if there is some way to change the default domain of a *iommu_group*
> dynamically. Hence, add such support.

The question is how this plays together with the per-device private
domains in the Intel VT-d driver. I recently debugged an issue there and
I think there are more. The overall code for this seems to be pretty
fragile, so I had the idea to make the private default domains more
general.

IOMMU default domains don't necessarily need to stick to the iommu-group
granularity, because the default domain is used by in-kernel drivers
only, and the kernel trusts itself.

So my idea was to make the private-domain concept of the VT-d driver
more generic and move it to the iommu core code. With that we can
configure real per-device default domain types and don't have the race
condition with driver probing when changing the default domain of
multiple devices. We have to limit the ability to change default domain
types to devices with no PCI aliases, but that should not be a problem
for the intended use-case.

What do you think?

Regards,

	Joerg
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH V2 3/5] iommu: Add support to change default domain of an iommu_group
  2020-03-02 15:08   ` Joerg Roedel
@ 2020-03-03  6:47     ` Lu Baolu
  2020-03-03 13:13       ` Joerg Roedel
  0 siblings, 1 reply; 28+ messages in thread
From: Lu Baolu @ 2020-03-03  6:47 UTC (permalink / raw)
  To: Joerg Roedel, Sai Praneeth Prakhya
  Cc: Ashok Raj, Will Deacon, iommu, Robin Murphy, Christoph Hellwig

Hi Joerg,

On 2020/3/2 23:08, Joerg Roedel wrote:
> Hello Sai, Baolu,
> 
> On Sun, Feb 16, 2020 at 01:57:26PM -0800, Sai Praneeth Prakhya wrote:
>> Hence it will be helpful if there is some way to change the default
>> domain of a B:D.F dynamically. Since, linux iommu subsystem prefers to
>> deal at iommu_group level instead of B:D.F level, it might be helpful
>> if there is some way to change the default domain of a *iommu_group*
>> dynamically. Hence, add such support.
> 
> The question is how this plays together with the per-device private
> domains in the Intel VT-d driver. I recently debugged an issue there and
> I think there are more. The overall code for this seems to be pretty
> fragile, so I had the idea to make the private default domains more
> general.
> 
> IOMMU default domains don't necessarily need to stick to the iommu-group
> granularity, because the default domain is used by in-kernel drivers
> only, and the kernel trusts itself.
> 
> So my idea was to make the private-domain concept of the VT-d driver
> more generic and move it to the iommu core code. With that we can
> configure real per-device default domain types and don't have the race
> condition with driver probing when changing the default domain of
> multiple devices. We have to limit the ability to change default domain
> types to devices with no PCI aliases, but that should not be a problem
> for the intended use-case.
> 
> What do you think?
>

Theoretically speaking, per-device default domain is impractical. PCI
aliased devices (PCI bridge and all devices beneath it, VMD devices and
various devices quirked with pci_add_dma_alias()) must use the same
domain. It's likely that we have to introduce something like a sub-group
with all PCI aliased devices staying in it. Current private-domain
implementation in the vt-d driver was introduced for compatible purpose
and I wanted to abandon it from the first day. :-)

On Intel platforms, there are only rare devices which require a specific
default domain: some graphic devices (identity), a specific model of
AZALIA (identity) and external devices connected through thunderbolt
(dma). They are not supposed to belong to a same group. Hence, if we
are able to configure per-group default domain type, we don't need to
keep private domain anymore.

Probably, we are able to configure per-group default domain type with
below two interfaces.

- (ops->)dev_def_domain_type: Return the required default domain type
   for a device. It returns
   - IOMMU_DOMAIN_DMA (device must use a DMA domain), unlikely
   - IOMMU_DOMAIN_IDENTITY (device must use an Identity domain), unlikely
   - 0 (both are okay), likely

- iommu_group_change_def_domain: Change the default domain of a group
   Works only when all devices have no driver bond.

[Sai's patch set has already included these two interfaces.]

In iommu_probe_device(),

dev_def_type = ops->dev_def_domain_type(dev)
if (dev_def_type && dev_def_type != group->default_domain->type) {
	ret = iommu_group_change_def_domain(...)
	if (ret)
		return -EINVAL;
}

This should work during boot since iommu_probe_device() always happens
before device driver binding. We need to further consider the hot-plug
cases.

- Hardware initiated device hotplug
We should always use DMA domain for devices connected through an
external port to avoid DMA attacking from malicious devices. And
such devices shouldn't share a group with internal (trusted) devices.
Hence, I can't see any problems here.

- Software initiated device hotplug
The default domain type won't change before and after device hotplug
so there's no problem as well.

This is what I have for the private domain in vt-d driver. Just for
discussion.

Best regards,
baolu
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH V2 3/5] iommu: Add support to change default domain of an iommu_group
  2020-03-03  6:47     ` Lu Baolu
@ 2020-03-03 13:13       ` Joerg Roedel
  2020-03-04 12:17         ` Lu Baolu
  0 siblings, 1 reply; 28+ messages in thread
From: Joerg Roedel @ 2020-03-03 13:13 UTC (permalink / raw)
  To: Lu Baolu; +Cc: Ashok Raj, Will Deacon, iommu, Robin Murphy, Christoph Hellwig

Hi Baolu,

On Tue, Mar 03, 2020 at 02:47:02PM +0800, Lu Baolu wrote:
> Theoretically speaking, per-device default domain is impractical. PCI
> aliased devices (PCI bridge and all devices beneath it, VMD devices and
> various devices quirked with pci_add_dma_alias()) must use the same
> domain. It's likely that we have to introduce something like a sub-group
> with all PCI aliased devices staying in it. Current private-domain
> implementation in the vt-d driver was introduced for compatible purpose
> and I wanted to abandon it from the first day. :-)

What hinders you from removing it now? I looked a bit closer into
these private default domain implementation and it is very fragile. If
it can be removed, then better do so sooner than later.

> Probably, we are able to configure per-group default domain type with
> below two interfaces.
> 
> - (ops->)dev_def_domain_type: Return the required default domain type
>   for a device. It returns
>   - IOMMU_DOMAIN_DMA (device must use a DMA domain), unlikely
>   - IOMMU_DOMAIN_IDENTITY (device must use an Identity domain), unlikely
>   - 0 (both are okay), likely

If we stay at the group level, this interface should work on the group
level too, and not on the device level.

> - iommu_group_change_def_domain: Change the default domain of a group
>   Works only when all devices have no driver bond.

Btw, I have no objections about the concept of private default domains
for devices, but the implementation should be moved to generic IOMMU
code so that the behavior is consistent accross differnet IOMMU
platforms, and of course be robust.


Regards,

	Joerg

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH V2 3/5] iommu: Add support to change default domain of an iommu_group
  2020-03-03 13:13       ` Joerg Roedel
@ 2020-03-04 12:17         ` Lu Baolu
  0 siblings, 0 replies; 28+ messages in thread
From: Lu Baolu @ 2020-03-04 12:17 UTC (permalink / raw)
  To: Joerg Roedel
  Cc: Ashok Raj, Will Deacon, iommu, Robin Murphy, Christoph Hellwig

Hi Joerg,

On 2020/3/3 21:13, Joerg Roedel wrote:
> Hi Baolu,
> 
> On Tue, Mar 03, 2020 at 02:47:02PM +0800, Lu Baolu wrote:
>> Theoretically speaking, per-device default domain is impractical. PCI
>> aliased devices (PCI bridge and all devices beneath it, VMD devices and
>> various devices quirked with pci_add_dma_alias()) must use the same
>> domain. It's likely that we have to introduce something like a sub-group
>> with all PCI aliased devices staying in it. Current private-domain
>> implementation in the vt-d driver was introduced for compatible purpose
>> and I wanted to abandon it from the first day. :-)
> 
> What hinders you from removing it now? I looked a bit closer into
> these private default domain implementation and it is very fragile. If
> it can be removed, then better do so sooner than later.
>

I will soon send out the patches for review.

>> Probably, we are able to configure per-group default domain type with
>> below two interfaces.
>>
>> - (ops->)dev_def_domain_type: Return the required default domain type
>>    for a device. It returns
>>    - IOMMU_DOMAIN_DMA (device must use a DMA domain), unlikely
>>    - IOMMU_DOMAIN_IDENTITY (device must use an Identity domain), unlikely
>>    - 0 (both are okay), likely
> 
> If we stay at the group level, this interface should work on the group
> level too, and not on the device level.
> 
>> - iommu_group_change_def_domain: Change the default domain of a group
>>    Works only when all devices have no driver bond.
> 
> Btw, I have no objections about the concept of private default domains
> for devices, but the implementation should be moved to generic IOMMU
> code so that the behavior is consistent accross differnet IOMMU
> platforms, and of course be robust.
>

Yes. I agree with you.

> 
> Regards,
> 
> 	Joerg
> 

Best regards,
baolu
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2020-03-04 12:17 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-02-16 21:57 [PATCH V2 0/5] iommu: Add support to change default domain of a group Sai Praneeth Prakhya
2020-02-16 21:57 ` [PATCH V2 1/5] iommu: Add dev_def_domain_type() call back function to iommu_ops Sai Praneeth Prakhya
2020-02-22 23:37   ` Lu Baolu
2020-02-22 23:39     ` Prakhya, Sai Praneeth
2020-02-16 21:57 ` [PATCH V2 2/5] iommu/vt-d: Rename device_def_domain_type() to intel_iommu_dev_def_domain_type() Sai Praneeth Prakhya
2020-02-22 23:42   ` Lu Baolu
2020-02-22 23:59     ` Prakhya, Sai Praneeth
2020-02-23  1:50       ` Lu Baolu
2020-02-24  3:23         ` Prakhya, Sai Praneeth
2020-02-16 21:57 ` [PATCH V2 3/5] iommu: Add support to change default domain of an iommu_group Sai Praneeth Prakhya
2020-02-23  1:20   ` Lu Baolu
2020-02-24  3:20     ` Prakhya, Sai Praneeth
2020-02-24  5:46       ` Lu Baolu
2020-02-24  7:03         ` Prakhya, Sai Praneeth
2020-02-24  7:39           ` Lu Baolu
2020-02-24  7:57             ` Prakhya, Sai Praneeth
2020-02-24  8:12               ` Lu Baolu
2020-02-24  8:39                 ` Lu Baolu
2020-02-24  8:44                   ` Prakhya, Sai Praneeth
2020-03-02 15:08   ` Joerg Roedel
2020-03-03  6:47     ` Lu Baolu
2020-03-03 13:13       ` Joerg Roedel
2020-03-04 12:17         ` Lu Baolu
2020-02-16 21:57 ` [PATCH V2 4/5] iommu: Take lock before reading iommu_group default domain type Sai Praneeth Prakhya
2020-02-16 21:57 ` [PATCH V2 5/5] iommu: Document usage of "/sys/kernel/iommu_groups/<grp_id>/type" file Sai Praneeth Prakhya
2020-02-23  1:38   ` Lu Baolu
2020-02-24  2:18     ` Prakhya, Sai Praneeth
2020-02-22 23:40 ` [PATCH V2 0/5] iommu: Add support to change default domain of a group Prakhya, Sai Praneeth

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).