iommu.lists.linux-foundation.org archive mirror
 help / color / mirror / Atom feed
* [PATCH V3 0/3] iommu: Add support to change default domain of an iommu
@ 2020-05-28 19:23 Sai Praneeth Prakhya
  2020-05-28 19:23 ` [PATCH V3 1/3] iommu: Add support to change default domain of an iommu_group Sai Praneeth Prakhya
                   ` (3 more replies)
  0 siblings, 4 replies; 8+ messages in thread
From: Sai Praneeth Prakhya @ 2020-05-28 19:23 UTC (permalink / raw)
  To: iommu; +Cc: Ashok Raj, Will Deacon, Robin Murphy, Christoph Hellwig

Presently, the default domain of a iommu group is allocated during boot time and
it cannot be changed later. So, the device would typically be either in identity
(pass_through) mode or the device would be in DMA mode as long as the system is
up and running. There is no way to change the default domain type dynamically
i.e. after booting, a device cannot switch between identity mode and DMA mode.

Assume a use case wherein the privileged user would want to use the device in
pass-through mode when the device is used for host so that it would be high
performing. Presently, this is not supported. Hence add support to change the
default domain of an iommu group dynamically.

Support this by writing to a sysfs file, namely
"/sys/kernel/iommu_groups/<grp_id>/type".

Testing:
--------
Tested by dynamically changing storage device (nvme) from
1. identity mode to DMA and making sure file transfer works
2. DMA mode to identity mode and making sure file transfer works
Tested only for intel_iommu/vt-d. Would appreciate if someone could test on AMD
and ARM based machines.

Based on iommu maintainer's 'next' branch.

Changes from V2:
----------------
1. Change the logic of updating default domain from V2 because
   ops->probe_finalize() could be used to update dma_ops.
2. Drop 1st and 2nd patch of V2 series because they are no longer needed on
   iommu maintainer's 'next' branch.
3. Limit this feature to iommu groups with only one device.
4. Hold device_lock and group mutex until the default domain is changed.

Changes from V1:
----------------
1. V1 patch set wasn't updating dma_ops for some vendors (Eg: AMD), hence,
   change the logic of updating default domain as below (because adding a device
   to iommu_group automatically updates dma_ops)
   a. Allocate a new domain
   b. For every device in the group
	i. Remove the device from the group
	ii. Add the device back to the group
   c. Free previous domain
2. Drop 1st patch of V1 (iommu/vt-d: Modify device_def_domain_type() to use at
   runtime) because "iommu=pt" has no effect on this function anymore.
3. Added a patch to take/release lock while reading iommu_group->default_domain->type
   because it can be changed any time by user.
4. Before changing default domain type of a group, check if the group is
   directly assigned for user level access. If so, abort.
5. Sanitize return path (using ternary operator) in iommu_group_store_type()
6. Split 2nd patch of V1 (iommu: Add device_def_domain_type() call back function
   to iommu_ops) into two patches such that iommu generic changes are now in 1st
   patch of V2 and vt-d specific changes are in 2nd patch of V2.
7. Rename device_def_domain_type() to dev_def_domain_type()
8. Remove example from documentation
9. Change the value written to file "/sys/kernel/iommu_groups/<grp_id>/type"
   from "dma" to "DMA".

Changes from RFC:
-----------------
1. Added support for "auto" type, so that kernel selects one among identity or
   dma mode.
2. Use "system_state" in device_def_domain_type() instead of an argument.

Sai Praneeth Prakhya (3):
  iommu: Add support to change default domain of an iommu_group
  iommu: Take lock before reading iommu_group default domain type
  iommu: Document usage of "/sys/kernel/iommu_groups/<grp_id>/type" file

 .../ABI/testing/sysfs-kernel-iommu_groups     |  30 +++
 drivers/iommu/iommu.c                         | 213 +++++++++++++++++-
 2 files changed, 242 insertions(+), 1 deletion(-)

Cc: Christoph Hellwig <hch@lst.de>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Lu Baolu <baolu.lu@linux.intel.com>
Cc: Sohil Mehta <sohil.mehta@intel.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com>

-- 
2.19.1

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH V3 1/3] iommu: Add support to change default domain of an iommu_group
  2020-05-28 19:23 [PATCH V3 0/3] iommu: Add support to change default domain of an iommu Sai Praneeth Prakhya
@ 2020-05-28 19:23 ` Sai Praneeth Prakhya
  2020-05-29  2:43   ` Lu Baolu
  2020-05-28 19:23 ` [PATCH V3 2/3] iommu: Take lock before reading iommu_group default domain type Sai Praneeth Prakhya
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 8+ messages in thread
From: Sai Praneeth Prakhya @ 2020-05-28 19:23 UTC (permalink / raw)
  To: iommu; +Cc: Ashok Raj, Will Deacon, Robin Murphy, Christoph Hellwig

Presently, the default domain of an iommu_group is allocated during boot
time (i.e. when a device is being added to a group) and it cannot be
changed later. So, the device would typically be either in identity (also
known as pass_through) mode (controlled by "iommu=pt" kernel command line
argument) or the device would be in DMA mode as long as the machine is up
and running. There is no way to change the default domain type dynamically
i.e. after booting, a device cannot switch between identity mode and DMA
mode.

But, assume a use case wherein the user trusts the device and believes that
the OS is secure enough and hence wants *only* this device to bypass IOMMU
(so that it could be high performing) whereas all the other devices to go
through IOMMU (so that the system is protected). Presently, this use case
is not supported. It will be helpful if there is some way to change the
default domain of a B:D.F dynamically. Hence, add such support.

A privileged user could request the kernel to change the default domain
type of a iommu_group by writing to
"/sys/kernel/iommu_groups/<grp_id>/type" file. Presently, only three values
are supported
1. identity: all the DMA transactions from the device in this group are
             *not* translated by the iommu
2. DMA: all the DMA transactions from the device in this group are
        translated by the iommu
3. auto: change to the type the device was booted with

Note:
1. Default domain of an iommu group with two or more devices cannot be
   changed.
2. The device in the iommu group shouldn't be bound to any driver.
3. The device shouldn't be assigned to user for direct access.

Please see "Documentation/ABI/testing/sysfs-kernel-iommu_groups" for more
information.

Cc: Christoph Hellwig <hch@lst.de>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Lu Baolu <baolu.lu@linux.intel.com>
Cc: Sohil Mehta <sohil.mehta@intel.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com>
---
 drivers/iommu/iommu.c | 211 +++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 210 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index a4c2f122eb8b..2b6cca799055 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -92,6 +92,8 @@ static void __iommu_detach_group(struct iommu_domain *domain,
 static int iommu_create_device_direct_mappings(struct iommu_group *group,
 					       struct device *dev);
 static struct iommu_group *iommu_group_get_for_dev(struct device *dev);
+static ssize_t iommu_group_store_type(struct iommu_group *group,
+				      const char *buf, size_t count);
 
 #define IOMMU_GROUP_ATTR(_name, _mode, _show, _store)		\
 struct iommu_group_attribute iommu_group_attr_##_name =		\
@@ -524,7 +526,8 @@ static IOMMU_GROUP_ATTR(name, S_IRUGO, iommu_group_show_name, NULL);
 static IOMMU_GROUP_ATTR(reserved_regions, 0444,
 			iommu_group_show_resv_regions, NULL);
 
-static IOMMU_GROUP_ATTR(type, 0444, iommu_group_show_type, NULL);
+static IOMMU_GROUP_ATTR(type, 0644, iommu_group_show_type,
+			iommu_group_store_type);
 
 static void iommu_group_release(struct kobject *kobj)
 {
@@ -2847,3 +2850,209 @@ int iommu_sva_get_pasid(struct iommu_sva *handle)
 	return ops->sva_get_pasid(handle);
 }
 EXPORT_SYMBOL_GPL(iommu_sva_get_pasid);
+
+/*
+ * Changes the default domain of a device
+ *
+ * @dev: The *only* device in the group
+ * @group: The group for which the default domain should be changed
+ * @prev_domain: The previous domain that is being switched from
+ * @type: The type of the new default domain that gets associated with the group
+ *
+ * Returns 0 on success and error code on failure
+ *
+ * Note:
+ * 1. Presently, this function is called only when user requests to change the
+ *    group's default domain type through /sys/kernel/iommu_groups/<grp_id>/type
+ *    Please take a closer look if intended to use for other purposes.
+ * 2. Assumes that group->mutex is already taken and releases before returning
+ */
+static int iommu_change_dev_def_domain(struct device *dev,
+				       struct iommu_group *group,
+				       struct iommu_domain *prev_dom, int type)
+{
+	int ret = 0;
+
+	/* Sets group->default_domain to the newly allocated domain */
+	ret = iommu_group_alloc_default_domain(dev->bus, group, type);
+	if (ret)
+		goto out;
+
+	ret = __iommu_attach_device(group->default_domain, dev);
+	if (ret)
+		goto free_new_domain;
+
+	group->domain = group->default_domain;
+
+	ret = iommu_create_device_direct_mappings(group, dev);
+	if (ret)
+		goto free_new_domain;
+
+	/*
+	 * Release the mutex here because ops->probe_finalize() call-back of
+	 * some vendor IOMMU drivers calls arm_iommu_attach_device() which
+	 * in-turn might call back into IOMMU core code, where it tries to take
+	 * group->mutex, resulting in a deadlock.
+	 */
+	mutex_unlock(&group->mutex);
+
+	/* Make sure dma_ops is appropriatley set */
+	iommu_group_do_probe_finalize(dev, group->default_domain);
+	iommu_domain_free(prev_dom);
+	return 0;
+
+free_new_domain:
+	iommu_domain_free(group->default_domain);
+	group->default_domain = prev_dom;
+	group->domain = prev_dom;
+
+out:
+	mutex_unlock(&group->mutex);
+
+	return ret;
+}
+
+static ssize_t iommu_group_store_type(struct iommu_group *group,
+				      const char *buf, size_t count)
+{
+	int ret, req_type = 0, req_auto = 0, dev_def_dom;
+	struct iommu_domain *prev_dom;
+	struct group_device *grp_dev;
+	const struct iommu_ops *ops;
+	struct device *dev;
+
+	if (!capable(CAP_SYS_ADMIN) || !capable(CAP_SYS_RAWIO))
+		return -EACCES;
+
+	if (WARN_ON(!group))
+		return -EINVAL;
+
+	if (sysfs_streq(buf, "identity"))
+		req_type = IOMMU_DOMAIN_IDENTITY;
+	else if (sysfs_streq(buf, "DMA"))
+		req_type = IOMMU_DOMAIN_DMA;
+	else if (sysfs_streq(buf, "auto"))
+		req_auto = 1;
+	else
+		return -EINVAL;
+
+	/* Lock the group to make sure that the device count doesn't change */
+	mutex_lock(&group->mutex);
+	if (iommu_group_device_count(group) != 1) {
+		mutex_unlock(&group->mutex);
+		pr_err("Cannot change default domain of a group with two or more devices\n");
+		return -EINVAL;
+	}
+
+	/* Since group has only one device */
+	list_for_each_entry(grp_dev, &group->devices, list)
+		dev = grp_dev->dev;
+
+	/*
+	 * Don't hold onto the group mutex because taking group mutex first and
+	 * then the device lock could potentially cause a deadlock as below.
+	 * Assume two threads T1 and T2. T1 is trying to change default domain
+	 * of an iommu group and T2 is trying to hot unplug a device or release
+	 * [1] VF of a PCIe device which is in the same iommu group. T1 takes
+	 * group mutex and before it could take device lock assume T2 has taken
+	 * device lock and is yet to take group mutex. Now, both the threads
+	 * will be waiting for the other thread to release lock.
+	 *
+	 * [1] Typical device release path
+	 * device_lock() from device/driver core code
+	 *  -> bus_notifier()
+	 *   -> iommu_bus_notifier()
+	 *    -> iommu_release_device()
+	 *     -> ops->release_device() vendor driver calls back iommu core code
+	 *      -> mutex_lock() from iommu core code
+	 */
+	mutex_unlock(&group->mutex);
+
+	/* Check if the device in the group still has a driver bound to it */
+	device_lock(dev);
+	if (device_is_bound(dev)) {
+		pr_err("Device is still bound to driver\n");
+		ret = -EBUSY;
+		goto dev_unlock;
+	}
+
+	/*
+	 * iommu group wasn't locked while acquiring device lock. So, make sure
+	 * that the device count hasn't changed while acquiring device lock.
+	 *
+	 * Changing default domain of an iommu group with two or more devices
+	 * isn't supported because there could be a potential deadlock. Consider
+	 * the following scenario. T1 is trying to acquire device locks of all
+	 * the devices in the group and before it could acquire all of them,
+	 * there could be another thread T2 (from different sub-system and use
+	 * case) that has already acquired some of the device locks and might be
+	 * waiting for T1 to release other device locks.
+	 */
+	mutex_lock(&group->mutex);
+	if (iommu_group_device_count(group) != 1) {
+		pr_err("Cannot change default domain of a group with two or more devices\n");
+		ret = -EINVAL;
+		goto group_unlock;
+	}
+
+	/*
+	 * Check if any user level driver (that doesn't use kernel driver like
+	 * VFIO) is directly using the group. VFIO use case is detected by
+	 * device_is_bound() check above
+	 */
+	if (group->default_domain != group->domain) {
+		pr_err("Group assigned to user level for direct access\n");
+		ret = -EINVAL;
+		goto group_unlock;
+	}
+
+	prev_dom = group->default_domain;
+	if (!prev_dom || !prev_dom->ops || !prev_dom->ops->def_domain_type) {
+		pr_err("'def_domain_type' call back isn't registered\n");
+		ret = -EINVAL;
+		goto group_unlock;
+	}
+
+	ops = prev_dom->ops;
+	dev_def_dom = ops->def_domain_type(dev);
+
+	/* Check if user requested domain is supported by the device or not */
+	if (!req_auto && dev_def_dom && req_type != dev_def_dom) {
+		pr_err("Device cannot be in %s domain\n", buf);
+		ret = -EINVAL;
+		goto group_unlock;
+	}
+
+	if (req_auto) {
+		/*
+		 * If user requested "auto" and the device supports both the
+		 * domains, then default to the domain the device was booted
+		 * with
+		 */
+		if (!dev_def_dom)
+			req_type = iommu_def_domain_type;
+		else
+			req_type = dev_def_dom;
+	}
+
+	/*
+	 * Switch to a new domain only if the requested domain type is different
+	 * from the existing default domain type
+	 */
+	if (prev_dom->type == req_type) {
+		ret = count;
+		goto group_unlock;
+	}
+
+	ret = iommu_change_dev_def_domain(dev, group, prev_dom, req_type);
+	device_unlock(dev);
+	return ret ?: count;
+
+group_unlock:
+	mutex_unlock(&group->mutex);
+
+dev_unlock:
+	device_unlock(dev);
+
+	return ret;
+}
-- 
2.19.1

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH V3 2/3] iommu: Take lock before reading iommu_group default domain type
  2020-05-28 19:23 [PATCH V3 0/3] iommu: Add support to change default domain of an iommu Sai Praneeth Prakhya
  2020-05-28 19:23 ` [PATCH V3 1/3] iommu: Add support to change default domain of an iommu_group Sai Praneeth Prakhya
@ 2020-05-28 19:23 ` Sai Praneeth Prakhya
  2020-05-28 19:24 ` [PATCH V3 3/3] iommu: Document usage of "/sys/kernel/iommu_groups/<grp_id>/type" file Sai Praneeth Prakhya
  2020-05-29  1:51 ` [PATCH V3 0/3] iommu: Add support to change default domain of an iommu Lu Baolu
  3 siblings, 0 replies; 8+ messages in thread
From: Sai Praneeth Prakhya @ 2020-05-28 19:23 UTC (permalink / raw)
  To: iommu; +Cc: Ashok Raj, Will Deacon, Robin Murphy, Christoph Hellwig

"/sys/kernel/iommu_groups/<grp_id>/type" file could be read to find out the
default domain type of an iommu group. The default domain of an iommu group
doesn't change after booting and hence could be read directly. But,
after addding support to dynamically change iommu group default domain, the
above assumption no longer stays valid.

iommu group default domain type could be changed at any time by writing to
"/sys/kernel/iommu_groups/<grp_id>/type". So, take group mutex before
reading iommu group default domain type so that the user wouldn't see stale
values or iommu_group_show_type() doesn't try to derefernce stale pointers.

Cc: Christoph Hellwig <hch@lst.de>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Lu Baolu <baolu.lu@linux.intel.com>
Cc: Sohil Mehta <sohil.mehta@intel.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com>
---
 drivers/iommu/iommu.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 2b6cca799055..fae8a4e1c7ab 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -500,6 +500,7 @@ static ssize_t iommu_group_show_type(struct iommu_group *group,
 {
 	char *type = "unknown\n";
 
+	mutex_lock(&group->mutex);
 	if (group->default_domain) {
 		switch (group->default_domain->type) {
 		case IOMMU_DOMAIN_BLOCKED:
@@ -516,6 +517,7 @@ static ssize_t iommu_group_show_type(struct iommu_group *group,
 			break;
 		}
 	}
+	mutex_unlock(&group->mutex);
 	strcpy(buf, type);
 
 	return strlen(type);
-- 
2.19.1

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH V3 3/3] iommu: Document usage of "/sys/kernel/iommu_groups/<grp_id>/type" file
  2020-05-28 19:23 [PATCH V3 0/3] iommu: Add support to change default domain of an iommu Sai Praneeth Prakhya
  2020-05-28 19:23 ` [PATCH V3 1/3] iommu: Add support to change default domain of an iommu_group Sai Praneeth Prakhya
  2020-05-28 19:23 ` [PATCH V3 2/3] iommu: Take lock before reading iommu_group default domain type Sai Praneeth Prakhya
@ 2020-05-28 19:24 ` Sai Praneeth Prakhya
  2020-05-29  1:51 ` [PATCH V3 0/3] iommu: Add support to change default domain of an iommu Lu Baolu
  3 siblings, 0 replies; 8+ messages in thread
From: Sai Praneeth Prakhya @ 2020-05-28 19:24 UTC (permalink / raw)
  To: iommu; +Cc: Ashok Raj, Will Deacon, Robin Murphy, Christoph Hellwig

The default domain type of an iommu group can be changed by writing to
"/sys/kernel/iommu_groups/<grp_id>/type" file. Hence, document it's usage
and more importantly spell out its limitations.

Cc: Christoph Hellwig <hch@lst.de>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Lu Baolu <baolu.lu@linux.intel.com>
Cc: Sohil Mehta <sohil.mehta@intel.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com>
---
 .../ABI/testing/sysfs-kernel-iommu_groups     | 30 +++++++++++++++++++
 1 file changed, 30 insertions(+)

diff --git a/Documentation/ABI/testing/sysfs-kernel-iommu_groups b/Documentation/ABI/testing/sysfs-kernel-iommu_groups
index 017f5bc3920c..d238fca4408f 100644
--- a/Documentation/ABI/testing/sysfs-kernel-iommu_groups
+++ b/Documentation/ABI/testing/sysfs-kernel-iommu_groups
@@ -33,3 +33,33 @@ Description:    In case an RMRR is used only by graphics or USB devices
 		it is now exposed as "direct-relaxable" instead of "direct".
 		In device assignment use case, for instance, those RMRR
 		are considered to be relaxable and safe.
+
+What:		/sys/kernel/iommu_groups/<grp_id>/type
+Date:		June 2020
+KernelVersion:	v5.8
+Contact:	Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com>
+Description:	Lets the user know the type of default domain in use by iommu
+		for this group. A privileged user could request kernel to change
+		the group type by writing to this file. Presently, only three
+		types are supported
+		1. DMA: All the DMA transactions from the device in this group
+			are translated by the iommu.
+		2. identity: All the DMA transactions from the device in this
+			     group are *not* translated by the iommu.
+		3. auto: Change to the type the device was booted with. When the
+			 user reads the file he would never see "auto". This is
+			 just a write only value.
+		Note:
+		-----
+		A group type could be modified only when
+		1. The group has *only* one device
+		2. The device in the group is not bound to any device driver.
+		   So, the user must first unbind the appropriate driver and
+		   then change the default domain type.
+		Caution:
+		--------
+		Unbinding a device driver will take away the driver's control
+		over the device and if done on devices that host root file
+		system could lead to catastrophic effects (the user might
+		need to reboot the machine to get it to normal state). So, it's
+		expected that the user understands what he is doing.
-- 
2.19.1

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH V3 0/3] iommu: Add support to change default domain of an iommu
  2020-05-28 19:23 [PATCH V3 0/3] iommu: Add support to change default domain of an iommu Sai Praneeth Prakhya
                   ` (2 preceding siblings ...)
  2020-05-28 19:24 ` [PATCH V3 3/3] iommu: Document usage of "/sys/kernel/iommu_groups/<grp_id>/type" file Sai Praneeth Prakhya
@ 2020-05-29  1:51 ` Lu Baolu
  2020-05-29  4:46   ` Prakhya, Sai Praneeth
  3 siblings, 1 reply; 8+ messages in thread
From: Lu Baolu @ 2020-05-29  1:51 UTC (permalink / raw)
  To: Sai Praneeth Prakhya, iommu
  Cc: Ashok Raj, Will Deacon, Robin Murphy, Christoph Hellwig

Hi Sai,

On 5/29/20 3:23 AM, Sai Praneeth Prakhya wrote:
> Presently, the default domain of a iommu group is allocated during boot time and
> it cannot be changed later. So, the device would typically be either in identity
> (pass_through) mode or the device would be in DMA mode as long as the system is
> up and running. There is no way to change the default domain type dynamically
> i.e. after booting, a device cannot switch between identity mode and DMA mode.
> 
> Assume a use case wherein the privileged user would want to use the device in
> pass-through mode when the device is used for host so that it would be high
> performing. Presently, this is not supported. Hence add support to change the
> default domain of an iommu group dynamically.
> 
> Support this by writing to a sysfs file, namely
> "/sys/kernel/iommu_groups/<grp_id>/type".

The email subject

[PATCH V3 0/3] iommu: Add support to change default domain of an iommu

probably should be changed to

[PATCH V3 0/3] iommu: Add support to change default domain of an iommu
group

Best regards,
baolu

> 
> Testing:
> --------
> Tested by dynamically changing storage device (nvme) from
> 1. identity mode to DMA and making sure file transfer works
> 2. DMA mode to identity mode and making sure file transfer works
> Tested only for intel_iommu/vt-d. Would appreciate if someone could test on AMD
> and ARM based machines.
> 
> Based on iommu maintainer's 'next' branch.
> 
> Changes from V2:
> ----------------
> 1. Change the logic of updating default domain from V2 because
>     ops->probe_finalize() could be used to update dma_ops.
> 2. Drop 1st and 2nd patch of V2 series because they are no longer needed on
>     iommu maintainer's 'next' branch.
> 3. Limit this feature to iommu groups with only one device.
> 4. Hold device_lock and group mutex until the default domain is changed.
> 
> Changes from V1:
> ----------------
> 1. V1 patch set wasn't updating dma_ops for some vendors (Eg: AMD), hence,
>     change the logic of updating default domain as below (because adding a device
>     to iommu_group automatically updates dma_ops)
>     a. Allocate a new domain
>     b. For every device in the group
> 	i. Remove the device from the group
> 	ii. Add the device back to the group
>     c. Free previous domain
> 2. Drop 1st patch of V1 (iommu/vt-d: Modify device_def_domain_type() to use at
>     runtime) because "iommu=pt" has no effect on this function anymore.
> 3. Added a patch to take/release lock while reading iommu_group->default_domain->type
>     because it can be changed any time by user.
> 4. Before changing default domain type of a group, check if the group is
>     directly assigned for user level access. If so, abort.
> 5. Sanitize return path (using ternary operator) in iommu_group_store_type()
> 6. Split 2nd patch of V1 (iommu: Add device_def_domain_type() call back function
>     to iommu_ops) into two patches such that iommu generic changes are now in 1st
>     patch of V2 and vt-d specific changes are in 2nd patch of V2.
> 7. Rename device_def_domain_type() to dev_def_domain_type()
> 8. Remove example from documentation
> 9. Change the value written to file "/sys/kernel/iommu_groups/<grp_id>/type"
>     from "dma" to "DMA".
> 
> Changes from RFC:
> -----------------
> 1. Added support for "auto" type, so that kernel selects one among identity or
>     dma mode.
> 2. Use "system_state" in device_def_domain_type() instead of an argument.
> 
> Sai Praneeth Prakhya (3):
>    iommu: Add support to change default domain of an iommu_group
>    iommu: Take lock before reading iommu_group default domain type
>    iommu: Document usage of "/sys/kernel/iommu_groups/<grp_id>/type" file
> 
>   .../ABI/testing/sysfs-kernel-iommu_groups     |  30 +++
>   drivers/iommu/iommu.c                         | 213 +++++++++++++++++-
>   2 files changed, 242 insertions(+), 1 deletion(-)
> 
> Cc: Christoph Hellwig <hch@lst.de>
> Cc: Joerg Roedel <joro@8bytes.org>
> Cc: Ashok Raj <ashok.raj@intel.com>
> Cc: Will Deacon <will.deacon@arm.com>
> Cc: Lu Baolu <baolu.lu@linux.intel.com>
> Cc: Sohil Mehta <sohil.mehta@intel.com>
> Cc: Robin Murphy <robin.murphy@arm.com>
> Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
> Signed-off-by: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com>
> 
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH V3 1/3] iommu: Add support to change default domain of an iommu_group
  2020-05-28 19:23 ` [PATCH V3 1/3] iommu: Add support to change default domain of an iommu_group Sai Praneeth Prakhya
@ 2020-05-29  2:43   ` Lu Baolu
  2020-05-29  8:15     ` Prakhya, Sai Praneeth
  0 siblings, 1 reply; 8+ messages in thread
From: Lu Baolu @ 2020-05-29  2:43 UTC (permalink / raw)
  To: Sai Praneeth Prakhya, iommu
  Cc: Ashok Raj, Will Deacon, Robin Murphy, Christoph Hellwig

Hi Sai,

On 5/29/20 3:23 AM, Sai Praneeth Prakhya wrote:
> Presently, the default domain of an iommu_group is allocated during boot
> time (i.e. when a device is being added to a group) and it cannot be
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

This is inaccurate as Joerg's code has deferred default domain
allocation and attaching after group allocation. I'd suggest to remove
this.

> changed later. So, the device would typically be either in identity (also
> known as pass_through) mode (controlled by "iommu=pt" kernel command line
                                               ^^^^^^^^

There are other kernel parameters to put device in pass_through mode.
I'd suggest to remove this.

> argument) or the device would be in DMA mode as long as the machine is up
> and running. There is no way to change the default domain type dynamically
> i.e. after booting, a device cannot switch between identity mode and DMA
> mode.
> 
> But, assume a use case wherein the user trusts the device and believes that
> the OS is secure enough and hence wants *only* this device to bypass IOMMU
> (so that it could be high performing) whereas all the other devices to go
> through IOMMU (so that the system is protected). Presently, this use case
> is not supported. It will be helpful if there is some way to change the
> default domain of a B:D.F dynamically. Hence, add such support.
   ^^^^^^^^^^^^^^^^^^^^^^^^^

Currently default domain is per iommu_group, we have no per device
default domain yet. Probably, "default domain of an iommu group"?

> 
> A privileged user could request the kernel to change the default domain
> type of a iommu_group by writing to
> "/sys/kernel/iommu_groups/<grp_id>/type" file. Presently, only three values
> are supported
> 1. identity: all the DMA transactions from the device in this group are
>               *not* translated by the iommu
> 2. DMA: all the DMA transactions from the device in this group are
>          translated by the iommu
> 3. auto: change to the type the device was booted with
> 
> Note:
> 1. Default domain of an iommu group with two or more devices cannot be
>     changed.
> 2. The device in the iommu group shouldn't be bound to any driver.
> 3. The device shouldn't be assigned to user for direct access.
> 
> Please see "Documentation/ABI/testing/sysfs-kernel-iommu_groups" for more
> information.
> 
> Cc: Christoph Hellwig <hch@lst.de>
> Cc: Joerg Roedel <joro@8bytes.org>
> Cc: Ashok Raj <ashok.raj@intel.com>
> Cc: Will Deacon <will.deacon@arm.com>
> Cc: Lu Baolu <baolu.lu@linux.intel.com>
> Cc: Sohil Mehta <sohil.mehta@intel.com>
> Cc: Robin Murphy <robin.murphy@arm.com>
> Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
> Signed-off-by: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com>
> ---
>   drivers/iommu/iommu.c | 211 +++++++++++++++++++++++++++++++++++++++++-
>   1 file changed, 210 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
> index a4c2f122eb8b..2b6cca799055 100644
> --- a/drivers/iommu/iommu.c
> +++ b/drivers/iommu/iommu.c
> @@ -92,6 +92,8 @@ static void __iommu_detach_group(struct iommu_domain *domain,
>   static int iommu_create_device_direct_mappings(struct iommu_group *group,
>   					       struct device *dev);
>   static struct iommu_group *iommu_group_get_for_dev(struct device *dev);
> +static ssize_t iommu_group_store_type(struct iommu_group *group,
> +				      const char *buf, size_t count);
>   
>   #define IOMMU_GROUP_ATTR(_name, _mode, _show, _store)		\
>   struct iommu_group_attribute iommu_group_attr_##_name =		\
> @@ -524,7 +526,8 @@ static IOMMU_GROUP_ATTR(name, S_IRUGO, iommu_group_show_name, NULL);
>   static IOMMU_GROUP_ATTR(reserved_regions, 0444,
>   			iommu_group_show_resv_regions, NULL);
>   
> -static IOMMU_GROUP_ATTR(type, 0444, iommu_group_show_type, NULL);
> +static IOMMU_GROUP_ATTR(type, 0644, iommu_group_show_type,
> +			iommu_group_store_type);
>   
>   static void iommu_group_release(struct kobject *kobj)
>   {
> @@ -2847,3 +2850,209 @@ int iommu_sva_get_pasid(struct iommu_sva *handle)
>   	return ops->sva_get_pasid(handle);
>   }
>   EXPORT_SYMBOL_GPL(iommu_sva_get_pasid);
> +
> +/*
> + * Changes the default domain of a device
> + *
> + * @dev: The *only* device in the group

All devices in a group have been linked in group->devices, so you don't
need to pass it in as a parameter. Otherwise, you also need some sanity
check against @dev, right?

> + * @group: The group for which the default domain should be changed
> + * @prev_domain: The previous domain that is being switched from

The same as @dev, the default domain in use is saved in
group->default_domain. No need a separated parameter to save space and
sanity check.

> + * @type: The type of the new default domain that gets associated with the group
> + *
> + * Returns 0 on success and error code on failure
> + *
> + * Note:
> + * 1. Presently, this function is called only when user requests to change the
> + *    group's default domain type through /sys/kernel/iommu_groups/<grp_id>/type
> + *    Please take a closer look if intended to use for other purposes.
> + * 2. Assumes that group->mutex is already taken and releases before returning

You could assume that the caller should hold the group->mutex and 
shouldn't be released in this helper. A reference design could be found
here.

https://lkml.org/lkml/2020/3/13/1164

> + */
> +static int iommu_change_dev_def_domain(struct device *dev,
> +				       struct iommu_group *group,
> +				       struct iommu_domain *prev_dom, int type)
> +{
> +	int ret = 0;
> +
> +	/* Sets group->default_domain to the newly allocated domain */
> +	ret = iommu_group_alloc_default_domain(dev->bus, group, type);
> +	if (ret)
> +		goto out;
> +
> +	ret = __iommu_attach_device(group->default_domain, dev);
> +	if (ret)
> +		goto free_new_domain;
> +
> +	group->domain = group->default_domain;
> +
> +	ret = iommu_create_device_direct_mappings(group, dev);
> +	if (ret)
> +		goto free_new_domain;
> +
> +	/*
> +	 * Release the mutex here because ops->probe_finalize() call-back of
> +	 * some vendor IOMMU drivers calls arm_iommu_attach_device() which
> +	 * in-turn might call back into IOMMU core code, where it tries to take
> +	 * group->mutex, resulting in a deadlock.
> +	 */
> +	mutex_unlock(&group->mutex);
> +
> +	/* Make sure dma_ops is appropriatley set */
> +	iommu_group_do_probe_finalize(dev, group->default_domain);
> +	iommu_domain_free(prev_dom);
> +	return 0;
> +
> +free_new_domain:
> +	iommu_domain_free(group->default_domain);
> +	group->default_domain = prev_dom;
> +	group->domain = prev_dom;
> +
> +out:
> +	mutex_unlock(&group->mutex);
> +
> +	return ret;
> +}
> +
> +static ssize_t iommu_group_store_type(struct iommu_group *group,
> +				      const char *buf, size_t count)
> +{
> +	int ret, req_type = 0, req_auto = 0, dev_def_dom;
> +	struct iommu_domain *prev_dom;
> +	struct group_device *grp_dev;
> +	const struct iommu_ops *ops;
> +	struct device *dev;
> +
> +	if (!capable(CAP_SYS_ADMIN) || !capable(CAP_SYS_RAWIO))
> +		return -EACCES;
> +
> +	if (WARN_ON(!group))
> +		return -EINVAL;
> +
> +	if (sysfs_streq(buf, "identity"))
> +		req_type = IOMMU_DOMAIN_IDENTITY;
> +	else if (sysfs_streq(buf, "DMA"))
> +		req_type = IOMMU_DOMAIN_DMA;
> +	else if (sysfs_streq(buf, "auto"))
> +		req_auto = 1;

How about let req_type = 0 for auto case. This keeps it consistent with
the return value of iommu_ops->def_domain_type(). With

if (!req_type)
	req_type = ops->def_domain_type(dev);

you don't need req_auto and dev_def_dom any more.

> +	else
> +		return -EINVAL;
> +
> +	/* Lock the group to make sure that the device count doesn't change */
> +	mutex_lock(&group->mutex);
> +	if (iommu_group_device_count(group) != 1) {
> +		mutex_unlock(&group->mutex);
> +		pr_err("Cannot change default domain of a group with two or more devices\n");
> +		return -EINVAL;
> +	}
> +
> +	/* Since group has only one device */
> +	list_for_each_entry(grp_dev, &group->devices, list)
> +		dev = grp_dev->dev;
> +
> +	/*
> +	 * Don't hold onto the group mutex because taking group mutex first and
> +	 * then the device lock could potentially cause a deadlock as below.
> +	 * Assume two threads T1 and T2. T1 is trying to change default domain
> +	 * of an iommu group and T2 is trying to hot unplug a device or release
> +	 * [1] VF of a PCIe device which is in the same iommu group. T1 takes
> +	 * group mutex and before it could take device lock assume T2 has taken
> +	 * device lock and is yet to take group mutex. Now, both the threads
> +	 * will be waiting for the other thread to release lock.
> +	 *
> +	 * [1] Typical device release path
> +	 * device_lock() from device/driver core code
> +	 *  -> bus_notifier()
> +	 *   -> iommu_bus_notifier()
> +	 *    -> iommu_release_device()
> +	 *     -> ops->release_device() vendor driver calls back iommu core code
> +	 *      -> mutex_lock() from iommu core code
> +	 */
> +	mutex_unlock(&group->mutex);

Since you only support single device group, how about

-> lock_device
     -> lock group
         -> change default domain
     -> unlock group
-> unlock device
?

> +
> +	/* Check if the device in the group still has a driver bound to it */
> +	device_lock(dev);
> +	if (device_is_bound(dev)) {
> +		pr_err("Device is still bound to driver\n");
> +		ret = -EBUSY;
> +		goto dev_unlock;
> +	}
> +
> +	/*
> +	 * iommu group wasn't locked while acquiring device lock. So, make sure
> +	 * that the device count hasn't changed while acquiring device lock.
> +	 *
> +	 * Changing default domain of an iommu group with two or more devices
> +	 * isn't supported because there could be a potential deadlock. Consider
> +	 * the following scenario. T1 is trying to acquire device locks of all
> +	 * the devices in the group and before it could acquire all of them,
> +	 * there could be another thread T2 (from different sub-system and use
> +	 * case) that has already acquired some of the device locks and might be
> +	 * waiting for T1 to release other device locks.
> +	 */
> +	mutex_lock(&group->mutex);
> +	if (iommu_group_device_count(group) != 1) {
> +		pr_err("Cannot change default domain of a group with two or more devices\n");
> +		ret = -EINVAL;
> +		goto group_unlock;
> +	}
> +
> +	/*
> +	 * Check if any user level driver (that doesn't use kernel driver like
> +	 * VFIO) is directly using the group. VFIO use case is detected by
> +	 * device_is_bound() check above
> +	 */
> +	if (group->default_domain != group->domain) {
> +		pr_err("Group assigned to user level for direct access\n");
> +		ret = -EINVAL;

-EBUSY?

> +		goto group_unlock;
> +	}
> +
> +	prev_dom = group->default_domain;
> +	if (!prev_dom || !prev_dom->ops || !prev_dom->ops->def_domain_type) {
> +		pr_err("'def_domain_type' call back isn't registered\n");
> +		ret = -EINVAL;
> +		goto group_unlock;
> +	}
> +

--- start ---
> +	ops = prev_dom->ops;
> +	dev_def_dom = ops->def_domain_type(dev);
> +
> +	/* Check if user requested domain is supported by the device or not */
> +	if (!req_auto && dev_def_dom && req_type != dev_def_dom) {
> +		pr_err("Device cannot be in %s domain\n", buf);
> +		ret = -EINVAL;
> +		goto group_unlock;
> +	}
> +
> +	if (req_auto) {
> +		/*
> +		 * If user requested "auto" and the device supports both the
> +		 * domains, then default to the domain the device was booted
> +		 * with
> +		 */
> +		if (!dev_def_dom)
> +			req_type = iommu_def_domain_type;
> +		else
> +			req_type = dev_def_dom;
> +	}
--- end ---

Isn't this could could be simply replaced with:

if (!req_type)
	req_type = ops->default_domain_type(dev) ? : iommu_def_domain_type;

> +
> +	/*
> +	 * Switch to a new domain only if the requested domain type is different
> +	 * from the existing default domain type
> +	 */
> +	if (prev_dom->type == req_type) {
> +		ret = count;
> +		goto group_unlock;
> +	} > +
> +	ret = iommu_change_dev_def_domain(dev, group, prev_dom, req_type);
> +	device_unlock(dev);
> +	return ret ?: count;
> +
> +group_unlock:
> +	mutex_unlock(&group->mutex);
> +
> +dev_unlock:
> +	device_unlock(dev);
> +
> +	return ret;
> +}
> 

Best regards,
baolu
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 8+ messages in thread

* RE: [PATCH V3 0/3] iommu: Add support to change default domain of an iommu
  2020-05-29  1:51 ` [PATCH V3 0/3] iommu: Add support to change default domain of an iommu Lu Baolu
@ 2020-05-29  4:46   ` Prakhya, Sai Praneeth
  0 siblings, 0 replies; 8+ messages in thread
From: Prakhya, Sai Praneeth @ 2020-05-29  4:46 UTC (permalink / raw)
  To: Lu Baolu, iommu; +Cc: Raj, Ashok, Will Deacon, Robin Murphy, Christoph Hellwig

Hi Baolu,

> -----Original Message-----
> From: Lu Baolu <baolu.lu@linux.intel.com>
> Sent: Thursday, May 28, 2020 6:52 PM
> To: Prakhya, Sai Praneeth <sai.praneeth.prakhya@intel.com>;
> iommu@lists.linux-foundation.org
> Cc: baolu.lu@linux.intel.com; Christoph Hellwig <hch@lst.de>; Joerg Roedel
> <joro@8bytes.org>; Raj, Ashok <ashok.raj@intel.com>; Will Deacon
> <will.deacon@arm.com>; Mehta, Sohil <sohil.mehta@intel.com>; Robin
> Murphy <robin.murphy@arm.com>; Jacob Pan <jacob.jun.pan@linux.intel.com>
> Subject: Re: [PATCH V3 0/3] iommu: Add support to change default domain of
> an iommu
> 
> Hi Sai,
> 
> On 5/29/20 3:23 AM, Sai Praneeth Prakhya wrote:
> > Presently, the default domain of a iommu group is allocated during
> > boot time and it cannot be changed later. So, the device would
> > typically be either in identity
> > (pass_through) mode or the device would be in DMA mode as long as the
> > system is up and running. There is no way to change the default domain
> > type dynamically i.e. after booting, a device cannot switch between identity
> mode and DMA mode.
> >
> > Assume a use case wherein the privileged user would want to use the
> > device in pass-through mode when the device is used for host so that
> > it would be high performing. Presently, this is not supported. Hence
> > add support to change the default domain of an iommu group dynamically.
> >
> > Support this by writing to a sysfs file, namely
> > "/sys/kernel/iommu_groups/<grp_id>/type".
> 
> The email subject
> 
> [PATCH V3 0/3] iommu: Add support to change default domain of an iommu
> 
> probably should be changed to
> 
> [PATCH V3 0/3] iommu: Add support to change default domain of an iommu
> group

Oops.. my bad. I have it in the patches that I have on my dev machine. I added "group" to a new line because the subject line crossed 80 character limit. Probably, I will just put it in the same line for the next version. Thanks for bringing this up.

Regards,
Sai
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 8+ messages in thread

* RE: [PATCH V3 1/3] iommu: Add support to change default domain of an iommu_group
  2020-05-29  2:43   ` Lu Baolu
@ 2020-05-29  8:15     ` Prakhya, Sai Praneeth
  0 siblings, 0 replies; 8+ messages in thread
From: Prakhya, Sai Praneeth @ 2020-05-29  8:15 UTC (permalink / raw)
  To: Lu Baolu, iommu; +Cc: Raj, Ashok, Will Deacon, Robin Murphy, Christoph Hellwig

Hi Baolu,

> -----Original Message-----
> From: Lu Baolu <baolu.lu@linux.intel.com>
> Sent: Thursday, May 28, 2020 7:43 PM
> To: Prakhya, Sai Praneeth <sai.praneeth.prakhya@intel.com>;
> iommu@lists.linux-foundation.org
> Cc: baolu.lu@linux.intel.com; Christoph Hellwig <hch@lst.de>; Joerg Roedel
> <joro@8bytes.org>; Raj, Ashok <ashok.raj@intel.com>; Will Deacon
> <will.deacon@arm.com>; Mehta, Sohil <sohil.mehta@intel.com>; Robin
> Murphy <robin.murphy@arm.com>; Jacob Pan <jacob.jun.pan@linux.intel.com>
> Subject: Re: [PATCH V3 1/3] iommu: Add support to change default domain of
> an iommu_group
> 
> Hi Sai,
> 
> On 5/29/20 3:23 AM, Sai Praneeth Prakhya wrote:
> > Presently, the default domain of an iommu_group is allocated during
> > boot time (i.e. when a device is being added to a group) and it cannot
> > be
>               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> 
> This is inaccurate as Joerg's code has deferred default domain allocation and
> attaching after group allocation. I'd suggest to remove this.

Ok.. makes sense. I will remove it. I think it should have been like below to accurately describe Joerg's changes, is that correct?

"Presently, the default domain of an iommu_group is allocated during
boot time (i.e. during device probe and after the device is added to a group) and it cannot
be"
 
> > changed later. So, the device would typically be either in identity
> > (also known as pass_through) mode (controlled by "iommu=pt" kernel
> > command line
>                                                ^^^^^^^^
> 
> There are other kernel parameters to put device in pass_through mode.
> I'd suggest to remove this.

Ok.. makes sense. I will remove it.
I think, you were talking about ARM parameters (iommu.passthrough).. am I right?

> > argument) or the device would be in DMA mode as long as the machine is
> > up and running. There is no way to change the default domain type
> > dynamically i.e. after booting, a device cannot switch between
> > identity mode and DMA mode.
> >
> > But, assume a use case wherein the user trusts the device and believes
> > that the OS is secure enough and hence wants *only* this device to
> > bypass IOMMU (so that it could be high performing) whereas all the
> > other devices to go through IOMMU (so that the system is protected).
> > Presently, this use case is not supported. It will be helpful if there
> > is some way to change the default domain of a B:D.F dynamically. Hence, add
> such support.
>    ^^^^^^^^^^^^^^^^^^^^^^^^^
> 
> Currently default domain is per iommu_group, we have no per device default
> domain yet. Probably, "default domain of an iommu group"?

Makes sense. I will change it.

> > A privileged user could request the kernel to change the default
> > domain type of a iommu_group by writing to
> > "/sys/kernel/iommu_groups/<grp_id>/type" file. Presently, only three
> > values are supported 1. identity: all the DMA transactions from the
> > device in this group are
> >               *not* translated by the iommu 2. DMA: all the DMA
> > transactions from the device in this group are
> >          translated by the iommu
> > 3. auto: change to the type the device was booted with
> >
> > Note:
> > 1. Default domain of an iommu group with two or more devices cannot be
> >     changed.
> > 2. The device in the iommu group shouldn't be bound to any driver.
> > 3. The device shouldn't be assigned to user for direct access.
> >
> > Please see "Documentation/ABI/testing/sysfs-kernel-iommu_groups" for
> > more information.
> >
> > Cc: Christoph Hellwig <hch@lst.de>
> > Cc: Joerg Roedel <joro@8bytes.org>
> > Cc: Ashok Raj <ashok.raj@intel.com>
> > Cc: Will Deacon <will.deacon@arm.com>
> > Cc: Lu Baolu <baolu.lu@linux.intel.com>
> > Cc: Sohil Mehta <sohil.mehta@intel.com>
> > Cc: Robin Murphy <robin.murphy@arm.com>
> > Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
> > Signed-off-by: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com>
> > ---
> >   drivers/iommu/iommu.c | 211
> +++++++++++++++++++++++++++++++++++++++++-
> >   1 file changed, 210 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c index
> > a4c2f122eb8b..2b6cca799055 100644
> > --- a/drivers/iommu/iommu.c
> > +++ b/drivers/iommu/iommu.c
> > @@ -92,6 +92,8 @@ static void __iommu_detach_group(struct
> iommu_domain *domain,
> >   static int iommu_create_device_direct_mappings(struct iommu_group
> *group,
> >   					       struct device *dev);
> >   static struct iommu_group *iommu_group_get_for_dev(struct device
> > *dev);
> > +static ssize_t iommu_group_store_type(struct iommu_group *group,
> > +				      const char *buf, size_t count);
> >
> >   #define IOMMU_GROUP_ATTR(_name, _mode, _show, _store)
> 	\
> >   struct iommu_group_attribute iommu_group_attr_##_name =		\
> > @@ -524,7 +526,8 @@ static IOMMU_GROUP_ATTR(name, S_IRUGO,
> iommu_group_show_name, NULL);
> >   static IOMMU_GROUP_ATTR(reserved_regions, 0444,
> >   			iommu_group_show_resv_regions, NULL);
> >
> > -static IOMMU_GROUP_ATTR(type, 0444, iommu_group_show_type, NULL);
> > +static IOMMU_GROUP_ATTR(type, 0644, iommu_group_show_type,
> > +			iommu_group_store_type);
> >
> >   static void iommu_group_release(struct kobject *kobj)
> >   {
> > @@ -2847,3 +2850,209 @@ int iommu_sva_get_pasid(struct iommu_sva
> *handle)
> >   	return ops->sva_get_pasid(handle);
> >   }
> >   EXPORT_SYMBOL_GPL(iommu_sva_get_pasid);
> > +
> > +/*
> > + * Changes the default domain of a device
> > + *
> > + * @dev: The *only* device in the group
> 
> All devices in a group have been linked in group->devices, so you don't need to
> pass it in as a parameter. Otherwise, you also need some sanity check against
> @dev, right?

Yes, I agree that we could get "dev" from group->devices. But, I passed it as a parameter because it's already done by iommu_group_store_type() (as below) and I thought that I could save from duplicating code by passing it as a parameter. Also, iommu_change_dev_def_domain() is presently just a helper function (and not an API) for iommu_group_store_type() and hence thought that error checking for "dev" (i.e. if (!dev)) is not needed in iommu_change_dev_def_domain() because iommu_group_store_type()  may never pass a null value.

/* Since group has only one device */
list_for_each_entry(grp_dev, &group->devices, list)
	dev = grp_dev->dev;

Just wanted to put out the rationale behind why I decided to go this way rather than getting "dev" from group->devices. Please let me know if you think otherwise, I am happy to change it.

> > + * @group: The group for which the default domain should be changed
> > + * @prev_domain: The previous domain that is being switched from
> 
> The same as @dev, the default domain in use is saved in
> group->default_domain. No need a separated parameter to save space and
> sanity check.

Same as above. Just wanted to avoid the below check as it's already done by iommu_group_store_type()
if (group->default_domain)
	prev_dom = group->default_domain;

I am happy to change it if you prefer to see less parameters.

> > + * @type: The type of the new default domain that gets associated
> > + with the group
> > + *
> > + * Returns 0 on success and error code on failure
> > + *
> > + * Note:
> > + * 1. Presently, this function is called only when user requests to change the
> > + *    group's default domain type through
> /sys/kernel/iommu_groups/<grp_id>/type
> > + *    Please take a closer look if intended to use for other purposes.
> > + * 2. Assumes that group->mutex is already taken and releases before
> > + returning
> 
> You could assume that the caller should hold the group->mutex and shouldn't be
> released in this helper. A reference design could be found here.
> 
> https://lkml.org/lkml/2020/3/13/1164

Ok. I have looked at it.. and it does hold group mutex.
Just wondering if you would like me to change anything here?

> > + */
> > +static int iommu_change_dev_def_domain(struct device *dev,
> > +				       struct iommu_group *group,
> > +				       struct iommu_domain *prev_dom, int type)
> {
> > +	int ret = 0;
> > +
> > +	/* Sets group->default_domain to the newly allocated domain */
> > +	ret = iommu_group_alloc_default_domain(dev->bus, group, type);
> > +	if (ret)
> > +		goto out;
> > +
> > +	ret = __iommu_attach_device(group->default_domain, dev);
> > +	if (ret)
> > +		goto free_new_domain;
> > +
> > +	group->domain = group->default_domain;
> > +
> > +	ret = iommu_create_device_direct_mappings(group, dev);
> > +	if (ret)
> > +		goto free_new_domain;
> > +
> > +	/*
> > +	 * Release the mutex here because ops->probe_finalize() call-back of
> > +	 * some vendor IOMMU drivers calls arm_iommu_attach_device() which
> > +	 * in-turn might call back into IOMMU core code, where it tries to take
> > +	 * group->mutex, resulting in a deadlock.
> > +	 */
> > +	mutex_unlock(&group->mutex);
> > +
> > +	/* Make sure dma_ops is appropriatley set */
> > +	iommu_group_do_probe_finalize(dev, group->default_domain);
> > +	iommu_domain_free(prev_dom);
> > +	return 0;
> > +
> > +free_new_domain:
> > +	iommu_domain_free(group->default_domain);
> > +	group->default_domain = prev_dom;
> > +	group->domain = prev_dom;
> > +
> > +out:
> > +	mutex_unlock(&group->mutex);
> > +
> > +	return ret;
> > +}
> > +
> > +static ssize_t iommu_group_store_type(struct iommu_group *group,
> > +				      const char *buf, size_t count) {
> > +	int ret, req_type = 0, req_auto = 0, dev_def_dom;
> > +	struct iommu_domain *prev_dom;
> > +	struct group_device *grp_dev;
> > +	const struct iommu_ops *ops;
> > +	struct device *dev;
> > +
> > +	if (!capable(CAP_SYS_ADMIN) || !capable(CAP_SYS_RAWIO))
> > +		return -EACCES;
> > +
> > +	if (WARN_ON(!group))
> > +		return -EINVAL;
> > +
> > +	if (sysfs_streq(buf, "identity"))
> > +		req_type = IOMMU_DOMAIN_IDENTITY;
> > +	else if (sysfs_streq(buf, "DMA"))
> > +		req_type = IOMMU_DOMAIN_DMA;
> > +	else if (sysfs_streq(buf, "auto"))
> > +		req_auto = 1;
> 
> How about let req_type = 0 for auto case. This keeps it consistent with the
> return value of iommu_ops->def_domain_type(). With
> 
> if (!req_type)
> 	req_type = ops->def_domain_type(dev);
> 
> you don't need req_auto and dev_def_dom any more.

Please see below where you had comments for changing req_type.

> > +	else
> > +		return -EINVAL;
> > +
> > +	/* Lock the group to make sure that the device count doesn't change */
> > +	mutex_lock(&group->mutex);
> > +	if (iommu_group_device_count(group) != 1) {
> > +		mutex_unlock(&group->mutex);
> > +		pr_err("Cannot change default domain of a group with two or
> more devices\n");
> > +		return -EINVAL;
> > +	}
> > +
> > +	/* Since group has only one device */
> > +	list_for_each_entry(grp_dev, &group->devices, list)
> > +		dev = grp_dev->dev;
> > +
> > +	/*
> > +	 * Don't hold onto the group mutex because taking group mutex first
> and
> > +	 * then the device lock could potentially cause a deadlock as below.
> > +	 * Assume two threads T1 and T2. T1 is trying to change default domain
> > +	 * of an iommu group and T2 is trying to hot unplug a device or release
> > +	 * [1] VF of a PCIe device which is in the same iommu group. T1 takes
> > +	 * group mutex and before it could take device lock assume T2 has taken
> > +	 * device lock and is yet to take group mutex. Now, both the threads
> > +	 * will be waiting for the other thread to release lock.
> > +	 *
> > +	 * [1] Typical device release path
> > +	 * device_lock() from device/driver core code
> > +	 *  -> bus_notifier()
> > +	 *   -> iommu_bus_notifier()
> > +	 *    -> iommu_release_device()
> > +	 *     -> ops->release_device() vendor driver calls back iommu core code
> > +	 *      -> mutex_lock() from iommu core code
> > +	 */
> > +	mutex_unlock(&group->mutex);
> 
> Since you only support single device group, how about
> 
> -> lock_device
>      -> lock group
>          -> change default domain
>      -> unlock group
> -> unlock device

Sure! I did the same thing below i.e. first take device_lock() and then take group mutex.
I acquired/released group mutex above to
1.  Verify that iommu group has only one device
2. Get "struct dev *" of that device.
Just wondering if you would like me to change anything here?

> > +
> > +	/* Check if the device in the group still has a driver bound to it */
> > +	device_lock(dev);
> > +	if (device_is_bound(dev)) {
> > +		pr_err("Device is still bound to driver\n");
> > +		ret = -EBUSY;
> > +		goto dev_unlock;
> > +	}
> > +
> > +	/*
> > +	 * iommu group wasn't locked while acquiring device lock. So, make
> sure
> > +	 * that the device count hasn't changed while acquiring device lock.
> > +	 *
> > +	 * Changing default domain of an iommu group with two or more
> devices
> > +	 * isn't supported because there could be a potential deadlock. Consider
> > +	 * the following scenario. T1 is trying to acquire device locks of all
> > +	 * the devices in the group and before it could acquire all of them,
> > +	 * there could be another thread T2 (from different sub-system and use
> > +	 * case) that has already acquired some of the device locks and might be
> > +	 * waiting for T1 to release other device locks.
> > +	 */
> > +	mutex_lock(&group->mutex);
> > +	if (iommu_group_device_count(group) != 1) {
> > +		pr_err("Cannot change default domain of a group with two or
> more devices\n");
> > +		ret = -EINVAL;
> > +		goto group_unlock;
> > +	}
> > +
> > +	/*
> > +	 * Check if any user level driver (that doesn't use kernel driver like
> > +	 * VFIO) is directly using the group. VFIO use case is detected by
> > +	 * device_is_bound() check above
> > +	 */
> > +	if (group->default_domain != group->domain) {
> > +		pr_err("Group assigned to user level for direct access\n");
> > +		ret = -EINVAL;
> 
> -EBUSY?

Ok.. I will change it.

> > +		goto group_unlock;
> > +	}
> > +
> > +	prev_dom = group->default_domain;
> > +	if (!prev_dom || !prev_dom->ops || !prev_dom->ops-
> >def_domain_type) {
> > +		pr_err("'def_domain_type' call back isn't registered\n");
> > +		ret = -EINVAL;
> > +		goto group_unlock;
> > +	}
> > +
> 
> --- start ---
> > +	ops = prev_dom->ops;
> > +	dev_def_dom = ops->def_domain_type(dev);
> > +
> > +	/* Check if user requested domain is supported by the device or not */
> > +	if (!req_auto && dev_def_dom && req_type != dev_def_dom) {
> > +		pr_err("Device cannot be in %s domain\n", buf);
> > +		ret = -EINVAL;
> > +		goto group_unlock;
> > +	}
> > +
> > +	if (req_auto) {
> > +		/*
> > +		 * If user requested "auto" and the device supports both the
> > +		 * domains, then default to the domain the device was booted
> > +		 * with
> > +		 */
> > +		if (!dev_def_dom)
> > +			req_type = iommu_def_domain_type;
> > +		else
> > +			req_type = dev_def_dom;
> > +	}
> --- end ---
> 
> Isn't this could could be simply replaced with:
> 
> if (!req_type)
> 	req_type = ops->default_domain_type(dev) ? :
> iommu_def_domain_type;

I think, replacing this has two issues
1. We might miss validating user request i.e. we need to make sure that when user requests for a specific type of domain (E.g: DMA), we need to validate that the device could actually be in that domain. This is done above by

/* Check if user requested domain is supported by the device or not */
if (!req_auto && dev_def_dom && req_type != dev_def_dom) {
	pr_err("Device cannot be in %s domain\n", buf);
	ret = -EINVAL;
	goto group_unlock;
}

2. req_type could end up having 0 if "ops->default_domain_type(dev)" returns 0.
But, after this point, req_type should _only_ be either DMA type or identity type and cannot be 0 because req_type is passed to iommu_ops->domain_alloc() to allocate domain of requested type and it cannot be 0.

Regards,
Sai
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2020-05-29  8:15 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-05-28 19:23 [PATCH V3 0/3] iommu: Add support to change default domain of an iommu Sai Praneeth Prakhya
2020-05-28 19:23 ` [PATCH V3 1/3] iommu: Add support to change default domain of an iommu_group Sai Praneeth Prakhya
2020-05-29  2:43   ` Lu Baolu
2020-05-29  8:15     ` Prakhya, Sai Praneeth
2020-05-28 19:23 ` [PATCH V3 2/3] iommu: Take lock before reading iommu_group default domain type Sai Praneeth Prakhya
2020-05-28 19:24 ` [PATCH V3 3/3] iommu: Document usage of "/sys/kernel/iommu_groups/<grp_id>/type" file Sai Praneeth Prakhya
2020-05-29  1:51 ` [PATCH V3 0/3] iommu: Add support to change default domain of an iommu Lu Baolu
2020-05-29  4:46   ` Prakhya, Sai Praneeth

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).