All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH RESEND v5 0/5] iommu: Allow IOVA rcache range be configured
@ 2022-04-04 11:27 ` John Garry via iommu
  0 siblings, 0 replies; 38+ messages in thread
From: John Garry @ 2022-04-04 11:27 UTC (permalink / raw)
  To: joro, will, robin.murphy
  Cc: mst, jasowang, iommu, linux-kernel, virtualization, chenxiang66,
	thunder.leizhen, jean-philippe, linuxarm, John Garry

For streaming DMA mappings involving an IOMMU and whose IOVA len regularly
exceeds the IOVA rcache upper limit (meaning that they are not cached),
performance can be reduced. 

This may be much more pronounced from commit 4e89dce72521 ("iommu/iova:
Retry from last rb tree node if iova search fails"), as discussed at [0].

IOVAs which cannot be cached are highly involved in the IOVA ageing issue,
as discussed at [1].

This series allows the IOVA rcache range be configured, so that we may
cache all IOVAs per domain, thus improving performance.

A new IOMMU group sysfs file is added - max_opt_dma_size - which is used
indirectly to configure the IOVA rcache range:
/sys/kernel/iommu_groups/X/max_opt_dma_size

This file is updated same as how the IOMMU group default domain type is
updated, i.e. must unbind the only device in the group first.

The inspiration here comes from block layer request queue sysfs
"optimal_io_size" file, in /sys/block/sdX/queue/optimal_io_size

Some old figures* for storage scenario (when increasing IOVA rcache range
to cover all DMA mapping sizes from the LLD):
v5.13-rc1 baseline:			1200K IOPS
With series:				1800K IOPS

All above are for IOMMU strict mode. Non-strict mode gives ~1800K IOPS in
all scenarios.

Based on v5.18-rc1
* I lost my high data throughout test setup

Differences to v4:
https://lore.kernel.org/linux-iommu/1626259003-201303-1-git-send-email-john.garry@huawei.com/
- Major rebase
- Change the "Refactor iommu_group_store_type()" to not use a callback
  and an op type enum instead
  - I didn't pick up Will's Ack as it has changed so much
- Use a domain feature flag to keep same default group type
- Add wrapper for default IOVA rcache range
- Combine last 2x patches

[0] https://lore.kernel.org/linux-iommu/20210129092120.1482-1-thunder.leizhen@huawei.com/
[1] https://lore.kernel.org/linux-iommu/1607538189-237944-1-git-send-email-john.garry@huawei.com/

John Garry (5):
  iommu: Refactor iommu_group_store_type()
  iova: Allow rcache range upper limit to be flexible
  iommu: Allow iommu_change_dev_def_domain() realloc same default domain
    type
  iommu: Allow max opt DMA len be set for a group via sysfs
  iova: Add iova_len argument to iova_domain_init_rcaches()

 .../ABI/testing/sysfs-kernel-iommu_groups     |  16 ++
 drivers/iommu/dma-iommu.c                     |  15 +-
 drivers/iommu/iommu.c                         | 202 +++++++++++++-----
 drivers/iommu/iova.c                          |  37 ++--
 drivers/vdpa/vdpa_user/iova_domain.c          |   4 +-
 include/linux/iommu.h                         |   7 +
 include/linux/iova.h                          |   6 +-
 7 files changed, 212 insertions(+), 75 deletions(-)

-- 
2.26.2


^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCH RESEND v5 0/5] iommu: Allow IOVA rcache range be configured
@ 2022-04-04 11:27 ` John Garry via iommu
  0 siblings, 0 replies; 38+ messages in thread
From: John Garry via iommu @ 2022-04-04 11:27 UTC (permalink / raw)
  To: joro, will, robin.murphy
  Cc: jean-philippe, jasowang, mst, linuxarm, linux-kernel,
	virtualization, iommu

For streaming DMA mappings involving an IOMMU and whose IOVA len regularly
exceeds the IOVA rcache upper limit (meaning that they are not cached),
performance can be reduced. 

This may be much more pronounced from commit 4e89dce72521 ("iommu/iova:
Retry from last rb tree node if iova search fails"), as discussed at [0].

IOVAs which cannot be cached are highly involved in the IOVA ageing issue,
as discussed at [1].

This series allows the IOVA rcache range be configured, so that we may
cache all IOVAs per domain, thus improving performance.

A new IOMMU group sysfs file is added - max_opt_dma_size - which is used
indirectly to configure the IOVA rcache range:
/sys/kernel/iommu_groups/X/max_opt_dma_size

This file is updated same as how the IOMMU group default domain type is
updated, i.e. must unbind the only device in the group first.

The inspiration here comes from block layer request queue sysfs
"optimal_io_size" file, in /sys/block/sdX/queue/optimal_io_size

Some old figures* for storage scenario (when increasing IOVA rcache range
to cover all DMA mapping sizes from the LLD):
v5.13-rc1 baseline:			1200K IOPS
With series:				1800K IOPS

All above are for IOMMU strict mode. Non-strict mode gives ~1800K IOPS in
all scenarios.

Based on v5.18-rc1
* I lost my high data throughout test setup

Differences to v4:
https://lore.kernel.org/linux-iommu/1626259003-201303-1-git-send-email-john.garry@huawei.com/
- Major rebase
- Change the "Refactor iommu_group_store_type()" to not use a callback
  and an op type enum instead
  - I didn't pick up Will's Ack as it has changed so much
- Use a domain feature flag to keep same default group type
- Add wrapper for default IOVA rcache range
- Combine last 2x patches

[0] https://lore.kernel.org/linux-iommu/20210129092120.1482-1-thunder.leizhen@huawei.com/
[1] https://lore.kernel.org/linux-iommu/1607538189-237944-1-git-send-email-john.garry@huawei.com/

John Garry (5):
  iommu: Refactor iommu_group_store_type()
  iova: Allow rcache range upper limit to be flexible
  iommu: Allow iommu_change_dev_def_domain() realloc same default domain
    type
  iommu: Allow max opt DMA len be set for a group via sysfs
  iova: Add iova_len argument to iova_domain_init_rcaches()

 .../ABI/testing/sysfs-kernel-iommu_groups     |  16 ++
 drivers/iommu/dma-iommu.c                     |  15 +-
 drivers/iommu/iommu.c                         | 202 +++++++++++++-----
 drivers/iommu/iova.c                          |  37 ++--
 drivers/vdpa/vdpa_user/iova_domain.c          |   4 +-
 include/linux/iommu.h                         |   7 +
 include/linux/iova.h                          |   6 +-
 7 files changed, 212 insertions(+), 75 deletions(-)

-- 
2.26.2

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCH RESEND v5 1/5] iommu: Refactor iommu_group_store_type()
  2022-04-04 11:27 ` John Garry via iommu
@ 2022-04-04 11:27   ` John Garry via iommu
  -1 siblings, 0 replies; 38+ messages in thread
From: John Garry @ 2022-04-04 11:27 UTC (permalink / raw)
  To: joro, will, robin.murphy
  Cc: mst, jasowang, iommu, linux-kernel, virtualization, chenxiang66,
	thunder.leizhen, jean-philippe, linuxarm, John Garry

Function iommu_group_store_type() supports changing the default domain
of an IOMMU group.

Many conditions need to be satisfied and steps taken for this action to be
successful.

Satisfying these conditions and steps will be required for setting other
IOMMU group attributes, so factor into a common part and a part specific
to update the IOMMU group attribute.

No functional change intended.

Some code comments are tidied up also.

Signed-off-by: John Garry <john.garry@huawei.com>
---
 drivers/iommu/iommu.c | 96 ++++++++++++++++++++++++++++---------------
 1 file changed, 62 insertions(+), 34 deletions(-)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index f2c45b85b9fc..0dd766030baf 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -3000,21 +3000,57 @@ static int iommu_change_dev_def_domain(struct iommu_group *group,
 	return ret;
 }
 
+enum iommu_group_op {
+	CHANGE_GROUP_TYPE,
+};
+
+static int __iommu_group_store_type(const char *buf, struct iommu_group *group,
+				    struct device *dev)
+{
+	int type;
+
+	if (sysfs_streq(buf, "identity"))
+		type = IOMMU_DOMAIN_IDENTITY;
+	else if (sysfs_streq(buf, "DMA"))
+		type = IOMMU_DOMAIN_DMA;
+	else if (sysfs_streq(buf, "DMA-FQ"))
+		type = IOMMU_DOMAIN_DMA_FQ;
+	else if (sysfs_streq(buf, "auto"))
+		type = 0;
+	else
+		return -EINVAL;
+
+	/*
+	 * Check if the only device in the group still has a driver bound or
+	 * we're transistioning from DMA -> DMA-FQ
+	 */
+	if (device_is_bound(dev) && !(type == IOMMU_DOMAIN_DMA_FQ &&
+	    group->default_domain->type == IOMMU_DOMAIN_DMA)) {
+		pr_err_ratelimited("Device is still bound to driver\n");
+		return -EINVAL;
+	}
+
+	return iommu_change_dev_def_domain(group, dev, type);
+}
+
 /*
  * Changing the default domain through sysfs requires the users to unbind the
  * drivers from the devices in the iommu group, except for a DMA -> DMA-FQ
- * transition. Return failure if this isn't met.
+ * transition. Changing or any other IOMMU group attribute still requires the
+ * user to unbind the drivers from the devices in the iommu group. Return
+ * failure if these conditions are not met.
  *
  * We need to consider the race between this and the device release path.
  * device_lock(dev) is used here to guarantee that the device release path
  * will not be entered at the same time.
  */
-static ssize_t iommu_group_store_type(struct iommu_group *group,
-				      const char *buf, size_t count)
+static ssize_t iommu_group_store_common(struct iommu_group *group,
+					enum iommu_group_op op,
+					const char *buf, size_t count)
 {
 	struct group_device *grp_dev;
 	struct device *dev;
-	int ret, req_type;
+	int ret;
 
 	if (!capable(CAP_SYS_ADMIN) || !capable(CAP_SYS_RAWIO))
 		return -EACCES;
@@ -3022,27 +3058,16 @@ static ssize_t iommu_group_store_type(struct iommu_group *group,
 	if (WARN_ON(!group))
 		return -EINVAL;
 
-	if (sysfs_streq(buf, "identity"))
-		req_type = IOMMU_DOMAIN_IDENTITY;
-	else if (sysfs_streq(buf, "DMA"))
-		req_type = IOMMU_DOMAIN_DMA;
-	else if (sysfs_streq(buf, "DMA-FQ"))
-		req_type = IOMMU_DOMAIN_DMA_FQ;
-	else if (sysfs_streq(buf, "auto"))
-		req_type = 0;
-	else
-		return -EINVAL;
-
 	/*
 	 * Lock/Unlock the group mutex here before device lock to
-	 * 1. Make sure that the iommu group has only one device (this is a
+	 * 1. Make sure that the IOMMU group has only one device (this is a
 	 *    prerequisite for step 2)
 	 * 2. Get struct *dev which is needed to lock device
 	 */
 	mutex_lock(&group->mutex);
 	if (iommu_group_device_count(group) != 1) {
 		mutex_unlock(&group->mutex);
-		pr_err_ratelimited("Cannot change default domain: Group has more than one device\n");
+		pr_err_ratelimited("Cannot change IOMMU group default domain attribute: Group has more than one device\n");
 		return -EINVAL;
 	}
 
@@ -3054,16 +3079,16 @@ static ssize_t iommu_group_store_type(struct iommu_group *group,
 	/*
 	 * Don't hold the group mutex because taking group mutex first and then
 	 * the device lock could potentially cause a deadlock as below. Assume
-	 * two threads T1 and T2. T1 is trying to change default domain of an
-	 * iommu group and T2 is trying to hot unplug a device or release [1] VF
-	 * of a PCIe device which is in the same iommu group. T1 takes group
-	 * mutex and before it could take device lock assume T2 has taken device
-	 * lock and is yet to take group mutex. Now, both the threads will be
-	 * waiting for the other thread to release lock. Below, lock order was
-	 * suggested.
+	 * two threads, T1 and T2. T1 is trying to change default domain
+	 * attribute of an IOMMU group and T2 is trying to hot unplug a device
+	 * or release [1] VF of a PCIe device which is in the same IOMMU group.
+	 * T1 takes the group mutex and before it could take device lock T2 may
+	 * have taken device lock and is yet to take group mutex. Now, both the
+	 * threads will be waiting for the other thread to release lock. Below,
+	 * lock order was suggested.
 	 * device_lock(dev);
 	 *	mutex_lock(&group->mutex);
-	 *		iommu_change_dev_def_domain();
+	 *		cb->iommu_change_dev_def_domain(); [example cb]
 	 *	mutex_unlock(&group->mutex);
 	 * device_unlock(dev);
 	 *
@@ -3077,21 +3102,24 @@ static ssize_t iommu_group_store_type(struct iommu_group *group,
 	 */
 	mutex_unlock(&group->mutex);
 
-	/* Check if the device in the group still has a driver bound to it */
 	device_lock(dev);
-	if (device_is_bound(dev) && !(req_type == IOMMU_DOMAIN_DMA_FQ &&
-	    group->default_domain->type == IOMMU_DOMAIN_DMA)) {
-		pr_err_ratelimited("Device is still bound to driver\n");
-		ret = -EBUSY;
-		goto out;
+	switch (op) {
+	case CHANGE_GROUP_TYPE:
+		ret = __iommu_group_store_type(buf, group, dev);
+		break;
+	default:
+		ret = -EINVAL;
 	}
-
-	ret = iommu_change_dev_def_domain(group, dev, req_type);
 	ret = ret ?: count;
 
-out:
 	device_unlock(dev);
 	put_device(dev);
 
 	return ret;
 }
+
+static ssize_t iommu_group_store_type(struct iommu_group *group,
+				      const char *buf, size_t count)
+{
+	return iommu_group_store_common(group, CHANGE_GROUP_TYPE, buf, count);
+}
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH RESEND v5 1/5] iommu: Refactor iommu_group_store_type()
@ 2022-04-04 11:27   ` John Garry via iommu
  0 siblings, 0 replies; 38+ messages in thread
From: John Garry via iommu @ 2022-04-04 11:27 UTC (permalink / raw)
  To: joro, will, robin.murphy
  Cc: jean-philippe, jasowang, mst, linuxarm, linux-kernel,
	virtualization, iommu

Function iommu_group_store_type() supports changing the default domain
of an IOMMU group.

Many conditions need to be satisfied and steps taken for this action to be
successful.

Satisfying these conditions and steps will be required for setting other
IOMMU group attributes, so factor into a common part and a part specific
to update the IOMMU group attribute.

No functional change intended.

Some code comments are tidied up also.

Signed-off-by: John Garry <john.garry@huawei.com>
---
 drivers/iommu/iommu.c | 96 ++++++++++++++++++++++++++++---------------
 1 file changed, 62 insertions(+), 34 deletions(-)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index f2c45b85b9fc..0dd766030baf 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -3000,21 +3000,57 @@ static int iommu_change_dev_def_domain(struct iommu_group *group,
 	return ret;
 }
 
+enum iommu_group_op {
+	CHANGE_GROUP_TYPE,
+};
+
+static int __iommu_group_store_type(const char *buf, struct iommu_group *group,
+				    struct device *dev)
+{
+	int type;
+
+	if (sysfs_streq(buf, "identity"))
+		type = IOMMU_DOMAIN_IDENTITY;
+	else if (sysfs_streq(buf, "DMA"))
+		type = IOMMU_DOMAIN_DMA;
+	else if (sysfs_streq(buf, "DMA-FQ"))
+		type = IOMMU_DOMAIN_DMA_FQ;
+	else if (sysfs_streq(buf, "auto"))
+		type = 0;
+	else
+		return -EINVAL;
+
+	/*
+	 * Check if the only device in the group still has a driver bound or
+	 * we're transistioning from DMA -> DMA-FQ
+	 */
+	if (device_is_bound(dev) && !(type == IOMMU_DOMAIN_DMA_FQ &&
+	    group->default_domain->type == IOMMU_DOMAIN_DMA)) {
+		pr_err_ratelimited("Device is still bound to driver\n");
+		return -EINVAL;
+	}
+
+	return iommu_change_dev_def_domain(group, dev, type);
+}
+
 /*
  * Changing the default domain through sysfs requires the users to unbind the
  * drivers from the devices in the iommu group, except for a DMA -> DMA-FQ
- * transition. Return failure if this isn't met.
+ * transition. Changing or any other IOMMU group attribute still requires the
+ * user to unbind the drivers from the devices in the iommu group. Return
+ * failure if these conditions are not met.
  *
  * We need to consider the race between this and the device release path.
  * device_lock(dev) is used here to guarantee that the device release path
  * will not be entered at the same time.
  */
-static ssize_t iommu_group_store_type(struct iommu_group *group,
-				      const char *buf, size_t count)
+static ssize_t iommu_group_store_common(struct iommu_group *group,
+					enum iommu_group_op op,
+					const char *buf, size_t count)
 {
 	struct group_device *grp_dev;
 	struct device *dev;
-	int ret, req_type;
+	int ret;
 
 	if (!capable(CAP_SYS_ADMIN) || !capable(CAP_SYS_RAWIO))
 		return -EACCES;
@@ -3022,27 +3058,16 @@ static ssize_t iommu_group_store_type(struct iommu_group *group,
 	if (WARN_ON(!group))
 		return -EINVAL;
 
-	if (sysfs_streq(buf, "identity"))
-		req_type = IOMMU_DOMAIN_IDENTITY;
-	else if (sysfs_streq(buf, "DMA"))
-		req_type = IOMMU_DOMAIN_DMA;
-	else if (sysfs_streq(buf, "DMA-FQ"))
-		req_type = IOMMU_DOMAIN_DMA_FQ;
-	else if (sysfs_streq(buf, "auto"))
-		req_type = 0;
-	else
-		return -EINVAL;
-
 	/*
 	 * Lock/Unlock the group mutex here before device lock to
-	 * 1. Make sure that the iommu group has only one device (this is a
+	 * 1. Make sure that the IOMMU group has only one device (this is a
 	 *    prerequisite for step 2)
 	 * 2. Get struct *dev which is needed to lock device
 	 */
 	mutex_lock(&group->mutex);
 	if (iommu_group_device_count(group) != 1) {
 		mutex_unlock(&group->mutex);
-		pr_err_ratelimited("Cannot change default domain: Group has more than one device\n");
+		pr_err_ratelimited("Cannot change IOMMU group default domain attribute: Group has more than one device\n");
 		return -EINVAL;
 	}
 
@@ -3054,16 +3079,16 @@ static ssize_t iommu_group_store_type(struct iommu_group *group,
 	/*
 	 * Don't hold the group mutex because taking group mutex first and then
 	 * the device lock could potentially cause a deadlock as below. Assume
-	 * two threads T1 and T2. T1 is trying to change default domain of an
-	 * iommu group and T2 is trying to hot unplug a device or release [1] VF
-	 * of a PCIe device which is in the same iommu group. T1 takes group
-	 * mutex and before it could take device lock assume T2 has taken device
-	 * lock and is yet to take group mutex. Now, both the threads will be
-	 * waiting for the other thread to release lock. Below, lock order was
-	 * suggested.
+	 * two threads, T1 and T2. T1 is trying to change default domain
+	 * attribute of an IOMMU group and T2 is trying to hot unplug a device
+	 * or release [1] VF of a PCIe device which is in the same IOMMU group.
+	 * T1 takes the group mutex and before it could take device lock T2 may
+	 * have taken device lock and is yet to take group mutex. Now, both the
+	 * threads will be waiting for the other thread to release lock. Below,
+	 * lock order was suggested.
 	 * device_lock(dev);
 	 *	mutex_lock(&group->mutex);
-	 *		iommu_change_dev_def_domain();
+	 *		cb->iommu_change_dev_def_domain(); [example cb]
 	 *	mutex_unlock(&group->mutex);
 	 * device_unlock(dev);
 	 *
@@ -3077,21 +3102,24 @@ static ssize_t iommu_group_store_type(struct iommu_group *group,
 	 */
 	mutex_unlock(&group->mutex);
 
-	/* Check if the device in the group still has a driver bound to it */
 	device_lock(dev);
-	if (device_is_bound(dev) && !(req_type == IOMMU_DOMAIN_DMA_FQ &&
-	    group->default_domain->type == IOMMU_DOMAIN_DMA)) {
-		pr_err_ratelimited("Device is still bound to driver\n");
-		ret = -EBUSY;
-		goto out;
+	switch (op) {
+	case CHANGE_GROUP_TYPE:
+		ret = __iommu_group_store_type(buf, group, dev);
+		break;
+	default:
+		ret = -EINVAL;
 	}
-
-	ret = iommu_change_dev_def_domain(group, dev, req_type);
 	ret = ret ?: count;
 
-out:
 	device_unlock(dev);
 	put_device(dev);
 
 	return ret;
 }
+
+static ssize_t iommu_group_store_type(struct iommu_group *group,
+				      const char *buf, size_t count)
+{
+	return iommu_group_store_common(group, CHANGE_GROUP_TYPE, buf, count);
+}
-- 
2.26.2

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH RESEND v5 2/5] iova: Allow rcache range upper limit to be flexible
  2022-04-04 11:27 ` John Garry via iommu
@ 2022-04-04 11:27   ` John Garry via iommu
  -1 siblings, 0 replies; 38+ messages in thread
From: John Garry @ 2022-04-04 11:27 UTC (permalink / raw)
  To: joro, will, robin.murphy
  Cc: mst, jasowang, iommu, linux-kernel, virtualization, chenxiang66,
	thunder.leizhen, jean-philippe, linuxarm, John Garry

Some low-level drivers may request DMA mappings whose IOVA length exceeds
that of the current rcache upper limit.

This means that allocations for those IOVAs will never be cached, and
always must be allocated and freed from the RB tree per DMA mapping cycle.
This has a significant effect on performance, more so since commit
4e89dce72521 ("iommu/iova: Retry from last rb tree node if iova search
fails"), as discussed at [0].

As a first step towards allowing the rcache range upper limit be
configured, hold this value in the IOVA rcache structure, and allocate
the rcaches separately.

Delete macro IOVA_RANGE_CACHE_MAX_SIZE in case it's reused by mistake.

[0] https://lore.kernel.org/linux-iommu/20210129092120.1482-1-thunder.leizhen@huawei.com/

Signed-off-by: John Garry <john.garry@huawei.com>
---
 drivers/iommu/iova.c | 20 ++++++++++----------
 include/linux/iova.h |  3 +++
 2 files changed, 13 insertions(+), 10 deletions(-)

diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c
index db77aa675145..5c22b9187b79 100644
--- a/drivers/iommu/iova.c
+++ b/drivers/iommu/iova.c
@@ -15,8 +15,6 @@
 /* The anchor node sits above the top of the usable address space */
 #define IOVA_ANCHOR	~0UL
 
-#define IOVA_RANGE_CACHE_MAX_SIZE 6	/* log of max cached IOVA range size (in pages) */
-
 static bool iova_rcache_insert(struct iova_domain *iovad,
 			       unsigned long pfn,
 			       unsigned long size);
@@ -443,7 +441,7 @@ alloc_iova_fast(struct iova_domain *iovad, unsigned long size,
 	 * rounding up anything cacheable to make sure that can't happen. The
 	 * order of the unadjusted size will still match upon freeing.
 	 */
-	if (size < (1 << (IOVA_RANGE_CACHE_MAX_SIZE - 1)))
+	if (size < (1 << (iovad->rcache_max_size - 1)))
 		size = roundup_pow_of_two(size);
 
 	iova_pfn = iova_rcache_get(iovad, size, limit_pfn + 1);
@@ -713,13 +711,15 @@ int iova_domain_init_rcaches(struct iova_domain *iovad)
 	unsigned int cpu;
 	int i, ret;
 
-	iovad->rcaches = kcalloc(IOVA_RANGE_CACHE_MAX_SIZE,
+	iovad->rcache_max_size = 6; /* Arbitrarily high default */
+
+	iovad->rcaches = kcalloc(iovad->rcache_max_size,
 				 sizeof(struct iova_rcache),
 				 GFP_KERNEL);
 	if (!iovad->rcaches)
 		return -ENOMEM;
 
-	for (i = 0; i < IOVA_RANGE_CACHE_MAX_SIZE; ++i) {
+	for (i = 0; i < iovad->rcache_max_size; ++i) {
 		struct iova_cpu_rcache *cpu_rcache;
 		struct iova_rcache *rcache;
 
@@ -816,7 +816,7 @@ static bool iova_rcache_insert(struct iova_domain *iovad, unsigned long pfn,
 {
 	unsigned int log_size = order_base_2(size);
 
-	if (log_size >= IOVA_RANGE_CACHE_MAX_SIZE)
+	if (log_size >= iovad->rcache_max_size)
 		return false;
 
 	return __iova_rcache_insert(iovad, &iovad->rcaches[log_size], pfn);
@@ -872,7 +872,7 @@ static unsigned long iova_rcache_get(struct iova_domain *iovad,
 {
 	unsigned int log_size = order_base_2(size);
 
-	if (log_size >= IOVA_RANGE_CACHE_MAX_SIZE || !iovad->rcaches)
+	if (log_size >= iovad->rcache_max_size || !iovad->rcaches)
 		return 0;
 
 	return __iova_rcache_get(&iovad->rcaches[log_size], limit_pfn - size);
@@ -888,7 +888,7 @@ static void free_iova_rcaches(struct iova_domain *iovad)
 	unsigned int cpu;
 	int i, j;
 
-	for (i = 0; i < IOVA_RANGE_CACHE_MAX_SIZE; ++i) {
+	for (i = 0; i < iovad->rcache_max_size; ++i) {
 		rcache = &iovad->rcaches[i];
 		if (!rcache->cpu_rcaches)
 			break;
@@ -916,7 +916,7 @@ static void free_cpu_cached_iovas(unsigned int cpu, struct iova_domain *iovad)
 	unsigned long flags;
 	int i;
 
-	for (i = 0; i < IOVA_RANGE_CACHE_MAX_SIZE; ++i) {
+	for (i = 0; i < iovad->rcache_max_size; ++i) {
 		rcache = &iovad->rcaches[i];
 		cpu_rcache = per_cpu_ptr(rcache->cpu_rcaches, cpu);
 		spin_lock_irqsave(&cpu_rcache->lock, flags);
@@ -935,7 +935,7 @@ static void free_global_cached_iovas(struct iova_domain *iovad)
 	unsigned long flags;
 	int i, j;
 
-	for (i = 0; i < IOVA_RANGE_CACHE_MAX_SIZE; ++i) {
+	for (i = 0; i < iovad->rcache_max_size; ++i) {
 		rcache = &iovad->rcaches[i];
 		spin_lock_irqsave(&rcache->lock, flags);
 		for (j = 0; j < rcache->depot_size; ++j) {
diff --git a/include/linux/iova.h b/include/linux/iova.h
index 320a70e40233..02f7222fa85a 100644
--- a/include/linux/iova.h
+++ b/include/linux/iova.h
@@ -38,6 +38,9 @@ struct iova_domain {
 
 	struct iova_rcache	*rcaches;
 	struct hlist_node	cpuhp_dead;
+
+	/* log of max cached IOVA range size (in pages) */
+	unsigned long	rcache_max_size;
 };
 
 static inline unsigned long iova_size(struct iova *iova)
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH RESEND v5 2/5] iova: Allow rcache range upper limit to be flexible
@ 2022-04-04 11:27   ` John Garry via iommu
  0 siblings, 0 replies; 38+ messages in thread
From: John Garry via iommu @ 2022-04-04 11:27 UTC (permalink / raw)
  To: joro, will, robin.murphy
  Cc: jean-philippe, jasowang, mst, linuxarm, linux-kernel,
	virtualization, iommu

Some low-level drivers may request DMA mappings whose IOVA length exceeds
that of the current rcache upper limit.

This means that allocations for those IOVAs will never be cached, and
always must be allocated and freed from the RB tree per DMA mapping cycle.
This has a significant effect on performance, more so since commit
4e89dce72521 ("iommu/iova: Retry from last rb tree node if iova search
fails"), as discussed at [0].

As a first step towards allowing the rcache range upper limit be
configured, hold this value in the IOVA rcache structure, and allocate
the rcaches separately.

Delete macro IOVA_RANGE_CACHE_MAX_SIZE in case it's reused by mistake.

[0] https://lore.kernel.org/linux-iommu/20210129092120.1482-1-thunder.leizhen@huawei.com/

Signed-off-by: John Garry <john.garry@huawei.com>
---
 drivers/iommu/iova.c | 20 ++++++++++----------
 include/linux/iova.h |  3 +++
 2 files changed, 13 insertions(+), 10 deletions(-)

diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c
index db77aa675145..5c22b9187b79 100644
--- a/drivers/iommu/iova.c
+++ b/drivers/iommu/iova.c
@@ -15,8 +15,6 @@
 /* The anchor node sits above the top of the usable address space */
 #define IOVA_ANCHOR	~0UL
 
-#define IOVA_RANGE_CACHE_MAX_SIZE 6	/* log of max cached IOVA range size (in pages) */
-
 static bool iova_rcache_insert(struct iova_domain *iovad,
 			       unsigned long pfn,
 			       unsigned long size);
@@ -443,7 +441,7 @@ alloc_iova_fast(struct iova_domain *iovad, unsigned long size,
 	 * rounding up anything cacheable to make sure that can't happen. The
 	 * order of the unadjusted size will still match upon freeing.
 	 */
-	if (size < (1 << (IOVA_RANGE_CACHE_MAX_SIZE - 1)))
+	if (size < (1 << (iovad->rcache_max_size - 1)))
 		size = roundup_pow_of_two(size);
 
 	iova_pfn = iova_rcache_get(iovad, size, limit_pfn + 1);
@@ -713,13 +711,15 @@ int iova_domain_init_rcaches(struct iova_domain *iovad)
 	unsigned int cpu;
 	int i, ret;
 
-	iovad->rcaches = kcalloc(IOVA_RANGE_CACHE_MAX_SIZE,
+	iovad->rcache_max_size = 6; /* Arbitrarily high default */
+
+	iovad->rcaches = kcalloc(iovad->rcache_max_size,
 				 sizeof(struct iova_rcache),
 				 GFP_KERNEL);
 	if (!iovad->rcaches)
 		return -ENOMEM;
 
-	for (i = 0; i < IOVA_RANGE_CACHE_MAX_SIZE; ++i) {
+	for (i = 0; i < iovad->rcache_max_size; ++i) {
 		struct iova_cpu_rcache *cpu_rcache;
 		struct iova_rcache *rcache;
 
@@ -816,7 +816,7 @@ static bool iova_rcache_insert(struct iova_domain *iovad, unsigned long pfn,
 {
 	unsigned int log_size = order_base_2(size);
 
-	if (log_size >= IOVA_RANGE_CACHE_MAX_SIZE)
+	if (log_size >= iovad->rcache_max_size)
 		return false;
 
 	return __iova_rcache_insert(iovad, &iovad->rcaches[log_size], pfn);
@@ -872,7 +872,7 @@ static unsigned long iova_rcache_get(struct iova_domain *iovad,
 {
 	unsigned int log_size = order_base_2(size);
 
-	if (log_size >= IOVA_RANGE_CACHE_MAX_SIZE || !iovad->rcaches)
+	if (log_size >= iovad->rcache_max_size || !iovad->rcaches)
 		return 0;
 
 	return __iova_rcache_get(&iovad->rcaches[log_size], limit_pfn - size);
@@ -888,7 +888,7 @@ static void free_iova_rcaches(struct iova_domain *iovad)
 	unsigned int cpu;
 	int i, j;
 
-	for (i = 0; i < IOVA_RANGE_CACHE_MAX_SIZE; ++i) {
+	for (i = 0; i < iovad->rcache_max_size; ++i) {
 		rcache = &iovad->rcaches[i];
 		if (!rcache->cpu_rcaches)
 			break;
@@ -916,7 +916,7 @@ static void free_cpu_cached_iovas(unsigned int cpu, struct iova_domain *iovad)
 	unsigned long flags;
 	int i;
 
-	for (i = 0; i < IOVA_RANGE_CACHE_MAX_SIZE; ++i) {
+	for (i = 0; i < iovad->rcache_max_size; ++i) {
 		rcache = &iovad->rcaches[i];
 		cpu_rcache = per_cpu_ptr(rcache->cpu_rcaches, cpu);
 		spin_lock_irqsave(&cpu_rcache->lock, flags);
@@ -935,7 +935,7 @@ static void free_global_cached_iovas(struct iova_domain *iovad)
 	unsigned long flags;
 	int i, j;
 
-	for (i = 0; i < IOVA_RANGE_CACHE_MAX_SIZE; ++i) {
+	for (i = 0; i < iovad->rcache_max_size; ++i) {
 		rcache = &iovad->rcaches[i];
 		spin_lock_irqsave(&rcache->lock, flags);
 		for (j = 0; j < rcache->depot_size; ++j) {
diff --git a/include/linux/iova.h b/include/linux/iova.h
index 320a70e40233..02f7222fa85a 100644
--- a/include/linux/iova.h
+++ b/include/linux/iova.h
@@ -38,6 +38,9 @@ struct iova_domain {
 
 	struct iova_rcache	*rcaches;
 	struct hlist_node	cpuhp_dead;
+
+	/* log of max cached IOVA range size (in pages) */
+	unsigned long	rcache_max_size;
 };
 
 static inline unsigned long iova_size(struct iova *iova)
-- 
2.26.2

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH RESEND v5 3/5] iommu: Allow iommu_change_dev_def_domain() realloc same default domain type
  2022-04-04 11:27 ` John Garry via iommu
@ 2022-04-04 11:27   ` John Garry via iommu
  -1 siblings, 0 replies; 38+ messages in thread
From: John Garry @ 2022-04-04 11:27 UTC (permalink / raw)
  To: joro, will, robin.murphy
  Cc: mst, jasowang, iommu, linux-kernel, virtualization, chenxiang66,
	thunder.leizhen, jean-philippe, linuxarm, John Garry

Allow iommu_change_dev_def_domain() to create a new default domain, keeping
the same as current.

Also remove comment about the function purpose, which will become stale.

Signed-off-by: John Garry <john.garry@huawei.com>
---
 drivers/iommu/iommu.c | 49 ++++++++++++++++++++++---------------------
 include/linux/iommu.h |  1 +
 2 files changed, 26 insertions(+), 24 deletions(-)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 0dd766030baf..10bb10c2a210 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -2863,6 +2863,7 @@ u32 iommu_sva_get_pasid(struct iommu_sva *handle)
 }
 EXPORT_SYMBOL_GPL(iommu_sva_get_pasid);
 
+
 /*
  * Changes the default domain of an iommu group that has *only* one device
  *
@@ -2873,10 +2874,6 @@ EXPORT_SYMBOL_GPL(iommu_sva_get_pasid);
  *
  * Returns 0 on success and error code on failure
  *
- * Note:
- * 1. Presently, this function is called only when user requests to change the
- *    group's default domain type through /sys/kernel/iommu_groups/<grp_id>/type
- *    Please take a closer look if intended to use for other purposes.
  */
 static int iommu_change_dev_def_domain(struct iommu_group *group,
 				       struct device *prev_dev, int type)
@@ -2929,28 +2926,32 @@ static int iommu_change_dev_def_domain(struct iommu_group *group,
 		goto out;
 	}
 
-	dev_def_dom = iommu_get_def_domain_type(dev);
-	if (!type) {
+	if (type == __IOMMU_DOMAIN_SAME) {
+		type = prev_dom->type;
+	} else {
+		dev_def_dom = iommu_get_def_domain_type(dev);
+		if (!type) {
+			/*
+			 * If the user hasn't requested any specific type of domain and
+			 * if the device supports both the domains, then default to the
+			 * domain the device was booted with
+			 */
+			type = dev_def_dom ? : iommu_def_domain_type;
+		} else if (dev_def_dom && type != dev_def_dom) {
+			dev_err_ratelimited(prev_dev, "Device cannot be in %s domain\n",
+					    iommu_domain_type_str(type));
+			ret = -EINVAL;
+			goto out;
+		}
+
 		/*
-		 * If the user hasn't requested any specific type of domain and
-		 * if the device supports both the domains, then default to the
-		 * domain the device was booted with
+		 * Switch to a new domain only if the requested domain type is different
+		 * from the existing default domain type
 		 */
-		type = dev_def_dom ? : iommu_def_domain_type;
-	} else if (dev_def_dom && type != dev_def_dom) {
-		dev_err_ratelimited(prev_dev, "Device cannot be in %s domain\n",
-				    iommu_domain_type_str(type));
-		ret = -EINVAL;
-		goto out;
-	}
-
-	/*
-	 * Switch to a new domain only if the requested domain type is different
-	 * from the existing default domain type
-	 */
-	if (prev_dom->type == type) {
-		ret = 0;
-		goto out;
+		if (prev_dom->type == type) {
+			ret = 0;
+			goto out;
+		}
 	}
 
 	/* We can bring up a flush queue without tearing down the domain */
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 9208eca4b0d1..b141cf71c7af 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -63,6 +63,7 @@ struct iommu_domain_geometry {
 					      implementation              */
 #define __IOMMU_DOMAIN_PT	(1U << 2)  /* Domain is identity mapped   */
 #define __IOMMU_DOMAIN_DMA_FQ	(1U << 3)  /* DMA-API uses flush queue    */
+#define __IOMMU_DOMAIN_SAME	(1U << 4)  /* Keep same type (internal)   */
 
 /*
  * This are the possible domain-types
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH RESEND v5 3/5] iommu: Allow iommu_change_dev_def_domain() realloc same default domain type
@ 2022-04-04 11:27   ` John Garry via iommu
  0 siblings, 0 replies; 38+ messages in thread
From: John Garry via iommu @ 2022-04-04 11:27 UTC (permalink / raw)
  To: joro, will, robin.murphy
  Cc: jean-philippe, jasowang, mst, linuxarm, linux-kernel,
	virtualization, iommu

Allow iommu_change_dev_def_domain() to create a new default domain, keeping
the same as current.

Also remove comment about the function purpose, which will become stale.

Signed-off-by: John Garry <john.garry@huawei.com>
---
 drivers/iommu/iommu.c | 49 ++++++++++++++++++++++---------------------
 include/linux/iommu.h |  1 +
 2 files changed, 26 insertions(+), 24 deletions(-)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 0dd766030baf..10bb10c2a210 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -2863,6 +2863,7 @@ u32 iommu_sva_get_pasid(struct iommu_sva *handle)
 }
 EXPORT_SYMBOL_GPL(iommu_sva_get_pasid);
 
+
 /*
  * Changes the default domain of an iommu group that has *only* one device
  *
@@ -2873,10 +2874,6 @@ EXPORT_SYMBOL_GPL(iommu_sva_get_pasid);
  *
  * Returns 0 on success and error code on failure
  *
- * Note:
- * 1. Presently, this function is called only when user requests to change the
- *    group's default domain type through /sys/kernel/iommu_groups/<grp_id>/type
- *    Please take a closer look if intended to use for other purposes.
  */
 static int iommu_change_dev_def_domain(struct iommu_group *group,
 				       struct device *prev_dev, int type)
@@ -2929,28 +2926,32 @@ static int iommu_change_dev_def_domain(struct iommu_group *group,
 		goto out;
 	}
 
-	dev_def_dom = iommu_get_def_domain_type(dev);
-	if (!type) {
+	if (type == __IOMMU_DOMAIN_SAME) {
+		type = prev_dom->type;
+	} else {
+		dev_def_dom = iommu_get_def_domain_type(dev);
+		if (!type) {
+			/*
+			 * If the user hasn't requested any specific type of domain and
+			 * if the device supports both the domains, then default to the
+			 * domain the device was booted with
+			 */
+			type = dev_def_dom ? : iommu_def_domain_type;
+		} else if (dev_def_dom && type != dev_def_dom) {
+			dev_err_ratelimited(prev_dev, "Device cannot be in %s domain\n",
+					    iommu_domain_type_str(type));
+			ret = -EINVAL;
+			goto out;
+		}
+
 		/*
-		 * If the user hasn't requested any specific type of domain and
-		 * if the device supports both the domains, then default to the
-		 * domain the device was booted with
+		 * Switch to a new domain only if the requested domain type is different
+		 * from the existing default domain type
 		 */
-		type = dev_def_dom ? : iommu_def_domain_type;
-	} else if (dev_def_dom && type != dev_def_dom) {
-		dev_err_ratelimited(prev_dev, "Device cannot be in %s domain\n",
-				    iommu_domain_type_str(type));
-		ret = -EINVAL;
-		goto out;
-	}
-
-	/*
-	 * Switch to a new domain only if the requested domain type is different
-	 * from the existing default domain type
-	 */
-	if (prev_dom->type == type) {
-		ret = 0;
-		goto out;
+		if (prev_dom->type == type) {
+			ret = 0;
+			goto out;
+		}
 	}
 
 	/* We can bring up a flush queue without tearing down the domain */
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 9208eca4b0d1..b141cf71c7af 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -63,6 +63,7 @@ struct iommu_domain_geometry {
 					      implementation              */
 #define __IOMMU_DOMAIN_PT	(1U << 2)  /* Domain is identity mapped   */
 #define __IOMMU_DOMAIN_DMA_FQ	(1U << 3)  /* DMA-API uses flush queue    */
+#define __IOMMU_DOMAIN_SAME	(1U << 4)  /* Keep same type (internal)   */
 
 /*
  * This are the possible domain-types
-- 
2.26.2

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH RESEND v5 4/5] iommu: Allow max opt DMA len be set for a group via sysfs
  2022-04-04 11:27 ` John Garry via iommu
@ 2022-04-04 11:27   ` John Garry via iommu
  -1 siblings, 0 replies; 38+ messages in thread
From: John Garry @ 2022-04-04 11:27 UTC (permalink / raw)
  To: joro, will, robin.murphy
  Cc: mst, jasowang, iommu, linux-kernel, virtualization, chenxiang66,
	thunder.leizhen, jean-philippe, linuxarm, John Garry

Add support to allow the maximum optimised DMA len be set for an IOMMU
group via sysfs.

This is much the same with the method to change the default domain type
for a group.

Signed-off-by: John Garry <john.garry@huawei.com>
---
 .../ABI/testing/sysfs-kernel-iommu_groups     | 16 +++++
 drivers/iommu/iommu.c                         | 59 ++++++++++++++++++-
 include/linux/iommu.h                         |  6 ++
 3 files changed, 79 insertions(+), 2 deletions(-)

diff --git a/Documentation/ABI/testing/sysfs-kernel-iommu_groups b/Documentation/ABI/testing/sysfs-kernel-iommu_groups
index b15af6a5bc08..ed6f72794f6c 100644
--- a/Documentation/ABI/testing/sysfs-kernel-iommu_groups
+++ b/Documentation/ABI/testing/sysfs-kernel-iommu_groups
@@ -63,3 +63,19 @@ Description:	/sys/kernel/iommu_groups/<grp_id>/type shows the type of default
 		system could lead to catastrophic effects (the users might
 		need to reboot the machine to get it to normal state). So, it's
 		expected that the users understand what they're doing.
+
+What:		/sys/kernel/iommu_groups/<grp_id>/max_opt_dma_size
+Date:		Feb 2022
+KernelVersion:	v5.18
+Contact:	iommu@lists.linux-foundation.org
+Description:	/sys/kernel/iommu_groups/<grp_id>/max_opt_dma_size shows the
+		max optimised DMA size for the default IOMMU domain associated
+		with the group.
+		Each IOMMU domain has an IOVA domain. The IOVA domain caches
+		IOVAs upto a certain size as a performance optimisation.
+		This sysfs file allows the range of the IOVA domain caching be
+		set, such that larger than default IOVAs may be cached.
+		A value of 0 means that the default caching range is chosen.
+		A privileged user could request the kernel the change the range
+		by writing to this file. For this to happen, the same rules
+		and procedure applies as in changing the default domain type.
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 10bb10c2a210..7c7258f19bed 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -48,6 +48,7 @@ struct iommu_group {
 	struct iommu_domain *default_domain;
 	struct iommu_domain *domain;
 	struct list_head entry;
+	size_t max_opt_dma_size;
 };
 
 struct group_device {
@@ -89,6 +90,9 @@ static int iommu_create_device_direct_mappings(struct iommu_group *group,
 static struct iommu_group *iommu_group_get_for_dev(struct device *dev);
 static ssize_t iommu_group_store_type(struct iommu_group *group,
 				      const char *buf, size_t count);
+static ssize_t iommu_group_store_max_opt_dma_size(struct iommu_group *group,
+						  const char *buf,
+						  size_t count);
 
 #define IOMMU_GROUP_ATTR(_name, _mode, _show, _store)		\
 struct iommu_group_attribute iommu_group_attr_##_name =		\
@@ -571,6 +575,12 @@ static ssize_t iommu_group_show_type(struct iommu_group *group,
 	return strlen(type);
 }
 
+static ssize_t iommu_group_show_max_opt_dma_size(struct iommu_group *group,
+				     char *buf)
+{
+	return sprintf(buf, "%zu\n", group->max_opt_dma_size);
+}
+
 static IOMMU_GROUP_ATTR(name, S_IRUGO, iommu_group_show_name, NULL);
 
 static IOMMU_GROUP_ATTR(reserved_regions, 0444,
@@ -579,6 +589,9 @@ static IOMMU_GROUP_ATTR(reserved_regions, 0444,
 static IOMMU_GROUP_ATTR(type, 0644, iommu_group_show_type,
 			iommu_group_store_type);
 
+static IOMMU_GROUP_ATTR(max_opt_dma_size, 0644, iommu_group_show_max_opt_dma_size,
+			iommu_group_store_max_opt_dma_size);
+
 static void iommu_group_release(struct kobject *kobj)
 {
 	struct iommu_group *group = to_iommu_group(kobj);
@@ -665,6 +678,10 @@ struct iommu_group *iommu_group_alloc(void)
 	if (ret)
 		return ERR_PTR(ret);
 
+	ret = iommu_group_create_file(group, &iommu_group_attr_max_opt_dma_size);
+	if (ret)
+		return ERR_PTR(ret);
+
 	pr_debug("Allocated group %d\n", group->id);
 
 	return group;
@@ -2087,6 +2104,11 @@ struct iommu_domain *iommu_get_dma_domain(struct device *dev)
 	return dev->iommu_group->default_domain;
 }
 
+size_t iommu_group_get_max_opt_dma_size(struct iommu_group *group)
+{
+	return group->max_opt_dma_size;
+}
+
 /*
  * IOMMU groups are really the natural working unit of the IOMMU, but
  * the IOMMU API works on domains and devices.  Bridge that gap by
@@ -2871,12 +2893,14 @@ EXPORT_SYMBOL_GPL(iommu_sva_get_pasid);
  * @prev_dev: The device in the group (this is used to make sure that the device
  *	 hasn't changed after the caller has called this function)
  * @type: The type of the new default domain that gets associated with the group
+ * @max_opt_dma_size: Set the IOMMU group max_opt_dma_size if non-zero
  *
  * Returns 0 on success and error code on failure
  *
  */
 static int iommu_change_dev_def_domain(struct iommu_group *group,
-				       struct device *prev_dev, int type)
+				       struct device *prev_dev, int type,
+				       unsigned long max_opt_dma_size)
 {
 	struct iommu_domain *prev_dom;
 	struct group_device *grp_dev;
@@ -2977,6 +3001,9 @@ static int iommu_change_dev_def_domain(struct iommu_group *group,
 
 	group->domain = group->default_domain;
 
+	if (max_opt_dma_size)
+		group->max_opt_dma_size = max_opt_dma_size;
+
 	/*
 	 * Release the mutex here because ops->probe_finalize() call-back of
 	 * some vendor IOMMU drivers calls arm_iommu_attach_device() which
@@ -3003,6 +3030,7 @@ static int iommu_change_dev_def_domain(struct iommu_group *group,
 
 enum iommu_group_op {
 	CHANGE_GROUP_TYPE,
+	CHANGE_DMA_OPT_SIZE,
 };
 
 static int __iommu_group_store_type(const char *buf, struct iommu_group *group,
@@ -3031,7 +3059,24 @@ static int __iommu_group_store_type(const char *buf, struct iommu_group *group,
 		return -EINVAL;
 	}
 
-	return iommu_change_dev_def_domain(group, dev, type);
+	return iommu_change_dev_def_domain(group, dev, type, 0);
+}
+
+static int __iommu_group_store_max_opt_dma_size(const char *buf,
+						struct iommu_group *group,
+						struct device *dev)
+{
+	unsigned long val;
+
+	if (kstrtoul(buf, 0, &val) || !val)
+		return -EINVAL;
+
+	if (device_is_bound(dev)) {
+		pr_err_ratelimited("Device is still bound to driver\n");
+		return -EINVAL;
+	}
+
+	return iommu_change_dev_def_domain(group, dev, __IOMMU_DOMAIN_SAME, val);
 }
 
 /*
@@ -3108,6 +3153,9 @@ static ssize_t iommu_group_store_common(struct iommu_group *group,
 	case CHANGE_GROUP_TYPE:
 		ret = __iommu_group_store_type(buf, group, dev);
 		break;
+	case CHANGE_DMA_OPT_SIZE:
+		ret = __iommu_group_store_max_opt_dma_size(buf, group, dev);
+		break;
 	default:
 		ret = -EINVAL;
 	}
@@ -3124,3 +3172,10 @@ static ssize_t iommu_group_store_type(struct iommu_group *group,
 {
 	return iommu_group_store_common(group, CHANGE_GROUP_TYPE, buf, count);
 }
+
+static ssize_t iommu_group_store_max_opt_dma_size(struct iommu_group *group,
+						  const char *buf,
+						  size_t count)
+{
+	return iommu_group_store_common(group, CHANGE_DMA_OPT_SIZE, buf, count);
+}
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index b141cf71c7af..6915e68c40b7 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -430,6 +430,7 @@ extern int iommu_sva_unbind_gpasid(struct iommu_domain *domain,
 				   struct device *dev, ioasid_t pasid);
 extern struct iommu_domain *iommu_get_domain_for_dev(struct device *dev);
 extern struct iommu_domain *iommu_get_dma_domain(struct device *dev);
+extern size_t iommu_group_get_max_opt_dma_size(struct iommu_group *group);
 extern int iommu_map(struct iommu_domain *domain, unsigned long iova,
 		     phys_addr_t paddr, size_t size, int prot);
 extern int iommu_map_atomic(struct iommu_domain *domain, unsigned long iova,
@@ -725,6 +726,11 @@ static inline struct iommu_domain *iommu_get_domain_for_dev(struct device *dev)
 	return NULL;
 }
 
+static inline size_t iommu_group_get_max_opt_dma_size(struct iommu_group *group)
+{
+	return 0;
+}
+
 static inline int iommu_map(struct iommu_domain *domain, unsigned long iova,
 			    phys_addr_t paddr, size_t size, int prot)
 {
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH RESEND v5 4/5] iommu: Allow max opt DMA len be set for a group via sysfs
@ 2022-04-04 11:27   ` John Garry via iommu
  0 siblings, 0 replies; 38+ messages in thread
From: John Garry via iommu @ 2022-04-04 11:27 UTC (permalink / raw)
  To: joro, will, robin.murphy
  Cc: jean-philippe, jasowang, mst, linuxarm, linux-kernel,
	virtualization, iommu

Add support to allow the maximum optimised DMA len be set for an IOMMU
group via sysfs.

This is much the same with the method to change the default domain type
for a group.

Signed-off-by: John Garry <john.garry@huawei.com>
---
 .../ABI/testing/sysfs-kernel-iommu_groups     | 16 +++++
 drivers/iommu/iommu.c                         | 59 ++++++++++++++++++-
 include/linux/iommu.h                         |  6 ++
 3 files changed, 79 insertions(+), 2 deletions(-)

diff --git a/Documentation/ABI/testing/sysfs-kernel-iommu_groups b/Documentation/ABI/testing/sysfs-kernel-iommu_groups
index b15af6a5bc08..ed6f72794f6c 100644
--- a/Documentation/ABI/testing/sysfs-kernel-iommu_groups
+++ b/Documentation/ABI/testing/sysfs-kernel-iommu_groups
@@ -63,3 +63,19 @@ Description:	/sys/kernel/iommu_groups/<grp_id>/type shows the type of default
 		system could lead to catastrophic effects (the users might
 		need to reboot the machine to get it to normal state). So, it's
 		expected that the users understand what they're doing.
+
+What:		/sys/kernel/iommu_groups/<grp_id>/max_opt_dma_size
+Date:		Feb 2022
+KernelVersion:	v5.18
+Contact:	iommu@lists.linux-foundation.org
+Description:	/sys/kernel/iommu_groups/<grp_id>/max_opt_dma_size shows the
+		max optimised DMA size for the default IOMMU domain associated
+		with the group.
+		Each IOMMU domain has an IOVA domain. The IOVA domain caches
+		IOVAs upto a certain size as a performance optimisation.
+		This sysfs file allows the range of the IOVA domain caching be
+		set, such that larger than default IOVAs may be cached.
+		A value of 0 means that the default caching range is chosen.
+		A privileged user could request the kernel the change the range
+		by writing to this file. For this to happen, the same rules
+		and procedure applies as in changing the default domain type.
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 10bb10c2a210..7c7258f19bed 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -48,6 +48,7 @@ struct iommu_group {
 	struct iommu_domain *default_domain;
 	struct iommu_domain *domain;
 	struct list_head entry;
+	size_t max_opt_dma_size;
 };
 
 struct group_device {
@@ -89,6 +90,9 @@ static int iommu_create_device_direct_mappings(struct iommu_group *group,
 static struct iommu_group *iommu_group_get_for_dev(struct device *dev);
 static ssize_t iommu_group_store_type(struct iommu_group *group,
 				      const char *buf, size_t count);
+static ssize_t iommu_group_store_max_opt_dma_size(struct iommu_group *group,
+						  const char *buf,
+						  size_t count);
 
 #define IOMMU_GROUP_ATTR(_name, _mode, _show, _store)		\
 struct iommu_group_attribute iommu_group_attr_##_name =		\
@@ -571,6 +575,12 @@ static ssize_t iommu_group_show_type(struct iommu_group *group,
 	return strlen(type);
 }
 
+static ssize_t iommu_group_show_max_opt_dma_size(struct iommu_group *group,
+				     char *buf)
+{
+	return sprintf(buf, "%zu\n", group->max_opt_dma_size);
+}
+
 static IOMMU_GROUP_ATTR(name, S_IRUGO, iommu_group_show_name, NULL);
 
 static IOMMU_GROUP_ATTR(reserved_regions, 0444,
@@ -579,6 +589,9 @@ static IOMMU_GROUP_ATTR(reserved_regions, 0444,
 static IOMMU_GROUP_ATTR(type, 0644, iommu_group_show_type,
 			iommu_group_store_type);
 
+static IOMMU_GROUP_ATTR(max_opt_dma_size, 0644, iommu_group_show_max_opt_dma_size,
+			iommu_group_store_max_opt_dma_size);
+
 static void iommu_group_release(struct kobject *kobj)
 {
 	struct iommu_group *group = to_iommu_group(kobj);
@@ -665,6 +678,10 @@ struct iommu_group *iommu_group_alloc(void)
 	if (ret)
 		return ERR_PTR(ret);
 
+	ret = iommu_group_create_file(group, &iommu_group_attr_max_opt_dma_size);
+	if (ret)
+		return ERR_PTR(ret);
+
 	pr_debug("Allocated group %d\n", group->id);
 
 	return group;
@@ -2087,6 +2104,11 @@ struct iommu_domain *iommu_get_dma_domain(struct device *dev)
 	return dev->iommu_group->default_domain;
 }
 
+size_t iommu_group_get_max_opt_dma_size(struct iommu_group *group)
+{
+	return group->max_opt_dma_size;
+}
+
 /*
  * IOMMU groups are really the natural working unit of the IOMMU, but
  * the IOMMU API works on domains and devices.  Bridge that gap by
@@ -2871,12 +2893,14 @@ EXPORT_SYMBOL_GPL(iommu_sva_get_pasid);
  * @prev_dev: The device in the group (this is used to make sure that the device
  *	 hasn't changed after the caller has called this function)
  * @type: The type of the new default domain that gets associated with the group
+ * @max_opt_dma_size: Set the IOMMU group max_opt_dma_size if non-zero
  *
  * Returns 0 on success and error code on failure
  *
  */
 static int iommu_change_dev_def_domain(struct iommu_group *group,
-				       struct device *prev_dev, int type)
+				       struct device *prev_dev, int type,
+				       unsigned long max_opt_dma_size)
 {
 	struct iommu_domain *prev_dom;
 	struct group_device *grp_dev;
@@ -2977,6 +3001,9 @@ static int iommu_change_dev_def_domain(struct iommu_group *group,
 
 	group->domain = group->default_domain;
 
+	if (max_opt_dma_size)
+		group->max_opt_dma_size = max_opt_dma_size;
+
 	/*
 	 * Release the mutex here because ops->probe_finalize() call-back of
 	 * some vendor IOMMU drivers calls arm_iommu_attach_device() which
@@ -3003,6 +3030,7 @@ static int iommu_change_dev_def_domain(struct iommu_group *group,
 
 enum iommu_group_op {
 	CHANGE_GROUP_TYPE,
+	CHANGE_DMA_OPT_SIZE,
 };
 
 static int __iommu_group_store_type(const char *buf, struct iommu_group *group,
@@ -3031,7 +3059,24 @@ static int __iommu_group_store_type(const char *buf, struct iommu_group *group,
 		return -EINVAL;
 	}
 
-	return iommu_change_dev_def_domain(group, dev, type);
+	return iommu_change_dev_def_domain(group, dev, type, 0);
+}
+
+static int __iommu_group_store_max_opt_dma_size(const char *buf,
+						struct iommu_group *group,
+						struct device *dev)
+{
+	unsigned long val;
+
+	if (kstrtoul(buf, 0, &val) || !val)
+		return -EINVAL;
+
+	if (device_is_bound(dev)) {
+		pr_err_ratelimited("Device is still bound to driver\n");
+		return -EINVAL;
+	}
+
+	return iommu_change_dev_def_domain(group, dev, __IOMMU_DOMAIN_SAME, val);
 }
 
 /*
@@ -3108,6 +3153,9 @@ static ssize_t iommu_group_store_common(struct iommu_group *group,
 	case CHANGE_GROUP_TYPE:
 		ret = __iommu_group_store_type(buf, group, dev);
 		break;
+	case CHANGE_DMA_OPT_SIZE:
+		ret = __iommu_group_store_max_opt_dma_size(buf, group, dev);
+		break;
 	default:
 		ret = -EINVAL;
 	}
@@ -3124,3 +3172,10 @@ static ssize_t iommu_group_store_type(struct iommu_group *group,
 {
 	return iommu_group_store_common(group, CHANGE_GROUP_TYPE, buf, count);
 }
+
+static ssize_t iommu_group_store_max_opt_dma_size(struct iommu_group *group,
+						  const char *buf,
+						  size_t count)
+{
+	return iommu_group_store_common(group, CHANGE_DMA_OPT_SIZE, buf, count);
+}
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index b141cf71c7af..6915e68c40b7 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -430,6 +430,7 @@ extern int iommu_sva_unbind_gpasid(struct iommu_domain *domain,
 				   struct device *dev, ioasid_t pasid);
 extern struct iommu_domain *iommu_get_domain_for_dev(struct device *dev);
 extern struct iommu_domain *iommu_get_dma_domain(struct device *dev);
+extern size_t iommu_group_get_max_opt_dma_size(struct iommu_group *group);
 extern int iommu_map(struct iommu_domain *domain, unsigned long iova,
 		     phys_addr_t paddr, size_t size, int prot);
 extern int iommu_map_atomic(struct iommu_domain *domain, unsigned long iova,
@@ -725,6 +726,11 @@ static inline struct iommu_domain *iommu_get_domain_for_dev(struct device *dev)
 	return NULL;
 }
 
+static inline size_t iommu_group_get_max_opt_dma_size(struct iommu_group *group)
+{
+	return 0;
+}
+
 static inline int iommu_map(struct iommu_domain *domain, unsigned long iova,
 			    phys_addr_t paddr, size_t size, int prot)
 {
-- 
2.26.2

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH RESEND v5 5/5] iova: Add iova_len argument to iova_domain_init_rcaches()
  2022-04-04 11:27 ` John Garry via iommu
@ 2022-04-04 11:27   ` John Garry via iommu
  -1 siblings, 0 replies; 38+ messages in thread
From: John Garry @ 2022-04-04 11:27 UTC (permalink / raw)
  To: joro, will, robin.murphy
  Cc: mst, jasowang, iommu, linux-kernel, virtualization, chenxiang66,
	thunder.leizhen, jean-philippe, linuxarm, John Garry

Add max opt argument to iova_domain_init_rcaches(), and use it to set the
rcaches range.

Also fix up all users to set this value (at 0, meaning use default),
including a wrapper for that, iova_domain_init_rcaches_default().

For dma-iommu.c we derive the iova_len argument from the IOMMU group
max opt DMA size.

Signed-off-by: John Garry <john.garry@huawei.com>
---
 drivers/iommu/dma-iommu.c            | 15 ++++++++++++++-
 drivers/iommu/iova.c                 | 19 ++++++++++++++++---
 drivers/vdpa/vdpa_user/iova_domain.c |  4 ++--
 include/linux/iova.h                 |  3 ++-
 4 files changed, 34 insertions(+), 7 deletions(-)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index 42ca42ff1b5d..19f35624611c 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -525,6 +525,8 @@ static int iommu_dma_init_domain(struct iommu_domain *domain, dma_addr_t base,
 	struct iommu_dma_cookie *cookie = domain->iova_cookie;
 	unsigned long order, base_pfn;
 	struct iova_domain *iovad;
+	size_t max_opt_dma_size;
+	unsigned long iova_len = 0;
 	int ret;
 
 	if (!cookie || cookie->type != IOMMU_DMA_IOVA_COOKIE)
@@ -560,7 +562,18 @@ static int iommu_dma_init_domain(struct iommu_domain *domain, dma_addr_t base,
 	}
 
 	init_iova_domain(iovad, 1UL << order, base_pfn);
-	ret = iova_domain_init_rcaches(iovad);
+
+	max_opt_dma_size = iommu_group_get_max_opt_dma_size(dev->iommu_group);
+	if (max_opt_dma_size) {
+		unsigned long shift = __ffs(1UL << order);
+
+		iova_len = roundup_pow_of_two(max_opt_dma_size);
+		iova_len >>= shift;
+		if (!iova_len)
+			iova_len = 1;
+	}
+
+	ret = iova_domain_init_rcaches(iovad, iova_len);
 	if (ret)
 		return ret;
 
diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c
index 5c22b9187b79..d65e79e132ee 100644
--- a/drivers/iommu/iova.c
+++ b/drivers/iommu/iova.c
@@ -706,12 +706,20 @@ static void iova_magazine_push(struct iova_magazine *mag, unsigned long pfn)
 	mag->pfns[mag->size++] = pfn;
 }
 
-int iova_domain_init_rcaches(struct iova_domain *iovad)
+static unsigned long iova_len_to_rcache_max(unsigned long iova_len)
+{
+	return order_base_2(iova_len) + 1;
+}
+
+int iova_domain_init_rcaches(struct iova_domain *iovad, unsigned long iova_len)
 {
 	unsigned int cpu;
 	int i, ret;
 
-	iovad->rcache_max_size = 6; /* Arbitrarily high default */
+	if (iova_len)
+		iovad->rcache_max_size = iova_len_to_rcache_max(iova_len);
+	else
+		iovad->rcache_max_size = 6; /* Arbitrarily high default */
 
 	iovad->rcaches = kcalloc(iovad->rcache_max_size,
 				 sizeof(struct iova_rcache),
@@ -755,7 +763,12 @@ int iova_domain_init_rcaches(struct iova_domain *iovad)
 	free_iova_rcaches(iovad);
 	return ret;
 }
-EXPORT_SYMBOL_GPL(iova_domain_init_rcaches);
+
+int iova_domain_init_rcaches_default(struct iova_domain *iovad)
+{
+	return iova_domain_init_rcaches(iovad, 0);
+}
+EXPORT_SYMBOL_GPL(iova_domain_init_rcaches_default);
 
 /*
  * Try inserting IOVA range starting with 'iova_pfn' into 'rcache', and
diff --git a/drivers/vdpa/vdpa_user/iova_domain.c b/drivers/vdpa/vdpa_user/iova_domain.c
index 6daa3978d290..3a2acef98a4a 100644
--- a/drivers/vdpa/vdpa_user/iova_domain.c
+++ b/drivers/vdpa/vdpa_user/iova_domain.c
@@ -514,12 +514,12 @@ vduse_domain_create(unsigned long iova_limit, size_t bounce_size)
 	spin_lock_init(&domain->iotlb_lock);
 	init_iova_domain(&domain->stream_iovad,
 			PAGE_SIZE, IOVA_START_PFN);
-	ret = iova_domain_init_rcaches(&domain->stream_iovad);
+	ret = iova_domain_init_rcaches_default(&domain->stream_iovad);
 	if (ret)
 		goto err_iovad_stream;
 	init_iova_domain(&domain->consistent_iovad,
 			PAGE_SIZE, bounce_pfns);
-	ret = iova_domain_init_rcaches(&domain->consistent_iovad);
+	ret = iova_domain_init_rcaches_default(&domain->consistent_iovad);
 	if (ret)
 		goto err_iovad_consistent;
 
diff --git a/include/linux/iova.h b/include/linux/iova.h
index 02f7222fa85a..56281434ce0c 100644
--- a/include/linux/iova.h
+++ b/include/linux/iova.h
@@ -95,7 +95,8 @@ struct iova *reserve_iova(struct iova_domain *iovad, unsigned long pfn_lo,
 	unsigned long pfn_hi);
 void init_iova_domain(struct iova_domain *iovad, unsigned long granule,
 	unsigned long start_pfn);
-int iova_domain_init_rcaches(struct iova_domain *iovad);
+int iova_domain_init_rcaches(struct iova_domain *iovad, unsigned long iova_len);
+int iova_domain_init_rcaches_default(struct iova_domain *iovad);
 struct iova *find_iova(struct iova_domain *iovad, unsigned long pfn);
 void put_iova_domain(struct iova_domain *iovad);
 #else
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH RESEND v5 5/5] iova: Add iova_len argument to iova_domain_init_rcaches()
@ 2022-04-04 11:27   ` John Garry via iommu
  0 siblings, 0 replies; 38+ messages in thread
From: John Garry via iommu @ 2022-04-04 11:27 UTC (permalink / raw)
  To: joro, will, robin.murphy
  Cc: jean-philippe, jasowang, mst, linuxarm, linux-kernel,
	virtualization, iommu

Add max opt argument to iova_domain_init_rcaches(), and use it to set the
rcaches range.

Also fix up all users to set this value (at 0, meaning use default),
including a wrapper for that, iova_domain_init_rcaches_default().

For dma-iommu.c we derive the iova_len argument from the IOMMU group
max opt DMA size.

Signed-off-by: John Garry <john.garry@huawei.com>
---
 drivers/iommu/dma-iommu.c            | 15 ++++++++++++++-
 drivers/iommu/iova.c                 | 19 ++++++++++++++++---
 drivers/vdpa/vdpa_user/iova_domain.c |  4 ++--
 include/linux/iova.h                 |  3 ++-
 4 files changed, 34 insertions(+), 7 deletions(-)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index 42ca42ff1b5d..19f35624611c 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -525,6 +525,8 @@ static int iommu_dma_init_domain(struct iommu_domain *domain, dma_addr_t base,
 	struct iommu_dma_cookie *cookie = domain->iova_cookie;
 	unsigned long order, base_pfn;
 	struct iova_domain *iovad;
+	size_t max_opt_dma_size;
+	unsigned long iova_len = 0;
 	int ret;
 
 	if (!cookie || cookie->type != IOMMU_DMA_IOVA_COOKIE)
@@ -560,7 +562,18 @@ static int iommu_dma_init_domain(struct iommu_domain *domain, dma_addr_t base,
 	}
 
 	init_iova_domain(iovad, 1UL << order, base_pfn);
-	ret = iova_domain_init_rcaches(iovad);
+
+	max_opt_dma_size = iommu_group_get_max_opt_dma_size(dev->iommu_group);
+	if (max_opt_dma_size) {
+		unsigned long shift = __ffs(1UL << order);
+
+		iova_len = roundup_pow_of_two(max_opt_dma_size);
+		iova_len >>= shift;
+		if (!iova_len)
+			iova_len = 1;
+	}
+
+	ret = iova_domain_init_rcaches(iovad, iova_len);
 	if (ret)
 		return ret;
 
diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c
index 5c22b9187b79..d65e79e132ee 100644
--- a/drivers/iommu/iova.c
+++ b/drivers/iommu/iova.c
@@ -706,12 +706,20 @@ static void iova_magazine_push(struct iova_magazine *mag, unsigned long pfn)
 	mag->pfns[mag->size++] = pfn;
 }
 
-int iova_domain_init_rcaches(struct iova_domain *iovad)
+static unsigned long iova_len_to_rcache_max(unsigned long iova_len)
+{
+	return order_base_2(iova_len) + 1;
+}
+
+int iova_domain_init_rcaches(struct iova_domain *iovad, unsigned long iova_len)
 {
 	unsigned int cpu;
 	int i, ret;
 
-	iovad->rcache_max_size = 6; /* Arbitrarily high default */
+	if (iova_len)
+		iovad->rcache_max_size = iova_len_to_rcache_max(iova_len);
+	else
+		iovad->rcache_max_size = 6; /* Arbitrarily high default */
 
 	iovad->rcaches = kcalloc(iovad->rcache_max_size,
 				 sizeof(struct iova_rcache),
@@ -755,7 +763,12 @@ int iova_domain_init_rcaches(struct iova_domain *iovad)
 	free_iova_rcaches(iovad);
 	return ret;
 }
-EXPORT_SYMBOL_GPL(iova_domain_init_rcaches);
+
+int iova_domain_init_rcaches_default(struct iova_domain *iovad)
+{
+	return iova_domain_init_rcaches(iovad, 0);
+}
+EXPORT_SYMBOL_GPL(iova_domain_init_rcaches_default);
 
 /*
  * Try inserting IOVA range starting with 'iova_pfn' into 'rcache', and
diff --git a/drivers/vdpa/vdpa_user/iova_domain.c b/drivers/vdpa/vdpa_user/iova_domain.c
index 6daa3978d290..3a2acef98a4a 100644
--- a/drivers/vdpa/vdpa_user/iova_domain.c
+++ b/drivers/vdpa/vdpa_user/iova_domain.c
@@ -514,12 +514,12 @@ vduse_domain_create(unsigned long iova_limit, size_t bounce_size)
 	spin_lock_init(&domain->iotlb_lock);
 	init_iova_domain(&domain->stream_iovad,
 			PAGE_SIZE, IOVA_START_PFN);
-	ret = iova_domain_init_rcaches(&domain->stream_iovad);
+	ret = iova_domain_init_rcaches_default(&domain->stream_iovad);
 	if (ret)
 		goto err_iovad_stream;
 	init_iova_domain(&domain->consistent_iovad,
 			PAGE_SIZE, bounce_pfns);
-	ret = iova_domain_init_rcaches(&domain->consistent_iovad);
+	ret = iova_domain_init_rcaches_default(&domain->consistent_iovad);
 	if (ret)
 		goto err_iovad_consistent;
 
diff --git a/include/linux/iova.h b/include/linux/iova.h
index 02f7222fa85a..56281434ce0c 100644
--- a/include/linux/iova.h
+++ b/include/linux/iova.h
@@ -95,7 +95,8 @@ struct iova *reserve_iova(struct iova_domain *iovad, unsigned long pfn_lo,
 	unsigned long pfn_hi);
 void init_iova_domain(struct iova_domain *iovad, unsigned long granule,
 	unsigned long start_pfn);
-int iova_domain_init_rcaches(struct iova_domain *iovad);
+int iova_domain_init_rcaches(struct iova_domain *iovad, unsigned long iova_len);
+int iova_domain_init_rcaches_default(struct iova_domain *iovad);
 struct iova *find_iova(struct iova_domain *iovad, unsigned long pfn);
 void put_iova_domain(struct iova_domain *iovad);
 #else
-- 
2.26.2

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* Re: [PATCH RESEND v5 1/5] iommu: Refactor iommu_group_store_type()
  2022-04-04 11:27   ` John Garry via iommu
@ 2022-04-07  7:40     ` Leizhen (ThunderTown) via iommu
  -1 siblings, 0 replies; 38+ messages in thread
From: Leizhen (ThunderTown) @ 2022-04-07  7:40 UTC (permalink / raw)
  To: John Garry, joro, will, robin.murphy
  Cc: mst, jasowang, iommu, linux-kernel, virtualization, chenxiang66,
	jean-philippe, linuxarm

Reviewed-by: Zhen Lei <thunder.leizhen@huawei.com>

On 2022/4/4 19:27, John Garry wrote:
> Function iommu_group_store_type() supports changing the default domain
> of an IOMMU group.
> 
> Many conditions need to be satisfied and steps taken for this action to be
> successful.
> 
> Satisfying these conditions and steps will be required for setting other
> IOMMU group attributes, so factor into a common part and a part specific
> to update the IOMMU group attribute.
> 
> No functional change intended.
> 
> Some code comments are tidied up also.
> 
> Signed-off-by: John Garry <john.garry@huawei.com>
> ---
>  drivers/iommu/iommu.c | 96 ++++++++++++++++++++++++++++---------------
>  1 file changed, 62 insertions(+), 34 deletions(-)
> 
> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
> index f2c45b85b9fc..0dd766030baf 100644
> --- a/drivers/iommu/iommu.c
> +++ b/drivers/iommu/iommu.c
> @@ -3000,21 +3000,57 @@ static int iommu_change_dev_def_domain(struct iommu_group *group,
>  	return ret;
>  }
>  
> +enum iommu_group_op {
> +	CHANGE_GROUP_TYPE,
> +};
> +
> +static int __iommu_group_store_type(const char *buf, struct iommu_group *group,
> +				    struct device *dev)
> +{
> +	int type;
> +
> +	if (sysfs_streq(buf, "identity"))
> +		type = IOMMU_DOMAIN_IDENTITY;
> +	else if (sysfs_streq(buf, "DMA"))
> +		type = IOMMU_DOMAIN_DMA;
> +	else if (sysfs_streq(buf, "DMA-FQ"))
> +		type = IOMMU_DOMAIN_DMA_FQ;
> +	else if (sysfs_streq(buf, "auto"))
> +		type = 0;
> +	else
> +		return -EINVAL;
> +
> +	/*
> +	 * Check if the only device in the group still has a driver bound or
> +	 * we're transistioning from DMA -> DMA-FQ
> +	 */
> +	if (device_is_bound(dev) && !(type == IOMMU_DOMAIN_DMA_FQ &&
> +	    group->default_domain->type == IOMMU_DOMAIN_DMA)) {
> +		pr_err_ratelimited("Device is still bound to driver\n");
> +		return -EINVAL;
> +	}
> +
> +	return iommu_change_dev_def_domain(group, dev, type);
> +}
> +
>  /*
>   * Changing the default domain through sysfs requires the users to unbind the
>   * drivers from the devices in the iommu group, except for a DMA -> DMA-FQ
> - * transition. Return failure if this isn't met.
> + * transition. Changing or any other IOMMU group attribute still requires the
> + * user to unbind the drivers from the devices in the iommu group. Return
> + * failure if these conditions are not met.
>   *
>   * We need to consider the race between this and the device release path.
>   * device_lock(dev) is used here to guarantee that the device release path
>   * will not be entered at the same time.
>   */
> -static ssize_t iommu_group_store_type(struct iommu_group *group,
> -				      const char *buf, size_t count)
> +static ssize_t iommu_group_store_common(struct iommu_group *group,
> +					enum iommu_group_op op,
> +					const char *buf, size_t count)
>  {
>  	struct group_device *grp_dev;
>  	struct device *dev;
> -	int ret, req_type;
> +	int ret;
>  
>  	if (!capable(CAP_SYS_ADMIN) || !capable(CAP_SYS_RAWIO))
>  		return -EACCES;
> @@ -3022,27 +3058,16 @@ static ssize_t iommu_group_store_type(struct iommu_group *group,
>  	if (WARN_ON(!group))
>  		return -EINVAL;
>  
> -	if (sysfs_streq(buf, "identity"))
> -		req_type = IOMMU_DOMAIN_IDENTITY;
> -	else if (sysfs_streq(buf, "DMA"))
> -		req_type = IOMMU_DOMAIN_DMA;
> -	else if (sysfs_streq(buf, "DMA-FQ"))
> -		req_type = IOMMU_DOMAIN_DMA_FQ;
> -	else if (sysfs_streq(buf, "auto"))
> -		req_type = 0;
> -	else
> -		return -EINVAL;
> -
>  	/*
>  	 * Lock/Unlock the group mutex here before device lock to
> -	 * 1. Make sure that the iommu group has only one device (this is a
> +	 * 1. Make sure that the IOMMU group has only one device (this is a
>  	 *    prerequisite for step 2)
>  	 * 2. Get struct *dev which is needed to lock device
>  	 */
>  	mutex_lock(&group->mutex);
>  	if (iommu_group_device_count(group) != 1) {
>  		mutex_unlock(&group->mutex);
> -		pr_err_ratelimited("Cannot change default domain: Group has more than one device\n");
> +		pr_err_ratelimited("Cannot change IOMMU group default domain attribute: Group has more than one device\n");
>  		return -EINVAL;
>  	}
>  
> @@ -3054,16 +3079,16 @@ static ssize_t iommu_group_store_type(struct iommu_group *group,
>  	/*
>  	 * Don't hold the group mutex because taking group mutex first and then
>  	 * the device lock could potentially cause a deadlock as below. Assume
> -	 * two threads T1 and T2. T1 is trying to change default domain of an
> -	 * iommu group and T2 is trying to hot unplug a device or release [1] VF
> -	 * of a PCIe device which is in the same iommu group. T1 takes group
> -	 * mutex and before it could take device lock assume T2 has taken device
> -	 * lock and is yet to take group mutex. Now, both the threads will be
> -	 * waiting for the other thread to release lock. Below, lock order was
> -	 * suggested.
> +	 * two threads, T1 and T2. T1 is trying to change default domain
> +	 * attribute of an IOMMU group and T2 is trying to hot unplug a device
> +	 * or release [1] VF of a PCIe device which is in the same IOMMU group.
> +	 * T1 takes the group mutex and before it could take device lock T2 may
> +	 * have taken device lock and is yet to take group mutex. Now, both the
> +	 * threads will be waiting for the other thread to release lock. Below,
> +	 * lock order was suggested.
>  	 * device_lock(dev);
>  	 *	mutex_lock(&group->mutex);
> -	 *		iommu_change_dev_def_domain();
> +	 *		cb->iommu_change_dev_def_domain(); [example cb]
>  	 *	mutex_unlock(&group->mutex);
>  	 * device_unlock(dev);
>  	 *
> @@ -3077,21 +3102,24 @@ static ssize_t iommu_group_store_type(struct iommu_group *group,
>  	 */
>  	mutex_unlock(&group->mutex);
>  
> -	/* Check if the device in the group still has a driver bound to it */
>  	device_lock(dev);
> -	if (device_is_bound(dev) && !(req_type == IOMMU_DOMAIN_DMA_FQ &&
> -	    group->default_domain->type == IOMMU_DOMAIN_DMA)) {
> -		pr_err_ratelimited("Device is still bound to driver\n");
> -		ret = -EBUSY;
> -		goto out;
> +	switch (op) {
> +	case CHANGE_GROUP_TYPE:
> +		ret = __iommu_group_store_type(buf, group, dev);
> +		break;
> +	default:
> +		ret = -EINVAL;
>  	}
> -
> -	ret = iommu_change_dev_def_domain(group, dev, req_type);
>  	ret = ret ?: count;
>  
> -out:
>  	device_unlock(dev);
>  	put_device(dev);
>  
>  	return ret;
>  }
> +
> +static ssize_t iommu_group_store_type(struct iommu_group *group,
> +				      const char *buf, size_t count)
> +{
> +	return iommu_group_store_common(group, CHANGE_GROUP_TYPE, buf, count);
> +}
> 

-- 
Regards,
  Zhen Lei

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH RESEND v5 1/5] iommu: Refactor iommu_group_store_type()
@ 2022-04-07  7:40     ` Leizhen (ThunderTown) via iommu
  0 siblings, 0 replies; 38+ messages in thread
From: Leizhen (ThunderTown) via iommu @ 2022-04-07  7:40 UTC (permalink / raw)
  To: John Garry, joro, will, robin.murphy
  Cc: jean-philippe, jasowang, mst, linuxarm, linux-kernel,
	virtualization, iommu

Reviewed-by: Zhen Lei <thunder.leizhen@huawei.com>

On 2022/4/4 19:27, John Garry wrote:
> Function iommu_group_store_type() supports changing the default domain
> of an IOMMU group.
> 
> Many conditions need to be satisfied and steps taken for this action to be
> successful.
> 
> Satisfying these conditions and steps will be required for setting other
> IOMMU group attributes, so factor into a common part and a part specific
> to update the IOMMU group attribute.
> 
> No functional change intended.
> 
> Some code comments are tidied up also.
> 
> Signed-off-by: John Garry <john.garry@huawei.com>
> ---
>  drivers/iommu/iommu.c | 96 ++++++++++++++++++++++++++++---------------
>  1 file changed, 62 insertions(+), 34 deletions(-)
> 
> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
> index f2c45b85b9fc..0dd766030baf 100644
> --- a/drivers/iommu/iommu.c
> +++ b/drivers/iommu/iommu.c
> @@ -3000,21 +3000,57 @@ static int iommu_change_dev_def_domain(struct iommu_group *group,
>  	return ret;
>  }
>  
> +enum iommu_group_op {
> +	CHANGE_GROUP_TYPE,
> +};
> +
> +static int __iommu_group_store_type(const char *buf, struct iommu_group *group,
> +				    struct device *dev)
> +{
> +	int type;
> +
> +	if (sysfs_streq(buf, "identity"))
> +		type = IOMMU_DOMAIN_IDENTITY;
> +	else if (sysfs_streq(buf, "DMA"))
> +		type = IOMMU_DOMAIN_DMA;
> +	else if (sysfs_streq(buf, "DMA-FQ"))
> +		type = IOMMU_DOMAIN_DMA_FQ;
> +	else if (sysfs_streq(buf, "auto"))
> +		type = 0;
> +	else
> +		return -EINVAL;
> +
> +	/*
> +	 * Check if the only device in the group still has a driver bound or
> +	 * we're transistioning from DMA -> DMA-FQ
> +	 */
> +	if (device_is_bound(dev) && !(type == IOMMU_DOMAIN_DMA_FQ &&
> +	    group->default_domain->type == IOMMU_DOMAIN_DMA)) {
> +		pr_err_ratelimited("Device is still bound to driver\n");
> +		return -EINVAL;
> +	}
> +
> +	return iommu_change_dev_def_domain(group, dev, type);
> +}
> +
>  /*
>   * Changing the default domain through sysfs requires the users to unbind the
>   * drivers from the devices in the iommu group, except for a DMA -> DMA-FQ
> - * transition. Return failure if this isn't met.
> + * transition. Changing or any other IOMMU group attribute still requires the
> + * user to unbind the drivers from the devices in the iommu group. Return
> + * failure if these conditions are not met.
>   *
>   * We need to consider the race between this and the device release path.
>   * device_lock(dev) is used here to guarantee that the device release path
>   * will not be entered at the same time.
>   */
> -static ssize_t iommu_group_store_type(struct iommu_group *group,
> -				      const char *buf, size_t count)
> +static ssize_t iommu_group_store_common(struct iommu_group *group,
> +					enum iommu_group_op op,
> +					const char *buf, size_t count)
>  {
>  	struct group_device *grp_dev;
>  	struct device *dev;
> -	int ret, req_type;
> +	int ret;
>  
>  	if (!capable(CAP_SYS_ADMIN) || !capable(CAP_SYS_RAWIO))
>  		return -EACCES;
> @@ -3022,27 +3058,16 @@ static ssize_t iommu_group_store_type(struct iommu_group *group,
>  	if (WARN_ON(!group))
>  		return -EINVAL;
>  
> -	if (sysfs_streq(buf, "identity"))
> -		req_type = IOMMU_DOMAIN_IDENTITY;
> -	else if (sysfs_streq(buf, "DMA"))
> -		req_type = IOMMU_DOMAIN_DMA;
> -	else if (sysfs_streq(buf, "DMA-FQ"))
> -		req_type = IOMMU_DOMAIN_DMA_FQ;
> -	else if (sysfs_streq(buf, "auto"))
> -		req_type = 0;
> -	else
> -		return -EINVAL;
> -
>  	/*
>  	 * Lock/Unlock the group mutex here before device lock to
> -	 * 1. Make sure that the iommu group has only one device (this is a
> +	 * 1. Make sure that the IOMMU group has only one device (this is a
>  	 *    prerequisite for step 2)
>  	 * 2. Get struct *dev which is needed to lock device
>  	 */
>  	mutex_lock(&group->mutex);
>  	if (iommu_group_device_count(group) != 1) {
>  		mutex_unlock(&group->mutex);
> -		pr_err_ratelimited("Cannot change default domain: Group has more than one device\n");
> +		pr_err_ratelimited("Cannot change IOMMU group default domain attribute: Group has more than one device\n");
>  		return -EINVAL;
>  	}
>  
> @@ -3054,16 +3079,16 @@ static ssize_t iommu_group_store_type(struct iommu_group *group,
>  	/*
>  	 * Don't hold the group mutex because taking group mutex first and then
>  	 * the device lock could potentially cause a deadlock as below. Assume
> -	 * two threads T1 and T2. T1 is trying to change default domain of an
> -	 * iommu group and T2 is trying to hot unplug a device or release [1] VF
> -	 * of a PCIe device which is in the same iommu group. T1 takes group
> -	 * mutex and before it could take device lock assume T2 has taken device
> -	 * lock and is yet to take group mutex. Now, both the threads will be
> -	 * waiting for the other thread to release lock. Below, lock order was
> -	 * suggested.
> +	 * two threads, T1 and T2. T1 is trying to change default domain
> +	 * attribute of an IOMMU group and T2 is trying to hot unplug a device
> +	 * or release [1] VF of a PCIe device which is in the same IOMMU group.
> +	 * T1 takes the group mutex and before it could take device lock T2 may
> +	 * have taken device lock and is yet to take group mutex. Now, both the
> +	 * threads will be waiting for the other thread to release lock. Below,
> +	 * lock order was suggested.
>  	 * device_lock(dev);
>  	 *	mutex_lock(&group->mutex);
> -	 *		iommu_change_dev_def_domain();
> +	 *		cb->iommu_change_dev_def_domain(); [example cb]
>  	 *	mutex_unlock(&group->mutex);
>  	 * device_unlock(dev);
>  	 *
> @@ -3077,21 +3102,24 @@ static ssize_t iommu_group_store_type(struct iommu_group *group,
>  	 */
>  	mutex_unlock(&group->mutex);
>  
> -	/* Check if the device in the group still has a driver bound to it */
>  	device_lock(dev);
> -	if (device_is_bound(dev) && !(req_type == IOMMU_DOMAIN_DMA_FQ &&
> -	    group->default_domain->type == IOMMU_DOMAIN_DMA)) {
> -		pr_err_ratelimited("Device is still bound to driver\n");
> -		ret = -EBUSY;
> -		goto out;
> +	switch (op) {
> +	case CHANGE_GROUP_TYPE:
> +		ret = __iommu_group_store_type(buf, group, dev);
> +		break;
> +	default:
> +		ret = -EINVAL;
>  	}
> -
> -	ret = iommu_change_dev_def_domain(group, dev, req_type);
>  	ret = ret ?: count;
>  
> -out:
>  	device_unlock(dev);
>  	put_device(dev);
>  
>  	return ret;
>  }
> +
> +static ssize_t iommu_group_store_type(struct iommu_group *group,
> +				      const char *buf, size_t count)
> +{
> +	return iommu_group_store_common(group, CHANGE_GROUP_TYPE, buf, count);
> +}
> 

-- 
Regards,
  Zhen Lei
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH RESEND v5 2/5] iova: Allow rcache range upper limit to be flexible
  2022-04-04 11:27   ` John Garry via iommu
@ 2022-04-07  7:52     ` Leizhen (ThunderTown) via iommu
  -1 siblings, 0 replies; 38+ messages in thread
From: Leizhen (ThunderTown) @ 2022-04-07  7:52 UTC (permalink / raw)
  To: John Garry, joro, will, robin.murphy
  Cc: mst, jasowang, iommu, linux-kernel, virtualization, chenxiang66,
	jean-philippe, linuxarm



On 2022/4/4 19:27, John Garry wrote:
> Some low-level drivers may request DMA mappings whose IOVA length exceeds
> that of the current rcache upper limit.
> 
> This means that allocations for those IOVAs will never be cached, and
> always must be allocated and freed from the RB tree per DMA mapping cycle.
> This has a significant effect on performance, more so since commit
> 4e89dce72521 ("iommu/iova: Retry from last rb tree node if iova search
> fails"), as discussed at [0].
> 
> As a first step towards allowing the rcache range upper limit be
> configured, hold this value in the IOVA rcache structure, and allocate
> the rcaches separately.
> 
> Delete macro IOVA_RANGE_CACHE_MAX_SIZE in case it's reused by mistake.
> 
> [0] https://lore.kernel.org/linux-iommu/20210129092120.1482-1-thunder.leizhen@huawei.com/
> 
> Signed-off-by: John Garry <john.garry@huawei.com>
> ---
>  drivers/iommu/iova.c | 20 ++++++++++----------
>  include/linux/iova.h |  3 +++
>  2 files changed, 13 insertions(+), 10 deletions(-)
> 
> diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c
> index db77aa675145..5c22b9187b79 100644
> --- a/drivers/iommu/iova.c
> +++ b/drivers/iommu/iova.c
> @@ -15,8 +15,6 @@
>  /* The anchor node sits above the top of the usable address space */
>  #define IOVA_ANCHOR	~0UL
>  
> -#define IOVA_RANGE_CACHE_MAX_SIZE 6	/* log of max cached IOVA range size (in pages) */
> -
>  static bool iova_rcache_insert(struct iova_domain *iovad,
>  			       unsigned long pfn,
>  			       unsigned long size);
> @@ -443,7 +441,7 @@ alloc_iova_fast(struct iova_domain *iovad, unsigned long size,
>  	 * rounding up anything cacheable to make sure that can't happen. The
>  	 * order of the unadjusted size will still match upon freeing.
>  	 */
> -	if (size < (1 << (IOVA_RANGE_CACHE_MAX_SIZE - 1)))
> +	if (size < (1 << (iovad->rcache_max_size - 1)))
>  		size = roundup_pow_of_two(size);
>  
>  	iova_pfn = iova_rcache_get(iovad, size, limit_pfn + 1);
> @@ -713,13 +711,15 @@ int iova_domain_init_rcaches(struct iova_domain *iovad)
>  	unsigned int cpu;
>  	int i, ret;
>  
> -	iovad->rcaches = kcalloc(IOVA_RANGE_CACHE_MAX_SIZE,
> +	iovad->rcache_max_size = 6; /* Arbitrarily high default */

It would be better to assign this constant value to iovad->rcache_max_size in init_iova_domain().

> +
> +	iovad->rcaches = kcalloc(iovad->rcache_max_size,
>  				 sizeof(struct iova_rcache),
>  				 GFP_KERNEL);
>  	if (!iovad->rcaches)
>  		return -ENOMEM;
>  
> -	for (i = 0; i < IOVA_RANGE_CACHE_MAX_SIZE; ++i) {
> +	for (i = 0; i < iovad->rcache_max_size; ++i) {
>  		struct iova_cpu_rcache *cpu_rcache;
>  		struct iova_rcache *rcache;
>  
> @@ -816,7 +816,7 @@ static bool iova_rcache_insert(struct iova_domain *iovad, unsigned long pfn,
>  {
>  	unsigned int log_size = order_base_2(size);
>  
> -	if (log_size >= IOVA_RANGE_CACHE_MAX_SIZE)
> +	if (log_size >= iovad->rcache_max_size)
>  		return false;
>  
>  	return __iova_rcache_insert(iovad, &iovad->rcaches[log_size], pfn);
> @@ -872,7 +872,7 @@ static unsigned long iova_rcache_get(struct iova_domain *iovad,
>  {
>  	unsigned int log_size = order_base_2(size);
>  
> -	if (log_size >= IOVA_RANGE_CACHE_MAX_SIZE || !iovad->rcaches)
> +	if (log_size >= iovad->rcache_max_size || !iovad->rcaches)
>  		return 0;
>  
>  	return __iova_rcache_get(&iovad->rcaches[log_size], limit_pfn - size);
> @@ -888,7 +888,7 @@ static void free_iova_rcaches(struct iova_domain *iovad)
>  	unsigned int cpu;
>  	int i, j;
>  
> -	for (i = 0; i < IOVA_RANGE_CACHE_MAX_SIZE; ++i) {
> +	for (i = 0; i < iovad->rcache_max_size; ++i) {
>  		rcache = &iovad->rcaches[i];
>  		if (!rcache->cpu_rcaches)
>  			break;
> @@ -916,7 +916,7 @@ static void free_cpu_cached_iovas(unsigned int cpu, struct iova_domain *iovad)
>  	unsigned long flags;
>  	int i;
>  
> -	for (i = 0; i < IOVA_RANGE_CACHE_MAX_SIZE; ++i) {
> +	for (i = 0; i < iovad->rcache_max_size; ++i) {
>  		rcache = &iovad->rcaches[i];
>  		cpu_rcache = per_cpu_ptr(rcache->cpu_rcaches, cpu);
>  		spin_lock_irqsave(&cpu_rcache->lock, flags);
> @@ -935,7 +935,7 @@ static void free_global_cached_iovas(struct iova_domain *iovad)
>  	unsigned long flags;
>  	int i, j;
>  
> -	for (i = 0; i < IOVA_RANGE_CACHE_MAX_SIZE; ++i) {
> +	for (i = 0; i < iovad->rcache_max_size; ++i) {
>  		rcache = &iovad->rcaches[i];
>  		spin_lock_irqsave(&rcache->lock, flags);
>  		for (j = 0; j < rcache->depot_size; ++j) {
> diff --git a/include/linux/iova.h b/include/linux/iova.h
> index 320a70e40233..02f7222fa85a 100644
> --- a/include/linux/iova.h
> +++ b/include/linux/iova.h
> @@ -38,6 +38,9 @@ struct iova_domain {
>  
>  	struct iova_rcache	*rcaches;
>  	struct hlist_node	cpuhp_dead;
> +
> +	/* log of max cached IOVA range size (in pages) */
> +	unsigned long	rcache_max_size;
>  };
>  
>  static inline unsigned long iova_size(struct iova *iova)
> 

-- 
Regards,
  Zhen Lei

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH RESEND v5 2/5] iova: Allow rcache range upper limit to be flexible
@ 2022-04-07  7:52     ` Leizhen (ThunderTown) via iommu
  0 siblings, 0 replies; 38+ messages in thread
From: Leizhen (ThunderTown) via iommu @ 2022-04-07  7:52 UTC (permalink / raw)
  To: John Garry, joro, will, robin.murphy
  Cc: jean-philippe, jasowang, mst, linuxarm, linux-kernel,
	virtualization, iommu



On 2022/4/4 19:27, John Garry wrote:
> Some low-level drivers may request DMA mappings whose IOVA length exceeds
> that of the current rcache upper limit.
> 
> This means that allocations for those IOVAs will never be cached, and
> always must be allocated and freed from the RB tree per DMA mapping cycle.
> This has a significant effect on performance, more so since commit
> 4e89dce72521 ("iommu/iova: Retry from last rb tree node if iova search
> fails"), as discussed at [0].
> 
> As a first step towards allowing the rcache range upper limit be
> configured, hold this value in the IOVA rcache structure, and allocate
> the rcaches separately.
> 
> Delete macro IOVA_RANGE_CACHE_MAX_SIZE in case it's reused by mistake.
> 
> [0] https://lore.kernel.org/linux-iommu/20210129092120.1482-1-thunder.leizhen@huawei.com/
> 
> Signed-off-by: John Garry <john.garry@huawei.com>
> ---
>  drivers/iommu/iova.c | 20 ++++++++++----------
>  include/linux/iova.h |  3 +++
>  2 files changed, 13 insertions(+), 10 deletions(-)
> 
> diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c
> index db77aa675145..5c22b9187b79 100644
> --- a/drivers/iommu/iova.c
> +++ b/drivers/iommu/iova.c
> @@ -15,8 +15,6 @@
>  /* The anchor node sits above the top of the usable address space */
>  #define IOVA_ANCHOR	~0UL
>  
> -#define IOVA_RANGE_CACHE_MAX_SIZE 6	/* log of max cached IOVA range size (in pages) */
> -
>  static bool iova_rcache_insert(struct iova_domain *iovad,
>  			       unsigned long pfn,
>  			       unsigned long size);
> @@ -443,7 +441,7 @@ alloc_iova_fast(struct iova_domain *iovad, unsigned long size,
>  	 * rounding up anything cacheable to make sure that can't happen. The
>  	 * order of the unadjusted size will still match upon freeing.
>  	 */
> -	if (size < (1 << (IOVA_RANGE_CACHE_MAX_SIZE - 1)))
> +	if (size < (1 << (iovad->rcache_max_size - 1)))
>  		size = roundup_pow_of_two(size);
>  
>  	iova_pfn = iova_rcache_get(iovad, size, limit_pfn + 1);
> @@ -713,13 +711,15 @@ int iova_domain_init_rcaches(struct iova_domain *iovad)
>  	unsigned int cpu;
>  	int i, ret;
>  
> -	iovad->rcaches = kcalloc(IOVA_RANGE_CACHE_MAX_SIZE,
> +	iovad->rcache_max_size = 6; /* Arbitrarily high default */

It would be better to assign this constant value to iovad->rcache_max_size in init_iova_domain().

> +
> +	iovad->rcaches = kcalloc(iovad->rcache_max_size,
>  				 sizeof(struct iova_rcache),
>  				 GFP_KERNEL);
>  	if (!iovad->rcaches)
>  		return -ENOMEM;
>  
> -	for (i = 0; i < IOVA_RANGE_CACHE_MAX_SIZE; ++i) {
> +	for (i = 0; i < iovad->rcache_max_size; ++i) {
>  		struct iova_cpu_rcache *cpu_rcache;
>  		struct iova_rcache *rcache;
>  
> @@ -816,7 +816,7 @@ static bool iova_rcache_insert(struct iova_domain *iovad, unsigned long pfn,
>  {
>  	unsigned int log_size = order_base_2(size);
>  
> -	if (log_size >= IOVA_RANGE_CACHE_MAX_SIZE)
> +	if (log_size >= iovad->rcache_max_size)
>  		return false;
>  
>  	return __iova_rcache_insert(iovad, &iovad->rcaches[log_size], pfn);
> @@ -872,7 +872,7 @@ static unsigned long iova_rcache_get(struct iova_domain *iovad,
>  {
>  	unsigned int log_size = order_base_2(size);
>  
> -	if (log_size >= IOVA_RANGE_CACHE_MAX_SIZE || !iovad->rcaches)
> +	if (log_size >= iovad->rcache_max_size || !iovad->rcaches)
>  		return 0;
>  
>  	return __iova_rcache_get(&iovad->rcaches[log_size], limit_pfn - size);
> @@ -888,7 +888,7 @@ static void free_iova_rcaches(struct iova_domain *iovad)
>  	unsigned int cpu;
>  	int i, j;
>  
> -	for (i = 0; i < IOVA_RANGE_CACHE_MAX_SIZE; ++i) {
> +	for (i = 0; i < iovad->rcache_max_size; ++i) {
>  		rcache = &iovad->rcaches[i];
>  		if (!rcache->cpu_rcaches)
>  			break;
> @@ -916,7 +916,7 @@ static void free_cpu_cached_iovas(unsigned int cpu, struct iova_domain *iovad)
>  	unsigned long flags;
>  	int i;
>  
> -	for (i = 0; i < IOVA_RANGE_CACHE_MAX_SIZE; ++i) {
> +	for (i = 0; i < iovad->rcache_max_size; ++i) {
>  		rcache = &iovad->rcaches[i];
>  		cpu_rcache = per_cpu_ptr(rcache->cpu_rcaches, cpu);
>  		spin_lock_irqsave(&cpu_rcache->lock, flags);
> @@ -935,7 +935,7 @@ static void free_global_cached_iovas(struct iova_domain *iovad)
>  	unsigned long flags;
>  	int i, j;
>  
> -	for (i = 0; i < IOVA_RANGE_CACHE_MAX_SIZE; ++i) {
> +	for (i = 0; i < iovad->rcache_max_size; ++i) {
>  		rcache = &iovad->rcaches[i];
>  		spin_lock_irqsave(&rcache->lock, flags);
>  		for (j = 0; j < rcache->depot_size; ++j) {
> diff --git a/include/linux/iova.h b/include/linux/iova.h
> index 320a70e40233..02f7222fa85a 100644
> --- a/include/linux/iova.h
> +++ b/include/linux/iova.h
> @@ -38,6 +38,9 @@ struct iova_domain {
>  
>  	struct iova_rcache	*rcaches;
>  	struct hlist_node	cpuhp_dead;
> +
> +	/* log of max cached IOVA range size (in pages) */
> +	unsigned long	rcache_max_size;
>  };
>  
>  static inline unsigned long iova_size(struct iova *iova)
> 

-- 
Regards,
  Zhen Lei
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH RESEND v5 4/5] iommu: Allow max opt DMA len be set for a group via sysfs
  2022-04-04 11:27   ` John Garry via iommu
@ 2022-04-07  8:21     ` Leizhen (ThunderTown) via iommu
  -1 siblings, 0 replies; 38+ messages in thread
From: Leizhen (ThunderTown) @ 2022-04-07  8:21 UTC (permalink / raw)
  To: John Garry, joro, will, robin.murphy
  Cc: mst, jasowang, iommu, linux-kernel, virtualization, chenxiang66,
	jean-philippe, linuxarm



On 2022/4/4 19:27, John Garry wrote:
> Add support to allow the maximum optimised DMA len be set for an IOMMU
> group via sysfs.
> 
> This is much the same with the method to change the default domain type
> for a group.
> 
> Signed-off-by: John Garry <john.garry@huawei.com>
> ---
>  .../ABI/testing/sysfs-kernel-iommu_groups     | 16 +++++
>  drivers/iommu/iommu.c                         | 59 ++++++++++++++++++-
>  include/linux/iommu.h                         |  6 ++
>  3 files changed, 79 insertions(+), 2 deletions(-)
> 
> diff --git a/Documentation/ABI/testing/sysfs-kernel-iommu_groups b/Documentation/ABI/testing/sysfs-kernel-iommu_groups
> index b15af6a5bc08..ed6f72794f6c 100644
> --- a/Documentation/ABI/testing/sysfs-kernel-iommu_groups
> +++ b/Documentation/ABI/testing/sysfs-kernel-iommu_groups
> @@ -63,3 +63,19 @@ Description:	/sys/kernel/iommu_groups/<grp_id>/type shows the type of default
>  		system could lead to catastrophic effects (the users might
>  		need to reboot the machine to get it to normal state). So, it's
>  		expected that the users understand what they're doing.
> +
> +What:		/sys/kernel/iommu_groups/<grp_id>/max_opt_dma_size
> +Date:		Feb 2022
> +KernelVersion:	v5.18
> +Contact:	iommu@lists.linux-foundation.org
> +Description:	/sys/kernel/iommu_groups/<grp_id>/max_opt_dma_size shows the
> +		max optimised DMA size for the default IOMMU domain associated
> +		with the group.
> +		Each IOMMU domain has an IOVA domain. The IOVA domain caches
> +		IOVAs upto a certain size as a performance optimisation.
> +		This sysfs file allows the range of the IOVA domain caching be
> +		set, such that larger than default IOVAs may be cached.
> +		A value of 0 means that the default caching range is chosen.
> +		A privileged user could request the kernel the change the range
> +		by writing to this file. For this to happen, the same rules
> +		and procedure applies as in changing the default domain type.
> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
> index 10bb10c2a210..7c7258f19bed 100644
> --- a/drivers/iommu/iommu.c
> +++ b/drivers/iommu/iommu.c
> @@ -48,6 +48,7 @@ struct iommu_group {
>  	struct iommu_domain *default_domain;
>  	struct iommu_domain *domain;
>  	struct list_head entry;
> +	size_t max_opt_dma_size;
>  };
>  
>  struct group_device {
> @@ -89,6 +90,9 @@ static int iommu_create_device_direct_mappings(struct iommu_group *group,
>  static struct iommu_group *iommu_group_get_for_dev(struct device *dev);
>  static ssize_t iommu_group_store_type(struct iommu_group *group,
>  				      const char *buf, size_t count);
> +static ssize_t iommu_group_store_max_opt_dma_size(struct iommu_group *group,
> +						  const char *buf,
> +						  size_t count);
>  
>  #define IOMMU_GROUP_ATTR(_name, _mode, _show, _store)		\
>  struct iommu_group_attribute iommu_group_attr_##_name =		\
> @@ -571,6 +575,12 @@ static ssize_t iommu_group_show_type(struct iommu_group *group,
>  	return strlen(type);
>  }
>  
> +static ssize_t iommu_group_show_max_opt_dma_size(struct iommu_group *group,
> +				     char *buf)
> +{
> +	return sprintf(buf, "%zu\n", group->max_opt_dma_size);
> +}
> +
>  static IOMMU_GROUP_ATTR(name, S_IRUGO, iommu_group_show_name, NULL);
>  
>  static IOMMU_GROUP_ATTR(reserved_regions, 0444,
> @@ -579,6 +589,9 @@ static IOMMU_GROUP_ATTR(reserved_regions, 0444,
>  static IOMMU_GROUP_ATTR(type, 0644, iommu_group_show_type,
>  			iommu_group_store_type);
>  
> +static IOMMU_GROUP_ATTR(max_opt_dma_size, 0644, iommu_group_show_max_opt_dma_size,
> +			iommu_group_store_max_opt_dma_size);
> +
>  static void iommu_group_release(struct kobject *kobj)
>  {
>  	struct iommu_group *group = to_iommu_group(kobj);
> @@ -665,6 +678,10 @@ struct iommu_group *iommu_group_alloc(void)
>  	if (ret)
>  		return ERR_PTR(ret);
>  
> +	ret = iommu_group_create_file(group, &iommu_group_attr_max_opt_dma_size);
> +	if (ret)
> +		return ERR_PTR(ret);
> +
>  	pr_debug("Allocated group %d\n", group->id);
>  
>  	return group;
> @@ -2087,6 +2104,11 @@ struct iommu_domain *iommu_get_dma_domain(struct device *dev)
>  	return dev->iommu_group->default_domain;
>  }
>  
> +size_t iommu_group_get_max_opt_dma_size(struct iommu_group *group)
> +{
> +	return group->max_opt_dma_size;
> +}
> +
>  /*
>   * IOMMU groups are really the natural working unit of the IOMMU, but
>   * the IOMMU API works on domains and devices.  Bridge that gap by
> @@ -2871,12 +2893,14 @@ EXPORT_SYMBOL_GPL(iommu_sva_get_pasid);
>   * @prev_dev: The device in the group (this is used to make sure that the device
>   *	 hasn't changed after the caller has called this function)
>   * @type: The type of the new default domain that gets associated with the group
> + * @max_opt_dma_size: Set the IOMMU group max_opt_dma_size if non-zero
>   *
>   * Returns 0 on success and error code on failure
>   *
>   */
>  static int iommu_change_dev_def_domain(struct iommu_group *group,
> -				       struct device *prev_dev, int type)
> +				       struct device *prev_dev, int type,
> +				       unsigned long max_opt_dma_size)
>  {
>  	struct iommu_domain *prev_dom;
>  	struct group_device *grp_dev;
> @@ -2977,6 +3001,9 @@ static int iommu_change_dev_def_domain(struct iommu_group *group,
>  
>  	group->domain = group->default_domain;
>  
> +	if (max_opt_dma_size)
> +		group->max_opt_dma_size = max_opt_dma_size;

Why not add a new function just do this? In this way, we do not need to modify
iommu_change_dev_def_domain() and patch 3/5 can be dropped.

> +
>  	/*
>  	 * Release the mutex here because ops->probe_finalize() call-back of
>  	 * some vendor IOMMU drivers calls arm_iommu_attach_device() which
> @@ -3003,6 +3030,7 @@ static int iommu_change_dev_def_domain(struct iommu_group *group,
>  
>  enum iommu_group_op {
>  	CHANGE_GROUP_TYPE,
> +	CHANGE_DMA_OPT_SIZE,
>  };
>  
>  static int __iommu_group_store_type(const char *buf, struct iommu_group *group,
> @@ -3031,7 +3059,24 @@ static int __iommu_group_store_type(const char *buf, struct iommu_group *group,
>  		return -EINVAL;
>  	}
>  
> -	return iommu_change_dev_def_domain(group, dev, type);
> +	return iommu_change_dev_def_domain(group, dev, type, 0);
> +}
> +
> +static int __iommu_group_store_max_opt_dma_size(const char *buf,
> +						struct iommu_group *group,
> +						struct device *dev)
> +{
> +	unsigned long val;
> +
> +	if (kstrtoul(buf, 0, &val) || !val)
> +		return -EINVAL;
> +
> +	if (device_is_bound(dev)) {
> +		pr_err_ratelimited("Device is still bound to driver\n");
> +		return -EINVAL;
> +	}
> +
> +	return iommu_change_dev_def_domain(group, dev, __IOMMU_DOMAIN_SAME, val);
>  }
>  
>  /*
> @@ -3108,6 +3153,9 @@ static ssize_t iommu_group_store_common(struct iommu_group *group,
>  	case CHANGE_GROUP_TYPE:
>  		ret = __iommu_group_store_type(buf, group, dev);
>  		break;
> +	case CHANGE_DMA_OPT_SIZE:
> +		ret = __iommu_group_store_max_opt_dma_size(buf, group, dev);
> +		break;
>  	default:
>  		ret = -EINVAL;
>  	}
> @@ -3124,3 +3172,10 @@ static ssize_t iommu_group_store_type(struct iommu_group *group,
>  {
>  	return iommu_group_store_common(group, CHANGE_GROUP_TYPE, buf, count);
>  }
> +
> +static ssize_t iommu_group_store_max_opt_dma_size(struct iommu_group *group,
> +						  const char *buf,
> +						  size_t count)
> +{
> +	return iommu_group_store_common(group, CHANGE_DMA_OPT_SIZE, buf, count);
> +}
> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
> index b141cf71c7af..6915e68c40b7 100644
> --- a/include/linux/iommu.h
> +++ b/include/linux/iommu.h
> @@ -430,6 +430,7 @@ extern int iommu_sva_unbind_gpasid(struct iommu_domain *domain,
>  				   struct device *dev, ioasid_t pasid);
>  extern struct iommu_domain *iommu_get_domain_for_dev(struct device *dev);
>  extern struct iommu_domain *iommu_get_dma_domain(struct device *dev);
> +extern size_t iommu_group_get_max_opt_dma_size(struct iommu_group *group);
>  extern int iommu_map(struct iommu_domain *domain, unsigned long iova,
>  		     phys_addr_t paddr, size_t size, int prot);
>  extern int iommu_map_atomic(struct iommu_domain *domain, unsigned long iova,
> @@ -725,6 +726,11 @@ static inline struct iommu_domain *iommu_get_domain_for_dev(struct device *dev)
>  	return NULL;
>  }
>  
> +static inline size_t iommu_group_get_max_opt_dma_size(struct iommu_group *group)
> +{
> +	return 0;
> +}
> +
>  static inline int iommu_map(struct iommu_domain *domain, unsigned long iova,
>  			    phys_addr_t paddr, size_t size, int prot)
>  {
> 

-- 
Regards,
  Zhen Lei

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH RESEND v5 4/5] iommu: Allow max opt DMA len be set for a group via sysfs
@ 2022-04-07  8:21     ` Leizhen (ThunderTown) via iommu
  0 siblings, 0 replies; 38+ messages in thread
From: Leizhen (ThunderTown) via iommu @ 2022-04-07  8:21 UTC (permalink / raw)
  To: John Garry, joro, will, robin.murphy
  Cc: jean-philippe, jasowang, mst, linuxarm, linux-kernel,
	virtualization, iommu



On 2022/4/4 19:27, John Garry wrote:
> Add support to allow the maximum optimised DMA len be set for an IOMMU
> group via sysfs.
> 
> This is much the same with the method to change the default domain type
> for a group.
> 
> Signed-off-by: John Garry <john.garry@huawei.com>
> ---
>  .../ABI/testing/sysfs-kernel-iommu_groups     | 16 +++++
>  drivers/iommu/iommu.c                         | 59 ++++++++++++++++++-
>  include/linux/iommu.h                         |  6 ++
>  3 files changed, 79 insertions(+), 2 deletions(-)
> 
> diff --git a/Documentation/ABI/testing/sysfs-kernel-iommu_groups b/Documentation/ABI/testing/sysfs-kernel-iommu_groups
> index b15af6a5bc08..ed6f72794f6c 100644
> --- a/Documentation/ABI/testing/sysfs-kernel-iommu_groups
> +++ b/Documentation/ABI/testing/sysfs-kernel-iommu_groups
> @@ -63,3 +63,19 @@ Description:	/sys/kernel/iommu_groups/<grp_id>/type shows the type of default
>  		system could lead to catastrophic effects (the users might
>  		need to reboot the machine to get it to normal state). So, it's
>  		expected that the users understand what they're doing.
> +
> +What:		/sys/kernel/iommu_groups/<grp_id>/max_opt_dma_size
> +Date:		Feb 2022
> +KernelVersion:	v5.18
> +Contact:	iommu@lists.linux-foundation.org
> +Description:	/sys/kernel/iommu_groups/<grp_id>/max_opt_dma_size shows the
> +		max optimised DMA size for the default IOMMU domain associated
> +		with the group.
> +		Each IOMMU domain has an IOVA domain. The IOVA domain caches
> +		IOVAs upto a certain size as a performance optimisation.
> +		This sysfs file allows the range of the IOVA domain caching be
> +		set, such that larger than default IOVAs may be cached.
> +		A value of 0 means that the default caching range is chosen.
> +		A privileged user could request the kernel the change the range
> +		by writing to this file. For this to happen, the same rules
> +		and procedure applies as in changing the default domain type.
> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
> index 10bb10c2a210..7c7258f19bed 100644
> --- a/drivers/iommu/iommu.c
> +++ b/drivers/iommu/iommu.c
> @@ -48,6 +48,7 @@ struct iommu_group {
>  	struct iommu_domain *default_domain;
>  	struct iommu_domain *domain;
>  	struct list_head entry;
> +	size_t max_opt_dma_size;
>  };
>  
>  struct group_device {
> @@ -89,6 +90,9 @@ static int iommu_create_device_direct_mappings(struct iommu_group *group,
>  static struct iommu_group *iommu_group_get_for_dev(struct device *dev);
>  static ssize_t iommu_group_store_type(struct iommu_group *group,
>  				      const char *buf, size_t count);
> +static ssize_t iommu_group_store_max_opt_dma_size(struct iommu_group *group,
> +						  const char *buf,
> +						  size_t count);
>  
>  #define IOMMU_GROUP_ATTR(_name, _mode, _show, _store)		\
>  struct iommu_group_attribute iommu_group_attr_##_name =		\
> @@ -571,6 +575,12 @@ static ssize_t iommu_group_show_type(struct iommu_group *group,
>  	return strlen(type);
>  }
>  
> +static ssize_t iommu_group_show_max_opt_dma_size(struct iommu_group *group,
> +				     char *buf)
> +{
> +	return sprintf(buf, "%zu\n", group->max_opt_dma_size);
> +}
> +
>  static IOMMU_GROUP_ATTR(name, S_IRUGO, iommu_group_show_name, NULL);
>  
>  static IOMMU_GROUP_ATTR(reserved_regions, 0444,
> @@ -579,6 +589,9 @@ static IOMMU_GROUP_ATTR(reserved_regions, 0444,
>  static IOMMU_GROUP_ATTR(type, 0644, iommu_group_show_type,
>  			iommu_group_store_type);
>  
> +static IOMMU_GROUP_ATTR(max_opt_dma_size, 0644, iommu_group_show_max_opt_dma_size,
> +			iommu_group_store_max_opt_dma_size);
> +
>  static void iommu_group_release(struct kobject *kobj)
>  {
>  	struct iommu_group *group = to_iommu_group(kobj);
> @@ -665,6 +678,10 @@ struct iommu_group *iommu_group_alloc(void)
>  	if (ret)
>  		return ERR_PTR(ret);
>  
> +	ret = iommu_group_create_file(group, &iommu_group_attr_max_opt_dma_size);
> +	if (ret)
> +		return ERR_PTR(ret);
> +
>  	pr_debug("Allocated group %d\n", group->id);
>  
>  	return group;
> @@ -2087,6 +2104,11 @@ struct iommu_domain *iommu_get_dma_domain(struct device *dev)
>  	return dev->iommu_group->default_domain;
>  }
>  
> +size_t iommu_group_get_max_opt_dma_size(struct iommu_group *group)
> +{
> +	return group->max_opt_dma_size;
> +}
> +
>  /*
>   * IOMMU groups are really the natural working unit of the IOMMU, but
>   * the IOMMU API works on domains and devices.  Bridge that gap by
> @@ -2871,12 +2893,14 @@ EXPORT_SYMBOL_GPL(iommu_sva_get_pasid);
>   * @prev_dev: The device in the group (this is used to make sure that the device
>   *	 hasn't changed after the caller has called this function)
>   * @type: The type of the new default domain that gets associated with the group
> + * @max_opt_dma_size: Set the IOMMU group max_opt_dma_size if non-zero
>   *
>   * Returns 0 on success and error code on failure
>   *
>   */
>  static int iommu_change_dev_def_domain(struct iommu_group *group,
> -				       struct device *prev_dev, int type)
> +				       struct device *prev_dev, int type,
> +				       unsigned long max_opt_dma_size)
>  {
>  	struct iommu_domain *prev_dom;
>  	struct group_device *grp_dev;
> @@ -2977,6 +3001,9 @@ static int iommu_change_dev_def_domain(struct iommu_group *group,
>  
>  	group->domain = group->default_domain;
>  
> +	if (max_opt_dma_size)
> +		group->max_opt_dma_size = max_opt_dma_size;

Why not add a new function just do this? In this way, we do not need to modify
iommu_change_dev_def_domain() and patch 3/5 can be dropped.

> +
>  	/*
>  	 * Release the mutex here because ops->probe_finalize() call-back of
>  	 * some vendor IOMMU drivers calls arm_iommu_attach_device() which
> @@ -3003,6 +3030,7 @@ static int iommu_change_dev_def_domain(struct iommu_group *group,
>  
>  enum iommu_group_op {
>  	CHANGE_GROUP_TYPE,
> +	CHANGE_DMA_OPT_SIZE,
>  };
>  
>  static int __iommu_group_store_type(const char *buf, struct iommu_group *group,
> @@ -3031,7 +3059,24 @@ static int __iommu_group_store_type(const char *buf, struct iommu_group *group,
>  		return -EINVAL;
>  	}
>  
> -	return iommu_change_dev_def_domain(group, dev, type);
> +	return iommu_change_dev_def_domain(group, dev, type, 0);
> +}
> +
> +static int __iommu_group_store_max_opt_dma_size(const char *buf,
> +						struct iommu_group *group,
> +						struct device *dev)
> +{
> +	unsigned long val;
> +
> +	if (kstrtoul(buf, 0, &val) || !val)
> +		return -EINVAL;
> +
> +	if (device_is_bound(dev)) {
> +		pr_err_ratelimited("Device is still bound to driver\n");
> +		return -EINVAL;
> +	}
> +
> +	return iommu_change_dev_def_domain(group, dev, __IOMMU_DOMAIN_SAME, val);
>  }
>  
>  /*
> @@ -3108,6 +3153,9 @@ static ssize_t iommu_group_store_common(struct iommu_group *group,
>  	case CHANGE_GROUP_TYPE:
>  		ret = __iommu_group_store_type(buf, group, dev);
>  		break;
> +	case CHANGE_DMA_OPT_SIZE:
> +		ret = __iommu_group_store_max_opt_dma_size(buf, group, dev);
> +		break;
>  	default:
>  		ret = -EINVAL;
>  	}
> @@ -3124,3 +3172,10 @@ static ssize_t iommu_group_store_type(struct iommu_group *group,
>  {
>  	return iommu_group_store_common(group, CHANGE_GROUP_TYPE, buf, count);
>  }
> +
> +static ssize_t iommu_group_store_max_opt_dma_size(struct iommu_group *group,
> +						  const char *buf,
> +						  size_t count)
> +{
> +	return iommu_group_store_common(group, CHANGE_DMA_OPT_SIZE, buf, count);
> +}
> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
> index b141cf71c7af..6915e68c40b7 100644
> --- a/include/linux/iommu.h
> +++ b/include/linux/iommu.h
> @@ -430,6 +430,7 @@ extern int iommu_sva_unbind_gpasid(struct iommu_domain *domain,
>  				   struct device *dev, ioasid_t pasid);
>  extern struct iommu_domain *iommu_get_domain_for_dev(struct device *dev);
>  extern struct iommu_domain *iommu_get_dma_domain(struct device *dev);
> +extern size_t iommu_group_get_max_opt_dma_size(struct iommu_group *group);
>  extern int iommu_map(struct iommu_domain *domain, unsigned long iova,
>  		     phys_addr_t paddr, size_t size, int prot);
>  extern int iommu_map_atomic(struct iommu_domain *domain, unsigned long iova,
> @@ -725,6 +726,11 @@ static inline struct iommu_domain *iommu_get_domain_for_dev(struct device *dev)
>  	return NULL;
>  }
>  
> +static inline size_t iommu_group_get_max_opt_dma_size(struct iommu_group *group)
> +{
> +	return 0;
> +}
> +
>  static inline int iommu_map(struct iommu_domain *domain, unsigned long iova,
>  			    phys_addr_t paddr, size_t size, int prot)
>  {
> 

-- 
Regards,
  Zhen Lei
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH RESEND v5 5/5] iova: Add iova_len argument to iova_domain_init_rcaches()
  2022-04-04 11:27   ` John Garry via iommu
@ 2022-04-07  8:27     ` Leizhen (ThunderTown) via iommu
  -1 siblings, 0 replies; 38+ messages in thread
From: Leizhen (ThunderTown) @ 2022-04-07  8:27 UTC (permalink / raw)
  To: John Garry, joro, will, robin.murphy
  Cc: mst, jasowang, iommu, linux-kernel, virtualization, chenxiang66,
	jean-philippe, linuxarm



On 2022/4/4 19:27, John Garry wrote:
> Add max opt argument to iova_domain_init_rcaches(), and use it to set the
> rcaches range.
> 
> Also fix up all users to set this value (at 0, meaning use default),
> including a wrapper for that, iova_domain_init_rcaches_default().
> 
> For dma-iommu.c we derive the iova_len argument from the IOMMU group
> max opt DMA size.
> 
> Signed-off-by: John Garry <john.garry@huawei.com>
> ---
>  drivers/iommu/dma-iommu.c            | 15 ++++++++++++++-
>  drivers/iommu/iova.c                 | 19 ++++++++++++++++---
>  drivers/vdpa/vdpa_user/iova_domain.c |  4 ++--
>  include/linux/iova.h                 |  3 ++-
>  4 files changed, 34 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
> index 42ca42ff1b5d..19f35624611c 100644
> --- a/drivers/iommu/dma-iommu.c
> +++ b/drivers/iommu/dma-iommu.c
> @@ -525,6 +525,8 @@ static int iommu_dma_init_domain(struct iommu_domain *domain, dma_addr_t base,
>  	struct iommu_dma_cookie *cookie = domain->iova_cookie;
>  	unsigned long order, base_pfn;
>  	struct iova_domain *iovad;
> +	size_t max_opt_dma_size;
> +	unsigned long iova_len = 0;
>  	int ret;
>  
>  	if (!cookie || cookie->type != IOMMU_DMA_IOVA_COOKIE)
> @@ -560,7 +562,18 @@ static int iommu_dma_init_domain(struct iommu_domain *domain, dma_addr_t base,
>  	}
>  
>  	init_iova_domain(iovad, 1UL << order, base_pfn);
> -	ret = iova_domain_init_rcaches(iovad);
> +
> +	max_opt_dma_size = iommu_group_get_max_opt_dma_size(dev->iommu_group);
> +	if (max_opt_dma_size) {
> +		unsigned long shift = __ffs(1UL << order);
> +
> +		iova_len = roundup_pow_of_two(max_opt_dma_size);
> +		iova_len >>= shift;
> +		if (!iova_len)
> +			iova_len = 1;

How about move "iovad->rcache_max_size = iova_len_to_rcache_max(iova_len);" here? So that,
iova_domain_init_rcaches() can remain the same. And iova_domain_init_rcaches_default() does
not need to be added.

> +	}
> +
> +	ret = iova_domain_init_rcaches(iovad, iova_len);
>  	if (ret)
>  		return ret;
>  
> diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c
> index 5c22b9187b79..d65e79e132ee 100644
> --- a/drivers/iommu/iova.c
> +++ b/drivers/iommu/iova.c
> @@ -706,12 +706,20 @@ static void iova_magazine_push(struct iova_magazine *mag, unsigned long pfn)
>  	mag->pfns[mag->size++] = pfn;
>  }
>  
> -int iova_domain_init_rcaches(struct iova_domain *iovad)
> +static unsigned long iova_len_to_rcache_max(unsigned long iova_len)
> +{
> +	return order_base_2(iova_len) + 1;
> +}
> +
> +int iova_domain_init_rcaches(struct iova_domain *iovad, unsigned long iova_len)
>  {
>  	unsigned int cpu;
>  	int i, ret;
>  
> -	iovad->rcache_max_size = 6; /* Arbitrarily high default */
> +	if (iova_len)
> +		iovad->rcache_max_size = iova_len_to_rcache_max(iova_len);
> +	else
> +		iovad->rcache_max_size = 6; /* Arbitrarily high default */
>  
>  	iovad->rcaches = kcalloc(iovad->rcache_max_size,
>  				 sizeof(struct iova_rcache),
> @@ -755,7 +763,12 @@ int iova_domain_init_rcaches(struct iova_domain *iovad)
>  	free_iova_rcaches(iovad);
>  	return ret;
>  }
> -EXPORT_SYMBOL_GPL(iova_domain_init_rcaches);
> +
> +int iova_domain_init_rcaches_default(struct iova_domain *iovad)
> +{
> +	return iova_domain_init_rcaches(iovad, 0);
> +}
> +EXPORT_SYMBOL_GPL(iova_domain_init_rcaches_default);
>  
>  /*
>   * Try inserting IOVA range starting with 'iova_pfn' into 'rcache', and
> diff --git a/drivers/vdpa/vdpa_user/iova_domain.c b/drivers/vdpa/vdpa_user/iova_domain.c
> index 6daa3978d290..3a2acef98a4a 100644
> --- a/drivers/vdpa/vdpa_user/iova_domain.c
> +++ b/drivers/vdpa/vdpa_user/iova_domain.c
> @@ -514,12 +514,12 @@ vduse_domain_create(unsigned long iova_limit, size_t bounce_size)
>  	spin_lock_init(&domain->iotlb_lock);
>  	init_iova_domain(&domain->stream_iovad,
>  			PAGE_SIZE, IOVA_START_PFN);
> -	ret = iova_domain_init_rcaches(&domain->stream_iovad);
> +	ret = iova_domain_init_rcaches_default(&domain->stream_iovad);
>  	if (ret)
>  		goto err_iovad_stream;
>  	init_iova_domain(&domain->consistent_iovad,
>  			PAGE_SIZE, bounce_pfns);
> -	ret = iova_domain_init_rcaches(&domain->consistent_iovad);
> +	ret = iova_domain_init_rcaches_default(&domain->consistent_iovad);
>  	if (ret)
>  		goto err_iovad_consistent;
>  
> diff --git a/include/linux/iova.h b/include/linux/iova.h
> index 02f7222fa85a..56281434ce0c 100644
> --- a/include/linux/iova.h
> +++ b/include/linux/iova.h
> @@ -95,7 +95,8 @@ struct iova *reserve_iova(struct iova_domain *iovad, unsigned long pfn_lo,
>  	unsigned long pfn_hi);
>  void init_iova_domain(struct iova_domain *iovad, unsigned long granule,
>  	unsigned long start_pfn);
> -int iova_domain_init_rcaches(struct iova_domain *iovad);
> +int iova_domain_init_rcaches(struct iova_domain *iovad, unsigned long iova_len);
> +int iova_domain_init_rcaches_default(struct iova_domain *iovad);
>  struct iova *find_iova(struct iova_domain *iovad, unsigned long pfn);
>  void put_iova_domain(struct iova_domain *iovad);
>  #else
> 

-- 
Regards,
  Zhen Lei

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH RESEND v5 5/5] iova: Add iova_len argument to iova_domain_init_rcaches()
@ 2022-04-07  8:27     ` Leizhen (ThunderTown) via iommu
  0 siblings, 0 replies; 38+ messages in thread
From: Leizhen (ThunderTown) via iommu @ 2022-04-07  8:27 UTC (permalink / raw)
  To: John Garry, joro, will, robin.murphy
  Cc: jean-philippe, jasowang, mst, linuxarm, linux-kernel,
	virtualization, iommu



On 2022/4/4 19:27, John Garry wrote:
> Add max opt argument to iova_domain_init_rcaches(), and use it to set the
> rcaches range.
> 
> Also fix up all users to set this value (at 0, meaning use default),
> including a wrapper for that, iova_domain_init_rcaches_default().
> 
> For dma-iommu.c we derive the iova_len argument from the IOMMU group
> max opt DMA size.
> 
> Signed-off-by: John Garry <john.garry@huawei.com>
> ---
>  drivers/iommu/dma-iommu.c            | 15 ++++++++++++++-
>  drivers/iommu/iova.c                 | 19 ++++++++++++++++---
>  drivers/vdpa/vdpa_user/iova_domain.c |  4 ++--
>  include/linux/iova.h                 |  3 ++-
>  4 files changed, 34 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
> index 42ca42ff1b5d..19f35624611c 100644
> --- a/drivers/iommu/dma-iommu.c
> +++ b/drivers/iommu/dma-iommu.c
> @@ -525,6 +525,8 @@ static int iommu_dma_init_domain(struct iommu_domain *domain, dma_addr_t base,
>  	struct iommu_dma_cookie *cookie = domain->iova_cookie;
>  	unsigned long order, base_pfn;
>  	struct iova_domain *iovad;
> +	size_t max_opt_dma_size;
> +	unsigned long iova_len = 0;
>  	int ret;
>  
>  	if (!cookie || cookie->type != IOMMU_DMA_IOVA_COOKIE)
> @@ -560,7 +562,18 @@ static int iommu_dma_init_domain(struct iommu_domain *domain, dma_addr_t base,
>  	}
>  
>  	init_iova_domain(iovad, 1UL << order, base_pfn);
> -	ret = iova_domain_init_rcaches(iovad);
> +
> +	max_opt_dma_size = iommu_group_get_max_opt_dma_size(dev->iommu_group);
> +	if (max_opt_dma_size) {
> +		unsigned long shift = __ffs(1UL << order);
> +
> +		iova_len = roundup_pow_of_two(max_opt_dma_size);
> +		iova_len >>= shift;
> +		if (!iova_len)
> +			iova_len = 1;

How about move "iovad->rcache_max_size = iova_len_to_rcache_max(iova_len);" here? So that,
iova_domain_init_rcaches() can remain the same. And iova_domain_init_rcaches_default() does
not need to be added.

> +	}
> +
> +	ret = iova_domain_init_rcaches(iovad, iova_len);
>  	if (ret)
>  		return ret;
>  
> diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c
> index 5c22b9187b79..d65e79e132ee 100644
> --- a/drivers/iommu/iova.c
> +++ b/drivers/iommu/iova.c
> @@ -706,12 +706,20 @@ static void iova_magazine_push(struct iova_magazine *mag, unsigned long pfn)
>  	mag->pfns[mag->size++] = pfn;
>  }
>  
> -int iova_domain_init_rcaches(struct iova_domain *iovad)
> +static unsigned long iova_len_to_rcache_max(unsigned long iova_len)
> +{
> +	return order_base_2(iova_len) + 1;
> +}
> +
> +int iova_domain_init_rcaches(struct iova_domain *iovad, unsigned long iova_len)
>  {
>  	unsigned int cpu;
>  	int i, ret;
>  
> -	iovad->rcache_max_size = 6; /* Arbitrarily high default */
> +	if (iova_len)
> +		iovad->rcache_max_size = iova_len_to_rcache_max(iova_len);
> +	else
> +		iovad->rcache_max_size = 6; /* Arbitrarily high default */
>  
>  	iovad->rcaches = kcalloc(iovad->rcache_max_size,
>  				 sizeof(struct iova_rcache),
> @@ -755,7 +763,12 @@ int iova_domain_init_rcaches(struct iova_domain *iovad)
>  	free_iova_rcaches(iovad);
>  	return ret;
>  }
> -EXPORT_SYMBOL_GPL(iova_domain_init_rcaches);
> +
> +int iova_domain_init_rcaches_default(struct iova_domain *iovad)
> +{
> +	return iova_domain_init_rcaches(iovad, 0);
> +}
> +EXPORT_SYMBOL_GPL(iova_domain_init_rcaches_default);
>  
>  /*
>   * Try inserting IOVA range starting with 'iova_pfn' into 'rcache', and
> diff --git a/drivers/vdpa/vdpa_user/iova_domain.c b/drivers/vdpa/vdpa_user/iova_domain.c
> index 6daa3978d290..3a2acef98a4a 100644
> --- a/drivers/vdpa/vdpa_user/iova_domain.c
> +++ b/drivers/vdpa/vdpa_user/iova_domain.c
> @@ -514,12 +514,12 @@ vduse_domain_create(unsigned long iova_limit, size_t bounce_size)
>  	spin_lock_init(&domain->iotlb_lock);
>  	init_iova_domain(&domain->stream_iovad,
>  			PAGE_SIZE, IOVA_START_PFN);
> -	ret = iova_domain_init_rcaches(&domain->stream_iovad);
> +	ret = iova_domain_init_rcaches_default(&domain->stream_iovad);
>  	if (ret)
>  		goto err_iovad_stream;
>  	init_iova_domain(&domain->consistent_iovad,
>  			PAGE_SIZE, bounce_pfns);
> -	ret = iova_domain_init_rcaches(&domain->consistent_iovad);
> +	ret = iova_domain_init_rcaches_default(&domain->consistent_iovad);
>  	if (ret)
>  		goto err_iovad_consistent;
>  
> diff --git a/include/linux/iova.h b/include/linux/iova.h
> index 02f7222fa85a..56281434ce0c 100644
> --- a/include/linux/iova.h
> +++ b/include/linux/iova.h
> @@ -95,7 +95,8 @@ struct iova *reserve_iova(struct iova_domain *iovad, unsigned long pfn_lo,
>  	unsigned long pfn_hi);
>  void init_iova_domain(struct iova_domain *iovad, unsigned long granule,
>  	unsigned long start_pfn);
> -int iova_domain_init_rcaches(struct iova_domain *iovad);
> +int iova_domain_init_rcaches(struct iova_domain *iovad, unsigned long iova_len);
> +int iova_domain_init_rcaches_default(struct iova_domain *iovad);
>  struct iova *find_iova(struct iova_domain *iovad, unsigned long pfn);
>  void put_iova_domain(struct iova_domain *iovad);
>  #else
> 

-- 
Regards,
  Zhen Lei
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH RESEND v5 5/5] iova: Add iova_len argument to iova_domain_init_rcaches()
  2022-04-07  8:27     ` Leizhen (ThunderTown) via iommu
@ 2022-04-07 13:52       ` John Garry via iommu
  -1 siblings, 0 replies; 38+ messages in thread
From: John Garry @ 2022-04-07 13:52 UTC (permalink / raw)
  To: Leizhen (ThunderTown), joro, will, robin.murphy
  Cc: mst, jasowang, iommu, linux-kernel, virtualization, chenxiang66,
	jean-philippe, linuxarm

On 07/04/2022 09:27, Leizhen (ThunderTown) wrote:
> 

Thanks for having a look....

> 
> On 2022/4/4 19:27, John Garry wrote:
>> Add max opt argument to iova_domain_init_rcaches(), and use it to set the
>> rcaches range.
>>
>> Also fix up all users to set this value (at 0, meaning use default),
>> including a wrapper for that, iova_domain_init_rcaches_default().
>>
>> For dma-iommu.c we derive the iova_len argument from the IOMMU group
>> max opt DMA size.
>>
>> Signed-off-by: John Garry <john.garry@huawei.com>
>> ---
>>   drivers/iommu/dma-iommu.c            | 15 ++++++++++++++-
>>   drivers/iommu/iova.c                 | 19 ++++++++++++++++---
>>   drivers/vdpa/vdpa_user/iova_domain.c |  4 ++--
>>   include/linux/iova.h                 |  3 ++-
>>   4 files changed, 34 insertions(+), 7 deletions(-)
>>
>> diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
>> index 42ca42ff1b5d..19f35624611c 100644
>> --- a/drivers/iommu/dma-iommu.c
>> +++ b/drivers/iommu/dma-iommu.c
>> @@ -525,6 +525,8 @@ static int iommu_dma_init_domain(struct iommu_domain *domain, dma_addr_t base,
>>   	struct iommu_dma_cookie *cookie = domain->iova_cookie;
>>   	unsigned long order, base_pfn;
>>   	struct iova_domain *iovad;
>> +	size_t max_opt_dma_size;
>> +	unsigned long iova_len = 0;
>>   	int ret;
>>   
>>   	if (!cookie || cookie->type != IOMMU_DMA_IOVA_COOKIE)
>> @@ -560,7 +562,18 @@ static int iommu_dma_init_domain(struct iommu_domain *domain, dma_addr_t base,
>>   	}
>>   
>>   	init_iova_domain(iovad, 1UL << order, base_pfn);
>> -	ret = iova_domain_init_rcaches(iovad);
>> +
>> +	max_opt_dma_size = iommu_group_get_max_opt_dma_size(dev->iommu_group);
>> +	if (max_opt_dma_size) {
>> +		unsigned long shift = __ffs(1UL << order);
>> +
>> +		iova_len = roundup_pow_of_two(max_opt_dma_size);
>> +		iova_len >>= shift;
>> +		if (!iova_len)
>> +			iova_len = 1;
> 
> How about move "iovad->rcache_max_size = iova_len_to_rcache_max(iova_len);" here? So that,
> iova_domain_init_rcaches() can remain the same. And iova_domain_init_rcaches_default() does
> not need to be added.
> 

I see your idea. I will say that I would rather not add 
iova_domain_init_rcaches_default(). But personally I think it's better 
to setup all rcache stuff only once and inside 
iova_domain_init_rcaches(), as it is today.

In addition, it doesn't look reasonable to expose iova_len_to_rcache_max().

But maybe it's ok. Other opinion would be welcome...

Thanks,
John

>> +	}
>> +
>> +	ret = iova_domain_init_rcaches(iovad, iova_len);
>>   	if (ret)
>>   		return ret;
>>   
>> diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c
>> index 5c22b9187b79..d65e79e132ee 100644
>> --- a/drivers/iommu/iova.c
>> +++ b/drivers/iommu/iova.c
>> @@ -706,12 +706,20 @@ static void iova_magazine_push(struct iova_magazine *mag, unsigned long pfn)
>>   	mag->pfns[mag->size++] = pfn;
>>   }
>>   
>> -int iova_domain_init_rcaches(struct iova_domain *iovad)
>> +static unsigned long iova_len_to_rcache_max(unsigned long iova_len)
>> +{
>> +	return order_base_2(iova_len) + 1;
>> +}
>> +
>> +int iova_domain_init_rcaches(struct iova_domain *iovad, unsigned long iova_len)
>>   {
>>   	unsigned int cpu;
>>   	int i, ret;
>>   
>> -	iovad->rcache_max_size = 6; /* Arbitrarily high default */
>> +	if (iova_len)
>> +		iovad->rcache_max_size = iova_len_to_rcache_max(iova_len);
>> +	else
>> +		iovad->rcache_max_size = 6; /* Arbitrarily high default */
>>   
>>   	iovad->rcaches = kcalloc(iovad->rcache_max_size,
>>   				 sizeof(struct iova_rcache),
>> @@ -755,7 +763,12 @@ int iova_domain_init_rcaches(struct iova_domain *iovad)
>>   	free_iova_rcaches(iovad);
>>   	return ret;
>>   }
>> -EXPORT_SYMBOL_GPL(iova_domain_init_rcaches);
>> +
>> +int iova_domain_init_rcaches_default(struct iova_domain *iovad)
>> +{
>> +	return iova_domain_init_rcaches(iovad, 0);
>> +}
>> +EXPORT_SYMBOL_GPL(iova_domain_init_rcaches_default);
>>   
>>   /*
>>    * Try inserting IOVA range starting with 'iova_pfn' into 'rcache', and
>> diff --git a/drivers/vdpa/vdpa_user/iova_domain.c b/drivers/vdpa/vdpa_user/iova_domain.c
>> index 6daa3978d290..3a2acef98a4a 100644
>> --- a/drivers/vdpa/vdpa_user/iova_domain.c
>> +++ b/drivers/vdpa/vdpa_user/iova_domain.c
>> @@ -514,12 +514,12 @@ vduse_domain_create(unsigned long iova_limit, size_t bounce_size)
>>   	spin_lock_init(&domain->iotlb_lock);
>>   	init_iova_domain(&domain->stream_iovad,
>>   			PAGE_SIZE, IOVA_START_PFN);
>> -	ret = iova_domain_init_rcaches(&domain->stream_iovad);
>> +	ret = iova_domain_init_rcaches_default(&domain->stream_iovad);
>>   	if (ret)
>>   		goto err_iovad_stream;
>>   	init_iova_domain(&domain->consistent_iovad,
>>   			PAGE_SIZE, bounce_pfns);
>> -	ret = iova_domain_init_rcaches(&domain->consistent_iovad);
>> +	ret = iova_domain_init_rcaches_default(&domain->consistent_iovad);
>>   	if (ret)
>>   		goto err_iovad_consistent;
>>   
>> diff --git a/include/linux/iova.h b/include/linux/iova.h
>> index 02f7222fa85a..56281434ce0c 100644
>> --- a/include/linux/iova.h
>> +++ b/include/linux/iova.h
>> @@ -95,7 +95,8 @@ struct iova *reserve_iova(struct iova_domain *iovad, unsigned long pfn_lo,
>>   	unsigned long pfn_hi);
>>   void init_iova_domain(struct iova_domain *iovad, unsigned long granule,
>>   	unsigned long start_pfn);
>> -int iova_domain_init_rcaches(struct iova_domain *iovad);
>> +int iova_domain_init_rcaches(struct iova_domain *iovad, unsigned long iova_len);
>> +int iova_domain_init_rcaches_default(struct iova_domain *iovad);
>>   struct iova *find_iova(struct iova_domain *iovad, unsigned long pfn);
>>   void put_iova_domain(struct iova_domain *iovad);
>>   #else
>>
> 


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH RESEND v5 5/5] iova: Add iova_len argument to iova_domain_init_rcaches()
@ 2022-04-07 13:52       ` John Garry via iommu
  0 siblings, 0 replies; 38+ messages in thread
From: John Garry via iommu @ 2022-04-07 13:52 UTC (permalink / raw)
  To: Leizhen (ThunderTown), joro, will, robin.murphy
  Cc: jean-philippe, jasowang, mst, linuxarm, linux-kernel,
	virtualization, iommu

On 07/04/2022 09:27, Leizhen (ThunderTown) wrote:
> 

Thanks for having a look....

> 
> On 2022/4/4 19:27, John Garry wrote:
>> Add max opt argument to iova_domain_init_rcaches(), and use it to set the
>> rcaches range.
>>
>> Also fix up all users to set this value (at 0, meaning use default),
>> including a wrapper for that, iova_domain_init_rcaches_default().
>>
>> For dma-iommu.c we derive the iova_len argument from the IOMMU group
>> max opt DMA size.
>>
>> Signed-off-by: John Garry <john.garry@huawei.com>
>> ---
>>   drivers/iommu/dma-iommu.c            | 15 ++++++++++++++-
>>   drivers/iommu/iova.c                 | 19 ++++++++++++++++---
>>   drivers/vdpa/vdpa_user/iova_domain.c |  4 ++--
>>   include/linux/iova.h                 |  3 ++-
>>   4 files changed, 34 insertions(+), 7 deletions(-)
>>
>> diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
>> index 42ca42ff1b5d..19f35624611c 100644
>> --- a/drivers/iommu/dma-iommu.c
>> +++ b/drivers/iommu/dma-iommu.c
>> @@ -525,6 +525,8 @@ static int iommu_dma_init_domain(struct iommu_domain *domain, dma_addr_t base,
>>   	struct iommu_dma_cookie *cookie = domain->iova_cookie;
>>   	unsigned long order, base_pfn;
>>   	struct iova_domain *iovad;
>> +	size_t max_opt_dma_size;
>> +	unsigned long iova_len = 0;
>>   	int ret;
>>   
>>   	if (!cookie || cookie->type != IOMMU_DMA_IOVA_COOKIE)
>> @@ -560,7 +562,18 @@ static int iommu_dma_init_domain(struct iommu_domain *domain, dma_addr_t base,
>>   	}
>>   
>>   	init_iova_domain(iovad, 1UL << order, base_pfn);
>> -	ret = iova_domain_init_rcaches(iovad);
>> +
>> +	max_opt_dma_size = iommu_group_get_max_opt_dma_size(dev->iommu_group);
>> +	if (max_opt_dma_size) {
>> +		unsigned long shift = __ffs(1UL << order);
>> +
>> +		iova_len = roundup_pow_of_two(max_opt_dma_size);
>> +		iova_len >>= shift;
>> +		if (!iova_len)
>> +			iova_len = 1;
> 
> How about move "iovad->rcache_max_size = iova_len_to_rcache_max(iova_len);" here? So that,
> iova_domain_init_rcaches() can remain the same. And iova_domain_init_rcaches_default() does
> not need to be added.
> 

I see your idea. I will say that I would rather not add 
iova_domain_init_rcaches_default(). But personally I think it's better 
to setup all rcache stuff only once and inside 
iova_domain_init_rcaches(), as it is today.

In addition, it doesn't look reasonable to expose iova_len_to_rcache_max().

But maybe it's ok. Other opinion would be welcome...

Thanks,
John

>> +	}
>> +
>> +	ret = iova_domain_init_rcaches(iovad, iova_len);
>>   	if (ret)
>>   		return ret;
>>   
>> diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c
>> index 5c22b9187b79..d65e79e132ee 100644
>> --- a/drivers/iommu/iova.c
>> +++ b/drivers/iommu/iova.c
>> @@ -706,12 +706,20 @@ static void iova_magazine_push(struct iova_magazine *mag, unsigned long pfn)
>>   	mag->pfns[mag->size++] = pfn;
>>   }
>>   
>> -int iova_domain_init_rcaches(struct iova_domain *iovad)
>> +static unsigned long iova_len_to_rcache_max(unsigned long iova_len)
>> +{
>> +	return order_base_2(iova_len) + 1;
>> +}
>> +
>> +int iova_domain_init_rcaches(struct iova_domain *iovad, unsigned long iova_len)
>>   {
>>   	unsigned int cpu;
>>   	int i, ret;
>>   
>> -	iovad->rcache_max_size = 6; /* Arbitrarily high default */
>> +	if (iova_len)
>> +		iovad->rcache_max_size = iova_len_to_rcache_max(iova_len);
>> +	else
>> +		iovad->rcache_max_size = 6; /* Arbitrarily high default */
>>   
>>   	iovad->rcaches = kcalloc(iovad->rcache_max_size,
>>   				 sizeof(struct iova_rcache),
>> @@ -755,7 +763,12 @@ int iova_domain_init_rcaches(struct iova_domain *iovad)
>>   	free_iova_rcaches(iovad);
>>   	return ret;
>>   }
>> -EXPORT_SYMBOL_GPL(iova_domain_init_rcaches);
>> +
>> +int iova_domain_init_rcaches_default(struct iova_domain *iovad)
>> +{
>> +	return iova_domain_init_rcaches(iovad, 0);
>> +}
>> +EXPORT_SYMBOL_GPL(iova_domain_init_rcaches_default);
>>   
>>   /*
>>    * Try inserting IOVA range starting with 'iova_pfn' into 'rcache', and
>> diff --git a/drivers/vdpa/vdpa_user/iova_domain.c b/drivers/vdpa/vdpa_user/iova_domain.c
>> index 6daa3978d290..3a2acef98a4a 100644
>> --- a/drivers/vdpa/vdpa_user/iova_domain.c
>> +++ b/drivers/vdpa/vdpa_user/iova_domain.c
>> @@ -514,12 +514,12 @@ vduse_domain_create(unsigned long iova_limit, size_t bounce_size)
>>   	spin_lock_init(&domain->iotlb_lock);
>>   	init_iova_domain(&domain->stream_iovad,
>>   			PAGE_SIZE, IOVA_START_PFN);
>> -	ret = iova_domain_init_rcaches(&domain->stream_iovad);
>> +	ret = iova_domain_init_rcaches_default(&domain->stream_iovad);
>>   	if (ret)
>>   		goto err_iovad_stream;
>>   	init_iova_domain(&domain->consistent_iovad,
>>   			PAGE_SIZE, bounce_pfns);
>> -	ret = iova_domain_init_rcaches(&domain->consistent_iovad);
>> +	ret = iova_domain_init_rcaches_default(&domain->consistent_iovad);
>>   	if (ret)
>>   		goto err_iovad_consistent;
>>   
>> diff --git a/include/linux/iova.h b/include/linux/iova.h
>> index 02f7222fa85a..56281434ce0c 100644
>> --- a/include/linux/iova.h
>> +++ b/include/linux/iova.h
>> @@ -95,7 +95,8 @@ struct iova *reserve_iova(struct iova_domain *iovad, unsigned long pfn_lo,
>>   	unsigned long pfn_hi);
>>   void init_iova_domain(struct iova_domain *iovad, unsigned long granule,
>>   	unsigned long start_pfn);
>> -int iova_domain_init_rcaches(struct iova_domain *iovad);
>> +int iova_domain_init_rcaches(struct iova_domain *iovad, unsigned long iova_len);
>> +int iova_domain_init_rcaches_default(struct iova_domain *iovad);
>>   struct iova *find_iova(struct iova_domain *iovad, unsigned long pfn);
>>   void put_iova_domain(struct iova_domain *iovad);
>>   #else
>>
> 

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH RESEND v5 4/5] iommu: Allow max opt DMA len be set for a group via sysfs
  2022-04-07  8:21     ` Leizhen (ThunderTown) via iommu
@ 2022-04-07 14:00       ` John Garry via iommu
  -1 siblings, 0 replies; 38+ messages in thread
From: John Garry @ 2022-04-07 14:00 UTC (permalink / raw)
  To: Leizhen (ThunderTown), joro, will, robin.murphy
  Cc: mst, jasowang, iommu, linux-kernel, virtualization, chenxiang66,
	jean-philippe, linuxarm

On 07/04/2022 09:21, Leizhen (ThunderTown) wrote:
> 
> 
> On 2022/4/4 19:27, John Garry wrote:
>> Add support to allow the maximum optimised DMA len be set for an IOMMU
>> group via sysfs.
>>
>> This is much the same with the method to change the default domain type
>> for a group.
>>
>> Signed-off-by: John Garry <john.garry@huawei.com>
>> ---
>>   .../ABI/testing/sysfs-kernel-iommu_groups     | 16 +++++
>>   drivers/iommu/iommu.c                         | 59 ++++++++++++++++++-
>>   include/linux/iommu.h                         |  6 ++
>>   3 files changed, 79 insertions(+), 2 deletions(-)
>>
>> diff --git a/Documentation/ABI/testing/sysfs-kernel-iommu_groups b/Documentation/ABI/testing/sysfs-kernel-iommu_groups
>> index b15af6a5bc08..ed6f72794f6c 100644
>> --- a/Documentation/ABI/testing/sysfs-kernel-iommu_groups
>> +++ b/Documentation/ABI/testing/sysfs-kernel-iommu_groups
>> @@ -63,3 +63,19 @@ Description:	/sys/kernel/iommu_groups/<grp_id>/type shows the type of default
>>   		system could lead to catastrophic effects (the users might
>>   		need to reboot the machine to get it to normal state). So, it's
>>   		expected that the users understand what they're doing.
>> +
>> +What:		/sys/kernel/iommu_groups/<grp_id>/max_opt_dma_size
>> +Date:		Feb 2022
>> +KernelVersion:	v5.18
>> +Contact:	iommu@lists.linux-foundation.org
>> +Description:	/sys/kernel/iommu_groups/<grp_id>/max_opt_dma_size shows the
>> +		max optimised DMA size for the default IOMMU domain associated
>> +		with the group.
>> +		Each IOMMU domain has an IOVA domain. The IOVA domain caches
>> +		IOVAs upto a certain size as a performance optimisation.
>> +		This sysfs file allows the range of the IOVA domain caching be
>> +		set, such that larger than default IOVAs may be cached.
>> +		A value of 0 means that the default caching range is chosen.
>> +		A privileged user could request the kernel the change the range
>> +		by writing to this file. For this to happen, the same rules
>> +		and procedure applies as in changing the default domain type.
>> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
>> index 10bb10c2a210..7c7258f19bed 100644
>> --- a/drivers/iommu/iommu.c
>> +++ b/drivers/iommu/iommu.c
>> @@ -48,6 +48,7 @@ struct iommu_group {
>>   	struct iommu_domain *default_domain;
>>   	struct iommu_domain *domain;
>>   	struct list_head entry;
>> +	size_t max_opt_dma_size;
>>   };
>>   
>>   struct group_device {
>> @@ -89,6 +90,9 @@ static int iommu_create_device_direct_mappings(struct iommu_group *group,
>>   static struct iommu_group *iommu_group_get_for_dev(struct device *dev);
>>   static ssize_t iommu_group_store_type(struct iommu_group *group,
>>   				      const char *buf, size_t count);
>> +static ssize_t iommu_group_store_max_opt_dma_size(struct iommu_group *group,
>> +						  const char *buf,
>> +						  size_t count);
>>   
>>   #define IOMMU_GROUP_ATTR(_name, _mode, _show, _store)		\
>>   struct iommu_group_attribute iommu_group_attr_##_name =		\
>> @@ -571,6 +575,12 @@ static ssize_t iommu_group_show_type(struct iommu_group *group,
>>   	return strlen(type);
>>   }
>>   
>> +static ssize_t iommu_group_show_max_opt_dma_size(struct iommu_group *group,
>> +				     char *buf)
>> +{
>> +	return sprintf(buf, "%zu\n", group->max_opt_dma_size);
>> +}
>> +
>>   static IOMMU_GROUP_ATTR(name, S_IRUGO, iommu_group_show_name, NULL);
>>   
>>   static IOMMU_GROUP_ATTR(reserved_regions, 0444,
>> @@ -579,6 +589,9 @@ static IOMMU_GROUP_ATTR(reserved_regions, 0444,
>>   static IOMMU_GROUP_ATTR(type, 0644, iommu_group_show_type,
>>   			iommu_group_store_type);
>>   
>> +static IOMMU_GROUP_ATTR(max_opt_dma_size, 0644, iommu_group_show_max_opt_dma_size,
>> +			iommu_group_store_max_opt_dma_size);
>> +
>>   static void iommu_group_release(struct kobject *kobj)
>>   {
>>   	struct iommu_group *group = to_iommu_group(kobj);
>> @@ -665,6 +678,10 @@ struct iommu_group *iommu_group_alloc(void)
>>   	if (ret)
>>   		return ERR_PTR(ret);
>>   
>> +	ret = iommu_group_create_file(group, &iommu_group_attr_max_opt_dma_size);
>> +	if (ret)
>> +		return ERR_PTR(ret);
>> +
>>   	pr_debug("Allocated group %d\n", group->id);
>>   
>>   	return group;
>> @@ -2087,6 +2104,11 @@ struct iommu_domain *iommu_get_dma_domain(struct device *dev)
>>   	return dev->iommu_group->default_domain;
>>   }
>>   
>> +size_t iommu_group_get_max_opt_dma_size(struct iommu_group *group)
>> +{
>> +	return group->max_opt_dma_size;
>> +}
>> +
>>   /*
>>    * IOMMU groups are really the natural working unit of the IOMMU, but
>>    * the IOMMU API works on domains and devices.  Bridge that gap by
>> @@ -2871,12 +2893,14 @@ EXPORT_SYMBOL_GPL(iommu_sva_get_pasid);
>>    * @prev_dev: The device in the group (this is used to make sure that the device
>>    *	 hasn't changed after the caller has called this function)
>>    * @type: The type of the new default domain that gets associated with the group
>> + * @max_opt_dma_size: Set the IOMMU group max_opt_dma_size if non-zero
>>    *
>>    * Returns 0 on success and error code on failure
>>    *
>>    */
>>   static int iommu_change_dev_def_domain(struct iommu_group *group,
>> -				       struct device *prev_dev, int type)
>> +				       struct device *prev_dev, int type,
>> +				       unsigned long max_opt_dma_size)
>>   {
>>   	struct iommu_domain *prev_dom;
>>   	struct group_device *grp_dev;
>> @@ -2977,6 +3001,9 @@ static int iommu_change_dev_def_domain(struct iommu_group *group,
>>   
>>   	group->domain = group->default_domain;
>>   
>> +	if (max_opt_dma_size)
>> +		group->max_opt_dma_size = max_opt_dma_size;
> 

Hi Leizhen,

> Why not add a new function just do this?

But how would the function look?

> In this way, we do not need to modify
> iommu_change_dev_def_domain() and patch 3/5 can be dropped.

A change in the group max_opt_dma_size will only be applied when realloc 
the def domain, i.e. in this function, and we need 3/5 to support that

thanks,
John

> 
>> +
>>   	/*
>>   	 * Release the mutex here because ops->probe_finalize() call-back of
>>   	 * some vendor IOMMU drivers calls arm_iommu_attach_device() which
>> @@ -3003,6 +3030,7 @@ static int iommu_change_dev_def_domain(struct iommu_group *group,
>>   
>>   enum iommu_group_op {
>>   	CHANGE_GROUP_TYPE,
>> +	CHANGE_DMA_OPT_SIZE,
>>   };
>>   
>>   static int __iommu_group_store_type(const char *buf, struct iommu_group *group,
>> @@ -3031,7 +3059,24 @@ static int __iommu_group_store_type(const char *buf, struct iommu_group *group,
>>   		return -EINVAL;
>>   	}
>>   
>> -	return iommu_change_dev_def_domain(group, dev, type);
>> +	return iommu_change_dev_def_domain(group, dev, type, 0);
>> +}
>> +
>> +static int __iommu_group_store_max_opt_dma_size(const char *buf,
>> +						struct iommu_group *group,
>> +						struct device *dev)
>> +{
>> +	unsigned long val;
>> +
>> +	if (kstrtoul(buf, 0, &val) || !val)
>> +		return -EINVAL;
>> +
>> +	if (device_is_bound(dev)) {
>> +		pr_err_ratelimited("Device is still bound to driver\n");
>> +		return -EINVAL;
>> +	}
>> +
>> +	return iommu_change_dev_def_domain(group, dev, __IOMMU_DOMAIN_SAME, val);
>>   }
>>   
>>   /*
>> @@ -3108,6 +3153,9 @@ static ssize_t iommu_group_store_common(struct iommu_group *group,
>>   	case CHANGE_GROUP_TYPE:
>>   		ret = __iommu_group_store_type(buf, group, dev);
>>   		break;
>> +	case CHANGE_DMA_OPT_SIZE:
>> +		ret = __iommu_group_store_max_opt_dma_size(buf, group, dev);
>> +		break;
>>   	default:
>>   		ret = -EINVAL;
>>   	}
>> @@ -3124,3 +3172,10 @@ static ssize_t iommu_group_store_type(struct iommu_group *group,
>>   {
>>   	return iommu_group_store_common(group, CHANGE_GROUP_TYPE, buf, count);
>>   }
>> +
>> +static ssize_t iommu_group_store_max_opt_dma_size(struct iommu_group *group,
>> +						  const char *buf,
>> +						  size_t count)
>> +{
>> +	return iommu_group_store_common(group, CHANGE_DMA_OPT_SIZE, buf, count);
>> +}
>> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
>> index b141cf71c7af..6915e68c40b7 100644
>> --- a/include/linux/iommu.h
>> +++ b/include/linux/iommu.h
>> @@ -430,6 +430,7 @@ extern int iommu_sva_unbind_gpasid(struct iommu_domain *domain,
>>   				   struct device *dev, ioasid_t pasid);
>>   extern struct iommu_domain *iommu_get_domain_for_dev(struct device *dev);
>>   extern struct iommu_domain *iommu_get_dma_domain(struct device *dev);
>> +extern size_t iommu_group_get_max_opt_dma_size(struct iommu_group *group);
>>   extern int iommu_map(struct iommu_domain *domain, unsigned long iova,
>>   		     phys_addr_t paddr, size_t size, int prot);
>>   extern int iommu_map_atomic(struct iommu_domain *domain, unsigned long iova,
>> @@ -725,6 +726,11 @@ static inline struct iommu_domain *iommu_get_domain_for_dev(struct device *dev)
>>   	return NULL;
>>   }
>>   
>> +static inline size_t iommu_group_get_max_opt_dma_size(struct iommu_group *group)
>> +{
>> +	return 0;
>> +}
>> +
>>   static inline int iommu_map(struct iommu_domain *domain, unsigned long iova,
>>   			    phys_addr_t paddr, size_t size, int prot)
>>   {
>>
> 


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH RESEND v5 4/5] iommu: Allow max opt DMA len be set for a group via sysfs
@ 2022-04-07 14:00       ` John Garry via iommu
  0 siblings, 0 replies; 38+ messages in thread
From: John Garry via iommu @ 2022-04-07 14:00 UTC (permalink / raw)
  To: Leizhen (ThunderTown), joro, will, robin.murphy
  Cc: jean-philippe, jasowang, mst, linuxarm, linux-kernel,
	virtualization, iommu

On 07/04/2022 09:21, Leizhen (ThunderTown) wrote:
> 
> 
> On 2022/4/4 19:27, John Garry wrote:
>> Add support to allow the maximum optimised DMA len be set for an IOMMU
>> group via sysfs.
>>
>> This is much the same with the method to change the default domain type
>> for a group.
>>
>> Signed-off-by: John Garry <john.garry@huawei.com>
>> ---
>>   .../ABI/testing/sysfs-kernel-iommu_groups     | 16 +++++
>>   drivers/iommu/iommu.c                         | 59 ++++++++++++++++++-
>>   include/linux/iommu.h                         |  6 ++
>>   3 files changed, 79 insertions(+), 2 deletions(-)
>>
>> diff --git a/Documentation/ABI/testing/sysfs-kernel-iommu_groups b/Documentation/ABI/testing/sysfs-kernel-iommu_groups
>> index b15af6a5bc08..ed6f72794f6c 100644
>> --- a/Documentation/ABI/testing/sysfs-kernel-iommu_groups
>> +++ b/Documentation/ABI/testing/sysfs-kernel-iommu_groups
>> @@ -63,3 +63,19 @@ Description:	/sys/kernel/iommu_groups/<grp_id>/type shows the type of default
>>   		system could lead to catastrophic effects (the users might
>>   		need to reboot the machine to get it to normal state). So, it's
>>   		expected that the users understand what they're doing.
>> +
>> +What:		/sys/kernel/iommu_groups/<grp_id>/max_opt_dma_size
>> +Date:		Feb 2022
>> +KernelVersion:	v5.18
>> +Contact:	iommu@lists.linux-foundation.org
>> +Description:	/sys/kernel/iommu_groups/<grp_id>/max_opt_dma_size shows the
>> +		max optimised DMA size for the default IOMMU domain associated
>> +		with the group.
>> +		Each IOMMU domain has an IOVA domain. The IOVA domain caches
>> +		IOVAs upto a certain size as a performance optimisation.
>> +		This sysfs file allows the range of the IOVA domain caching be
>> +		set, such that larger than default IOVAs may be cached.
>> +		A value of 0 means that the default caching range is chosen.
>> +		A privileged user could request the kernel the change the range
>> +		by writing to this file. For this to happen, the same rules
>> +		and procedure applies as in changing the default domain type.
>> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
>> index 10bb10c2a210..7c7258f19bed 100644
>> --- a/drivers/iommu/iommu.c
>> +++ b/drivers/iommu/iommu.c
>> @@ -48,6 +48,7 @@ struct iommu_group {
>>   	struct iommu_domain *default_domain;
>>   	struct iommu_domain *domain;
>>   	struct list_head entry;
>> +	size_t max_opt_dma_size;
>>   };
>>   
>>   struct group_device {
>> @@ -89,6 +90,9 @@ static int iommu_create_device_direct_mappings(struct iommu_group *group,
>>   static struct iommu_group *iommu_group_get_for_dev(struct device *dev);
>>   static ssize_t iommu_group_store_type(struct iommu_group *group,
>>   				      const char *buf, size_t count);
>> +static ssize_t iommu_group_store_max_opt_dma_size(struct iommu_group *group,
>> +						  const char *buf,
>> +						  size_t count);
>>   
>>   #define IOMMU_GROUP_ATTR(_name, _mode, _show, _store)		\
>>   struct iommu_group_attribute iommu_group_attr_##_name =		\
>> @@ -571,6 +575,12 @@ static ssize_t iommu_group_show_type(struct iommu_group *group,
>>   	return strlen(type);
>>   }
>>   
>> +static ssize_t iommu_group_show_max_opt_dma_size(struct iommu_group *group,
>> +				     char *buf)
>> +{
>> +	return sprintf(buf, "%zu\n", group->max_opt_dma_size);
>> +}
>> +
>>   static IOMMU_GROUP_ATTR(name, S_IRUGO, iommu_group_show_name, NULL);
>>   
>>   static IOMMU_GROUP_ATTR(reserved_regions, 0444,
>> @@ -579,6 +589,9 @@ static IOMMU_GROUP_ATTR(reserved_regions, 0444,
>>   static IOMMU_GROUP_ATTR(type, 0644, iommu_group_show_type,
>>   			iommu_group_store_type);
>>   
>> +static IOMMU_GROUP_ATTR(max_opt_dma_size, 0644, iommu_group_show_max_opt_dma_size,
>> +			iommu_group_store_max_opt_dma_size);
>> +
>>   static void iommu_group_release(struct kobject *kobj)
>>   {
>>   	struct iommu_group *group = to_iommu_group(kobj);
>> @@ -665,6 +678,10 @@ struct iommu_group *iommu_group_alloc(void)
>>   	if (ret)
>>   		return ERR_PTR(ret);
>>   
>> +	ret = iommu_group_create_file(group, &iommu_group_attr_max_opt_dma_size);
>> +	if (ret)
>> +		return ERR_PTR(ret);
>> +
>>   	pr_debug("Allocated group %d\n", group->id);
>>   
>>   	return group;
>> @@ -2087,6 +2104,11 @@ struct iommu_domain *iommu_get_dma_domain(struct device *dev)
>>   	return dev->iommu_group->default_domain;
>>   }
>>   
>> +size_t iommu_group_get_max_opt_dma_size(struct iommu_group *group)
>> +{
>> +	return group->max_opt_dma_size;
>> +}
>> +
>>   /*
>>    * IOMMU groups are really the natural working unit of the IOMMU, but
>>    * the IOMMU API works on domains and devices.  Bridge that gap by
>> @@ -2871,12 +2893,14 @@ EXPORT_SYMBOL_GPL(iommu_sva_get_pasid);
>>    * @prev_dev: The device in the group (this is used to make sure that the device
>>    *	 hasn't changed after the caller has called this function)
>>    * @type: The type of the new default domain that gets associated with the group
>> + * @max_opt_dma_size: Set the IOMMU group max_opt_dma_size if non-zero
>>    *
>>    * Returns 0 on success and error code on failure
>>    *
>>    */
>>   static int iommu_change_dev_def_domain(struct iommu_group *group,
>> -				       struct device *prev_dev, int type)
>> +				       struct device *prev_dev, int type,
>> +				       unsigned long max_opt_dma_size)
>>   {
>>   	struct iommu_domain *prev_dom;
>>   	struct group_device *grp_dev;
>> @@ -2977,6 +3001,9 @@ static int iommu_change_dev_def_domain(struct iommu_group *group,
>>   
>>   	group->domain = group->default_domain;
>>   
>> +	if (max_opt_dma_size)
>> +		group->max_opt_dma_size = max_opt_dma_size;
> 

Hi Leizhen,

> Why not add a new function just do this?

But how would the function look?

> In this way, we do not need to modify
> iommu_change_dev_def_domain() and patch 3/5 can be dropped.

A change in the group max_opt_dma_size will only be applied when realloc 
the def domain, i.e. in this function, and we need 3/5 to support that

thanks,
John

> 
>> +
>>   	/*
>>   	 * Release the mutex here because ops->probe_finalize() call-back of
>>   	 * some vendor IOMMU drivers calls arm_iommu_attach_device() which
>> @@ -3003,6 +3030,7 @@ static int iommu_change_dev_def_domain(struct iommu_group *group,
>>   
>>   enum iommu_group_op {
>>   	CHANGE_GROUP_TYPE,
>> +	CHANGE_DMA_OPT_SIZE,
>>   };
>>   
>>   static int __iommu_group_store_type(const char *buf, struct iommu_group *group,
>> @@ -3031,7 +3059,24 @@ static int __iommu_group_store_type(const char *buf, struct iommu_group *group,
>>   		return -EINVAL;
>>   	}
>>   
>> -	return iommu_change_dev_def_domain(group, dev, type);
>> +	return iommu_change_dev_def_domain(group, dev, type, 0);
>> +}
>> +
>> +static int __iommu_group_store_max_opt_dma_size(const char *buf,
>> +						struct iommu_group *group,
>> +						struct device *dev)
>> +{
>> +	unsigned long val;
>> +
>> +	if (kstrtoul(buf, 0, &val) || !val)
>> +		return -EINVAL;
>> +
>> +	if (device_is_bound(dev)) {
>> +		pr_err_ratelimited("Device is still bound to driver\n");
>> +		return -EINVAL;
>> +	}
>> +
>> +	return iommu_change_dev_def_domain(group, dev, __IOMMU_DOMAIN_SAME, val);
>>   }
>>   
>>   /*
>> @@ -3108,6 +3153,9 @@ static ssize_t iommu_group_store_common(struct iommu_group *group,
>>   	case CHANGE_GROUP_TYPE:
>>   		ret = __iommu_group_store_type(buf, group, dev);
>>   		break;
>> +	case CHANGE_DMA_OPT_SIZE:
>> +		ret = __iommu_group_store_max_opt_dma_size(buf, group, dev);
>> +		break;
>>   	default:
>>   		ret = -EINVAL;
>>   	}
>> @@ -3124,3 +3172,10 @@ static ssize_t iommu_group_store_type(struct iommu_group *group,
>>   {
>>   	return iommu_group_store_common(group, CHANGE_GROUP_TYPE, buf, count);
>>   }
>> +
>> +static ssize_t iommu_group_store_max_opt_dma_size(struct iommu_group *group,
>> +						  const char *buf,
>> +						  size_t count)
>> +{
>> +	return iommu_group_store_common(group, CHANGE_DMA_OPT_SIZE, buf, count);
>> +}
>> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
>> index b141cf71c7af..6915e68c40b7 100644
>> --- a/include/linux/iommu.h
>> +++ b/include/linux/iommu.h
>> @@ -430,6 +430,7 @@ extern int iommu_sva_unbind_gpasid(struct iommu_domain *domain,
>>   				   struct device *dev, ioasid_t pasid);
>>   extern struct iommu_domain *iommu_get_domain_for_dev(struct device *dev);
>>   extern struct iommu_domain *iommu_get_dma_domain(struct device *dev);
>> +extern size_t iommu_group_get_max_opt_dma_size(struct iommu_group *group);
>>   extern int iommu_map(struct iommu_domain *domain, unsigned long iova,
>>   		     phys_addr_t paddr, size_t size, int prot);
>>   extern int iommu_map_atomic(struct iommu_domain *domain, unsigned long iova,
>> @@ -725,6 +726,11 @@ static inline struct iommu_domain *iommu_get_domain_for_dev(struct device *dev)
>>   	return NULL;
>>   }
>>   
>> +static inline size_t iommu_group_get_max_opt_dma_size(struct iommu_group *group)
>> +{
>> +	return 0;
>> +}
>> +
>>   static inline int iommu_map(struct iommu_domain *domain, unsigned long iova,
>>   			    phys_addr_t paddr, size_t size, int prot)
>>   {
>>
> 

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH RESEND v5 1/5] iommu: Refactor iommu_group_store_type()
  2022-04-04 11:27   ` John Garry via iommu
  (?)
@ 2022-07-06 12:00     ` Will Deacon
  -1 siblings, 0 replies; 38+ messages in thread
From: Will Deacon @ 2022-07-06 12:00 UTC (permalink / raw)
  To: John Garry
  Cc: joro, robin.murphy, mst, jasowang, iommu, linux-kernel,
	virtualization, chenxiang66, thunder.leizhen, jean-philippe,
	linuxarm

On Mon, Apr 04, 2022 at 07:27:10PM +0800, John Garry wrote:
> Function iommu_group_store_type() supports changing the default domain
> of an IOMMU group.
> 
> Many conditions need to be satisfied and steps taken for this action to be
> successful.
> 
> Satisfying these conditions and steps will be required for setting other
> IOMMU group attributes, so factor into a common part and a part specific
> to update the IOMMU group attribute.
> 
> No functional change intended.
> 
> Some code comments are tidied up also.
> 
> Signed-off-by: John Garry <john.garry@huawei.com>
> ---
>  drivers/iommu/iommu.c | 96 ++++++++++++++++++++++++++++---------------
>  1 file changed, 62 insertions(+), 34 deletions(-)

Acked-by: Will Deacon <will@kernel.org>

Will

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH RESEND v5 1/5] iommu: Refactor iommu_group_store_type()
@ 2022-07-06 12:00     ` Will Deacon
  0 siblings, 0 replies; 38+ messages in thread
From: Will Deacon @ 2022-07-06 12:00 UTC (permalink / raw)
  To: John Garry
  Cc: jean-philippe, mst, linuxarm, linux-kernel, virtualization,
	iommu, robin.murphy, jasowang

On Mon, Apr 04, 2022 at 07:27:10PM +0800, John Garry wrote:
> Function iommu_group_store_type() supports changing the default domain
> of an IOMMU group.
> 
> Many conditions need to be satisfied and steps taken for this action to be
> successful.
> 
> Satisfying these conditions and steps will be required for setting other
> IOMMU group attributes, so factor into a common part and a part specific
> to update the IOMMU group attribute.
> 
> No functional change intended.
> 
> Some code comments are tidied up also.
> 
> Signed-off-by: John Garry <john.garry@huawei.com>
> ---
>  drivers/iommu/iommu.c | 96 ++++++++++++++++++++++++++++---------------
>  1 file changed, 62 insertions(+), 34 deletions(-)

Acked-by: Will Deacon <will@kernel.org>

Will
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH RESEND v5 1/5] iommu: Refactor iommu_group_store_type()
@ 2022-07-06 12:00     ` Will Deacon
  0 siblings, 0 replies; 38+ messages in thread
From: Will Deacon @ 2022-07-06 12:00 UTC (permalink / raw)
  To: John Garry
  Cc: jean-philippe, mst, joro, chenxiang66, linuxarm, linux-kernel,
	virtualization, iommu, thunder.leizhen, robin.murphy

On Mon, Apr 04, 2022 at 07:27:10PM +0800, John Garry wrote:
> Function iommu_group_store_type() supports changing the default domain
> of an IOMMU group.
> 
> Many conditions need to be satisfied and steps taken for this action to be
> successful.
> 
> Satisfying these conditions and steps will be required for setting other
> IOMMU group attributes, so factor into a common part and a part specific
> to update the IOMMU group attribute.
> 
> No functional change intended.
> 
> Some code comments are tidied up also.
> 
> Signed-off-by: John Garry <john.garry@huawei.com>
> ---
>  drivers/iommu/iommu.c | 96 ++++++++++++++++++++++++++++---------------
>  1 file changed, 62 insertions(+), 34 deletions(-)

Acked-by: Will Deacon <will@kernel.org>

Will
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH RESEND v5 1/5] iommu: Refactor iommu_group_store_type()
  2022-07-06 12:00     ` Will Deacon
@ 2022-07-06 12:03       ` John Garry via iommu
  -1 siblings, 0 replies; 38+ messages in thread
From: John Garry @ 2022-07-06 12:03 UTC (permalink / raw)
  To: Will Deacon
  Cc: joro, robin.murphy, mst, jasowang, iommu, linux-kernel,
	virtualization, chenxiang66, thunder.leizhen, jean-philippe,
	linuxarm

On 06/07/2022 13:00, Will Deacon wrote:
> On Mon, Apr 04, 2022 at 07:27:10PM +0800, John Garry wrote:
>> Function iommu_group_store_type() supports changing the default domain
>> of an IOMMU group.
>>
>> Many conditions need to be satisfied and steps taken for this action to be
>> successful.
>>
>> Satisfying these conditions and steps will be required for setting other
>> IOMMU group attributes, so factor into a common part and a part specific
>> to update the IOMMU group attribute.
>>
>> No functional change intended.
>>
>> Some code comments are tidied up also.
>>
>> Signed-off-by: John Garry<john.garry@huawei.com>
>> ---
>>   drivers/iommu/iommu.c | 96 ++++++++++++++++++++++++++++---------------
>>   1 file changed, 62 insertions(+), 34 deletions(-)
> Acked-by: Will Deacon<will@kernel.org>
> 

Thanks, but currently I have no plans to progress this series, in favour 
of this 
https://lore.kernel.org/linux-iommu/1656590892-42307-1-git-send-email-john.garry@huawei.com/T/#me0e806913050c95f6e6ba2c7f7d96d51ce191204

cheers


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH RESEND v5 1/5] iommu: Refactor iommu_group_store_type()
@ 2022-07-06 12:03       ` John Garry via iommu
  0 siblings, 0 replies; 38+ messages in thread
From: John Garry via iommu @ 2022-07-06 12:03 UTC (permalink / raw)
  To: Will Deacon
  Cc: jean-philippe, mst, linuxarm, linux-kernel, virtualization,
	iommu, robin.murphy, jasowang

On 06/07/2022 13:00, Will Deacon wrote:
> On Mon, Apr 04, 2022 at 07:27:10PM +0800, John Garry wrote:
>> Function iommu_group_store_type() supports changing the default domain
>> of an IOMMU group.
>>
>> Many conditions need to be satisfied and steps taken for this action to be
>> successful.
>>
>> Satisfying these conditions and steps will be required for setting other
>> IOMMU group attributes, so factor into a common part and a part specific
>> to update the IOMMU group attribute.
>>
>> No functional change intended.
>>
>> Some code comments are tidied up also.
>>
>> Signed-off-by: John Garry<john.garry@huawei.com>
>> ---
>>   drivers/iommu/iommu.c | 96 ++++++++++++++++++++++++++++---------------
>>   1 file changed, 62 insertions(+), 34 deletions(-)
> Acked-by: Will Deacon<will@kernel.org>
> 

Thanks, but currently I have no plans to progress this series, in favour 
of this 
https://lore.kernel.org/linux-iommu/1656590892-42307-1-git-send-email-john.garry@huawei.com/T/#me0e806913050c95f6e6ba2c7f7d96d51ce191204

cheers

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH RESEND v5 2/5] iova: Allow rcache range upper limit to be flexible
  2022-04-04 11:27   ` John Garry via iommu
  (?)
@ 2022-07-06 12:10     ` Will Deacon
  -1 siblings, 0 replies; 38+ messages in thread
From: Will Deacon @ 2022-07-06 12:10 UTC (permalink / raw)
  To: John Garry
  Cc: joro, robin.murphy, mst, jasowang, iommu, linux-kernel,
	virtualization, chenxiang66, thunder.leizhen, jean-philippe,
	linuxarm

On Mon, Apr 04, 2022 at 07:27:11PM +0800, John Garry wrote:
> Some low-level drivers may request DMA mappings whose IOVA length exceeds
> that of the current rcache upper limit.
> 
> This means that allocations for those IOVAs will never be cached, and
> always must be allocated and freed from the RB tree per DMA mapping cycle.
> This has a significant effect on performance, more so since commit
> 4e89dce72521 ("iommu/iova: Retry from last rb tree node if iova search
> fails"), as discussed at [0].
> 
> As a first step towards allowing the rcache range upper limit be
> configured, hold this value in the IOVA rcache structure, and allocate
> the rcaches separately.
> 
> Delete macro IOVA_RANGE_CACHE_MAX_SIZE in case it's reused by mistake.
> 
> [0] https://lore.kernel.org/linux-iommu/20210129092120.1482-1-thunder.leizhen@huawei.com/
> 
> Signed-off-by: John Garry <john.garry@huawei.com>
> ---
>  drivers/iommu/iova.c | 20 ++++++++++----------
>  include/linux/iova.h |  3 +++
>  2 files changed, 13 insertions(+), 10 deletions(-)

Acked-by: Will Deacon <will@kernel.org>

Will

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH RESEND v5 2/5] iova: Allow rcache range upper limit to be flexible
@ 2022-07-06 12:10     ` Will Deacon
  0 siblings, 0 replies; 38+ messages in thread
From: Will Deacon @ 2022-07-06 12:10 UTC (permalink / raw)
  To: John Garry
  Cc: jean-philippe, mst, linuxarm, linux-kernel, virtualization,
	iommu, robin.murphy, jasowang

On Mon, Apr 04, 2022 at 07:27:11PM +0800, John Garry wrote:
> Some low-level drivers may request DMA mappings whose IOVA length exceeds
> that of the current rcache upper limit.
> 
> This means that allocations for those IOVAs will never be cached, and
> always must be allocated and freed from the RB tree per DMA mapping cycle.
> This has a significant effect on performance, more so since commit
> 4e89dce72521 ("iommu/iova: Retry from last rb tree node if iova search
> fails"), as discussed at [0].
> 
> As a first step towards allowing the rcache range upper limit be
> configured, hold this value in the IOVA rcache structure, and allocate
> the rcaches separately.
> 
> Delete macro IOVA_RANGE_CACHE_MAX_SIZE in case it's reused by mistake.
> 
> [0] https://lore.kernel.org/linux-iommu/20210129092120.1482-1-thunder.leizhen@huawei.com/
> 
> Signed-off-by: John Garry <john.garry@huawei.com>
> ---
>  drivers/iommu/iova.c | 20 ++++++++++----------
>  include/linux/iova.h |  3 +++
>  2 files changed, 13 insertions(+), 10 deletions(-)

Acked-by: Will Deacon <will@kernel.org>

Will
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH RESEND v5 2/5] iova: Allow rcache range upper limit to be flexible
@ 2022-07-06 12:10     ` Will Deacon
  0 siblings, 0 replies; 38+ messages in thread
From: Will Deacon @ 2022-07-06 12:10 UTC (permalink / raw)
  To: John Garry
  Cc: jean-philippe, mst, joro, chenxiang66, linuxarm, linux-kernel,
	virtualization, iommu, thunder.leizhen, robin.murphy

On Mon, Apr 04, 2022 at 07:27:11PM +0800, John Garry wrote:
> Some low-level drivers may request DMA mappings whose IOVA length exceeds
> that of the current rcache upper limit.
> 
> This means that allocations for those IOVAs will never be cached, and
> always must be allocated and freed from the RB tree per DMA mapping cycle.
> This has a significant effect on performance, more so since commit
> 4e89dce72521 ("iommu/iova: Retry from last rb tree node if iova search
> fails"), as discussed at [0].
> 
> As a first step towards allowing the rcache range upper limit be
> configured, hold this value in the IOVA rcache structure, and allocate
> the rcaches separately.
> 
> Delete macro IOVA_RANGE_CACHE_MAX_SIZE in case it's reused by mistake.
> 
> [0] https://lore.kernel.org/linux-iommu/20210129092120.1482-1-thunder.leizhen@huawei.com/
> 
> Signed-off-by: John Garry <john.garry@huawei.com>
> ---
>  drivers/iommu/iova.c | 20 ++++++++++----------
>  include/linux/iova.h |  3 +++
>  2 files changed, 13 insertions(+), 10 deletions(-)

Acked-by: Will Deacon <will@kernel.org>

Will
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH RESEND v5 2/5] iova: Allow rcache range upper limit to be flexible
  2022-04-07  7:52     ` Leizhen (ThunderTown) via iommu
  (?)
@ 2022-07-06 12:11       ` Will Deacon
  -1 siblings, 0 replies; 38+ messages in thread
From: Will Deacon @ 2022-07-06 12:11 UTC (permalink / raw)
  To: Leizhen (ThunderTown)
  Cc: mst, linuxarm, linux-kernel, virtualization, iommu,
	jean-philippe, robin.murphy, jasowang

On Thu, Apr 07, 2022 at 03:52:53PM +0800, Leizhen (ThunderTown) wrote:
> On 2022/4/4 19:27, John Garry wrote:
> > Some low-level drivers may request DMA mappings whose IOVA length exceeds
> > that of the current rcache upper limit.
> > 
> > This means that allocations for those IOVAs will never be cached, and
> > always must be allocated and freed from the RB tree per DMA mapping cycle.
> > This has a significant effect on performance, more so since commit
> > 4e89dce72521 ("iommu/iova: Retry from last rb tree node if iova search
> > fails"), as discussed at [0].
> > 
> > As a first step towards allowing the rcache range upper limit be
> > configured, hold this value in the IOVA rcache structure, and allocate
> > the rcaches separately.
> > 
> > Delete macro IOVA_RANGE_CACHE_MAX_SIZE in case it's reused by mistake.
> > 
> > [0] https://lore.kernel.org/linux-iommu/20210129092120.1482-1-thunder.leizhen@huawei.com/
> > 
> > Signed-off-by: John Garry <john.garry@huawei.com>
> > ---
> >  drivers/iommu/iova.c | 20 ++++++++++----------
> >  include/linux/iova.h |  3 +++
> >  2 files changed, 13 insertions(+), 10 deletions(-)
> > 
> > diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c
> > index db77aa675145..5c22b9187b79 100644
> > --- a/drivers/iommu/iova.c
> > +++ b/drivers/iommu/iova.c
> > @@ -15,8 +15,6 @@
> >  /* The anchor node sits above the top of the usable address space */
> >  #define IOVA_ANCHOR	~0UL
> >  
> > -#define IOVA_RANGE_CACHE_MAX_SIZE 6	/* log of max cached IOVA range size (in pages) */
> > -
> >  static bool iova_rcache_insert(struct iova_domain *iovad,
> >  			       unsigned long pfn,
> >  			       unsigned long size);
> > @@ -443,7 +441,7 @@ alloc_iova_fast(struct iova_domain *iovad, unsigned long size,
> >  	 * rounding up anything cacheable to make sure that can't happen. The
> >  	 * order of the unadjusted size will still match upon freeing.
> >  	 */
> > -	if (size < (1 << (IOVA_RANGE_CACHE_MAX_SIZE - 1)))
> > +	if (size < (1 << (iovad->rcache_max_size - 1)))
> >  		size = roundup_pow_of_two(size);
> >  
> >  	iova_pfn = iova_rcache_get(iovad, size, limit_pfn + 1);
> > @@ -713,13 +711,15 @@ int iova_domain_init_rcaches(struct iova_domain *iovad)
> >  	unsigned int cpu;
> >  	int i, ret;
> >  
> > -	iovad->rcaches = kcalloc(IOVA_RANGE_CACHE_MAX_SIZE,
> > +	iovad->rcache_max_size = 6; /* Arbitrarily high default */
> 
> It would be better to assign this constant value to iovad->rcache_max_size
> in init_iova_domain().

I think it's fine where it is as it's a meaningless number outside of the
rcache code.

Will
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH RESEND v5 2/5] iova: Allow rcache range upper limit to be flexible
@ 2022-07-06 12:11       ` Will Deacon
  0 siblings, 0 replies; 38+ messages in thread
From: Will Deacon @ 2022-07-06 12:11 UTC (permalink / raw)
  To: Leizhen (ThunderTown)
  Cc: chenxiang66, mst, joro, John Garry, linuxarm, linux-kernel,
	virtualization, iommu, jean-philippe, robin.murphy

On Thu, Apr 07, 2022 at 03:52:53PM +0800, Leizhen (ThunderTown) wrote:
> On 2022/4/4 19:27, John Garry wrote:
> > Some low-level drivers may request DMA mappings whose IOVA length exceeds
> > that of the current rcache upper limit.
> > 
> > This means that allocations for those IOVAs will never be cached, and
> > always must be allocated and freed from the RB tree per DMA mapping cycle.
> > This has a significant effect on performance, more so since commit
> > 4e89dce72521 ("iommu/iova: Retry from last rb tree node if iova search
> > fails"), as discussed at [0].
> > 
> > As a first step towards allowing the rcache range upper limit be
> > configured, hold this value in the IOVA rcache structure, and allocate
> > the rcaches separately.
> > 
> > Delete macro IOVA_RANGE_CACHE_MAX_SIZE in case it's reused by mistake.
> > 
> > [0] https://lore.kernel.org/linux-iommu/20210129092120.1482-1-thunder.leizhen@huawei.com/
> > 
> > Signed-off-by: John Garry <john.garry@huawei.com>
> > ---
> >  drivers/iommu/iova.c | 20 ++++++++++----------
> >  include/linux/iova.h |  3 +++
> >  2 files changed, 13 insertions(+), 10 deletions(-)
> > 
> > diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c
> > index db77aa675145..5c22b9187b79 100644
> > --- a/drivers/iommu/iova.c
> > +++ b/drivers/iommu/iova.c
> > @@ -15,8 +15,6 @@
> >  /* The anchor node sits above the top of the usable address space */
> >  #define IOVA_ANCHOR	~0UL
> >  
> > -#define IOVA_RANGE_CACHE_MAX_SIZE 6	/* log of max cached IOVA range size (in pages) */
> > -
> >  static bool iova_rcache_insert(struct iova_domain *iovad,
> >  			       unsigned long pfn,
> >  			       unsigned long size);
> > @@ -443,7 +441,7 @@ alloc_iova_fast(struct iova_domain *iovad, unsigned long size,
> >  	 * rounding up anything cacheable to make sure that can't happen. The
> >  	 * order of the unadjusted size will still match upon freeing.
> >  	 */
> > -	if (size < (1 << (IOVA_RANGE_CACHE_MAX_SIZE - 1)))
> > +	if (size < (1 << (iovad->rcache_max_size - 1)))
> >  		size = roundup_pow_of_two(size);
> >  
> >  	iova_pfn = iova_rcache_get(iovad, size, limit_pfn + 1);
> > @@ -713,13 +711,15 @@ int iova_domain_init_rcaches(struct iova_domain *iovad)
> >  	unsigned int cpu;
> >  	int i, ret;
> >  
> > -	iovad->rcaches = kcalloc(IOVA_RANGE_CACHE_MAX_SIZE,
> > +	iovad->rcache_max_size = 6; /* Arbitrarily high default */
> 
> It would be better to assign this constant value to iovad->rcache_max_size
> in init_iova_domain().

I think it's fine where it is as it's a meaningless number outside of the
rcache code.

Will
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH RESEND v5 2/5] iova: Allow rcache range upper limit to be flexible
@ 2022-07-06 12:11       ` Will Deacon
  0 siblings, 0 replies; 38+ messages in thread
From: Will Deacon @ 2022-07-06 12:11 UTC (permalink / raw)
  To: Leizhen (ThunderTown)
  Cc: John Garry, joro, robin.murphy, mst, jasowang, iommu,
	linux-kernel, virtualization, chenxiang66, jean-philippe,
	linuxarm

On Thu, Apr 07, 2022 at 03:52:53PM +0800, Leizhen (ThunderTown) wrote:
> On 2022/4/4 19:27, John Garry wrote:
> > Some low-level drivers may request DMA mappings whose IOVA length exceeds
> > that of the current rcache upper limit.
> > 
> > This means that allocations for those IOVAs will never be cached, and
> > always must be allocated and freed from the RB tree per DMA mapping cycle.
> > This has a significant effect on performance, more so since commit
> > 4e89dce72521 ("iommu/iova: Retry from last rb tree node if iova search
> > fails"), as discussed at [0].
> > 
> > As a first step towards allowing the rcache range upper limit be
> > configured, hold this value in the IOVA rcache structure, and allocate
> > the rcaches separately.
> > 
> > Delete macro IOVA_RANGE_CACHE_MAX_SIZE in case it's reused by mistake.
> > 
> > [0] https://lore.kernel.org/linux-iommu/20210129092120.1482-1-thunder.leizhen@huawei.com/
> > 
> > Signed-off-by: John Garry <john.garry@huawei.com>
> > ---
> >  drivers/iommu/iova.c | 20 ++++++++++----------
> >  include/linux/iova.h |  3 +++
> >  2 files changed, 13 insertions(+), 10 deletions(-)
> > 
> > diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c
> > index db77aa675145..5c22b9187b79 100644
> > --- a/drivers/iommu/iova.c
> > +++ b/drivers/iommu/iova.c
> > @@ -15,8 +15,6 @@
> >  /* The anchor node sits above the top of the usable address space */
> >  #define IOVA_ANCHOR	~0UL
> >  
> > -#define IOVA_RANGE_CACHE_MAX_SIZE 6	/* log of max cached IOVA range size (in pages) */
> > -
> >  static bool iova_rcache_insert(struct iova_domain *iovad,
> >  			       unsigned long pfn,
> >  			       unsigned long size);
> > @@ -443,7 +441,7 @@ alloc_iova_fast(struct iova_domain *iovad, unsigned long size,
> >  	 * rounding up anything cacheable to make sure that can't happen. The
> >  	 * order of the unadjusted size will still match upon freeing.
> >  	 */
> > -	if (size < (1 << (IOVA_RANGE_CACHE_MAX_SIZE - 1)))
> > +	if (size < (1 << (iovad->rcache_max_size - 1)))
> >  		size = roundup_pow_of_two(size);
> >  
> >  	iova_pfn = iova_rcache_get(iovad, size, limit_pfn + 1);
> > @@ -713,13 +711,15 @@ int iova_domain_init_rcaches(struct iova_domain *iovad)
> >  	unsigned int cpu;
> >  	int i, ret;
> >  
> > -	iovad->rcaches = kcalloc(IOVA_RANGE_CACHE_MAX_SIZE,
> > +	iovad->rcache_max_size = 6; /* Arbitrarily high default */
> 
> It would be better to assign this constant value to iovad->rcache_max_size
> in init_iova_domain().

I think it's fine where it is as it's a meaningless number outside of the
rcache code.

Will

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH RESEND v5 1/5] iommu: Refactor iommu_group_store_type()
  2022-07-06 12:03       ` John Garry via iommu
  (?)
@ 2022-07-06 12:12         ` Will Deacon
  -1 siblings, 0 replies; 38+ messages in thread
From: Will Deacon @ 2022-07-06 12:12 UTC (permalink / raw)
  To: John Garry
  Cc: joro, robin.murphy, mst, jasowang, iommu, linux-kernel,
	virtualization, chenxiang66, thunder.leizhen, jean-philippe,
	linuxarm

On Wed, Jul 06, 2022 at 01:03:44PM +0100, John Garry wrote:
> On 06/07/2022 13:00, Will Deacon wrote:
> > On Mon, Apr 04, 2022 at 07:27:10PM +0800, John Garry wrote:
> > > Function iommu_group_store_type() supports changing the default domain
> > > of an IOMMU group.
> > > 
> > > Many conditions need to be satisfied and steps taken for this action to be
> > > successful.
> > > 
> > > Satisfying these conditions and steps will be required for setting other
> > > IOMMU group attributes, so factor into a common part and a part specific
> > > to update the IOMMU group attribute.
> > > 
> > > No functional change intended.
> > > 
> > > Some code comments are tidied up also.
> > > 
> > > Signed-off-by: John Garry<john.garry@huawei.com>
> > > ---
> > >   drivers/iommu/iommu.c | 96 ++++++++++++++++++++++++++++---------------
> > >   1 file changed, 62 insertions(+), 34 deletions(-)
> > Acked-by: Will Deacon<will@kernel.org>
> > 
> 
> Thanks, but currently I have no plans to progress this series, in favour of
> this https://lore.kernel.org/linux-iommu/1656590892-42307-1-git-send-email-john.garry@huawei.com/T/#me0e806913050c95f6e6ba2c7f7d96d51ce191204

heh, then I'll stop reviewing it then :) Shame, I quite liked it so far!

Will

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH RESEND v5 1/5] iommu: Refactor iommu_group_store_type()
@ 2022-07-06 12:12         ` Will Deacon
  0 siblings, 0 replies; 38+ messages in thread
From: Will Deacon @ 2022-07-06 12:12 UTC (permalink / raw)
  To: John Garry
  Cc: jean-philippe, mst, linuxarm, linux-kernel, virtualization,
	iommu, robin.murphy, jasowang

On Wed, Jul 06, 2022 at 01:03:44PM +0100, John Garry wrote:
> On 06/07/2022 13:00, Will Deacon wrote:
> > On Mon, Apr 04, 2022 at 07:27:10PM +0800, John Garry wrote:
> > > Function iommu_group_store_type() supports changing the default domain
> > > of an IOMMU group.
> > > 
> > > Many conditions need to be satisfied and steps taken for this action to be
> > > successful.
> > > 
> > > Satisfying these conditions and steps will be required for setting other
> > > IOMMU group attributes, so factor into a common part and a part specific
> > > to update the IOMMU group attribute.
> > > 
> > > No functional change intended.
> > > 
> > > Some code comments are tidied up also.
> > > 
> > > Signed-off-by: John Garry<john.garry@huawei.com>
> > > ---
> > >   drivers/iommu/iommu.c | 96 ++++++++++++++++++++++++++++---------------
> > >   1 file changed, 62 insertions(+), 34 deletions(-)
> > Acked-by: Will Deacon<will@kernel.org>
> > 
> 
> Thanks, but currently I have no plans to progress this series, in favour of
> this https://lore.kernel.org/linux-iommu/1656590892-42307-1-git-send-email-john.garry@huawei.com/T/#me0e806913050c95f6e6ba2c7f7d96d51ce191204

heh, then I'll stop reviewing it then :) Shame, I quite liked it so far!

Will
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH RESEND v5 1/5] iommu: Refactor iommu_group_store_type()
@ 2022-07-06 12:12         ` Will Deacon
  0 siblings, 0 replies; 38+ messages in thread
From: Will Deacon @ 2022-07-06 12:12 UTC (permalink / raw)
  To: John Garry
  Cc: jean-philippe, mst, joro, chenxiang66, linuxarm, linux-kernel,
	virtualization, iommu, thunder.leizhen, robin.murphy

On Wed, Jul 06, 2022 at 01:03:44PM +0100, John Garry wrote:
> On 06/07/2022 13:00, Will Deacon wrote:
> > On Mon, Apr 04, 2022 at 07:27:10PM +0800, John Garry wrote:
> > > Function iommu_group_store_type() supports changing the default domain
> > > of an IOMMU group.
> > > 
> > > Many conditions need to be satisfied and steps taken for this action to be
> > > successful.
> > > 
> > > Satisfying these conditions and steps will be required for setting other
> > > IOMMU group attributes, so factor into a common part and a part specific
> > > to update the IOMMU group attribute.
> > > 
> > > No functional change intended.
> > > 
> > > Some code comments are tidied up also.
> > > 
> > > Signed-off-by: John Garry<john.garry@huawei.com>
> > > ---
> > >   drivers/iommu/iommu.c | 96 ++++++++++++++++++++++++++++---------------
> > >   1 file changed, 62 insertions(+), 34 deletions(-)
> > Acked-by: Will Deacon<will@kernel.org>
> > 
> 
> Thanks, but currently I have no plans to progress this series, in favour of
> this https://lore.kernel.org/linux-iommu/1656590892-42307-1-git-send-email-john.garry@huawei.com/T/#me0e806913050c95f6e6ba2c7f7d96d51ce191204

heh, then I'll stop reviewing it then :) Shame, I quite liked it so far!

Will
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 38+ messages in thread

end of thread, other threads:[~2022-07-06 12:13 UTC | newest]

Thread overview: 38+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-04-04 11:27 [PATCH RESEND v5 0/5] iommu: Allow IOVA rcache range be configured John Garry
2022-04-04 11:27 ` John Garry via iommu
2022-04-04 11:27 ` [PATCH RESEND v5 1/5] iommu: Refactor iommu_group_store_type() John Garry
2022-04-04 11:27   ` John Garry via iommu
2022-04-07  7:40   ` Leizhen (ThunderTown)
2022-04-07  7:40     ` Leizhen (ThunderTown) via iommu
2022-07-06 12:00   ` Will Deacon
2022-07-06 12:00     ` Will Deacon
2022-07-06 12:00     ` Will Deacon
2022-07-06 12:03     ` John Garry
2022-07-06 12:03       ` John Garry via iommu
2022-07-06 12:12       ` Will Deacon
2022-07-06 12:12         ` Will Deacon
2022-07-06 12:12         ` Will Deacon
2022-04-04 11:27 ` [PATCH RESEND v5 2/5] iova: Allow rcache range upper limit to be flexible John Garry
2022-04-04 11:27   ` John Garry via iommu
2022-04-07  7:52   ` Leizhen (ThunderTown)
2022-04-07  7:52     ` Leizhen (ThunderTown) via iommu
2022-07-06 12:11     ` Will Deacon
2022-07-06 12:11       ` Will Deacon
2022-07-06 12:11       ` Will Deacon
2022-07-06 12:10   ` Will Deacon
2022-07-06 12:10     ` Will Deacon
2022-07-06 12:10     ` Will Deacon
2022-04-04 11:27 ` [PATCH RESEND v5 3/5] iommu: Allow iommu_change_dev_def_domain() realloc same default domain type John Garry
2022-04-04 11:27   ` John Garry via iommu
2022-04-04 11:27 ` [PATCH RESEND v5 4/5] iommu: Allow max opt DMA len be set for a group via sysfs John Garry
2022-04-04 11:27   ` John Garry via iommu
2022-04-07  8:21   ` Leizhen (ThunderTown)
2022-04-07  8:21     ` Leizhen (ThunderTown) via iommu
2022-04-07 14:00     ` John Garry
2022-04-07 14:00       ` John Garry via iommu
2022-04-04 11:27 ` [PATCH RESEND v5 5/5] iova: Add iova_len argument to iova_domain_init_rcaches() John Garry
2022-04-04 11:27   ` John Garry via iommu
2022-04-07  8:27   ` Leizhen (ThunderTown)
2022-04-07  8:27     ` Leizhen (ThunderTown) via iommu
2022-04-07 13:52     ` John Garry
2022-04-07 13:52       ` John Garry via iommu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.