All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH v4 00/13] iommu/smmuv3: Implement hardware dirty log tracking
@ 2021-05-07 10:21 ` Keqian Zhu
  0 siblings, 0 replies; 81+ messages in thread
From: Keqian Zhu @ 2021-05-07 10:21 UTC (permalink / raw)
  To: linux-kernel, linux-arm-kernel, iommu, Robin Murphy, Will Deacon,
	Joerg Roedel, Jean-Philippe Brucker, Lu Baolu, Yi Sun,
	Tian Kevin
  Cc: Alex Williamson, Kirti Wankhede, Cornelia Huck, Jonathan Cameron,
	wanghaibin.wang, jiangkunkun, yuzenghui, lushenming

Hi Robin, Will and everyone,

I think this series is relative mature now, please give your valuable suggestions,
thanks!


This patch series is split from the series[1] that containes both IOMMU part and
VFIO part. The VFIO part will be sent out in another series.

[1] https://lore.kernel.org/linux-iommu/20210310090614.26668-1-zhukeqian1@huawei.com/

changelog:

v4:
 - Modify the framework as suggested by Baolu, thanks!
 - Add trace for iommu ops.
 - Extract io-pgtable part.

v3:
 - Merge start_dirty_log and stop_dirty_log into switch_dirty_log. (Yi Sun)
 - Maintain the dirty log status in iommu_domain.
 - Update commit message to make patch easier to review.

v2:
 - Address all comments of RFC version, thanks for all of you ;-)
 - Add a bugfix that start dirty log for newly added dma ranges and domain.



Hi everyone,

This patch series introduces a framework of iommu dirty log tracking, and smmuv3
realizes this framework. This new feature can be used by VFIO dma dirty tracking.

Intention:

Some types of IOMMU are capable of tracking DMA dirty log, such as
ARM SMMU with HTTU or Intel IOMMU with SLADE. This introduces the
dirty log tracking framework in the IOMMU base layer.

Three new essential interfaces are added, and we maintaince the status
of dirty log tracking in iommu_domain.
1. iommu_switch_dirty_log: Perform actions to start|stop dirty log tracking
2. iommu_sync_dirty_log: Sync dirty log from IOMMU into a dirty bitmap
3. iommu_clear_dirty_log: Clear dirty log of IOMMU by a mask bitmap

About SMMU HTTU:

HTTU (Hardware Translation Table Update) is a feature of ARM SMMUv3, it can update
access flag or/and dirty state of the TTD (Translation Table Descriptor) by hardware.
With HTTU, stage1 TTD is classified into 3 types:
                        DBM bit             AP[2](readonly bit)
1. writable_clean         1                       1
2. writable_dirty         1                       0
3. readonly               0                       1

If HTTU_HD (manage dirty state) is enabled, smmu can change TTD from writable_clean to
writable_dirty. Then software can scan TTD to sync dirty state into dirty bitmap. With
this feature, we can track the dirty log of DMA continuously and precisely.

About this series:

Patch 1-3:Introduce dirty log tracking framework in the IOMMU base layer, and two common
           interfaces that can be used by many types of iommu.

Patch 4-6: Add feature detection for smmu HTTU and enable HTTU for smmu stage1 mapping.
           And add feature detection for smmu BBML. We need to split block mapping when
           start dirty log tracking and merge page mapping when stop dirty log tracking,
		   which requires break-before-make procedure. But it might cause problems when the
		   TTD is alive. The I/O streams might not tolerate translation faults. So BBML
		   should be used.

Patch 7-12: We implement these interfaces for arm smmuv3.

Thanks,
Keqian

Jean-Philippe Brucker (1):
  iommu/arm-smmu-v3: Add support for Hardware Translation Table Update

Keqian Zhu (1):
  iommu: Introduce dirty log tracking framework

Kunkun Jiang (11):
  iommu/io-pgtable-arm: Add quirk ARM_HD and ARM_BBMLx
  iommu/io-pgtable-arm: Add and realize split_block ops
  iommu/io-pgtable-arm: Add and realize merge_page ops
  iommu/io-pgtable-arm: Add and realize sync_dirty_log ops
  iommu/io-pgtable-arm: Add and realize clear_dirty_log ops
  iommu/arm-smmu-v3: Enable HTTU for stage1 with io-pgtable mapping
  iommu/arm-smmu-v3: Add feature detection for BBML
  iommu/arm-smmu-v3: Realize switch_dirty_log iommu ops
  iommu/arm-smmu-v3: Realize sync_dirty_log iommu ops
  iommu/arm-smmu-v3: Realize clear_dirty_log iommu ops
  iommu/arm-smmu-v3: Realize support_dirty_log iommu ops

 .../iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c   |   2 +
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c   | 268 +++++++++++-
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h   |  14 +
 drivers/iommu/io-pgtable-arm.c                | 389 +++++++++++++++++-
 drivers/iommu/iommu.c                         | 206 +++++++++-
 include/linux/io-pgtable.h                    |  23 ++
 include/linux/iommu.h                         |  65 +++
 include/trace/events/iommu.h                  |  63 +++
 8 files changed, 1026 insertions(+), 4 deletions(-)

-- 
2.19.1


^ permalink raw reply	[flat|nested] 81+ messages in thread

* [RFC PATCH v4 00/13] iommu/smmuv3: Implement hardware dirty log tracking
@ 2021-05-07 10:21 ` Keqian Zhu
  0 siblings, 0 replies; 81+ messages in thread
From: Keqian Zhu @ 2021-05-07 10:21 UTC (permalink / raw)
  To: linux-kernel, linux-arm-kernel, iommu, Robin Murphy, Will Deacon,
	Joerg Roedel, Jean-Philippe Brucker, Lu Baolu, Yi Sun,
	Tian Kevin
  Cc: jiangkunkun, Cornelia Huck, Kirti Wankhede, lushenming,
	Alex Williamson, wanghaibin.wang

Hi Robin, Will and everyone,

I think this series is relative mature now, please give your valuable suggestions,
thanks!


This patch series is split from the series[1] that containes both IOMMU part and
VFIO part. The VFIO part will be sent out in another series.

[1] https://lore.kernel.org/linux-iommu/20210310090614.26668-1-zhukeqian1@huawei.com/

changelog:

v4:
 - Modify the framework as suggested by Baolu, thanks!
 - Add trace for iommu ops.
 - Extract io-pgtable part.

v3:
 - Merge start_dirty_log and stop_dirty_log into switch_dirty_log. (Yi Sun)
 - Maintain the dirty log status in iommu_domain.
 - Update commit message to make patch easier to review.

v2:
 - Address all comments of RFC version, thanks for all of you ;-)
 - Add a bugfix that start dirty log for newly added dma ranges and domain.



Hi everyone,

This patch series introduces a framework of iommu dirty log tracking, and smmuv3
realizes this framework. This new feature can be used by VFIO dma dirty tracking.

Intention:

Some types of IOMMU are capable of tracking DMA dirty log, such as
ARM SMMU with HTTU or Intel IOMMU with SLADE. This introduces the
dirty log tracking framework in the IOMMU base layer.

Three new essential interfaces are added, and we maintaince the status
of dirty log tracking in iommu_domain.
1. iommu_switch_dirty_log: Perform actions to start|stop dirty log tracking
2. iommu_sync_dirty_log: Sync dirty log from IOMMU into a dirty bitmap
3. iommu_clear_dirty_log: Clear dirty log of IOMMU by a mask bitmap

About SMMU HTTU:

HTTU (Hardware Translation Table Update) is a feature of ARM SMMUv3, it can update
access flag or/and dirty state of the TTD (Translation Table Descriptor) by hardware.
With HTTU, stage1 TTD is classified into 3 types:
                        DBM bit             AP[2](readonly bit)
1. writable_clean         1                       1
2. writable_dirty         1                       0
3. readonly               0                       1

If HTTU_HD (manage dirty state) is enabled, smmu can change TTD from writable_clean to
writable_dirty. Then software can scan TTD to sync dirty state into dirty bitmap. With
this feature, we can track the dirty log of DMA continuously and precisely.

About this series:

Patch 1-3:Introduce dirty log tracking framework in the IOMMU base layer, and two common
           interfaces that can be used by many types of iommu.

Patch 4-6: Add feature detection for smmu HTTU and enable HTTU for smmu stage1 mapping.
           And add feature detection for smmu BBML. We need to split block mapping when
           start dirty log tracking and merge page mapping when stop dirty log tracking,
		   which requires break-before-make procedure. But it might cause problems when the
		   TTD is alive. The I/O streams might not tolerate translation faults. So BBML
		   should be used.

Patch 7-12: We implement these interfaces for arm smmuv3.

Thanks,
Keqian

Jean-Philippe Brucker (1):
  iommu/arm-smmu-v3: Add support for Hardware Translation Table Update

Keqian Zhu (1):
  iommu: Introduce dirty log tracking framework

Kunkun Jiang (11):
  iommu/io-pgtable-arm: Add quirk ARM_HD and ARM_BBMLx
  iommu/io-pgtable-arm: Add and realize split_block ops
  iommu/io-pgtable-arm: Add and realize merge_page ops
  iommu/io-pgtable-arm: Add and realize sync_dirty_log ops
  iommu/io-pgtable-arm: Add and realize clear_dirty_log ops
  iommu/arm-smmu-v3: Enable HTTU for stage1 with io-pgtable mapping
  iommu/arm-smmu-v3: Add feature detection for BBML
  iommu/arm-smmu-v3: Realize switch_dirty_log iommu ops
  iommu/arm-smmu-v3: Realize sync_dirty_log iommu ops
  iommu/arm-smmu-v3: Realize clear_dirty_log iommu ops
  iommu/arm-smmu-v3: Realize support_dirty_log iommu ops

 .../iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c   |   2 +
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c   | 268 +++++++++++-
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h   |  14 +
 drivers/iommu/io-pgtable-arm.c                | 389 +++++++++++++++++-
 drivers/iommu/iommu.c                         | 206 +++++++++-
 include/linux/io-pgtable.h                    |  23 ++
 include/linux/iommu.h                         |  65 +++
 include/trace/events/iommu.h                  |  63 +++
 8 files changed, 1026 insertions(+), 4 deletions(-)

-- 
2.19.1

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 81+ messages in thread

* [RFC PATCH v4 00/13] iommu/smmuv3: Implement hardware dirty log tracking
@ 2021-05-07 10:21 ` Keqian Zhu
  0 siblings, 0 replies; 81+ messages in thread
From: Keqian Zhu @ 2021-05-07 10:21 UTC (permalink / raw)
  To: linux-kernel, linux-arm-kernel, iommu, Robin Murphy, Will Deacon,
	Joerg Roedel, Jean-Philippe Brucker, Lu Baolu, Yi Sun,
	Tian Kevin
  Cc: Alex Williamson, Kirti Wankhede, Cornelia Huck, Jonathan Cameron,
	wanghaibin.wang, jiangkunkun, yuzenghui, lushenming

Hi Robin, Will and everyone,

I think this series is relative mature now, please give your valuable suggestions,
thanks!


This patch series is split from the series[1] that containes both IOMMU part and
VFIO part. The VFIO part will be sent out in another series.

[1] https://lore.kernel.org/linux-iommu/20210310090614.26668-1-zhukeqian1@huawei.com/

changelog:

v4:
 - Modify the framework as suggested by Baolu, thanks!
 - Add trace for iommu ops.
 - Extract io-pgtable part.

v3:
 - Merge start_dirty_log and stop_dirty_log into switch_dirty_log. (Yi Sun)
 - Maintain the dirty log status in iommu_domain.
 - Update commit message to make patch easier to review.

v2:
 - Address all comments of RFC version, thanks for all of you ;-)
 - Add a bugfix that start dirty log for newly added dma ranges and domain.



Hi everyone,

This patch series introduces a framework of iommu dirty log tracking, and smmuv3
realizes this framework. This new feature can be used by VFIO dma dirty tracking.

Intention:

Some types of IOMMU are capable of tracking DMA dirty log, such as
ARM SMMU with HTTU or Intel IOMMU with SLADE. This introduces the
dirty log tracking framework in the IOMMU base layer.

Three new essential interfaces are added, and we maintaince the status
of dirty log tracking in iommu_domain.
1. iommu_switch_dirty_log: Perform actions to start|stop dirty log tracking
2. iommu_sync_dirty_log: Sync dirty log from IOMMU into a dirty bitmap
3. iommu_clear_dirty_log: Clear dirty log of IOMMU by a mask bitmap

About SMMU HTTU:

HTTU (Hardware Translation Table Update) is a feature of ARM SMMUv3, it can update
access flag or/and dirty state of the TTD (Translation Table Descriptor) by hardware.
With HTTU, stage1 TTD is classified into 3 types:
                        DBM bit             AP[2](readonly bit)
1. writable_clean         1                       1
2. writable_dirty         1                       0
3. readonly               0                       1

If HTTU_HD (manage dirty state) is enabled, smmu can change TTD from writable_clean to
writable_dirty. Then software can scan TTD to sync dirty state into dirty bitmap. With
this feature, we can track the dirty log of DMA continuously and precisely.

About this series:

Patch 1-3:Introduce dirty log tracking framework in the IOMMU base layer, and two common
           interfaces that can be used by many types of iommu.

Patch 4-6: Add feature detection for smmu HTTU and enable HTTU for smmu stage1 mapping.
           And add feature detection for smmu BBML. We need to split block mapping when
           start dirty log tracking and merge page mapping when stop dirty log tracking,
		   which requires break-before-make procedure. But it might cause problems when the
		   TTD is alive. The I/O streams might not tolerate translation faults. So BBML
		   should be used.

Patch 7-12: We implement these interfaces for arm smmuv3.

Thanks,
Keqian

Jean-Philippe Brucker (1):
  iommu/arm-smmu-v3: Add support for Hardware Translation Table Update

Keqian Zhu (1):
  iommu: Introduce dirty log tracking framework

Kunkun Jiang (11):
  iommu/io-pgtable-arm: Add quirk ARM_HD and ARM_BBMLx
  iommu/io-pgtable-arm: Add and realize split_block ops
  iommu/io-pgtable-arm: Add and realize merge_page ops
  iommu/io-pgtable-arm: Add and realize sync_dirty_log ops
  iommu/io-pgtable-arm: Add and realize clear_dirty_log ops
  iommu/arm-smmu-v3: Enable HTTU for stage1 with io-pgtable mapping
  iommu/arm-smmu-v3: Add feature detection for BBML
  iommu/arm-smmu-v3: Realize switch_dirty_log iommu ops
  iommu/arm-smmu-v3: Realize sync_dirty_log iommu ops
  iommu/arm-smmu-v3: Realize clear_dirty_log iommu ops
  iommu/arm-smmu-v3: Realize support_dirty_log iommu ops

 .../iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c   |   2 +
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c   | 268 +++++++++++-
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h   |  14 +
 drivers/iommu/io-pgtable-arm.c                | 389 +++++++++++++++++-
 drivers/iommu/iommu.c                         | 206 +++++++++-
 include/linux/io-pgtable.h                    |  23 ++
 include/linux/iommu.h                         |  65 +++
 include/trace/events/iommu.h                  |  63 +++
 8 files changed, 1026 insertions(+), 4 deletions(-)

-- 
2.19.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 81+ messages in thread

* [RFC PATCH v4 01/13] iommu: Introduce dirty log tracking framework
  2021-05-07 10:21 ` Keqian Zhu
  (?)
@ 2021-05-07 10:21   ` Keqian Zhu
  -1 siblings, 0 replies; 81+ messages in thread
From: Keqian Zhu @ 2021-05-07 10:21 UTC (permalink / raw)
  To: linux-kernel, linux-arm-kernel, iommu, Robin Murphy, Will Deacon,
	Joerg Roedel, Jean-Philippe Brucker, Lu Baolu, Yi Sun,
	Tian Kevin
  Cc: Alex Williamson, Kirti Wankhede, Cornelia Huck, Jonathan Cameron,
	wanghaibin.wang, jiangkunkun, yuzenghui, lushenming

Some types of IOMMU are capable of tracking DMA dirty log, such as
ARM SMMU with HTTU or Intel IOMMU with SLADE. This introduces the
dirty log tracking framework in the IOMMU base layer.

Four new essential interfaces are added, and we maintaince the status
of dirty log tracking in iommu_domain.
1. iommu_support_dirty_log: Check whether domain supports dirty log tracking
2. iommu_switch_dirty_log: Perform actions to start|stop dirty log tracking
3. iommu_sync_dirty_log: Sync dirty log from IOMMU into a dirty bitmap
4. iommu_clear_dirty_log: Clear dirty log of IOMMU by a mask bitmap

Note: Don't concurrently call these interfaces with other ops that
access underlying page table.

Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com>
---
 drivers/iommu/iommu.c        | 201 +++++++++++++++++++++++++++++++++++
 include/linux/iommu.h        |  63 +++++++++++
 include/trace/events/iommu.h |  63 +++++++++++
 3 files changed, 327 insertions(+)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 808ab70d5df5..0d15620d1e90 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -1940,6 +1940,7 @@ static struct iommu_domain *__iommu_domain_alloc(struct bus_type *bus,
 	domain->type = type;
 	/* Assume all sizes by default; the driver may override this later */
 	domain->pgsize_bitmap  = bus->iommu_ops->pgsize_bitmap;
+	mutex_init(&domain->switch_log_lock);
 
 	return domain;
 }
@@ -2703,6 +2704,206 @@ int iommu_set_pgtable_quirks(struct iommu_domain *domain,
 }
 EXPORT_SYMBOL_GPL(iommu_set_pgtable_quirks);
 
+bool iommu_support_dirty_log(struct iommu_domain *domain)
+{
+	const struct iommu_ops *ops = domain->ops;
+
+	return ops->support_dirty_log && ops->support_dirty_log(domain);
+}
+EXPORT_SYMBOL_GPL(iommu_support_dirty_log);
+
+int iommu_switch_dirty_log(struct iommu_domain *domain, bool enable,
+			   unsigned long iova, size_t size, int prot)
+{
+	const struct iommu_ops *ops = domain->ops;
+	unsigned long orig_iova = iova;
+	unsigned int min_pagesz;
+	size_t orig_size = size;
+	bool flush = false;
+	int ret = 0;
+
+	if (unlikely(!ops->switch_dirty_log))
+		return -ENODEV;
+
+	min_pagesz = 1 << __ffs(domain->pgsize_bitmap);
+	if (!IS_ALIGNED(iova | size, min_pagesz)) {
+		pr_err("unaligned: iova 0x%lx size 0x%zx min_pagesz 0x%x\n",
+		       iova, size, min_pagesz);
+		return -EINVAL;
+	}
+
+	mutex_lock(&domain->switch_log_lock);
+	if (enable && domain->dirty_log_tracking) {
+		ret = -EBUSY;
+		goto out;
+	} else if (!enable && !domain->dirty_log_tracking) {
+		ret = -EINVAL;
+		goto out;
+	}
+
+	pr_debug("switch_dirty_log %s for: iova 0x%lx size 0x%zx\n",
+		 enable ? "enable" : "disable", iova, size);
+
+	while (size) {
+		size_t pgsize = iommu_pgsize(domain, iova, size);
+
+		flush = true;
+		ret = ops->switch_dirty_log(domain, enable, iova, pgsize, prot);
+		if (ret)
+			break;
+
+		pr_debug("switch_dirty_log handled: iova 0x%lx size 0x%zx\n",
+			 iova, pgsize);
+
+		iova += pgsize;
+		size -= pgsize;
+	}
+
+	if (flush)
+		iommu_flush_iotlb_all(domain);
+
+	if (!ret) {
+		domain->dirty_log_tracking = enable;
+		trace_switch_dirty_log(orig_iova, orig_size, enable);
+	}
+out:
+	mutex_unlock(&domain->switch_log_lock);
+	return ret;
+}
+EXPORT_SYMBOL_GPL(iommu_switch_dirty_log);
+
+int iommu_sync_dirty_log(struct iommu_domain *domain, unsigned long iova,
+			 size_t size, unsigned long *bitmap,
+			 unsigned long base_iova, unsigned long bitmap_pgshift)
+{
+	const struct iommu_ops *ops = domain->ops;
+	unsigned long orig_iova = iova;
+	unsigned int min_pagesz;
+	size_t orig_size = size;
+	int ret = 0;
+
+	if (unlikely(!ops->sync_dirty_log))
+		return -ENODEV;
+
+	min_pagesz = 1 << __ffs(domain->pgsize_bitmap);
+	if (!IS_ALIGNED(iova | size, min_pagesz)) {
+		pr_err("unaligned: iova 0x%lx size 0x%zx min_pagesz 0x%x\n",
+		       iova, size, min_pagesz);
+		return -EINVAL;
+	}
+
+	mutex_lock(&domain->switch_log_lock);
+	if (!domain->dirty_log_tracking) {
+		ret = -EINVAL;
+		goto out;
+	}
+
+	pr_debug("sync_dirty_log for: iova 0x%lx size 0x%zx\n", iova, size);
+
+	while (size) {
+		size_t pgsize = iommu_pgsize(domain, iova, size);
+
+		ret = ops->sync_dirty_log(domain, iova, pgsize,
+					  bitmap, base_iova, bitmap_pgshift);
+		if (ret)
+			break;
+
+		pr_debug("sync_dirty_log handled: iova 0x%lx size 0x%zx\n",
+			 iova, pgsize);
+
+		iova += pgsize;
+		size -= pgsize;
+	}
+
+	if (!ret)
+		trace_sync_dirty_log(orig_iova, orig_size);
+out:
+	mutex_unlock(&domain->switch_log_lock);
+	return ret;
+}
+EXPORT_SYMBOL_GPL(iommu_sync_dirty_log);
+
+static int __iommu_clear_dirty_log(struct iommu_domain *domain,
+				   unsigned long iova, size_t size,
+				   unsigned long *bitmap,
+				   unsigned long base_iova,
+				   unsigned long bitmap_pgshift)
+{
+	const struct iommu_ops *ops = domain->ops;
+	unsigned long orig_iova = iova;
+	size_t orig_size = size;
+	int ret = 0;
+
+	if (unlikely(!ops->clear_dirty_log))
+		return -ENODEV;
+
+	pr_debug("clear_dirty_log for: iova 0x%lx size 0x%zx\n", iova, size);
+
+	while (size) {
+		size_t pgsize = iommu_pgsize(domain, iova, size);
+
+		ret = ops->clear_dirty_log(domain, iova, pgsize, bitmap,
+					   base_iova, bitmap_pgshift);
+		if (ret)
+			break;
+
+		pr_debug("clear_dirty_log handled: iova 0x%lx size 0x%zx\n",
+			 iova, pgsize);
+
+		iova += pgsize;
+		size -= pgsize;
+	}
+
+	if (!ret)
+		trace_clear_dirty_log(orig_iova, orig_size);
+
+	return ret;
+}
+
+int iommu_clear_dirty_log(struct iommu_domain *domain,
+			  unsigned long iova, size_t size,
+			  unsigned long *bitmap, unsigned long base_iova,
+			  unsigned long bitmap_pgshift)
+{
+	unsigned long riova, rsize;
+	unsigned int min_pagesz;
+	bool flush = false;
+	int rs, re, start, end;
+	int ret = 0;
+
+	min_pagesz = 1 << __ffs(domain->pgsize_bitmap);
+	if (!IS_ALIGNED(iova | size, min_pagesz)) {
+		pr_err("unaligned: iova 0x%lx min_pagesz 0x%x\n",
+		       iova, min_pagesz);
+		return -EINVAL;
+	}
+
+	mutex_lock(&domain->switch_log_lock);
+	if (!domain->dirty_log_tracking) {
+		ret = -EINVAL;
+		goto out;
+	}
+
+	start = (iova - base_iova) >> bitmap_pgshift;
+	end = start + (size >> bitmap_pgshift);
+	bitmap_for_each_set_region(bitmap, rs, re, start, end) {
+		flush = true;
+		riova = base_iova + (rs << bitmap_pgshift);
+		rsize = (re - rs) << bitmap_pgshift;
+		ret = __iommu_clear_dirty_log(domain, riova, rsize, bitmap,
+					      base_iova, bitmap_pgshift);
+		if (ret)
+			break;
+	}
+
+	if (flush)
+		iommu_flush_iotlb_all(domain);
+out:
+	mutex_unlock(&domain->switch_log_lock);
+	return ret;
+}
+EXPORT_SYMBOL_GPL(iommu_clear_dirty_log);
+
 void iommu_get_resv_regions(struct device *dev, struct list_head *list)
 {
 	const struct iommu_ops *ops = dev->bus->iommu_ops;
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 32d448050bf7..e0e40dda974d 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -87,6 +87,8 @@ struct iommu_domain {
 	void *handler_token;
 	struct iommu_domain_geometry geometry;
 	void *iova_cookie;
+	bool dirty_log_tracking;
+	struct mutex switch_log_lock;
 };
 
 enum iommu_cap {
@@ -193,6 +195,10 @@ struct iommu_iotlb_gather {
  * @device_group: find iommu group for a particular device
  * @enable_nesting: Enable nesting
  * @set_pgtable_quirks: Set io page table quirks (IO_PGTABLE_QUIRK_*)
+ * @support_dirty_log: Check whether domain supports dirty log tracking
+ * @switch_dirty_log: Perform actions to start|stop dirty log tracking
+ * @sync_dirty_log: Sync dirty log from IOMMU into a dirty bitmap
+ * @clear_dirty_log: Clear dirty log of IOMMU by a mask bitmap
  * @get_resv_regions: Request list of reserved regions for a device
  * @put_resv_regions: Free list of reserved regions for a device
  * @apply_resv_region: Temporary helper call-back for iova reserved ranges
@@ -245,6 +251,22 @@ struct iommu_ops {
 	int (*set_pgtable_quirks)(struct iommu_domain *domain,
 				  unsigned long quirks);
 
+	/*
+	 * Track dirty log. Note: Don't concurrently call these interfaces with
+	 * other ops that access underlying page table.
+	 */
+	bool (*support_dirty_log)(struct iommu_domain *domain);
+	int (*switch_dirty_log)(struct iommu_domain *domain, bool enable,
+				unsigned long iova, size_t size, int prot);
+	int (*sync_dirty_log)(struct iommu_domain *domain,
+			      unsigned long iova, size_t size,
+			      unsigned long *bitmap, unsigned long base_iova,
+			      unsigned long bitmap_pgshift);
+	int (*clear_dirty_log)(struct iommu_domain *domain,
+			       unsigned long iova, size_t size,
+			       unsigned long *bitmap, unsigned long base_iova,
+			       unsigned long bitmap_pgshift);
+
 	/* Request/Free a list of reserved regions for a device */
 	void (*get_resv_regions)(struct device *dev, struct list_head *list);
 	void (*put_resv_regions)(struct device *dev, struct list_head *list);
@@ -475,6 +497,17 @@ extern struct iommu_domain *iommu_group_default_domain(struct iommu_group *);
 int iommu_enable_nesting(struct iommu_domain *domain);
 int iommu_set_pgtable_quirks(struct iommu_domain *domain,
 		unsigned long quirks);
+extern bool iommu_support_dirty_log(struct iommu_domain *domain);
+extern int iommu_switch_dirty_log(struct iommu_domain *domain, bool enable,
+				  unsigned long iova, size_t size, int prot);
+extern int iommu_sync_dirty_log(struct iommu_domain *domain, unsigned long iova,
+				size_t size, unsigned long *bitmap,
+				unsigned long base_iova,
+				unsigned long bitmap_pgshift);
+extern int iommu_clear_dirty_log(struct iommu_domain *domain, unsigned long iova,
+				 size_t dma_size, unsigned long *bitmap,
+				 unsigned long base_iova,
+				 unsigned long bitmap_pgshift);
 
 void iommu_set_dma_strict(bool val);
 bool iommu_get_dma_strict(struct iommu_domain *domain);
@@ -848,6 +881,36 @@ static inline int iommu_set_pgtable_quirks(struct iommu_domain *domain,
 	return 0;
 }
 
+static inline bool iommu_support_dirty_log(struct iommu_domain *domain)
+{
+	return false;
+}
+
+static inline int iommu_switch_dirty_log(struct iommu_domain *domain,
+					 bool enable, unsigned long iova,
+					 size_t size, int prot)
+{
+	return -EINVAL;
+}
+
+static inline int iommu_sync_dirty_log(struct iommu_domain *domain,
+				       unsigned long iova, size_t size,
+				       unsigned long *bitmap,
+				       unsigned long base_iova,
+				       unsigned long pgshift)
+{
+	return -EINVAL;
+}
+
+static inline int iommu_clear_dirty_log(struct iommu_domain *domain,
+					unsigned long iova, size_t size,
+					unsigned long *bitmap,
+					unsigned long base_iova,
+					unsigned long pgshift)
+{
+	return -EINVAL;
+}
+
 static inline int iommu_device_register(struct iommu_device *iommu,
 					const struct iommu_ops *ops,
 					struct device *hwdev)
diff --git a/include/trace/events/iommu.h b/include/trace/events/iommu.h
index 72b4582322ff..6436d693d357 100644
--- a/include/trace/events/iommu.h
+++ b/include/trace/events/iommu.h
@@ -129,6 +129,69 @@ TRACE_EVENT(unmap,
 	)
 );
 
+TRACE_EVENT(switch_dirty_log,
+
+	TP_PROTO(unsigned long iova, size_t size, bool enable),
+
+	TP_ARGS(iova, size, enable),
+
+	TP_STRUCT__entry(
+		__field(u64, iova)
+		__field(size_t, size)
+		__field(bool, enable)
+	),
+
+	TP_fast_assign(
+		__entry->iova = iova;
+		__entry->size = size;
+		__entry->enable = enable;
+	),
+
+	TP_printk("IOMMU: iova=0x%016llx size=%zu enable=%u",
+			__entry->iova, __entry->size, __entry->enable
+	)
+);
+
+TRACE_EVENT(sync_dirty_log,
+
+	TP_PROTO(unsigned long iova, size_t size),
+
+	TP_ARGS(iova, size),
+
+	TP_STRUCT__entry(
+		__field(u64, iova)
+		__field(size_t, size)
+	),
+
+	TP_fast_assign(
+		__entry->iova = iova;
+		__entry->size = size;
+	),
+
+	TP_printk("IOMMU: iova=0x%016llx size=%zu", __entry->iova,
+			__entry->size)
+);
+
+TRACE_EVENT(clear_dirty_log,
+
+	TP_PROTO(unsigned long iova, size_t size),
+
+	TP_ARGS(iova, size),
+
+	TP_STRUCT__entry(
+		__field(u64, iova)
+		__field(size_t, size)
+	),
+
+	TP_fast_assign(
+		__entry->iova = iova;
+		__entry->size = size;
+	),
+
+	TP_printk("IOMMU: iova=0x%016llx size=%zu", __entry->iova,
+			__entry->size)
+);
+
 DECLARE_EVENT_CLASS(iommu_error,
 
 	TP_PROTO(struct device *dev, unsigned long iova, int flags),
-- 
2.19.1


^ permalink raw reply	[flat|nested] 81+ messages in thread

* [RFC PATCH v4 01/13] iommu: Introduce dirty log tracking framework
@ 2021-05-07 10:21   ` Keqian Zhu
  0 siblings, 0 replies; 81+ messages in thread
From: Keqian Zhu @ 2021-05-07 10:21 UTC (permalink / raw)
  To: linux-kernel, linux-arm-kernel, iommu, Robin Murphy, Will Deacon,
	Joerg Roedel, Jean-Philippe Brucker, Lu Baolu, Yi Sun,
	Tian Kevin
  Cc: jiangkunkun, Cornelia Huck, Kirti Wankhede, lushenming,
	Alex Williamson, wanghaibin.wang

Some types of IOMMU are capable of tracking DMA dirty log, such as
ARM SMMU with HTTU or Intel IOMMU with SLADE. This introduces the
dirty log tracking framework in the IOMMU base layer.

Four new essential interfaces are added, and we maintaince the status
of dirty log tracking in iommu_domain.
1. iommu_support_dirty_log: Check whether domain supports dirty log tracking
2. iommu_switch_dirty_log: Perform actions to start|stop dirty log tracking
3. iommu_sync_dirty_log: Sync dirty log from IOMMU into a dirty bitmap
4. iommu_clear_dirty_log: Clear dirty log of IOMMU by a mask bitmap

Note: Don't concurrently call these interfaces with other ops that
access underlying page table.

Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com>
---
 drivers/iommu/iommu.c        | 201 +++++++++++++++++++++++++++++++++++
 include/linux/iommu.h        |  63 +++++++++++
 include/trace/events/iommu.h |  63 +++++++++++
 3 files changed, 327 insertions(+)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 808ab70d5df5..0d15620d1e90 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -1940,6 +1940,7 @@ static struct iommu_domain *__iommu_domain_alloc(struct bus_type *bus,
 	domain->type = type;
 	/* Assume all sizes by default; the driver may override this later */
 	domain->pgsize_bitmap  = bus->iommu_ops->pgsize_bitmap;
+	mutex_init(&domain->switch_log_lock);
 
 	return domain;
 }
@@ -2703,6 +2704,206 @@ int iommu_set_pgtable_quirks(struct iommu_domain *domain,
 }
 EXPORT_SYMBOL_GPL(iommu_set_pgtable_quirks);
 
+bool iommu_support_dirty_log(struct iommu_domain *domain)
+{
+	const struct iommu_ops *ops = domain->ops;
+
+	return ops->support_dirty_log && ops->support_dirty_log(domain);
+}
+EXPORT_SYMBOL_GPL(iommu_support_dirty_log);
+
+int iommu_switch_dirty_log(struct iommu_domain *domain, bool enable,
+			   unsigned long iova, size_t size, int prot)
+{
+	const struct iommu_ops *ops = domain->ops;
+	unsigned long orig_iova = iova;
+	unsigned int min_pagesz;
+	size_t orig_size = size;
+	bool flush = false;
+	int ret = 0;
+
+	if (unlikely(!ops->switch_dirty_log))
+		return -ENODEV;
+
+	min_pagesz = 1 << __ffs(domain->pgsize_bitmap);
+	if (!IS_ALIGNED(iova | size, min_pagesz)) {
+		pr_err("unaligned: iova 0x%lx size 0x%zx min_pagesz 0x%x\n",
+		       iova, size, min_pagesz);
+		return -EINVAL;
+	}
+
+	mutex_lock(&domain->switch_log_lock);
+	if (enable && domain->dirty_log_tracking) {
+		ret = -EBUSY;
+		goto out;
+	} else if (!enable && !domain->dirty_log_tracking) {
+		ret = -EINVAL;
+		goto out;
+	}
+
+	pr_debug("switch_dirty_log %s for: iova 0x%lx size 0x%zx\n",
+		 enable ? "enable" : "disable", iova, size);
+
+	while (size) {
+		size_t pgsize = iommu_pgsize(domain, iova, size);
+
+		flush = true;
+		ret = ops->switch_dirty_log(domain, enable, iova, pgsize, prot);
+		if (ret)
+			break;
+
+		pr_debug("switch_dirty_log handled: iova 0x%lx size 0x%zx\n",
+			 iova, pgsize);
+
+		iova += pgsize;
+		size -= pgsize;
+	}
+
+	if (flush)
+		iommu_flush_iotlb_all(domain);
+
+	if (!ret) {
+		domain->dirty_log_tracking = enable;
+		trace_switch_dirty_log(orig_iova, orig_size, enable);
+	}
+out:
+	mutex_unlock(&domain->switch_log_lock);
+	return ret;
+}
+EXPORT_SYMBOL_GPL(iommu_switch_dirty_log);
+
+int iommu_sync_dirty_log(struct iommu_domain *domain, unsigned long iova,
+			 size_t size, unsigned long *bitmap,
+			 unsigned long base_iova, unsigned long bitmap_pgshift)
+{
+	const struct iommu_ops *ops = domain->ops;
+	unsigned long orig_iova = iova;
+	unsigned int min_pagesz;
+	size_t orig_size = size;
+	int ret = 0;
+
+	if (unlikely(!ops->sync_dirty_log))
+		return -ENODEV;
+
+	min_pagesz = 1 << __ffs(domain->pgsize_bitmap);
+	if (!IS_ALIGNED(iova | size, min_pagesz)) {
+		pr_err("unaligned: iova 0x%lx size 0x%zx min_pagesz 0x%x\n",
+		       iova, size, min_pagesz);
+		return -EINVAL;
+	}
+
+	mutex_lock(&domain->switch_log_lock);
+	if (!domain->dirty_log_tracking) {
+		ret = -EINVAL;
+		goto out;
+	}
+
+	pr_debug("sync_dirty_log for: iova 0x%lx size 0x%zx\n", iova, size);
+
+	while (size) {
+		size_t pgsize = iommu_pgsize(domain, iova, size);
+
+		ret = ops->sync_dirty_log(domain, iova, pgsize,
+					  bitmap, base_iova, bitmap_pgshift);
+		if (ret)
+			break;
+
+		pr_debug("sync_dirty_log handled: iova 0x%lx size 0x%zx\n",
+			 iova, pgsize);
+
+		iova += pgsize;
+		size -= pgsize;
+	}
+
+	if (!ret)
+		trace_sync_dirty_log(orig_iova, orig_size);
+out:
+	mutex_unlock(&domain->switch_log_lock);
+	return ret;
+}
+EXPORT_SYMBOL_GPL(iommu_sync_dirty_log);
+
+static int __iommu_clear_dirty_log(struct iommu_domain *domain,
+				   unsigned long iova, size_t size,
+				   unsigned long *bitmap,
+				   unsigned long base_iova,
+				   unsigned long bitmap_pgshift)
+{
+	const struct iommu_ops *ops = domain->ops;
+	unsigned long orig_iova = iova;
+	size_t orig_size = size;
+	int ret = 0;
+
+	if (unlikely(!ops->clear_dirty_log))
+		return -ENODEV;
+
+	pr_debug("clear_dirty_log for: iova 0x%lx size 0x%zx\n", iova, size);
+
+	while (size) {
+		size_t pgsize = iommu_pgsize(domain, iova, size);
+
+		ret = ops->clear_dirty_log(domain, iova, pgsize, bitmap,
+					   base_iova, bitmap_pgshift);
+		if (ret)
+			break;
+
+		pr_debug("clear_dirty_log handled: iova 0x%lx size 0x%zx\n",
+			 iova, pgsize);
+
+		iova += pgsize;
+		size -= pgsize;
+	}
+
+	if (!ret)
+		trace_clear_dirty_log(orig_iova, orig_size);
+
+	return ret;
+}
+
+int iommu_clear_dirty_log(struct iommu_domain *domain,
+			  unsigned long iova, size_t size,
+			  unsigned long *bitmap, unsigned long base_iova,
+			  unsigned long bitmap_pgshift)
+{
+	unsigned long riova, rsize;
+	unsigned int min_pagesz;
+	bool flush = false;
+	int rs, re, start, end;
+	int ret = 0;
+
+	min_pagesz = 1 << __ffs(domain->pgsize_bitmap);
+	if (!IS_ALIGNED(iova | size, min_pagesz)) {
+		pr_err("unaligned: iova 0x%lx min_pagesz 0x%x\n",
+		       iova, min_pagesz);
+		return -EINVAL;
+	}
+
+	mutex_lock(&domain->switch_log_lock);
+	if (!domain->dirty_log_tracking) {
+		ret = -EINVAL;
+		goto out;
+	}
+
+	start = (iova - base_iova) >> bitmap_pgshift;
+	end = start + (size >> bitmap_pgshift);
+	bitmap_for_each_set_region(bitmap, rs, re, start, end) {
+		flush = true;
+		riova = base_iova + (rs << bitmap_pgshift);
+		rsize = (re - rs) << bitmap_pgshift;
+		ret = __iommu_clear_dirty_log(domain, riova, rsize, bitmap,
+					      base_iova, bitmap_pgshift);
+		if (ret)
+			break;
+	}
+
+	if (flush)
+		iommu_flush_iotlb_all(domain);
+out:
+	mutex_unlock(&domain->switch_log_lock);
+	return ret;
+}
+EXPORT_SYMBOL_GPL(iommu_clear_dirty_log);
+
 void iommu_get_resv_regions(struct device *dev, struct list_head *list)
 {
 	const struct iommu_ops *ops = dev->bus->iommu_ops;
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 32d448050bf7..e0e40dda974d 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -87,6 +87,8 @@ struct iommu_domain {
 	void *handler_token;
 	struct iommu_domain_geometry geometry;
 	void *iova_cookie;
+	bool dirty_log_tracking;
+	struct mutex switch_log_lock;
 };
 
 enum iommu_cap {
@@ -193,6 +195,10 @@ struct iommu_iotlb_gather {
  * @device_group: find iommu group for a particular device
  * @enable_nesting: Enable nesting
  * @set_pgtable_quirks: Set io page table quirks (IO_PGTABLE_QUIRK_*)
+ * @support_dirty_log: Check whether domain supports dirty log tracking
+ * @switch_dirty_log: Perform actions to start|stop dirty log tracking
+ * @sync_dirty_log: Sync dirty log from IOMMU into a dirty bitmap
+ * @clear_dirty_log: Clear dirty log of IOMMU by a mask bitmap
  * @get_resv_regions: Request list of reserved regions for a device
  * @put_resv_regions: Free list of reserved regions for a device
  * @apply_resv_region: Temporary helper call-back for iova reserved ranges
@@ -245,6 +251,22 @@ struct iommu_ops {
 	int (*set_pgtable_quirks)(struct iommu_domain *domain,
 				  unsigned long quirks);
 
+	/*
+	 * Track dirty log. Note: Don't concurrently call these interfaces with
+	 * other ops that access underlying page table.
+	 */
+	bool (*support_dirty_log)(struct iommu_domain *domain);
+	int (*switch_dirty_log)(struct iommu_domain *domain, bool enable,
+				unsigned long iova, size_t size, int prot);
+	int (*sync_dirty_log)(struct iommu_domain *domain,
+			      unsigned long iova, size_t size,
+			      unsigned long *bitmap, unsigned long base_iova,
+			      unsigned long bitmap_pgshift);
+	int (*clear_dirty_log)(struct iommu_domain *domain,
+			       unsigned long iova, size_t size,
+			       unsigned long *bitmap, unsigned long base_iova,
+			       unsigned long bitmap_pgshift);
+
 	/* Request/Free a list of reserved regions for a device */
 	void (*get_resv_regions)(struct device *dev, struct list_head *list);
 	void (*put_resv_regions)(struct device *dev, struct list_head *list);
@@ -475,6 +497,17 @@ extern struct iommu_domain *iommu_group_default_domain(struct iommu_group *);
 int iommu_enable_nesting(struct iommu_domain *domain);
 int iommu_set_pgtable_quirks(struct iommu_domain *domain,
 		unsigned long quirks);
+extern bool iommu_support_dirty_log(struct iommu_domain *domain);
+extern int iommu_switch_dirty_log(struct iommu_domain *domain, bool enable,
+				  unsigned long iova, size_t size, int prot);
+extern int iommu_sync_dirty_log(struct iommu_domain *domain, unsigned long iova,
+				size_t size, unsigned long *bitmap,
+				unsigned long base_iova,
+				unsigned long bitmap_pgshift);
+extern int iommu_clear_dirty_log(struct iommu_domain *domain, unsigned long iova,
+				 size_t dma_size, unsigned long *bitmap,
+				 unsigned long base_iova,
+				 unsigned long bitmap_pgshift);
 
 void iommu_set_dma_strict(bool val);
 bool iommu_get_dma_strict(struct iommu_domain *domain);
@@ -848,6 +881,36 @@ static inline int iommu_set_pgtable_quirks(struct iommu_domain *domain,
 	return 0;
 }
 
+static inline bool iommu_support_dirty_log(struct iommu_domain *domain)
+{
+	return false;
+}
+
+static inline int iommu_switch_dirty_log(struct iommu_domain *domain,
+					 bool enable, unsigned long iova,
+					 size_t size, int prot)
+{
+	return -EINVAL;
+}
+
+static inline int iommu_sync_dirty_log(struct iommu_domain *domain,
+				       unsigned long iova, size_t size,
+				       unsigned long *bitmap,
+				       unsigned long base_iova,
+				       unsigned long pgshift)
+{
+	return -EINVAL;
+}
+
+static inline int iommu_clear_dirty_log(struct iommu_domain *domain,
+					unsigned long iova, size_t size,
+					unsigned long *bitmap,
+					unsigned long base_iova,
+					unsigned long pgshift)
+{
+	return -EINVAL;
+}
+
 static inline int iommu_device_register(struct iommu_device *iommu,
 					const struct iommu_ops *ops,
 					struct device *hwdev)
diff --git a/include/trace/events/iommu.h b/include/trace/events/iommu.h
index 72b4582322ff..6436d693d357 100644
--- a/include/trace/events/iommu.h
+++ b/include/trace/events/iommu.h
@@ -129,6 +129,69 @@ TRACE_EVENT(unmap,
 	)
 );
 
+TRACE_EVENT(switch_dirty_log,
+
+	TP_PROTO(unsigned long iova, size_t size, bool enable),
+
+	TP_ARGS(iova, size, enable),
+
+	TP_STRUCT__entry(
+		__field(u64, iova)
+		__field(size_t, size)
+		__field(bool, enable)
+	),
+
+	TP_fast_assign(
+		__entry->iova = iova;
+		__entry->size = size;
+		__entry->enable = enable;
+	),
+
+	TP_printk("IOMMU: iova=0x%016llx size=%zu enable=%u",
+			__entry->iova, __entry->size, __entry->enable
+	)
+);
+
+TRACE_EVENT(sync_dirty_log,
+
+	TP_PROTO(unsigned long iova, size_t size),
+
+	TP_ARGS(iova, size),
+
+	TP_STRUCT__entry(
+		__field(u64, iova)
+		__field(size_t, size)
+	),
+
+	TP_fast_assign(
+		__entry->iova = iova;
+		__entry->size = size;
+	),
+
+	TP_printk("IOMMU: iova=0x%016llx size=%zu", __entry->iova,
+			__entry->size)
+);
+
+TRACE_EVENT(clear_dirty_log,
+
+	TP_PROTO(unsigned long iova, size_t size),
+
+	TP_ARGS(iova, size),
+
+	TP_STRUCT__entry(
+		__field(u64, iova)
+		__field(size_t, size)
+	),
+
+	TP_fast_assign(
+		__entry->iova = iova;
+		__entry->size = size;
+	),
+
+	TP_printk("IOMMU: iova=0x%016llx size=%zu", __entry->iova,
+			__entry->size)
+);
+
 DECLARE_EVENT_CLASS(iommu_error,
 
 	TP_PROTO(struct device *dev, unsigned long iova, int flags),
-- 
2.19.1

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 81+ messages in thread

* [RFC PATCH v4 01/13] iommu: Introduce dirty log tracking framework
@ 2021-05-07 10:21   ` Keqian Zhu
  0 siblings, 0 replies; 81+ messages in thread
From: Keqian Zhu @ 2021-05-07 10:21 UTC (permalink / raw)
  To: linux-kernel, linux-arm-kernel, iommu, Robin Murphy, Will Deacon,
	Joerg Roedel, Jean-Philippe Brucker, Lu Baolu, Yi Sun,
	Tian Kevin
  Cc: Alex Williamson, Kirti Wankhede, Cornelia Huck, Jonathan Cameron,
	wanghaibin.wang, jiangkunkun, yuzenghui, lushenming

Some types of IOMMU are capable of tracking DMA dirty log, such as
ARM SMMU with HTTU or Intel IOMMU with SLADE. This introduces the
dirty log tracking framework in the IOMMU base layer.

Four new essential interfaces are added, and we maintaince the status
of dirty log tracking in iommu_domain.
1. iommu_support_dirty_log: Check whether domain supports dirty log tracking
2. iommu_switch_dirty_log: Perform actions to start|stop dirty log tracking
3. iommu_sync_dirty_log: Sync dirty log from IOMMU into a dirty bitmap
4. iommu_clear_dirty_log: Clear dirty log of IOMMU by a mask bitmap

Note: Don't concurrently call these interfaces with other ops that
access underlying page table.

Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com>
---
 drivers/iommu/iommu.c        | 201 +++++++++++++++++++++++++++++++++++
 include/linux/iommu.h        |  63 +++++++++++
 include/trace/events/iommu.h |  63 +++++++++++
 3 files changed, 327 insertions(+)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 808ab70d5df5..0d15620d1e90 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -1940,6 +1940,7 @@ static struct iommu_domain *__iommu_domain_alloc(struct bus_type *bus,
 	domain->type = type;
 	/* Assume all sizes by default; the driver may override this later */
 	domain->pgsize_bitmap  = bus->iommu_ops->pgsize_bitmap;
+	mutex_init(&domain->switch_log_lock);
 
 	return domain;
 }
@@ -2703,6 +2704,206 @@ int iommu_set_pgtable_quirks(struct iommu_domain *domain,
 }
 EXPORT_SYMBOL_GPL(iommu_set_pgtable_quirks);
 
+bool iommu_support_dirty_log(struct iommu_domain *domain)
+{
+	const struct iommu_ops *ops = domain->ops;
+
+	return ops->support_dirty_log && ops->support_dirty_log(domain);
+}
+EXPORT_SYMBOL_GPL(iommu_support_dirty_log);
+
+int iommu_switch_dirty_log(struct iommu_domain *domain, bool enable,
+			   unsigned long iova, size_t size, int prot)
+{
+	const struct iommu_ops *ops = domain->ops;
+	unsigned long orig_iova = iova;
+	unsigned int min_pagesz;
+	size_t orig_size = size;
+	bool flush = false;
+	int ret = 0;
+
+	if (unlikely(!ops->switch_dirty_log))
+		return -ENODEV;
+
+	min_pagesz = 1 << __ffs(domain->pgsize_bitmap);
+	if (!IS_ALIGNED(iova | size, min_pagesz)) {
+		pr_err("unaligned: iova 0x%lx size 0x%zx min_pagesz 0x%x\n",
+		       iova, size, min_pagesz);
+		return -EINVAL;
+	}
+
+	mutex_lock(&domain->switch_log_lock);
+	if (enable && domain->dirty_log_tracking) {
+		ret = -EBUSY;
+		goto out;
+	} else if (!enable && !domain->dirty_log_tracking) {
+		ret = -EINVAL;
+		goto out;
+	}
+
+	pr_debug("switch_dirty_log %s for: iova 0x%lx size 0x%zx\n",
+		 enable ? "enable" : "disable", iova, size);
+
+	while (size) {
+		size_t pgsize = iommu_pgsize(domain, iova, size);
+
+		flush = true;
+		ret = ops->switch_dirty_log(domain, enable, iova, pgsize, prot);
+		if (ret)
+			break;
+
+		pr_debug("switch_dirty_log handled: iova 0x%lx size 0x%zx\n",
+			 iova, pgsize);
+
+		iova += pgsize;
+		size -= pgsize;
+	}
+
+	if (flush)
+		iommu_flush_iotlb_all(domain);
+
+	if (!ret) {
+		domain->dirty_log_tracking = enable;
+		trace_switch_dirty_log(orig_iova, orig_size, enable);
+	}
+out:
+	mutex_unlock(&domain->switch_log_lock);
+	return ret;
+}
+EXPORT_SYMBOL_GPL(iommu_switch_dirty_log);
+
+int iommu_sync_dirty_log(struct iommu_domain *domain, unsigned long iova,
+			 size_t size, unsigned long *bitmap,
+			 unsigned long base_iova, unsigned long bitmap_pgshift)
+{
+	const struct iommu_ops *ops = domain->ops;
+	unsigned long orig_iova = iova;
+	unsigned int min_pagesz;
+	size_t orig_size = size;
+	int ret = 0;
+
+	if (unlikely(!ops->sync_dirty_log))
+		return -ENODEV;
+
+	min_pagesz = 1 << __ffs(domain->pgsize_bitmap);
+	if (!IS_ALIGNED(iova | size, min_pagesz)) {
+		pr_err("unaligned: iova 0x%lx size 0x%zx min_pagesz 0x%x\n",
+		       iova, size, min_pagesz);
+		return -EINVAL;
+	}
+
+	mutex_lock(&domain->switch_log_lock);
+	if (!domain->dirty_log_tracking) {
+		ret = -EINVAL;
+		goto out;
+	}
+
+	pr_debug("sync_dirty_log for: iova 0x%lx size 0x%zx\n", iova, size);
+
+	while (size) {
+		size_t pgsize = iommu_pgsize(domain, iova, size);
+
+		ret = ops->sync_dirty_log(domain, iova, pgsize,
+					  bitmap, base_iova, bitmap_pgshift);
+		if (ret)
+			break;
+
+		pr_debug("sync_dirty_log handled: iova 0x%lx size 0x%zx\n",
+			 iova, pgsize);
+
+		iova += pgsize;
+		size -= pgsize;
+	}
+
+	if (!ret)
+		trace_sync_dirty_log(orig_iova, orig_size);
+out:
+	mutex_unlock(&domain->switch_log_lock);
+	return ret;
+}
+EXPORT_SYMBOL_GPL(iommu_sync_dirty_log);
+
+static int __iommu_clear_dirty_log(struct iommu_domain *domain,
+				   unsigned long iova, size_t size,
+				   unsigned long *bitmap,
+				   unsigned long base_iova,
+				   unsigned long bitmap_pgshift)
+{
+	const struct iommu_ops *ops = domain->ops;
+	unsigned long orig_iova = iova;
+	size_t orig_size = size;
+	int ret = 0;
+
+	if (unlikely(!ops->clear_dirty_log))
+		return -ENODEV;
+
+	pr_debug("clear_dirty_log for: iova 0x%lx size 0x%zx\n", iova, size);
+
+	while (size) {
+		size_t pgsize = iommu_pgsize(domain, iova, size);
+
+		ret = ops->clear_dirty_log(domain, iova, pgsize, bitmap,
+					   base_iova, bitmap_pgshift);
+		if (ret)
+			break;
+
+		pr_debug("clear_dirty_log handled: iova 0x%lx size 0x%zx\n",
+			 iova, pgsize);
+
+		iova += pgsize;
+		size -= pgsize;
+	}
+
+	if (!ret)
+		trace_clear_dirty_log(orig_iova, orig_size);
+
+	return ret;
+}
+
+int iommu_clear_dirty_log(struct iommu_domain *domain,
+			  unsigned long iova, size_t size,
+			  unsigned long *bitmap, unsigned long base_iova,
+			  unsigned long bitmap_pgshift)
+{
+	unsigned long riova, rsize;
+	unsigned int min_pagesz;
+	bool flush = false;
+	int rs, re, start, end;
+	int ret = 0;
+
+	min_pagesz = 1 << __ffs(domain->pgsize_bitmap);
+	if (!IS_ALIGNED(iova | size, min_pagesz)) {
+		pr_err("unaligned: iova 0x%lx min_pagesz 0x%x\n",
+		       iova, min_pagesz);
+		return -EINVAL;
+	}
+
+	mutex_lock(&domain->switch_log_lock);
+	if (!domain->dirty_log_tracking) {
+		ret = -EINVAL;
+		goto out;
+	}
+
+	start = (iova - base_iova) >> bitmap_pgshift;
+	end = start + (size >> bitmap_pgshift);
+	bitmap_for_each_set_region(bitmap, rs, re, start, end) {
+		flush = true;
+		riova = base_iova + (rs << bitmap_pgshift);
+		rsize = (re - rs) << bitmap_pgshift;
+		ret = __iommu_clear_dirty_log(domain, riova, rsize, bitmap,
+					      base_iova, bitmap_pgshift);
+		if (ret)
+			break;
+	}
+
+	if (flush)
+		iommu_flush_iotlb_all(domain);
+out:
+	mutex_unlock(&domain->switch_log_lock);
+	return ret;
+}
+EXPORT_SYMBOL_GPL(iommu_clear_dirty_log);
+
 void iommu_get_resv_regions(struct device *dev, struct list_head *list)
 {
 	const struct iommu_ops *ops = dev->bus->iommu_ops;
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 32d448050bf7..e0e40dda974d 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -87,6 +87,8 @@ struct iommu_domain {
 	void *handler_token;
 	struct iommu_domain_geometry geometry;
 	void *iova_cookie;
+	bool dirty_log_tracking;
+	struct mutex switch_log_lock;
 };
 
 enum iommu_cap {
@@ -193,6 +195,10 @@ struct iommu_iotlb_gather {
  * @device_group: find iommu group for a particular device
  * @enable_nesting: Enable nesting
  * @set_pgtable_quirks: Set io page table quirks (IO_PGTABLE_QUIRK_*)
+ * @support_dirty_log: Check whether domain supports dirty log tracking
+ * @switch_dirty_log: Perform actions to start|stop dirty log tracking
+ * @sync_dirty_log: Sync dirty log from IOMMU into a dirty bitmap
+ * @clear_dirty_log: Clear dirty log of IOMMU by a mask bitmap
  * @get_resv_regions: Request list of reserved regions for a device
  * @put_resv_regions: Free list of reserved regions for a device
  * @apply_resv_region: Temporary helper call-back for iova reserved ranges
@@ -245,6 +251,22 @@ struct iommu_ops {
 	int (*set_pgtable_quirks)(struct iommu_domain *domain,
 				  unsigned long quirks);
 
+	/*
+	 * Track dirty log. Note: Don't concurrently call these interfaces with
+	 * other ops that access underlying page table.
+	 */
+	bool (*support_dirty_log)(struct iommu_domain *domain);
+	int (*switch_dirty_log)(struct iommu_domain *domain, bool enable,
+				unsigned long iova, size_t size, int prot);
+	int (*sync_dirty_log)(struct iommu_domain *domain,
+			      unsigned long iova, size_t size,
+			      unsigned long *bitmap, unsigned long base_iova,
+			      unsigned long bitmap_pgshift);
+	int (*clear_dirty_log)(struct iommu_domain *domain,
+			       unsigned long iova, size_t size,
+			       unsigned long *bitmap, unsigned long base_iova,
+			       unsigned long bitmap_pgshift);
+
 	/* Request/Free a list of reserved regions for a device */
 	void (*get_resv_regions)(struct device *dev, struct list_head *list);
 	void (*put_resv_regions)(struct device *dev, struct list_head *list);
@@ -475,6 +497,17 @@ extern struct iommu_domain *iommu_group_default_domain(struct iommu_group *);
 int iommu_enable_nesting(struct iommu_domain *domain);
 int iommu_set_pgtable_quirks(struct iommu_domain *domain,
 		unsigned long quirks);
+extern bool iommu_support_dirty_log(struct iommu_domain *domain);
+extern int iommu_switch_dirty_log(struct iommu_domain *domain, bool enable,
+				  unsigned long iova, size_t size, int prot);
+extern int iommu_sync_dirty_log(struct iommu_domain *domain, unsigned long iova,
+				size_t size, unsigned long *bitmap,
+				unsigned long base_iova,
+				unsigned long bitmap_pgshift);
+extern int iommu_clear_dirty_log(struct iommu_domain *domain, unsigned long iova,
+				 size_t dma_size, unsigned long *bitmap,
+				 unsigned long base_iova,
+				 unsigned long bitmap_pgshift);
 
 void iommu_set_dma_strict(bool val);
 bool iommu_get_dma_strict(struct iommu_domain *domain);
@@ -848,6 +881,36 @@ static inline int iommu_set_pgtable_quirks(struct iommu_domain *domain,
 	return 0;
 }
 
+static inline bool iommu_support_dirty_log(struct iommu_domain *domain)
+{
+	return false;
+}
+
+static inline int iommu_switch_dirty_log(struct iommu_domain *domain,
+					 bool enable, unsigned long iova,
+					 size_t size, int prot)
+{
+	return -EINVAL;
+}
+
+static inline int iommu_sync_dirty_log(struct iommu_domain *domain,
+				       unsigned long iova, size_t size,
+				       unsigned long *bitmap,
+				       unsigned long base_iova,
+				       unsigned long pgshift)
+{
+	return -EINVAL;
+}
+
+static inline int iommu_clear_dirty_log(struct iommu_domain *domain,
+					unsigned long iova, size_t size,
+					unsigned long *bitmap,
+					unsigned long base_iova,
+					unsigned long pgshift)
+{
+	return -EINVAL;
+}
+
 static inline int iommu_device_register(struct iommu_device *iommu,
 					const struct iommu_ops *ops,
 					struct device *hwdev)
diff --git a/include/trace/events/iommu.h b/include/trace/events/iommu.h
index 72b4582322ff..6436d693d357 100644
--- a/include/trace/events/iommu.h
+++ b/include/trace/events/iommu.h
@@ -129,6 +129,69 @@ TRACE_EVENT(unmap,
 	)
 );
 
+TRACE_EVENT(switch_dirty_log,
+
+	TP_PROTO(unsigned long iova, size_t size, bool enable),
+
+	TP_ARGS(iova, size, enable),
+
+	TP_STRUCT__entry(
+		__field(u64, iova)
+		__field(size_t, size)
+		__field(bool, enable)
+	),
+
+	TP_fast_assign(
+		__entry->iova = iova;
+		__entry->size = size;
+		__entry->enable = enable;
+	),
+
+	TP_printk("IOMMU: iova=0x%016llx size=%zu enable=%u",
+			__entry->iova, __entry->size, __entry->enable
+	)
+);
+
+TRACE_EVENT(sync_dirty_log,
+
+	TP_PROTO(unsigned long iova, size_t size),
+
+	TP_ARGS(iova, size),
+
+	TP_STRUCT__entry(
+		__field(u64, iova)
+		__field(size_t, size)
+	),
+
+	TP_fast_assign(
+		__entry->iova = iova;
+		__entry->size = size;
+	),
+
+	TP_printk("IOMMU: iova=0x%016llx size=%zu", __entry->iova,
+			__entry->size)
+);
+
+TRACE_EVENT(clear_dirty_log,
+
+	TP_PROTO(unsigned long iova, size_t size),
+
+	TP_ARGS(iova, size),
+
+	TP_STRUCT__entry(
+		__field(u64, iova)
+		__field(size_t, size)
+	),
+
+	TP_fast_assign(
+		__entry->iova = iova;
+		__entry->size = size;
+	),
+
+	TP_printk("IOMMU: iova=0x%016llx size=%zu", __entry->iova,
+			__entry->size)
+);
+
 DECLARE_EVENT_CLASS(iommu_error,
 
 	TP_PROTO(struct device *dev, unsigned long iova, int flags),
-- 
2.19.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 81+ messages in thread

* [RFC PATCH v4 02/13] iommu/io-pgtable-arm: Add quirk ARM_HD and ARM_BBMLx
  2021-05-07 10:21 ` Keqian Zhu
  (?)
@ 2021-05-07 10:22   ` Keqian Zhu
  -1 siblings, 0 replies; 81+ messages in thread
From: Keqian Zhu @ 2021-05-07 10:22 UTC (permalink / raw)
  To: linux-kernel, linux-arm-kernel, iommu, Robin Murphy, Will Deacon,
	Joerg Roedel, Jean-Philippe Brucker, Lu Baolu, Yi Sun,
	Tian Kevin
  Cc: Alex Williamson, Kirti Wankhede, Cornelia Huck, Jonathan Cameron,
	wanghaibin.wang, jiangkunkun, yuzenghui, lushenming

From: Kunkun Jiang <jiangkunkun@huawei.com>

These features are essential to support dirty log tracking for
SMMU with io-pgtable mapping.

The dirty state information is encoded using the access permission
bits AP[2] (stage 1) or S2AP[1] (stage 2) in conjunction with the
DBM (Dirty Bit Modifier) bit, where DBM means writable and AP[2]/
S2AP[1] means dirty.

When has ARM_HD, we set DBM bit for S1 mapping. As SMMU nested
mode is not upstreamed for now, we just aim to support dirty
log tracking for stage1 with io-pgtable mapping (means not support
SVA).

Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com>
---
 drivers/iommu/io-pgtable-arm.c |  7 ++++++-
 include/linux/io-pgtable.h     | 11 +++++++++++
 2 files changed, 17 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
index 87def58e79b5..94d790b8ed27 100644
--- a/drivers/iommu/io-pgtable-arm.c
+++ b/drivers/iommu/io-pgtable-arm.c
@@ -72,6 +72,7 @@
 
 #define ARM_LPAE_PTE_NSTABLE		(((arm_lpae_iopte)1) << 63)
 #define ARM_LPAE_PTE_XN			(((arm_lpae_iopte)3) << 53)
+#define ARM_LPAE_PTE_DBM		(((arm_lpae_iopte)1) << 51)
 #define ARM_LPAE_PTE_AF			(((arm_lpae_iopte)1) << 10)
 #define ARM_LPAE_PTE_SH_NS		(((arm_lpae_iopte)0) << 8)
 #define ARM_LPAE_PTE_SH_OS		(((arm_lpae_iopte)2) << 8)
@@ -81,7 +82,7 @@
 
 #define ARM_LPAE_PTE_ATTR_LO_MASK	(((arm_lpae_iopte)0x3ff) << 2)
 /* Ignore the contiguous bit for block splitting */
-#define ARM_LPAE_PTE_ATTR_HI_MASK	(((arm_lpae_iopte)6) << 52)
+#define ARM_LPAE_PTE_ATTR_HI_MASK	(((arm_lpae_iopte)13) << 51)
 #define ARM_LPAE_PTE_ATTR_MASK		(ARM_LPAE_PTE_ATTR_LO_MASK |	\
 					 ARM_LPAE_PTE_ATTR_HI_MASK)
 /* Software bit for solving coherency races */
@@ -379,6 +380,7 @@ static int __arm_lpae_map(struct arm_lpae_io_pgtable *data, unsigned long iova,
 static arm_lpae_iopte arm_lpae_prot_to_pte(struct arm_lpae_io_pgtable *data,
 					   int prot)
 {
+	struct io_pgtable_cfg *cfg = &data->iop.cfg;
 	arm_lpae_iopte pte;
 
 	if (data->iop.fmt == ARM_64_LPAE_S1 ||
@@ -386,6 +388,9 @@ static arm_lpae_iopte arm_lpae_prot_to_pte(struct arm_lpae_io_pgtable *data,
 		pte = ARM_LPAE_PTE_nG;
 		if (!(prot & IOMMU_WRITE) && (prot & IOMMU_READ))
 			pte |= ARM_LPAE_PTE_AP_RDONLY;
+		else if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_HD)
+			pte |= ARM_LPAE_PTE_DBM;
+
 		if (!(prot & IOMMU_PRIV))
 			pte |= ARM_LPAE_PTE_AP_UNPRIV;
 	} else {
diff --git a/include/linux/io-pgtable.h b/include/linux/io-pgtable.h
index 4d40dfa75b55..92274705b772 100644
--- a/include/linux/io-pgtable.h
+++ b/include/linux/io-pgtable.h
@@ -82,6 +82,14 @@ struct io_pgtable_cfg {
 	 *
 	 * IO_PGTABLE_QUIRK_ARM_OUTER_WBWA: Override the outer-cacheability
 	 *	attributes set in the TCR for a non-coherent page-table walker.
+	 *
+	 * IO_PGTABLE_QUIRK_ARM_HD: Support hardware management of dirty status.
+	 *
+	 * IO_PGTABLE_QUIRK_ARM_BBML1: ARM SMMU supports BBM Level 1 behavior
+	 *	when changing block size.
+	 *
+	 * IO_PGTABLE_QUIRK_ARM_BBML2: ARM SMMU supports BBM Level 2 behavior
+	 *	when changing block size.
 	 */
 	#define IO_PGTABLE_QUIRK_ARM_NS		BIT(0)
 	#define IO_PGTABLE_QUIRK_NO_PERMS	BIT(1)
@@ -89,6 +97,9 @@ struct io_pgtable_cfg {
 	#define IO_PGTABLE_QUIRK_NON_STRICT	BIT(4)
 	#define IO_PGTABLE_QUIRK_ARM_TTBR1	BIT(5)
 	#define IO_PGTABLE_QUIRK_ARM_OUTER_WBWA	BIT(6)
+	#define IO_PGTABLE_QUIRK_ARM_HD		BIT(7)
+	#define IO_PGTABLE_QUIRK_ARM_BBML1	BIT(8)
+	#define IO_PGTABLE_QUIRK_ARM_BBML2	BIT(9)
 	unsigned long			quirks;
 	unsigned long			pgsize_bitmap;
 	unsigned int			ias;
-- 
2.19.1


^ permalink raw reply	[flat|nested] 81+ messages in thread

* [RFC PATCH v4 02/13] iommu/io-pgtable-arm: Add quirk ARM_HD and ARM_BBMLx
@ 2021-05-07 10:22   ` Keqian Zhu
  0 siblings, 0 replies; 81+ messages in thread
From: Keqian Zhu @ 2021-05-07 10:22 UTC (permalink / raw)
  To: linux-kernel, linux-arm-kernel, iommu, Robin Murphy, Will Deacon,
	Joerg Roedel, Jean-Philippe Brucker, Lu Baolu, Yi Sun,
	Tian Kevin
  Cc: jiangkunkun, Cornelia Huck, Kirti Wankhede, lushenming,
	Alex Williamson, wanghaibin.wang

From: Kunkun Jiang <jiangkunkun@huawei.com>

These features are essential to support dirty log tracking for
SMMU with io-pgtable mapping.

The dirty state information is encoded using the access permission
bits AP[2] (stage 1) or S2AP[1] (stage 2) in conjunction with the
DBM (Dirty Bit Modifier) bit, where DBM means writable and AP[2]/
S2AP[1] means dirty.

When has ARM_HD, we set DBM bit for S1 mapping. As SMMU nested
mode is not upstreamed for now, we just aim to support dirty
log tracking for stage1 with io-pgtable mapping (means not support
SVA).

Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com>
---
 drivers/iommu/io-pgtable-arm.c |  7 ++++++-
 include/linux/io-pgtable.h     | 11 +++++++++++
 2 files changed, 17 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
index 87def58e79b5..94d790b8ed27 100644
--- a/drivers/iommu/io-pgtable-arm.c
+++ b/drivers/iommu/io-pgtable-arm.c
@@ -72,6 +72,7 @@
 
 #define ARM_LPAE_PTE_NSTABLE		(((arm_lpae_iopte)1) << 63)
 #define ARM_LPAE_PTE_XN			(((arm_lpae_iopte)3) << 53)
+#define ARM_LPAE_PTE_DBM		(((arm_lpae_iopte)1) << 51)
 #define ARM_LPAE_PTE_AF			(((arm_lpae_iopte)1) << 10)
 #define ARM_LPAE_PTE_SH_NS		(((arm_lpae_iopte)0) << 8)
 #define ARM_LPAE_PTE_SH_OS		(((arm_lpae_iopte)2) << 8)
@@ -81,7 +82,7 @@
 
 #define ARM_LPAE_PTE_ATTR_LO_MASK	(((arm_lpae_iopte)0x3ff) << 2)
 /* Ignore the contiguous bit for block splitting */
-#define ARM_LPAE_PTE_ATTR_HI_MASK	(((arm_lpae_iopte)6) << 52)
+#define ARM_LPAE_PTE_ATTR_HI_MASK	(((arm_lpae_iopte)13) << 51)
 #define ARM_LPAE_PTE_ATTR_MASK		(ARM_LPAE_PTE_ATTR_LO_MASK |	\
 					 ARM_LPAE_PTE_ATTR_HI_MASK)
 /* Software bit for solving coherency races */
@@ -379,6 +380,7 @@ static int __arm_lpae_map(struct arm_lpae_io_pgtable *data, unsigned long iova,
 static arm_lpae_iopte arm_lpae_prot_to_pte(struct arm_lpae_io_pgtable *data,
 					   int prot)
 {
+	struct io_pgtable_cfg *cfg = &data->iop.cfg;
 	arm_lpae_iopte pte;
 
 	if (data->iop.fmt == ARM_64_LPAE_S1 ||
@@ -386,6 +388,9 @@ static arm_lpae_iopte arm_lpae_prot_to_pte(struct arm_lpae_io_pgtable *data,
 		pte = ARM_LPAE_PTE_nG;
 		if (!(prot & IOMMU_WRITE) && (prot & IOMMU_READ))
 			pte |= ARM_LPAE_PTE_AP_RDONLY;
+		else if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_HD)
+			pte |= ARM_LPAE_PTE_DBM;
+
 		if (!(prot & IOMMU_PRIV))
 			pte |= ARM_LPAE_PTE_AP_UNPRIV;
 	} else {
diff --git a/include/linux/io-pgtable.h b/include/linux/io-pgtable.h
index 4d40dfa75b55..92274705b772 100644
--- a/include/linux/io-pgtable.h
+++ b/include/linux/io-pgtable.h
@@ -82,6 +82,14 @@ struct io_pgtable_cfg {
 	 *
 	 * IO_PGTABLE_QUIRK_ARM_OUTER_WBWA: Override the outer-cacheability
 	 *	attributes set in the TCR for a non-coherent page-table walker.
+	 *
+	 * IO_PGTABLE_QUIRK_ARM_HD: Support hardware management of dirty status.
+	 *
+	 * IO_PGTABLE_QUIRK_ARM_BBML1: ARM SMMU supports BBM Level 1 behavior
+	 *	when changing block size.
+	 *
+	 * IO_PGTABLE_QUIRK_ARM_BBML2: ARM SMMU supports BBM Level 2 behavior
+	 *	when changing block size.
 	 */
 	#define IO_PGTABLE_QUIRK_ARM_NS		BIT(0)
 	#define IO_PGTABLE_QUIRK_NO_PERMS	BIT(1)
@@ -89,6 +97,9 @@ struct io_pgtable_cfg {
 	#define IO_PGTABLE_QUIRK_NON_STRICT	BIT(4)
 	#define IO_PGTABLE_QUIRK_ARM_TTBR1	BIT(5)
 	#define IO_PGTABLE_QUIRK_ARM_OUTER_WBWA	BIT(6)
+	#define IO_PGTABLE_QUIRK_ARM_HD		BIT(7)
+	#define IO_PGTABLE_QUIRK_ARM_BBML1	BIT(8)
+	#define IO_PGTABLE_QUIRK_ARM_BBML2	BIT(9)
 	unsigned long			quirks;
 	unsigned long			pgsize_bitmap;
 	unsigned int			ias;
-- 
2.19.1

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 81+ messages in thread

* [RFC PATCH v4 02/13] iommu/io-pgtable-arm: Add quirk ARM_HD and ARM_BBMLx
@ 2021-05-07 10:22   ` Keqian Zhu
  0 siblings, 0 replies; 81+ messages in thread
From: Keqian Zhu @ 2021-05-07 10:22 UTC (permalink / raw)
  To: linux-kernel, linux-arm-kernel, iommu, Robin Murphy, Will Deacon,
	Joerg Roedel, Jean-Philippe Brucker, Lu Baolu, Yi Sun,
	Tian Kevin
  Cc: Alex Williamson, Kirti Wankhede, Cornelia Huck, Jonathan Cameron,
	wanghaibin.wang, jiangkunkun, yuzenghui, lushenming

From: Kunkun Jiang <jiangkunkun@huawei.com>

These features are essential to support dirty log tracking for
SMMU with io-pgtable mapping.

The dirty state information is encoded using the access permission
bits AP[2] (stage 1) or S2AP[1] (stage 2) in conjunction with the
DBM (Dirty Bit Modifier) bit, where DBM means writable and AP[2]/
S2AP[1] means dirty.

When has ARM_HD, we set DBM bit for S1 mapping. As SMMU nested
mode is not upstreamed for now, we just aim to support dirty
log tracking for stage1 with io-pgtable mapping (means not support
SVA).

Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com>
---
 drivers/iommu/io-pgtable-arm.c |  7 ++++++-
 include/linux/io-pgtable.h     | 11 +++++++++++
 2 files changed, 17 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
index 87def58e79b5..94d790b8ed27 100644
--- a/drivers/iommu/io-pgtable-arm.c
+++ b/drivers/iommu/io-pgtable-arm.c
@@ -72,6 +72,7 @@
 
 #define ARM_LPAE_PTE_NSTABLE		(((arm_lpae_iopte)1) << 63)
 #define ARM_LPAE_PTE_XN			(((arm_lpae_iopte)3) << 53)
+#define ARM_LPAE_PTE_DBM		(((arm_lpae_iopte)1) << 51)
 #define ARM_LPAE_PTE_AF			(((arm_lpae_iopte)1) << 10)
 #define ARM_LPAE_PTE_SH_NS		(((arm_lpae_iopte)0) << 8)
 #define ARM_LPAE_PTE_SH_OS		(((arm_lpae_iopte)2) << 8)
@@ -81,7 +82,7 @@
 
 #define ARM_LPAE_PTE_ATTR_LO_MASK	(((arm_lpae_iopte)0x3ff) << 2)
 /* Ignore the contiguous bit for block splitting */
-#define ARM_LPAE_PTE_ATTR_HI_MASK	(((arm_lpae_iopte)6) << 52)
+#define ARM_LPAE_PTE_ATTR_HI_MASK	(((arm_lpae_iopte)13) << 51)
 #define ARM_LPAE_PTE_ATTR_MASK		(ARM_LPAE_PTE_ATTR_LO_MASK |	\
 					 ARM_LPAE_PTE_ATTR_HI_MASK)
 /* Software bit for solving coherency races */
@@ -379,6 +380,7 @@ static int __arm_lpae_map(struct arm_lpae_io_pgtable *data, unsigned long iova,
 static arm_lpae_iopte arm_lpae_prot_to_pte(struct arm_lpae_io_pgtable *data,
 					   int prot)
 {
+	struct io_pgtable_cfg *cfg = &data->iop.cfg;
 	arm_lpae_iopte pte;
 
 	if (data->iop.fmt == ARM_64_LPAE_S1 ||
@@ -386,6 +388,9 @@ static arm_lpae_iopte arm_lpae_prot_to_pte(struct arm_lpae_io_pgtable *data,
 		pte = ARM_LPAE_PTE_nG;
 		if (!(prot & IOMMU_WRITE) && (prot & IOMMU_READ))
 			pte |= ARM_LPAE_PTE_AP_RDONLY;
+		else if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_HD)
+			pte |= ARM_LPAE_PTE_DBM;
+
 		if (!(prot & IOMMU_PRIV))
 			pte |= ARM_LPAE_PTE_AP_UNPRIV;
 	} else {
diff --git a/include/linux/io-pgtable.h b/include/linux/io-pgtable.h
index 4d40dfa75b55..92274705b772 100644
--- a/include/linux/io-pgtable.h
+++ b/include/linux/io-pgtable.h
@@ -82,6 +82,14 @@ struct io_pgtable_cfg {
 	 *
 	 * IO_PGTABLE_QUIRK_ARM_OUTER_WBWA: Override the outer-cacheability
 	 *	attributes set in the TCR for a non-coherent page-table walker.
+	 *
+	 * IO_PGTABLE_QUIRK_ARM_HD: Support hardware management of dirty status.
+	 *
+	 * IO_PGTABLE_QUIRK_ARM_BBML1: ARM SMMU supports BBM Level 1 behavior
+	 *	when changing block size.
+	 *
+	 * IO_PGTABLE_QUIRK_ARM_BBML2: ARM SMMU supports BBM Level 2 behavior
+	 *	when changing block size.
 	 */
 	#define IO_PGTABLE_QUIRK_ARM_NS		BIT(0)
 	#define IO_PGTABLE_QUIRK_NO_PERMS	BIT(1)
@@ -89,6 +97,9 @@ struct io_pgtable_cfg {
 	#define IO_PGTABLE_QUIRK_NON_STRICT	BIT(4)
 	#define IO_PGTABLE_QUIRK_ARM_TTBR1	BIT(5)
 	#define IO_PGTABLE_QUIRK_ARM_OUTER_WBWA	BIT(6)
+	#define IO_PGTABLE_QUIRK_ARM_HD		BIT(7)
+	#define IO_PGTABLE_QUIRK_ARM_BBML1	BIT(8)
+	#define IO_PGTABLE_QUIRK_ARM_BBML2	BIT(9)
 	unsigned long			quirks;
 	unsigned long			pgsize_bitmap;
 	unsigned int			ias;
-- 
2.19.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 81+ messages in thread

* [RFC PATCH v4 03/13] iommu/io-pgtable-arm: Add and realize split_block ops
  2021-05-07 10:21 ` Keqian Zhu
  (?)
@ 2021-05-07 10:22   ` Keqian Zhu
  -1 siblings, 0 replies; 81+ messages in thread
From: Keqian Zhu @ 2021-05-07 10:22 UTC (permalink / raw)
  To: linux-kernel, linux-arm-kernel, iommu, Robin Murphy, Will Deacon,
	Joerg Roedel, Jean-Philippe Brucker, Lu Baolu, Yi Sun,
	Tian Kevin
  Cc: Alex Williamson, Kirti Wankhede, Cornelia Huck, Jonathan Cameron,
	wanghaibin.wang, jiangkunkun, yuzenghui, lushenming

From: Kunkun Jiang <jiangkunkun@huawei.com>

Block(largepage) mapping is not a proper granule for dirty log tracking.
Take an extreme example, if DMA writes one byte, under 1G mapping, the
dirty amount reported is 1G, but under 4K mapping, the dirty amount is
just 4K.

This splits block descriptor to an span of page descriptors. BBML1 or
BBML2 feature is required.

Spliting block is designed to be only used by dirty log tracking, which
does not concurrently work with other pgtable ops that access underlying
page table, so race condition does not exist.

Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com>
---
 drivers/iommu/io-pgtable-arm.c | 122 +++++++++++++++++++++++++++++++++
 include/linux/io-pgtable.h     |   2 +
 2 files changed, 124 insertions(+)

diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
index 94d790b8ed27..664a9548b199 100644
--- a/drivers/iommu/io-pgtable-arm.c
+++ b/drivers/iommu/io-pgtable-arm.c
@@ -79,6 +79,8 @@
 #define ARM_LPAE_PTE_SH_IS		(((arm_lpae_iopte)3) << 8)
 #define ARM_LPAE_PTE_NS			(((arm_lpae_iopte)1) << 5)
 #define ARM_LPAE_PTE_VALID		(((arm_lpae_iopte)1) << 0)
+/* Block descriptor bits */
+#define ARM_LPAE_PTE_NT			(((arm_lpae_iopte)1) << 16)
 
 #define ARM_LPAE_PTE_ATTR_LO_MASK	(((arm_lpae_iopte)0x3ff) << 2)
 /* Ignore the contiguous bit for block splitting */
@@ -679,6 +681,125 @@ static phys_addr_t arm_lpae_iova_to_phys(struct io_pgtable_ops *ops,
 	return iopte_to_paddr(pte, data) | iova;
 }
 
+static size_t __arm_lpae_split_block(struct arm_lpae_io_pgtable *data,
+				     unsigned long iova, size_t size, int lvl,
+				     arm_lpae_iopte *ptep);
+
+static size_t arm_lpae_do_split_blk(struct arm_lpae_io_pgtable *data,
+				    unsigned long iova, size_t size,
+				    arm_lpae_iopte blk_pte, int lvl,
+				    arm_lpae_iopte *ptep)
+{
+	struct io_pgtable_cfg *cfg = &data->iop.cfg;
+	arm_lpae_iopte pte, *tablep;
+	phys_addr_t blk_paddr;
+	size_t tablesz = ARM_LPAE_GRANULE(data);
+	size_t split_sz = ARM_LPAE_BLOCK_SIZE(lvl, data);
+	int i;
+
+	if (WARN_ON(lvl == ARM_LPAE_MAX_LEVELS))
+		return 0;
+
+	tablep = __arm_lpae_alloc_pages(tablesz, GFP_ATOMIC, cfg);
+	if (!tablep)
+		return 0;
+
+	blk_paddr = iopte_to_paddr(blk_pte, data);
+	pte = iopte_prot(blk_pte);
+	for (i = 0; i < tablesz / sizeof(pte); i++, blk_paddr += split_sz)
+		__arm_lpae_init_pte(data, blk_paddr, pte, lvl, &tablep[i]);
+
+	if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_BBML1) {
+		/* Race does not exist */
+		blk_pte |= ARM_LPAE_PTE_NT;
+		__arm_lpae_set_pte(ptep, blk_pte, cfg);
+		io_pgtable_tlb_flush_walk(&data->iop, iova, size, size);
+	}
+	/* Race does not exist */
+	pte = arm_lpae_install_table(tablep, ptep, blk_pte, cfg);
+
+	/* Have splited it into page? */
+	if (lvl == (ARM_LPAE_MAX_LEVELS - 1))
+		return size;
+
+	/* Go back to lvl - 1 */
+	ptep -= ARM_LPAE_LVL_IDX(iova, lvl - 1, data);
+	return __arm_lpae_split_block(data, iova, size, lvl - 1, ptep);
+}
+
+static size_t __arm_lpae_split_block(struct arm_lpae_io_pgtable *data,
+				     unsigned long iova, size_t size, int lvl,
+				     arm_lpae_iopte *ptep)
+{
+	arm_lpae_iopte pte;
+	struct io_pgtable *iop = &data->iop;
+	size_t base, next_size, total_size;
+
+	if (WARN_ON(lvl == ARM_LPAE_MAX_LEVELS))
+		return 0;
+
+	ptep += ARM_LPAE_LVL_IDX(iova, lvl, data);
+	pte = READ_ONCE(*ptep);
+	if (WARN_ON(!pte))
+		return 0;
+
+	if (size == ARM_LPAE_BLOCK_SIZE(lvl, data)) {
+		if (iopte_leaf(pte, lvl, iop->fmt)) {
+			if (lvl == (ARM_LPAE_MAX_LEVELS - 1) ||
+			    (pte & ARM_LPAE_PTE_AP_RDONLY))
+				return size;
+
+			/* We find a writable block, split it. */
+			return arm_lpae_do_split_blk(data, iova, size, pte,
+					lvl + 1, ptep);
+		} else {
+			/* If it is the last table level, then nothing to do */
+			if (lvl == (ARM_LPAE_MAX_LEVELS - 2))
+				return size;
+
+			total_size = 0;
+			next_size = ARM_LPAE_BLOCK_SIZE(lvl + 1, data);
+			ptep = iopte_deref(pte, data);
+			for (base = 0; base < size; base += next_size)
+				total_size += __arm_lpae_split_block(data,
+						iova + base, next_size, lvl + 1,
+						ptep);
+			return total_size;
+		}
+	} else if (iopte_leaf(pte, lvl, iop->fmt)) {
+		WARN(1, "Can't split behind a block.\n");
+		return 0;
+	}
+
+	/* Keep on walkin */
+	ptep = iopte_deref(pte, data);
+	return __arm_lpae_split_block(data, iova, size, lvl + 1, ptep);
+}
+
+static size_t arm_lpae_split_block(struct io_pgtable_ops *ops,
+				   unsigned long iova, size_t size)
+{
+	struct arm_lpae_io_pgtable *data = io_pgtable_ops_to_data(ops);
+	struct io_pgtable_cfg *cfg = &data->iop.cfg;
+	arm_lpae_iopte *ptep = data->pgd;
+	int lvl = data->start_level;
+	long iaext = (s64)iova >> cfg->ias;
+
+	if (WARN_ON(!size || (size & cfg->pgsize_bitmap) != size))
+		return 0;
+
+	if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_TTBR1)
+		iaext = ~iaext;
+	if (WARN_ON(iaext))
+		return 0;
+
+	/* If it is smallest granule, then nothing to do */
+	if (size == ARM_LPAE_BLOCK_SIZE(ARM_LPAE_MAX_LEVELS - 1, data))
+		return size;
+
+	return __arm_lpae_split_block(data, iova, size, lvl, ptep);
+}
+
 static void arm_lpae_restrict_pgsizes(struct io_pgtable_cfg *cfg)
 {
 	unsigned long granule, page_sizes;
@@ -757,6 +878,7 @@ arm_lpae_alloc_pgtable(struct io_pgtable_cfg *cfg)
 		.map		= arm_lpae_map,
 		.unmap		= arm_lpae_unmap,
 		.iova_to_phys	= arm_lpae_iova_to_phys,
+		.split_block	= arm_lpae_split_block,
 	};
 
 	return data;
diff --git a/include/linux/io-pgtable.h b/include/linux/io-pgtable.h
index 92274705b772..eba6c6ccbe49 100644
--- a/include/linux/io-pgtable.h
+++ b/include/linux/io-pgtable.h
@@ -167,6 +167,8 @@ struct io_pgtable_ops {
 			size_t size, struct iommu_iotlb_gather *gather);
 	phys_addr_t (*iova_to_phys)(struct io_pgtable_ops *ops,
 				    unsigned long iova);
+	size_t (*split_block)(struct io_pgtable_ops *ops, unsigned long iova,
+			      size_t size);
 };
 
 /**
-- 
2.19.1


^ permalink raw reply	[flat|nested] 81+ messages in thread

* [RFC PATCH v4 03/13] iommu/io-pgtable-arm: Add and realize split_block ops
@ 2021-05-07 10:22   ` Keqian Zhu
  0 siblings, 0 replies; 81+ messages in thread
From: Keqian Zhu @ 2021-05-07 10:22 UTC (permalink / raw)
  To: linux-kernel, linux-arm-kernel, iommu, Robin Murphy, Will Deacon,
	Joerg Roedel, Jean-Philippe Brucker, Lu Baolu, Yi Sun,
	Tian Kevin
  Cc: jiangkunkun, Cornelia Huck, Kirti Wankhede, lushenming,
	Alex Williamson, wanghaibin.wang

From: Kunkun Jiang <jiangkunkun@huawei.com>

Block(largepage) mapping is not a proper granule for dirty log tracking.
Take an extreme example, if DMA writes one byte, under 1G mapping, the
dirty amount reported is 1G, but under 4K mapping, the dirty amount is
just 4K.

This splits block descriptor to an span of page descriptors. BBML1 or
BBML2 feature is required.

Spliting block is designed to be only used by dirty log tracking, which
does not concurrently work with other pgtable ops that access underlying
page table, so race condition does not exist.

Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com>
---
 drivers/iommu/io-pgtable-arm.c | 122 +++++++++++++++++++++++++++++++++
 include/linux/io-pgtable.h     |   2 +
 2 files changed, 124 insertions(+)

diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
index 94d790b8ed27..664a9548b199 100644
--- a/drivers/iommu/io-pgtable-arm.c
+++ b/drivers/iommu/io-pgtable-arm.c
@@ -79,6 +79,8 @@
 #define ARM_LPAE_PTE_SH_IS		(((arm_lpae_iopte)3) << 8)
 #define ARM_LPAE_PTE_NS			(((arm_lpae_iopte)1) << 5)
 #define ARM_LPAE_PTE_VALID		(((arm_lpae_iopte)1) << 0)
+/* Block descriptor bits */
+#define ARM_LPAE_PTE_NT			(((arm_lpae_iopte)1) << 16)
 
 #define ARM_LPAE_PTE_ATTR_LO_MASK	(((arm_lpae_iopte)0x3ff) << 2)
 /* Ignore the contiguous bit for block splitting */
@@ -679,6 +681,125 @@ static phys_addr_t arm_lpae_iova_to_phys(struct io_pgtable_ops *ops,
 	return iopte_to_paddr(pte, data) | iova;
 }
 
+static size_t __arm_lpae_split_block(struct arm_lpae_io_pgtable *data,
+				     unsigned long iova, size_t size, int lvl,
+				     arm_lpae_iopte *ptep);
+
+static size_t arm_lpae_do_split_blk(struct arm_lpae_io_pgtable *data,
+				    unsigned long iova, size_t size,
+				    arm_lpae_iopte blk_pte, int lvl,
+				    arm_lpae_iopte *ptep)
+{
+	struct io_pgtable_cfg *cfg = &data->iop.cfg;
+	arm_lpae_iopte pte, *tablep;
+	phys_addr_t blk_paddr;
+	size_t tablesz = ARM_LPAE_GRANULE(data);
+	size_t split_sz = ARM_LPAE_BLOCK_SIZE(lvl, data);
+	int i;
+
+	if (WARN_ON(lvl == ARM_LPAE_MAX_LEVELS))
+		return 0;
+
+	tablep = __arm_lpae_alloc_pages(tablesz, GFP_ATOMIC, cfg);
+	if (!tablep)
+		return 0;
+
+	blk_paddr = iopte_to_paddr(blk_pte, data);
+	pte = iopte_prot(blk_pte);
+	for (i = 0; i < tablesz / sizeof(pte); i++, blk_paddr += split_sz)
+		__arm_lpae_init_pte(data, blk_paddr, pte, lvl, &tablep[i]);
+
+	if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_BBML1) {
+		/* Race does not exist */
+		blk_pte |= ARM_LPAE_PTE_NT;
+		__arm_lpae_set_pte(ptep, blk_pte, cfg);
+		io_pgtable_tlb_flush_walk(&data->iop, iova, size, size);
+	}
+	/* Race does not exist */
+	pte = arm_lpae_install_table(tablep, ptep, blk_pte, cfg);
+
+	/* Have splited it into page? */
+	if (lvl == (ARM_LPAE_MAX_LEVELS - 1))
+		return size;
+
+	/* Go back to lvl - 1 */
+	ptep -= ARM_LPAE_LVL_IDX(iova, lvl - 1, data);
+	return __arm_lpae_split_block(data, iova, size, lvl - 1, ptep);
+}
+
+static size_t __arm_lpae_split_block(struct arm_lpae_io_pgtable *data,
+				     unsigned long iova, size_t size, int lvl,
+				     arm_lpae_iopte *ptep)
+{
+	arm_lpae_iopte pte;
+	struct io_pgtable *iop = &data->iop;
+	size_t base, next_size, total_size;
+
+	if (WARN_ON(lvl == ARM_LPAE_MAX_LEVELS))
+		return 0;
+
+	ptep += ARM_LPAE_LVL_IDX(iova, lvl, data);
+	pte = READ_ONCE(*ptep);
+	if (WARN_ON(!pte))
+		return 0;
+
+	if (size == ARM_LPAE_BLOCK_SIZE(lvl, data)) {
+		if (iopte_leaf(pte, lvl, iop->fmt)) {
+			if (lvl == (ARM_LPAE_MAX_LEVELS - 1) ||
+			    (pte & ARM_LPAE_PTE_AP_RDONLY))
+				return size;
+
+			/* We find a writable block, split it. */
+			return arm_lpae_do_split_blk(data, iova, size, pte,
+					lvl + 1, ptep);
+		} else {
+			/* If it is the last table level, then nothing to do */
+			if (lvl == (ARM_LPAE_MAX_LEVELS - 2))
+				return size;
+
+			total_size = 0;
+			next_size = ARM_LPAE_BLOCK_SIZE(lvl + 1, data);
+			ptep = iopte_deref(pte, data);
+			for (base = 0; base < size; base += next_size)
+				total_size += __arm_lpae_split_block(data,
+						iova + base, next_size, lvl + 1,
+						ptep);
+			return total_size;
+		}
+	} else if (iopte_leaf(pte, lvl, iop->fmt)) {
+		WARN(1, "Can't split behind a block.\n");
+		return 0;
+	}
+
+	/* Keep on walkin */
+	ptep = iopte_deref(pte, data);
+	return __arm_lpae_split_block(data, iova, size, lvl + 1, ptep);
+}
+
+static size_t arm_lpae_split_block(struct io_pgtable_ops *ops,
+				   unsigned long iova, size_t size)
+{
+	struct arm_lpae_io_pgtable *data = io_pgtable_ops_to_data(ops);
+	struct io_pgtable_cfg *cfg = &data->iop.cfg;
+	arm_lpae_iopte *ptep = data->pgd;
+	int lvl = data->start_level;
+	long iaext = (s64)iova >> cfg->ias;
+
+	if (WARN_ON(!size || (size & cfg->pgsize_bitmap) != size))
+		return 0;
+
+	if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_TTBR1)
+		iaext = ~iaext;
+	if (WARN_ON(iaext))
+		return 0;
+
+	/* If it is smallest granule, then nothing to do */
+	if (size == ARM_LPAE_BLOCK_SIZE(ARM_LPAE_MAX_LEVELS - 1, data))
+		return size;
+
+	return __arm_lpae_split_block(data, iova, size, lvl, ptep);
+}
+
 static void arm_lpae_restrict_pgsizes(struct io_pgtable_cfg *cfg)
 {
 	unsigned long granule, page_sizes;
@@ -757,6 +878,7 @@ arm_lpae_alloc_pgtable(struct io_pgtable_cfg *cfg)
 		.map		= arm_lpae_map,
 		.unmap		= arm_lpae_unmap,
 		.iova_to_phys	= arm_lpae_iova_to_phys,
+		.split_block	= arm_lpae_split_block,
 	};
 
 	return data;
diff --git a/include/linux/io-pgtable.h b/include/linux/io-pgtable.h
index 92274705b772..eba6c6ccbe49 100644
--- a/include/linux/io-pgtable.h
+++ b/include/linux/io-pgtable.h
@@ -167,6 +167,8 @@ struct io_pgtable_ops {
 			size_t size, struct iommu_iotlb_gather *gather);
 	phys_addr_t (*iova_to_phys)(struct io_pgtable_ops *ops,
 				    unsigned long iova);
+	size_t (*split_block)(struct io_pgtable_ops *ops, unsigned long iova,
+			      size_t size);
 };
 
 /**
-- 
2.19.1

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 81+ messages in thread

* [RFC PATCH v4 03/13] iommu/io-pgtable-arm: Add and realize split_block ops
@ 2021-05-07 10:22   ` Keqian Zhu
  0 siblings, 0 replies; 81+ messages in thread
From: Keqian Zhu @ 2021-05-07 10:22 UTC (permalink / raw)
  To: linux-kernel, linux-arm-kernel, iommu, Robin Murphy, Will Deacon,
	Joerg Roedel, Jean-Philippe Brucker, Lu Baolu, Yi Sun,
	Tian Kevin
  Cc: Alex Williamson, Kirti Wankhede, Cornelia Huck, Jonathan Cameron,
	wanghaibin.wang, jiangkunkun, yuzenghui, lushenming

From: Kunkun Jiang <jiangkunkun@huawei.com>

Block(largepage) mapping is not a proper granule for dirty log tracking.
Take an extreme example, if DMA writes one byte, under 1G mapping, the
dirty amount reported is 1G, but under 4K mapping, the dirty amount is
just 4K.

This splits block descriptor to an span of page descriptors. BBML1 or
BBML2 feature is required.

Spliting block is designed to be only used by dirty log tracking, which
does not concurrently work with other pgtable ops that access underlying
page table, so race condition does not exist.

Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com>
---
 drivers/iommu/io-pgtable-arm.c | 122 +++++++++++++++++++++++++++++++++
 include/linux/io-pgtable.h     |   2 +
 2 files changed, 124 insertions(+)

diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
index 94d790b8ed27..664a9548b199 100644
--- a/drivers/iommu/io-pgtable-arm.c
+++ b/drivers/iommu/io-pgtable-arm.c
@@ -79,6 +79,8 @@
 #define ARM_LPAE_PTE_SH_IS		(((arm_lpae_iopte)3) << 8)
 #define ARM_LPAE_PTE_NS			(((arm_lpae_iopte)1) << 5)
 #define ARM_LPAE_PTE_VALID		(((arm_lpae_iopte)1) << 0)
+/* Block descriptor bits */
+#define ARM_LPAE_PTE_NT			(((arm_lpae_iopte)1) << 16)
 
 #define ARM_LPAE_PTE_ATTR_LO_MASK	(((arm_lpae_iopte)0x3ff) << 2)
 /* Ignore the contiguous bit for block splitting */
@@ -679,6 +681,125 @@ static phys_addr_t arm_lpae_iova_to_phys(struct io_pgtable_ops *ops,
 	return iopte_to_paddr(pte, data) | iova;
 }
 
+static size_t __arm_lpae_split_block(struct arm_lpae_io_pgtable *data,
+				     unsigned long iova, size_t size, int lvl,
+				     arm_lpae_iopte *ptep);
+
+static size_t arm_lpae_do_split_blk(struct arm_lpae_io_pgtable *data,
+				    unsigned long iova, size_t size,
+				    arm_lpae_iopte blk_pte, int lvl,
+				    arm_lpae_iopte *ptep)
+{
+	struct io_pgtable_cfg *cfg = &data->iop.cfg;
+	arm_lpae_iopte pte, *tablep;
+	phys_addr_t blk_paddr;
+	size_t tablesz = ARM_LPAE_GRANULE(data);
+	size_t split_sz = ARM_LPAE_BLOCK_SIZE(lvl, data);
+	int i;
+
+	if (WARN_ON(lvl == ARM_LPAE_MAX_LEVELS))
+		return 0;
+
+	tablep = __arm_lpae_alloc_pages(tablesz, GFP_ATOMIC, cfg);
+	if (!tablep)
+		return 0;
+
+	blk_paddr = iopte_to_paddr(blk_pte, data);
+	pte = iopte_prot(blk_pte);
+	for (i = 0; i < tablesz / sizeof(pte); i++, blk_paddr += split_sz)
+		__arm_lpae_init_pte(data, blk_paddr, pte, lvl, &tablep[i]);
+
+	if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_BBML1) {
+		/* Race does not exist */
+		blk_pte |= ARM_LPAE_PTE_NT;
+		__arm_lpae_set_pte(ptep, blk_pte, cfg);
+		io_pgtable_tlb_flush_walk(&data->iop, iova, size, size);
+	}
+	/* Race does not exist */
+	pte = arm_lpae_install_table(tablep, ptep, blk_pte, cfg);
+
+	/* Have splited it into page? */
+	if (lvl == (ARM_LPAE_MAX_LEVELS - 1))
+		return size;
+
+	/* Go back to lvl - 1 */
+	ptep -= ARM_LPAE_LVL_IDX(iova, lvl - 1, data);
+	return __arm_lpae_split_block(data, iova, size, lvl - 1, ptep);
+}
+
+static size_t __arm_lpae_split_block(struct arm_lpae_io_pgtable *data,
+				     unsigned long iova, size_t size, int lvl,
+				     arm_lpae_iopte *ptep)
+{
+	arm_lpae_iopte pte;
+	struct io_pgtable *iop = &data->iop;
+	size_t base, next_size, total_size;
+
+	if (WARN_ON(lvl == ARM_LPAE_MAX_LEVELS))
+		return 0;
+
+	ptep += ARM_LPAE_LVL_IDX(iova, lvl, data);
+	pte = READ_ONCE(*ptep);
+	if (WARN_ON(!pte))
+		return 0;
+
+	if (size == ARM_LPAE_BLOCK_SIZE(lvl, data)) {
+		if (iopte_leaf(pte, lvl, iop->fmt)) {
+			if (lvl == (ARM_LPAE_MAX_LEVELS - 1) ||
+			    (pte & ARM_LPAE_PTE_AP_RDONLY))
+				return size;
+
+			/* We find a writable block, split it. */
+			return arm_lpae_do_split_blk(data, iova, size, pte,
+					lvl + 1, ptep);
+		} else {
+			/* If it is the last table level, then nothing to do */
+			if (lvl == (ARM_LPAE_MAX_LEVELS - 2))
+				return size;
+
+			total_size = 0;
+			next_size = ARM_LPAE_BLOCK_SIZE(lvl + 1, data);
+			ptep = iopte_deref(pte, data);
+			for (base = 0; base < size; base += next_size)
+				total_size += __arm_lpae_split_block(data,
+						iova + base, next_size, lvl + 1,
+						ptep);
+			return total_size;
+		}
+	} else if (iopte_leaf(pte, lvl, iop->fmt)) {
+		WARN(1, "Can't split behind a block.\n");
+		return 0;
+	}
+
+	/* Keep on walkin */
+	ptep = iopte_deref(pte, data);
+	return __arm_lpae_split_block(data, iova, size, lvl + 1, ptep);
+}
+
+static size_t arm_lpae_split_block(struct io_pgtable_ops *ops,
+				   unsigned long iova, size_t size)
+{
+	struct arm_lpae_io_pgtable *data = io_pgtable_ops_to_data(ops);
+	struct io_pgtable_cfg *cfg = &data->iop.cfg;
+	arm_lpae_iopte *ptep = data->pgd;
+	int lvl = data->start_level;
+	long iaext = (s64)iova >> cfg->ias;
+
+	if (WARN_ON(!size || (size & cfg->pgsize_bitmap) != size))
+		return 0;
+
+	if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_TTBR1)
+		iaext = ~iaext;
+	if (WARN_ON(iaext))
+		return 0;
+
+	/* If it is smallest granule, then nothing to do */
+	if (size == ARM_LPAE_BLOCK_SIZE(ARM_LPAE_MAX_LEVELS - 1, data))
+		return size;
+
+	return __arm_lpae_split_block(data, iova, size, lvl, ptep);
+}
+
 static void arm_lpae_restrict_pgsizes(struct io_pgtable_cfg *cfg)
 {
 	unsigned long granule, page_sizes;
@@ -757,6 +878,7 @@ arm_lpae_alloc_pgtable(struct io_pgtable_cfg *cfg)
 		.map		= arm_lpae_map,
 		.unmap		= arm_lpae_unmap,
 		.iova_to_phys	= arm_lpae_iova_to_phys,
+		.split_block	= arm_lpae_split_block,
 	};
 
 	return data;
diff --git a/include/linux/io-pgtable.h b/include/linux/io-pgtable.h
index 92274705b772..eba6c6ccbe49 100644
--- a/include/linux/io-pgtable.h
+++ b/include/linux/io-pgtable.h
@@ -167,6 +167,8 @@ struct io_pgtable_ops {
 			size_t size, struct iommu_iotlb_gather *gather);
 	phys_addr_t (*iova_to_phys)(struct io_pgtable_ops *ops,
 				    unsigned long iova);
+	size_t (*split_block)(struct io_pgtable_ops *ops, unsigned long iova,
+			      size_t size);
 };
 
 /**
-- 
2.19.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 81+ messages in thread

* [RFC PATCH v4 04/13] iommu/io-pgtable-arm: Add and realize merge_page ops
  2021-05-07 10:21 ` Keqian Zhu
  (?)
@ 2021-05-07 10:22   ` Keqian Zhu
  -1 siblings, 0 replies; 81+ messages in thread
From: Keqian Zhu @ 2021-05-07 10:22 UTC (permalink / raw)
  To: linux-kernel, linux-arm-kernel, iommu, Robin Murphy, Will Deacon,
	Joerg Roedel, Jean-Philippe Brucker, Lu Baolu, Yi Sun,
	Tian Kevin
  Cc: Alex Williamson, Kirti Wankhede, Cornelia Huck, Jonathan Cameron,
	wanghaibin.wang, jiangkunkun, yuzenghui, lushenming

From: Kunkun Jiang <jiangkunkun@huawei.com>

If block(largepage) mappings are split during start dirty log, then
when stop dirty log, we need to recover them for better DMA performance.

This recovers block mappings and unmap the span of page mappings. BBML1
or BBML2 feature is required.

Merging page is designed to be only used by dirty log tracking, which
does not concurrently work with other pgtable ops that access underlying
page table, so race condition does not exist.

Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com>
---
 drivers/iommu/io-pgtable-arm.c | 78 ++++++++++++++++++++++++++++++++++
 include/linux/io-pgtable.h     |  2 +
 2 files changed, 80 insertions(+)

diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
index 664a9548b199..b9f6e3370032 100644
--- a/drivers/iommu/io-pgtable-arm.c
+++ b/drivers/iommu/io-pgtable-arm.c
@@ -800,6 +800,83 @@ static size_t arm_lpae_split_block(struct io_pgtable_ops *ops,
 	return __arm_lpae_split_block(data, iova, size, lvl, ptep);
 }
 
+static size_t __arm_lpae_merge_page(struct arm_lpae_io_pgtable *data,
+				    unsigned long iova, phys_addr_t paddr,
+				    size_t size, int lvl, arm_lpae_iopte *ptep,
+				    arm_lpae_iopte prot)
+{
+	arm_lpae_iopte pte, *tablep;
+	struct io_pgtable *iop = &data->iop;
+	struct io_pgtable_cfg *cfg = &data->iop.cfg;
+
+	if (WARN_ON(lvl == ARM_LPAE_MAX_LEVELS))
+		return 0;
+
+	ptep += ARM_LPAE_LVL_IDX(iova, lvl, data);
+	pte = READ_ONCE(*ptep);
+	if (WARN_ON(!pte))
+		return 0;
+
+	if (size == ARM_LPAE_BLOCK_SIZE(lvl, data)) {
+		if (iopte_leaf(pte, lvl, iop->fmt))
+			return size;
+
+		/* Race does not exist */
+		if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_BBML1) {
+			prot |= ARM_LPAE_PTE_NT;
+			__arm_lpae_init_pte(data, paddr, prot, lvl, ptep);
+			io_pgtable_tlb_flush_walk(iop, iova, size,
+						  ARM_LPAE_GRANULE(data));
+
+			prot &= ~(ARM_LPAE_PTE_NT);
+			__arm_lpae_init_pte(data, paddr, prot, lvl, ptep);
+		} else {
+			__arm_lpae_init_pte(data, paddr, prot, lvl, ptep);
+		}
+
+		tablep = iopte_deref(pte, data);
+		__arm_lpae_free_pgtable(data, lvl + 1, tablep);
+		return size;
+	} else if (iopte_leaf(pte, lvl, iop->fmt)) {
+		/* The size is too small, already merged */
+		return size;
+	}
+
+	/* Keep on walkin */
+	ptep = iopte_deref(pte, data);
+	return __arm_lpae_merge_page(data, iova, paddr, size, lvl + 1, ptep, prot);
+}
+
+static size_t arm_lpae_merge_page(struct io_pgtable_ops *ops, unsigned long iova,
+				  phys_addr_t paddr, size_t size, int iommu_prot)
+{
+	struct arm_lpae_io_pgtable *data = io_pgtable_ops_to_data(ops);
+	struct io_pgtable_cfg *cfg = &data->iop.cfg;
+	arm_lpae_iopte *ptep = data->pgd;
+	int lvl = data->start_level;
+	arm_lpae_iopte prot;
+	long iaext = (s64)iova >> cfg->ias;
+
+	if (WARN_ON(!size || (size & cfg->pgsize_bitmap) != size))
+		return 0;
+
+	if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_TTBR1)
+		iaext = ~iaext;
+	if (WARN_ON(iaext || paddr >> cfg->oas))
+		return 0;
+
+	/* If no access, then nothing to do */
+	if (!(iommu_prot & (IOMMU_READ | IOMMU_WRITE)))
+		return size;
+
+	/* If it is smallest granule, then nothing to do */
+	if (size == ARM_LPAE_BLOCK_SIZE(ARM_LPAE_MAX_LEVELS - 1, data))
+		return size;
+
+	prot = arm_lpae_prot_to_pte(data, iommu_prot);
+	return __arm_lpae_merge_page(data, iova, paddr, size, lvl, ptep, prot);
+}
+
 static void arm_lpae_restrict_pgsizes(struct io_pgtable_cfg *cfg)
 {
 	unsigned long granule, page_sizes;
@@ -879,6 +956,7 @@ arm_lpae_alloc_pgtable(struct io_pgtable_cfg *cfg)
 		.unmap		= arm_lpae_unmap,
 		.iova_to_phys	= arm_lpae_iova_to_phys,
 		.split_block	= arm_lpae_split_block,
+		.merge_page	= arm_lpae_merge_page,
 	};
 
 	return data;
diff --git a/include/linux/io-pgtable.h b/include/linux/io-pgtable.h
index eba6c6ccbe49..e77576d946a2 100644
--- a/include/linux/io-pgtable.h
+++ b/include/linux/io-pgtable.h
@@ -169,6 +169,8 @@ struct io_pgtable_ops {
 				    unsigned long iova);
 	size_t (*split_block)(struct io_pgtable_ops *ops, unsigned long iova,
 			      size_t size);
+	size_t (*merge_page)(struct io_pgtable_ops *ops, unsigned long iova,
+			     phys_addr_t phys, size_t size, int prot);
 };
 
 /**
-- 
2.19.1


^ permalink raw reply	[flat|nested] 81+ messages in thread

* [RFC PATCH v4 04/13] iommu/io-pgtable-arm: Add and realize merge_page ops
@ 2021-05-07 10:22   ` Keqian Zhu
  0 siblings, 0 replies; 81+ messages in thread
From: Keqian Zhu @ 2021-05-07 10:22 UTC (permalink / raw)
  To: linux-kernel, linux-arm-kernel, iommu, Robin Murphy, Will Deacon,
	Joerg Roedel, Jean-Philippe Brucker, Lu Baolu, Yi Sun,
	Tian Kevin
  Cc: jiangkunkun, Cornelia Huck, Kirti Wankhede, lushenming,
	Alex Williamson, wanghaibin.wang

From: Kunkun Jiang <jiangkunkun@huawei.com>

If block(largepage) mappings are split during start dirty log, then
when stop dirty log, we need to recover them for better DMA performance.

This recovers block mappings and unmap the span of page mappings. BBML1
or BBML2 feature is required.

Merging page is designed to be only used by dirty log tracking, which
does not concurrently work with other pgtable ops that access underlying
page table, so race condition does not exist.

Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com>
---
 drivers/iommu/io-pgtable-arm.c | 78 ++++++++++++++++++++++++++++++++++
 include/linux/io-pgtable.h     |  2 +
 2 files changed, 80 insertions(+)

diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
index 664a9548b199..b9f6e3370032 100644
--- a/drivers/iommu/io-pgtable-arm.c
+++ b/drivers/iommu/io-pgtable-arm.c
@@ -800,6 +800,83 @@ static size_t arm_lpae_split_block(struct io_pgtable_ops *ops,
 	return __arm_lpae_split_block(data, iova, size, lvl, ptep);
 }
 
+static size_t __arm_lpae_merge_page(struct arm_lpae_io_pgtable *data,
+				    unsigned long iova, phys_addr_t paddr,
+				    size_t size, int lvl, arm_lpae_iopte *ptep,
+				    arm_lpae_iopte prot)
+{
+	arm_lpae_iopte pte, *tablep;
+	struct io_pgtable *iop = &data->iop;
+	struct io_pgtable_cfg *cfg = &data->iop.cfg;
+
+	if (WARN_ON(lvl == ARM_LPAE_MAX_LEVELS))
+		return 0;
+
+	ptep += ARM_LPAE_LVL_IDX(iova, lvl, data);
+	pte = READ_ONCE(*ptep);
+	if (WARN_ON(!pte))
+		return 0;
+
+	if (size == ARM_LPAE_BLOCK_SIZE(lvl, data)) {
+		if (iopte_leaf(pte, lvl, iop->fmt))
+			return size;
+
+		/* Race does not exist */
+		if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_BBML1) {
+			prot |= ARM_LPAE_PTE_NT;
+			__arm_lpae_init_pte(data, paddr, prot, lvl, ptep);
+			io_pgtable_tlb_flush_walk(iop, iova, size,
+						  ARM_LPAE_GRANULE(data));
+
+			prot &= ~(ARM_LPAE_PTE_NT);
+			__arm_lpae_init_pte(data, paddr, prot, lvl, ptep);
+		} else {
+			__arm_lpae_init_pte(data, paddr, prot, lvl, ptep);
+		}
+
+		tablep = iopte_deref(pte, data);
+		__arm_lpae_free_pgtable(data, lvl + 1, tablep);
+		return size;
+	} else if (iopte_leaf(pte, lvl, iop->fmt)) {
+		/* The size is too small, already merged */
+		return size;
+	}
+
+	/* Keep on walkin */
+	ptep = iopte_deref(pte, data);
+	return __arm_lpae_merge_page(data, iova, paddr, size, lvl + 1, ptep, prot);
+}
+
+static size_t arm_lpae_merge_page(struct io_pgtable_ops *ops, unsigned long iova,
+				  phys_addr_t paddr, size_t size, int iommu_prot)
+{
+	struct arm_lpae_io_pgtable *data = io_pgtable_ops_to_data(ops);
+	struct io_pgtable_cfg *cfg = &data->iop.cfg;
+	arm_lpae_iopte *ptep = data->pgd;
+	int lvl = data->start_level;
+	arm_lpae_iopte prot;
+	long iaext = (s64)iova >> cfg->ias;
+
+	if (WARN_ON(!size || (size & cfg->pgsize_bitmap) != size))
+		return 0;
+
+	if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_TTBR1)
+		iaext = ~iaext;
+	if (WARN_ON(iaext || paddr >> cfg->oas))
+		return 0;
+
+	/* If no access, then nothing to do */
+	if (!(iommu_prot & (IOMMU_READ | IOMMU_WRITE)))
+		return size;
+
+	/* If it is smallest granule, then nothing to do */
+	if (size == ARM_LPAE_BLOCK_SIZE(ARM_LPAE_MAX_LEVELS - 1, data))
+		return size;
+
+	prot = arm_lpae_prot_to_pte(data, iommu_prot);
+	return __arm_lpae_merge_page(data, iova, paddr, size, lvl, ptep, prot);
+}
+
 static void arm_lpae_restrict_pgsizes(struct io_pgtable_cfg *cfg)
 {
 	unsigned long granule, page_sizes;
@@ -879,6 +956,7 @@ arm_lpae_alloc_pgtable(struct io_pgtable_cfg *cfg)
 		.unmap		= arm_lpae_unmap,
 		.iova_to_phys	= arm_lpae_iova_to_phys,
 		.split_block	= arm_lpae_split_block,
+		.merge_page	= arm_lpae_merge_page,
 	};
 
 	return data;
diff --git a/include/linux/io-pgtable.h b/include/linux/io-pgtable.h
index eba6c6ccbe49..e77576d946a2 100644
--- a/include/linux/io-pgtable.h
+++ b/include/linux/io-pgtable.h
@@ -169,6 +169,8 @@ struct io_pgtable_ops {
 				    unsigned long iova);
 	size_t (*split_block)(struct io_pgtable_ops *ops, unsigned long iova,
 			      size_t size);
+	size_t (*merge_page)(struct io_pgtable_ops *ops, unsigned long iova,
+			     phys_addr_t phys, size_t size, int prot);
 };
 
 /**
-- 
2.19.1

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 81+ messages in thread

* [RFC PATCH v4 04/13] iommu/io-pgtable-arm: Add and realize merge_page ops
@ 2021-05-07 10:22   ` Keqian Zhu
  0 siblings, 0 replies; 81+ messages in thread
From: Keqian Zhu @ 2021-05-07 10:22 UTC (permalink / raw)
  To: linux-kernel, linux-arm-kernel, iommu, Robin Murphy, Will Deacon,
	Joerg Roedel, Jean-Philippe Brucker, Lu Baolu, Yi Sun,
	Tian Kevin
  Cc: Alex Williamson, Kirti Wankhede, Cornelia Huck, Jonathan Cameron,
	wanghaibin.wang, jiangkunkun, yuzenghui, lushenming

From: Kunkun Jiang <jiangkunkun@huawei.com>

If block(largepage) mappings are split during start dirty log, then
when stop dirty log, we need to recover them for better DMA performance.

This recovers block mappings and unmap the span of page mappings. BBML1
or BBML2 feature is required.

Merging page is designed to be only used by dirty log tracking, which
does not concurrently work with other pgtable ops that access underlying
page table, so race condition does not exist.

Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com>
---
 drivers/iommu/io-pgtable-arm.c | 78 ++++++++++++++++++++++++++++++++++
 include/linux/io-pgtable.h     |  2 +
 2 files changed, 80 insertions(+)

diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
index 664a9548b199..b9f6e3370032 100644
--- a/drivers/iommu/io-pgtable-arm.c
+++ b/drivers/iommu/io-pgtable-arm.c
@@ -800,6 +800,83 @@ static size_t arm_lpae_split_block(struct io_pgtable_ops *ops,
 	return __arm_lpae_split_block(data, iova, size, lvl, ptep);
 }
 
+static size_t __arm_lpae_merge_page(struct arm_lpae_io_pgtable *data,
+				    unsigned long iova, phys_addr_t paddr,
+				    size_t size, int lvl, arm_lpae_iopte *ptep,
+				    arm_lpae_iopte prot)
+{
+	arm_lpae_iopte pte, *tablep;
+	struct io_pgtable *iop = &data->iop;
+	struct io_pgtable_cfg *cfg = &data->iop.cfg;
+
+	if (WARN_ON(lvl == ARM_LPAE_MAX_LEVELS))
+		return 0;
+
+	ptep += ARM_LPAE_LVL_IDX(iova, lvl, data);
+	pte = READ_ONCE(*ptep);
+	if (WARN_ON(!pte))
+		return 0;
+
+	if (size == ARM_LPAE_BLOCK_SIZE(lvl, data)) {
+		if (iopte_leaf(pte, lvl, iop->fmt))
+			return size;
+
+		/* Race does not exist */
+		if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_BBML1) {
+			prot |= ARM_LPAE_PTE_NT;
+			__arm_lpae_init_pte(data, paddr, prot, lvl, ptep);
+			io_pgtable_tlb_flush_walk(iop, iova, size,
+						  ARM_LPAE_GRANULE(data));
+
+			prot &= ~(ARM_LPAE_PTE_NT);
+			__arm_lpae_init_pte(data, paddr, prot, lvl, ptep);
+		} else {
+			__arm_lpae_init_pte(data, paddr, prot, lvl, ptep);
+		}
+
+		tablep = iopte_deref(pte, data);
+		__arm_lpae_free_pgtable(data, lvl + 1, tablep);
+		return size;
+	} else if (iopte_leaf(pte, lvl, iop->fmt)) {
+		/* The size is too small, already merged */
+		return size;
+	}
+
+	/* Keep on walkin */
+	ptep = iopte_deref(pte, data);
+	return __arm_lpae_merge_page(data, iova, paddr, size, lvl + 1, ptep, prot);
+}
+
+static size_t arm_lpae_merge_page(struct io_pgtable_ops *ops, unsigned long iova,
+				  phys_addr_t paddr, size_t size, int iommu_prot)
+{
+	struct arm_lpae_io_pgtable *data = io_pgtable_ops_to_data(ops);
+	struct io_pgtable_cfg *cfg = &data->iop.cfg;
+	arm_lpae_iopte *ptep = data->pgd;
+	int lvl = data->start_level;
+	arm_lpae_iopte prot;
+	long iaext = (s64)iova >> cfg->ias;
+
+	if (WARN_ON(!size || (size & cfg->pgsize_bitmap) != size))
+		return 0;
+
+	if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_TTBR1)
+		iaext = ~iaext;
+	if (WARN_ON(iaext || paddr >> cfg->oas))
+		return 0;
+
+	/* If no access, then nothing to do */
+	if (!(iommu_prot & (IOMMU_READ | IOMMU_WRITE)))
+		return size;
+
+	/* If it is smallest granule, then nothing to do */
+	if (size == ARM_LPAE_BLOCK_SIZE(ARM_LPAE_MAX_LEVELS - 1, data))
+		return size;
+
+	prot = arm_lpae_prot_to_pte(data, iommu_prot);
+	return __arm_lpae_merge_page(data, iova, paddr, size, lvl, ptep, prot);
+}
+
 static void arm_lpae_restrict_pgsizes(struct io_pgtable_cfg *cfg)
 {
 	unsigned long granule, page_sizes;
@@ -879,6 +956,7 @@ arm_lpae_alloc_pgtable(struct io_pgtable_cfg *cfg)
 		.unmap		= arm_lpae_unmap,
 		.iova_to_phys	= arm_lpae_iova_to_phys,
 		.split_block	= arm_lpae_split_block,
+		.merge_page	= arm_lpae_merge_page,
 	};
 
 	return data;
diff --git a/include/linux/io-pgtable.h b/include/linux/io-pgtable.h
index eba6c6ccbe49..e77576d946a2 100644
--- a/include/linux/io-pgtable.h
+++ b/include/linux/io-pgtable.h
@@ -169,6 +169,8 @@ struct io_pgtable_ops {
 				    unsigned long iova);
 	size_t (*split_block)(struct io_pgtable_ops *ops, unsigned long iova,
 			      size_t size);
+	size_t (*merge_page)(struct io_pgtable_ops *ops, unsigned long iova,
+			     phys_addr_t phys, size_t size, int prot);
 };
 
 /**
-- 
2.19.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 81+ messages in thread

* [RFC PATCH v4 05/13] iommu/io-pgtable-arm: Add and realize sync_dirty_log ops
  2021-05-07 10:21 ` Keqian Zhu
  (?)
@ 2021-05-07 10:22   ` Keqian Zhu
  -1 siblings, 0 replies; 81+ messages in thread
From: Keqian Zhu @ 2021-05-07 10:22 UTC (permalink / raw)
  To: linux-kernel, linux-arm-kernel, iommu, Robin Murphy, Will Deacon,
	Joerg Roedel, Jean-Philippe Brucker, Lu Baolu, Yi Sun,
	Tian Kevin
  Cc: Alex Williamson, Kirti Wankhede, Cornelia Huck, Jonathan Cameron,
	wanghaibin.wang, jiangkunkun, yuzenghui, lushenming

From: Kunkun Jiang <jiangkunkun@huawei.com>

During dirty log tracking, user will try to retrieve dirty log from
iommu if it supports hardware dirty log. Scan leaf TTD and treat it
is dirty if it's writable. As we just set DBM bit for stage1 mapping,
so check whether AP[2] is not set.

Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com>
---
 drivers/iommu/io-pgtable-arm.c | 89 ++++++++++++++++++++++++++++++++++
 include/linux/io-pgtable.h     |  4 ++
 2 files changed, 93 insertions(+)

diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
index b9f6e3370032..155d440099ab 100644
--- a/drivers/iommu/io-pgtable-arm.c
+++ b/drivers/iommu/io-pgtable-arm.c
@@ -877,6 +877,94 @@ static size_t arm_lpae_merge_page(struct io_pgtable_ops *ops, unsigned long iova
 	return __arm_lpae_merge_page(data, iova, paddr, size, lvl, ptep, prot);
 }
 
+static int __arm_lpae_sync_dirty_log(struct arm_lpae_io_pgtable *data,
+				     unsigned long iova, size_t size,
+				     int lvl, arm_lpae_iopte *ptep,
+				     unsigned long *bitmap,
+				     unsigned long base_iova,
+				     unsigned long bitmap_pgshift)
+{
+	arm_lpae_iopte pte;
+	struct io_pgtable *iop = &data->iop;
+	size_t base, next_size;
+	unsigned long offset;
+	int nbits, ret;
+
+	if (WARN_ON(lvl == ARM_LPAE_MAX_LEVELS))
+		return -EINVAL;
+
+	ptep += ARM_LPAE_LVL_IDX(iova, lvl, data);
+	pte = READ_ONCE(*ptep);
+	if (WARN_ON(!pte))
+		return -EINVAL;
+
+	if (size == ARM_LPAE_BLOCK_SIZE(lvl, data)) {
+		if (iopte_leaf(pte, lvl, iop->fmt)) {
+			if (pte & ARM_LPAE_PTE_AP_RDONLY)
+				return 0;
+
+			/* It is writable, set the bitmap */
+			nbits = size >> bitmap_pgshift;
+			offset = (iova - base_iova) >> bitmap_pgshift;
+			bitmap_set(bitmap, offset, nbits);
+			return 0;
+		}
+		/* Current level is table, traverse next level */
+		next_size = ARM_LPAE_BLOCK_SIZE(lvl + 1, data);
+		ptep = iopte_deref(pte, data);
+		for (base = 0; base < size; base += next_size) {
+			ret = __arm_lpae_sync_dirty_log(data, iova + base,
+					next_size, lvl + 1, ptep, bitmap,
+					base_iova, bitmap_pgshift);
+			if (ret)
+				return ret;
+		}
+		return 0;
+	} else if (iopte_leaf(pte, lvl, iop->fmt)) {
+		if (pte & ARM_LPAE_PTE_AP_RDONLY)
+			return 0;
+
+		/* Though the size is too small, also set bitmap */
+		nbits = size >> bitmap_pgshift;
+		offset = (iova - base_iova) >> bitmap_pgshift;
+		bitmap_set(bitmap, offset, nbits);
+		return 0;
+	}
+
+	/* Keep on walkin */
+	ptep = iopte_deref(pte, data);
+	return __arm_lpae_sync_dirty_log(data, iova, size, lvl + 1, ptep,
+			bitmap, base_iova, bitmap_pgshift);
+}
+
+static int arm_lpae_sync_dirty_log(struct io_pgtable_ops *ops,
+				   unsigned long iova, size_t size,
+				   unsigned long *bitmap,
+				   unsigned long base_iova,
+				   unsigned long bitmap_pgshift)
+{
+	struct arm_lpae_io_pgtable *data = io_pgtable_ops_to_data(ops);
+	struct io_pgtable_cfg *cfg = &data->iop.cfg;
+	arm_lpae_iopte *ptep = data->pgd;
+	int lvl = data->start_level;
+	long iaext = (s64)iova >> cfg->ias;
+
+	if (WARN_ON(!size || (size & cfg->pgsize_bitmap) != size))
+		return -EINVAL;
+
+	if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_TTBR1)
+		iaext = ~iaext;
+	if (WARN_ON(iaext))
+		return -EINVAL;
+
+	if (data->iop.fmt != ARM_64_LPAE_S1 &&
+	    data->iop.fmt != ARM_32_LPAE_S1)
+		return -EINVAL;
+
+	return __arm_lpae_sync_dirty_log(data, iova, size, lvl, ptep,
+					 bitmap, base_iova, bitmap_pgshift);
+}
+
 static void arm_lpae_restrict_pgsizes(struct io_pgtable_cfg *cfg)
 {
 	unsigned long granule, page_sizes;
@@ -957,6 +1045,7 @@ arm_lpae_alloc_pgtable(struct io_pgtable_cfg *cfg)
 		.iova_to_phys	= arm_lpae_iova_to_phys,
 		.split_block	= arm_lpae_split_block,
 		.merge_page	= arm_lpae_merge_page,
+		.sync_dirty_log	= arm_lpae_sync_dirty_log,
 	};
 
 	return data;
diff --git a/include/linux/io-pgtable.h b/include/linux/io-pgtable.h
index e77576d946a2..329fa99d9d96 100644
--- a/include/linux/io-pgtable.h
+++ b/include/linux/io-pgtable.h
@@ -171,6 +171,10 @@ struct io_pgtable_ops {
 			      size_t size);
 	size_t (*merge_page)(struct io_pgtable_ops *ops, unsigned long iova,
 			     phys_addr_t phys, size_t size, int prot);
+	int (*sync_dirty_log)(struct io_pgtable_ops *ops,
+			      unsigned long iova, size_t size,
+			      unsigned long *bitmap, unsigned long base_iova,
+			      unsigned long bitmap_pgshift);
 };
 
 /**
-- 
2.19.1


^ permalink raw reply	[flat|nested] 81+ messages in thread

* [RFC PATCH v4 05/13] iommu/io-pgtable-arm: Add and realize sync_dirty_log ops
@ 2021-05-07 10:22   ` Keqian Zhu
  0 siblings, 0 replies; 81+ messages in thread
From: Keqian Zhu @ 2021-05-07 10:22 UTC (permalink / raw)
  To: linux-kernel, linux-arm-kernel, iommu, Robin Murphy, Will Deacon,
	Joerg Roedel, Jean-Philippe Brucker, Lu Baolu, Yi Sun,
	Tian Kevin
  Cc: jiangkunkun, Cornelia Huck, Kirti Wankhede, lushenming,
	Alex Williamson, wanghaibin.wang

From: Kunkun Jiang <jiangkunkun@huawei.com>

During dirty log tracking, user will try to retrieve dirty log from
iommu if it supports hardware dirty log. Scan leaf TTD and treat it
is dirty if it's writable. As we just set DBM bit for stage1 mapping,
so check whether AP[2] is not set.

Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com>
---
 drivers/iommu/io-pgtable-arm.c | 89 ++++++++++++++++++++++++++++++++++
 include/linux/io-pgtable.h     |  4 ++
 2 files changed, 93 insertions(+)

diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
index b9f6e3370032..155d440099ab 100644
--- a/drivers/iommu/io-pgtable-arm.c
+++ b/drivers/iommu/io-pgtable-arm.c
@@ -877,6 +877,94 @@ static size_t arm_lpae_merge_page(struct io_pgtable_ops *ops, unsigned long iova
 	return __arm_lpae_merge_page(data, iova, paddr, size, lvl, ptep, prot);
 }
 
+static int __arm_lpae_sync_dirty_log(struct arm_lpae_io_pgtable *data,
+				     unsigned long iova, size_t size,
+				     int lvl, arm_lpae_iopte *ptep,
+				     unsigned long *bitmap,
+				     unsigned long base_iova,
+				     unsigned long bitmap_pgshift)
+{
+	arm_lpae_iopte pte;
+	struct io_pgtable *iop = &data->iop;
+	size_t base, next_size;
+	unsigned long offset;
+	int nbits, ret;
+
+	if (WARN_ON(lvl == ARM_LPAE_MAX_LEVELS))
+		return -EINVAL;
+
+	ptep += ARM_LPAE_LVL_IDX(iova, lvl, data);
+	pte = READ_ONCE(*ptep);
+	if (WARN_ON(!pte))
+		return -EINVAL;
+
+	if (size == ARM_LPAE_BLOCK_SIZE(lvl, data)) {
+		if (iopte_leaf(pte, lvl, iop->fmt)) {
+			if (pte & ARM_LPAE_PTE_AP_RDONLY)
+				return 0;
+
+			/* It is writable, set the bitmap */
+			nbits = size >> bitmap_pgshift;
+			offset = (iova - base_iova) >> bitmap_pgshift;
+			bitmap_set(bitmap, offset, nbits);
+			return 0;
+		}
+		/* Current level is table, traverse next level */
+		next_size = ARM_LPAE_BLOCK_SIZE(lvl + 1, data);
+		ptep = iopte_deref(pte, data);
+		for (base = 0; base < size; base += next_size) {
+			ret = __arm_lpae_sync_dirty_log(data, iova + base,
+					next_size, lvl + 1, ptep, bitmap,
+					base_iova, bitmap_pgshift);
+			if (ret)
+				return ret;
+		}
+		return 0;
+	} else if (iopte_leaf(pte, lvl, iop->fmt)) {
+		if (pte & ARM_LPAE_PTE_AP_RDONLY)
+			return 0;
+
+		/* Though the size is too small, also set bitmap */
+		nbits = size >> bitmap_pgshift;
+		offset = (iova - base_iova) >> bitmap_pgshift;
+		bitmap_set(bitmap, offset, nbits);
+		return 0;
+	}
+
+	/* Keep on walkin */
+	ptep = iopte_deref(pte, data);
+	return __arm_lpae_sync_dirty_log(data, iova, size, lvl + 1, ptep,
+			bitmap, base_iova, bitmap_pgshift);
+}
+
+static int arm_lpae_sync_dirty_log(struct io_pgtable_ops *ops,
+				   unsigned long iova, size_t size,
+				   unsigned long *bitmap,
+				   unsigned long base_iova,
+				   unsigned long bitmap_pgshift)
+{
+	struct arm_lpae_io_pgtable *data = io_pgtable_ops_to_data(ops);
+	struct io_pgtable_cfg *cfg = &data->iop.cfg;
+	arm_lpae_iopte *ptep = data->pgd;
+	int lvl = data->start_level;
+	long iaext = (s64)iova >> cfg->ias;
+
+	if (WARN_ON(!size || (size & cfg->pgsize_bitmap) != size))
+		return -EINVAL;
+
+	if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_TTBR1)
+		iaext = ~iaext;
+	if (WARN_ON(iaext))
+		return -EINVAL;
+
+	if (data->iop.fmt != ARM_64_LPAE_S1 &&
+	    data->iop.fmt != ARM_32_LPAE_S1)
+		return -EINVAL;
+
+	return __arm_lpae_sync_dirty_log(data, iova, size, lvl, ptep,
+					 bitmap, base_iova, bitmap_pgshift);
+}
+
 static void arm_lpae_restrict_pgsizes(struct io_pgtable_cfg *cfg)
 {
 	unsigned long granule, page_sizes;
@@ -957,6 +1045,7 @@ arm_lpae_alloc_pgtable(struct io_pgtable_cfg *cfg)
 		.iova_to_phys	= arm_lpae_iova_to_phys,
 		.split_block	= arm_lpae_split_block,
 		.merge_page	= arm_lpae_merge_page,
+		.sync_dirty_log	= arm_lpae_sync_dirty_log,
 	};
 
 	return data;
diff --git a/include/linux/io-pgtable.h b/include/linux/io-pgtable.h
index e77576d946a2..329fa99d9d96 100644
--- a/include/linux/io-pgtable.h
+++ b/include/linux/io-pgtable.h
@@ -171,6 +171,10 @@ struct io_pgtable_ops {
 			      size_t size);
 	size_t (*merge_page)(struct io_pgtable_ops *ops, unsigned long iova,
 			     phys_addr_t phys, size_t size, int prot);
+	int (*sync_dirty_log)(struct io_pgtable_ops *ops,
+			      unsigned long iova, size_t size,
+			      unsigned long *bitmap, unsigned long base_iova,
+			      unsigned long bitmap_pgshift);
 };
 
 /**
-- 
2.19.1

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 81+ messages in thread

* [RFC PATCH v4 05/13] iommu/io-pgtable-arm: Add and realize sync_dirty_log ops
@ 2021-05-07 10:22   ` Keqian Zhu
  0 siblings, 0 replies; 81+ messages in thread
From: Keqian Zhu @ 2021-05-07 10:22 UTC (permalink / raw)
  To: linux-kernel, linux-arm-kernel, iommu, Robin Murphy, Will Deacon,
	Joerg Roedel, Jean-Philippe Brucker, Lu Baolu, Yi Sun,
	Tian Kevin
  Cc: Alex Williamson, Kirti Wankhede, Cornelia Huck, Jonathan Cameron,
	wanghaibin.wang, jiangkunkun, yuzenghui, lushenming

From: Kunkun Jiang <jiangkunkun@huawei.com>

During dirty log tracking, user will try to retrieve dirty log from
iommu if it supports hardware dirty log. Scan leaf TTD and treat it
is dirty if it's writable. As we just set DBM bit for stage1 mapping,
so check whether AP[2] is not set.

Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com>
---
 drivers/iommu/io-pgtable-arm.c | 89 ++++++++++++++++++++++++++++++++++
 include/linux/io-pgtable.h     |  4 ++
 2 files changed, 93 insertions(+)

diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
index b9f6e3370032..155d440099ab 100644
--- a/drivers/iommu/io-pgtable-arm.c
+++ b/drivers/iommu/io-pgtable-arm.c
@@ -877,6 +877,94 @@ static size_t arm_lpae_merge_page(struct io_pgtable_ops *ops, unsigned long iova
 	return __arm_lpae_merge_page(data, iova, paddr, size, lvl, ptep, prot);
 }
 
+static int __arm_lpae_sync_dirty_log(struct arm_lpae_io_pgtable *data,
+				     unsigned long iova, size_t size,
+				     int lvl, arm_lpae_iopte *ptep,
+				     unsigned long *bitmap,
+				     unsigned long base_iova,
+				     unsigned long bitmap_pgshift)
+{
+	arm_lpae_iopte pte;
+	struct io_pgtable *iop = &data->iop;
+	size_t base, next_size;
+	unsigned long offset;
+	int nbits, ret;
+
+	if (WARN_ON(lvl == ARM_LPAE_MAX_LEVELS))
+		return -EINVAL;
+
+	ptep += ARM_LPAE_LVL_IDX(iova, lvl, data);
+	pte = READ_ONCE(*ptep);
+	if (WARN_ON(!pte))
+		return -EINVAL;
+
+	if (size == ARM_LPAE_BLOCK_SIZE(lvl, data)) {
+		if (iopte_leaf(pte, lvl, iop->fmt)) {
+			if (pte & ARM_LPAE_PTE_AP_RDONLY)
+				return 0;
+
+			/* It is writable, set the bitmap */
+			nbits = size >> bitmap_pgshift;
+			offset = (iova - base_iova) >> bitmap_pgshift;
+			bitmap_set(bitmap, offset, nbits);
+			return 0;
+		}
+		/* Current level is table, traverse next level */
+		next_size = ARM_LPAE_BLOCK_SIZE(lvl + 1, data);
+		ptep = iopte_deref(pte, data);
+		for (base = 0; base < size; base += next_size) {
+			ret = __arm_lpae_sync_dirty_log(data, iova + base,
+					next_size, lvl + 1, ptep, bitmap,
+					base_iova, bitmap_pgshift);
+			if (ret)
+				return ret;
+		}
+		return 0;
+	} else if (iopte_leaf(pte, lvl, iop->fmt)) {
+		if (pte & ARM_LPAE_PTE_AP_RDONLY)
+			return 0;
+
+		/* Though the size is too small, also set bitmap */
+		nbits = size >> bitmap_pgshift;
+		offset = (iova - base_iova) >> bitmap_pgshift;
+		bitmap_set(bitmap, offset, nbits);
+		return 0;
+	}
+
+	/* Keep on walkin */
+	ptep = iopte_deref(pte, data);
+	return __arm_lpae_sync_dirty_log(data, iova, size, lvl + 1, ptep,
+			bitmap, base_iova, bitmap_pgshift);
+}
+
+static int arm_lpae_sync_dirty_log(struct io_pgtable_ops *ops,
+				   unsigned long iova, size_t size,
+				   unsigned long *bitmap,
+				   unsigned long base_iova,
+				   unsigned long bitmap_pgshift)
+{
+	struct arm_lpae_io_pgtable *data = io_pgtable_ops_to_data(ops);
+	struct io_pgtable_cfg *cfg = &data->iop.cfg;
+	arm_lpae_iopte *ptep = data->pgd;
+	int lvl = data->start_level;
+	long iaext = (s64)iova >> cfg->ias;
+
+	if (WARN_ON(!size || (size & cfg->pgsize_bitmap) != size))
+		return -EINVAL;
+
+	if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_TTBR1)
+		iaext = ~iaext;
+	if (WARN_ON(iaext))
+		return -EINVAL;
+
+	if (data->iop.fmt != ARM_64_LPAE_S1 &&
+	    data->iop.fmt != ARM_32_LPAE_S1)
+		return -EINVAL;
+
+	return __arm_lpae_sync_dirty_log(data, iova, size, lvl, ptep,
+					 bitmap, base_iova, bitmap_pgshift);
+}
+
 static void arm_lpae_restrict_pgsizes(struct io_pgtable_cfg *cfg)
 {
 	unsigned long granule, page_sizes;
@@ -957,6 +1045,7 @@ arm_lpae_alloc_pgtable(struct io_pgtable_cfg *cfg)
 		.iova_to_phys	= arm_lpae_iova_to_phys,
 		.split_block	= arm_lpae_split_block,
 		.merge_page	= arm_lpae_merge_page,
+		.sync_dirty_log	= arm_lpae_sync_dirty_log,
 	};
 
 	return data;
diff --git a/include/linux/io-pgtable.h b/include/linux/io-pgtable.h
index e77576d946a2..329fa99d9d96 100644
--- a/include/linux/io-pgtable.h
+++ b/include/linux/io-pgtable.h
@@ -171,6 +171,10 @@ struct io_pgtable_ops {
 			      size_t size);
 	size_t (*merge_page)(struct io_pgtable_ops *ops, unsigned long iova,
 			     phys_addr_t phys, size_t size, int prot);
+	int (*sync_dirty_log)(struct io_pgtable_ops *ops,
+			      unsigned long iova, size_t size,
+			      unsigned long *bitmap, unsigned long base_iova,
+			      unsigned long bitmap_pgshift);
 };
 
 /**
-- 
2.19.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 81+ messages in thread

* [RFC PATCH v4 06/13] iommu/io-pgtable-arm: Add and realize clear_dirty_log ops
  2021-05-07 10:21 ` Keqian Zhu
  (?)
@ 2021-05-07 10:22   ` Keqian Zhu
  -1 siblings, 0 replies; 81+ messages in thread
From: Keqian Zhu @ 2021-05-07 10:22 UTC (permalink / raw)
  To: linux-kernel, linux-arm-kernel, iommu, Robin Murphy, Will Deacon,
	Joerg Roedel, Jean-Philippe Brucker, Lu Baolu, Yi Sun,
	Tian Kevin
  Cc: Alex Williamson, Kirti Wankhede, Cornelia Huck, Jonathan Cameron,
	wanghaibin.wang, jiangkunkun, yuzenghui, lushenming

From: Kunkun Jiang <jiangkunkun@huawei.com>

After dirty log is retrieved, user should clear dirty log to re-enable
dirty log tracking for these dirtied pages. This clears the dirty state
(As we just set DBM bit for stage1 mapping, so should set the AP[2] bit)
of these leaf TTDs that are specified by the user provided bitmap.

Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com>
---
 drivers/iommu/io-pgtable-arm.c | 93 ++++++++++++++++++++++++++++++++++
 include/linux/io-pgtable.h     |  4 ++
 2 files changed, 97 insertions(+)

diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
index 155d440099ab..2b41b9d0faa3 100644
--- a/drivers/iommu/io-pgtable-arm.c
+++ b/drivers/iommu/io-pgtable-arm.c
@@ -965,6 +965,98 @@ static int arm_lpae_sync_dirty_log(struct io_pgtable_ops *ops,
 					 bitmap, base_iova, bitmap_pgshift);
 }
 
+static int __arm_lpae_clear_dirty_log(struct arm_lpae_io_pgtable *data,
+				      unsigned long iova, size_t size,
+				      int lvl, arm_lpae_iopte *ptep,
+				      unsigned long *bitmap,
+				      unsigned long base_iova,
+				      unsigned long bitmap_pgshift)
+{
+	arm_lpae_iopte pte;
+	struct io_pgtable *iop = &data->iop;
+	unsigned long offset;
+	size_t base, next_size;
+	int nbits, ret, i;
+
+	if (WARN_ON(lvl == ARM_LPAE_MAX_LEVELS))
+		return -EINVAL;
+
+	ptep += ARM_LPAE_LVL_IDX(iova, lvl, data);
+	pte = READ_ONCE(*ptep);
+	if (WARN_ON(!pte))
+		return -EINVAL;
+
+	if (size == ARM_LPAE_BLOCK_SIZE(lvl, data)) {
+		if (iopte_leaf(pte, lvl, iop->fmt)) {
+			if (pte & ARM_LPAE_PTE_AP_RDONLY)
+				return 0;
+
+			/* Ensure all corresponding bits are set */
+			nbits = size >> bitmap_pgshift;
+			offset = (iova - base_iova) >> bitmap_pgshift;
+			for (i = offset; i < offset + nbits; i++) {
+				if (!test_bit(i, bitmap))
+					return 0;
+			}
+
+			/* Race does not exist */
+			pte |= ARM_LPAE_PTE_AP_RDONLY;
+			__arm_lpae_set_pte(ptep, pte, &iop->cfg);
+			return 0;
+		}
+		/* Current level is table, traverse next level */
+		next_size = ARM_LPAE_BLOCK_SIZE(lvl + 1, data);
+		ptep = iopte_deref(pte, data);
+		for (base = 0; base < size; base += next_size) {
+			ret = __arm_lpae_clear_dirty_log(data, iova + base,
+					next_size, lvl + 1, ptep, bitmap,
+					base_iova, bitmap_pgshift);
+			if (ret)
+				return ret;
+		}
+		return 0;
+	} else if (iopte_leaf(pte, lvl, iop->fmt)) {
+		/* Though the size is too small, it is already clean */
+		if (pte & ARM_LPAE_PTE_AP_RDONLY)
+			return 0;
+
+		return -EINVAL;
+	}
+
+	/* Keep on walkin */
+	ptep = iopte_deref(pte, data);
+	return __arm_lpae_clear_dirty_log(data, iova, size, lvl + 1, ptep,
+			bitmap, base_iova, bitmap_pgshift);
+}
+
+static int arm_lpae_clear_dirty_log(struct io_pgtable_ops *ops,
+				    unsigned long iova, size_t size,
+				    unsigned long *bitmap,
+				    unsigned long base_iova,
+				    unsigned long bitmap_pgshift)
+{
+	struct arm_lpae_io_pgtable *data = io_pgtable_ops_to_data(ops);
+	struct io_pgtable_cfg *cfg = &data->iop.cfg;
+	arm_lpae_iopte *ptep = data->pgd;
+	int lvl = data->start_level;
+	long iaext = (s64)iova >> cfg->ias;
+
+	if (WARN_ON(!size || (size & cfg->pgsize_bitmap) != size))
+		return -EINVAL;
+
+	if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_TTBR1)
+		iaext = ~iaext;
+	if (WARN_ON(iaext))
+		return -EINVAL;
+
+	if (data->iop.fmt != ARM_64_LPAE_S1 &&
+	    data->iop.fmt != ARM_32_LPAE_S1)
+		return -EINVAL;
+
+	return __arm_lpae_clear_dirty_log(data, iova, size, lvl, ptep,
+			bitmap, base_iova, bitmap_pgshift);
+}
+
 static void arm_lpae_restrict_pgsizes(struct io_pgtable_cfg *cfg)
 {
 	unsigned long granule, page_sizes;
@@ -1046,6 +1138,7 @@ arm_lpae_alloc_pgtable(struct io_pgtable_cfg *cfg)
 		.split_block	= arm_lpae_split_block,
 		.merge_page	= arm_lpae_merge_page,
 		.sync_dirty_log	= arm_lpae_sync_dirty_log,
+		.clear_dirty_log = arm_lpae_clear_dirty_log,
 	};
 
 	return data;
diff --git a/include/linux/io-pgtable.h b/include/linux/io-pgtable.h
index 329fa99d9d96..4781407d5e2d 100644
--- a/include/linux/io-pgtable.h
+++ b/include/linux/io-pgtable.h
@@ -175,6 +175,10 @@ struct io_pgtable_ops {
 			      unsigned long iova, size_t size,
 			      unsigned long *bitmap, unsigned long base_iova,
 			      unsigned long bitmap_pgshift);
+	int (*clear_dirty_log)(struct io_pgtable_ops *ops,
+			       unsigned long iova, size_t size,
+			       unsigned long *bitmap, unsigned long base_iova,
+			       unsigned long bitmap_pgshift);
 };
 
 /**
-- 
2.19.1


^ permalink raw reply	[flat|nested] 81+ messages in thread

* [RFC PATCH v4 06/13] iommu/io-pgtable-arm: Add and realize clear_dirty_log ops
@ 2021-05-07 10:22   ` Keqian Zhu
  0 siblings, 0 replies; 81+ messages in thread
From: Keqian Zhu @ 2021-05-07 10:22 UTC (permalink / raw)
  To: linux-kernel, linux-arm-kernel, iommu, Robin Murphy, Will Deacon,
	Joerg Roedel, Jean-Philippe Brucker, Lu Baolu, Yi Sun,
	Tian Kevin
  Cc: jiangkunkun, Cornelia Huck, Kirti Wankhede, lushenming,
	Alex Williamson, wanghaibin.wang

From: Kunkun Jiang <jiangkunkun@huawei.com>

After dirty log is retrieved, user should clear dirty log to re-enable
dirty log tracking for these dirtied pages. This clears the dirty state
(As we just set DBM bit for stage1 mapping, so should set the AP[2] bit)
of these leaf TTDs that are specified by the user provided bitmap.

Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com>
---
 drivers/iommu/io-pgtable-arm.c | 93 ++++++++++++++++++++++++++++++++++
 include/linux/io-pgtable.h     |  4 ++
 2 files changed, 97 insertions(+)

diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
index 155d440099ab..2b41b9d0faa3 100644
--- a/drivers/iommu/io-pgtable-arm.c
+++ b/drivers/iommu/io-pgtable-arm.c
@@ -965,6 +965,98 @@ static int arm_lpae_sync_dirty_log(struct io_pgtable_ops *ops,
 					 bitmap, base_iova, bitmap_pgshift);
 }
 
+static int __arm_lpae_clear_dirty_log(struct arm_lpae_io_pgtable *data,
+				      unsigned long iova, size_t size,
+				      int lvl, arm_lpae_iopte *ptep,
+				      unsigned long *bitmap,
+				      unsigned long base_iova,
+				      unsigned long bitmap_pgshift)
+{
+	arm_lpae_iopte pte;
+	struct io_pgtable *iop = &data->iop;
+	unsigned long offset;
+	size_t base, next_size;
+	int nbits, ret, i;
+
+	if (WARN_ON(lvl == ARM_LPAE_MAX_LEVELS))
+		return -EINVAL;
+
+	ptep += ARM_LPAE_LVL_IDX(iova, lvl, data);
+	pte = READ_ONCE(*ptep);
+	if (WARN_ON(!pte))
+		return -EINVAL;
+
+	if (size == ARM_LPAE_BLOCK_SIZE(lvl, data)) {
+		if (iopte_leaf(pte, lvl, iop->fmt)) {
+			if (pte & ARM_LPAE_PTE_AP_RDONLY)
+				return 0;
+
+			/* Ensure all corresponding bits are set */
+			nbits = size >> bitmap_pgshift;
+			offset = (iova - base_iova) >> bitmap_pgshift;
+			for (i = offset; i < offset + nbits; i++) {
+				if (!test_bit(i, bitmap))
+					return 0;
+			}
+
+			/* Race does not exist */
+			pte |= ARM_LPAE_PTE_AP_RDONLY;
+			__arm_lpae_set_pte(ptep, pte, &iop->cfg);
+			return 0;
+		}
+		/* Current level is table, traverse next level */
+		next_size = ARM_LPAE_BLOCK_SIZE(lvl + 1, data);
+		ptep = iopte_deref(pte, data);
+		for (base = 0; base < size; base += next_size) {
+			ret = __arm_lpae_clear_dirty_log(data, iova + base,
+					next_size, lvl + 1, ptep, bitmap,
+					base_iova, bitmap_pgshift);
+			if (ret)
+				return ret;
+		}
+		return 0;
+	} else if (iopte_leaf(pte, lvl, iop->fmt)) {
+		/* Though the size is too small, it is already clean */
+		if (pte & ARM_LPAE_PTE_AP_RDONLY)
+			return 0;
+
+		return -EINVAL;
+	}
+
+	/* Keep on walkin */
+	ptep = iopte_deref(pte, data);
+	return __arm_lpae_clear_dirty_log(data, iova, size, lvl + 1, ptep,
+			bitmap, base_iova, bitmap_pgshift);
+}
+
+static int arm_lpae_clear_dirty_log(struct io_pgtable_ops *ops,
+				    unsigned long iova, size_t size,
+				    unsigned long *bitmap,
+				    unsigned long base_iova,
+				    unsigned long bitmap_pgshift)
+{
+	struct arm_lpae_io_pgtable *data = io_pgtable_ops_to_data(ops);
+	struct io_pgtable_cfg *cfg = &data->iop.cfg;
+	arm_lpae_iopte *ptep = data->pgd;
+	int lvl = data->start_level;
+	long iaext = (s64)iova >> cfg->ias;
+
+	if (WARN_ON(!size || (size & cfg->pgsize_bitmap) != size))
+		return -EINVAL;
+
+	if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_TTBR1)
+		iaext = ~iaext;
+	if (WARN_ON(iaext))
+		return -EINVAL;
+
+	if (data->iop.fmt != ARM_64_LPAE_S1 &&
+	    data->iop.fmt != ARM_32_LPAE_S1)
+		return -EINVAL;
+
+	return __arm_lpae_clear_dirty_log(data, iova, size, lvl, ptep,
+			bitmap, base_iova, bitmap_pgshift);
+}
+
 static void arm_lpae_restrict_pgsizes(struct io_pgtable_cfg *cfg)
 {
 	unsigned long granule, page_sizes;
@@ -1046,6 +1138,7 @@ arm_lpae_alloc_pgtable(struct io_pgtable_cfg *cfg)
 		.split_block	= arm_lpae_split_block,
 		.merge_page	= arm_lpae_merge_page,
 		.sync_dirty_log	= arm_lpae_sync_dirty_log,
+		.clear_dirty_log = arm_lpae_clear_dirty_log,
 	};
 
 	return data;
diff --git a/include/linux/io-pgtable.h b/include/linux/io-pgtable.h
index 329fa99d9d96..4781407d5e2d 100644
--- a/include/linux/io-pgtable.h
+++ b/include/linux/io-pgtable.h
@@ -175,6 +175,10 @@ struct io_pgtable_ops {
 			      unsigned long iova, size_t size,
 			      unsigned long *bitmap, unsigned long base_iova,
 			      unsigned long bitmap_pgshift);
+	int (*clear_dirty_log)(struct io_pgtable_ops *ops,
+			       unsigned long iova, size_t size,
+			       unsigned long *bitmap, unsigned long base_iova,
+			       unsigned long bitmap_pgshift);
 };
 
 /**
-- 
2.19.1

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 81+ messages in thread

* [RFC PATCH v4 06/13] iommu/io-pgtable-arm: Add and realize clear_dirty_log ops
@ 2021-05-07 10:22   ` Keqian Zhu
  0 siblings, 0 replies; 81+ messages in thread
From: Keqian Zhu @ 2021-05-07 10:22 UTC (permalink / raw)
  To: linux-kernel, linux-arm-kernel, iommu, Robin Murphy, Will Deacon,
	Joerg Roedel, Jean-Philippe Brucker, Lu Baolu, Yi Sun,
	Tian Kevin
  Cc: Alex Williamson, Kirti Wankhede, Cornelia Huck, Jonathan Cameron,
	wanghaibin.wang, jiangkunkun, yuzenghui, lushenming

From: Kunkun Jiang <jiangkunkun@huawei.com>

After dirty log is retrieved, user should clear dirty log to re-enable
dirty log tracking for these dirtied pages. This clears the dirty state
(As we just set DBM bit for stage1 mapping, so should set the AP[2] bit)
of these leaf TTDs that are specified by the user provided bitmap.

Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com>
---
 drivers/iommu/io-pgtable-arm.c | 93 ++++++++++++++++++++++++++++++++++
 include/linux/io-pgtable.h     |  4 ++
 2 files changed, 97 insertions(+)

diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
index 155d440099ab..2b41b9d0faa3 100644
--- a/drivers/iommu/io-pgtable-arm.c
+++ b/drivers/iommu/io-pgtable-arm.c
@@ -965,6 +965,98 @@ static int arm_lpae_sync_dirty_log(struct io_pgtable_ops *ops,
 					 bitmap, base_iova, bitmap_pgshift);
 }
 
+static int __arm_lpae_clear_dirty_log(struct arm_lpae_io_pgtable *data,
+				      unsigned long iova, size_t size,
+				      int lvl, arm_lpae_iopte *ptep,
+				      unsigned long *bitmap,
+				      unsigned long base_iova,
+				      unsigned long bitmap_pgshift)
+{
+	arm_lpae_iopte pte;
+	struct io_pgtable *iop = &data->iop;
+	unsigned long offset;
+	size_t base, next_size;
+	int nbits, ret, i;
+
+	if (WARN_ON(lvl == ARM_LPAE_MAX_LEVELS))
+		return -EINVAL;
+
+	ptep += ARM_LPAE_LVL_IDX(iova, lvl, data);
+	pte = READ_ONCE(*ptep);
+	if (WARN_ON(!pte))
+		return -EINVAL;
+
+	if (size == ARM_LPAE_BLOCK_SIZE(lvl, data)) {
+		if (iopte_leaf(pte, lvl, iop->fmt)) {
+			if (pte & ARM_LPAE_PTE_AP_RDONLY)
+				return 0;
+
+			/* Ensure all corresponding bits are set */
+			nbits = size >> bitmap_pgshift;
+			offset = (iova - base_iova) >> bitmap_pgshift;
+			for (i = offset; i < offset + nbits; i++) {
+				if (!test_bit(i, bitmap))
+					return 0;
+			}
+
+			/* Race does not exist */
+			pte |= ARM_LPAE_PTE_AP_RDONLY;
+			__arm_lpae_set_pte(ptep, pte, &iop->cfg);
+			return 0;
+		}
+		/* Current level is table, traverse next level */
+		next_size = ARM_LPAE_BLOCK_SIZE(lvl + 1, data);
+		ptep = iopte_deref(pte, data);
+		for (base = 0; base < size; base += next_size) {
+			ret = __arm_lpae_clear_dirty_log(data, iova + base,
+					next_size, lvl + 1, ptep, bitmap,
+					base_iova, bitmap_pgshift);
+			if (ret)
+				return ret;
+		}
+		return 0;
+	} else if (iopte_leaf(pte, lvl, iop->fmt)) {
+		/* Though the size is too small, it is already clean */
+		if (pte & ARM_LPAE_PTE_AP_RDONLY)
+			return 0;
+
+		return -EINVAL;
+	}
+
+	/* Keep on walkin */
+	ptep = iopte_deref(pte, data);
+	return __arm_lpae_clear_dirty_log(data, iova, size, lvl + 1, ptep,
+			bitmap, base_iova, bitmap_pgshift);
+}
+
+static int arm_lpae_clear_dirty_log(struct io_pgtable_ops *ops,
+				    unsigned long iova, size_t size,
+				    unsigned long *bitmap,
+				    unsigned long base_iova,
+				    unsigned long bitmap_pgshift)
+{
+	struct arm_lpae_io_pgtable *data = io_pgtable_ops_to_data(ops);
+	struct io_pgtable_cfg *cfg = &data->iop.cfg;
+	arm_lpae_iopte *ptep = data->pgd;
+	int lvl = data->start_level;
+	long iaext = (s64)iova >> cfg->ias;
+
+	if (WARN_ON(!size || (size & cfg->pgsize_bitmap) != size))
+		return -EINVAL;
+
+	if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_TTBR1)
+		iaext = ~iaext;
+	if (WARN_ON(iaext))
+		return -EINVAL;
+
+	if (data->iop.fmt != ARM_64_LPAE_S1 &&
+	    data->iop.fmt != ARM_32_LPAE_S1)
+		return -EINVAL;
+
+	return __arm_lpae_clear_dirty_log(data, iova, size, lvl, ptep,
+			bitmap, base_iova, bitmap_pgshift);
+}
+
 static void arm_lpae_restrict_pgsizes(struct io_pgtable_cfg *cfg)
 {
 	unsigned long granule, page_sizes;
@@ -1046,6 +1138,7 @@ arm_lpae_alloc_pgtable(struct io_pgtable_cfg *cfg)
 		.split_block	= arm_lpae_split_block,
 		.merge_page	= arm_lpae_merge_page,
 		.sync_dirty_log	= arm_lpae_sync_dirty_log,
+		.clear_dirty_log = arm_lpae_clear_dirty_log,
 	};
 
 	return data;
diff --git a/include/linux/io-pgtable.h b/include/linux/io-pgtable.h
index 329fa99d9d96..4781407d5e2d 100644
--- a/include/linux/io-pgtable.h
+++ b/include/linux/io-pgtable.h
@@ -175,6 +175,10 @@ struct io_pgtable_ops {
 			      unsigned long iova, size_t size,
 			      unsigned long *bitmap, unsigned long base_iova,
 			      unsigned long bitmap_pgshift);
+	int (*clear_dirty_log)(struct io_pgtable_ops *ops,
+			       unsigned long iova, size_t size,
+			       unsigned long *bitmap, unsigned long base_iova,
+			       unsigned long bitmap_pgshift);
 };
 
 /**
-- 
2.19.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 81+ messages in thread

* [RFC PATCH v4 07/13] iommu/arm-smmu-v3: Add support for Hardware Translation Table Update
  2021-05-07 10:21 ` Keqian Zhu
  (?)
@ 2021-05-07 10:22   ` Keqian Zhu
  -1 siblings, 0 replies; 81+ messages in thread
From: Keqian Zhu @ 2021-05-07 10:22 UTC (permalink / raw)
  To: linux-kernel, linux-arm-kernel, iommu, Robin Murphy, Will Deacon,
	Joerg Roedel, Jean-Philippe Brucker, Lu Baolu, Yi Sun,
	Tian Kevin
  Cc: Alex Williamson, Kirti Wankhede, Cornelia Huck, Jonathan Cameron,
	wanghaibin.wang, jiangkunkun, yuzenghui, lushenming

From: Jean-Philippe Brucker <jean-philippe@linaro.org>

If the SMMU supports it and the kernel was built with HTTU support,
enable hardware update of access and dirty flags. This is essential for
shared page tables, to reduce the number of access faults on the fault
queue. Normal DMA with io-pgtables doesn't currently use the access or
dirty flags.

We can enable HTTU even if CPUs don't support it, because the kernel
always checks for HW dirty bit and updates the PTE flags atomically.

Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
---
 .../iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c   |  2 +
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c   | 41 ++++++++++++++++++-
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h   |  8 ++++
 3 files changed, 50 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
index bb251cab61f3..ae075e675892 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
@@ -121,10 +121,12 @@ static struct arm_smmu_ctx_desc *arm_smmu_alloc_shared_cd(struct mm_struct *mm)
 	if (err)
 		goto out_free_asid;
 
+	/* HA and HD will be filtered out later if not supported by the SMMU */
 	tcr = FIELD_PREP(CTXDESC_CD_0_TCR_T0SZ, 64ULL - vabits_actual) |
 	      FIELD_PREP(CTXDESC_CD_0_TCR_IRGN0, ARM_LPAE_TCR_RGN_WBWA) |
 	      FIELD_PREP(CTXDESC_CD_0_TCR_ORGN0, ARM_LPAE_TCR_RGN_WBWA) |
 	      FIELD_PREP(CTXDESC_CD_0_TCR_SH0, ARM_LPAE_TCR_SH_IS) |
+	      CTXDESC_CD_0_TCR_HA | CTXDESC_CD_0_TCR_HD |
 	      CTXDESC_CD_0_TCR_EPD1 | CTXDESC_CD_0_AA64;
 
 	switch (PAGE_SIZE) {
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 54b2f27b81d4..4ac59a89bc76 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -1010,10 +1010,17 @@ int arm_smmu_write_ctx_desc(struct arm_smmu_domain *smmu_domain, int ssid,
 		 * this substream's traffic
 		 */
 	} else { /* (1) and (2) */
+		u64 tcr = cd->tcr;
+
 		cdptr[1] = cpu_to_le64(cd->ttbr & CTXDESC_CD_1_TTB0_MASK);
 		cdptr[2] = 0;
 		cdptr[3] = cpu_to_le64(cd->mair);
 
+		if (!(smmu->features & ARM_SMMU_FEAT_HD))
+			tcr &= ~CTXDESC_CD_0_TCR_HD;
+		if (!(smmu->features & ARM_SMMU_FEAT_HA))
+			tcr &= ~CTXDESC_CD_0_TCR_HA;
+
 		/*
 		 * STE is live, and the SMMU might read dwords of this CD in any
 		 * order. Ensure that it observes valid values before reading
@@ -1021,7 +1028,7 @@ int arm_smmu_write_ctx_desc(struct arm_smmu_domain *smmu_domain, int ssid,
 		 */
 		arm_smmu_sync_cd(smmu_domain, ssid, true);
 
-		val = cd->tcr |
+		val = tcr |
 #ifdef __BIG_ENDIAN
 			CTXDESC_CD_0_ENDI |
 #endif
@@ -3242,6 +3249,28 @@ static int arm_smmu_device_reset(struct arm_smmu_device *smmu, bool bypass)
 	return 0;
 }
 
+static void arm_smmu_get_httu(struct arm_smmu_device *smmu, u32 reg)
+{
+	u32 fw_features = smmu->features & (ARM_SMMU_FEAT_HA | ARM_SMMU_FEAT_HD);
+	u32 features = 0;
+
+	switch (FIELD_GET(IDR0_HTTU, reg)) {
+	case IDR0_HTTU_ACCESS_DIRTY:
+		features |= ARM_SMMU_FEAT_HD;
+		fallthrough;
+	case IDR0_HTTU_ACCESS:
+		features |= ARM_SMMU_FEAT_HA;
+	}
+
+	if (smmu->dev->of_node)
+		smmu->features |= features;
+	else if (features != fw_features)
+		/* ACPI IORT sets the HTTU bits */
+		dev_warn(smmu->dev,
+			 "IDR0.HTTU overridden by FW configuration (0x%x)\n",
+			 fw_features);
+}
+
 static int arm_smmu_device_hw_probe(struct arm_smmu_device *smmu)
 {
 	u32 reg;
@@ -3302,6 +3331,8 @@ static int arm_smmu_device_hw_probe(struct arm_smmu_device *smmu)
 			smmu->features |= ARM_SMMU_FEAT_E2H;
 	}
 
+	arm_smmu_get_httu(smmu, reg);
+
 	/*
 	 * The coherency feature as set by FW is used in preference to the ID
 	 * register, but warn on mismatch.
@@ -3487,6 +3518,14 @@ static int arm_smmu_device_acpi_probe(struct platform_device *pdev,
 	if (iort_smmu->flags & ACPI_IORT_SMMU_V3_COHACC_OVERRIDE)
 		smmu->features |= ARM_SMMU_FEAT_COHERENCY;
 
+	switch (FIELD_GET(ACPI_IORT_SMMU_V3_HTTU_OVERRIDE, iort_smmu->flags)) {
+	case IDR0_HTTU_ACCESS_DIRTY:
+		smmu->features |= ARM_SMMU_FEAT_HD;
+		fallthrough;
+	case IDR0_HTTU_ACCESS:
+		smmu->features |= ARM_SMMU_FEAT_HA;
+	}
+
 	return 0;
 }
 #else
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index 46e8c49214a8..3edcd31b046e 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -33,6 +33,9 @@
 #define IDR0_ASID16			(1 << 12)
 #define IDR0_ATS			(1 << 10)
 #define IDR0_HYP			(1 << 9)
+#define IDR0_HTTU			GENMASK(7, 6)
+#define IDR0_HTTU_ACCESS		1
+#define IDR0_HTTU_ACCESS_DIRTY		2
 #define IDR0_COHACC			(1 << 4)
 #define IDR0_TTF			GENMASK(3, 2)
 #define IDR0_TTF_AARCH64		2
@@ -285,6 +288,9 @@
 #define CTXDESC_CD_0_TCR_IPS		GENMASK_ULL(34, 32)
 #define CTXDESC_CD_0_TCR_TBI0		(1ULL << 38)
 
+#define CTXDESC_CD_0_TCR_HA		(1UL << 43)
+#define CTXDESC_CD_0_TCR_HD		(1UL << 42)
+
 #define CTXDESC_CD_0_AA64		(1UL << 41)
 #define CTXDESC_CD_0_S			(1UL << 44)
 #define CTXDESC_CD_0_R			(1UL << 45)
@@ -605,6 +611,8 @@ struct arm_smmu_device {
 #define ARM_SMMU_FEAT_BTM		(1 << 16)
 #define ARM_SMMU_FEAT_SVA		(1 << 17)
 #define ARM_SMMU_FEAT_E2H		(1 << 18)
+#define ARM_SMMU_FEAT_HA		(1 << 19)
+#define ARM_SMMU_FEAT_HD		(1 << 20)
 	u32				features;
 
 #define ARM_SMMU_OPT_SKIP_PREFETCH	(1 << 0)
-- 
2.19.1


^ permalink raw reply	[flat|nested] 81+ messages in thread

* [RFC PATCH v4 07/13] iommu/arm-smmu-v3: Add support for Hardware Translation Table Update
@ 2021-05-07 10:22   ` Keqian Zhu
  0 siblings, 0 replies; 81+ messages in thread
From: Keqian Zhu @ 2021-05-07 10:22 UTC (permalink / raw)
  To: linux-kernel, linux-arm-kernel, iommu, Robin Murphy, Will Deacon,
	Joerg Roedel, Jean-Philippe Brucker, Lu Baolu, Yi Sun,
	Tian Kevin
  Cc: jiangkunkun, Cornelia Huck, Kirti Wankhede, lushenming,
	Alex Williamson, wanghaibin.wang

From: Jean-Philippe Brucker <jean-philippe@linaro.org>

If the SMMU supports it and the kernel was built with HTTU support,
enable hardware update of access and dirty flags. This is essential for
shared page tables, to reduce the number of access faults on the fault
queue. Normal DMA with io-pgtables doesn't currently use the access or
dirty flags.

We can enable HTTU even if CPUs don't support it, because the kernel
always checks for HW dirty bit and updates the PTE flags atomically.

Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
---
 .../iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c   |  2 +
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c   | 41 ++++++++++++++++++-
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h   |  8 ++++
 3 files changed, 50 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
index bb251cab61f3..ae075e675892 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
@@ -121,10 +121,12 @@ static struct arm_smmu_ctx_desc *arm_smmu_alloc_shared_cd(struct mm_struct *mm)
 	if (err)
 		goto out_free_asid;
 
+	/* HA and HD will be filtered out later if not supported by the SMMU */
 	tcr = FIELD_PREP(CTXDESC_CD_0_TCR_T0SZ, 64ULL - vabits_actual) |
 	      FIELD_PREP(CTXDESC_CD_0_TCR_IRGN0, ARM_LPAE_TCR_RGN_WBWA) |
 	      FIELD_PREP(CTXDESC_CD_0_TCR_ORGN0, ARM_LPAE_TCR_RGN_WBWA) |
 	      FIELD_PREP(CTXDESC_CD_0_TCR_SH0, ARM_LPAE_TCR_SH_IS) |
+	      CTXDESC_CD_0_TCR_HA | CTXDESC_CD_0_TCR_HD |
 	      CTXDESC_CD_0_TCR_EPD1 | CTXDESC_CD_0_AA64;
 
 	switch (PAGE_SIZE) {
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 54b2f27b81d4..4ac59a89bc76 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -1010,10 +1010,17 @@ int arm_smmu_write_ctx_desc(struct arm_smmu_domain *smmu_domain, int ssid,
 		 * this substream's traffic
 		 */
 	} else { /* (1) and (2) */
+		u64 tcr = cd->tcr;
+
 		cdptr[1] = cpu_to_le64(cd->ttbr & CTXDESC_CD_1_TTB0_MASK);
 		cdptr[2] = 0;
 		cdptr[3] = cpu_to_le64(cd->mair);
 
+		if (!(smmu->features & ARM_SMMU_FEAT_HD))
+			tcr &= ~CTXDESC_CD_0_TCR_HD;
+		if (!(smmu->features & ARM_SMMU_FEAT_HA))
+			tcr &= ~CTXDESC_CD_0_TCR_HA;
+
 		/*
 		 * STE is live, and the SMMU might read dwords of this CD in any
 		 * order. Ensure that it observes valid values before reading
@@ -1021,7 +1028,7 @@ int arm_smmu_write_ctx_desc(struct arm_smmu_domain *smmu_domain, int ssid,
 		 */
 		arm_smmu_sync_cd(smmu_domain, ssid, true);
 
-		val = cd->tcr |
+		val = tcr |
 #ifdef __BIG_ENDIAN
 			CTXDESC_CD_0_ENDI |
 #endif
@@ -3242,6 +3249,28 @@ static int arm_smmu_device_reset(struct arm_smmu_device *smmu, bool bypass)
 	return 0;
 }
 
+static void arm_smmu_get_httu(struct arm_smmu_device *smmu, u32 reg)
+{
+	u32 fw_features = smmu->features & (ARM_SMMU_FEAT_HA | ARM_SMMU_FEAT_HD);
+	u32 features = 0;
+
+	switch (FIELD_GET(IDR0_HTTU, reg)) {
+	case IDR0_HTTU_ACCESS_DIRTY:
+		features |= ARM_SMMU_FEAT_HD;
+		fallthrough;
+	case IDR0_HTTU_ACCESS:
+		features |= ARM_SMMU_FEAT_HA;
+	}
+
+	if (smmu->dev->of_node)
+		smmu->features |= features;
+	else if (features != fw_features)
+		/* ACPI IORT sets the HTTU bits */
+		dev_warn(smmu->dev,
+			 "IDR0.HTTU overridden by FW configuration (0x%x)\n",
+			 fw_features);
+}
+
 static int arm_smmu_device_hw_probe(struct arm_smmu_device *smmu)
 {
 	u32 reg;
@@ -3302,6 +3331,8 @@ static int arm_smmu_device_hw_probe(struct arm_smmu_device *smmu)
 			smmu->features |= ARM_SMMU_FEAT_E2H;
 	}
 
+	arm_smmu_get_httu(smmu, reg);
+
 	/*
 	 * The coherency feature as set by FW is used in preference to the ID
 	 * register, but warn on mismatch.
@@ -3487,6 +3518,14 @@ static int arm_smmu_device_acpi_probe(struct platform_device *pdev,
 	if (iort_smmu->flags & ACPI_IORT_SMMU_V3_COHACC_OVERRIDE)
 		smmu->features |= ARM_SMMU_FEAT_COHERENCY;
 
+	switch (FIELD_GET(ACPI_IORT_SMMU_V3_HTTU_OVERRIDE, iort_smmu->flags)) {
+	case IDR0_HTTU_ACCESS_DIRTY:
+		smmu->features |= ARM_SMMU_FEAT_HD;
+		fallthrough;
+	case IDR0_HTTU_ACCESS:
+		smmu->features |= ARM_SMMU_FEAT_HA;
+	}
+
 	return 0;
 }
 #else
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index 46e8c49214a8..3edcd31b046e 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -33,6 +33,9 @@
 #define IDR0_ASID16			(1 << 12)
 #define IDR0_ATS			(1 << 10)
 #define IDR0_HYP			(1 << 9)
+#define IDR0_HTTU			GENMASK(7, 6)
+#define IDR0_HTTU_ACCESS		1
+#define IDR0_HTTU_ACCESS_DIRTY		2
 #define IDR0_COHACC			(1 << 4)
 #define IDR0_TTF			GENMASK(3, 2)
 #define IDR0_TTF_AARCH64		2
@@ -285,6 +288,9 @@
 #define CTXDESC_CD_0_TCR_IPS		GENMASK_ULL(34, 32)
 #define CTXDESC_CD_0_TCR_TBI0		(1ULL << 38)
 
+#define CTXDESC_CD_0_TCR_HA		(1UL << 43)
+#define CTXDESC_CD_0_TCR_HD		(1UL << 42)
+
 #define CTXDESC_CD_0_AA64		(1UL << 41)
 #define CTXDESC_CD_0_S			(1UL << 44)
 #define CTXDESC_CD_0_R			(1UL << 45)
@@ -605,6 +611,8 @@ struct arm_smmu_device {
 #define ARM_SMMU_FEAT_BTM		(1 << 16)
 #define ARM_SMMU_FEAT_SVA		(1 << 17)
 #define ARM_SMMU_FEAT_E2H		(1 << 18)
+#define ARM_SMMU_FEAT_HA		(1 << 19)
+#define ARM_SMMU_FEAT_HD		(1 << 20)
 	u32				features;
 
 #define ARM_SMMU_OPT_SKIP_PREFETCH	(1 << 0)
-- 
2.19.1

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 81+ messages in thread

* [RFC PATCH v4 07/13] iommu/arm-smmu-v3: Add support for Hardware Translation Table Update
@ 2021-05-07 10:22   ` Keqian Zhu
  0 siblings, 0 replies; 81+ messages in thread
From: Keqian Zhu @ 2021-05-07 10:22 UTC (permalink / raw)
  To: linux-kernel, linux-arm-kernel, iommu, Robin Murphy, Will Deacon,
	Joerg Roedel, Jean-Philippe Brucker, Lu Baolu, Yi Sun,
	Tian Kevin
  Cc: Alex Williamson, Kirti Wankhede, Cornelia Huck, Jonathan Cameron,
	wanghaibin.wang, jiangkunkun, yuzenghui, lushenming

From: Jean-Philippe Brucker <jean-philippe@linaro.org>

If the SMMU supports it and the kernel was built with HTTU support,
enable hardware update of access and dirty flags. This is essential for
shared page tables, to reduce the number of access faults on the fault
queue. Normal DMA with io-pgtables doesn't currently use the access or
dirty flags.

We can enable HTTU even if CPUs don't support it, because the kernel
always checks for HW dirty bit and updates the PTE flags atomically.

Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
---
 .../iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c   |  2 +
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c   | 41 ++++++++++++++++++-
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h   |  8 ++++
 3 files changed, 50 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
index bb251cab61f3..ae075e675892 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
@@ -121,10 +121,12 @@ static struct arm_smmu_ctx_desc *arm_smmu_alloc_shared_cd(struct mm_struct *mm)
 	if (err)
 		goto out_free_asid;
 
+	/* HA and HD will be filtered out later if not supported by the SMMU */
 	tcr = FIELD_PREP(CTXDESC_CD_0_TCR_T0SZ, 64ULL - vabits_actual) |
 	      FIELD_PREP(CTXDESC_CD_0_TCR_IRGN0, ARM_LPAE_TCR_RGN_WBWA) |
 	      FIELD_PREP(CTXDESC_CD_0_TCR_ORGN0, ARM_LPAE_TCR_RGN_WBWA) |
 	      FIELD_PREP(CTXDESC_CD_0_TCR_SH0, ARM_LPAE_TCR_SH_IS) |
+	      CTXDESC_CD_0_TCR_HA | CTXDESC_CD_0_TCR_HD |
 	      CTXDESC_CD_0_TCR_EPD1 | CTXDESC_CD_0_AA64;
 
 	switch (PAGE_SIZE) {
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 54b2f27b81d4..4ac59a89bc76 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -1010,10 +1010,17 @@ int arm_smmu_write_ctx_desc(struct arm_smmu_domain *smmu_domain, int ssid,
 		 * this substream's traffic
 		 */
 	} else { /* (1) and (2) */
+		u64 tcr = cd->tcr;
+
 		cdptr[1] = cpu_to_le64(cd->ttbr & CTXDESC_CD_1_TTB0_MASK);
 		cdptr[2] = 0;
 		cdptr[3] = cpu_to_le64(cd->mair);
 
+		if (!(smmu->features & ARM_SMMU_FEAT_HD))
+			tcr &= ~CTXDESC_CD_0_TCR_HD;
+		if (!(smmu->features & ARM_SMMU_FEAT_HA))
+			tcr &= ~CTXDESC_CD_0_TCR_HA;
+
 		/*
 		 * STE is live, and the SMMU might read dwords of this CD in any
 		 * order. Ensure that it observes valid values before reading
@@ -1021,7 +1028,7 @@ int arm_smmu_write_ctx_desc(struct arm_smmu_domain *smmu_domain, int ssid,
 		 */
 		arm_smmu_sync_cd(smmu_domain, ssid, true);
 
-		val = cd->tcr |
+		val = tcr |
 #ifdef __BIG_ENDIAN
 			CTXDESC_CD_0_ENDI |
 #endif
@@ -3242,6 +3249,28 @@ static int arm_smmu_device_reset(struct arm_smmu_device *smmu, bool bypass)
 	return 0;
 }
 
+static void arm_smmu_get_httu(struct arm_smmu_device *smmu, u32 reg)
+{
+	u32 fw_features = smmu->features & (ARM_SMMU_FEAT_HA | ARM_SMMU_FEAT_HD);
+	u32 features = 0;
+
+	switch (FIELD_GET(IDR0_HTTU, reg)) {
+	case IDR0_HTTU_ACCESS_DIRTY:
+		features |= ARM_SMMU_FEAT_HD;
+		fallthrough;
+	case IDR0_HTTU_ACCESS:
+		features |= ARM_SMMU_FEAT_HA;
+	}
+
+	if (smmu->dev->of_node)
+		smmu->features |= features;
+	else if (features != fw_features)
+		/* ACPI IORT sets the HTTU bits */
+		dev_warn(smmu->dev,
+			 "IDR0.HTTU overridden by FW configuration (0x%x)\n",
+			 fw_features);
+}
+
 static int arm_smmu_device_hw_probe(struct arm_smmu_device *smmu)
 {
 	u32 reg;
@@ -3302,6 +3331,8 @@ static int arm_smmu_device_hw_probe(struct arm_smmu_device *smmu)
 			smmu->features |= ARM_SMMU_FEAT_E2H;
 	}
 
+	arm_smmu_get_httu(smmu, reg);
+
 	/*
 	 * The coherency feature as set by FW is used in preference to the ID
 	 * register, but warn on mismatch.
@@ -3487,6 +3518,14 @@ static int arm_smmu_device_acpi_probe(struct platform_device *pdev,
 	if (iort_smmu->flags & ACPI_IORT_SMMU_V3_COHACC_OVERRIDE)
 		smmu->features |= ARM_SMMU_FEAT_COHERENCY;
 
+	switch (FIELD_GET(ACPI_IORT_SMMU_V3_HTTU_OVERRIDE, iort_smmu->flags)) {
+	case IDR0_HTTU_ACCESS_DIRTY:
+		smmu->features |= ARM_SMMU_FEAT_HD;
+		fallthrough;
+	case IDR0_HTTU_ACCESS:
+		smmu->features |= ARM_SMMU_FEAT_HA;
+	}
+
 	return 0;
 }
 #else
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index 46e8c49214a8..3edcd31b046e 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -33,6 +33,9 @@
 #define IDR0_ASID16			(1 << 12)
 #define IDR0_ATS			(1 << 10)
 #define IDR0_HYP			(1 << 9)
+#define IDR0_HTTU			GENMASK(7, 6)
+#define IDR0_HTTU_ACCESS		1
+#define IDR0_HTTU_ACCESS_DIRTY		2
 #define IDR0_COHACC			(1 << 4)
 #define IDR0_TTF			GENMASK(3, 2)
 #define IDR0_TTF_AARCH64		2
@@ -285,6 +288,9 @@
 #define CTXDESC_CD_0_TCR_IPS		GENMASK_ULL(34, 32)
 #define CTXDESC_CD_0_TCR_TBI0		(1ULL << 38)
 
+#define CTXDESC_CD_0_TCR_HA		(1UL << 43)
+#define CTXDESC_CD_0_TCR_HD		(1UL << 42)
+
 #define CTXDESC_CD_0_AA64		(1UL << 41)
 #define CTXDESC_CD_0_S			(1UL << 44)
 #define CTXDESC_CD_0_R			(1UL << 45)
@@ -605,6 +611,8 @@ struct arm_smmu_device {
 #define ARM_SMMU_FEAT_BTM		(1 << 16)
 #define ARM_SMMU_FEAT_SVA		(1 << 17)
 #define ARM_SMMU_FEAT_E2H		(1 << 18)
+#define ARM_SMMU_FEAT_HA		(1 << 19)
+#define ARM_SMMU_FEAT_HD		(1 << 20)
 	u32				features;
 
 #define ARM_SMMU_OPT_SKIP_PREFETCH	(1 << 0)
-- 
2.19.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 81+ messages in thread

* [RFC PATCH v4 08/13] iommu/arm-smmu-v3: Enable HTTU for stage1 with io-pgtable mapping
  2021-05-07 10:21 ` Keqian Zhu
  (?)
@ 2021-05-07 10:22   ` Keqian Zhu
  -1 siblings, 0 replies; 81+ messages in thread
From: Keqian Zhu @ 2021-05-07 10:22 UTC (permalink / raw)
  To: linux-kernel, linux-arm-kernel, iommu, Robin Murphy, Will Deacon,
	Joerg Roedel, Jean-Philippe Brucker, Lu Baolu, Yi Sun,
	Tian Kevin
  Cc: Alex Williamson, Kirti Wankhede, Cornelia Huck, Jonathan Cameron,
	wanghaibin.wang, jiangkunkun, yuzenghui, lushenming

From: Kunkun Jiang <jiangkunkun@huawei.com>

As nested mode is not upstreamed now, we just aim to support dirty
log tracking for stage1 with io-pgtable mapping (means not support
SVA mapping). If HTTU is supported, we enable HA/HD bits in the SMMU
CD and transfer ARM_HD quirk to io-pgtable.

Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com>
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 4ac59a89bc76..c42e59655fd0 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -1942,6 +1942,7 @@ static int arm_smmu_domain_finalise_s1(struct arm_smmu_domain *smmu_domain,
 			  FIELD_PREP(CTXDESC_CD_0_TCR_ORGN0, tcr->orgn) |
 			  FIELD_PREP(CTXDESC_CD_0_TCR_SH0, tcr->sh) |
 			  FIELD_PREP(CTXDESC_CD_0_TCR_IPS, tcr->ips) |
+			  CTXDESC_CD_0_TCR_HA | CTXDESC_CD_0_TCR_HD |
 			  CTXDESC_CD_0_TCR_EPD1 | CTXDESC_CD_0_AA64;
 	cfg->cd.mair	= pgtbl_cfg->arm_lpae_s1_cfg.mair;
 
@@ -2047,6 +2048,8 @@ static int arm_smmu_domain_finalise(struct iommu_domain *domain,
 
 	if (!iommu_get_dma_strict(domain))
 		pgtbl_cfg.quirks |= IO_PGTABLE_QUIRK_NON_STRICT;
+	if (smmu->features & ARM_SMMU_FEAT_HD)
+		pgtbl_cfg.quirks |= IO_PGTABLE_QUIRK_ARM_HD;
 
 	pgtbl_ops = alloc_io_pgtable_ops(fmt, &pgtbl_cfg, smmu_domain);
 	if (!pgtbl_ops)
-- 
2.19.1


^ permalink raw reply	[flat|nested] 81+ messages in thread

* [RFC PATCH v4 08/13] iommu/arm-smmu-v3: Enable HTTU for stage1 with io-pgtable mapping
@ 2021-05-07 10:22   ` Keqian Zhu
  0 siblings, 0 replies; 81+ messages in thread
From: Keqian Zhu @ 2021-05-07 10:22 UTC (permalink / raw)
  To: linux-kernel, linux-arm-kernel, iommu, Robin Murphy, Will Deacon,
	Joerg Roedel, Jean-Philippe Brucker, Lu Baolu, Yi Sun,
	Tian Kevin
  Cc: jiangkunkun, Cornelia Huck, Kirti Wankhede, lushenming,
	Alex Williamson, wanghaibin.wang

From: Kunkun Jiang <jiangkunkun@huawei.com>

As nested mode is not upstreamed now, we just aim to support dirty
log tracking for stage1 with io-pgtable mapping (means not support
SVA mapping). If HTTU is supported, we enable HA/HD bits in the SMMU
CD and transfer ARM_HD quirk to io-pgtable.

Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com>
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 4ac59a89bc76..c42e59655fd0 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -1942,6 +1942,7 @@ static int arm_smmu_domain_finalise_s1(struct arm_smmu_domain *smmu_domain,
 			  FIELD_PREP(CTXDESC_CD_0_TCR_ORGN0, tcr->orgn) |
 			  FIELD_PREP(CTXDESC_CD_0_TCR_SH0, tcr->sh) |
 			  FIELD_PREP(CTXDESC_CD_0_TCR_IPS, tcr->ips) |
+			  CTXDESC_CD_0_TCR_HA | CTXDESC_CD_0_TCR_HD |
 			  CTXDESC_CD_0_TCR_EPD1 | CTXDESC_CD_0_AA64;
 	cfg->cd.mair	= pgtbl_cfg->arm_lpae_s1_cfg.mair;
 
@@ -2047,6 +2048,8 @@ static int arm_smmu_domain_finalise(struct iommu_domain *domain,
 
 	if (!iommu_get_dma_strict(domain))
 		pgtbl_cfg.quirks |= IO_PGTABLE_QUIRK_NON_STRICT;
+	if (smmu->features & ARM_SMMU_FEAT_HD)
+		pgtbl_cfg.quirks |= IO_PGTABLE_QUIRK_ARM_HD;
 
 	pgtbl_ops = alloc_io_pgtable_ops(fmt, &pgtbl_cfg, smmu_domain);
 	if (!pgtbl_ops)
-- 
2.19.1

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 81+ messages in thread

* [RFC PATCH v4 08/13] iommu/arm-smmu-v3: Enable HTTU for stage1 with io-pgtable mapping
@ 2021-05-07 10:22   ` Keqian Zhu
  0 siblings, 0 replies; 81+ messages in thread
From: Keqian Zhu @ 2021-05-07 10:22 UTC (permalink / raw)
  To: linux-kernel, linux-arm-kernel, iommu, Robin Murphy, Will Deacon,
	Joerg Roedel, Jean-Philippe Brucker, Lu Baolu, Yi Sun,
	Tian Kevin
  Cc: Alex Williamson, Kirti Wankhede, Cornelia Huck, Jonathan Cameron,
	wanghaibin.wang, jiangkunkun, yuzenghui, lushenming

From: Kunkun Jiang <jiangkunkun@huawei.com>

As nested mode is not upstreamed now, we just aim to support dirty
log tracking for stage1 with io-pgtable mapping (means not support
SVA mapping). If HTTU is supported, we enable HA/HD bits in the SMMU
CD and transfer ARM_HD quirk to io-pgtable.

Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com>
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 4ac59a89bc76..c42e59655fd0 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -1942,6 +1942,7 @@ static int arm_smmu_domain_finalise_s1(struct arm_smmu_domain *smmu_domain,
 			  FIELD_PREP(CTXDESC_CD_0_TCR_ORGN0, tcr->orgn) |
 			  FIELD_PREP(CTXDESC_CD_0_TCR_SH0, tcr->sh) |
 			  FIELD_PREP(CTXDESC_CD_0_TCR_IPS, tcr->ips) |
+			  CTXDESC_CD_0_TCR_HA | CTXDESC_CD_0_TCR_HD |
 			  CTXDESC_CD_0_TCR_EPD1 | CTXDESC_CD_0_AA64;
 	cfg->cd.mair	= pgtbl_cfg->arm_lpae_s1_cfg.mair;
 
@@ -2047,6 +2048,8 @@ static int arm_smmu_domain_finalise(struct iommu_domain *domain,
 
 	if (!iommu_get_dma_strict(domain))
 		pgtbl_cfg.quirks |= IO_PGTABLE_QUIRK_NON_STRICT;
+	if (smmu->features & ARM_SMMU_FEAT_HD)
+		pgtbl_cfg.quirks |= IO_PGTABLE_QUIRK_ARM_HD;
 
 	pgtbl_ops = alloc_io_pgtable_ops(fmt, &pgtbl_cfg, smmu_domain);
 	if (!pgtbl_ops)
-- 
2.19.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 81+ messages in thread

* [RFC PATCH v4 09/13] iommu/arm-smmu-v3: Add feature detection for BBML
  2021-05-07 10:21 ` Keqian Zhu
  (?)
@ 2021-05-07 10:22   ` Keqian Zhu
  -1 siblings, 0 replies; 81+ messages in thread
From: Keqian Zhu @ 2021-05-07 10:22 UTC (permalink / raw)
  To: linux-kernel, linux-arm-kernel, iommu, Robin Murphy, Will Deacon,
	Joerg Roedel, Jean-Philippe Brucker, Lu Baolu, Yi Sun,
	Tian Kevin
  Cc: Alex Williamson, Kirti Wankhede, Cornelia Huck, Jonathan Cameron,
	wanghaibin.wang, jiangkunkun, yuzenghui, lushenming

From: Kunkun Jiang <jiangkunkun@huawei.com>

This detects BBML feature and if SMMU supports it, transfer BBMLx
quirk to io-pgtable.

Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com>
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 19 +++++++++++++++++++
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h |  6 ++++++
 2 files changed, 25 insertions(+)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index c42e59655fd0..3a2dc3177180 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -2051,6 +2051,11 @@ static int arm_smmu_domain_finalise(struct iommu_domain *domain,
 	if (smmu->features & ARM_SMMU_FEAT_HD)
 		pgtbl_cfg.quirks |= IO_PGTABLE_QUIRK_ARM_HD;
 
+	if (smmu->features & ARM_SMMU_FEAT_BBML1)
+		pgtbl_cfg.quirks |= IO_PGTABLE_QUIRK_ARM_BBML1;
+	else if (smmu->features & ARM_SMMU_FEAT_BBML2)
+		pgtbl_cfg.quirks |= IO_PGTABLE_QUIRK_ARM_BBML2;
+
 	pgtbl_ops = alloc_io_pgtable_ops(fmt, &pgtbl_cfg, smmu_domain);
 	if (!pgtbl_ops)
 		return -ENOMEM;
@@ -3419,6 +3424,20 @@ static int arm_smmu_device_hw_probe(struct arm_smmu_device *smmu)
 
 	/* IDR3 */
 	reg = readl_relaxed(smmu->base + ARM_SMMU_IDR3);
+	switch (FIELD_GET(IDR3_BBML, reg)) {
+	case IDR3_BBML0:
+		break;
+	case IDR3_BBML1:
+		smmu->features |= ARM_SMMU_FEAT_BBML1;
+		break;
+	case IDR3_BBML2:
+		smmu->features |= ARM_SMMU_FEAT_BBML2;
+		break;
+	default:
+		dev_err(smmu->dev, "unknown/unsupported BBM behavior level\n");
+		return -ENXIO;
+	}
+
 	if (FIELD_GET(IDR3_RIL, reg))
 		smmu->features |= ARM_SMMU_FEAT_RANGE_INV;
 
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index 3edcd31b046e..e3b6bdd292c9 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -54,6 +54,10 @@
 #define IDR1_SIDSIZE			GENMASK(5, 0)
 
 #define ARM_SMMU_IDR3			0xc
+#define IDR3_BBML			GENMASK(12, 11)
+#define IDR3_BBML0			0
+#define IDR3_BBML1			1
+#define IDR3_BBML2			2
 #define IDR3_RIL			(1 << 10)
 
 #define ARM_SMMU_IDR5			0x14
@@ -613,6 +617,8 @@ struct arm_smmu_device {
 #define ARM_SMMU_FEAT_E2H		(1 << 18)
 #define ARM_SMMU_FEAT_HA		(1 << 19)
 #define ARM_SMMU_FEAT_HD		(1 << 20)
+#define ARM_SMMU_FEAT_BBML1		(1 << 21)
+#define ARM_SMMU_FEAT_BBML2		(1 << 22)
 	u32				features;
 
 #define ARM_SMMU_OPT_SKIP_PREFETCH	(1 << 0)
-- 
2.19.1


^ permalink raw reply	[flat|nested] 81+ messages in thread

* [RFC PATCH v4 09/13] iommu/arm-smmu-v3: Add feature detection for BBML
@ 2021-05-07 10:22   ` Keqian Zhu
  0 siblings, 0 replies; 81+ messages in thread
From: Keqian Zhu @ 2021-05-07 10:22 UTC (permalink / raw)
  To: linux-kernel, linux-arm-kernel, iommu, Robin Murphy, Will Deacon,
	Joerg Roedel, Jean-Philippe Brucker, Lu Baolu, Yi Sun,
	Tian Kevin
  Cc: jiangkunkun, Cornelia Huck, Kirti Wankhede, lushenming,
	Alex Williamson, wanghaibin.wang

From: Kunkun Jiang <jiangkunkun@huawei.com>

This detects BBML feature and if SMMU supports it, transfer BBMLx
quirk to io-pgtable.

Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com>
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 19 +++++++++++++++++++
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h |  6 ++++++
 2 files changed, 25 insertions(+)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index c42e59655fd0..3a2dc3177180 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -2051,6 +2051,11 @@ static int arm_smmu_domain_finalise(struct iommu_domain *domain,
 	if (smmu->features & ARM_SMMU_FEAT_HD)
 		pgtbl_cfg.quirks |= IO_PGTABLE_QUIRK_ARM_HD;
 
+	if (smmu->features & ARM_SMMU_FEAT_BBML1)
+		pgtbl_cfg.quirks |= IO_PGTABLE_QUIRK_ARM_BBML1;
+	else if (smmu->features & ARM_SMMU_FEAT_BBML2)
+		pgtbl_cfg.quirks |= IO_PGTABLE_QUIRK_ARM_BBML2;
+
 	pgtbl_ops = alloc_io_pgtable_ops(fmt, &pgtbl_cfg, smmu_domain);
 	if (!pgtbl_ops)
 		return -ENOMEM;
@@ -3419,6 +3424,20 @@ static int arm_smmu_device_hw_probe(struct arm_smmu_device *smmu)
 
 	/* IDR3 */
 	reg = readl_relaxed(smmu->base + ARM_SMMU_IDR3);
+	switch (FIELD_GET(IDR3_BBML, reg)) {
+	case IDR3_BBML0:
+		break;
+	case IDR3_BBML1:
+		smmu->features |= ARM_SMMU_FEAT_BBML1;
+		break;
+	case IDR3_BBML2:
+		smmu->features |= ARM_SMMU_FEAT_BBML2;
+		break;
+	default:
+		dev_err(smmu->dev, "unknown/unsupported BBM behavior level\n");
+		return -ENXIO;
+	}
+
 	if (FIELD_GET(IDR3_RIL, reg))
 		smmu->features |= ARM_SMMU_FEAT_RANGE_INV;
 
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index 3edcd31b046e..e3b6bdd292c9 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -54,6 +54,10 @@
 #define IDR1_SIDSIZE			GENMASK(5, 0)
 
 #define ARM_SMMU_IDR3			0xc
+#define IDR3_BBML			GENMASK(12, 11)
+#define IDR3_BBML0			0
+#define IDR3_BBML1			1
+#define IDR3_BBML2			2
 #define IDR3_RIL			(1 << 10)
 
 #define ARM_SMMU_IDR5			0x14
@@ -613,6 +617,8 @@ struct arm_smmu_device {
 #define ARM_SMMU_FEAT_E2H		(1 << 18)
 #define ARM_SMMU_FEAT_HA		(1 << 19)
 #define ARM_SMMU_FEAT_HD		(1 << 20)
+#define ARM_SMMU_FEAT_BBML1		(1 << 21)
+#define ARM_SMMU_FEAT_BBML2		(1 << 22)
 	u32				features;
 
 #define ARM_SMMU_OPT_SKIP_PREFETCH	(1 << 0)
-- 
2.19.1

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 81+ messages in thread

* [RFC PATCH v4 09/13] iommu/arm-smmu-v3: Add feature detection for BBML
@ 2021-05-07 10:22   ` Keqian Zhu
  0 siblings, 0 replies; 81+ messages in thread
From: Keqian Zhu @ 2021-05-07 10:22 UTC (permalink / raw)
  To: linux-kernel, linux-arm-kernel, iommu, Robin Murphy, Will Deacon,
	Joerg Roedel, Jean-Philippe Brucker, Lu Baolu, Yi Sun,
	Tian Kevin
  Cc: Alex Williamson, Kirti Wankhede, Cornelia Huck, Jonathan Cameron,
	wanghaibin.wang, jiangkunkun, yuzenghui, lushenming

From: Kunkun Jiang <jiangkunkun@huawei.com>

This detects BBML feature and if SMMU supports it, transfer BBMLx
quirk to io-pgtable.

Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com>
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 19 +++++++++++++++++++
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h |  6 ++++++
 2 files changed, 25 insertions(+)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index c42e59655fd0..3a2dc3177180 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -2051,6 +2051,11 @@ static int arm_smmu_domain_finalise(struct iommu_domain *domain,
 	if (smmu->features & ARM_SMMU_FEAT_HD)
 		pgtbl_cfg.quirks |= IO_PGTABLE_QUIRK_ARM_HD;
 
+	if (smmu->features & ARM_SMMU_FEAT_BBML1)
+		pgtbl_cfg.quirks |= IO_PGTABLE_QUIRK_ARM_BBML1;
+	else if (smmu->features & ARM_SMMU_FEAT_BBML2)
+		pgtbl_cfg.quirks |= IO_PGTABLE_QUIRK_ARM_BBML2;
+
 	pgtbl_ops = alloc_io_pgtable_ops(fmt, &pgtbl_cfg, smmu_domain);
 	if (!pgtbl_ops)
 		return -ENOMEM;
@@ -3419,6 +3424,20 @@ static int arm_smmu_device_hw_probe(struct arm_smmu_device *smmu)
 
 	/* IDR3 */
 	reg = readl_relaxed(smmu->base + ARM_SMMU_IDR3);
+	switch (FIELD_GET(IDR3_BBML, reg)) {
+	case IDR3_BBML0:
+		break;
+	case IDR3_BBML1:
+		smmu->features |= ARM_SMMU_FEAT_BBML1;
+		break;
+	case IDR3_BBML2:
+		smmu->features |= ARM_SMMU_FEAT_BBML2;
+		break;
+	default:
+		dev_err(smmu->dev, "unknown/unsupported BBM behavior level\n");
+		return -ENXIO;
+	}
+
 	if (FIELD_GET(IDR3_RIL, reg))
 		smmu->features |= ARM_SMMU_FEAT_RANGE_INV;
 
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index 3edcd31b046e..e3b6bdd292c9 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -54,6 +54,10 @@
 #define IDR1_SIDSIZE			GENMASK(5, 0)
 
 #define ARM_SMMU_IDR3			0xc
+#define IDR3_BBML			GENMASK(12, 11)
+#define IDR3_BBML0			0
+#define IDR3_BBML1			1
+#define IDR3_BBML2			2
 #define IDR3_RIL			(1 << 10)
 
 #define ARM_SMMU_IDR5			0x14
@@ -613,6 +617,8 @@ struct arm_smmu_device {
 #define ARM_SMMU_FEAT_E2H		(1 << 18)
 #define ARM_SMMU_FEAT_HA		(1 << 19)
 #define ARM_SMMU_FEAT_HD		(1 << 20)
+#define ARM_SMMU_FEAT_BBML1		(1 << 21)
+#define ARM_SMMU_FEAT_BBML2		(1 << 22)
 	u32				features;
 
 #define ARM_SMMU_OPT_SKIP_PREFETCH	(1 << 0)
-- 
2.19.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 81+ messages in thread

* [RFC PATCH v4 10/13] iommu/arm-smmu-v3: Realize switch_dirty_log iommu ops
  2021-05-07 10:21 ` Keqian Zhu
  (?)
@ 2021-05-07 10:22   ` Keqian Zhu
  -1 siblings, 0 replies; 81+ messages in thread
From: Keqian Zhu @ 2021-05-07 10:22 UTC (permalink / raw)
  To: linux-kernel, linux-arm-kernel, iommu, Robin Murphy, Will Deacon,
	Joerg Roedel, Jean-Philippe Brucker, Lu Baolu, Yi Sun,
	Tian Kevin
  Cc: Alex Williamson, Kirti Wankhede, Cornelia Huck, Jonathan Cameron,
	wanghaibin.wang, jiangkunkun, yuzenghui, lushenming

From: Kunkun Jiang <jiangkunkun@huawei.com>

This realizes switch_dirty_log. In order to get finer dirty
granule, it invokes arm_smmu_split_block when start dirty
log, and invokes arm_smmu_merge_page() to recover block
mapping when stop dirty log.

Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com>
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 142 ++++++++++++++++++++
 drivers/iommu/iommu.c                       |   5 +-
 include/linux/iommu.h                       |   2 +
 3 files changed, 147 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 3a2dc3177180..6de81d6ab652 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -2580,6 +2580,147 @@ static int arm_smmu_enable_nesting(struct iommu_domain *domain)
 	return ret;
 }
 
+static int arm_smmu_split_block(struct iommu_domain *domain,
+				unsigned long iova, size_t size)
+{
+	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
+	struct arm_smmu_device *smmu = smmu_domain->smmu;
+	struct io_pgtable_ops *ops = smmu_domain->pgtbl_ops;
+	size_t handled_size;
+
+	if (!(smmu->features & (ARM_SMMU_FEAT_BBML1 | ARM_SMMU_FEAT_BBML2))) {
+		dev_err(smmu->dev, "don't support BBML1/2, can't split block\n");
+		return -ENODEV;
+	}
+	if (!ops || !ops->split_block) {
+		pr_err("io-pgtable don't realize split block\n");
+		return -ENODEV;
+	}
+
+	handled_size = ops->split_block(ops, iova, size);
+	if (handled_size != size) {
+		pr_err("split block failed\n");
+		return -EFAULT;
+	}
+
+	return 0;
+}
+
+static int __arm_smmu_merge_page(struct iommu_domain *domain,
+				 unsigned long iova, phys_addr_t paddr,
+				 size_t size, int prot)
+{
+	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
+	struct io_pgtable_ops *ops = smmu_domain->pgtbl_ops;
+	size_t handled_size;
+
+	if (!ops || !ops->merge_page) {
+		pr_err("io-pgtable don't realize merge page\n");
+		return -ENODEV;
+	}
+
+	while (size) {
+		size_t pgsize = iommu_pgsize(domain, iova | paddr, size);
+
+		handled_size = ops->merge_page(ops, iova, paddr, pgsize, prot);
+		if (handled_size != pgsize) {
+			pr_err("merge page failed\n");
+			return -EFAULT;
+		}
+
+		pr_debug("merge handled: iova 0x%lx pa %pa size 0x%zx\n",
+			 iova, &paddr, pgsize);
+
+		iova += pgsize;
+		paddr += pgsize;
+		size -= pgsize;
+	}
+
+	return 0;
+}
+
+static int arm_smmu_merge_page(struct iommu_domain *domain, unsigned long iova,
+			       size_t size, int prot)
+{
+	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
+	struct arm_smmu_device *smmu = smmu_domain->smmu;
+	struct io_pgtable_ops *ops = smmu_domain->pgtbl_ops;
+	phys_addr_t phys;
+	dma_addr_t p, i;
+	size_t cont_size;
+	int ret = 0;
+
+	if (!(smmu->features & (ARM_SMMU_FEAT_BBML1 | ARM_SMMU_FEAT_BBML2))) {
+		dev_err(smmu->dev, "don't support BBML1/2, can't merge page\n");
+		return -ENODEV;
+	}
+
+	if (!ops || !ops->iova_to_phys)
+		return -ENODEV;
+
+	while (size) {
+		phys = ops->iova_to_phys(ops, iova);
+		cont_size = PAGE_SIZE;
+		p = phys + cont_size;
+		i = iova + cont_size;
+
+		while (cont_size < size && p == ops->iova_to_phys(ops, i)) {
+			p += PAGE_SIZE;
+			i += PAGE_SIZE;
+			cont_size += PAGE_SIZE;
+		}
+
+		if (cont_size != PAGE_SIZE) {
+			ret = __arm_smmu_merge_page(domain, iova, phys,
+						    cont_size, prot);
+			if (ret)
+				break;
+		}
+
+		iova += cont_size;
+		size -= cont_size;
+	}
+
+	return ret;
+}
+
+static int arm_smmu_switch_dirty_log(struct iommu_domain *domain, bool enable,
+				     unsigned long iova, size_t size, int prot)
+{
+	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
+	struct arm_smmu_device *smmu = smmu_domain->smmu;
+
+	if (!(smmu->features & ARM_SMMU_FEAT_HD))
+		return -ENODEV;
+	if (smmu_domain->stage != ARM_SMMU_DOMAIN_S1)
+		return -EINVAL;
+
+	if (enable) {
+		/*
+		 * For SMMU, the hardware dirty management is always enabled if
+		 * hardware supports HTTU HD. The action to start dirty log is
+		 * spliting block mapping.
+		 *
+		 * We don't return error even if the split operation fail, as we
+		 * can still track dirty at block granule, which is still a much
+		 * better choice compared to full dirty policy.
+		 */
+		arm_smmu_split_block(domain, iova, size);
+	} else {
+		/*
+		 * For SMMU, the hardware dirty management is always enabled if
+		 * hardware supports HTTU HD. The action to stop dirty log is
+		 * merging page mapping.
+		 *
+		 * We don't return error even if the merge operation fail, as it
+		 * just effects performace of DMA transaction.
+		 */
+		arm_smmu_merge_page(domain, iova, size, prot);
+	}
+
+	return 0;
+}
+
 static int arm_smmu_of_xlate(struct device *dev, struct of_phandle_args *args)
 {
 	return iommu_fwspec_add_ids(dev, args->args, 1);
@@ -2678,6 +2819,7 @@ static struct iommu_ops arm_smmu_ops = {
 	.release_device		= arm_smmu_release_device,
 	.device_group		= arm_smmu_device_group,
 	.enable_nesting		= arm_smmu_enable_nesting,
+	.switch_dirty_log	= arm_smmu_switch_dirty_log,
 	.of_xlate		= arm_smmu_of_xlate,
 	.get_resv_regions	= arm_smmu_get_resv_regions,
 	.put_resv_regions	= generic_iommu_put_resv_regions,
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 0d15620d1e90..bb19df2317ed 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -2375,8 +2375,8 @@ phys_addr_t iommu_iova_to_phys(struct iommu_domain *domain, dma_addr_t iova)
 }
 EXPORT_SYMBOL_GPL(iommu_iova_to_phys);
 
-static size_t iommu_pgsize(struct iommu_domain *domain,
-			   unsigned long addr_merge, size_t size)
+size_t iommu_pgsize(struct iommu_domain *domain,
+		    unsigned long addr_merge, size_t size)
 {
 	unsigned int pgsize_idx;
 	size_t pgsize;
@@ -2406,6 +2406,7 @@ static size_t iommu_pgsize(struct iommu_domain *domain,
 
 	return pgsize;
 }
+EXPORT_SYMBOL_GPL(iommu_pgsize);
 
 static int __iommu_map(struct iommu_domain *domain, unsigned long iova,
 		       phys_addr_t paddr, size_t size, int prot, gfp_t gfp)
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index e0e40dda974d..0a77db4f397f 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -427,6 +427,8 @@ extern int iommu_sva_unbind_gpasid(struct iommu_domain *domain,
 				   struct device *dev, ioasid_t pasid);
 extern struct iommu_domain *iommu_get_domain_for_dev(struct device *dev);
 extern struct iommu_domain *iommu_get_dma_domain(struct device *dev);
+extern size_t iommu_pgsize(struct iommu_domain *domain,
+			   unsigned long addr_merge, size_t size);
 extern int iommu_map(struct iommu_domain *domain, unsigned long iova,
 		     phys_addr_t paddr, size_t size, int prot);
 extern int iommu_map_atomic(struct iommu_domain *domain, unsigned long iova,
-- 
2.19.1


^ permalink raw reply	[flat|nested] 81+ messages in thread

* [RFC PATCH v4 10/13] iommu/arm-smmu-v3: Realize switch_dirty_log iommu ops
@ 2021-05-07 10:22   ` Keqian Zhu
  0 siblings, 0 replies; 81+ messages in thread
From: Keqian Zhu @ 2021-05-07 10:22 UTC (permalink / raw)
  To: linux-kernel, linux-arm-kernel, iommu, Robin Murphy, Will Deacon,
	Joerg Roedel, Jean-Philippe Brucker, Lu Baolu, Yi Sun,
	Tian Kevin
  Cc: jiangkunkun, Cornelia Huck, Kirti Wankhede, lushenming,
	Alex Williamson, wanghaibin.wang

From: Kunkun Jiang <jiangkunkun@huawei.com>

This realizes switch_dirty_log. In order to get finer dirty
granule, it invokes arm_smmu_split_block when start dirty
log, and invokes arm_smmu_merge_page() to recover block
mapping when stop dirty log.

Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com>
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 142 ++++++++++++++++++++
 drivers/iommu/iommu.c                       |   5 +-
 include/linux/iommu.h                       |   2 +
 3 files changed, 147 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 3a2dc3177180..6de81d6ab652 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -2580,6 +2580,147 @@ static int arm_smmu_enable_nesting(struct iommu_domain *domain)
 	return ret;
 }
 
+static int arm_smmu_split_block(struct iommu_domain *domain,
+				unsigned long iova, size_t size)
+{
+	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
+	struct arm_smmu_device *smmu = smmu_domain->smmu;
+	struct io_pgtable_ops *ops = smmu_domain->pgtbl_ops;
+	size_t handled_size;
+
+	if (!(smmu->features & (ARM_SMMU_FEAT_BBML1 | ARM_SMMU_FEAT_BBML2))) {
+		dev_err(smmu->dev, "don't support BBML1/2, can't split block\n");
+		return -ENODEV;
+	}
+	if (!ops || !ops->split_block) {
+		pr_err("io-pgtable don't realize split block\n");
+		return -ENODEV;
+	}
+
+	handled_size = ops->split_block(ops, iova, size);
+	if (handled_size != size) {
+		pr_err("split block failed\n");
+		return -EFAULT;
+	}
+
+	return 0;
+}
+
+static int __arm_smmu_merge_page(struct iommu_domain *domain,
+				 unsigned long iova, phys_addr_t paddr,
+				 size_t size, int prot)
+{
+	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
+	struct io_pgtable_ops *ops = smmu_domain->pgtbl_ops;
+	size_t handled_size;
+
+	if (!ops || !ops->merge_page) {
+		pr_err("io-pgtable don't realize merge page\n");
+		return -ENODEV;
+	}
+
+	while (size) {
+		size_t pgsize = iommu_pgsize(domain, iova | paddr, size);
+
+		handled_size = ops->merge_page(ops, iova, paddr, pgsize, prot);
+		if (handled_size != pgsize) {
+			pr_err("merge page failed\n");
+			return -EFAULT;
+		}
+
+		pr_debug("merge handled: iova 0x%lx pa %pa size 0x%zx\n",
+			 iova, &paddr, pgsize);
+
+		iova += pgsize;
+		paddr += pgsize;
+		size -= pgsize;
+	}
+
+	return 0;
+}
+
+static int arm_smmu_merge_page(struct iommu_domain *domain, unsigned long iova,
+			       size_t size, int prot)
+{
+	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
+	struct arm_smmu_device *smmu = smmu_domain->smmu;
+	struct io_pgtable_ops *ops = smmu_domain->pgtbl_ops;
+	phys_addr_t phys;
+	dma_addr_t p, i;
+	size_t cont_size;
+	int ret = 0;
+
+	if (!(smmu->features & (ARM_SMMU_FEAT_BBML1 | ARM_SMMU_FEAT_BBML2))) {
+		dev_err(smmu->dev, "don't support BBML1/2, can't merge page\n");
+		return -ENODEV;
+	}
+
+	if (!ops || !ops->iova_to_phys)
+		return -ENODEV;
+
+	while (size) {
+		phys = ops->iova_to_phys(ops, iova);
+		cont_size = PAGE_SIZE;
+		p = phys + cont_size;
+		i = iova + cont_size;
+
+		while (cont_size < size && p == ops->iova_to_phys(ops, i)) {
+			p += PAGE_SIZE;
+			i += PAGE_SIZE;
+			cont_size += PAGE_SIZE;
+		}
+
+		if (cont_size != PAGE_SIZE) {
+			ret = __arm_smmu_merge_page(domain, iova, phys,
+						    cont_size, prot);
+			if (ret)
+				break;
+		}
+
+		iova += cont_size;
+		size -= cont_size;
+	}
+
+	return ret;
+}
+
+static int arm_smmu_switch_dirty_log(struct iommu_domain *domain, bool enable,
+				     unsigned long iova, size_t size, int prot)
+{
+	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
+	struct arm_smmu_device *smmu = smmu_domain->smmu;
+
+	if (!(smmu->features & ARM_SMMU_FEAT_HD))
+		return -ENODEV;
+	if (smmu_domain->stage != ARM_SMMU_DOMAIN_S1)
+		return -EINVAL;
+
+	if (enable) {
+		/*
+		 * For SMMU, the hardware dirty management is always enabled if
+		 * hardware supports HTTU HD. The action to start dirty log is
+		 * spliting block mapping.
+		 *
+		 * We don't return error even if the split operation fail, as we
+		 * can still track dirty at block granule, which is still a much
+		 * better choice compared to full dirty policy.
+		 */
+		arm_smmu_split_block(domain, iova, size);
+	} else {
+		/*
+		 * For SMMU, the hardware dirty management is always enabled if
+		 * hardware supports HTTU HD. The action to stop dirty log is
+		 * merging page mapping.
+		 *
+		 * We don't return error even if the merge operation fail, as it
+		 * just effects performace of DMA transaction.
+		 */
+		arm_smmu_merge_page(domain, iova, size, prot);
+	}
+
+	return 0;
+}
+
 static int arm_smmu_of_xlate(struct device *dev, struct of_phandle_args *args)
 {
 	return iommu_fwspec_add_ids(dev, args->args, 1);
@@ -2678,6 +2819,7 @@ static struct iommu_ops arm_smmu_ops = {
 	.release_device		= arm_smmu_release_device,
 	.device_group		= arm_smmu_device_group,
 	.enable_nesting		= arm_smmu_enable_nesting,
+	.switch_dirty_log	= arm_smmu_switch_dirty_log,
 	.of_xlate		= arm_smmu_of_xlate,
 	.get_resv_regions	= arm_smmu_get_resv_regions,
 	.put_resv_regions	= generic_iommu_put_resv_regions,
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 0d15620d1e90..bb19df2317ed 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -2375,8 +2375,8 @@ phys_addr_t iommu_iova_to_phys(struct iommu_domain *domain, dma_addr_t iova)
 }
 EXPORT_SYMBOL_GPL(iommu_iova_to_phys);
 
-static size_t iommu_pgsize(struct iommu_domain *domain,
-			   unsigned long addr_merge, size_t size)
+size_t iommu_pgsize(struct iommu_domain *domain,
+		    unsigned long addr_merge, size_t size)
 {
 	unsigned int pgsize_idx;
 	size_t pgsize;
@@ -2406,6 +2406,7 @@ static size_t iommu_pgsize(struct iommu_domain *domain,
 
 	return pgsize;
 }
+EXPORT_SYMBOL_GPL(iommu_pgsize);
 
 static int __iommu_map(struct iommu_domain *domain, unsigned long iova,
 		       phys_addr_t paddr, size_t size, int prot, gfp_t gfp)
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index e0e40dda974d..0a77db4f397f 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -427,6 +427,8 @@ extern int iommu_sva_unbind_gpasid(struct iommu_domain *domain,
 				   struct device *dev, ioasid_t pasid);
 extern struct iommu_domain *iommu_get_domain_for_dev(struct device *dev);
 extern struct iommu_domain *iommu_get_dma_domain(struct device *dev);
+extern size_t iommu_pgsize(struct iommu_domain *domain,
+			   unsigned long addr_merge, size_t size);
 extern int iommu_map(struct iommu_domain *domain, unsigned long iova,
 		     phys_addr_t paddr, size_t size, int prot);
 extern int iommu_map_atomic(struct iommu_domain *domain, unsigned long iova,
-- 
2.19.1

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 81+ messages in thread

* [RFC PATCH v4 10/13] iommu/arm-smmu-v3: Realize switch_dirty_log iommu ops
@ 2021-05-07 10:22   ` Keqian Zhu
  0 siblings, 0 replies; 81+ messages in thread
From: Keqian Zhu @ 2021-05-07 10:22 UTC (permalink / raw)
  To: linux-kernel, linux-arm-kernel, iommu, Robin Murphy, Will Deacon,
	Joerg Roedel, Jean-Philippe Brucker, Lu Baolu, Yi Sun,
	Tian Kevin
  Cc: Alex Williamson, Kirti Wankhede, Cornelia Huck, Jonathan Cameron,
	wanghaibin.wang, jiangkunkun, yuzenghui, lushenming

From: Kunkun Jiang <jiangkunkun@huawei.com>

This realizes switch_dirty_log. In order to get finer dirty
granule, it invokes arm_smmu_split_block when start dirty
log, and invokes arm_smmu_merge_page() to recover block
mapping when stop dirty log.

Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com>
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 142 ++++++++++++++++++++
 drivers/iommu/iommu.c                       |   5 +-
 include/linux/iommu.h                       |   2 +
 3 files changed, 147 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 3a2dc3177180..6de81d6ab652 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -2580,6 +2580,147 @@ static int arm_smmu_enable_nesting(struct iommu_domain *domain)
 	return ret;
 }
 
+static int arm_smmu_split_block(struct iommu_domain *domain,
+				unsigned long iova, size_t size)
+{
+	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
+	struct arm_smmu_device *smmu = smmu_domain->smmu;
+	struct io_pgtable_ops *ops = smmu_domain->pgtbl_ops;
+	size_t handled_size;
+
+	if (!(smmu->features & (ARM_SMMU_FEAT_BBML1 | ARM_SMMU_FEAT_BBML2))) {
+		dev_err(smmu->dev, "don't support BBML1/2, can't split block\n");
+		return -ENODEV;
+	}
+	if (!ops || !ops->split_block) {
+		pr_err("io-pgtable don't realize split block\n");
+		return -ENODEV;
+	}
+
+	handled_size = ops->split_block(ops, iova, size);
+	if (handled_size != size) {
+		pr_err("split block failed\n");
+		return -EFAULT;
+	}
+
+	return 0;
+}
+
+static int __arm_smmu_merge_page(struct iommu_domain *domain,
+				 unsigned long iova, phys_addr_t paddr,
+				 size_t size, int prot)
+{
+	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
+	struct io_pgtable_ops *ops = smmu_domain->pgtbl_ops;
+	size_t handled_size;
+
+	if (!ops || !ops->merge_page) {
+		pr_err("io-pgtable don't realize merge page\n");
+		return -ENODEV;
+	}
+
+	while (size) {
+		size_t pgsize = iommu_pgsize(domain, iova | paddr, size);
+
+		handled_size = ops->merge_page(ops, iova, paddr, pgsize, prot);
+		if (handled_size != pgsize) {
+			pr_err("merge page failed\n");
+			return -EFAULT;
+		}
+
+		pr_debug("merge handled: iova 0x%lx pa %pa size 0x%zx\n",
+			 iova, &paddr, pgsize);
+
+		iova += pgsize;
+		paddr += pgsize;
+		size -= pgsize;
+	}
+
+	return 0;
+}
+
+static int arm_smmu_merge_page(struct iommu_domain *domain, unsigned long iova,
+			       size_t size, int prot)
+{
+	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
+	struct arm_smmu_device *smmu = smmu_domain->smmu;
+	struct io_pgtable_ops *ops = smmu_domain->pgtbl_ops;
+	phys_addr_t phys;
+	dma_addr_t p, i;
+	size_t cont_size;
+	int ret = 0;
+
+	if (!(smmu->features & (ARM_SMMU_FEAT_BBML1 | ARM_SMMU_FEAT_BBML2))) {
+		dev_err(smmu->dev, "don't support BBML1/2, can't merge page\n");
+		return -ENODEV;
+	}
+
+	if (!ops || !ops->iova_to_phys)
+		return -ENODEV;
+
+	while (size) {
+		phys = ops->iova_to_phys(ops, iova);
+		cont_size = PAGE_SIZE;
+		p = phys + cont_size;
+		i = iova + cont_size;
+
+		while (cont_size < size && p == ops->iova_to_phys(ops, i)) {
+			p += PAGE_SIZE;
+			i += PAGE_SIZE;
+			cont_size += PAGE_SIZE;
+		}
+
+		if (cont_size != PAGE_SIZE) {
+			ret = __arm_smmu_merge_page(domain, iova, phys,
+						    cont_size, prot);
+			if (ret)
+				break;
+		}
+
+		iova += cont_size;
+		size -= cont_size;
+	}
+
+	return ret;
+}
+
+static int arm_smmu_switch_dirty_log(struct iommu_domain *domain, bool enable,
+				     unsigned long iova, size_t size, int prot)
+{
+	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
+	struct arm_smmu_device *smmu = smmu_domain->smmu;
+
+	if (!(smmu->features & ARM_SMMU_FEAT_HD))
+		return -ENODEV;
+	if (smmu_domain->stage != ARM_SMMU_DOMAIN_S1)
+		return -EINVAL;
+
+	if (enable) {
+		/*
+		 * For SMMU, the hardware dirty management is always enabled if
+		 * hardware supports HTTU HD. The action to start dirty log is
+		 * spliting block mapping.
+		 *
+		 * We don't return error even if the split operation fail, as we
+		 * can still track dirty at block granule, which is still a much
+		 * better choice compared to full dirty policy.
+		 */
+		arm_smmu_split_block(domain, iova, size);
+	} else {
+		/*
+		 * For SMMU, the hardware dirty management is always enabled if
+		 * hardware supports HTTU HD. The action to stop dirty log is
+		 * merging page mapping.
+		 *
+		 * We don't return error even if the merge operation fail, as it
+		 * just effects performace of DMA transaction.
+		 */
+		arm_smmu_merge_page(domain, iova, size, prot);
+	}
+
+	return 0;
+}
+
 static int arm_smmu_of_xlate(struct device *dev, struct of_phandle_args *args)
 {
 	return iommu_fwspec_add_ids(dev, args->args, 1);
@@ -2678,6 +2819,7 @@ static struct iommu_ops arm_smmu_ops = {
 	.release_device		= arm_smmu_release_device,
 	.device_group		= arm_smmu_device_group,
 	.enable_nesting		= arm_smmu_enable_nesting,
+	.switch_dirty_log	= arm_smmu_switch_dirty_log,
 	.of_xlate		= arm_smmu_of_xlate,
 	.get_resv_regions	= arm_smmu_get_resv_regions,
 	.put_resv_regions	= generic_iommu_put_resv_regions,
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 0d15620d1e90..bb19df2317ed 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -2375,8 +2375,8 @@ phys_addr_t iommu_iova_to_phys(struct iommu_domain *domain, dma_addr_t iova)
 }
 EXPORT_SYMBOL_GPL(iommu_iova_to_phys);
 
-static size_t iommu_pgsize(struct iommu_domain *domain,
-			   unsigned long addr_merge, size_t size)
+size_t iommu_pgsize(struct iommu_domain *domain,
+		    unsigned long addr_merge, size_t size)
 {
 	unsigned int pgsize_idx;
 	size_t pgsize;
@@ -2406,6 +2406,7 @@ static size_t iommu_pgsize(struct iommu_domain *domain,
 
 	return pgsize;
 }
+EXPORT_SYMBOL_GPL(iommu_pgsize);
 
 static int __iommu_map(struct iommu_domain *domain, unsigned long iova,
 		       phys_addr_t paddr, size_t size, int prot, gfp_t gfp)
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index e0e40dda974d..0a77db4f397f 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -427,6 +427,8 @@ extern int iommu_sva_unbind_gpasid(struct iommu_domain *domain,
 				   struct device *dev, ioasid_t pasid);
 extern struct iommu_domain *iommu_get_domain_for_dev(struct device *dev);
 extern struct iommu_domain *iommu_get_dma_domain(struct device *dev);
+extern size_t iommu_pgsize(struct iommu_domain *domain,
+			   unsigned long addr_merge, size_t size);
 extern int iommu_map(struct iommu_domain *domain, unsigned long iova,
 		     phys_addr_t paddr, size_t size, int prot);
 extern int iommu_map_atomic(struct iommu_domain *domain, unsigned long iova,
-- 
2.19.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 81+ messages in thread

* [RFC PATCH v4 11/13] iommu/arm-smmu-v3: Realize sync_dirty_log iommu ops
  2021-05-07 10:21 ` Keqian Zhu
  (?)
@ 2021-05-07 10:22   ` Keqian Zhu
  -1 siblings, 0 replies; 81+ messages in thread
From: Keqian Zhu @ 2021-05-07 10:22 UTC (permalink / raw)
  To: linux-kernel, linux-arm-kernel, iommu, Robin Murphy, Will Deacon,
	Joerg Roedel, Jean-Philippe Brucker, Lu Baolu, Yi Sun,
	Tian Kevin
  Cc: Alex Williamson, Kirti Wankhede, Cornelia Huck, Jonathan Cameron,
	wanghaibin.wang, jiangkunkun, yuzenghui, lushenming

From: Kunkun Jiang <jiangkunkun@huawei.com>

This realizes sync_dirty_log iommu ops based on sync_dirty_log
io-pgtable ops.

Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com>
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 30 +++++++++++++++++++++
 1 file changed, 30 insertions(+)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 6de81d6ab652..3d3c0f8e2446 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -2721,6 +2721,35 @@ static int arm_smmu_switch_dirty_log(struct iommu_domain *domain, bool enable,
 	return 0;
 }
 
+static int arm_smmu_sync_dirty_log(struct iommu_domain *domain,
+				   unsigned long iova, size_t size,
+				   unsigned long *bitmap,
+				   unsigned long base_iova,
+				   unsigned long bitmap_pgshift)
+{
+	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
+	struct io_pgtable_ops *ops = smmu_domain->pgtbl_ops;
+	struct arm_smmu_device *smmu = smmu_domain->smmu;
+
+	if (!(smmu->features & ARM_SMMU_FEAT_HD))
+		return -ENODEV;
+	if (smmu_domain->stage != ARM_SMMU_DOMAIN_S1)
+		return -EINVAL;
+
+	if (!ops || !ops->sync_dirty_log) {
+		pr_err("io-pgtable don't realize sync dirty log\n");
+		return -ENODEV;
+	}
+
+	/*
+	 * Flush iotlb to ensure all inflight transactions are completed.
+	 * See doc IHI0070Da 3.13.4 "HTTU behavior summary".
+	 */
+	arm_smmu_flush_iotlb_all(domain);
+	return ops->sync_dirty_log(ops, iova, size, bitmap, base_iova,
+				   bitmap_pgshift);
+}
+
 static int arm_smmu_of_xlate(struct device *dev, struct of_phandle_args *args)
 {
 	return iommu_fwspec_add_ids(dev, args->args, 1);
@@ -2820,6 +2849,7 @@ static struct iommu_ops arm_smmu_ops = {
 	.device_group		= arm_smmu_device_group,
 	.enable_nesting		= arm_smmu_enable_nesting,
 	.switch_dirty_log	= arm_smmu_switch_dirty_log,
+	.sync_dirty_log		= arm_smmu_sync_dirty_log,
 	.of_xlate		= arm_smmu_of_xlate,
 	.get_resv_regions	= arm_smmu_get_resv_regions,
 	.put_resv_regions	= generic_iommu_put_resv_regions,
-- 
2.19.1


^ permalink raw reply	[flat|nested] 81+ messages in thread

* [RFC PATCH v4 11/13] iommu/arm-smmu-v3: Realize sync_dirty_log iommu ops
@ 2021-05-07 10:22   ` Keqian Zhu
  0 siblings, 0 replies; 81+ messages in thread
From: Keqian Zhu @ 2021-05-07 10:22 UTC (permalink / raw)
  To: linux-kernel, linux-arm-kernel, iommu, Robin Murphy, Will Deacon,
	Joerg Roedel, Jean-Philippe Brucker, Lu Baolu, Yi Sun,
	Tian Kevin
  Cc: jiangkunkun, Cornelia Huck, Kirti Wankhede, lushenming,
	Alex Williamson, wanghaibin.wang

From: Kunkun Jiang <jiangkunkun@huawei.com>

This realizes sync_dirty_log iommu ops based on sync_dirty_log
io-pgtable ops.

Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com>
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 30 +++++++++++++++++++++
 1 file changed, 30 insertions(+)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 6de81d6ab652..3d3c0f8e2446 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -2721,6 +2721,35 @@ static int arm_smmu_switch_dirty_log(struct iommu_domain *domain, bool enable,
 	return 0;
 }
 
+static int arm_smmu_sync_dirty_log(struct iommu_domain *domain,
+				   unsigned long iova, size_t size,
+				   unsigned long *bitmap,
+				   unsigned long base_iova,
+				   unsigned long bitmap_pgshift)
+{
+	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
+	struct io_pgtable_ops *ops = smmu_domain->pgtbl_ops;
+	struct arm_smmu_device *smmu = smmu_domain->smmu;
+
+	if (!(smmu->features & ARM_SMMU_FEAT_HD))
+		return -ENODEV;
+	if (smmu_domain->stage != ARM_SMMU_DOMAIN_S1)
+		return -EINVAL;
+
+	if (!ops || !ops->sync_dirty_log) {
+		pr_err("io-pgtable don't realize sync dirty log\n");
+		return -ENODEV;
+	}
+
+	/*
+	 * Flush iotlb to ensure all inflight transactions are completed.
+	 * See doc IHI0070Da 3.13.4 "HTTU behavior summary".
+	 */
+	arm_smmu_flush_iotlb_all(domain);
+	return ops->sync_dirty_log(ops, iova, size, bitmap, base_iova,
+				   bitmap_pgshift);
+}
+
 static int arm_smmu_of_xlate(struct device *dev, struct of_phandle_args *args)
 {
 	return iommu_fwspec_add_ids(dev, args->args, 1);
@@ -2820,6 +2849,7 @@ static struct iommu_ops arm_smmu_ops = {
 	.device_group		= arm_smmu_device_group,
 	.enable_nesting		= arm_smmu_enable_nesting,
 	.switch_dirty_log	= arm_smmu_switch_dirty_log,
+	.sync_dirty_log		= arm_smmu_sync_dirty_log,
 	.of_xlate		= arm_smmu_of_xlate,
 	.get_resv_regions	= arm_smmu_get_resv_regions,
 	.put_resv_regions	= generic_iommu_put_resv_regions,
-- 
2.19.1

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 81+ messages in thread

* [RFC PATCH v4 11/13] iommu/arm-smmu-v3: Realize sync_dirty_log iommu ops
@ 2021-05-07 10:22   ` Keqian Zhu
  0 siblings, 0 replies; 81+ messages in thread
From: Keqian Zhu @ 2021-05-07 10:22 UTC (permalink / raw)
  To: linux-kernel, linux-arm-kernel, iommu, Robin Murphy, Will Deacon,
	Joerg Roedel, Jean-Philippe Brucker, Lu Baolu, Yi Sun,
	Tian Kevin
  Cc: Alex Williamson, Kirti Wankhede, Cornelia Huck, Jonathan Cameron,
	wanghaibin.wang, jiangkunkun, yuzenghui, lushenming

From: Kunkun Jiang <jiangkunkun@huawei.com>

This realizes sync_dirty_log iommu ops based on sync_dirty_log
io-pgtable ops.

Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com>
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 30 +++++++++++++++++++++
 1 file changed, 30 insertions(+)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 6de81d6ab652..3d3c0f8e2446 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -2721,6 +2721,35 @@ static int arm_smmu_switch_dirty_log(struct iommu_domain *domain, bool enable,
 	return 0;
 }
 
+static int arm_smmu_sync_dirty_log(struct iommu_domain *domain,
+				   unsigned long iova, size_t size,
+				   unsigned long *bitmap,
+				   unsigned long base_iova,
+				   unsigned long bitmap_pgshift)
+{
+	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
+	struct io_pgtable_ops *ops = smmu_domain->pgtbl_ops;
+	struct arm_smmu_device *smmu = smmu_domain->smmu;
+
+	if (!(smmu->features & ARM_SMMU_FEAT_HD))
+		return -ENODEV;
+	if (smmu_domain->stage != ARM_SMMU_DOMAIN_S1)
+		return -EINVAL;
+
+	if (!ops || !ops->sync_dirty_log) {
+		pr_err("io-pgtable don't realize sync dirty log\n");
+		return -ENODEV;
+	}
+
+	/*
+	 * Flush iotlb to ensure all inflight transactions are completed.
+	 * See doc IHI0070Da 3.13.4 "HTTU behavior summary".
+	 */
+	arm_smmu_flush_iotlb_all(domain);
+	return ops->sync_dirty_log(ops, iova, size, bitmap, base_iova,
+				   bitmap_pgshift);
+}
+
 static int arm_smmu_of_xlate(struct device *dev, struct of_phandle_args *args)
 {
 	return iommu_fwspec_add_ids(dev, args->args, 1);
@@ -2820,6 +2849,7 @@ static struct iommu_ops arm_smmu_ops = {
 	.device_group		= arm_smmu_device_group,
 	.enable_nesting		= arm_smmu_enable_nesting,
 	.switch_dirty_log	= arm_smmu_switch_dirty_log,
+	.sync_dirty_log		= arm_smmu_sync_dirty_log,
 	.of_xlate		= arm_smmu_of_xlate,
 	.get_resv_regions	= arm_smmu_get_resv_regions,
 	.put_resv_regions	= generic_iommu_put_resv_regions,
-- 
2.19.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 81+ messages in thread

* [RFC PATCH v4 12/13] iommu/arm-smmu-v3: Realize clear_dirty_log iommu ops
  2021-05-07 10:21 ` Keqian Zhu
  (?)
@ 2021-05-07 10:22   ` Keqian Zhu
  -1 siblings, 0 replies; 81+ messages in thread
From: Keqian Zhu @ 2021-05-07 10:22 UTC (permalink / raw)
  To: linux-kernel, linux-arm-kernel, iommu, Robin Murphy, Will Deacon,
	Joerg Roedel, Jean-Philippe Brucker, Lu Baolu, Yi Sun,
	Tian Kevin
  Cc: Alex Williamson, Kirti Wankhede, Cornelia Huck, Jonathan Cameron,
	wanghaibin.wang, jiangkunkun, yuzenghui, lushenming

From: Kunkun Jiang <jiangkunkun@huawei.com>

This realizes clear_dirty_log iommu ops based on clear_dirty_log
io-pgtable ops.

Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com>
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 25 +++++++++++++++++++++
 1 file changed, 25 insertions(+)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 3d3c0f8e2446..9b4739247dbb 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -2750,6 +2750,30 @@ static int arm_smmu_sync_dirty_log(struct iommu_domain *domain,
 				   bitmap_pgshift);
 }
 
+static int arm_smmu_clear_dirty_log(struct iommu_domain *domain,
+				    unsigned long iova, size_t size,
+				    unsigned long *bitmap,
+				    unsigned long base_iova,
+				    unsigned long bitmap_pgshift)
+{
+	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
+	struct io_pgtable_ops *ops = smmu_domain->pgtbl_ops;
+	struct arm_smmu_device *smmu = smmu_domain->smmu;
+
+	if (!(smmu->features & ARM_SMMU_FEAT_HD))
+		return -ENODEV;
+	if (smmu_domain->stage != ARM_SMMU_DOMAIN_S1)
+		return -EINVAL;
+
+	if (!ops || !ops->clear_dirty_log) {
+		pr_err("io-pgtable don't realize clear dirty log\n");
+		return -ENODEV;
+	}
+
+	return ops->clear_dirty_log(ops, iova, size, bitmap, base_iova,
+				    bitmap_pgshift);
+}
+
 static int arm_smmu_of_xlate(struct device *dev, struct of_phandle_args *args)
 {
 	return iommu_fwspec_add_ids(dev, args->args, 1);
@@ -2850,6 +2874,7 @@ static struct iommu_ops arm_smmu_ops = {
 	.enable_nesting		= arm_smmu_enable_nesting,
 	.switch_dirty_log	= arm_smmu_switch_dirty_log,
 	.sync_dirty_log		= arm_smmu_sync_dirty_log,
+	.clear_dirty_log	= arm_smmu_clear_dirty_log,
 	.of_xlate		= arm_smmu_of_xlate,
 	.get_resv_regions	= arm_smmu_get_resv_regions,
 	.put_resv_regions	= generic_iommu_put_resv_regions,
-- 
2.19.1


^ permalink raw reply	[flat|nested] 81+ messages in thread

* [RFC PATCH v4 12/13] iommu/arm-smmu-v3: Realize clear_dirty_log iommu ops
@ 2021-05-07 10:22   ` Keqian Zhu
  0 siblings, 0 replies; 81+ messages in thread
From: Keqian Zhu @ 2021-05-07 10:22 UTC (permalink / raw)
  To: linux-kernel, linux-arm-kernel, iommu, Robin Murphy, Will Deacon,
	Joerg Roedel, Jean-Philippe Brucker, Lu Baolu, Yi Sun,
	Tian Kevin
  Cc: jiangkunkun, Cornelia Huck, Kirti Wankhede, lushenming,
	Alex Williamson, wanghaibin.wang

From: Kunkun Jiang <jiangkunkun@huawei.com>

This realizes clear_dirty_log iommu ops based on clear_dirty_log
io-pgtable ops.

Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com>
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 25 +++++++++++++++++++++
 1 file changed, 25 insertions(+)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 3d3c0f8e2446..9b4739247dbb 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -2750,6 +2750,30 @@ static int arm_smmu_sync_dirty_log(struct iommu_domain *domain,
 				   bitmap_pgshift);
 }
 
+static int arm_smmu_clear_dirty_log(struct iommu_domain *domain,
+				    unsigned long iova, size_t size,
+				    unsigned long *bitmap,
+				    unsigned long base_iova,
+				    unsigned long bitmap_pgshift)
+{
+	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
+	struct io_pgtable_ops *ops = smmu_domain->pgtbl_ops;
+	struct arm_smmu_device *smmu = smmu_domain->smmu;
+
+	if (!(smmu->features & ARM_SMMU_FEAT_HD))
+		return -ENODEV;
+	if (smmu_domain->stage != ARM_SMMU_DOMAIN_S1)
+		return -EINVAL;
+
+	if (!ops || !ops->clear_dirty_log) {
+		pr_err("io-pgtable don't realize clear dirty log\n");
+		return -ENODEV;
+	}
+
+	return ops->clear_dirty_log(ops, iova, size, bitmap, base_iova,
+				    bitmap_pgshift);
+}
+
 static int arm_smmu_of_xlate(struct device *dev, struct of_phandle_args *args)
 {
 	return iommu_fwspec_add_ids(dev, args->args, 1);
@@ -2850,6 +2874,7 @@ static struct iommu_ops arm_smmu_ops = {
 	.enable_nesting		= arm_smmu_enable_nesting,
 	.switch_dirty_log	= arm_smmu_switch_dirty_log,
 	.sync_dirty_log		= arm_smmu_sync_dirty_log,
+	.clear_dirty_log	= arm_smmu_clear_dirty_log,
 	.of_xlate		= arm_smmu_of_xlate,
 	.get_resv_regions	= arm_smmu_get_resv_regions,
 	.put_resv_regions	= generic_iommu_put_resv_regions,
-- 
2.19.1

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 81+ messages in thread

* [RFC PATCH v4 12/13] iommu/arm-smmu-v3: Realize clear_dirty_log iommu ops
@ 2021-05-07 10:22   ` Keqian Zhu
  0 siblings, 0 replies; 81+ messages in thread
From: Keqian Zhu @ 2021-05-07 10:22 UTC (permalink / raw)
  To: linux-kernel, linux-arm-kernel, iommu, Robin Murphy, Will Deacon,
	Joerg Roedel, Jean-Philippe Brucker, Lu Baolu, Yi Sun,
	Tian Kevin
  Cc: Alex Williamson, Kirti Wankhede, Cornelia Huck, Jonathan Cameron,
	wanghaibin.wang, jiangkunkun, yuzenghui, lushenming

From: Kunkun Jiang <jiangkunkun@huawei.com>

This realizes clear_dirty_log iommu ops based on clear_dirty_log
io-pgtable ops.

Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com>
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 25 +++++++++++++++++++++
 1 file changed, 25 insertions(+)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 3d3c0f8e2446..9b4739247dbb 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -2750,6 +2750,30 @@ static int arm_smmu_sync_dirty_log(struct iommu_domain *domain,
 				   bitmap_pgshift);
 }
 
+static int arm_smmu_clear_dirty_log(struct iommu_domain *domain,
+				    unsigned long iova, size_t size,
+				    unsigned long *bitmap,
+				    unsigned long base_iova,
+				    unsigned long bitmap_pgshift)
+{
+	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
+	struct io_pgtable_ops *ops = smmu_domain->pgtbl_ops;
+	struct arm_smmu_device *smmu = smmu_domain->smmu;
+
+	if (!(smmu->features & ARM_SMMU_FEAT_HD))
+		return -ENODEV;
+	if (smmu_domain->stage != ARM_SMMU_DOMAIN_S1)
+		return -EINVAL;
+
+	if (!ops || !ops->clear_dirty_log) {
+		pr_err("io-pgtable don't realize clear dirty log\n");
+		return -ENODEV;
+	}
+
+	return ops->clear_dirty_log(ops, iova, size, bitmap, base_iova,
+				    bitmap_pgshift);
+}
+
 static int arm_smmu_of_xlate(struct device *dev, struct of_phandle_args *args)
 {
 	return iommu_fwspec_add_ids(dev, args->args, 1);
@@ -2850,6 +2874,7 @@ static struct iommu_ops arm_smmu_ops = {
 	.enable_nesting		= arm_smmu_enable_nesting,
 	.switch_dirty_log	= arm_smmu_switch_dirty_log,
 	.sync_dirty_log		= arm_smmu_sync_dirty_log,
+	.clear_dirty_log	= arm_smmu_clear_dirty_log,
 	.of_xlate		= arm_smmu_of_xlate,
 	.get_resv_regions	= arm_smmu_get_resv_regions,
 	.put_resv_regions	= generic_iommu_put_resv_regions,
-- 
2.19.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 81+ messages in thread

* [RFC PATCH v4 13/13] iommu/arm-smmu-v3: Realize support_dirty_log iommu ops
  2021-05-07 10:21 ` Keqian Zhu
  (?)
@ 2021-05-07 10:22   ` Keqian Zhu
  -1 siblings, 0 replies; 81+ messages in thread
From: Keqian Zhu @ 2021-05-07 10:22 UTC (permalink / raw)
  To: linux-kernel, linux-arm-kernel, iommu, Robin Murphy, Will Deacon,
	Joerg Roedel, Jean-Philippe Brucker, Lu Baolu, Yi Sun,
	Tian Kevin
  Cc: Alex Williamson, Kirti Wankhede, Cornelia Huck, Jonathan Cameron,
	wanghaibin.wang, jiangkunkun, yuzenghui, lushenming

From: Kunkun Jiang <jiangkunkun@huawei.com>

We have implemented these interfaces required to support iommu
dirty log tracking. The last step is reporting this feature to
upper user, then the user can perform higher policy base on it.
For arm smmuv3, it is equal to ARM_SMMU_FEAT_HD.

Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com>
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 9b4739247dbb..59d11f084199 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -2684,6 +2684,13 @@ static int arm_smmu_merge_page(struct iommu_domain *domain, unsigned long iova,
 	return ret;
 }
 
+static bool arm_smmu_support_dirty_log(struct iommu_domain *domain)
+{
+	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
+
+	return !!(smmu_domain->smmu->features & ARM_SMMU_FEAT_HD);
+}
+
 static int arm_smmu_switch_dirty_log(struct iommu_domain *domain, bool enable,
 				     unsigned long iova, size_t size, int prot)
 {
@@ -2872,6 +2879,7 @@ static struct iommu_ops arm_smmu_ops = {
 	.release_device		= arm_smmu_release_device,
 	.device_group		= arm_smmu_device_group,
 	.enable_nesting		= arm_smmu_enable_nesting,
+	.support_dirty_log	= arm_smmu_support_dirty_log,
 	.switch_dirty_log	= arm_smmu_switch_dirty_log,
 	.sync_dirty_log		= arm_smmu_sync_dirty_log,
 	.clear_dirty_log	= arm_smmu_clear_dirty_log,
-- 
2.19.1


^ permalink raw reply	[flat|nested] 81+ messages in thread

* [RFC PATCH v4 13/13] iommu/arm-smmu-v3: Realize support_dirty_log iommu ops
@ 2021-05-07 10:22   ` Keqian Zhu
  0 siblings, 0 replies; 81+ messages in thread
From: Keqian Zhu @ 2021-05-07 10:22 UTC (permalink / raw)
  To: linux-kernel, linux-arm-kernel, iommu, Robin Murphy, Will Deacon,
	Joerg Roedel, Jean-Philippe Brucker, Lu Baolu, Yi Sun,
	Tian Kevin
  Cc: jiangkunkun, Cornelia Huck, Kirti Wankhede, lushenming,
	Alex Williamson, wanghaibin.wang

From: Kunkun Jiang <jiangkunkun@huawei.com>

We have implemented these interfaces required to support iommu
dirty log tracking. The last step is reporting this feature to
upper user, then the user can perform higher policy base on it.
For arm smmuv3, it is equal to ARM_SMMU_FEAT_HD.

Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com>
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 9b4739247dbb..59d11f084199 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -2684,6 +2684,13 @@ static int arm_smmu_merge_page(struct iommu_domain *domain, unsigned long iova,
 	return ret;
 }
 
+static bool arm_smmu_support_dirty_log(struct iommu_domain *domain)
+{
+	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
+
+	return !!(smmu_domain->smmu->features & ARM_SMMU_FEAT_HD);
+}
+
 static int arm_smmu_switch_dirty_log(struct iommu_domain *domain, bool enable,
 				     unsigned long iova, size_t size, int prot)
 {
@@ -2872,6 +2879,7 @@ static struct iommu_ops arm_smmu_ops = {
 	.release_device		= arm_smmu_release_device,
 	.device_group		= arm_smmu_device_group,
 	.enable_nesting		= arm_smmu_enable_nesting,
+	.support_dirty_log	= arm_smmu_support_dirty_log,
 	.switch_dirty_log	= arm_smmu_switch_dirty_log,
 	.sync_dirty_log		= arm_smmu_sync_dirty_log,
 	.clear_dirty_log	= arm_smmu_clear_dirty_log,
-- 
2.19.1

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 81+ messages in thread

* [RFC PATCH v4 13/13] iommu/arm-smmu-v3: Realize support_dirty_log iommu ops
@ 2021-05-07 10:22   ` Keqian Zhu
  0 siblings, 0 replies; 81+ messages in thread
From: Keqian Zhu @ 2021-05-07 10:22 UTC (permalink / raw)
  To: linux-kernel, linux-arm-kernel, iommu, Robin Murphy, Will Deacon,
	Joerg Roedel, Jean-Philippe Brucker, Lu Baolu, Yi Sun,
	Tian Kevin
  Cc: Alex Williamson, Kirti Wankhede, Cornelia Huck, Jonathan Cameron,
	wanghaibin.wang, jiangkunkun, yuzenghui, lushenming

From: Kunkun Jiang <jiangkunkun@huawei.com>

We have implemented these interfaces required to support iommu
dirty log tracking. The last step is reporting this feature to
upper user, then the user can perform higher policy base on it.
For arm smmuv3, it is equal to ARM_SMMU_FEAT_HD.

Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com>
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 9b4739247dbb..59d11f084199 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -2684,6 +2684,13 @@ static int arm_smmu_merge_page(struct iommu_domain *domain, unsigned long iova,
 	return ret;
 }
 
+static bool arm_smmu_support_dirty_log(struct iommu_domain *domain)
+{
+	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
+
+	return !!(smmu_domain->smmu->features & ARM_SMMU_FEAT_HD);
+}
+
 static int arm_smmu_switch_dirty_log(struct iommu_domain *domain, bool enable,
 				     unsigned long iova, size_t size, int prot)
 {
@@ -2872,6 +2879,7 @@ static struct iommu_ops arm_smmu_ops = {
 	.release_device		= arm_smmu_release_device,
 	.device_group		= arm_smmu_device_group,
 	.enable_nesting		= arm_smmu_enable_nesting,
+	.support_dirty_log	= arm_smmu_support_dirty_log,
 	.switch_dirty_log	= arm_smmu_switch_dirty_log,
 	.sync_dirty_log		= arm_smmu_sync_dirty_log,
 	.clear_dirty_log	= arm_smmu_clear_dirty_log,
-- 
2.19.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [RFC PATCH v4 01/13] iommu: Introduce dirty log tracking framework
  2021-05-07 10:21   ` Keqian Zhu
  (?)
@ 2021-05-08  3:46     ` Lu Baolu
  -1 siblings, 0 replies; 81+ messages in thread
From: Lu Baolu @ 2021-05-08  3:46 UTC (permalink / raw)
  To: Keqian Zhu, linux-kernel, linux-arm-kernel, iommu, Robin Murphy,
	Will Deacon, Joerg Roedel, Jean-Philippe Brucker, Yi Sun,
	Tian Kevin
  Cc: baolu.lu, Alex Williamson, Kirti Wankhede, Cornelia Huck,
	Jonathan Cameron, wanghaibin.wang, jiangkunkun, yuzenghui,
	lushenming

Hi Keqian,

On 5/7/21 6:21 PM, Keqian Zhu wrote:
> Some types of IOMMU are capable of tracking DMA dirty log, such as
> ARM SMMU with HTTU or Intel IOMMU with SLADE. This introduces the
> dirty log tracking framework in the IOMMU base layer.
> 
> Four new essential interfaces are added, and we maintaince the status
> of dirty log tracking in iommu_domain.
> 1. iommu_support_dirty_log: Check whether domain supports dirty log tracking
> 2. iommu_switch_dirty_log: Perform actions to start|stop dirty log tracking
> 3. iommu_sync_dirty_log: Sync dirty log from IOMMU into a dirty bitmap
> 4. iommu_clear_dirty_log: Clear dirty log of IOMMU by a mask bitmap
> 
> Note: Don't concurrently call these interfaces with other ops that
> access underlying page table.
> 
> Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
> Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com>
> ---
>   drivers/iommu/iommu.c        | 201 +++++++++++++++++++++++++++++++++++
>   include/linux/iommu.h        |  63 +++++++++++
>   include/trace/events/iommu.h |  63 +++++++++++
>   3 files changed, 327 insertions(+)
> 
> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
> index 808ab70d5df5..0d15620d1e90 100644
> --- a/drivers/iommu/iommu.c
> +++ b/drivers/iommu/iommu.c
> @@ -1940,6 +1940,7 @@ static struct iommu_domain *__iommu_domain_alloc(struct bus_type *bus,
>   	domain->type = type;
>   	/* Assume all sizes by default; the driver may override this later */
>   	domain->pgsize_bitmap  = bus->iommu_ops->pgsize_bitmap;
> +	mutex_init(&domain->switch_log_lock);
>   
>   	return domain;
>   }
> @@ -2703,6 +2704,206 @@ int iommu_set_pgtable_quirks(struct iommu_domain *domain,
>   }
>   EXPORT_SYMBOL_GPL(iommu_set_pgtable_quirks);
>   
> +bool iommu_support_dirty_log(struct iommu_domain *domain)
> +{
> +	const struct iommu_ops *ops = domain->ops;
> +
> +	return ops->support_dirty_log && ops->support_dirty_log(domain);
> +}
> +EXPORT_SYMBOL_GPL(iommu_support_dirty_log);

I suppose this interface is to ask the vendor IOMMU driver to check
whether each device/iommu in the domain supports dirty bit tracking.
But what will happen if new devices with different tracking capability
are added afterward?

To make things simple, is it possible to support this tracking only when
all underlying IOMMUs support dirty bit tracking?

Or, the more crazy idea is that we don't need to check this capability
at all. If dirty bit tracking is not supported by hardware, just mark
all pages dirty?

> +
> +int iommu_switch_dirty_log(struct iommu_domain *domain, bool enable,
> +			   unsigned long iova, size_t size, int prot)
> +{
> +	const struct iommu_ops *ops = domain->ops;
> +	unsigned long orig_iova = iova;
> +	unsigned int min_pagesz;
> +	size_t orig_size = size;
> +	bool flush = false;
> +	int ret = 0;
> +
> +	if (unlikely(!ops->switch_dirty_log))
> +		return -ENODEV;
> +
> +	min_pagesz = 1 << __ffs(domain->pgsize_bitmap);
> +	if (!IS_ALIGNED(iova | size, min_pagesz)) {
> +		pr_err("unaligned: iova 0x%lx size 0x%zx min_pagesz 0x%x\n",
> +		       iova, size, min_pagesz);
> +		return -EINVAL;
> +	}
> +
> +	mutex_lock(&domain->switch_log_lock);
> +	if (enable && domain->dirty_log_tracking) {
> +		ret = -EBUSY;
> +		goto out;
> +	} else if (!enable && !domain->dirty_log_tracking) {
> +		ret = -EINVAL;
> +		goto out;
> +	}
> +
> +	pr_debug("switch_dirty_log %s for: iova 0x%lx size 0x%zx\n",
> +		 enable ? "enable" : "disable", iova, size);
> +
> +	while (size) {
> +		size_t pgsize = iommu_pgsize(domain, iova, size);
> +
> +		flush = true;
> +		ret = ops->switch_dirty_log(domain, enable, iova, pgsize, prot);

Per minimal page callback is much expensive. How about using (pagesize,
count), so that all pages with the same page size could be handled in a
single indirect call? I remember I commented this during last review,
but I don't mind doing it again.

Best regards,
baolu

> +		if (ret)
> +			break;
> +
> +		pr_debug("switch_dirty_log handled: iova 0x%lx size 0x%zx\n",
> +			 iova, pgsize);
> +
> +		iova += pgsize;
> +		size -= pgsize;
> +	}
> +
> +	if (flush)
> +		iommu_flush_iotlb_all(domain);
> +
> +	if (!ret) {
> +		domain->dirty_log_tracking = enable;
> +		trace_switch_dirty_log(orig_iova, orig_size, enable);
> +	}
> +out:
> +	mutex_unlock(&domain->switch_log_lock);
> +	return ret;
> +}
> +EXPORT_SYMBOL_GPL(iommu_switch_dirty_log);
> +
> +int iommu_sync_dirty_log(struct iommu_domain *domain, unsigned long iova,
> +			 size_t size, unsigned long *bitmap,
> +			 unsigned long base_iova, unsigned long bitmap_pgshift)
> +{
> +	const struct iommu_ops *ops = domain->ops;
> +	unsigned long orig_iova = iova;
> +	unsigned int min_pagesz;
> +	size_t orig_size = size;
> +	int ret = 0;
> +
> +	if (unlikely(!ops->sync_dirty_log))
> +		return -ENODEV;
> +
> +	min_pagesz = 1 << __ffs(domain->pgsize_bitmap);
> +	if (!IS_ALIGNED(iova | size, min_pagesz)) {
> +		pr_err("unaligned: iova 0x%lx size 0x%zx min_pagesz 0x%x\n",
> +		       iova, size, min_pagesz);
> +		return -EINVAL;
> +	}
> +
> +	mutex_lock(&domain->switch_log_lock);
> +	if (!domain->dirty_log_tracking) {
> +		ret = -EINVAL;
> +		goto out;
> +	}
> +
> +	pr_debug("sync_dirty_log for: iova 0x%lx size 0x%zx\n", iova, size);
> +
> +	while (size) {
> +		size_t pgsize = iommu_pgsize(domain, iova, size);
> +
> +		ret = ops->sync_dirty_log(domain, iova, pgsize,
> +					  bitmap, base_iova, bitmap_pgshift);
> +		if (ret)
> +			break;
> +
> +		pr_debug("sync_dirty_log handled: iova 0x%lx size 0x%zx\n",
> +			 iova, pgsize);
> +
> +		iova += pgsize;
> +		size -= pgsize;
> +	}
> +
> +	if (!ret)
> +		trace_sync_dirty_log(orig_iova, orig_size);
> +out:
> +	mutex_unlock(&domain->switch_log_lock);
> +	return ret;
> +}
> +EXPORT_SYMBOL_GPL(iommu_sync_dirty_log);
> +
> +static int __iommu_clear_dirty_log(struct iommu_domain *domain,
> +				   unsigned long iova, size_t size,
> +				   unsigned long *bitmap,
> +				   unsigned long base_iova,
> +				   unsigned long bitmap_pgshift)
> +{
> +	const struct iommu_ops *ops = domain->ops;
> +	unsigned long orig_iova = iova;
> +	size_t orig_size = size;
> +	int ret = 0;
> +
> +	if (unlikely(!ops->clear_dirty_log))
> +		return -ENODEV;
> +
> +	pr_debug("clear_dirty_log for: iova 0x%lx size 0x%zx\n", iova, size);
> +
> +	while (size) {
> +		size_t pgsize = iommu_pgsize(domain, iova, size);
> +
> +		ret = ops->clear_dirty_log(domain, iova, pgsize, bitmap,
> +					   base_iova, bitmap_pgshift);
> +		if (ret)
> +			break;
> +
> +		pr_debug("clear_dirty_log handled: iova 0x%lx size 0x%zx\n",
> +			 iova, pgsize);
> +
> +		iova += pgsize;
> +		size -= pgsize;
> +	}
> +
> +	if (!ret)
> +		trace_clear_dirty_log(orig_iova, orig_size);
> +
> +	return ret;
> +}
> +
> +int iommu_clear_dirty_log(struct iommu_domain *domain,
> +			  unsigned long iova, size_t size,
> +			  unsigned long *bitmap, unsigned long base_iova,
> +			  unsigned long bitmap_pgshift)
> +{
> +	unsigned long riova, rsize;
> +	unsigned int min_pagesz;
> +	bool flush = false;
> +	int rs, re, start, end;
> +	int ret = 0;
> +
> +	min_pagesz = 1 << __ffs(domain->pgsize_bitmap);
> +	if (!IS_ALIGNED(iova | size, min_pagesz)) {
> +		pr_err("unaligned: iova 0x%lx min_pagesz 0x%x\n",
> +		       iova, min_pagesz);
> +		return -EINVAL;
> +	}
> +
> +	mutex_lock(&domain->switch_log_lock);
> +	if (!domain->dirty_log_tracking) {
> +		ret = -EINVAL;
> +		goto out;
> +	}
> +
> +	start = (iova - base_iova) >> bitmap_pgshift;
> +	end = start + (size >> bitmap_pgshift);
> +	bitmap_for_each_set_region(bitmap, rs, re, start, end) {
> +		flush = true;
> +		riova = base_iova + (rs << bitmap_pgshift);
> +		rsize = (re - rs) << bitmap_pgshift;
> +		ret = __iommu_clear_dirty_log(domain, riova, rsize, bitmap,
> +					      base_iova, bitmap_pgshift);
> +		if (ret)
> +			break;
> +	}
> +
> +	if (flush)
> +		iommu_flush_iotlb_all(domain);
> +out:
> +	mutex_unlock(&domain->switch_log_lock);
> +	return ret;
> +}
> +EXPORT_SYMBOL_GPL(iommu_clear_dirty_log);
> +
>   void iommu_get_resv_regions(struct device *dev, struct list_head *list)
>   {
>   	const struct iommu_ops *ops = dev->bus->iommu_ops;
> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
> index 32d448050bf7..e0e40dda974d 100644
> --- a/include/linux/iommu.h
> +++ b/include/linux/iommu.h
> @@ -87,6 +87,8 @@ struct iommu_domain {
>   	void *handler_token;
>   	struct iommu_domain_geometry geometry;
>   	void *iova_cookie;
> +	bool dirty_log_tracking;
> +	struct mutex switch_log_lock;
>   };
>   
>   enum iommu_cap {
> @@ -193,6 +195,10 @@ struct iommu_iotlb_gather {
>    * @device_group: find iommu group for a particular device
>    * @enable_nesting: Enable nesting
>    * @set_pgtable_quirks: Set io page table quirks (IO_PGTABLE_QUIRK_*)
> + * @support_dirty_log: Check whether domain supports dirty log tracking
> + * @switch_dirty_log: Perform actions to start|stop dirty log tracking
> + * @sync_dirty_log: Sync dirty log from IOMMU into a dirty bitmap
> + * @clear_dirty_log: Clear dirty log of IOMMU by a mask bitmap
>    * @get_resv_regions: Request list of reserved regions for a device
>    * @put_resv_regions: Free list of reserved regions for a device
>    * @apply_resv_region: Temporary helper call-back for iova reserved ranges
> @@ -245,6 +251,22 @@ struct iommu_ops {
>   	int (*set_pgtable_quirks)(struct iommu_domain *domain,
>   				  unsigned long quirks);
>   
> +	/*
> +	 * Track dirty log. Note: Don't concurrently call these interfaces with
> +	 * other ops that access underlying page table.
> +	 */
> +	bool (*support_dirty_log)(struct iommu_domain *domain);
> +	int (*switch_dirty_log)(struct iommu_domain *domain, bool enable,
> +				unsigned long iova, size_t size, int prot);
> +	int (*sync_dirty_log)(struct iommu_domain *domain,
> +			      unsigned long iova, size_t size,
> +			      unsigned long *bitmap, unsigned long base_iova,
> +			      unsigned long bitmap_pgshift);
> +	int (*clear_dirty_log)(struct iommu_domain *domain,
> +			       unsigned long iova, size_t size,
> +			       unsigned long *bitmap, unsigned long base_iova,
> +			       unsigned long bitmap_pgshift);
> +
>   	/* Request/Free a list of reserved regions for a device */
>   	void (*get_resv_regions)(struct device *dev, struct list_head *list);
>   	void (*put_resv_regions)(struct device *dev, struct list_head *list);
> @@ -475,6 +497,17 @@ extern struct iommu_domain *iommu_group_default_domain(struct iommu_group *);
>   int iommu_enable_nesting(struct iommu_domain *domain);
>   int iommu_set_pgtable_quirks(struct iommu_domain *domain,
>   		unsigned long quirks);
> +extern bool iommu_support_dirty_log(struct iommu_domain *domain);
> +extern int iommu_switch_dirty_log(struct iommu_domain *domain, bool enable,
> +				  unsigned long iova, size_t size, int prot);
> +extern int iommu_sync_dirty_log(struct iommu_domain *domain, unsigned long iova,
> +				size_t size, unsigned long *bitmap,
> +				unsigned long base_iova,
> +				unsigned long bitmap_pgshift);
> +extern int iommu_clear_dirty_log(struct iommu_domain *domain, unsigned long iova,
> +				 size_t dma_size, unsigned long *bitmap,
> +				 unsigned long base_iova,
> +				 unsigned long bitmap_pgshift);
>   
>   void iommu_set_dma_strict(bool val);
>   bool iommu_get_dma_strict(struct iommu_domain *domain);
> @@ -848,6 +881,36 @@ static inline int iommu_set_pgtable_quirks(struct iommu_domain *domain,
>   	return 0;
>   }
>   
> +static inline bool iommu_support_dirty_log(struct iommu_domain *domain)
> +{
> +	return false;
> +}
> +
> +static inline int iommu_switch_dirty_log(struct iommu_domain *domain,
> +					 bool enable, unsigned long iova,
> +					 size_t size, int prot)
> +{
> +	return -EINVAL;
> +}
> +
> +static inline int iommu_sync_dirty_log(struct iommu_domain *domain,
> +				       unsigned long iova, size_t size,
> +				       unsigned long *bitmap,
> +				       unsigned long base_iova,
> +				       unsigned long pgshift)
> +{
> +	return -EINVAL;
> +}
> +
> +static inline int iommu_clear_dirty_log(struct iommu_domain *domain,
> +					unsigned long iova, size_t size,
> +					unsigned long *bitmap,
> +					unsigned long base_iova,
> +					unsigned long pgshift)
> +{
> +	return -EINVAL;
> +}
> +
>   static inline int iommu_device_register(struct iommu_device *iommu,
>   					const struct iommu_ops *ops,
>   					struct device *hwdev)
> diff --git a/include/trace/events/iommu.h b/include/trace/events/iommu.h
> index 72b4582322ff..6436d693d357 100644
> --- a/include/trace/events/iommu.h
> +++ b/include/trace/events/iommu.h
> @@ -129,6 +129,69 @@ TRACE_EVENT(unmap,
>   	)
>   );
>   
> +TRACE_EVENT(switch_dirty_log,
> +
> +	TP_PROTO(unsigned long iova, size_t size, bool enable),
> +
> +	TP_ARGS(iova, size, enable),
> +
> +	TP_STRUCT__entry(
> +		__field(u64, iova)
> +		__field(size_t, size)
> +		__field(bool, enable)
> +	),
> +
> +	TP_fast_assign(
> +		__entry->iova = iova;
> +		__entry->size = size;
> +		__entry->enable = enable;
> +	),
> +
> +	TP_printk("IOMMU: iova=0x%016llx size=%zu enable=%u",
> +			__entry->iova, __entry->size, __entry->enable
> +	)
> +);
> +
> +TRACE_EVENT(sync_dirty_log,
> +
> +	TP_PROTO(unsigned long iova, size_t size),
> +
> +	TP_ARGS(iova, size),
> +
> +	TP_STRUCT__entry(
> +		__field(u64, iova)
> +		__field(size_t, size)
> +	),
> +
> +	TP_fast_assign(
> +		__entry->iova = iova;
> +		__entry->size = size;
> +	),
> +
> +	TP_printk("IOMMU: iova=0x%016llx size=%zu", __entry->iova,
> +			__entry->size)
> +);
> +
> +TRACE_EVENT(clear_dirty_log,
> +
> +	TP_PROTO(unsigned long iova, size_t size),
> +
> +	TP_ARGS(iova, size),
> +
> +	TP_STRUCT__entry(
> +		__field(u64, iova)
> +		__field(size_t, size)
> +	),
> +
> +	TP_fast_assign(
> +		__entry->iova = iova;
> +		__entry->size = size;
> +	),
> +
> +	TP_printk("IOMMU: iova=0x%016llx size=%zu", __entry->iova,
> +			__entry->size)
> +);
> +
>   DECLARE_EVENT_CLASS(iommu_error,
>   
>   	TP_PROTO(struct device *dev, unsigned long iova, int flags),
> 

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [RFC PATCH v4 01/13] iommu: Introduce dirty log tracking framework
@ 2021-05-08  3:46     ` Lu Baolu
  0 siblings, 0 replies; 81+ messages in thread
From: Lu Baolu @ 2021-05-08  3:46 UTC (permalink / raw)
  To: Keqian Zhu, linux-kernel, linux-arm-kernel, iommu, Robin Murphy,
	Will Deacon, Joerg Roedel, Jean-Philippe Brucker, Yi Sun,
	Tian Kevin
  Cc: jiangkunkun, Cornelia Huck, Kirti Wankhede, lushenming,
	Alex Williamson, wanghaibin.wang

Hi Keqian,

On 5/7/21 6:21 PM, Keqian Zhu wrote:
> Some types of IOMMU are capable of tracking DMA dirty log, such as
> ARM SMMU with HTTU or Intel IOMMU with SLADE. This introduces the
> dirty log tracking framework in the IOMMU base layer.
> 
> Four new essential interfaces are added, and we maintaince the status
> of dirty log tracking in iommu_domain.
> 1. iommu_support_dirty_log: Check whether domain supports dirty log tracking
> 2. iommu_switch_dirty_log: Perform actions to start|stop dirty log tracking
> 3. iommu_sync_dirty_log: Sync dirty log from IOMMU into a dirty bitmap
> 4. iommu_clear_dirty_log: Clear dirty log of IOMMU by a mask bitmap
> 
> Note: Don't concurrently call these interfaces with other ops that
> access underlying page table.
> 
> Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
> Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com>
> ---
>   drivers/iommu/iommu.c        | 201 +++++++++++++++++++++++++++++++++++
>   include/linux/iommu.h        |  63 +++++++++++
>   include/trace/events/iommu.h |  63 +++++++++++
>   3 files changed, 327 insertions(+)
> 
> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
> index 808ab70d5df5..0d15620d1e90 100644
> --- a/drivers/iommu/iommu.c
> +++ b/drivers/iommu/iommu.c
> @@ -1940,6 +1940,7 @@ static struct iommu_domain *__iommu_domain_alloc(struct bus_type *bus,
>   	domain->type = type;
>   	/* Assume all sizes by default; the driver may override this later */
>   	domain->pgsize_bitmap  = bus->iommu_ops->pgsize_bitmap;
> +	mutex_init(&domain->switch_log_lock);
>   
>   	return domain;
>   }
> @@ -2703,6 +2704,206 @@ int iommu_set_pgtable_quirks(struct iommu_domain *domain,
>   }
>   EXPORT_SYMBOL_GPL(iommu_set_pgtable_quirks);
>   
> +bool iommu_support_dirty_log(struct iommu_domain *domain)
> +{
> +	const struct iommu_ops *ops = domain->ops;
> +
> +	return ops->support_dirty_log && ops->support_dirty_log(domain);
> +}
> +EXPORT_SYMBOL_GPL(iommu_support_dirty_log);

I suppose this interface is to ask the vendor IOMMU driver to check
whether each device/iommu in the domain supports dirty bit tracking.
But what will happen if new devices with different tracking capability
are added afterward?

To make things simple, is it possible to support this tracking only when
all underlying IOMMUs support dirty bit tracking?

Or, the more crazy idea is that we don't need to check this capability
at all. If dirty bit tracking is not supported by hardware, just mark
all pages dirty?

> +
> +int iommu_switch_dirty_log(struct iommu_domain *domain, bool enable,
> +			   unsigned long iova, size_t size, int prot)
> +{
> +	const struct iommu_ops *ops = domain->ops;
> +	unsigned long orig_iova = iova;
> +	unsigned int min_pagesz;
> +	size_t orig_size = size;
> +	bool flush = false;
> +	int ret = 0;
> +
> +	if (unlikely(!ops->switch_dirty_log))
> +		return -ENODEV;
> +
> +	min_pagesz = 1 << __ffs(domain->pgsize_bitmap);
> +	if (!IS_ALIGNED(iova | size, min_pagesz)) {
> +		pr_err("unaligned: iova 0x%lx size 0x%zx min_pagesz 0x%x\n",
> +		       iova, size, min_pagesz);
> +		return -EINVAL;
> +	}
> +
> +	mutex_lock(&domain->switch_log_lock);
> +	if (enable && domain->dirty_log_tracking) {
> +		ret = -EBUSY;
> +		goto out;
> +	} else if (!enable && !domain->dirty_log_tracking) {
> +		ret = -EINVAL;
> +		goto out;
> +	}
> +
> +	pr_debug("switch_dirty_log %s for: iova 0x%lx size 0x%zx\n",
> +		 enable ? "enable" : "disable", iova, size);
> +
> +	while (size) {
> +		size_t pgsize = iommu_pgsize(domain, iova, size);
> +
> +		flush = true;
> +		ret = ops->switch_dirty_log(domain, enable, iova, pgsize, prot);

Per minimal page callback is much expensive. How about using (pagesize,
count), so that all pages with the same page size could be handled in a
single indirect call? I remember I commented this during last review,
but I don't mind doing it again.

Best regards,
baolu

> +		if (ret)
> +			break;
> +
> +		pr_debug("switch_dirty_log handled: iova 0x%lx size 0x%zx\n",
> +			 iova, pgsize);
> +
> +		iova += pgsize;
> +		size -= pgsize;
> +	}
> +
> +	if (flush)
> +		iommu_flush_iotlb_all(domain);
> +
> +	if (!ret) {
> +		domain->dirty_log_tracking = enable;
> +		trace_switch_dirty_log(orig_iova, orig_size, enable);
> +	}
> +out:
> +	mutex_unlock(&domain->switch_log_lock);
> +	return ret;
> +}
> +EXPORT_SYMBOL_GPL(iommu_switch_dirty_log);
> +
> +int iommu_sync_dirty_log(struct iommu_domain *domain, unsigned long iova,
> +			 size_t size, unsigned long *bitmap,
> +			 unsigned long base_iova, unsigned long bitmap_pgshift)
> +{
> +	const struct iommu_ops *ops = domain->ops;
> +	unsigned long orig_iova = iova;
> +	unsigned int min_pagesz;
> +	size_t orig_size = size;
> +	int ret = 0;
> +
> +	if (unlikely(!ops->sync_dirty_log))
> +		return -ENODEV;
> +
> +	min_pagesz = 1 << __ffs(domain->pgsize_bitmap);
> +	if (!IS_ALIGNED(iova | size, min_pagesz)) {
> +		pr_err("unaligned: iova 0x%lx size 0x%zx min_pagesz 0x%x\n",
> +		       iova, size, min_pagesz);
> +		return -EINVAL;
> +	}
> +
> +	mutex_lock(&domain->switch_log_lock);
> +	if (!domain->dirty_log_tracking) {
> +		ret = -EINVAL;
> +		goto out;
> +	}
> +
> +	pr_debug("sync_dirty_log for: iova 0x%lx size 0x%zx\n", iova, size);
> +
> +	while (size) {
> +		size_t pgsize = iommu_pgsize(domain, iova, size);
> +
> +		ret = ops->sync_dirty_log(domain, iova, pgsize,
> +					  bitmap, base_iova, bitmap_pgshift);
> +		if (ret)
> +			break;
> +
> +		pr_debug("sync_dirty_log handled: iova 0x%lx size 0x%zx\n",
> +			 iova, pgsize);
> +
> +		iova += pgsize;
> +		size -= pgsize;
> +	}
> +
> +	if (!ret)
> +		trace_sync_dirty_log(orig_iova, orig_size);
> +out:
> +	mutex_unlock(&domain->switch_log_lock);
> +	return ret;
> +}
> +EXPORT_SYMBOL_GPL(iommu_sync_dirty_log);
> +
> +static int __iommu_clear_dirty_log(struct iommu_domain *domain,
> +				   unsigned long iova, size_t size,
> +				   unsigned long *bitmap,
> +				   unsigned long base_iova,
> +				   unsigned long bitmap_pgshift)
> +{
> +	const struct iommu_ops *ops = domain->ops;
> +	unsigned long orig_iova = iova;
> +	size_t orig_size = size;
> +	int ret = 0;
> +
> +	if (unlikely(!ops->clear_dirty_log))
> +		return -ENODEV;
> +
> +	pr_debug("clear_dirty_log for: iova 0x%lx size 0x%zx\n", iova, size);
> +
> +	while (size) {
> +		size_t pgsize = iommu_pgsize(domain, iova, size);
> +
> +		ret = ops->clear_dirty_log(domain, iova, pgsize, bitmap,
> +					   base_iova, bitmap_pgshift);
> +		if (ret)
> +			break;
> +
> +		pr_debug("clear_dirty_log handled: iova 0x%lx size 0x%zx\n",
> +			 iova, pgsize);
> +
> +		iova += pgsize;
> +		size -= pgsize;
> +	}
> +
> +	if (!ret)
> +		trace_clear_dirty_log(orig_iova, orig_size);
> +
> +	return ret;
> +}
> +
> +int iommu_clear_dirty_log(struct iommu_domain *domain,
> +			  unsigned long iova, size_t size,
> +			  unsigned long *bitmap, unsigned long base_iova,
> +			  unsigned long bitmap_pgshift)
> +{
> +	unsigned long riova, rsize;
> +	unsigned int min_pagesz;
> +	bool flush = false;
> +	int rs, re, start, end;
> +	int ret = 0;
> +
> +	min_pagesz = 1 << __ffs(domain->pgsize_bitmap);
> +	if (!IS_ALIGNED(iova | size, min_pagesz)) {
> +		pr_err("unaligned: iova 0x%lx min_pagesz 0x%x\n",
> +		       iova, min_pagesz);
> +		return -EINVAL;
> +	}
> +
> +	mutex_lock(&domain->switch_log_lock);
> +	if (!domain->dirty_log_tracking) {
> +		ret = -EINVAL;
> +		goto out;
> +	}
> +
> +	start = (iova - base_iova) >> bitmap_pgshift;
> +	end = start + (size >> bitmap_pgshift);
> +	bitmap_for_each_set_region(bitmap, rs, re, start, end) {
> +		flush = true;
> +		riova = base_iova + (rs << bitmap_pgshift);
> +		rsize = (re - rs) << bitmap_pgshift;
> +		ret = __iommu_clear_dirty_log(domain, riova, rsize, bitmap,
> +					      base_iova, bitmap_pgshift);
> +		if (ret)
> +			break;
> +	}
> +
> +	if (flush)
> +		iommu_flush_iotlb_all(domain);
> +out:
> +	mutex_unlock(&domain->switch_log_lock);
> +	return ret;
> +}
> +EXPORT_SYMBOL_GPL(iommu_clear_dirty_log);
> +
>   void iommu_get_resv_regions(struct device *dev, struct list_head *list)
>   {
>   	const struct iommu_ops *ops = dev->bus->iommu_ops;
> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
> index 32d448050bf7..e0e40dda974d 100644
> --- a/include/linux/iommu.h
> +++ b/include/linux/iommu.h
> @@ -87,6 +87,8 @@ struct iommu_domain {
>   	void *handler_token;
>   	struct iommu_domain_geometry geometry;
>   	void *iova_cookie;
> +	bool dirty_log_tracking;
> +	struct mutex switch_log_lock;
>   };
>   
>   enum iommu_cap {
> @@ -193,6 +195,10 @@ struct iommu_iotlb_gather {
>    * @device_group: find iommu group for a particular device
>    * @enable_nesting: Enable nesting
>    * @set_pgtable_quirks: Set io page table quirks (IO_PGTABLE_QUIRK_*)
> + * @support_dirty_log: Check whether domain supports dirty log tracking
> + * @switch_dirty_log: Perform actions to start|stop dirty log tracking
> + * @sync_dirty_log: Sync dirty log from IOMMU into a dirty bitmap
> + * @clear_dirty_log: Clear dirty log of IOMMU by a mask bitmap
>    * @get_resv_regions: Request list of reserved regions for a device
>    * @put_resv_regions: Free list of reserved regions for a device
>    * @apply_resv_region: Temporary helper call-back for iova reserved ranges
> @@ -245,6 +251,22 @@ struct iommu_ops {
>   	int (*set_pgtable_quirks)(struct iommu_domain *domain,
>   				  unsigned long quirks);
>   
> +	/*
> +	 * Track dirty log. Note: Don't concurrently call these interfaces with
> +	 * other ops that access underlying page table.
> +	 */
> +	bool (*support_dirty_log)(struct iommu_domain *domain);
> +	int (*switch_dirty_log)(struct iommu_domain *domain, bool enable,
> +				unsigned long iova, size_t size, int prot);
> +	int (*sync_dirty_log)(struct iommu_domain *domain,
> +			      unsigned long iova, size_t size,
> +			      unsigned long *bitmap, unsigned long base_iova,
> +			      unsigned long bitmap_pgshift);
> +	int (*clear_dirty_log)(struct iommu_domain *domain,
> +			       unsigned long iova, size_t size,
> +			       unsigned long *bitmap, unsigned long base_iova,
> +			       unsigned long bitmap_pgshift);
> +
>   	/* Request/Free a list of reserved regions for a device */
>   	void (*get_resv_regions)(struct device *dev, struct list_head *list);
>   	void (*put_resv_regions)(struct device *dev, struct list_head *list);
> @@ -475,6 +497,17 @@ extern struct iommu_domain *iommu_group_default_domain(struct iommu_group *);
>   int iommu_enable_nesting(struct iommu_domain *domain);
>   int iommu_set_pgtable_quirks(struct iommu_domain *domain,
>   		unsigned long quirks);
> +extern bool iommu_support_dirty_log(struct iommu_domain *domain);
> +extern int iommu_switch_dirty_log(struct iommu_domain *domain, bool enable,
> +				  unsigned long iova, size_t size, int prot);
> +extern int iommu_sync_dirty_log(struct iommu_domain *domain, unsigned long iova,
> +				size_t size, unsigned long *bitmap,
> +				unsigned long base_iova,
> +				unsigned long bitmap_pgshift);
> +extern int iommu_clear_dirty_log(struct iommu_domain *domain, unsigned long iova,
> +				 size_t dma_size, unsigned long *bitmap,
> +				 unsigned long base_iova,
> +				 unsigned long bitmap_pgshift);
>   
>   void iommu_set_dma_strict(bool val);
>   bool iommu_get_dma_strict(struct iommu_domain *domain);
> @@ -848,6 +881,36 @@ static inline int iommu_set_pgtable_quirks(struct iommu_domain *domain,
>   	return 0;
>   }
>   
> +static inline bool iommu_support_dirty_log(struct iommu_domain *domain)
> +{
> +	return false;
> +}
> +
> +static inline int iommu_switch_dirty_log(struct iommu_domain *domain,
> +					 bool enable, unsigned long iova,
> +					 size_t size, int prot)
> +{
> +	return -EINVAL;
> +}
> +
> +static inline int iommu_sync_dirty_log(struct iommu_domain *domain,
> +				       unsigned long iova, size_t size,
> +				       unsigned long *bitmap,
> +				       unsigned long base_iova,
> +				       unsigned long pgshift)
> +{
> +	return -EINVAL;
> +}
> +
> +static inline int iommu_clear_dirty_log(struct iommu_domain *domain,
> +					unsigned long iova, size_t size,
> +					unsigned long *bitmap,
> +					unsigned long base_iova,
> +					unsigned long pgshift)
> +{
> +	return -EINVAL;
> +}
> +
>   static inline int iommu_device_register(struct iommu_device *iommu,
>   					const struct iommu_ops *ops,
>   					struct device *hwdev)
> diff --git a/include/trace/events/iommu.h b/include/trace/events/iommu.h
> index 72b4582322ff..6436d693d357 100644
> --- a/include/trace/events/iommu.h
> +++ b/include/trace/events/iommu.h
> @@ -129,6 +129,69 @@ TRACE_EVENT(unmap,
>   	)
>   );
>   
> +TRACE_EVENT(switch_dirty_log,
> +
> +	TP_PROTO(unsigned long iova, size_t size, bool enable),
> +
> +	TP_ARGS(iova, size, enable),
> +
> +	TP_STRUCT__entry(
> +		__field(u64, iova)
> +		__field(size_t, size)
> +		__field(bool, enable)
> +	),
> +
> +	TP_fast_assign(
> +		__entry->iova = iova;
> +		__entry->size = size;
> +		__entry->enable = enable;
> +	),
> +
> +	TP_printk("IOMMU: iova=0x%016llx size=%zu enable=%u",
> +			__entry->iova, __entry->size, __entry->enable
> +	)
> +);
> +
> +TRACE_EVENT(sync_dirty_log,
> +
> +	TP_PROTO(unsigned long iova, size_t size),
> +
> +	TP_ARGS(iova, size),
> +
> +	TP_STRUCT__entry(
> +		__field(u64, iova)
> +		__field(size_t, size)
> +	),
> +
> +	TP_fast_assign(
> +		__entry->iova = iova;
> +		__entry->size = size;
> +	),
> +
> +	TP_printk("IOMMU: iova=0x%016llx size=%zu", __entry->iova,
> +			__entry->size)
> +);
> +
> +TRACE_EVENT(clear_dirty_log,
> +
> +	TP_PROTO(unsigned long iova, size_t size),
> +
> +	TP_ARGS(iova, size),
> +
> +	TP_STRUCT__entry(
> +		__field(u64, iova)
> +		__field(size_t, size)
> +	),
> +
> +	TP_fast_assign(
> +		__entry->iova = iova;
> +		__entry->size = size;
> +	),
> +
> +	TP_printk("IOMMU: iova=0x%016llx size=%zu", __entry->iova,
> +			__entry->size)
> +);
> +
>   DECLARE_EVENT_CLASS(iommu_error,
>   
>   	TP_PROTO(struct device *dev, unsigned long iova, int flags),
> 
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [RFC PATCH v4 01/13] iommu: Introduce dirty log tracking framework
@ 2021-05-08  3:46     ` Lu Baolu
  0 siblings, 0 replies; 81+ messages in thread
From: Lu Baolu @ 2021-05-08  3:46 UTC (permalink / raw)
  To: Keqian Zhu, linux-kernel, linux-arm-kernel, iommu, Robin Murphy,
	Will Deacon, Joerg Roedel, Jean-Philippe Brucker, Yi Sun,
	Tian Kevin
  Cc: baolu.lu, Alex Williamson, Kirti Wankhede, Cornelia Huck,
	Jonathan Cameron, wanghaibin.wang, jiangkunkun, yuzenghui,
	lushenming

Hi Keqian,

On 5/7/21 6:21 PM, Keqian Zhu wrote:
> Some types of IOMMU are capable of tracking DMA dirty log, such as
> ARM SMMU with HTTU or Intel IOMMU with SLADE. This introduces the
> dirty log tracking framework in the IOMMU base layer.
> 
> Four new essential interfaces are added, and we maintaince the status
> of dirty log tracking in iommu_domain.
> 1. iommu_support_dirty_log: Check whether domain supports dirty log tracking
> 2. iommu_switch_dirty_log: Perform actions to start|stop dirty log tracking
> 3. iommu_sync_dirty_log: Sync dirty log from IOMMU into a dirty bitmap
> 4. iommu_clear_dirty_log: Clear dirty log of IOMMU by a mask bitmap
> 
> Note: Don't concurrently call these interfaces with other ops that
> access underlying page table.
> 
> Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
> Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com>
> ---
>   drivers/iommu/iommu.c        | 201 +++++++++++++++++++++++++++++++++++
>   include/linux/iommu.h        |  63 +++++++++++
>   include/trace/events/iommu.h |  63 +++++++++++
>   3 files changed, 327 insertions(+)
> 
> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
> index 808ab70d5df5..0d15620d1e90 100644
> --- a/drivers/iommu/iommu.c
> +++ b/drivers/iommu/iommu.c
> @@ -1940,6 +1940,7 @@ static struct iommu_domain *__iommu_domain_alloc(struct bus_type *bus,
>   	domain->type = type;
>   	/* Assume all sizes by default; the driver may override this later */
>   	domain->pgsize_bitmap  = bus->iommu_ops->pgsize_bitmap;
> +	mutex_init(&domain->switch_log_lock);
>   
>   	return domain;
>   }
> @@ -2703,6 +2704,206 @@ int iommu_set_pgtable_quirks(struct iommu_domain *domain,
>   }
>   EXPORT_SYMBOL_GPL(iommu_set_pgtable_quirks);
>   
> +bool iommu_support_dirty_log(struct iommu_domain *domain)
> +{
> +	const struct iommu_ops *ops = domain->ops;
> +
> +	return ops->support_dirty_log && ops->support_dirty_log(domain);
> +}
> +EXPORT_SYMBOL_GPL(iommu_support_dirty_log);

I suppose this interface is to ask the vendor IOMMU driver to check
whether each device/iommu in the domain supports dirty bit tracking.
But what will happen if new devices with different tracking capability
are added afterward?

To make things simple, is it possible to support this tracking only when
all underlying IOMMUs support dirty bit tracking?

Or, the more crazy idea is that we don't need to check this capability
at all. If dirty bit tracking is not supported by hardware, just mark
all pages dirty?

> +
> +int iommu_switch_dirty_log(struct iommu_domain *domain, bool enable,
> +			   unsigned long iova, size_t size, int prot)
> +{
> +	const struct iommu_ops *ops = domain->ops;
> +	unsigned long orig_iova = iova;
> +	unsigned int min_pagesz;
> +	size_t orig_size = size;
> +	bool flush = false;
> +	int ret = 0;
> +
> +	if (unlikely(!ops->switch_dirty_log))
> +		return -ENODEV;
> +
> +	min_pagesz = 1 << __ffs(domain->pgsize_bitmap);
> +	if (!IS_ALIGNED(iova | size, min_pagesz)) {
> +		pr_err("unaligned: iova 0x%lx size 0x%zx min_pagesz 0x%x\n",
> +		       iova, size, min_pagesz);
> +		return -EINVAL;
> +	}
> +
> +	mutex_lock(&domain->switch_log_lock);
> +	if (enable && domain->dirty_log_tracking) {
> +		ret = -EBUSY;
> +		goto out;
> +	} else if (!enable && !domain->dirty_log_tracking) {
> +		ret = -EINVAL;
> +		goto out;
> +	}
> +
> +	pr_debug("switch_dirty_log %s for: iova 0x%lx size 0x%zx\n",
> +		 enable ? "enable" : "disable", iova, size);
> +
> +	while (size) {
> +		size_t pgsize = iommu_pgsize(domain, iova, size);
> +
> +		flush = true;
> +		ret = ops->switch_dirty_log(domain, enable, iova, pgsize, prot);

Per minimal page callback is much expensive. How about using (pagesize,
count), so that all pages with the same page size could be handled in a
single indirect call? I remember I commented this during last review,
but I don't mind doing it again.

Best regards,
baolu

> +		if (ret)
> +			break;
> +
> +		pr_debug("switch_dirty_log handled: iova 0x%lx size 0x%zx\n",
> +			 iova, pgsize);
> +
> +		iova += pgsize;
> +		size -= pgsize;
> +	}
> +
> +	if (flush)
> +		iommu_flush_iotlb_all(domain);
> +
> +	if (!ret) {
> +		domain->dirty_log_tracking = enable;
> +		trace_switch_dirty_log(orig_iova, orig_size, enable);
> +	}
> +out:
> +	mutex_unlock(&domain->switch_log_lock);
> +	return ret;
> +}
> +EXPORT_SYMBOL_GPL(iommu_switch_dirty_log);
> +
> +int iommu_sync_dirty_log(struct iommu_domain *domain, unsigned long iova,
> +			 size_t size, unsigned long *bitmap,
> +			 unsigned long base_iova, unsigned long bitmap_pgshift)
> +{
> +	const struct iommu_ops *ops = domain->ops;
> +	unsigned long orig_iova = iova;
> +	unsigned int min_pagesz;
> +	size_t orig_size = size;
> +	int ret = 0;
> +
> +	if (unlikely(!ops->sync_dirty_log))
> +		return -ENODEV;
> +
> +	min_pagesz = 1 << __ffs(domain->pgsize_bitmap);
> +	if (!IS_ALIGNED(iova | size, min_pagesz)) {
> +		pr_err("unaligned: iova 0x%lx size 0x%zx min_pagesz 0x%x\n",
> +		       iova, size, min_pagesz);
> +		return -EINVAL;
> +	}
> +
> +	mutex_lock(&domain->switch_log_lock);
> +	if (!domain->dirty_log_tracking) {
> +		ret = -EINVAL;
> +		goto out;
> +	}
> +
> +	pr_debug("sync_dirty_log for: iova 0x%lx size 0x%zx\n", iova, size);
> +
> +	while (size) {
> +		size_t pgsize = iommu_pgsize(domain, iova, size);
> +
> +		ret = ops->sync_dirty_log(domain, iova, pgsize,
> +					  bitmap, base_iova, bitmap_pgshift);
> +		if (ret)
> +			break;
> +
> +		pr_debug("sync_dirty_log handled: iova 0x%lx size 0x%zx\n",
> +			 iova, pgsize);
> +
> +		iova += pgsize;
> +		size -= pgsize;
> +	}
> +
> +	if (!ret)
> +		trace_sync_dirty_log(orig_iova, orig_size);
> +out:
> +	mutex_unlock(&domain->switch_log_lock);
> +	return ret;
> +}
> +EXPORT_SYMBOL_GPL(iommu_sync_dirty_log);
> +
> +static int __iommu_clear_dirty_log(struct iommu_domain *domain,
> +				   unsigned long iova, size_t size,
> +				   unsigned long *bitmap,
> +				   unsigned long base_iova,
> +				   unsigned long bitmap_pgshift)
> +{
> +	const struct iommu_ops *ops = domain->ops;
> +	unsigned long orig_iova = iova;
> +	size_t orig_size = size;
> +	int ret = 0;
> +
> +	if (unlikely(!ops->clear_dirty_log))
> +		return -ENODEV;
> +
> +	pr_debug("clear_dirty_log for: iova 0x%lx size 0x%zx\n", iova, size);
> +
> +	while (size) {
> +		size_t pgsize = iommu_pgsize(domain, iova, size);
> +
> +		ret = ops->clear_dirty_log(domain, iova, pgsize, bitmap,
> +					   base_iova, bitmap_pgshift);
> +		if (ret)
> +			break;
> +
> +		pr_debug("clear_dirty_log handled: iova 0x%lx size 0x%zx\n",
> +			 iova, pgsize);
> +
> +		iova += pgsize;
> +		size -= pgsize;
> +	}
> +
> +	if (!ret)
> +		trace_clear_dirty_log(orig_iova, orig_size);
> +
> +	return ret;
> +}
> +
> +int iommu_clear_dirty_log(struct iommu_domain *domain,
> +			  unsigned long iova, size_t size,
> +			  unsigned long *bitmap, unsigned long base_iova,
> +			  unsigned long bitmap_pgshift)
> +{
> +	unsigned long riova, rsize;
> +	unsigned int min_pagesz;
> +	bool flush = false;
> +	int rs, re, start, end;
> +	int ret = 0;
> +
> +	min_pagesz = 1 << __ffs(domain->pgsize_bitmap);
> +	if (!IS_ALIGNED(iova | size, min_pagesz)) {
> +		pr_err("unaligned: iova 0x%lx min_pagesz 0x%x\n",
> +		       iova, min_pagesz);
> +		return -EINVAL;
> +	}
> +
> +	mutex_lock(&domain->switch_log_lock);
> +	if (!domain->dirty_log_tracking) {
> +		ret = -EINVAL;
> +		goto out;
> +	}
> +
> +	start = (iova - base_iova) >> bitmap_pgshift;
> +	end = start + (size >> bitmap_pgshift);
> +	bitmap_for_each_set_region(bitmap, rs, re, start, end) {
> +		flush = true;
> +		riova = base_iova + (rs << bitmap_pgshift);
> +		rsize = (re - rs) << bitmap_pgshift;
> +		ret = __iommu_clear_dirty_log(domain, riova, rsize, bitmap,
> +					      base_iova, bitmap_pgshift);
> +		if (ret)
> +			break;
> +	}
> +
> +	if (flush)
> +		iommu_flush_iotlb_all(domain);
> +out:
> +	mutex_unlock(&domain->switch_log_lock);
> +	return ret;
> +}
> +EXPORT_SYMBOL_GPL(iommu_clear_dirty_log);
> +
>   void iommu_get_resv_regions(struct device *dev, struct list_head *list)
>   {
>   	const struct iommu_ops *ops = dev->bus->iommu_ops;
> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
> index 32d448050bf7..e0e40dda974d 100644
> --- a/include/linux/iommu.h
> +++ b/include/linux/iommu.h
> @@ -87,6 +87,8 @@ struct iommu_domain {
>   	void *handler_token;
>   	struct iommu_domain_geometry geometry;
>   	void *iova_cookie;
> +	bool dirty_log_tracking;
> +	struct mutex switch_log_lock;
>   };
>   
>   enum iommu_cap {
> @@ -193,6 +195,10 @@ struct iommu_iotlb_gather {
>    * @device_group: find iommu group for a particular device
>    * @enable_nesting: Enable nesting
>    * @set_pgtable_quirks: Set io page table quirks (IO_PGTABLE_QUIRK_*)
> + * @support_dirty_log: Check whether domain supports dirty log tracking
> + * @switch_dirty_log: Perform actions to start|stop dirty log tracking
> + * @sync_dirty_log: Sync dirty log from IOMMU into a dirty bitmap
> + * @clear_dirty_log: Clear dirty log of IOMMU by a mask bitmap
>    * @get_resv_regions: Request list of reserved regions for a device
>    * @put_resv_regions: Free list of reserved regions for a device
>    * @apply_resv_region: Temporary helper call-back for iova reserved ranges
> @@ -245,6 +251,22 @@ struct iommu_ops {
>   	int (*set_pgtable_quirks)(struct iommu_domain *domain,
>   				  unsigned long quirks);
>   
> +	/*
> +	 * Track dirty log. Note: Don't concurrently call these interfaces with
> +	 * other ops that access underlying page table.
> +	 */
> +	bool (*support_dirty_log)(struct iommu_domain *domain);
> +	int (*switch_dirty_log)(struct iommu_domain *domain, bool enable,
> +				unsigned long iova, size_t size, int prot);
> +	int (*sync_dirty_log)(struct iommu_domain *domain,
> +			      unsigned long iova, size_t size,
> +			      unsigned long *bitmap, unsigned long base_iova,
> +			      unsigned long bitmap_pgshift);
> +	int (*clear_dirty_log)(struct iommu_domain *domain,
> +			       unsigned long iova, size_t size,
> +			       unsigned long *bitmap, unsigned long base_iova,
> +			       unsigned long bitmap_pgshift);
> +
>   	/* Request/Free a list of reserved regions for a device */
>   	void (*get_resv_regions)(struct device *dev, struct list_head *list);
>   	void (*put_resv_regions)(struct device *dev, struct list_head *list);
> @@ -475,6 +497,17 @@ extern struct iommu_domain *iommu_group_default_domain(struct iommu_group *);
>   int iommu_enable_nesting(struct iommu_domain *domain);
>   int iommu_set_pgtable_quirks(struct iommu_domain *domain,
>   		unsigned long quirks);
> +extern bool iommu_support_dirty_log(struct iommu_domain *domain);
> +extern int iommu_switch_dirty_log(struct iommu_domain *domain, bool enable,
> +				  unsigned long iova, size_t size, int prot);
> +extern int iommu_sync_dirty_log(struct iommu_domain *domain, unsigned long iova,
> +				size_t size, unsigned long *bitmap,
> +				unsigned long base_iova,
> +				unsigned long bitmap_pgshift);
> +extern int iommu_clear_dirty_log(struct iommu_domain *domain, unsigned long iova,
> +				 size_t dma_size, unsigned long *bitmap,
> +				 unsigned long base_iova,
> +				 unsigned long bitmap_pgshift);
>   
>   void iommu_set_dma_strict(bool val);
>   bool iommu_get_dma_strict(struct iommu_domain *domain);
> @@ -848,6 +881,36 @@ static inline int iommu_set_pgtable_quirks(struct iommu_domain *domain,
>   	return 0;
>   }
>   
> +static inline bool iommu_support_dirty_log(struct iommu_domain *domain)
> +{
> +	return false;
> +}
> +
> +static inline int iommu_switch_dirty_log(struct iommu_domain *domain,
> +					 bool enable, unsigned long iova,
> +					 size_t size, int prot)
> +{
> +	return -EINVAL;
> +}
> +
> +static inline int iommu_sync_dirty_log(struct iommu_domain *domain,
> +				       unsigned long iova, size_t size,
> +				       unsigned long *bitmap,
> +				       unsigned long base_iova,
> +				       unsigned long pgshift)
> +{
> +	return -EINVAL;
> +}
> +
> +static inline int iommu_clear_dirty_log(struct iommu_domain *domain,
> +					unsigned long iova, size_t size,
> +					unsigned long *bitmap,
> +					unsigned long base_iova,
> +					unsigned long pgshift)
> +{
> +	return -EINVAL;
> +}
> +
>   static inline int iommu_device_register(struct iommu_device *iommu,
>   					const struct iommu_ops *ops,
>   					struct device *hwdev)
> diff --git a/include/trace/events/iommu.h b/include/trace/events/iommu.h
> index 72b4582322ff..6436d693d357 100644
> --- a/include/trace/events/iommu.h
> +++ b/include/trace/events/iommu.h
> @@ -129,6 +129,69 @@ TRACE_EVENT(unmap,
>   	)
>   );
>   
> +TRACE_EVENT(switch_dirty_log,
> +
> +	TP_PROTO(unsigned long iova, size_t size, bool enable),
> +
> +	TP_ARGS(iova, size, enable),
> +
> +	TP_STRUCT__entry(
> +		__field(u64, iova)
> +		__field(size_t, size)
> +		__field(bool, enable)
> +	),
> +
> +	TP_fast_assign(
> +		__entry->iova = iova;
> +		__entry->size = size;
> +		__entry->enable = enable;
> +	),
> +
> +	TP_printk("IOMMU: iova=0x%016llx size=%zu enable=%u",
> +			__entry->iova, __entry->size, __entry->enable
> +	)
> +);
> +
> +TRACE_EVENT(sync_dirty_log,
> +
> +	TP_PROTO(unsigned long iova, size_t size),
> +
> +	TP_ARGS(iova, size),
> +
> +	TP_STRUCT__entry(
> +		__field(u64, iova)
> +		__field(size_t, size)
> +	),
> +
> +	TP_fast_assign(
> +		__entry->iova = iova;
> +		__entry->size = size;
> +	),
> +
> +	TP_printk("IOMMU: iova=0x%016llx size=%zu", __entry->iova,
> +			__entry->size)
> +);
> +
> +TRACE_EVENT(clear_dirty_log,
> +
> +	TP_PROTO(unsigned long iova, size_t size),
> +
> +	TP_ARGS(iova, size),
> +
> +	TP_STRUCT__entry(
> +		__field(u64, iova)
> +		__field(size_t, size)
> +	),
> +
> +	TP_fast_assign(
> +		__entry->iova = iova;
> +		__entry->size = size;
> +	),
> +
> +	TP_printk("IOMMU: iova=0x%016llx size=%zu", __entry->iova,
> +			__entry->size)
> +);
> +
>   DECLARE_EVENT_CLASS(iommu_error,
>   
>   	TP_PROTO(struct device *dev, unsigned long iova, int flags),
> 

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [RFC PATCH v4 01/13] iommu: Introduce dirty log tracking framework
  2021-05-08  3:46     ` Lu Baolu
  (?)
@ 2021-05-08  7:35       ` Keqian Zhu
  -1 siblings, 0 replies; 81+ messages in thread
From: Keqian Zhu @ 2021-05-08  7:35 UTC (permalink / raw)
  To: Lu Baolu, linux-kernel, linux-arm-kernel, iommu, Robin Murphy,
	Will Deacon, Joerg Roedel, Jean-Philippe Brucker, Yi Sun,
	Tian Kevin
  Cc: Alex Williamson, Kirti Wankhede, Cornelia Huck, Jonathan Cameron,
	wanghaibin.wang, jiangkunkun, yuzenghui, lushenming

Hi Baolu,

On 2021/5/8 11:46, Lu Baolu wrote:
> Hi Keqian,
> 
> On 5/7/21 6:21 PM, Keqian Zhu wrote:
>> Some types of IOMMU are capable of tracking DMA dirty log, such as
>> ARM SMMU with HTTU or Intel IOMMU with SLADE. This introduces the
>> dirty log tracking framework in the IOMMU base layer.
>>
>> Four new essential interfaces are added, and we maintaince the status
>> of dirty log tracking in iommu_domain.
>> 1. iommu_support_dirty_log: Check whether domain supports dirty log tracking
>> 2. iommu_switch_dirty_log: Perform actions to start|stop dirty log tracking
>> 3. iommu_sync_dirty_log: Sync dirty log from IOMMU into a dirty bitmap
>> 4. iommu_clear_dirty_log: Clear dirty log of IOMMU by a mask bitmap
>>
>> Note: Don't concurrently call these interfaces with other ops that
>> access underlying page table.
>>
>> Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
>> Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com>
>> ---
>>   drivers/iommu/iommu.c        | 201 +++++++++++++++++++++++++++++++++++
>>   include/linux/iommu.h        |  63 +++++++++++
>>   include/trace/events/iommu.h |  63 +++++++++++
>>   3 files changed, 327 insertions(+)
>>
>> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
>> index 808ab70d5df5..0d15620d1e90 100644
>> --- a/drivers/iommu/iommu.c
>> +++ b/drivers/iommu/iommu.c
>> @@ -1940,6 +1940,7 @@ static struct iommu_domain *__iommu_domain_alloc(struct bus_type *bus,
>>       domain->type = type;
>>       /* Assume all sizes by default; the driver may override this later */
>>       domain->pgsize_bitmap  = bus->iommu_ops->pgsize_bitmap;
>> +    mutex_init(&domain->switch_log_lock);
>>         return domain;
>>   }
>> @@ -2703,6 +2704,206 @@ int iommu_set_pgtable_quirks(struct iommu_domain *domain,
>>   }
>>   EXPORT_SYMBOL_GPL(iommu_set_pgtable_quirks);
>>   +bool iommu_support_dirty_log(struct iommu_domain *domain)
>> +{
>> +    const struct iommu_ops *ops = domain->ops;
>> +
>> +    return ops->support_dirty_log && ops->support_dirty_log(domain);
>> +}
>> +EXPORT_SYMBOL_GPL(iommu_support_dirty_log);
> 
> I suppose this interface is to ask the vendor IOMMU driver to check
> whether each device/iommu in the domain supports dirty bit tracking.
> But what will happen if new devices with different tracking capability
> are added afterward?
Yep, this is considered in the vfio part. We will query again after attaching or
detaching devices from the domain.  When the domain becomes capable, we enable
dirty log for it. When it becomes not capable, we disable dirty log for it.

> 
> To make things simple, is it possible to support this tracking only when
> all underlying IOMMUs support dirty bit tracking?
IIUC, all underlying IOMMUs you refer is of system wide. I think this idea may has
two issues. 1) The target domain may just contains part of system IOMMUs. 2) The
dirty tracking capability can be related to the capability of devices. For example,
we can track dirty log based on IOPF, which needs the capability of devices. That's
to say, we can make this framework more common.

> 
> Or, the more crazy idea is that we don't need to check this capability
> at all. If dirty bit tracking is not supported by hardware, just mark
> all pages dirty?
Yeah, I think this idea is nice :).

Still one concern is that we may have other dirty tracking methods in the future,
if we can't track dirty through iommu, we can still try other methods.

If there is no interface to check this capability, we have no chance to try
other methods. What do you think?

> 
>> +
>> +int iommu_switch_dirty_log(struct iommu_domain *domain, bool enable,
>> +               unsigned long iova, size_t size, int prot)
>> +{
>> +    const struct iommu_ops *ops = domain->ops;
>> +    unsigned long orig_iova = iova;
>> +    unsigned int min_pagesz;
>> +    size_t orig_size = size;
>> +    bool flush = false;
>> +    int ret = 0;
>> +
>> +    if (unlikely(!ops->switch_dirty_log))
>> +        return -ENODEV;
>> +
>> +    min_pagesz = 1 << __ffs(domain->pgsize_bitmap);
>> +    if (!IS_ALIGNED(iova | size, min_pagesz)) {
>> +        pr_err("unaligned: iova 0x%lx size 0x%zx min_pagesz 0x%x\n",
>> +               iova, size, min_pagesz);
>> +        return -EINVAL;
>> +    }
>> +
>> +    mutex_lock(&domain->switch_log_lock);
>> +    if (enable && domain->dirty_log_tracking) {
>> +        ret = -EBUSY;
>> +        goto out;
>> +    } else if (!enable && !domain->dirty_log_tracking) {
>> +        ret = -EINVAL;
>> +        goto out;
>> +    }
>> +
>> +    pr_debug("switch_dirty_log %s for: iova 0x%lx size 0x%zx\n",
>> +         enable ? "enable" : "disable", iova, size);
>> +
>> +    while (size) {
>> +        size_t pgsize = iommu_pgsize(domain, iova, size);
>> +
>> +        flush = true;
>> +        ret = ops->switch_dirty_log(domain, enable, iova, pgsize, prot);
> 
> Per minimal page callback is much expensive. How about using (pagesize,
> count), so that all pages with the same page size could be handled in a
> single indirect call? I remember I commented this during last review,
> but I don't mind doing it again.
Thanks for reminding me again :). I'll do that in next version.

Thanks,
Keqian

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [RFC PATCH v4 01/13] iommu: Introduce dirty log tracking framework
@ 2021-05-08  7:35       ` Keqian Zhu
  0 siblings, 0 replies; 81+ messages in thread
From: Keqian Zhu @ 2021-05-08  7:35 UTC (permalink / raw)
  To: Lu Baolu, linux-kernel, linux-arm-kernel, iommu, Robin Murphy,
	Will Deacon, Joerg Roedel, Jean-Philippe Brucker, Yi Sun,
	Tian Kevin
  Cc: jiangkunkun, Cornelia Huck, Kirti Wankhede, lushenming,
	Alex Williamson, wanghaibin.wang

Hi Baolu,

On 2021/5/8 11:46, Lu Baolu wrote:
> Hi Keqian,
> 
> On 5/7/21 6:21 PM, Keqian Zhu wrote:
>> Some types of IOMMU are capable of tracking DMA dirty log, such as
>> ARM SMMU with HTTU or Intel IOMMU with SLADE. This introduces the
>> dirty log tracking framework in the IOMMU base layer.
>>
>> Four new essential interfaces are added, and we maintaince the status
>> of dirty log tracking in iommu_domain.
>> 1. iommu_support_dirty_log: Check whether domain supports dirty log tracking
>> 2. iommu_switch_dirty_log: Perform actions to start|stop dirty log tracking
>> 3. iommu_sync_dirty_log: Sync dirty log from IOMMU into a dirty bitmap
>> 4. iommu_clear_dirty_log: Clear dirty log of IOMMU by a mask bitmap
>>
>> Note: Don't concurrently call these interfaces with other ops that
>> access underlying page table.
>>
>> Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
>> Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com>
>> ---
>>   drivers/iommu/iommu.c        | 201 +++++++++++++++++++++++++++++++++++
>>   include/linux/iommu.h        |  63 +++++++++++
>>   include/trace/events/iommu.h |  63 +++++++++++
>>   3 files changed, 327 insertions(+)
>>
>> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
>> index 808ab70d5df5..0d15620d1e90 100644
>> --- a/drivers/iommu/iommu.c
>> +++ b/drivers/iommu/iommu.c
>> @@ -1940,6 +1940,7 @@ static struct iommu_domain *__iommu_domain_alloc(struct bus_type *bus,
>>       domain->type = type;
>>       /* Assume all sizes by default; the driver may override this later */
>>       domain->pgsize_bitmap  = bus->iommu_ops->pgsize_bitmap;
>> +    mutex_init(&domain->switch_log_lock);
>>         return domain;
>>   }
>> @@ -2703,6 +2704,206 @@ int iommu_set_pgtable_quirks(struct iommu_domain *domain,
>>   }
>>   EXPORT_SYMBOL_GPL(iommu_set_pgtable_quirks);
>>   +bool iommu_support_dirty_log(struct iommu_domain *domain)
>> +{
>> +    const struct iommu_ops *ops = domain->ops;
>> +
>> +    return ops->support_dirty_log && ops->support_dirty_log(domain);
>> +}
>> +EXPORT_SYMBOL_GPL(iommu_support_dirty_log);
> 
> I suppose this interface is to ask the vendor IOMMU driver to check
> whether each device/iommu in the domain supports dirty bit tracking.
> But what will happen if new devices with different tracking capability
> are added afterward?
Yep, this is considered in the vfio part. We will query again after attaching or
detaching devices from the domain.  When the domain becomes capable, we enable
dirty log for it. When it becomes not capable, we disable dirty log for it.

> 
> To make things simple, is it possible to support this tracking only when
> all underlying IOMMUs support dirty bit tracking?
IIUC, all underlying IOMMUs you refer is of system wide. I think this idea may has
two issues. 1) The target domain may just contains part of system IOMMUs. 2) The
dirty tracking capability can be related to the capability of devices. For example,
we can track dirty log based on IOPF, which needs the capability of devices. That's
to say, we can make this framework more common.

> 
> Or, the more crazy idea is that we don't need to check this capability
> at all. If dirty bit tracking is not supported by hardware, just mark
> all pages dirty?
Yeah, I think this idea is nice :).

Still one concern is that we may have other dirty tracking methods in the future,
if we can't track dirty through iommu, we can still try other methods.

If there is no interface to check this capability, we have no chance to try
other methods. What do you think?

> 
>> +
>> +int iommu_switch_dirty_log(struct iommu_domain *domain, bool enable,
>> +               unsigned long iova, size_t size, int prot)
>> +{
>> +    const struct iommu_ops *ops = domain->ops;
>> +    unsigned long orig_iova = iova;
>> +    unsigned int min_pagesz;
>> +    size_t orig_size = size;
>> +    bool flush = false;
>> +    int ret = 0;
>> +
>> +    if (unlikely(!ops->switch_dirty_log))
>> +        return -ENODEV;
>> +
>> +    min_pagesz = 1 << __ffs(domain->pgsize_bitmap);
>> +    if (!IS_ALIGNED(iova | size, min_pagesz)) {
>> +        pr_err("unaligned: iova 0x%lx size 0x%zx min_pagesz 0x%x\n",
>> +               iova, size, min_pagesz);
>> +        return -EINVAL;
>> +    }
>> +
>> +    mutex_lock(&domain->switch_log_lock);
>> +    if (enable && domain->dirty_log_tracking) {
>> +        ret = -EBUSY;
>> +        goto out;
>> +    } else if (!enable && !domain->dirty_log_tracking) {
>> +        ret = -EINVAL;
>> +        goto out;
>> +    }
>> +
>> +    pr_debug("switch_dirty_log %s for: iova 0x%lx size 0x%zx\n",
>> +         enable ? "enable" : "disable", iova, size);
>> +
>> +    while (size) {
>> +        size_t pgsize = iommu_pgsize(domain, iova, size);
>> +
>> +        flush = true;
>> +        ret = ops->switch_dirty_log(domain, enable, iova, pgsize, prot);
> 
> Per minimal page callback is much expensive. How about using (pagesize,
> count), so that all pages with the same page size could be handled in a
> single indirect call? I remember I commented this during last review,
> but I don't mind doing it again.
Thanks for reminding me again :). I'll do that in next version.

Thanks,
Keqian
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [RFC PATCH v4 01/13] iommu: Introduce dirty log tracking framework
@ 2021-05-08  7:35       ` Keqian Zhu
  0 siblings, 0 replies; 81+ messages in thread
From: Keqian Zhu @ 2021-05-08  7:35 UTC (permalink / raw)
  To: Lu Baolu, linux-kernel, linux-arm-kernel, iommu, Robin Murphy,
	Will Deacon, Joerg Roedel, Jean-Philippe Brucker, Yi Sun,
	Tian Kevin
  Cc: Alex Williamson, Kirti Wankhede, Cornelia Huck, Jonathan Cameron,
	wanghaibin.wang, jiangkunkun, yuzenghui, lushenming

Hi Baolu,

On 2021/5/8 11:46, Lu Baolu wrote:
> Hi Keqian,
> 
> On 5/7/21 6:21 PM, Keqian Zhu wrote:
>> Some types of IOMMU are capable of tracking DMA dirty log, such as
>> ARM SMMU with HTTU or Intel IOMMU with SLADE. This introduces the
>> dirty log tracking framework in the IOMMU base layer.
>>
>> Four new essential interfaces are added, and we maintaince the status
>> of dirty log tracking in iommu_domain.
>> 1. iommu_support_dirty_log: Check whether domain supports dirty log tracking
>> 2. iommu_switch_dirty_log: Perform actions to start|stop dirty log tracking
>> 3. iommu_sync_dirty_log: Sync dirty log from IOMMU into a dirty bitmap
>> 4. iommu_clear_dirty_log: Clear dirty log of IOMMU by a mask bitmap
>>
>> Note: Don't concurrently call these interfaces with other ops that
>> access underlying page table.
>>
>> Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
>> Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com>
>> ---
>>   drivers/iommu/iommu.c        | 201 +++++++++++++++++++++++++++++++++++
>>   include/linux/iommu.h        |  63 +++++++++++
>>   include/trace/events/iommu.h |  63 +++++++++++
>>   3 files changed, 327 insertions(+)
>>
>> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
>> index 808ab70d5df5..0d15620d1e90 100644
>> --- a/drivers/iommu/iommu.c
>> +++ b/drivers/iommu/iommu.c
>> @@ -1940,6 +1940,7 @@ static struct iommu_domain *__iommu_domain_alloc(struct bus_type *bus,
>>       domain->type = type;
>>       /* Assume all sizes by default; the driver may override this later */
>>       domain->pgsize_bitmap  = bus->iommu_ops->pgsize_bitmap;
>> +    mutex_init(&domain->switch_log_lock);
>>         return domain;
>>   }
>> @@ -2703,6 +2704,206 @@ int iommu_set_pgtable_quirks(struct iommu_domain *domain,
>>   }
>>   EXPORT_SYMBOL_GPL(iommu_set_pgtable_quirks);
>>   +bool iommu_support_dirty_log(struct iommu_domain *domain)
>> +{
>> +    const struct iommu_ops *ops = domain->ops;
>> +
>> +    return ops->support_dirty_log && ops->support_dirty_log(domain);
>> +}
>> +EXPORT_SYMBOL_GPL(iommu_support_dirty_log);
> 
> I suppose this interface is to ask the vendor IOMMU driver to check
> whether each device/iommu in the domain supports dirty bit tracking.
> But what will happen if new devices with different tracking capability
> are added afterward?
Yep, this is considered in the vfio part. We will query again after attaching or
detaching devices from the domain.  When the domain becomes capable, we enable
dirty log for it. When it becomes not capable, we disable dirty log for it.

> 
> To make things simple, is it possible to support this tracking only when
> all underlying IOMMUs support dirty bit tracking?
IIUC, all underlying IOMMUs you refer is of system wide. I think this idea may has
two issues. 1) The target domain may just contains part of system IOMMUs. 2) The
dirty tracking capability can be related to the capability of devices. For example,
we can track dirty log based on IOPF, which needs the capability of devices. That's
to say, we can make this framework more common.

> 
> Or, the more crazy idea is that we don't need to check this capability
> at all. If dirty bit tracking is not supported by hardware, just mark
> all pages dirty?
Yeah, I think this idea is nice :).

Still one concern is that we may have other dirty tracking methods in the future,
if we can't track dirty through iommu, we can still try other methods.

If there is no interface to check this capability, we have no chance to try
other methods. What do you think?

> 
>> +
>> +int iommu_switch_dirty_log(struct iommu_domain *domain, bool enable,
>> +               unsigned long iova, size_t size, int prot)
>> +{
>> +    const struct iommu_ops *ops = domain->ops;
>> +    unsigned long orig_iova = iova;
>> +    unsigned int min_pagesz;
>> +    size_t orig_size = size;
>> +    bool flush = false;
>> +    int ret = 0;
>> +
>> +    if (unlikely(!ops->switch_dirty_log))
>> +        return -ENODEV;
>> +
>> +    min_pagesz = 1 << __ffs(domain->pgsize_bitmap);
>> +    if (!IS_ALIGNED(iova | size, min_pagesz)) {
>> +        pr_err("unaligned: iova 0x%lx size 0x%zx min_pagesz 0x%x\n",
>> +               iova, size, min_pagesz);
>> +        return -EINVAL;
>> +    }
>> +
>> +    mutex_lock(&domain->switch_log_lock);
>> +    if (enable && domain->dirty_log_tracking) {
>> +        ret = -EBUSY;
>> +        goto out;
>> +    } else if (!enable && !domain->dirty_log_tracking) {
>> +        ret = -EINVAL;
>> +        goto out;
>> +    }
>> +
>> +    pr_debug("switch_dirty_log %s for: iova 0x%lx size 0x%zx\n",
>> +         enable ? "enable" : "disable", iova, size);
>> +
>> +    while (size) {
>> +        size_t pgsize = iommu_pgsize(domain, iova, size);
>> +
>> +        flush = true;
>> +        ret = ops->switch_dirty_log(domain, enable, iova, pgsize, prot);
> 
> Per minimal page callback is much expensive. How about using (pagesize,
> count), so that all pages with the same page size could be handled in a
> single indirect call? I remember I commented this during last review,
> but I don't mind doing it again.
Thanks for reminding me again :). I'll do that in next version.

Thanks,
Keqian

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [RFC PATCH v4 01/13] iommu: Introduce dirty log tracking framework
  2021-05-08  7:35       ` Keqian Zhu
  (?)
@ 2021-05-10  1:08         ` Lu Baolu
  -1 siblings, 0 replies; 81+ messages in thread
From: Lu Baolu @ 2021-05-10  1:08 UTC (permalink / raw)
  To: Keqian Zhu, linux-kernel, linux-arm-kernel, iommu, Robin Murphy,
	Will Deacon, Joerg Roedel, Jean-Philippe Brucker, Yi Sun,
	Tian Kevin
  Cc: baolu.lu, Alex Williamson, Kirti Wankhede, Cornelia Huck,
	Jonathan Cameron, wanghaibin.wang, jiangkunkun, yuzenghui,
	lushenming

Hi Keqian,

On 5/8/21 3:35 PM, Keqian Zhu wrote:
> Hi Baolu,
> 
> On 2021/5/8 11:46, Lu Baolu wrote:
>> Hi Keqian,
>>
>> On 5/7/21 6:21 PM, Keqian Zhu wrote:
>>> Some types of IOMMU are capable of tracking DMA dirty log, such as
>>> ARM SMMU with HTTU or Intel IOMMU with SLADE. This introduces the
>>> dirty log tracking framework in the IOMMU base layer.
>>>
>>> Four new essential interfaces are added, and we maintaince the status
>>> of dirty log tracking in iommu_domain.
>>> 1. iommu_support_dirty_log: Check whether domain supports dirty log tracking
>>> 2. iommu_switch_dirty_log: Perform actions to start|stop dirty log tracking
>>> 3. iommu_sync_dirty_log: Sync dirty log from IOMMU into a dirty bitmap
>>> 4. iommu_clear_dirty_log: Clear dirty log of IOMMU by a mask bitmap
>>>
>>> Note: Don't concurrently call these interfaces with other ops that
>>> access underlying page table.
>>>
>>> Signed-off-by: Keqian Zhu<zhukeqian1@huawei.com>
>>> Signed-off-by: Kunkun Jiang<jiangkunkun@huawei.com>
>>> ---
>>>    drivers/iommu/iommu.c        | 201 +++++++++++++++++++++++++++++++++++
>>>    include/linux/iommu.h        |  63 +++++++++++
>>>    include/trace/events/iommu.h |  63 +++++++++++
>>>    3 files changed, 327 insertions(+)
>>>
>>> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
>>> index 808ab70d5df5..0d15620d1e90 100644
>>> --- a/drivers/iommu/iommu.c
>>> +++ b/drivers/iommu/iommu.c
>>> @@ -1940,6 +1940,7 @@ static struct iommu_domain *__iommu_domain_alloc(struct bus_type *bus,
>>>        domain->type = type;
>>>        /* Assume all sizes by default; the driver may override this later */
>>>        domain->pgsize_bitmap  = bus->iommu_ops->pgsize_bitmap;
>>> +    mutex_init(&domain->switch_log_lock);
>>>          return domain;
>>>    }
>>> @@ -2703,6 +2704,206 @@ int iommu_set_pgtable_quirks(struct iommu_domain *domain,
>>>    }
>>>    EXPORT_SYMBOL_GPL(iommu_set_pgtable_quirks);
>>>    +bool iommu_support_dirty_log(struct iommu_domain *domain)
>>> +{
>>> +    const struct iommu_ops *ops = domain->ops;
>>> +
>>> +    return ops->support_dirty_log && ops->support_dirty_log(domain);
>>> +}
>>> +EXPORT_SYMBOL_GPL(iommu_support_dirty_log);
>> I suppose this interface is to ask the vendor IOMMU driver to check
>> whether each device/iommu in the domain supports dirty bit tracking.
>> But what will happen if new devices with different tracking capability
>> are added afterward?
> Yep, this is considered in the vfio part. We will query again after attaching or
> detaching devices from the domain.  When the domain becomes capable, we enable
> dirty log for it. When it becomes not capable, we disable dirty log for it.

If that's the case, why not putting this logic in the iommu subsystem so
that it doesn't need to be duplicate in different upper layers?

For example, add something like dirty_page_trackable in the struct of
iommu_domain and ask the vendor iommu driver to update it once any
device is added/removed to/from the domain. It's also better to disallow
any domain attach/detach once the dirty page tracking is on.

> 
>> To make things simple, is it possible to support this tracking only when
>> all underlying IOMMUs support dirty bit tracking?
> IIUC, all underlying IOMMUs you refer is of system wide. I think this idea may has
> two issues. 1) The target domain may just contains part of system IOMMUs. 2) The
> dirty tracking capability can be related to the capability of devices. For example,
> we can track dirty log based on IOPF, which needs the capability of devices. That's
> to say, we can make this framework more common.

Yes. Fair enough. Thanks for sharing.

> 
>> Or, the more crazy idea is that we don't need to check this capability
>> at all. If dirty bit tracking is not supported by hardware, just mark
>> all pages dirty?
> Yeah, I think this idea is nice:).
> 
> Still one concern is that we may have other dirty tracking methods in the future,
> if we can't track dirty through iommu, we can still try other methods.
> 
> If there is no interface to check this capability, we have no chance to try
> other methods. What do you think?
> 

Agreed.

Best regards,
baolu

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [RFC PATCH v4 01/13] iommu: Introduce dirty log tracking framework
@ 2021-05-10  1:08         ` Lu Baolu
  0 siblings, 0 replies; 81+ messages in thread
From: Lu Baolu @ 2021-05-10  1:08 UTC (permalink / raw)
  To: Keqian Zhu, linux-kernel, linux-arm-kernel, iommu, Robin Murphy,
	Will Deacon, Joerg Roedel, Jean-Philippe Brucker, Yi Sun,
	Tian Kevin
  Cc: jiangkunkun, Cornelia Huck, Kirti Wankhede, lushenming,
	Alex Williamson, wanghaibin.wang

Hi Keqian,

On 5/8/21 3:35 PM, Keqian Zhu wrote:
> Hi Baolu,
> 
> On 2021/5/8 11:46, Lu Baolu wrote:
>> Hi Keqian,
>>
>> On 5/7/21 6:21 PM, Keqian Zhu wrote:
>>> Some types of IOMMU are capable of tracking DMA dirty log, such as
>>> ARM SMMU with HTTU or Intel IOMMU with SLADE. This introduces the
>>> dirty log tracking framework in the IOMMU base layer.
>>>
>>> Four new essential interfaces are added, and we maintaince the status
>>> of dirty log tracking in iommu_domain.
>>> 1. iommu_support_dirty_log: Check whether domain supports dirty log tracking
>>> 2. iommu_switch_dirty_log: Perform actions to start|stop dirty log tracking
>>> 3. iommu_sync_dirty_log: Sync dirty log from IOMMU into a dirty bitmap
>>> 4. iommu_clear_dirty_log: Clear dirty log of IOMMU by a mask bitmap
>>>
>>> Note: Don't concurrently call these interfaces with other ops that
>>> access underlying page table.
>>>
>>> Signed-off-by: Keqian Zhu<zhukeqian1@huawei.com>
>>> Signed-off-by: Kunkun Jiang<jiangkunkun@huawei.com>
>>> ---
>>>    drivers/iommu/iommu.c        | 201 +++++++++++++++++++++++++++++++++++
>>>    include/linux/iommu.h        |  63 +++++++++++
>>>    include/trace/events/iommu.h |  63 +++++++++++
>>>    3 files changed, 327 insertions(+)
>>>
>>> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
>>> index 808ab70d5df5..0d15620d1e90 100644
>>> --- a/drivers/iommu/iommu.c
>>> +++ b/drivers/iommu/iommu.c
>>> @@ -1940,6 +1940,7 @@ static struct iommu_domain *__iommu_domain_alloc(struct bus_type *bus,
>>>        domain->type = type;
>>>        /* Assume all sizes by default; the driver may override this later */
>>>        domain->pgsize_bitmap  = bus->iommu_ops->pgsize_bitmap;
>>> +    mutex_init(&domain->switch_log_lock);
>>>          return domain;
>>>    }
>>> @@ -2703,6 +2704,206 @@ int iommu_set_pgtable_quirks(struct iommu_domain *domain,
>>>    }
>>>    EXPORT_SYMBOL_GPL(iommu_set_pgtable_quirks);
>>>    +bool iommu_support_dirty_log(struct iommu_domain *domain)
>>> +{
>>> +    const struct iommu_ops *ops = domain->ops;
>>> +
>>> +    return ops->support_dirty_log && ops->support_dirty_log(domain);
>>> +}
>>> +EXPORT_SYMBOL_GPL(iommu_support_dirty_log);
>> I suppose this interface is to ask the vendor IOMMU driver to check
>> whether each device/iommu in the domain supports dirty bit tracking.
>> But what will happen if new devices with different tracking capability
>> are added afterward?
> Yep, this is considered in the vfio part. We will query again after attaching or
> detaching devices from the domain.  When the domain becomes capable, we enable
> dirty log for it. When it becomes not capable, we disable dirty log for it.

If that's the case, why not putting this logic in the iommu subsystem so
that it doesn't need to be duplicate in different upper layers?

For example, add something like dirty_page_trackable in the struct of
iommu_domain and ask the vendor iommu driver to update it once any
device is added/removed to/from the domain. It's also better to disallow
any domain attach/detach once the dirty page tracking is on.

> 
>> To make things simple, is it possible to support this tracking only when
>> all underlying IOMMUs support dirty bit tracking?
> IIUC, all underlying IOMMUs you refer is of system wide. I think this idea may has
> two issues. 1) The target domain may just contains part of system IOMMUs. 2) The
> dirty tracking capability can be related to the capability of devices. For example,
> we can track dirty log based on IOPF, which needs the capability of devices. That's
> to say, we can make this framework more common.

Yes. Fair enough. Thanks for sharing.

> 
>> Or, the more crazy idea is that we don't need to check this capability
>> at all. If dirty bit tracking is not supported by hardware, just mark
>> all pages dirty?
> Yeah, I think this idea is nice:).
> 
> Still one concern is that we may have other dirty tracking methods in the future,
> if we can't track dirty through iommu, we can still try other methods.
> 
> If there is no interface to check this capability, we have no chance to try
> other methods. What do you think?
> 

Agreed.

Best regards,
baolu
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [RFC PATCH v4 01/13] iommu: Introduce dirty log tracking framework
@ 2021-05-10  1:08         ` Lu Baolu
  0 siblings, 0 replies; 81+ messages in thread
From: Lu Baolu @ 2021-05-10  1:08 UTC (permalink / raw)
  To: Keqian Zhu, linux-kernel, linux-arm-kernel, iommu, Robin Murphy,
	Will Deacon, Joerg Roedel, Jean-Philippe Brucker, Yi Sun,
	Tian Kevin
  Cc: baolu.lu, Alex Williamson, Kirti Wankhede, Cornelia Huck,
	Jonathan Cameron, wanghaibin.wang, jiangkunkun, yuzenghui,
	lushenming

Hi Keqian,

On 5/8/21 3:35 PM, Keqian Zhu wrote:
> Hi Baolu,
> 
> On 2021/5/8 11:46, Lu Baolu wrote:
>> Hi Keqian,
>>
>> On 5/7/21 6:21 PM, Keqian Zhu wrote:
>>> Some types of IOMMU are capable of tracking DMA dirty log, such as
>>> ARM SMMU with HTTU or Intel IOMMU with SLADE. This introduces the
>>> dirty log tracking framework in the IOMMU base layer.
>>>
>>> Four new essential interfaces are added, and we maintaince the status
>>> of dirty log tracking in iommu_domain.
>>> 1. iommu_support_dirty_log: Check whether domain supports dirty log tracking
>>> 2. iommu_switch_dirty_log: Perform actions to start|stop dirty log tracking
>>> 3. iommu_sync_dirty_log: Sync dirty log from IOMMU into a dirty bitmap
>>> 4. iommu_clear_dirty_log: Clear dirty log of IOMMU by a mask bitmap
>>>
>>> Note: Don't concurrently call these interfaces with other ops that
>>> access underlying page table.
>>>
>>> Signed-off-by: Keqian Zhu<zhukeqian1@huawei.com>
>>> Signed-off-by: Kunkun Jiang<jiangkunkun@huawei.com>
>>> ---
>>>    drivers/iommu/iommu.c        | 201 +++++++++++++++++++++++++++++++++++
>>>    include/linux/iommu.h        |  63 +++++++++++
>>>    include/trace/events/iommu.h |  63 +++++++++++
>>>    3 files changed, 327 insertions(+)
>>>
>>> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
>>> index 808ab70d5df5..0d15620d1e90 100644
>>> --- a/drivers/iommu/iommu.c
>>> +++ b/drivers/iommu/iommu.c
>>> @@ -1940,6 +1940,7 @@ static struct iommu_domain *__iommu_domain_alloc(struct bus_type *bus,
>>>        domain->type = type;
>>>        /* Assume all sizes by default; the driver may override this later */
>>>        domain->pgsize_bitmap  = bus->iommu_ops->pgsize_bitmap;
>>> +    mutex_init(&domain->switch_log_lock);
>>>          return domain;
>>>    }
>>> @@ -2703,6 +2704,206 @@ int iommu_set_pgtable_quirks(struct iommu_domain *domain,
>>>    }
>>>    EXPORT_SYMBOL_GPL(iommu_set_pgtable_quirks);
>>>    +bool iommu_support_dirty_log(struct iommu_domain *domain)
>>> +{
>>> +    const struct iommu_ops *ops = domain->ops;
>>> +
>>> +    return ops->support_dirty_log && ops->support_dirty_log(domain);
>>> +}
>>> +EXPORT_SYMBOL_GPL(iommu_support_dirty_log);
>> I suppose this interface is to ask the vendor IOMMU driver to check
>> whether each device/iommu in the domain supports dirty bit tracking.
>> But what will happen if new devices with different tracking capability
>> are added afterward?
> Yep, this is considered in the vfio part. We will query again after attaching or
> detaching devices from the domain.  When the domain becomes capable, we enable
> dirty log for it. When it becomes not capable, we disable dirty log for it.

If that's the case, why not putting this logic in the iommu subsystem so
that it doesn't need to be duplicate in different upper layers?

For example, add something like dirty_page_trackable in the struct of
iommu_domain and ask the vendor iommu driver to update it once any
device is added/removed to/from the domain. It's also better to disallow
any domain attach/detach once the dirty page tracking is on.

> 
>> To make things simple, is it possible to support this tracking only when
>> all underlying IOMMUs support dirty bit tracking?
> IIUC, all underlying IOMMUs you refer is of system wide. I think this idea may has
> two issues. 1) The target domain may just contains part of system IOMMUs. 2) The
> dirty tracking capability can be related to the capability of devices. For example,
> we can track dirty log based on IOPF, which needs the capability of devices. That's
> to say, we can make this framework more common.

Yes. Fair enough. Thanks for sharing.

> 
>> Or, the more crazy idea is that we don't need to check this capability
>> at all. If dirty bit tracking is not supported by hardware, just mark
>> all pages dirty?
> Yeah, I think this idea is nice:).
> 
> Still one concern is that we may have other dirty tracking methods in the future,
> if we can't track dirty through iommu, we can still try other methods.
> 
> If there is no interface to check this capability, we have no chance to try
> other methods. What do you think?
> 

Agreed.

Best regards,
baolu

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [RFC PATCH v4 01/13] iommu: Introduce dirty log tracking framework
  2021-05-10  1:08         ` Lu Baolu
  (?)
@ 2021-05-10 11:07           ` Keqian Zhu
  -1 siblings, 0 replies; 81+ messages in thread
From: Keqian Zhu @ 2021-05-10 11:07 UTC (permalink / raw)
  To: Lu Baolu, linux-kernel, linux-arm-kernel, iommu, Robin Murphy,
	Will Deacon, Joerg Roedel, Jean-Philippe Brucker, Yi Sun,
	Tian Kevin
  Cc: Alex Williamson, Kirti Wankhede, Cornelia Huck, Jonathan Cameron,
	wanghaibin.wang, jiangkunkun, yuzenghui, lushenming

Hi Baolu,

On 2021/5/10 9:08, Lu Baolu wrote:
> Hi Keqian,
> 
> On 5/8/21 3:35 PM, Keqian Zhu wrote:
>> Hi Baolu,
>>
>> On 2021/5/8 11:46, Lu Baolu wrote:
>>> Hi Keqian,
>>>
>>> On 5/7/21 6:21 PM, Keqian Zhu wrote:
>>>> Some types of IOMMU are capable of tracking DMA dirty log, such as
>>>> ARM SMMU with HTTU or Intel IOMMU with SLADE. This introduces the
>>>> dirty log tracking framework in the IOMMU base layer.
>>>>
>>>> Four new essential interfaces are added, and we maintaince the status
>>>> of dirty log tracking in iommu_domain.
>>>> 1. iommu_support_dirty_log: Check whether domain supports dirty log tracking
>>>> 2. iommu_switch_dirty_log: Perform actions to start|stop dirty log tracking
>>>> 3. iommu_sync_dirty_log: Sync dirty log from IOMMU into a dirty bitmap
>>>> 4. iommu_clear_dirty_log: Clear dirty log of IOMMU by a mask bitmap
>>>>
>>>> Note: Don't concurrently call these interfaces with other ops that
>>>> access underlying page table.
>>>>
>>>> Signed-off-by: Keqian Zhu<zhukeqian1@huawei.com>
>>>> Signed-off-by: Kunkun Jiang<jiangkunkun@huawei.com>
>>>> ---
>>>>    drivers/iommu/iommu.c        | 201 +++++++++++++++++++++++++++++++++++
>>>>    include/linux/iommu.h        |  63 +++++++++++
>>>>    include/trace/events/iommu.h |  63 +++++++++++
>>>>    3 files changed, 327 insertions(+)
>>>>
>>>> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
>>>> index 808ab70d5df5..0d15620d1e90 100644
>>>> --- a/drivers/iommu/iommu.c
>>>> +++ b/drivers/iommu/iommu.c
>>>> @@ -1940,6 +1940,7 @@ static struct iommu_domain *__iommu_domain_alloc(struct bus_type *bus,
>>>>        domain->type = type;
>>>>        /* Assume all sizes by default; the driver may override this later */
>>>>        domain->pgsize_bitmap  = bus->iommu_ops->pgsize_bitmap;
>>>> +    mutex_init(&domain->switch_log_lock);
>>>>          return domain;
>>>>    }
>>>> @@ -2703,6 +2704,206 @@ int iommu_set_pgtable_quirks(struct iommu_domain *domain,
>>>>    }
>>>>    EXPORT_SYMBOL_GPL(iommu_set_pgtable_quirks);
>>>>    +bool iommu_support_dirty_log(struct iommu_domain *domain)
>>>> +{
>>>> +    const struct iommu_ops *ops = domain->ops;
>>>> +
>>>> +    return ops->support_dirty_log && ops->support_dirty_log(domain);
>>>> +}
>>>> +EXPORT_SYMBOL_GPL(iommu_support_dirty_log);
>>> I suppose this interface is to ask the vendor IOMMU driver to check
>>> whether each device/iommu in the domain supports dirty bit tracking.
>>> But what will happen if new devices with different tracking capability
>>> are added afterward?
>> Yep, this is considered in the vfio part. We will query again after attaching or
>> detaching devices from the domain.  When the domain becomes capable, we enable
>> dirty log for it. When it becomes not capable, we disable dirty log for it.
> 
> If that's the case, why not putting this logic in the iommu subsystem so
> that it doesn't need to be duplicate in different upper layers?
> 
> For example, add something like dirty_page_trackable in the struct of
> iommu_domain and ask the vendor iommu driver to update it once any
> device is added/removed to/from the domain. It's also better to disallow
If we do it, the upper layer still needs to query the capability from domain and switch
dirty log tracking for it. Or do you mean the domain can switch dirty log tracking automatically
when its capability change? If so, I think we're lack of some flexibility. The upper layer
may have it's own policy, such as only enable dirty log tracking when all domains are capable,
and disable dirty log tracking when just one domain is not capable.

> any domain attach/detach once the dirty page tracking is on.
Yep, this can greatly simplify our code logic, but I don't know whether our maintainers
agree that, as they may think that IOMMU dirty logging should not change original domain
behaviors.


Thanks,
Keqian

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [RFC PATCH v4 01/13] iommu: Introduce dirty log tracking framework
@ 2021-05-10 11:07           ` Keqian Zhu
  0 siblings, 0 replies; 81+ messages in thread
From: Keqian Zhu @ 2021-05-10 11:07 UTC (permalink / raw)
  To: Lu Baolu, linux-kernel, linux-arm-kernel, iommu, Robin Murphy,
	Will Deacon, Joerg Roedel, Jean-Philippe Brucker, Yi Sun,
	Tian Kevin
  Cc: jiangkunkun, Cornelia Huck, Kirti Wankhede, lushenming,
	Alex Williamson, wanghaibin.wang

Hi Baolu,

On 2021/5/10 9:08, Lu Baolu wrote:
> Hi Keqian,
> 
> On 5/8/21 3:35 PM, Keqian Zhu wrote:
>> Hi Baolu,
>>
>> On 2021/5/8 11:46, Lu Baolu wrote:
>>> Hi Keqian,
>>>
>>> On 5/7/21 6:21 PM, Keqian Zhu wrote:
>>>> Some types of IOMMU are capable of tracking DMA dirty log, such as
>>>> ARM SMMU with HTTU or Intel IOMMU with SLADE. This introduces the
>>>> dirty log tracking framework in the IOMMU base layer.
>>>>
>>>> Four new essential interfaces are added, and we maintaince the status
>>>> of dirty log tracking in iommu_domain.
>>>> 1. iommu_support_dirty_log: Check whether domain supports dirty log tracking
>>>> 2. iommu_switch_dirty_log: Perform actions to start|stop dirty log tracking
>>>> 3. iommu_sync_dirty_log: Sync dirty log from IOMMU into a dirty bitmap
>>>> 4. iommu_clear_dirty_log: Clear dirty log of IOMMU by a mask bitmap
>>>>
>>>> Note: Don't concurrently call these interfaces with other ops that
>>>> access underlying page table.
>>>>
>>>> Signed-off-by: Keqian Zhu<zhukeqian1@huawei.com>
>>>> Signed-off-by: Kunkun Jiang<jiangkunkun@huawei.com>
>>>> ---
>>>>    drivers/iommu/iommu.c        | 201 +++++++++++++++++++++++++++++++++++
>>>>    include/linux/iommu.h        |  63 +++++++++++
>>>>    include/trace/events/iommu.h |  63 +++++++++++
>>>>    3 files changed, 327 insertions(+)
>>>>
>>>> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
>>>> index 808ab70d5df5..0d15620d1e90 100644
>>>> --- a/drivers/iommu/iommu.c
>>>> +++ b/drivers/iommu/iommu.c
>>>> @@ -1940,6 +1940,7 @@ static struct iommu_domain *__iommu_domain_alloc(struct bus_type *bus,
>>>>        domain->type = type;
>>>>        /* Assume all sizes by default; the driver may override this later */
>>>>        domain->pgsize_bitmap  = bus->iommu_ops->pgsize_bitmap;
>>>> +    mutex_init(&domain->switch_log_lock);
>>>>          return domain;
>>>>    }
>>>> @@ -2703,6 +2704,206 @@ int iommu_set_pgtable_quirks(struct iommu_domain *domain,
>>>>    }
>>>>    EXPORT_SYMBOL_GPL(iommu_set_pgtable_quirks);
>>>>    +bool iommu_support_dirty_log(struct iommu_domain *domain)
>>>> +{
>>>> +    const struct iommu_ops *ops = domain->ops;
>>>> +
>>>> +    return ops->support_dirty_log && ops->support_dirty_log(domain);
>>>> +}
>>>> +EXPORT_SYMBOL_GPL(iommu_support_dirty_log);
>>> I suppose this interface is to ask the vendor IOMMU driver to check
>>> whether each device/iommu in the domain supports dirty bit tracking.
>>> But what will happen if new devices with different tracking capability
>>> are added afterward?
>> Yep, this is considered in the vfio part. We will query again after attaching or
>> detaching devices from the domain.  When the domain becomes capable, we enable
>> dirty log for it. When it becomes not capable, we disable dirty log for it.
> 
> If that's the case, why not putting this logic in the iommu subsystem so
> that it doesn't need to be duplicate in different upper layers?
> 
> For example, add something like dirty_page_trackable in the struct of
> iommu_domain and ask the vendor iommu driver to update it once any
> device is added/removed to/from the domain. It's also better to disallow
If we do it, the upper layer still needs to query the capability from domain and switch
dirty log tracking for it. Or do you mean the domain can switch dirty log tracking automatically
when its capability change? If so, I think we're lack of some flexibility. The upper layer
may have it's own policy, such as only enable dirty log tracking when all domains are capable,
and disable dirty log tracking when just one domain is not capable.

> any domain attach/detach once the dirty page tracking is on.
Yep, this can greatly simplify our code logic, but I don't know whether our maintainers
agree that, as they may think that IOMMU dirty logging should not change original domain
behaviors.


Thanks,
Keqian
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [RFC PATCH v4 01/13] iommu: Introduce dirty log tracking framework
@ 2021-05-10 11:07           ` Keqian Zhu
  0 siblings, 0 replies; 81+ messages in thread
From: Keqian Zhu @ 2021-05-10 11:07 UTC (permalink / raw)
  To: Lu Baolu, linux-kernel, linux-arm-kernel, iommu, Robin Murphy,
	Will Deacon, Joerg Roedel, Jean-Philippe Brucker, Yi Sun,
	Tian Kevin
  Cc: Alex Williamson, Kirti Wankhede, Cornelia Huck, Jonathan Cameron,
	wanghaibin.wang, jiangkunkun, yuzenghui, lushenming

Hi Baolu,

On 2021/5/10 9:08, Lu Baolu wrote:
> Hi Keqian,
> 
> On 5/8/21 3:35 PM, Keqian Zhu wrote:
>> Hi Baolu,
>>
>> On 2021/5/8 11:46, Lu Baolu wrote:
>>> Hi Keqian,
>>>
>>> On 5/7/21 6:21 PM, Keqian Zhu wrote:
>>>> Some types of IOMMU are capable of tracking DMA dirty log, such as
>>>> ARM SMMU with HTTU or Intel IOMMU with SLADE. This introduces the
>>>> dirty log tracking framework in the IOMMU base layer.
>>>>
>>>> Four new essential interfaces are added, and we maintaince the status
>>>> of dirty log tracking in iommu_domain.
>>>> 1. iommu_support_dirty_log: Check whether domain supports dirty log tracking
>>>> 2. iommu_switch_dirty_log: Perform actions to start|stop dirty log tracking
>>>> 3. iommu_sync_dirty_log: Sync dirty log from IOMMU into a dirty bitmap
>>>> 4. iommu_clear_dirty_log: Clear dirty log of IOMMU by a mask bitmap
>>>>
>>>> Note: Don't concurrently call these interfaces with other ops that
>>>> access underlying page table.
>>>>
>>>> Signed-off-by: Keqian Zhu<zhukeqian1@huawei.com>
>>>> Signed-off-by: Kunkun Jiang<jiangkunkun@huawei.com>
>>>> ---
>>>>    drivers/iommu/iommu.c        | 201 +++++++++++++++++++++++++++++++++++
>>>>    include/linux/iommu.h        |  63 +++++++++++
>>>>    include/trace/events/iommu.h |  63 +++++++++++
>>>>    3 files changed, 327 insertions(+)
>>>>
>>>> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
>>>> index 808ab70d5df5..0d15620d1e90 100644
>>>> --- a/drivers/iommu/iommu.c
>>>> +++ b/drivers/iommu/iommu.c
>>>> @@ -1940,6 +1940,7 @@ static struct iommu_domain *__iommu_domain_alloc(struct bus_type *bus,
>>>>        domain->type = type;
>>>>        /* Assume all sizes by default; the driver may override this later */
>>>>        domain->pgsize_bitmap  = bus->iommu_ops->pgsize_bitmap;
>>>> +    mutex_init(&domain->switch_log_lock);
>>>>          return domain;
>>>>    }
>>>> @@ -2703,6 +2704,206 @@ int iommu_set_pgtable_quirks(struct iommu_domain *domain,
>>>>    }
>>>>    EXPORT_SYMBOL_GPL(iommu_set_pgtable_quirks);
>>>>    +bool iommu_support_dirty_log(struct iommu_domain *domain)
>>>> +{
>>>> +    const struct iommu_ops *ops = domain->ops;
>>>> +
>>>> +    return ops->support_dirty_log && ops->support_dirty_log(domain);
>>>> +}
>>>> +EXPORT_SYMBOL_GPL(iommu_support_dirty_log);
>>> I suppose this interface is to ask the vendor IOMMU driver to check
>>> whether each device/iommu in the domain supports dirty bit tracking.
>>> But what will happen if new devices with different tracking capability
>>> are added afterward?
>> Yep, this is considered in the vfio part. We will query again after attaching or
>> detaching devices from the domain.  When the domain becomes capable, we enable
>> dirty log for it. When it becomes not capable, we disable dirty log for it.
> 
> If that's the case, why not putting this logic in the iommu subsystem so
> that it doesn't need to be duplicate in different upper layers?
> 
> For example, add something like dirty_page_trackable in the struct of
> iommu_domain and ask the vendor iommu driver to update it once any
> device is added/removed to/from the domain. It's also better to disallow
If we do it, the upper layer still needs to query the capability from domain and switch
dirty log tracking for it. Or do you mean the domain can switch dirty log tracking automatically
when its capability change? If so, I think we're lack of some flexibility. The upper layer
may have it's own policy, such as only enable dirty log tracking when all domains are capable,
and disable dirty log tracking when just one domain is not capable.

> any domain attach/detach once the dirty page tracking is on.
Yep, this can greatly simplify our code logic, but I don't know whether our maintainers
agree that, as they may think that IOMMU dirty logging should not change original domain
behaviors.


Thanks,
Keqian

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [RFC PATCH v4 01/13] iommu: Introduce dirty log tracking framework
  2021-05-10 11:07           ` Keqian Zhu
  (?)
@ 2021-05-11  3:12             ` Lu Baolu
  -1 siblings, 0 replies; 81+ messages in thread
From: Lu Baolu @ 2021-05-11  3:12 UTC (permalink / raw)
  To: Keqian Zhu, linux-kernel, linux-arm-kernel, iommu, Robin Murphy,
	Will Deacon, Joerg Roedel, Jean-Philippe Brucker, Yi Sun,
	Tian Kevin
  Cc: baolu.lu, Alex Williamson, Kirti Wankhede, Cornelia Huck,
	Jonathan Cameron, wanghaibin.wang, jiangkunkun, yuzenghui,
	lushenming

Hi Keqian,

On 5/10/21 7:07 PM, Keqian Zhu wrote:
>>>> I suppose this interface is to ask the vendor IOMMU driver to check
>>>> whether each device/iommu in the domain supports dirty bit tracking.
>>>> But what will happen if new devices with different tracking capability
>>>> are added afterward?
>>> Yep, this is considered in the vfio part. We will query again after attaching or
>>> detaching devices from the domain.  When the domain becomes capable, we enable
>>> dirty log for it. When it becomes not capable, we disable dirty log for it.
>> If that's the case, why not putting this logic in the iommu subsystem so
>> that it doesn't need to be duplicate in different upper layers?
>>
>> For example, add something like dirty_page_trackable in the struct of
>> iommu_domain and ask the vendor iommu driver to update it once any
>> device is added/removed to/from the domain. It's also better to disallow
> If we do it, the upper layer still needs to query the capability from domain and switch
> dirty log tracking for it. Or do you mean the domain can switch dirty log tracking automatically
> when its capability change? If so, I think we're lack of some flexibility. The upper layer
> may have it's own policy, such as only enable dirty log tracking when all domains are capable,
> and disable dirty log tracking when just one domain is not capable.

I may not get you.

Assume that dirty_page_trackable is an attribution of an iommu_domain.
This attribution might be changed once a new device (with different
capability) added or removed. So it should be updated every time a new
device is attached or detached. This work could be done by the vendor
iommu driver on the path of dev_attach/dev_detach callback.

For upper layers, before starting page tracking, they check the
dirty_page_trackable attribution of the domain and start it only it's
capable. Once the page tracking is switched on the vendor iommu driver
(or iommu core) should block further device attach/detach operations
until page tracking is stopped.

> 
>> any domain attach/detach once the dirty page tracking is on.
> Yep, this can greatly simplify our code logic, but I don't know whether our maintainers
> agree that, as they may think that IOMMU dirty logging should not change original domain
> behaviors.

The maintainer owns the last word, but we need to work out a generic and
self-contained API set.

Best regards,
baolu

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [RFC PATCH v4 01/13] iommu: Introduce dirty log tracking framework
@ 2021-05-11  3:12             ` Lu Baolu
  0 siblings, 0 replies; 81+ messages in thread
From: Lu Baolu @ 2021-05-11  3:12 UTC (permalink / raw)
  To: Keqian Zhu, linux-kernel, linux-arm-kernel, iommu, Robin Murphy,
	Will Deacon, Joerg Roedel, Jean-Philippe Brucker, Yi Sun,
	Tian Kevin
  Cc: jiangkunkun, Cornelia Huck, Kirti Wankhede, lushenming,
	Alex Williamson, wanghaibin.wang

Hi Keqian,

On 5/10/21 7:07 PM, Keqian Zhu wrote:
>>>> I suppose this interface is to ask the vendor IOMMU driver to check
>>>> whether each device/iommu in the domain supports dirty bit tracking.
>>>> But what will happen if new devices with different tracking capability
>>>> are added afterward?
>>> Yep, this is considered in the vfio part. We will query again after attaching or
>>> detaching devices from the domain.  When the domain becomes capable, we enable
>>> dirty log for it. When it becomes not capable, we disable dirty log for it.
>> If that's the case, why not putting this logic in the iommu subsystem so
>> that it doesn't need to be duplicate in different upper layers?
>>
>> For example, add something like dirty_page_trackable in the struct of
>> iommu_domain and ask the vendor iommu driver to update it once any
>> device is added/removed to/from the domain. It's also better to disallow
> If we do it, the upper layer still needs to query the capability from domain and switch
> dirty log tracking for it. Or do you mean the domain can switch dirty log tracking automatically
> when its capability change? If so, I think we're lack of some flexibility. The upper layer
> may have it's own policy, such as only enable dirty log tracking when all domains are capable,
> and disable dirty log tracking when just one domain is not capable.

I may not get you.

Assume that dirty_page_trackable is an attribution of an iommu_domain.
This attribution might be changed once a new device (with different
capability) added or removed. So it should be updated every time a new
device is attached or detached. This work could be done by the vendor
iommu driver on the path of dev_attach/dev_detach callback.

For upper layers, before starting page tracking, they check the
dirty_page_trackable attribution of the domain and start it only it's
capable. Once the page tracking is switched on the vendor iommu driver
(or iommu core) should block further device attach/detach operations
until page tracking is stopped.

> 
>> any domain attach/detach once the dirty page tracking is on.
> Yep, this can greatly simplify our code logic, but I don't know whether our maintainers
> agree that, as they may think that IOMMU dirty logging should not change original domain
> behaviors.

The maintainer owns the last word, but we need to work out a generic and
self-contained API set.

Best regards,
baolu
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [RFC PATCH v4 01/13] iommu: Introduce dirty log tracking framework
@ 2021-05-11  3:12             ` Lu Baolu
  0 siblings, 0 replies; 81+ messages in thread
From: Lu Baolu @ 2021-05-11  3:12 UTC (permalink / raw)
  To: Keqian Zhu, linux-kernel, linux-arm-kernel, iommu, Robin Murphy,
	Will Deacon, Joerg Roedel, Jean-Philippe Brucker, Yi Sun,
	Tian Kevin
  Cc: baolu.lu, Alex Williamson, Kirti Wankhede, Cornelia Huck,
	Jonathan Cameron, wanghaibin.wang, jiangkunkun, yuzenghui,
	lushenming

Hi Keqian,

On 5/10/21 7:07 PM, Keqian Zhu wrote:
>>>> I suppose this interface is to ask the vendor IOMMU driver to check
>>>> whether each device/iommu in the domain supports dirty bit tracking.
>>>> But what will happen if new devices with different tracking capability
>>>> are added afterward?
>>> Yep, this is considered in the vfio part. We will query again after attaching or
>>> detaching devices from the domain.  When the domain becomes capable, we enable
>>> dirty log for it. When it becomes not capable, we disable dirty log for it.
>> If that's the case, why not putting this logic in the iommu subsystem so
>> that it doesn't need to be duplicate in different upper layers?
>>
>> For example, add something like dirty_page_trackable in the struct of
>> iommu_domain and ask the vendor iommu driver to update it once any
>> device is added/removed to/from the domain. It's also better to disallow
> If we do it, the upper layer still needs to query the capability from domain and switch
> dirty log tracking for it. Or do you mean the domain can switch dirty log tracking automatically
> when its capability change? If so, I think we're lack of some flexibility. The upper layer
> may have it's own policy, such as only enable dirty log tracking when all domains are capable,
> and disable dirty log tracking when just one domain is not capable.

I may not get you.

Assume that dirty_page_trackable is an attribution of an iommu_domain.
This attribution might be changed once a new device (with different
capability) added or removed. So it should be updated every time a new
device is attached or detached. This work could be done by the vendor
iommu driver on the path of dev_attach/dev_detach callback.

For upper layers, before starting page tracking, they check the
dirty_page_trackable attribution of the domain and start it only it's
capable. Once the page tracking is switched on the vendor iommu driver
(or iommu core) should block further device attach/detach operations
until page tracking is stopped.

> 
>> any domain attach/detach once the dirty page tracking is on.
> Yep, this can greatly simplify our code logic, but I don't know whether our maintainers
> agree that, as they may think that IOMMU dirty logging should not change original domain
> behaviors.

The maintainer owns the last word, but we need to work out a generic and
self-contained API set.

Best regards,
baolu

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [RFC PATCH v4 01/13] iommu: Introduce dirty log tracking framework
  2021-05-11  3:12             ` Lu Baolu
  (?)
@ 2021-05-11  7:40               ` Keqian Zhu
  -1 siblings, 0 replies; 81+ messages in thread
From: Keqian Zhu @ 2021-05-11  7:40 UTC (permalink / raw)
  To: Lu Baolu, linux-kernel, linux-arm-kernel, iommu, Robin Murphy,
	Will Deacon, Joerg Roedel, Jean-Philippe Brucker, Yi Sun,
	Tian Kevin
  Cc: Alex Williamson, Kirti Wankhede, Cornelia Huck, Jonathan Cameron,
	wanghaibin.wang, jiangkunkun, yuzenghui, lushenming

Hi Baolu,

On 2021/5/11 11:12, Lu Baolu wrote:
> Hi Keqian,
> 
> On 5/10/21 7:07 PM, Keqian Zhu wrote:
>>>>> I suppose this interface is to ask the vendor IOMMU driver to check
>>>>> whether each device/iommu in the domain supports dirty bit tracking.
>>>>> But what will happen if new devices with different tracking capability
>>>>> are added afterward?
>>>> Yep, this is considered in the vfio part. We will query again after attaching or
>>>> detaching devices from the domain.  When the domain becomes capable, we enable
>>>> dirty log for it. When it becomes not capable, we disable dirty log for it.
>>> If that's the case, why not putting this logic in the iommu subsystem so
>>> that it doesn't need to be duplicate in different upper layers?
>>>
>>> For example, add something like dirty_page_trackable in the struct of
>>> iommu_domain and ask the vendor iommu driver to update it once any
>>> device is added/removed to/from the domain. It's also better to disallow
>> If we do it, the upper layer still needs to query the capability from domain and switch
>> dirty log tracking for it. Or do you mean the domain can switch dirty log tracking automatically
>> when its capability change? If so, I think we're lack of some flexibility. The upper layer
>> may have it's own policy, such as only enable dirty log tracking when all domains are capable,
>> and disable dirty log tracking when just one domain is not capable.
> 
> I may not get you.
> 
> Assume that dirty_page_trackable is an attribution of an iommu_domain.
> This attribution might be changed once a new device (with different
> capability) added or removed. So it should be updated every time a new
> device is attached or detached. This work could be done by the vendor
> iommu driver on the path of dev_attach/dev_detach callback.
Yes, this is what I understand you.

> 
> For upper layers, before starting page tracking, they check the
> dirty_page_trackable attribution of the domain and start it only it's
> capable. Once the page tracking is switched on the vendor iommu driver
> (or iommu core) should block further device attach/detach operations
> until page tracking is stopped.
But when a domain becomes capable after detaching a device, the upper layer
still needs to query it and enable dirty log for it...

To make things coordinated, maybe the upper layer can register a notifier,
when the domain's capability change, the upper layer do not need to query, instead
they just need to realize a callback, and do their specific policy in the callback.
What do you think?

> 
>>
>>> any domain attach/detach once the dirty page tracking is on.
>> Yep, this can greatly simplify our code logic, but I don't know whether our maintainers
>> agree that, as they may think that IOMMU dirty logging should not change original domain
>> behaviors.
> 
> The maintainer owns the last word, but we need to work out a generic and
> self-contained API set.
OK, I see.

Thanks,
Keqian

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [RFC PATCH v4 01/13] iommu: Introduce dirty log tracking framework
@ 2021-05-11  7:40               ` Keqian Zhu
  0 siblings, 0 replies; 81+ messages in thread
From: Keqian Zhu @ 2021-05-11  7:40 UTC (permalink / raw)
  To: Lu Baolu, linux-kernel, linux-arm-kernel, iommu, Robin Murphy,
	Will Deacon, Joerg Roedel, Jean-Philippe Brucker, Yi Sun,
	Tian Kevin
  Cc: jiangkunkun, Cornelia Huck, Kirti Wankhede, lushenming,
	Alex Williamson, wanghaibin.wang

Hi Baolu,

On 2021/5/11 11:12, Lu Baolu wrote:
> Hi Keqian,
> 
> On 5/10/21 7:07 PM, Keqian Zhu wrote:
>>>>> I suppose this interface is to ask the vendor IOMMU driver to check
>>>>> whether each device/iommu in the domain supports dirty bit tracking.
>>>>> But what will happen if new devices with different tracking capability
>>>>> are added afterward?
>>>> Yep, this is considered in the vfio part. We will query again after attaching or
>>>> detaching devices from the domain.  When the domain becomes capable, we enable
>>>> dirty log for it. When it becomes not capable, we disable dirty log for it.
>>> If that's the case, why not putting this logic in the iommu subsystem so
>>> that it doesn't need to be duplicate in different upper layers?
>>>
>>> For example, add something like dirty_page_trackable in the struct of
>>> iommu_domain and ask the vendor iommu driver to update it once any
>>> device is added/removed to/from the domain. It's also better to disallow
>> If we do it, the upper layer still needs to query the capability from domain and switch
>> dirty log tracking for it. Or do you mean the domain can switch dirty log tracking automatically
>> when its capability change? If so, I think we're lack of some flexibility. The upper layer
>> may have it's own policy, such as only enable dirty log tracking when all domains are capable,
>> and disable dirty log tracking when just one domain is not capable.
> 
> I may not get you.
> 
> Assume that dirty_page_trackable is an attribution of an iommu_domain.
> This attribution might be changed once a new device (with different
> capability) added or removed. So it should be updated every time a new
> device is attached or detached. This work could be done by the vendor
> iommu driver on the path of dev_attach/dev_detach callback.
Yes, this is what I understand you.

> 
> For upper layers, before starting page tracking, they check the
> dirty_page_trackable attribution of the domain and start it only it's
> capable. Once the page tracking is switched on the vendor iommu driver
> (or iommu core) should block further device attach/detach operations
> until page tracking is stopped.
But when a domain becomes capable after detaching a device, the upper layer
still needs to query it and enable dirty log for it...

To make things coordinated, maybe the upper layer can register a notifier,
when the domain's capability change, the upper layer do not need to query, instead
they just need to realize a callback, and do their specific policy in the callback.
What do you think?

> 
>>
>>> any domain attach/detach once the dirty page tracking is on.
>> Yep, this can greatly simplify our code logic, but I don't know whether our maintainers
>> agree that, as they may think that IOMMU dirty logging should not change original domain
>> behaviors.
> 
> The maintainer owns the last word, but we need to work out a generic and
> self-contained API set.
OK, I see.

Thanks,
Keqian
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [RFC PATCH v4 01/13] iommu: Introduce dirty log tracking framework
@ 2021-05-11  7:40               ` Keqian Zhu
  0 siblings, 0 replies; 81+ messages in thread
From: Keqian Zhu @ 2021-05-11  7:40 UTC (permalink / raw)
  To: Lu Baolu, linux-kernel, linux-arm-kernel, iommu, Robin Murphy,
	Will Deacon, Joerg Roedel, Jean-Philippe Brucker, Yi Sun,
	Tian Kevin
  Cc: Alex Williamson, Kirti Wankhede, Cornelia Huck, Jonathan Cameron,
	wanghaibin.wang, jiangkunkun, yuzenghui, lushenming

Hi Baolu,

On 2021/5/11 11:12, Lu Baolu wrote:
> Hi Keqian,
> 
> On 5/10/21 7:07 PM, Keqian Zhu wrote:
>>>>> I suppose this interface is to ask the vendor IOMMU driver to check
>>>>> whether each device/iommu in the domain supports dirty bit tracking.
>>>>> But what will happen if new devices with different tracking capability
>>>>> are added afterward?
>>>> Yep, this is considered in the vfio part. We will query again after attaching or
>>>> detaching devices from the domain.  When the domain becomes capable, we enable
>>>> dirty log for it. When it becomes not capable, we disable dirty log for it.
>>> If that's the case, why not putting this logic in the iommu subsystem so
>>> that it doesn't need to be duplicate in different upper layers?
>>>
>>> For example, add something like dirty_page_trackable in the struct of
>>> iommu_domain and ask the vendor iommu driver to update it once any
>>> device is added/removed to/from the domain. It's also better to disallow
>> If we do it, the upper layer still needs to query the capability from domain and switch
>> dirty log tracking for it. Or do you mean the domain can switch dirty log tracking automatically
>> when its capability change? If so, I think we're lack of some flexibility. The upper layer
>> may have it's own policy, such as only enable dirty log tracking when all domains are capable,
>> and disable dirty log tracking when just one domain is not capable.
> 
> I may not get you.
> 
> Assume that dirty_page_trackable is an attribution of an iommu_domain.
> This attribution might be changed once a new device (with different
> capability) added or removed. So it should be updated every time a new
> device is attached or detached. This work could be done by the vendor
> iommu driver on the path of dev_attach/dev_detach callback.
Yes, this is what I understand you.

> 
> For upper layers, before starting page tracking, they check the
> dirty_page_trackable attribution of the domain and start it only it's
> capable. Once the page tracking is switched on the vendor iommu driver
> (or iommu core) should block further device attach/detach operations
> until page tracking is stopped.
But when a domain becomes capable after detaching a device, the upper layer
still needs to query it and enable dirty log for it...

To make things coordinated, maybe the upper layer can register a notifier,
when the domain's capability change, the upper layer do not need to query, instead
they just need to realize a callback, and do their specific policy in the callback.
What do you think?

> 
>>
>>> any domain attach/detach once the dirty page tracking is on.
>> Yep, this can greatly simplify our code logic, but I don't know whether our maintainers
>> agree that, as they may think that IOMMU dirty logging should not change original domain
>> behaviors.
> 
> The maintainer owns the last word, but we need to work out a generic and
> self-contained API set.
OK, I see.

Thanks,
Keqian

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [RFC PATCH v4 01/13] iommu: Introduce dirty log tracking framework
  2021-05-11  7:40               ` Keqian Zhu
  (?)
@ 2021-05-12  3:20                 ` Lu Baolu
  -1 siblings, 0 replies; 81+ messages in thread
From: Lu Baolu @ 2021-05-12  3:20 UTC (permalink / raw)
  To: Keqian Zhu, linux-kernel, linux-arm-kernel, iommu, Robin Murphy,
	Will Deacon, Joerg Roedel, Jean-Philippe Brucker, Yi Sun,
	Tian Kevin
  Cc: baolu.lu, Alex Williamson, Kirti Wankhede, Cornelia Huck,
	Jonathan Cameron, wanghaibin.wang, jiangkunkun, yuzenghui,
	lushenming

On 5/11/21 3:40 PM, Keqian Zhu wrote:
>> For upper layers, before starting page tracking, they check the
>> dirty_page_trackable attribution of the domain and start it only it's
>> capable. Once the page tracking is switched on the vendor iommu driver
>> (or iommu core) should block further device attach/detach operations
>> until page tracking is stopped.
> But when a domain becomes capable after detaching a device, the upper layer
> still needs to query it and enable dirty log for it...
> 
> To make things coordinated, maybe the upper layer can register a notifier,
> when the domain's capability change, the upper layer do not need to query, instead
> they just need to realize a callback, and do their specific policy in the callback.
> What do you think?
> 

That might be an option. But why not checking domain's attribution every
time a new tracking period is about to start?

Best regards,
baolu

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [RFC PATCH v4 01/13] iommu: Introduce dirty log tracking framework
@ 2021-05-12  3:20                 ` Lu Baolu
  0 siblings, 0 replies; 81+ messages in thread
From: Lu Baolu @ 2021-05-12  3:20 UTC (permalink / raw)
  To: Keqian Zhu, linux-kernel, linux-arm-kernel, iommu, Robin Murphy,
	Will Deacon, Joerg Roedel, Jean-Philippe Brucker, Yi Sun,
	Tian Kevin
  Cc: jiangkunkun, Cornelia Huck, Kirti Wankhede, lushenming,
	Alex Williamson, wanghaibin.wang

On 5/11/21 3:40 PM, Keqian Zhu wrote:
>> For upper layers, before starting page tracking, they check the
>> dirty_page_trackable attribution of the domain and start it only it's
>> capable. Once the page tracking is switched on the vendor iommu driver
>> (or iommu core) should block further device attach/detach operations
>> until page tracking is stopped.
> But when a domain becomes capable after detaching a device, the upper layer
> still needs to query it and enable dirty log for it...
> 
> To make things coordinated, maybe the upper layer can register a notifier,
> when the domain's capability change, the upper layer do not need to query, instead
> they just need to realize a callback, and do their specific policy in the callback.
> What do you think?
> 

That might be an option. But why not checking domain's attribution every
time a new tracking period is about to start?

Best regards,
baolu
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [RFC PATCH v4 01/13] iommu: Introduce dirty log tracking framework
@ 2021-05-12  3:20                 ` Lu Baolu
  0 siblings, 0 replies; 81+ messages in thread
From: Lu Baolu @ 2021-05-12  3:20 UTC (permalink / raw)
  To: Keqian Zhu, linux-kernel, linux-arm-kernel, iommu, Robin Murphy,
	Will Deacon, Joerg Roedel, Jean-Philippe Brucker, Yi Sun,
	Tian Kevin
  Cc: baolu.lu, Alex Williamson, Kirti Wankhede, Cornelia Huck,
	Jonathan Cameron, wanghaibin.wang, jiangkunkun, yuzenghui,
	lushenming

On 5/11/21 3:40 PM, Keqian Zhu wrote:
>> For upper layers, before starting page tracking, they check the
>> dirty_page_trackable attribution of the domain and start it only it's
>> capable. Once the page tracking is switched on the vendor iommu driver
>> (or iommu core) should block further device attach/detach operations
>> until page tracking is stopped.
> But when a domain becomes capable after detaching a device, the upper layer
> still needs to query it and enable dirty log for it...
> 
> To make things coordinated, maybe the upper layer can register a notifier,
> when the domain's capability change, the upper layer do not need to query, instead
> they just need to realize a callback, and do their specific policy in the callback.
> What do you think?
> 

That might be an option. But why not checking domain's attribution every
time a new tracking period is about to start?

Best regards,
baolu

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [RFC PATCH v4 01/13] iommu: Introduce dirty log tracking framework
  2021-05-12  3:20                 ` Lu Baolu
  (?)
@ 2021-05-12  8:44                   ` Keqian Zhu
  -1 siblings, 0 replies; 81+ messages in thread
From: Keqian Zhu @ 2021-05-12  8:44 UTC (permalink / raw)
  To: Lu Baolu, linux-kernel, linux-arm-kernel, iommu, Robin Murphy,
	Will Deacon, Joerg Roedel, Jean-Philippe Brucker, Yi Sun,
	Tian Kevin
  Cc: Alex Williamson, Kirti Wankhede, Cornelia Huck, Jonathan Cameron,
	wanghaibin.wang, jiangkunkun, yuzenghui, lushenming



On 2021/5/12 11:20, Lu Baolu wrote:
> On 5/11/21 3:40 PM, Keqian Zhu wrote:
>>> For upper layers, before starting page tracking, they check the
>>> dirty_page_trackable attribution of the domain and start it only it's
>>> capable. Once the page tracking is switched on the vendor iommu driver
>>> (or iommu core) should block further device attach/detach operations
>>> until page tracking is stopped.
>> But when a domain becomes capable after detaching a device, the upper layer
>> still needs to query it and enable dirty log for it...
>>
>> To make things coordinated, maybe the upper layer can register a notifier,
>> when the domain's capability change, the upper layer do not need to query, instead
>> they just need to realize a callback, and do their specific policy in the callback.
>> What do you think?
>>
> 
> That might be an option. But why not checking domain's attribution every
> time a new tracking period is about to start?
Hi Baolu,

I'll add an attribution in iommu_domain, and the vendor iommu driver will update
the attribution when attach/detach devices.

The attribute should be protected by a lock, so the upper layer shouldn't access
the attribute directly. Then the iommu_domain_support_dirty_log() still should be
retained. Does this design looks good to you?

Thanks,
Keqian

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [RFC PATCH v4 01/13] iommu: Introduce dirty log tracking framework
@ 2021-05-12  8:44                   ` Keqian Zhu
  0 siblings, 0 replies; 81+ messages in thread
From: Keqian Zhu @ 2021-05-12  8:44 UTC (permalink / raw)
  To: Lu Baolu, linux-kernel, linux-arm-kernel, iommu, Robin Murphy,
	Will Deacon, Joerg Roedel, Jean-Philippe Brucker, Yi Sun,
	Tian Kevin
  Cc: jiangkunkun, Cornelia Huck, Kirti Wankhede, lushenming,
	Alex Williamson, wanghaibin.wang



On 2021/5/12 11:20, Lu Baolu wrote:
> On 5/11/21 3:40 PM, Keqian Zhu wrote:
>>> For upper layers, before starting page tracking, they check the
>>> dirty_page_trackable attribution of the domain and start it only it's
>>> capable. Once the page tracking is switched on the vendor iommu driver
>>> (or iommu core) should block further device attach/detach operations
>>> until page tracking is stopped.
>> But when a domain becomes capable after detaching a device, the upper layer
>> still needs to query it and enable dirty log for it...
>>
>> To make things coordinated, maybe the upper layer can register a notifier,
>> when the domain's capability change, the upper layer do not need to query, instead
>> they just need to realize a callback, and do their specific policy in the callback.
>> What do you think?
>>
> 
> That might be an option. But why not checking domain's attribution every
> time a new tracking period is about to start?
Hi Baolu,

I'll add an attribution in iommu_domain, and the vendor iommu driver will update
the attribution when attach/detach devices.

The attribute should be protected by a lock, so the upper layer shouldn't access
the attribute directly. Then the iommu_domain_support_dirty_log() still should be
retained. Does this design looks good to you?

Thanks,
Keqian
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [RFC PATCH v4 01/13] iommu: Introduce dirty log tracking framework
@ 2021-05-12  8:44                   ` Keqian Zhu
  0 siblings, 0 replies; 81+ messages in thread
From: Keqian Zhu @ 2021-05-12  8:44 UTC (permalink / raw)
  To: Lu Baolu, linux-kernel, linux-arm-kernel, iommu, Robin Murphy,
	Will Deacon, Joerg Roedel, Jean-Philippe Brucker, Yi Sun,
	Tian Kevin
  Cc: Alex Williamson, Kirti Wankhede, Cornelia Huck, Jonathan Cameron,
	wanghaibin.wang, jiangkunkun, yuzenghui, lushenming



On 2021/5/12 11:20, Lu Baolu wrote:
> On 5/11/21 3:40 PM, Keqian Zhu wrote:
>>> For upper layers, before starting page tracking, they check the
>>> dirty_page_trackable attribution of the domain and start it only it's
>>> capable. Once the page tracking is switched on the vendor iommu driver
>>> (or iommu core) should block further device attach/detach operations
>>> until page tracking is stopped.
>> But when a domain becomes capable after detaching a device, the upper layer
>> still needs to query it and enable dirty log for it...
>>
>> To make things coordinated, maybe the upper layer can register a notifier,
>> when the domain's capability change, the upper layer do not need to query, instead
>> they just need to realize a callback, and do their specific policy in the callback.
>> What do you think?
>>
> 
> That might be an option. But why not checking domain's attribution every
> time a new tracking period is about to start?
Hi Baolu,

I'll add an attribution in iommu_domain, and the vendor iommu driver will update
the attribution when attach/detach devices.

The attribute should be protected by a lock, so the upper layer shouldn't access
the attribute directly. Then the iommu_domain_support_dirty_log() still should be
retained. Does this design looks good to you?

Thanks,
Keqian

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [RFC PATCH v4 01/13] iommu: Introduce dirty log tracking framework
  2021-05-12  8:44                   ` Keqian Zhu
  (?)
@ 2021-05-12 11:36                     ` Lu Baolu
  -1 siblings, 0 replies; 81+ messages in thread
From: Lu Baolu @ 2021-05-12 11:36 UTC (permalink / raw)
  To: Keqian Zhu, linux-kernel, linux-arm-kernel, iommu, Robin Murphy,
	Will Deacon, Joerg Roedel, Jean-Philippe Brucker, Yi Sun,
	Tian Kevin
  Cc: baolu.lu, Alex Williamson, Kirti Wankhede, Cornelia Huck,
	Jonathan Cameron, wanghaibin.wang, jiangkunkun, yuzenghui,
	lushenming

Hi keqian,

On 5/12/21 4:44 PM, Keqian Zhu wrote:
> 
> 
> On 2021/5/12 11:20, Lu Baolu wrote:
>> On 5/11/21 3:40 PM, Keqian Zhu wrote:
>>>> For upper layers, before starting page tracking, they check the
>>>> dirty_page_trackable attribution of the domain and start it only it's
>>>> capable. Once the page tracking is switched on the vendor iommu driver
>>>> (or iommu core) should block further device attach/detach operations
>>>> until page tracking is stopped.
>>> But when a domain becomes capable after detaching a device, the upper layer
>>> still needs to query it and enable dirty log for it...
>>>
>>> To make things coordinated, maybe the upper layer can register a notifier,
>>> when the domain's capability change, the upper layer do not need to query, instead
>>> they just need to realize a callback, and do their specific policy in the callback.
>>> What do you think?
>>>
>>
>> That might be an option. But why not checking domain's attribution every
>> time a new tracking period is about to start?
> Hi Baolu,
> 
> I'll add an attribution in iommu_domain, and the vendor iommu driver will update
> the attribution when attach/detach devices.
> 
> The attribute should be protected by a lock, so the upper layer shouldn't access
> the attribute directly. Then the iommu_domain_support_dirty_log() still should be
> retained. Does this design looks good to you?

Yes, that's what I was thinking of. But I am not sure whether it worth
of a lock here. It seems not to be a valid behavior for upper layer to
attach or detach any device while doing the dirty page tracking.

Best regards,
baolu

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [RFC PATCH v4 01/13] iommu: Introduce dirty log tracking framework
@ 2021-05-12 11:36                     ` Lu Baolu
  0 siblings, 0 replies; 81+ messages in thread
From: Lu Baolu @ 2021-05-12 11:36 UTC (permalink / raw)
  To: Keqian Zhu, linux-kernel, linux-arm-kernel, iommu, Robin Murphy,
	Will Deacon, Joerg Roedel, Jean-Philippe Brucker, Yi Sun,
	Tian Kevin
  Cc: jiangkunkun, Cornelia Huck, Kirti Wankhede, lushenming,
	Alex Williamson, wanghaibin.wang

Hi keqian,

On 5/12/21 4:44 PM, Keqian Zhu wrote:
> 
> 
> On 2021/5/12 11:20, Lu Baolu wrote:
>> On 5/11/21 3:40 PM, Keqian Zhu wrote:
>>>> For upper layers, before starting page tracking, they check the
>>>> dirty_page_trackable attribution of the domain and start it only it's
>>>> capable. Once the page tracking is switched on the vendor iommu driver
>>>> (or iommu core) should block further device attach/detach operations
>>>> until page tracking is stopped.
>>> But when a domain becomes capable after detaching a device, the upper layer
>>> still needs to query it and enable dirty log for it...
>>>
>>> To make things coordinated, maybe the upper layer can register a notifier,
>>> when the domain's capability change, the upper layer do not need to query, instead
>>> they just need to realize a callback, and do their specific policy in the callback.
>>> What do you think?
>>>
>>
>> That might be an option. But why not checking domain's attribution every
>> time a new tracking period is about to start?
> Hi Baolu,
> 
> I'll add an attribution in iommu_domain, and the vendor iommu driver will update
> the attribution when attach/detach devices.
> 
> The attribute should be protected by a lock, so the upper layer shouldn't access
> the attribute directly. Then the iommu_domain_support_dirty_log() still should be
> retained. Does this design looks good to you?

Yes, that's what I was thinking of. But I am not sure whether it worth
of a lock here. It seems not to be a valid behavior for upper layer to
attach or detach any device while doing the dirty page tracking.

Best regards,
baolu
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [RFC PATCH v4 01/13] iommu: Introduce dirty log tracking framework
@ 2021-05-12 11:36                     ` Lu Baolu
  0 siblings, 0 replies; 81+ messages in thread
From: Lu Baolu @ 2021-05-12 11:36 UTC (permalink / raw)
  To: Keqian Zhu, linux-kernel, linux-arm-kernel, iommu, Robin Murphy,
	Will Deacon, Joerg Roedel, Jean-Philippe Brucker, Yi Sun,
	Tian Kevin
  Cc: baolu.lu, Alex Williamson, Kirti Wankhede, Cornelia Huck,
	Jonathan Cameron, wanghaibin.wang, jiangkunkun, yuzenghui,
	lushenming

Hi keqian,

On 5/12/21 4:44 PM, Keqian Zhu wrote:
> 
> 
> On 2021/5/12 11:20, Lu Baolu wrote:
>> On 5/11/21 3:40 PM, Keqian Zhu wrote:
>>>> For upper layers, before starting page tracking, they check the
>>>> dirty_page_trackable attribution of the domain and start it only it's
>>>> capable. Once the page tracking is switched on the vendor iommu driver
>>>> (or iommu core) should block further device attach/detach operations
>>>> until page tracking is stopped.
>>> But when a domain becomes capable after detaching a device, the upper layer
>>> still needs to query it and enable dirty log for it...
>>>
>>> To make things coordinated, maybe the upper layer can register a notifier,
>>> when the domain's capability change, the upper layer do not need to query, instead
>>> they just need to realize a callback, and do their specific policy in the callback.
>>> What do you think?
>>>
>>
>> That might be an option. But why not checking domain's attribution every
>> time a new tracking period is about to start?
> Hi Baolu,
> 
> I'll add an attribution in iommu_domain, and the vendor iommu driver will update
> the attribution when attach/detach devices.
> 
> The attribute should be protected by a lock, so the upper layer shouldn't access
> the attribute directly. Then the iommu_domain_support_dirty_log() still should be
> retained. Does this design looks good to you?

Yes, that's what I was thinking of. But I am not sure whether it worth
of a lock here. It seems not to be a valid behavior for upper layer to
attach or detach any device while doing the dirty page tracking.

Best regards,
baolu

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [RFC PATCH v4 01/13] iommu: Introduce dirty log tracking framework
  2021-05-12 11:36                     ` Lu Baolu
  (?)
@ 2021-05-13 10:58                       ` Keqian Zhu
  -1 siblings, 0 replies; 81+ messages in thread
From: Keqian Zhu @ 2021-05-13 10:58 UTC (permalink / raw)
  To: Lu Baolu, linux-kernel, linux-arm-kernel, iommu, Robin Murphy,
	Will Deacon, Joerg Roedel, Jean-Philippe Brucker, Yi Sun,
	Tian Kevin
  Cc: Alex Williamson, Kirti Wankhede, Cornelia Huck, Jonathan Cameron,
	wanghaibin.wang, jiangkunkun, yuzenghui, lushenming



On 2021/5/12 19:36, Lu Baolu wrote:
> Hi keqian,
> 
> On 5/12/21 4:44 PM, Keqian Zhu wrote:
>>
>>
>> On 2021/5/12 11:20, Lu Baolu wrote:
>>> On 5/11/21 3:40 PM, Keqian Zhu wrote:
>>>>> For upper layers, before starting page tracking, they check the
>>>>> dirty_page_trackable attribution of the domain and start it only it's
>>>>> capable. Once the page tracking is switched on the vendor iommu driver
>>>>> (or iommu core) should block further device attach/detach operations
>>>>> until page tracking is stopped.
>>>> But when a domain becomes capable after detaching a device, the upper layer
>>>> still needs to query it and enable dirty log for it...
>>>>
>>>> To make things coordinated, maybe the upper layer can register a notifier,
>>>> when the domain's capability change, the upper layer do not need to query, instead
>>>> they just need to realize a callback, and do their specific policy in the callback.
>>>> What do you think?
>>>>
>>>
>>> That might be an option. But why not checking domain's attribution every
>>> time a new tracking period is about to start?
>> Hi Baolu,
>>
>> I'll add an attribution in iommu_domain, and the vendor iommu driver will update
>> the attribution when attach/detach devices.
>>
>> The attribute should be protected by a lock, so the upper layer shouldn't access
>> the attribute directly. Then the iommu_domain_support_dirty_log() still should be
>> retained. Does this design looks good to you?
> 
> Yes, that's what I was thinking of. But I am not sure whether it worth
> of a lock here. It seems not to be a valid behavior for upper layer to
> attach or detach any device while doing the dirty page tracking.
Hi Baolu,

Right, if the "detach|attach" interfaces and "dirty tracking" interfaces can be called concurrently,
a lock in iommu_domain_support_dirty_log() is still not enough. I will add another note for the dirty
tracking interfaces.

Do you have other suggestions? I will accelerate the progress, so I plan to send out v5 next week.

Thanks,
Keqian

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [RFC PATCH v4 01/13] iommu: Introduce dirty log tracking framework
@ 2021-05-13 10:58                       ` Keqian Zhu
  0 siblings, 0 replies; 81+ messages in thread
From: Keqian Zhu @ 2021-05-13 10:58 UTC (permalink / raw)
  To: Lu Baolu, linux-kernel, linux-arm-kernel, iommu, Robin Murphy,
	Will Deacon, Joerg Roedel, Jean-Philippe Brucker, Yi Sun,
	Tian Kevin
  Cc: jiangkunkun, Cornelia Huck, Kirti Wankhede, lushenming,
	Alex Williamson, wanghaibin.wang



On 2021/5/12 19:36, Lu Baolu wrote:
> Hi keqian,
> 
> On 5/12/21 4:44 PM, Keqian Zhu wrote:
>>
>>
>> On 2021/5/12 11:20, Lu Baolu wrote:
>>> On 5/11/21 3:40 PM, Keqian Zhu wrote:
>>>>> For upper layers, before starting page tracking, they check the
>>>>> dirty_page_trackable attribution of the domain and start it only it's
>>>>> capable. Once the page tracking is switched on the vendor iommu driver
>>>>> (or iommu core) should block further device attach/detach operations
>>>>> until page tracking is stopped.
>>>> But when a domain becomes capable after detaching a device, the upper layer
>>>> still needs to query it and enable dirty log for it...
>>>>
>>>> To make things coordinated, maybe the upper layer can register a notifier,
>>>> when the domain's capability change, the upper layer do not need to query, instead
>>>> they just need to realize a callback, and do their specific policy in the callback.
>>>> What do you think?
>>>>
>>>
>>> That might be an option. But why not checking domain's attribution every
>>> time a new tracking period is about to start?
>> Hi Baolu,
>>
>> I'll add an attribution in iommu_domain, and the vendor iommu driver will update
>> the attribution when attach/detach devices.
>>
>> The attribute should be protected by a lock, so the upper layer shouldn't access
>> the attribute directly. Then the iommu_domain_support_dirty_log() still should be
>> retained. Does this design looks good to you?
> 
> Yes, that's what I was thinking of. But I am not sure whether it worth
> of a lock here. It seems not to be a valid behavior for upper layer to
> attach or detach any device while doing the dirty page tracking.
Hi Baolu,

Right, if the "detach|attach" interfaces and "dirty tracking" interfaces can be called concurrently,
a lock in iommu_domain_support_dirty_log() is still not enough. I will add another note for the dirty
tracking interfaces.

Do you have other suggestions? I will accelerate the progress, so I plan to send out v5 next week.

Thanks,
Keqian
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [RFC PATCH v4 01/13] iommu: Introduce dirty log tracking framework
@ 2021-05-13 10:58                       ` Keqian Zhu
  0 siblings, 0 replies; 81+ messages in thread
From: Keqian Zhu @ 2021-05-13 10:58 UTC (permalink / raw)
  To: Lu Baolu, linux-kernel, linux-arm-kernel, iommu, Robin Murphy,
	Will Deacon, Joerg Roedel, Jean-Philippe Brucker, Yi Sun,
	Tian Kevin
  Cc: Alex Williamson, Kirti Wankhede, Cornelia Huck, Jonathan Cameron,
	wanghaibin.wang, jiangkunkun, yuzenghui, lushenming



On 2021/5/12 19:36, Lu Baolu wrote:
> Hi keqian,
> 
> On 5/12/21 4:44 PM, Keqian Zhu wrote:
>>
>>
>> On 2021/5/12 11:20, Lu Baolu wrote:
>>> On 5/11/21 3:40 PM, Keqian Zhu wrote:
>>>>> For upper layers, before starting page tracking, they check the
>>>>> dirty_page_trackable attribution of the domain and start it only it's
>>>>> capable. Once the page tracking is switched on the vendor iommu driver
>>>>> (or iommu core) should block further device attach/detach operations
>>>>> until page tracking is stopped.
>>>> But when a domain becomes capable after detaching a device, the upper layer
>>>> still needs to query it and enable dirty log for it...
>>>>
>>>> To make things coordinated, maybe the upper layer can register a notifier,
>>>> when the domain's capability change, the upper layer do not need to query, instead
>>>> they just need to realize a callback, and do their specific policy in the callback.
>>>> What do you think?
>>>>
>>>
>>> That might be an option. But why not checking domain's attribution every
>>> time a new tracking period is about to start?
>> Hi Baolu,
>>
>> I'll add an attribution in iommu_domain, and the vendor iommu driver will update
>> the attribution when attach/detach devices.
>>
>> The attribute should be protected by a lock, so the upper layer shouldn't access
>> the attribute directly. Then the iommu_domain_support_dirty_log() still should be
>> retained. Does this design looks good to you?
> 
> Yes, that's what I was thinking of. But I am not sure whether it worth
> of a lock here. It seems not to be a valid behavior for upper layer to
> attach or detach any device while doing the dirty page tracking.
Hi Baolu,

Right, if the "detach|attach" interfaces and "dirty tracking" interfaces can be called concurrently,
a lock in iommu_domain_support_dirty_log() is still not enough. I will add another note for the dirty
tracking interfaces.

Do you have other suggestions? I will accelerate the progress, so I plan to send out v5 next week.

Thanks,
Keqian

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [RFC PATCH v4 01/13] iommu: Introduce dirty log tracking framework
  2021-05-13 10:58                       ` Keqian Zhu
  (?)
@ 2021-05-13 12:02                         ` Lu Baolu
  -1 siblings, 0 replies; 81+ messages in thread
From: Lu Baolu @ 2021-05-13 12:02 UTC (permalink / raw)
  To: Keqian Zhu, linux-kernel, linux-arm-kernel, iommu, Robin Murphy,
	Will Deacon, Joerg Roedel, Jean-Philippe Brucker, Yi Sun,
	Tian Kevin
  Cc: baolu.lu, Alex Williamson, Kirti Wankhede, Cornelia Huck,
	Jonathan Cameron, wanghaibin.wang, jiangkunkun, yuzenghui,
	lushenming

On 5/13/21 6:58 PM, Keqian Zhu wrote:
> 
> 
> On 2021/5/12 19:36, Lu Baolu wrote:
>> Hi keqian,
>>
>> On 5/12/21 4:44 PM, Keqian Zhu wrote:
>>>
>>>
>>> On 2021/5/12 11:20, Lu Baolu wrote:
>>>> On 5/11/21 3:40 PM, Keqian Zhu wrote:
>>>>>> For upper layers, before starting page tracking, they check the
>>>>>> dirty_page_trackable attribution of the domain and start it only it's
>>>>>> capable. Once the page tracking is switched on the vendor iommu driver
>>>>>> (or iommu core) should block further device attach/detach operations
>>>>>> until page tracking is stopped.
>>>>> But when a domain becomes capable after detaching a device, the upper layer
>>>>> still needs to query it and enable dirty log for it...
>>>>>
>>>>> To make things coordinated, maybe the upper layer can register a notifier,
>>>>> when the domain's capability change, the upper layer do not need to query, instead
>>>>> they just need to realize a callback, and do their specific policy in the callback.
>>>>> What do you think?
>>>>>
>>>>
>>>> That might be an option. But why not checking domain's attribution every
>>>> time a new tracking period is about to start?
>>> Hi Baolu,
>>>
>>> I'll add an attribution in iommu_domain, and the vendor iommu driver will update
>>> the attribution when attach/detach devices.
>>>
>>> The attribute should be protected by a lock, so the upper layer shouldn't access
>>> the attribute directly. Then the iommu_domain_support_dirty_log() still should be
>>> retained. Does this design looks good to you?
>>
>> Yes, that's what I was thinking of. But I am not sure whether it worth
>> of a lock here. It seems not to be a valid behavior for upper layer to
>> attach or detach any device while doing the dirty page tracking.
> Hi Baolu,
> 
> Right, if the "detach|attach" interfaces and "dirty tracking" interfaces can be called concurrently,
> a lock in iommu_domain_support_dirty_log() is still not enough. I will add another note for the dirty
> tracking interfaces.
> 
> Do you have other suggestions? I will accelerate the progress, so I plan to send out v5 next week.

No further comments expect below nit:

"iommu_switch_dirty_log: Perform actions to start|stop dirty log tracking"

How about splitting it into
  - iommu_start_dirty_log()
  - iommu_stop_dirty_log()

Not a strong opinion anyway.

Best regards,
baolu

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [RFC PATCH v4 01/13] iommu: Introduce dirty log tracking framework
@ 2021-05-13 12:02                         ` Lu Baolu
  0 siblings, 0 replies; 81+ messages in thread
From: Lu Baolu @ 2021-05-13 12:02 UTC (permalink / raw)
  To: Keqian Zhu, linux-kernel, linux-arm-kernel, iommu, Robin Murphy,
	Will Deacon, Joerg Roedel, Jean-Philippe Brucker, Yi Sun,
	Tian Kevin
  Cc: jiangkunkun, Cornelia Huck, Kirti Wankhede, lushenming,
	Alex Williamson, wanghaibin.wang

On 5/13/21 6:58 PM, Keqian Zhu wrote:
> 
> 
> On 2021/5/12 19:36, Lu Baolu wrote:
>> Hi keqian,
>>
>> On 5/12/21 4:44 PM, Keqian Zhu wrote:
>>>
>>>
>>> On 2021/5/12 11:20, Lu Baolu wrote:
>>>> On 5/11/21 3:40 PM, Keqian Zhu wrote:
>>>>>> For upper layers, before starting page tracking, they check the
>>>>>> dirty_page_trackable attribution of the domain and start it only it's
>>>>>> capable. Once the page tracking is switched on the vendor iommu driver
>>>>>> (or iommu core) should block further device attach/detach operations
>>>>>> until page tracking is stopped.
>>>>> But when a domain becomes capable after detaching a device, the upper layer
>>>>> still needs to query it and enable dirty log for it...
>>>>>
>>>>> To make things coordinated, maybe the upper layer can register a notifier,
>>>>> when the domain's capability change, the upper layer do not need to query, instead
>>>>> they just need to realize a callback, and do their specific policy in the callback.
>>>>> What do you think?
>>>>>
>>>>
>>>> That might be an option. But why not checking domain's attribution every
>>>> time a new tracking period is about to start?
>>> Hi Baolu,
>>>
>>> I'll add an attribution in iommu_domain, and the vendor iommu driver will update
>>> the attribution when attach/detach devices.
>>>
>>> The attribute should be protected by a lock, so the upper layer shouldn't access
>>> the attribute directly. Then the iommu_domain_support_dirty_log() still should be
>>> retained. Does this design looks good to you?
>>
>> Yes, that's what I was thinking of. But I am not sure whether it worth
>> of a lock here. It seems not to be a valid behavior for upper layer to
>> attach or detach any device while doing the dirty page tracking.
> Hi Baolu,
> 
> Right, if the "detach|attach" interfaces and "dirty tracking" interfaces can be called concurrently,
> a lock in iommu_domain_support_dirty_log() is still not enough. I will add another note for the dirty
> tracking interfaces.
> 
> Do you have other suggestions? I will accelerate the progress, so I plan to send out v5 next week.

No further comments expect below nit:

"iommu_switch_dirty_log: Perform actions to start|stop dirty log tracking"

How about splitting it into
  - iommu_start_dirty_log()
  - iommu_stop_dirty_log()

Not a strong opinion anyway.

Best regards,
baolu
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [RFC PATCH v4 01/13] iommu: Introduce dirty log tracking framework
@ 2021-05-13 12:02                         ` Lu Baolu
  0 siblings, 0 replies; 81+ messages in thread
From: Lu Baolu @ 2021-05-13 12:02 UTC (permalink / raw)
  To: Keqian Zhu, linux-kernel, linux-arm-kernel, iommu, Robin Murphy,
	Will Deacon, Joerg Roedel, Jean-Philippe Brucker, Yi Sun,
	Tian Kevin
  Cc: baolu.lu, Alex Williamson, Kirti Wankhede, Cornelia Huck,
	Jonathan Cameron, wanghaibin.wang, jiangkunkun, yuzenghui,
	lushenming

On 5/13/21 6:58 PM, Keqian Zhu wrote:
> 
> 
> On 2021/5/12 19:36, Lu Baolu wrote:
>> Hi keqian,
>>
>> On 5/12/21 4:44 PM, Keqian Zhu wrote:
>>>
>>>
>>> On 2021/5/12 11:20, Lu Baolu wrote:
>>>> On 5/11/21 3:40 PM, Keqian Zhu wrote:
>>>>>> For upper layers, before starting page tracking, they check the
>>>>>> dirty_page_trackable attribution of the domain and start it only it's
>>>>>> capable. Once the page tracking is switched on the vendor iommu driver
>>>>>> (or iommu core) should block further device attach/detach operations
>>>>>> until page tracking is stopped.
>>>>> But when a domain becomes capable after detaching a device, the upper layer
>>>>> still needs to query it and enable dirty log for it...
>>>>>
>>>>> To make things coordinated, maybe the upper layer can register a notifier,
>>>>> when the domain's capability change, the upper layer do not need to query, instead
>>>>> they just need to realize a callback, and do their specific policy in the callback.
>>>>> What do you think?
>>>>>
>>>>
>>>> That might be an option. But why not checking domain's attribution every
>>>> time a new tracking period is about to start?
>>> Hi Baolu,
>>>
>>> I'll add an attribution in iommu_domain, and the vendor iommu driver will update
>>> the attribution when attach/detach devices.
>>>
>>> The attribute should be protected by a lock, so the upper layer shouldn't access
>>> the attribute directly. Then the iommu_domain_support_dirty_log() still should be
>>> retained. Does this design looks good to you?
>>
>> Yes, that's what I was thinking of. But I am not sure whether it worth
>> of a lock here. It seems not to be a valid behavior for upper layer to
>> attach or detach any device while doing the dirty page tracking.
> Hi Baolu,
> 
> Right, if the "detach|attach" interfaces and "dirty tracking" interfaces can be called concurrently,
> a lock in iommu_domain_support_dirty_log() is still not enough. I will add another note for the dirty
> tracking interfaces.
> 
> Do you have other suggestions? I will accelerate the progress, so I plan to send out v5 next week.

No further comments expect below nit:

"iommu_switch_dirty_log: Perform actions to start|stop dirty log tracking"

How about splitting it into
  - iommu_start_dirty_log()
  - iommu_stop_dirty_log()

Not a strong opinion anyway.

Best regards,
baolu

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [RFC PATCH v4 01/13] iommu: Introduce dirty log tracking framework
  2021-05-13 12:02                         ` Lu Baolu
  (?)
@ 2021-05-14  2:30                           ` Keqian Zhu
  -1 siblings, 0 replies; 81+ messages in thread
From: Keqian Zhu @ 2021-05-14  2:30 UTC (permalink / raw)
  To: Lu Baolu, linux-kernel, linux-arm-kernel, iommu, Robin Murphy,
	Will Deacon, Joerg Roedel, Jean-Philippe Brucker, Yi Sun,
	Tian Kevin
  Cc: Alex Williamson, Kirti Wankhede, Cornelia Huck, Jonathan Cameron,
	wanghaibin.wang, jiangkunkun, yuzenghui, lushenming



On 2021/5/13 20:02, Lu Baolu wrote:
> On 5/13/21 6:58 PM, Keqian Zhu wrote:
>>
>>
>> On 2021/5/12 19:36, Lu Baolu wrote:
>>> Hi keqian,
>>>
>>> On 5/12/21 4:44 PM, Keqian Zhu wrote:
>>>>
>>>>
>>>> On 2021/5/12 11:20, Lu Baolu wrote:
>>>>> On 5/11/21 3:40 PM, Keqian Zhu wrote:
>>>>>>> For upper layers, before starting page tracking, they check the
>>>>>>> dirty_page_trackable attribution of the domain and start it only it's
>>>>>>> capable. Once the page tracking is switched on the vendor iommu driver
>>>>>>> (or iommu core) should block further device attach/detach operations
>>>>>>> until page tracking is stopped.
>>>>>> But when a domain becomes capable after detaching a device, the upper layer
>>>>>> still needs to query it and enable dirty log for it...
>>>>>>
>>>>>> To make things coordinated, maybe the upper layer can register a notifier,
>>>>>> when the domain's capability change, the upper layer do not need to query, instead
>>>>>> they just need to realize a callback, and do their specific policy in the callback.
>>>>>> What do you think?
>>>>>>
>>>>>
>>>>> That might be an option. But why not checking domain's attribution every
>>>>> time a new tracking period is about to start?
>>>> Hi Baolu,
>>>>
>>>> I'll add an attribution in iommu_domain, and the vendor iommu driver will update
>>>> the attribution when attach/detach devices.
>>>>
>>>> The attribute should be protected by a lock, so the upper layer shouldn't access
>>>> the attribute directly. Then the iommu_domain_support_dirty_log() still should be
>>>> retained. Does this design looks good to you?
>>>
>>> Yes, that's what I was thinking of. But I am not sure whether it worth
>>> of a lock here. It seems not to be a valid behavior for upper layer to
>>> attach or detach any device while doing the dirty page tracking.
>> Hi Baolu,
>>
>> Right, if the "detach|attach" interfaces and "dirty tracking" interfaces can be called concurrently,
>> a lock in iommu_domain_support_dirty_log() is still not enough. I will add another note for the dirty
>> tracking interfaces.
>>
>> Do you have other suggestions? I will accelerate the progress, so I plan to send out v5 next week.
> 
> No further comments expect below nit:
> 
> "iommu_switch_dirty_log: Perform actions to start|stop dirty log tracking"
> 
> How about splitting it into
>  - iommu_start_dirty_log()
>  - iommu_stop_dirty_log()
Yeah, actually this is my original version, and the "switch" style is suggested by Yi Sun.
Anyway, I think both is OK, and the "switch" style can reduce some code.

Thanks,
Keqian

> 
> Not a strong opinion anyway.
> 
> Best regards,
> baolu
> .
> 

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [RFC PATCH v4 01/13] iommu: Introduce dirty log tracking framework
@ 2021-05-14  2:30                           ` Keqian Zhu
  0 siblings, 0 replies; 81+ messages in thread
From: Keqian Zhu @ 2021-05-14  2:30 UTC (permalink / raw)
  To: Lu Baolu, linux-kernel, linux-arm-kernel, iommu, Robin Murphy,
	Will Deacon, Joerg Roedel, Jean-Philippe Brucker, Yi Sun,
	Tian Kevin
  Cc: jiangkunkun, Cornelia Huck, Kirti Wankhede, lushenming,
	Alex Williamson, wanghaibin.wang



On 2021/5/13 20:02, Lu Baolu wrote:
> On 5/13/21 6:58 PM, Keqian Zhu wrote:
>>
>>
>> On 2021/5/12 19:36, Lu Baolu wrote:
>>> Hi keqian,
>>>
>>> On 5/12/21 4:44 PM, Keqian Zhu wrote:
>>>>
>>>>
>>>> On 2021/5/12 11:20, Lu Baolu wrote:
>>>>> On 5/11/21 3:40 PM, Keqian Zhu wrote:
>>>>>>> For upper layers, before starting page tracking, they check the
>>>>>>> dirty_page_trackable attribution of the domain and start it only it's
>>>>>>> capable. Once the page tracking is switched on the vendor iommu driver
>>>>>>> (or iommu core) should block further device attach/detach operations
>>>>>>> until page tracking is stopped.
>>>>>> But when a domain becomes capable after detaching a device, the upper layer
>>>>>> still needs to query it and enable dirty log for it...
>>>>>>
>>>>>> To make things coordinated, maybe the upper layer can register a notifier,
>>>>>> when the domain's capability change, the upper layer do not need to query, instead
>>>>>> they just need to realize a callback, and do their specific policy in the callback.
>>>>>> What do you think?
>>>>>>
>>>>>
>>>>> That might be an option. But why not checking domain's attribution every
>>>>> time a new tracking period is about to start?
>>>> Hi Baolu,
>>>>
>>>> I'll add an attribution in iommu_domain, and the vendor iommu driver will update
>>>> the attribution when attach/detach devices.
>>>>
>>>> The attribute should be protected by a lock, so the upper layer shouldn't access
>>>> the attribute directly. Then the iommu_domain_support_dirty_log() still should be
>>>> retained. Does this design looks good to you?
>>>
>>> Yes, that's what I was thinking of. But I am not sure whether it worth
>>> of a lock here. It seems not to be a valid behavior for upper layer to
>>> attach or detach any device while doing the dirty page tracking.
>> Hi Baolu,
>>
>> Right, if the "detach|attach" interfaces and "dirty tracking" interfaces can be called concurrently,
>> a lock in iommu_domain_support_dirty_log() is still not enough. I will add another note for the dirty
>> tracking interfaces.
>>
>> Do you have other suggestions? I will accelerate the progress, so I plan to send out v5 next week.
> 
> No further comments expect below nit:
> 
> "iommu_switch_dirty_log: Perform actions to start|stop dirty log tracking"
> 
> How about splitting it into
>  - iommu_start_dirty_log()
>  - iommu_stop_dirty_log()
Yeah, actually this is my original version, and the "switch" style is suggested by Yi Sun.
Anyway, I think both is OK, and the "switch" style can reduce some code.

Thanks,
Keqian

> 
> Not a strong opinion anyway.
> 
> Best regards,
> baolu
> .
> 
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [RFC PATCH v4 01/13] iommu: Introduce dirty log tracking framework
@ 2021-05-14  2:30                           ` Keqian Zhu
  0 siblings, 0 replies; 81+ messages in thread
From: Keqian Zhu @ 2021-05-14  2:30 UTC (permalink / raw)
  To: Lu Baolu, linux-kernel, linux-arm-kernel, iommu, Robin Murphy,
	Will Deacon, Joerg Roedel, Jean-Philippe Brucker, Yi Sun,
	Tian Kevin
  Cc: Alex Williamson, Kirti Wankhede, Cornelia Huck, Jonathan Cameron,
	wanghaibin.wang, jiangkunkun, yuzenghui, lushenming



On 2021/5/13 20:02, Lu Baolu wrote:
> On 5/13/21 6:58 PM, Keqian Zhu wrote:
>>
>>
>> On 2021/5/12 19:36, Lu Baolu wrote:
>>> Hi keqian,
>>>
>>> On 5/12/21 4:44 PM, Keqian Zhu wrote:
>>>>
>>>>
>>>> On 2021/5/12 11:20, Lu Baolu wrote:
>>>>> On 5/11/21 3:40 PM, Keqian Zhu wrote:
>>>>>>> For upper layers, before starting page tracking, they check the
>>>>>>> dirty_page_trackable attribution of the domain and start it only it's
>>>>>>> capable. Once the page tracking is switched on the vendor iommu driver
>>>>>>> (or iommu core) should block further device attach/detach operations
>>>>>>> until page tracking is stopped.
>>>>>> But when a domain becomes capable after detaching a device, the upper layer
>>>>>> still needs to query it and enable dirty log for it...
>>>>>>
>>>>>> To make things coordinated, maybe the upper layer can register a notifier,
>>>>>> when the domain's capability change, the upper layer do not need to query, instead
>>>>>> they just need to realize a callback, and do their specific policy in the callback.
>>>>>> What do you think?
>>>>>>
>>>>>
>>>>> That might be an option. But why not checking domain's attribution every
>>>>> time a new tracking period is about to start?
>>>> Hi Baolu,
>>>>
>>>> I'll add an attribution in iommu_domain, and the vendor iommu driver will update
>>>> the attribution when attach/detach devices.
>>>>
>>>> The attribute should be protected by a lock, so the upper layer shouldn't access
>>>> the attribute directly. Then the iommu_domain_support_dirty_log() still should be
>>>> retained. Does this design looks good to you?
>>>
>>> Yes, that's what I was thinking of. But I am not sure whether it worth
>>> of a lock here. It seems not to be a valid behavior for upper layer to
>>> attach or detach any device while doing the dirty page tracking.
>> Hi Baolu,
>>
>> Right, if the "detach|attach" interfaces and "dirty tracking" interfaces can be called concurrently,
>> a lock in iommu_domain_support_dirty_log() is still not enough. I will add another note for the dirty
>> tracking interfaces.
>>
>> Do you have other suggestions? I will accelerate the progress, so I plan to send out v5 next week.
> 
> No further comments expect below nit:
> 
> "iommu_switch_dirty_log: Perform actions to start|stop dirty log tracking"
> 
> How about splitting it into
>  - iommu_start_dirty_log()
>  - iommu_stop_dirty_log()
Yeah, actually this is my original version, and the "switch" style is suggested by Yi Sun.
Anyway, I think both is OK, and the "switch" style can reduce some code.

Thanks,
Keqian

> 
> Not a strong opinion anyway.
> 
> Best regards,
> baolu
> .
> 

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [RFC PATCH v4 00/13] iommu/smmuv3: Implement hardware dirty log tracking
  2021-05-07 10:21 ` Keqian Zhu
  (?)
@ 2021-05-17  8:46   ` Keqian Zhu
  -1 siblings, 0 replies; 81+ messages in thread
From: Keqian Zhu @ 2021-05-17  8:46 UTC (permalink / raw)
  To: linux-kernel, linux-arm-kernel, iommu, Robin Murphy, Will Deacon,
	Joerg Roedel, Jean-Philippe Brucker, Lu Baolu, Yi Sun,
	Tian Kevin
  Cc: jiangkunkun, Cornelia Huck, Kirti Wankhede, lushenming,
	Alex Williamson, wanghaibin.wang

Hi all,

The VFIO part is at here: https://lore.kernel.org/kvm/20210507103608.39440-1-zhukeqian1@huawei.com/

Thanks,
Keqian

On 2021/5/7 18:21, Keqian Zhu wrote:
> Hi Robin, Will and everyone,
> 
> I think this series is relative mature now, please give your valuable suggestions,
> thanks!
> 
> 
> This patch series is split from the series[1] that containes both IOMMU part and
> VFIO part. The VFIO part will be sent out in another series.
> 
> [1] https://lore.kernel.org/linux-iommu/20210310090614.26668-1-zhukeqian1@huawei.com/
> 
> changelog:
> 
> v4:
>  - Modify the framework as suggested by Baolu, thanks!
>  - Add trace for iommu ops.
>  - Extract io-pgtable part.
> 
> v3:
>  - Merge start_dirty_log and stop_dirty_log into switch_dirty_log. (Yi Sun)
>  - Maintain the dirty log status in iommu_domain.
>  - Update commit message to make patch easier to review.
> 
> v2:
>  - Address all comments of RFC version, thanks for all of you ;-)
>  - Add a bugfix that start dirty log for newly added dma ranges and domain.
> 
> 
> 
> Hi everyone,
> 
> This patch series introduces a framework of iommu dirty log tracking, and smmuv3
> realizes this framework. This new feature can be used by VFIO dma dirty tracking.
> 
> Intention:
> 
> Some types of IOMMU are capable of tracking DMA dirty log, such as
> ARM SMMU with HTTU or Intel IOMMU with SLADE. This introduces the
> dirty log tracking framework in the IOMMU base layer.
> 
> Three new essential interfaces are added, and we maintaince the status
> of dirty log tracking in iommu_domain.
> 1. iommu_switch_dirty_log: Perform actions to start|stop dirty log tracking
> 2. iommu_sync_dirty_log: Sync dirty log from IOMMU into a dirty bitmap
> 3. iommu_clear_dirty_log: Clear dirty log of IOMMU by a mask bitmap
> 
> About SMMU HTTU:
> 
> HTTU (Hardware Translation Table Update) is a feature of ARM SMMUv3, it can update
> access flag or/and dirty state of the TTD (Translation Table Descriptor) by hardware.
> With HTTU, stage1 TTD is classified into 3 types:
>                         DBM bit             AP[2](readonly bit)
> 1. writable_clean         1                       1
> 2. writable_dirty         1                       0
> 3. readonly               0                       1
> 
> If HTTU_HD (manage dirty state) is enabled, smmu can change TTD from writable_clean to
> writable_dirty. Then software can scan TTD to sync dirty state into dirty bitmap. With
> this feature, we can track the dirty log of DMA continuously and precisely.
> 
> About this series:
> 
> Patch 1-3:Introduce dirty log tracking framework in the IOMMU base layer, and two common
>            interfaces that can be used by many types of iommu.
> 
> Patch 4-6: Add feature detection for smmu HTTU and enable HTTU for smmu stage1 mapping.
>            And add feature detection for smmu BBML. We need to split block mapping when
>            start dirty log tracking and merge page mapping when stop dirty log tracking,
> 		   which requires break-before-make procedure. But it might cause problems when the
> 		   TTD is alive. The I/O streams might not tolerate translation faults. So BBML
> 		   should be used.
> 
> Patch 7-12: We implement these interfaces for arm smmuv3.
> 
> Thanks,
> Keqian
> 
> Jean-Philippe Brucker (1):
>   iommu/arm-smmu-v3: Add support for Hardware Translation Table Update
> 
> Keqian Zhu (1):
>   iommu: Introduce dirty log tracking framework
> 
> Kunkun Jiang (11):
>   iommu/io-pgtable-arm: Add quirk ARM_HD and ARM_BBMLx
>   iommu/io-pgtable-arm: Add and realize split_block ops
>   iommu/io-pgtable-arm: Add and realize merge_page ops
>   iommu/io-pgtable-arm: Add and realize sync_dirty_log ops
>   iommu/io-pgtable-arm: Add and realize clear_dirty_log ops
>   iommu/arm-smmu-v3: Enable HTTU for stage1 with io-pgtable mapping
>   iommu/arm-smmu-v3: Add feature detection for BBML
>   iommu/arm-smmu-v3: Realize switch_dirty_log iommu ops
>   iommu/arm-smmu-v3: Realize sync_dirty_log iommu ops
>   iommu/arm-smmu-v3: Realize clear_dirty_log iommu ops
>   iommu/arm-smmu-v3: Realize support_dirty_log iommu ops
> 
>  .../iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c   |   2 +
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c   | 268 +++++++++++-
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h   |  14 +
>  drivers/iommu/io-pgtable-arm.c                | 389 +++++++++++++++++-
>  drivers/iommu/iommu.c                         | 206 +++++++++-
>  include/linux/io-pgtable.h                    |  23 ++
>  include/linux/iommu.h                         |  65 +++
>  include/trace/events/iommu.h                  |  63 +++
>  8 files changed, 1026 insertions(+), 4 deletions(-)
> 

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [RFC PATCH v4 00/13] iommu/smmuv3: Implement hardware dirty log tracking
@ 2021-05-17  8:46   ` Keqian Zhu
  0 siblings, 0 replies; 81+ messages in thread
From: Keqian Zhu @ 2021-05-17  8:46 UTC (permalink / raw)
  To: linux-kernel, linux-arm-kernel, iommu, Robin Murphy, Will Deacon,
	Joerg Roedel, Jean-Philippe Brucker, Lu Baolu, Yi Sun,
	Tian Kevin
  Cc: Cornelia Huck, jiangkunkun, Alex Williamson, lushenming,
	Kirti Wankhede, wanghaibin.wang

Hi all,

The VFIO part is at here: https://lore.kernel.org/kvm/20210507103608.39440-1-zhukeqian1@huawei.com/

Thanks,
Keqian

On 2021/5/7 18:21, Keqian Zhu wrote:
> Hi Robin, Will and everyone,
> 
> I think this series is relative mature now, please give your valuable suggestions,
> thanks!
> 
> 
> This patch series is split from the series[1] that containes both IOMMU part and
> VFIO part. The VFIO part will be sent out in another series.
> 
> [1] https://lore.kernel.org/linux-iommu/20210310090614.26668-1-zhukeqian1@huawei.com/
> 
> changelog:
> 
> v4:
>  - Modify the framework as suggested by Baolu, thanks!
>  - Add trace for iommu ops.
>  - Extract io-pgtable part.
> 
> v3:
>  - Merge start_dirty_log and stop_dirty_log into switch_dirty_log. (Yi Sun)
>  - Maintain the dirty log status in iommu_domain.
>  - Update commit message to make patch easier to review.
> 
> v2:
>  - Address all comments of RFC version, thanks for all of you ;-)
>  - Add a bugfix that start dirty log for newly added dma ranges and domain.
> 
> 
> 
> Hi everyone,
> 
> This patch series introduces a framework of iommu dirty log tracking, and smmuv3
> realizes this framework. This new feature can be used by VFIO dma dirty tracking.
> 
> Intention:
> 
> Some types of IOMMU are capable of tracking DMA dirty log, such as
> ARM SMMU with HTTU or Intel IOMMU with SLADE. This introduces the
> dirty log tracking framework in the IOMMU base layer.
> 
> Three new essential interfaces are added, and we maintaince the status
> of dirty log tracking in iommu_domain.
> 1. iommu_switch_dirty_log: Perform actions to start|stop dirty log tracking
> 2. iommu_sync_dirty_log: Sync dirty log from IOMMU into a dirty bitmap
> 3. iommu_clear_dirty_log: Clear dirty log of IOMMU by a mask bitmap
> 
> About SMMU HTTU:
> 
> HTTU (Hardware Translation Table Update) is a feature of ARM SMMUv3, it can update
> access flag or/and dirty state of the TTD (Translation Table Descriptor) by hardware.
> With HTTU, stage1 TTD is classified into 3 types:
>                         DBM bit             AP[2](readonly bit)
> 1. writable_clean         1                       1
> 2. writable_dirty         1                       0
> 3. readonly               0                       1
> 
> If HTTU_HD (manage dirty state) is enabled, smmu can change TTD from writable_clean to
> writable_dirty. Then software can scan TTD to sync dirty state into dirty bitmap. With
> this feature, we can track the dirty log of DMA continuously and precisely.
> 
> About this series:
> 
> Patch 1-3:Introduce dirty log tracking framework in the IOMMU base layer, and two common
>            interfaces that can be used by many types of iommu.
> 
> Patch 4-6: Add feature detection for smmu HTTU and enable HTTU for smmu stage1 mapping.
>            And add feature detection for smmu BBML. We need to split block mapping when
>            start dirty log tracking and merge page mapping when stop dirty log tracking,
> 		   which requires break-before-make procedure. But it might cause problems when the
> 		   TTD is alive. The I/O streams might not tolerate translation faults. So BBML
> 		   should be used.
> 
> Patch 7-12: We implement these interfaces for arm smmuv3.
> 
> Thanks,
> Keqian
> 
> Jean-Philippe Brucker (1):
>   iommu/arm-smmu-v3: Add support for Hardware Translation Table Update
> 
> Keqian Zhu (1):
>   iommu: Introduce dirty log tracking framework
> 
> Kunkun Jiang (11):
>   iommu/io-pgtable-arm: Add quirk ARM_HD and ARM_BBMLx
>   iommu/io-pgtable-arm: Add and realize split_block ops
>   iommu/io-pgtable-arm: Add and realize merge_page ops
>   iommu/io-pgtable-arm: Add and realize sync_dirty_log ops
>   iommu/io-pgtable-arm: Add and realize clear_dirty_log ops
>   iommu/arm-smmu-v3: Enable HTTU for stage1 with io-pgtable mapping
>   iommu/arm-smmu-v3: Add feature detection for BBML
>   iommu/arm-smmu-v3: Realize switch_dirty_log iommu ops
>   iommu/arm-smmu-v3: Realize sync_dirty_log iommu ops
>   iommu/arm-smmu-v3: Realize clear_dirty_log iommu ops
>   iommu/arm-smmu-v3: Realize support_dirty_log iommu ops
> 
>  .../iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c   |   2 +
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c   | 268 +++++++++++-
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h   |  14 +
>  drivers/iommu/io-pgtable-arm.c                | 389 +++++++++++++++++-
>  drivers/iommu/iommu.c                         | 206 +++++++++-
>  include/linux/io-pgtable.h                    |  23 ++
>  include/linux/iommu.h                         |  65 +++
>  include/trace/events/iommu.h                  |  63 +++
>  8 files changed, 1026 insertions(+), 4 deletions(-)
> 
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [RFC PATCH v4 00/13] iommu/smmuv3: Implement hardware dirty log tracking
@ 2021-05-17  8:46   ` Keqian Zhu
  0 siblings, 0 replies; 81+ messages in thread
From: Keqian Zhu @ 2021-05-17  8:46 UTC (permalink / raw)
  To: linux-kernel, linux-arm-kernel, iommu, Robin Murphy, Will Deacon,
	Joerg Roedel, Jean-Philippe Brucker, Lu Baolu, Yi Sun,
	Tian Kevin
  Cc: jiangkunkun, Cornelia Huck, Kirti Wankhede, lushenming,
	Alex Williamson, wanghaibin.wang

Hi all,

The VFIO part is at here: https://lore.kernel.org/kvm/20210507103608.39440-1-zhukeqian1@huawei.com/

Thanks,
Keqian

On 2021/5/7 18:21, Keqian Zhu wrote:
> Hi Robin, Will and everyone,
> 
> I think this series is relative mature now, please give your valuable suggestions,
> thanks!
> 
> 
> This patch series is split from the series[1] that containes both IOMMU part and
> VFIO part. The VFIO part will be sent out in another series.
> 
> [1] https://lore.kernel.org/linux-iommu/20210310090614.26668-1-zhukeqian1@huawei.com/
> 
> changelog:
> 
> v4:
>  - Modify the framework as suggested by Baolu, thanks!
>  - Add trace for iommu ops.
>  - Extract io-pgtable part.
> 
> v3:
>  - Merge start_dirty_log and stop_dirty_log into switch_dirty_log. (Yi Sun)
>  - Maintain the dirty log status in iommu_domain.
>  - Update commit message to make patch easier to review.
> 
> v2:
>  - Address all comments of RFC version, thanks for all of you ;-)
>  - Add a bugfix that start dirty log for newly added dma ranges and domain.
> 
> 
> 
> Hi everyone,
> 
> This patch series introduces a framework of iommu dirty log tracking, and smmuv3
> realizes this framework. This new feature can be used by VFIO dma dirty tracking.
> 
> Intention:
> 
> Some types of IOMMU are capable of tracking DMA dirty log, such as
> ARM SMMU with HTTU or Intel IOMMU with SLADE. This introduces the
> dirty log tracking framework in the IOMMU base layer.
> 
> Three new essential interfaces are added, and we maintaince the status
> of dirty log tracking in iommu_domain.
> 1. iommu_switch_dirty_log: Perform actions to start|stop dirty log tracking
> 2. iommu_sync_dirty_log: Sync dirty log from IOMMU into a dirty bitmap
> 3. iommu_clear_dirty_log: Clear dirty log of IOMMU by a mask bitmap
> 
> About SMMU HTTU:
> 
> HTTU (Hardware Translation Table Update) is a feature of ARM SMMUv3, it can update
> access flag or/and dirty state of the TTD (Translation Table Descriptor) by hardware.
> With HTTU, stage1 TTD is classified into 3 types:
>                         DBM bit             AP[2](readonly bit)
> 1. writable_clean         1                       1
> 2. writable_dirty         1                       0
> 3. readonly               0                       1
> 
> If HTTU_HD (manage dirty state) is enabled, smmu can change TTD from writable_clean to
> writable_dirty. Then software can scan TTD to sync dirty state into dirty bitmap. With
> this feature, we can track the dirty log of DMA continuously and precisely.
> 
> About this series:
> 
> Patch 1-3:Introduce dirty log tracking framework in the IOMMU base layer, and two common
>            interfaces that can be used by many types of iommu.
> 
> Patch 4-6: Add feature detection for smmu HTTU and enable HTTU for smmu stage1 mapping.
>            And add feature detection for smmu BBML. We need to split block mapping when
>            start dirty log tracking and merge page mapping when stop dirty log tracking,
> 		   which requires break-before-make procedure. But it might cause problems when the
> 		   TTD is alive. The I/O streams might not tolerate translation faults. So BBML
> 		   should be used.
> 
> Patch 7-12: We implement these interfaces for arm smmuv3.
> 
> Thanks,
> Keqian
> 
> Jean-Philippe Brucker (1):
>   iommu/arm-smmu-v3: Add support for Hardware Translation Table Update
> 
> Keqian Zhu (1):
>   iommu: Introduce dirty log tracking framework
> 
> Kunkun Jiang (11):
>   iommu/io-pgtable-arm: Add quirk ARM_HD and ARM_BBMLx
>   iommu/io-pgtable-arm: Add and realize split_block ops
>   iommu/io-pgtable-arm: Add and realize merge_page ops
>   iommu/io-pgtable-arm: Add and realize sync_dirty_log ops
>   iommu/io-pgtable-arm: Add and realize clear_dirty_log ops
>   iommu/arm-smmu-v3: Enable HTTU for stage1 with io-pgtable mapping
>   iommu/arm-smmu-v3: Add feature detection for BBML
>   iommu/arm-smmu-v3: Realize switch_dirty_log iommu ops
>   iommu/arm-smmu-v3: Realize sync_dirty_log iommu ops
>   iommu/arm-smmu-v3: Realize clear_dirty_log iommu ops
>   iommu/arm-smmu-v3: Realize support_dirty_log iommu ops
> 
>  .../iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c   |   2 +
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c   | 268 +++++++++++-
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h   |  14 +
>  drivers/iommu/io-pgtable-arm.c                | 389 +++++++++++++++++-
>  drivers/iommu/iommu.c                         | 206 +++++++++-
>  include/linux/io-pgtable.h                    |  23 ++
>  include/linux/iommu.h                         |  65 +++
>  include/trace/events/iommu.h                  |  63 +++
>  8 files changed, 1026 insertions(+), 4 deletions(-)
> 

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 81+ messages in thread

end of thread, other threads:[~2021-05-17  8:49 UTC | newest]

Thread overview: 81+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-05-07 10:21 [RFC PATCH v4 00/13] iommu/smmuv3: Implement hardware dirty log tracking Keqian Zhu
2021-05-07 10:21 ` Keqian Zhu
2021-05-07 10:21 ` Keqian Zhu
2021-05-07 10:21 ` [RFC PATCH v4 01/13] iommu: Introduce dirty log tracking framework Keqian Zhu
2021-05-07 10:21   ` Keqian Zhu
2021-05-07 10:21   ` Keqian Zhu
2021-05-08  3:46   ` Lu Baolu
2021-05-08  3:46     ` Lu Baolu
2021-05-08  3:46     ` Lu Baolu
2021-05-08  7:35     ` Keqian Zhu
2021-05-08  7:35       ` Keqian Zhu
2021-05-08  7:35       ` Keqian Zhu
2021-05-10  1:08       ` Lu Baolu
2021-05-10  1:08         ` Lu Baolu
2021-05-10  1:08         ` Lu Baolu
2021-05-10 11:07         ` Keqian Zhu
2021-05-10 11:07           ` Keqian Zhu
2021-05-10 11:07           ` Keqian Zhu
2021-05-11  3:12           ` Lu Baolu
2021-05-11  3:12             ` Lu Baolu
2021-05-11  3:12             ` Lu Baolu
2021-05-11  7:40             ` Keqian Zhu
2021-05-11  7:40               ` Keqian Zhu
2021-05-11  7:40               ` Keqian Zhu
2021-05-12  3:20               ` Lu Baolu
2021-05-12  3:20                 ` Lu Baolu
2021-05-12  3:20                 ` Lu Baolu
2021-05-12  8:44                 ` Keqian Zhu
2021-05-12  8:44                   ` Keqian Zhu
2021-05-12  8:44                   ` Keqian Zhu
2021-05-12 11:36                   ` Lu Baolu
2021-05-12 11:36                     ` Lu Baolu
2021-05-12 11:36                     ` Lu Baolu
2021-05-13 10:58                     ` Keqian Zhu
2021-05-13 10:58                       ` Keqian Zhu
2021-05-13 10:58                       ` Keqian Zhu
2021-05-13 12:02                       ` Lu Baolu
2021-05-13 12:02                         ` Lu Baolu
2021-05-13 12:02                         ` Lu Baolu
2021-05-14  2:30                         ` Keqian Zhu
2021-05-14  2:30                           ` Keqian Zhu
2021-05-14  2:30                           ` Keqian Zhu
2021-05-07 10:22 ` [RFC PATCH v4 02/13] iommu/io-pgtable-arm: Add quirk ARM_HD and ARM_BBMLx Keqian Zhu
2021-05-07 10:22   ` Keqian Zhu
2021-05-07 10:22   ` Keqian Zhu
2021-05-07 10:22 ` [RFC PATCH v4 03/13] iommu/io-pgtable-arm: Add and realize split_block ops Keqian Zhu
2021-05-07 10:22   ` Keqian Zhu
2021-05-07 10:22   ` Keqian Zhu
2021-05-07 10:22 ` [RFC PATCH v4 04/13] iommu/io-pgtable-arm: Add and realize merge_page ops Keqian Zhu
2021-05-07 10:22   ` Keqian Zhu
2021-05-07 10:22   ` Keqian Zhu
2021-05-07 10:22 ` [RFC PATCH v4 05/13] iommu/io-pgtable-arm: Add and realize sync_dirty_log ops Keqian Zhu
2021-05-07 10:22   ` Keqian Zhu
2021-05-07 10:22   ` Keqian Zhu
2021-05-07 10:22 ` [RFC PATCH v4 06/13] iommu/io-pgtable-arm: Add and realize clear_dirty_log ops Keqian Zhu
2021-05-07 10:22   ` Keqian Zhu
2021-05-07 10:22   ` Keqian Zhu
2021-05-07 10:22 ` [RFC PATCH v4 07/13] iommu/arm-smmu-v3: Add support for Hardware Translation Table Update Keqian Zhu
2021-05-07 10:22   ` Keqian Zhu
2021-05-07 10:22   ` Keqian Zhu
2021-05-07 10:22 ` [RFC PATCH v4 08/13] iommu/arm-smmu-v3: Enable HTTU for stage1 with io-pgtable mapping Keqian Zhu
2021-05-07 10:22   ` Keqian Zhu
2021-05-07 10:22   ` Keqian Zhu
2021-05-07 10:22 ` [RFC PATCH v4 09/13] iommu/arm-smmu-v3: Add feature detection for BBML Keqian Zhu
2021-05-07 10:22   ` Keqian Zhu
2021-05-07 10:22   ` Keqian Zhu
2021-05-07 10:22 ` [RFC PATCH v4 10/13] iommu/arm-smmu-v3: Realize switch_dirty_log iommu ops Keqian Zhu
2021-05-07 10:22   ` Keqian Zhu
2021-05-07 10:22   ` Keqian Zhu
2021-05-07 10:22 ` [RFC PATCH v4 11/13] iommu/arm-smmu-v3: Realize sync_dirty_log " Keqian Zhu
2021-05-07 10:22   ` Keqian Zhu
2021-05-07 10:22   ` Keqian Zhu
2021-05-07 10:22 ` [RFC PATCH v4 12/13] iommu/arm-smmu-v3: Realize clear_dirty_log " Keqian Zhu
2021-05-07 10:22   ` Keqian Zhu
2021-05-07 10:22   ` Keqian Zhu
2021-05-07 10:22 ` [RFC PATCH v4 13/13] iommu/arm-smmu-v3: Realize support_dirty_log " Keqian Zhu
2021-05-07 10:22   ` Keqian Zhu
2021-05-07 10:22   ` Keqian Zhu
2021-05-17  8:46 ` [RFC PATCH v4 00/13] iommu/smmuv3: Implement hardware dirty log tracking Keqian Zhu
2021-05-17  8:46   ` Keqian Zhu
2021-05-17  8:46   ` Keqian Zhu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.