iommu.lists.linux-foundation.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 0/7] MediaTek IOMMU improve tlb flush performance in map/unmap
@ 2020-12-16 10:36 Yong Wu
  2020-12-16 10:36 ` [PATCH v3 1/7] iommu: Move iotlb_sync_map out from __iommu_map Yong Wu
                   ` (6 more replies)
  0 siblings, 7 replies; 23+ messages in thread
From: Yong Wu @ 2020-12-16 10:36 UTC (permalink / raw)
  To: Joerg Roedel, Will Deacon, Robin Murphy
  Cc: youlin.pei, anan.sun, Nicolas Boichat, srv_heupstream, chao.hao,
	linux-kernel, Krzysztof Kozlowski, Tomasz Figa, iommu,
	linux-mediatek, Matthias Brugger, Greg Kroah-Hartman,
	kernel-team, linux-arm-kernel

This patchset is to improve tlb flushing performance in iommu_map/unmap
for MediaTek IOMMU.

For iommu_map, currently MediaTek IOMMU use IO_PGTABLE_QUIRK_TLBI_ON_MAP
to do tlb_flush for each a memory chunk. this is so unnecessary. we could
improve it by tlb flushing one time at the end of iommu_map.

For iommu_unmap, currently we have already improve this performance by
gather. But the current gather should take care its granule size. if the
granule size is different, it will do tlb flush and gather again. Our HW
don't care about granule size. thus I gather the range in our file.

After this patchset, we could achieve only tlb flushing once in iommu_map
and iommu_unmap.

Regardless of sg, for each a segment, I did a simple test:
  
  size = 20 * SZ_1M;
  /* the worst case, all are 4k mapping. */
  ret = iommu_map(domain, 0x5bb02000, 0x123f1000, size, IOMMU_READ);
  iommu_unmap(domain, 0x5bb02000, size);

This is the comparing time(unit is us):
              original-time  after-improve
   map-20M    59943           2347
   unmap-20M  264             36

This patchset also flush tlb once in the iommu_map_sg case.

patch [1/7][2/7][3/7] are for map while the others are for unmap.

This patchset base on:
  a) mt8192 iommu v5
https://lore.kernel.org/linux-iommu/20201209080102.26626-1-yong.wu@mediatek.com/T/#t
  b) iommu/io-pgtable: Remove tlb_flush_leaf
https://lore.kernel.org/linux-iommu/160744101816.3622130.16266834943434854326.b4-ty@kernel.org/T/#mc8fbc98bee8bca865d73c873275ab34fed1c25c7

change note:
v3: Refactor the unmap flow suggested by Robin.
     
v2: https://lore.kernel.org/linux-iommu/20201119061836.15238-1-yong.wu@mediatek.com/
    Refactor all the code.
    base on v5.10-rc1.

Yong Wu (7):
  iommu: Move iotlb_sync_map out from __iommu_map
  iommu: Add iova and size as parameters in iotlb_sync_map
  iommu/mediatek: Add iotlb_sync_map to sync whole the iova range
  iommu: Switch gather->end to unsigned long long
  iommu: Allow io_pgtable_tlb ops optional
  iommu/mediatek: Gather iova in iommu_unmap to achieve tlb sync once
  iommu/mediatek: Remove the tlb-ops for v7s

 drivers/iommu/iommu.c      | 24 +++++++++++++++-----
 drivers/iommu/mtk_iommu.c  | 45 +++++++++++++++-----------------------
 drivers/iommu/tegra-gart.c |  3 ++-
 include/linux/io-pgtable.h |  8 ++++---
 include/linux/iommu.h      |  8 ++++---
 5 files changed, 49 insertions(+), 39 deletions(-)

-- 
2.18.0


_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH v3 1/7] iommu: Move iotlb_sync_map out from __iommu_map
  2020-12-16 10:36 [PATCH v3 0/7] MediaTek IOMMU improve tlb flush performance in map/unmap Yong Wu
@ 2020-12-16 10:36 ` Yong Wu
  2020-12-23  8:51   ` Christoph Hellwig
  2020-12-16 10:36 ` [PATCH v3 2/7] iommu: Add iova and size as parameters in iotlb_sync_map Yong Wu
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 23+ messages in thread
From: Yong Wu @ 2020-12-16 10:36 UTC (permalink / raw)
  To: Joerg Roedel, Will Deacon, Robin Murphy
  Cc: youlin.pei, anan.sun, Nicolas Boichat, srv_heupstream, chao.hao,
	linux-kernel, Krzysztof Kozlowski, Tomasz Figa, iommu,
	linux-mediatek, Matthias Brugger, Greg Kroah-Hartman,
	kernel-team, linux-arm-kernel

In the end of __iommu_map, It alway call iotlb_sync_map.
This patch moves iotlb_sync_map out from __iommu_map since it is
unnecessary to call this for each sg segment especially iotlb_sync_map
is flush tlb all currently.

Signed-off-by: Yong Wu <yong.wu@mediatek.com>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
---
 drivers/iommu/iommu.c | 24 +++++++++++++++++++-----
 1 file changed, 19 insertions(+), 5 deletions(-)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 8c470f451a32..decef851fa3a 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -2407,9 +2407,6 @@ static int __iommu_map(struct iommu_domain *domain, unsigned long iova,
 		size -= pgsize;
 	}
 
-	if (ops->iotlb_sync_map)
-		ops->iotlb_sync_map(domain);
-
 	/* unroll mapping in case something went wrong */
 	if (ret)
 		iommu_unmap(domain, orig_iova, orig_size - size);
@@ -2422,15 +2419,29 @@ static int __iommu_map(struct iommu_domain *domain, unsigned long iova,
 int iommu_map(struct iommu_domain *domain, unsigned long iova,
 	      phys_addr_t paddr, size_t size, int prot)
 {
+	const struct iommu_ops *ops = domain->ops;
+	int ret;
+
 	might_sleep();
-	return __iommu_map(domain, iova, paddr, size, prot, GFP_KERNEL);
+	ret = __iommu_map(domain, iova, paddr, size, prot, GFP_KERNEL);
+	if (ret == 0 && ops->iotlb_sync_map)
+		ops->iotlb_sync_map(domain);
+
+	return ret;
 }
 EXPORT_SYMBOL_GPL(iommu_map);
 
 int iommu_map_atomic(struct iommu_domain *domain, unsigned long iova,
 	      phys_addr_t paddr, size_t size, int prot)
 {
-	return __iommu_map(domain, iova, paddr, size, prot, GFP_ATOMIC);
+	const struct iommu_ops *ops = domain->ops;
+	int ret;
+
+	ret = __iommu_map(domain, iova, paddr, size, prot, GFP_ATOMIC);
+	if (ret == 0 && ops->iotlb_sync_map)
+		ops->iotlb_sync_map(domain);
+
+	return ret;
 }
 EXPORT_SYMBOL_GPL(iommu_map_atomic);
 
@@ -2514,6 +2525,7 @@ static size_t __iommu_map_sg(struct iommu_domain *domain, unsigned long iova,
 			     struct scatterlist *sg, unsigned int nents, int prot,
 			     gfp_t gfp)
 {
+	const struct iommu_ops *ops = domain->ops;
 	size_t len = 0, mapped = 0;
 	phys_addr_t start;
 	unsigned int i = 0;
@@ -2544,6 +2556,8 @@ static size_t __iommu_map_sg(struct iommu_domain *domain, unsigned long iova,
 			sg = sg_next(sg);
 	}
 
+	if (ops->iotlb_sync_map)
+		ops->iotlb_sync_map(domain);
 	return mapped;
 
 out_err:
-- 
2.18.0

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v3 2/7] iommu: Add iova and size as parameters in iotlb_sync_map
  2020-12-16 10:36 [PATCH v3 0/7] MediaTek IOMMU improve tlb flush performance in map/unmap Yong Wu
  2020-12-16 10:36 ` [PATCH v3 1/7] iommu: Move iotlb_sync_map out from __iommu_map Yong Wu
@ 2020-12-16 10:36 ` Yong Wu
  2020-12-16 10:36 ` [PATCH v3 3/7] iommu/mediatek: Add iotlb_sync_map to sync whole the iova range Yong Wu
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 23+ messages in thread
From: Yong Wu @ 2020-12-16 10:36 UTC (permalink / raw)
  To: Joerg Roedel, Will Deacon, Robin Murphy
  Cc: youlin.pei, anan.sun, Nicolas Boichat, srv_heupstream, chao.hao,
	linux-kernel, Krzysztof Kozlowski, Tomasz Figa, iommu,
	linux-mediatek, Matthias Brugger, Greg Kroah-Hartman,
	kernel-team, linux-arm-kernel

iotlb_sync_map allow IOMMU drivers tlb sync after completing the whole
mapping. This patch adds iova and size as the parameters in it. then the
IOMMU driver could flush tlb with the whole range once after iova mapping
to improve performance.

Signed-off-by: Yong Wu <yong.wu@mediatek.com>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
---
 drivers/iommu/iommu.c      | 6 +++---
 drivers/iommu/tegra-gart.c | 3 ++-
 include/linux/iommu.h      | 3 ++-
 3 files changed, 7 insertions(+), 5 deletions(-)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index decef851fa3a..df87c8e825f7 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -2425,7 +2425,7 @@ int iommu_map(struct iommu_domain *domain, unsigned long iova,
 	might_sleep();
 	ret = __iommu_map(domain, iova, paddr, size, prot, GFP_KERNEL);
 	if (ret == 0 && ops->iotlb_sync_map)
-		ops->iotlb_sync_map(domain);
+		ops->iotlb_sync_map(domain, iova, size);
 
 	return ret;
 }
@@ -2439,7 +2439,7 @@ int iommu_map_atomic(struct iommu_domain *domain, unsigned long iova,
 
 	ret = __iommu_map(domain, iova, paddr, size, prot, GFP_ATOMIC);
 	if (ret == 0 && ops->iotlb_sync_map)
-		ops->iotlb_sync_map(domain);
+		ops->iotlb_sync_map(domain, iova, size);
 
 	return ret;
 }
@@ -2557,7 +2557,7 @@ static size_t __iommu_map_sg(struct iommu_domain *domain, unsigned long iova,
 	}
 
 	if (ops->iotlb_sync_map)
-		ops->iotlb_sync_map(domain);
+		ops->iotlb_sync_map(domain, iova, mapped);
 	return mapped;
 
 out_err:
diff --git a/drivers/iommu/tegra-gart.c b/drivers/iommu/tegra-gart.c
index fac720273889..d15d13a98ed1 100644
--- a/drivers/iommu/tegra-gart.c
+++ b/drivers/iommu/tegra-gart.c
@@ -261,7 +261,8 @@ static int gart_iommu_of_xlate(struct device *dev,
 	return 0;
 }
 
-static void gart_iommu_sync_map(struct iommu_domain *domain)
+static void gart_iommu_sync_map(struct iommu_domain *domain, unsigned long iova,
+				size_t size)
 {
 	FLUSH_GART_REGS(gart_handle);
 }
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index b95a6f8db6ff..794d4085edd3 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -244,7 +244,8 @@ struct iommu_ops {
 	size_t (*unmap)(struct iommu_domain *domain, unsigned long iova,
 		     size_t size, struct iommu_iotlb_gather *iotlb_gather);
 	void (*flush_iotlb_all)(struct iommu_domain *domain);
-	void (*iotlb_sync_map)(struct iommu_domain *domain);
+	void (*iotlb_sync_map)(struct iommu_domain *domain, unsigned long iova,
+			       size_t size);
 	void (*iotlb_sync)(struct iommu_domain *domain,
 			   struct iommu_iotlb_gather *iotlb_gather);
 	phys_addr_t (*iova_to_phys)(struct iommu_domain *domain, dma_addr_t iova);
-- 
2.18.0

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v3 3/7] iommu/mediatek: Add iotlb_sync_map to sync whole the iova range
  2020-12-16 10:36 [PATCH v3 0/7] MediaTek IOMMU improve tlb flush performance in map/unmap Yong Wu
  2020-12-16 10:36 ` [PATCH v3 1/7] iommu: Move iotlb_sync_map out from __iommu_map Yong Wu
  2020-12-16 10:36 ` [PATCH v3 2/7] iommu: Add iova and size as parameters in iotlb_sync_map Yong Wu
@ 2020-12-16 10:36 ` Yong Wu
  2020-12-16 10:36 ` [PATCH v3 4/7] iommu: Switch gather->end to unsigned long long Yong Wu
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 23+ messages in thread
From: Yong Wu @ 2020-12-16 10:36 UTC (permalink / raw)
  To: Joerg Roedel, Will Deacon, Robin Murphy
  Cc: youlin.pei, anan.sun, Nicolas Boichat, srv_heupstream, chao.hao,
	linux-kernel, Krzysztof Kozlowski, Tomasz Figa, iommu,
	linux-mediatek, Matthias Brugger, Greg Kroah-Hartman,
	kernel-team, linux-arm-kernel

Remove IO_PGTABLE_QUIRK_TLBI_ON_MAP to avoid tlb sync for each a small
chunk memory, Use the new iotlb_sync_map to tlb_sync once for whole the
iova range of iommu_map.

Signed-off-by: Yong Wu <yong.wu@mediatek.com>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
---
 drivers/iommu/mtk_iommu.c | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
index 5878b11bc5e9..db7d43adb06b 100644
--- a/drivers/iommu/mtk_iommu.c
+++ b/drivers/iommu/mtk_iommu.c
@@ -376,7 +376,6 @@ static int mtk_iommu_domain_finalise(struct mtk_iommu_domain *dom)
 	dom->cfg = (struct io_pgtable_cfg) {
 		.quirks = IO_PGTABLE_QUIRK_ARM_NS |
 			IO_PGTABLE_QUIRK_NO_PERMS |
-			IO_PGTABLE_QUIRK_TLBI_ON_MAP |
 			IO_PGTABLE_QUIRK_ARM_MTK_EXT,
 		.pgsize_bitmap = mtk_iommu_ops.pgsize_bitmap,
 		.ias = MTK_IOMMU_HAS_FLAG(data->plat_data, IOVA_34_EN) ? 34 : 32,
@@ -531,6 +530,14 @@ static void mtk_iommu_iotlb_sync(struct iommu_domain *domain,
 				       dom->data);
 }
 
+static void mtk_iommu_sync_map(struct iommu_domain *domain, unsigned long iova,
+			       size_t size)
+{
+	struct mtk_iommu_domain *dom = to_mtk_domain(domain);
+
+	mtk_iommu_tlb_flush_range_sync(iova, size, size, dom->data);
+}
+
 static phys_addr_t mtk_iommu_iova_to_phys(struct iommu_domain *domain,
 					  dma_addr_t iova)
 {
@@ -655,6 +662,7 @@ static const struct iommu_ops mtk_iommu_ops = {
 	.unmap		= mtk_iommu_unmap,
 	.flush_iotlb_all = mtk_iommu_flush_iotlb_all,
 	.iotlb_sync	= mtk_iommu_iotlb_sync,
+	.iotlb_sync_map	= mtk_iommu_sync_map,
 	.iova_to_phys	= mtk_iommu_iova_to_phys,
 	.probe_device	= mtk_iommu_probe_device,
 	.release_device	= mtk_iommu_release_device,
-- 
2.18.0

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v3 4/7] iommu: Switch gather->end to unsigned long long
  2020-12-16 10:36 [PATCH v3 0/7] MediaTek IOMMU improve tlb flush performance in map/unmap Yong Wu
                   ` (2 preceding siblings ...)
  2020-12-16 10:36 ` [PATCH v3 3/7] iommu/mediatek: Add iotlb_sync_map to sync whole the iova range Yong Wu
@ 2020-12-16 10:36 ` Yong Wu
  2020-12-16 11:03   ` David Laight
  2020-12-16 12:10   ` Robin Murphy
  2020-12-16 10:36 ` [PATCH v3 5/7] iommu: Allow io_pgtable_tlb ops optional Yong Wu
                   ` (2 subsequent siblings)
  6 siblings, 2 replies; 23+ messages in thread
From: Yong Wu @ 2020-12-16 10:36 UTC (permalink / raw)
  To: Joerg Roedel, Will Deacon, Robin Murphy
  Cc: youlin.pei, anan.sun, Nicolas Boichat, srv_heupstream, chao.hao,
	linux-kernel, Krzysztof Kozlowski, Tomasz Figa, iommu,
	linux-mediatek, Matthias Brugger, Greg Kroah-Hartman,
	kernel-team, linux-arm-kernel

Currently gather->end is "unsigned long" which may be overflow in
arch32 in the corner case: 0xfff00000 + 0x100000(iova + size).
Although it doesn't affect the size(end - start), it affects the checking
"gather->end < end"

Fixes: a7d20dc19d9e ("iommu: Introduce struct iommu_iotlb_gather for batching TLB flushes")
Signed-off-by: Yong Wu <yong.wu@mediatek.com>
---
 include/linux/iommu.h | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 794d4085edd3..6e907a95d981 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -178,7 +178,7 @@ enum iommu_dev_features {
  */
 struct iommu_iotlb_gather {
 	unsigned long		start;
-	unsigned long		end;
+	unsigned long long	end;
 	size_t			pgsize;
 };
 
@@ -537,7 +537,8 @@ static inline void iommu_iotlb_gather_add_page(struct iommu_domain *domain,
 					       struct iommu_iotlb_gather *gather,
 					       unsigned long iova, size_t size)
 {
-	unsigned long start = iova, end = start + size;
+	unsigned long start = iova;
+	unsigned long long end = start + size;
 
 	/*
 	 * If the new page is disjoint from the current range or is mapped at
-- 
2.18.0

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v3 5/7] iommu: Allow io_pgtable_tlb ops optional
  2020-12-16 10:36 [PATCH v3 0/7] MediaTek IOMMU improve tlb flush performance in map/unmap Yong Wu
                   ` (3 preceding siblings ...)
  2020-12-16 10:36 ` [PATCH v3 4/7] iommu: Switch gather->end to unsigned long long Yong Wu
@ 2020-12-16 10:36 ` Yong Wu
  2021-01-18 16:18   ` Robin Murphy
  2020-12-16 10:36 ` [PATCH v3 6/7] iommu/mediatek: Gather iova in iommu_unmap to achieve tlb sync once Yong Wu
  2020-12-16 10:36 ` [PATCH v3 7/7] iommu/mediatek: Remove the tlb-ops for v7s Yong Wu
  6 siblings, 1 reply; 23+ messages in thread
From: Yong Wu @ 2020-12-16 10:36 UTC (permalink / raw)
  To: Joerg Roedel, Will Deacon, Robin Murphy
  Cc: youlin.pei, anan.sun, Nicolas Boichat, srv_heupstream, chao.hao,
	linux-kernel, Krzysztof Kozlowski, Tomasz Figa, iommu,
	linux-mediatek, Matthias Brugger, Greg Kroah-Hartman,
	kernel-team, linux-arm-kernel

This patch allows io_pgtable_tlb ops could be null since the IOMMU drivers
may use the tlb ops from iommu framework.

Signed-off-by: Yong Wu <yong.wu@mediatek.com>
---
 include/linux/io-pgtable.h | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/include/linux/io-pgtable.h b/include/linux/io-pgtable.h
index adde9e49be08..c81796814afa 100644
--- a/include/linux/io-pgtable.h
+++ b/include/linux/io-pgtable.h
@@ -206,14 +206,16 @@ struct io_pgtable {
 
 static inline void io_pgtable_tlb_flush_all(struct io_pgtable *iop)
 {
-	iop->cfg.tlb->tlb_flush_all(iop->cookie);
+	if (iop->cfg.tlb && iop->cfg.tlb->tlb_flush_all)
+		iop->cfg.tlb->tlb_flush_all(iop->cookie);
 }
 
 static inline void
 io_pgtable_tlb_flush_walk(struct io_pgtable *iop, unsigned long iova,
 			  size_t size, size_t granule)
 {
-	iop->cfg.tlb->tlb_flush_walk(iova, size, granule, iop->cookie);
+	if (iop->cfg.tlb && iop->cfg.tlb->tlb_flush_walk)
+		iop->cfg.tlb->tlb_flush_walk(iova, size, granule, iop->cookie);
 }
 
 static inline void
@@ -221,7 +223,7 @@ io_pgtable_tlb_add_page(struct io_pgtable *iop,
 			struct iommu_iotlb_gather * gather, unsigned long iova,
 			size_t granule)
 {
-	if (iop->cfg.tlb->tlb_add_page)
+	if (iop->cfg.tlb && iop->cfg.tlb->tlb_add_page)
 		iop->cfg.tlb->tlb_add_page(gather, iova, granule, iop->cookie);
 }
 
-- 
2.18.0

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v3 6/7] iommu/mediatek: Gather iova in iommu_unmap to achieve tlb sync once
  2020-12-16 10:36 [PATCH v3 0/7] MediaTek IOMMU improve tlb flush performance in map/unmap Yong Wu
                   ` (4 preceding siblings ...)
  2020-12-16 10:36 ` [PATCH v3 5/7] iommu: Allow io_pgtable_tlb ops optional Yong Wu
@ 2020-12-16 10:36 ` Yong Wu
  2020-12-23  8:56   ` Tomasz Figa
  2021-01-18 16:35   ` Robin Murphy
  2020-12-16 10:36 ` [PATCH v3 7/7] iommu/mediatek: Remove the tlb-ops for v7s Yong Wu
  6 siblings, 2 replies; 23+ messages in thread
From: Yong Wu @ 2020-12-16 10:36 UTC (permalink / raw)
  To: Joerg Roedel, Will Deacon, Robin Murphy
  Cc: youlin.pei, anan.sun, Nicolas Boichat, srv_heupstream, chao.hao,
	linux-kernel, Krzysztof Kozlowski, Tomasz Figa, iommu,
	linux-mediatek, Matthias Brugger, Greg Kroah-Hartman,
	kernel-team, linux-arm-kernel

In current iommu_unmap, this code is:

	iommu_iotlb_gather_init(&iotlb_gather);
	ret = __iommu_unmap(domain, iova, size, &iotlb_gather);
	iommu_iotlb_sync(domain, &iotlb_gather);

We could gather the whole iova range in __iommu_unmap, and then do tlb
synchronization in the iommu_iotlb_sync.

This patch implement this, Gather the range in mtk_iommu_unmap.
then iommu_iotlb_sync call tlb synchronization for the gathered iova range.
we don't call iommu_iotlb_gather_add_page since our tlb synchronization
could be regardless of granule size.

In this way, gather->start is impossible ULONG_MAX, remove the checking.

This patch aims to do tlb synchronization *once* in the iommu_unmap.

Signed-off-by: Yong Wu <yong.wu@mediatek.com>
---
 drivers/iommu/mtk_iommu.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
index db7d43adb06b..89cec51405cd 100644
--- a/drivers/iommu/mtk_iommu.c
+++ b/drivers/iommu/mtk_iommu.c
@@ -506,7 +506,12 @@ static size_t mtk_iommu_unmap(struct iommu_domain *domain,
 			      struct iommu_iotlb_gather *gather)
 {
 	struct mtk_iommu_domain *dom = to_mtk_domain(domain);
+	unsigned long long end = iova + size;
 
+	if (gather->start > iova)
+		gather->start = iova;
+	if (gather->end < end)
+		gather->end = end;
 	return dom->iop->unmap(dom->iop, iova, size, gather);
 }
 
@@ -523,9 +528,6 @@ static void mtk_iommu_iotlb_sync(struct iommu_domain *domain,
 	struct mtk_iommu_domain *dom = to_mtk_domain(domain);
 	size_t length = gather->end - gather->start;
 
-	if (gather->start == ULONG_MAX)
-		return;
-
 	mtk_iommu_tlb_flush_range_sync(gather->start, length, gather->pgsize,
 				       dom->data);
 }
-- 
2.18.0

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v3 7/7] iommu/mediatek: Remove the tlb-ops for v7s
  2020-12-16 10:36 [PATCH v3 0/7] MediaTek IOMMU improve tlb flush performance in map/unmap Yong Wu
                   ` (5 preceding siblings ...)
  2020-12-16 10:36 ` [PATCH v3 6/7] iommu/mediatek: Gather iova in iommu_unmap to achieve tlb sync once Yong Wu
@ 2020-12-16 10:36 ` Yong Wu
  2021-01-18 16:39   ` Robin Murphy
  6 siblings, 1 reply; 23+ messages in thread
From: Yong Wu @ 2020-12-16 10:36 UTC (permalink / raw)
  To: Joerg Roedel, Will Deacon, Robin Murphy
  Cc: youlin.pei, anan.sun, Nicolas Boichat, srv_heupstream, chao.hao,
	linux-kernel, Krzysztof Kozlowski, Tomasz Figa, iommu,
	linux-mediatek, Matthias Brugger, Greg Kroah-Hartman,
	kernel-team, linux-arm-kernel

Until now, we have already used the tlb operations from iommu framework,
then the tlb operations for v7s can be removed.

Correspondingly, Switch the paramenter "cookie" to internal structure.

Signed-off-by: Yong Wu <yong.wu@mediatek.com>
---
 drivers/iommu/mtk_iommu.c | 27 ++++-----------------------
 1 file changed, 4 insertions(+), 23 deletions(-)

diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
index 89cec51405cd..5656819cd4a1 100644
--- a/drivers/iommu/mtk_iommu.c
+++ b/drivers/iommu/mtk_iommu.c
@@ -206,10 +206,8 @@ static struct mtk_iommu_domain *to_mtk_domain(struct iommu_domain *dom)
 	return container_of(dom, struct mtk_iommu_domain, domain);
 }
 
-static void mtk_iommu_tlb_flush_all(void *cookie)
+static void mtk_iommu_tlb_flush_all(struct mtk_iommu_data *data)
 {
-	struct mtk_iommu_data *data = cookie;
-
 	for_each_m4u(data) {
 		if (!pm_runtime_active(data->dev))
 			continue;
@@ -221,9 +219,9 @@ static void mtk_iommu_tlb_flush_all(void *cookie)
 }
 
 static void mtk_iommu_tlb_flush_range_sync(unsigned long iova, size_t size,
-					   size_t granule, void *cookie)
+					   size_t granule,
+					   struct mtk_iommu_data *data)
 {
-	struct mtk_iommu_data *data = cookie;
 	unsigned long flags;
 	int ret;
 	u32 tmp;
@@ -250,7 +248,7 @@ static void mtk_iommu_tlb_flush_range_sync(unsigned long iova, size_t size,
 		if (ret) {
 			dev_warn(data->dev,
 				 "Partial TLB flush timed out, falling back to full flush\n");
-			mtk_iommu_tlb_flush_all(cookie);
+			mtk_iommu_tlb_flush_all(data);
 		}
 		/* Clear the CPE status */
 		writel_relaxed(0, data->base + REG_MMU_CPE_DONE);
@@ -258,22 +256,6 @@ static void mtk_iommu_tlb_flush_range_sync(unsigned long iova, size_t size,
 	}
 }
 
-static void mtk_iommu_tlb_flush_page_nosync(struct iommu_iotlb_gather *gather,
-					    unsigned long iova, size_t granule,
-					    void *cookie)
-{
-	struct mtk_iommu_data *data = cookie;
-	struct iommu_domain *domain = &data->m4u_dom->domain;
-
-	iommu_iotlb_gather_add_page(domain, gather, iova, granule);
-}
-
-static const struct iommu_flush_ops mtk_iommu_flush_ops = {
-	.tlb_flush_all = mtk_iommu_tlb_flush_all,
-	.tlb_flush_walk = mtk_iommu_tlb_flush_range_sync,
-	.tlb_add_page = mtk_iommu_tlb_flush_page_nosync,
-};
-
 static irqreturn_t mtk_iommu_isr(int irq, void *dev_id)
 {
 	struct mtk_iommu_data *data = dev_id;
@@ -380,7 +362,6 @@ static int mtk_iommu_domain_finalise(struct mtk_iommu_domain *dom)
 		.pgsize_bitmap = mtk_iommu_ops.pgsize_bitmap,
 		.ias = MTK_IOMMU_HAS_FLAG(data->plat_data, IOVA_34_EN) ? 34 : 32,
 		.oas = 35,
-		.tlb = &mtk_iommu_flush_ops,
 		.iommu_dev = data->dev,
 	};
 
-- 
2.18.0

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* RE: [PATCH v3 4/7] iommu: Switch gather->end to unsigned long long
  2020-12-16 10:36 ` [PATCH v3 4/7] iommu: Switch gather->end to unsigned long long Yong Wu
@ 2020-12-16 11:03   ` David Laight
  2020-12-17  2:26     ` Yong Wu
  2020-12-16 12:10   ` Robin Murphy
  1 sibling, 1 reply; 23+ messages in thread
From: David Laight @ 2020-12-16 11:03 UTC (permalink / raw)
  To: 'Yong Wu', Joerg Roedel, Will Deacon, Robin Murphy
  Cc: youlin.pei, anan.sun, Nicolas Boichat, srv_heupstream, chao.hao,
	linux-kernel, Krzysztof Kozlowski, Tomasz Figa, iommu,
	linux-mediatek, Matthias Brugger, Greg Kroah-Hartman,
	kernel-team, linux-arm-kernel

From: Yong Wu
> Sent: 16 December 2020 10:36
> 
> Currently gather->end is "unsigned long" which may be overflow in
> arch32 in the corner case: 0xfff00000 + 0x100000(iova + size).
> Although it doesn't affect the size(end - start), it affects the checking
> "gather->end < end"
> 
> Fixes: a7d20dc19d9e ("iommu: Introduce struct iommu_iotlb_gather for batching TLB flushes")
> Signed-off-by: Yong Wu <yong.wu@mediatek.com>
> ---
>  include/linux/iommu.h | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
> index 794d4085edd3..6e907a95d981 100644
> --- a/include/linux/iommu.h
> +++ b/include/linux/iommu.h
> @@ -178,7 +178,7 @@ enum iommu_dev_features {
>   */
>  struct iommu_iotlb_gather {
>  	unsigned long		start;
> -	unsigned long		end;
> +	unsigned long long	end;
>  	size_t			pgsize;
>  };

Doesn't that add two pad words on many 32bit systems?
You probably ought to re-order the structure to keep the fields
on their natural boundaries.

I'm not sure what is being mapped here, but could it make sense
to just avoid using the highest addresses?
Then you never hit the problem.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v3 4/7] iommu: Switch gather->end to unsigned long long
  2020-12-16 10:36 ` [PATCH v3 4/7] iommu: Switch gather->end to unsigned long long Yong Wu
  2020-12-16 11:03   ` David Laight
@ 2020-12-16 12:10   ` Robin Murphy
  2020-12-17  2:26     ` Yong Wu
  1 sibling, 1 reply; 23+ messages in thread
From: Robin Murphy @ 2020-12-16 12:10 UTC (permalink / raw)
  To: Yong Wu, Joerg Roedel, Will Deacon
  Cc: youlin.pei, anan.sun, Nicolas Boichat, srv_heupstream,
	Tomasz Figa, linux-kernel, Krzysztof Kozlowski, chao.hao, iommu,
	linux-mediatek, Matthias Brugger, Greg Kroah-Hartman,
	kernel-team, linux-arm-kernel

On 2020-12-16 10:36, Yong Wu wrote:
> Currently gather->end is "unsigned long" which may be overflow in
> arch32 in the corner case: 0xfff00000 + 0x100000(iova + size).
> Although it doesn't affect the size(end - start), it affects the checking
> "gather->end < end"

This won't help the same situation at the top of a 64-bit address space, 
though, and now that we have TTBR1 support for AArch64 format that is 
definitely a thing. Better to just encode the end address as the actual 
end address, i.e. iova + size - 1. We don't lose anything other than the 
ability to encode zero-sized invalidations that don't make sense anyway.

Robin.

> Fixes: a7d20dc19d9e ("iommu: Introduce struct iommu_iotlb_gather for batching TLB flushes")
> Signed-off-by: Yong Wu <yong.wu@mediatek.com>
> ---
>   include/linux/iommu.h | 5 +++--
>   1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
> index 794d4085edd3..6e907a95d981 100644
> --- a/include/linux/iommu.h
> +++ b/include/linux/iommu.h
> @@ -178,7 +178,7 @@ enum iommu_dev_features {
>    */
>   struct iommu_iotlb_gather {
>   	unsigned long		start;
> -	unsigned long		end;
> +	unsigned long long	end;
>   	size_t			pgsize;
>   };
>   
> @@ -537,7 +537,8 @@ static inline void iommu_iotlb_gather_add_page(struct iommu_domain *domain,
>   					       struct iommu_iotlb_gather *gather,
>   					       unsigned long iova, size_t size)
>   {
> -	unsigned long start = iova, end = start + size;
> +	unsigned long start = iova;
> +	unsigned long long end = start + size;
>   
>   	/*
>   	 * If the new page is disjoint from the current range or is mapped at
> 
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v3 4/7] iommu: Switch gather->end to unsigned long long
  2020-12-16 12:10   ` Robin Murphy
@ 2020-12-17  2:26     ` Yong Wu
  0 siblings, 0 replies; 23+ messages in thread
From: Yong Wu @ 2020-12-17  2:26 UTC (permalink / raw)
  To: Robin Murphy
  Cc: youlin.pei, anan.sun, Nicolas Boichat, srv_heupstream,
	Tomasz Figa, kernel-team, linux-kernel, Krzysztof Kozlowski,
	chao.hao, iommu, linux-mediatek, Matthias Brugger,
	Greg Kroah-Hartman, Will Deacon, linux-arm-kernel

On Wed, 2020-12-16 at 12:10 +0000, Robin Murphy wrote:
> On 2020-12-16 10:36, Yong Wu wrote:
> > Currently gather->end is "unsigned long" which may be overflow in
> > arch32 in the corner case: 0xfff00000 + 0x100000(iova + size).
> > Although it doesn't affect the size(end - start), it affects the checking
> > "gather->end < end"
> 
> This won't help the same situation at the top of a 64-bit address space, 
> though, and now that we have TTBR1 support for AArch64 format that is 
> definitely a thing. Better to just encode the end address as the actual 
> end address, i.e. iova + size - 1. We don't lose anything other than the 
> ability to encode zero-sized invalidations that don't make sense anyway.

Thanks for the suggestion. "iova + size - 1" is better. Also I will
change the "size" to "gather->end - gather->start + 1" in the iotlb_sync
of arm-smmu-v3.c and mtk_iommu.c.

> 
> Robin.
> 
> > Fixes: a7d20dc19d9e ("iommu: Introduce struct iommu_iotlb_gather for batching TLB flushes")
> > Signed-off-by: Yong Wu <yong.wu@mediatek.com>
> > ---
> >   include/linux/iommu.h | 5 +++--
> >   1 file changed, 3 insertions(+), 2 deletions(-)
> > 
> > diff --git a/include/linux/iommu.h b/include/linux/iommu.h
> > index 794d4085edd3..6e907a95d981 100644
> > --- a/include/linux/iommu.h
> > +++ b/include/linux/iommu.h
> > @@ -178,7 +178,7 @@ enum iommu_dev_features {
> >    */
> >   struct iommu_iotlb_gather {
> >   	unsigned long		start;
> > -	unsigned long		end;
> > +	unsigned long long	end;
> >   	size_t			pgsize;
> >   };
> >   
> > @@ -537,7 +537,8 @@ static inline void iommu_iotlb_gather_add_page(struct iommu_domain *domain,
> >   					       struct iommu_iotlb_gather *gather,
> >   					       unsigned long iova, size_t size)
> >   {
> > -	unsigned long start = iova, end = start + size;
> > +	unsigned long start = iova;
> > +	unsigned long long end = start + size;
> >   
> >   	/*
> >   	 * If the new page is disjoint from the current range or is mapped at
> > 

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v3 4/7] iommu: Switch gather->end to unsigned long long
  2020-12-16 11:03   ` David Laight
@ 2020-12-17  2:26     ` Yong Wu
  0 siblings, 0 replies; 23+ messages in thread
From: Yong Wu @ 2020-12-17  2:26 UTC (permalink / raw)
  To: David Laight
  Cc: youlin.pei, anan.sun, Nicolas Boichat, srv_heupstream, chao.hao,
	Will Deacon, linux-kernel, Krzysztof Kozlowski, Tomasz Figa,
	iommu, linux-mediatek, Matthias Brugger, kernel-team,
	Greg Kroah-Hartman, Robin Murphy, linux-arm-kernel

Hi David,

On Wed, 2020-12-16 at 11:03 +0000, David Laight wrote:
> From: Yong Wu
> > Sent: 16 December 2020 10:36
> > 
> > Currently gather->end is "unsigned long" which may be overflow in
> > arch32 in the corner case: 0xfff00000 + 0x100000(iova + size).
> > Although it doesn't affect the size(end - start), it affects the checking
> > "gather->end < end"
> > 
> > Fixes: a7d20dc19d9e ("iommu: Introduce struct iommu_iotlb_gather for batching TLB flushes")
> > Signed-off-by: Yong Wu <yong.wu@mediatek.com>
> > ---
> >  include/linux/iommu.h | 5 +++--
> >  1 file changed, 3 insertions(+), 2 deletions(-)
> > 
> > diff --git a/include/linux/iommu.h b/include/linux/iommu.h
> > index 794d4085edd3..6e907a95d981 100644
> > --- a/include/linux/iommu.h
> > +++ b/include/linux/iommu.h
> > @@ -178,7 +178,7 @@ enum iommu_dev_features {
> >   */
> >  struct iommu_iotlb_gather {
> >  	unsigned long		start;
> > -	unsigned long		end;
> > +	unsigned long long	end;
> >  	size_t			pgsize;
> >  };
> 
> Doesn't that add two pad words on many 32bit systems?
> You probably ought to re-order the structure to keep the fields
> on their natural boundaries.
> 
> I'm not sure what is being mapped here, but could it make sense
> to just avoid using the highest addresses?
> Then you never hit the problem.

Thanks for your review. following Robin's suggesting, I will use "iova +
size - 1", then avoid this.

> 
> 	David
> 
> -
> Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
> Registration No: 1397386 (Wales)
> 

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v3 1/7] iommu: Move iotlb_sync_map out from __iommu_map
  2020-12-16 10:36 ` [PATCH v3 1/7] iommu: Move iotlb_sync_map out from __iommu_map Yong Wu
@ 2020-12-23  8:51   ` Christoph Hellwig
  2020-12-24 11:27     ` Yong Wu
  0 siblings, 1 reply; 23+ messages in thread
From: Christoph Hellwig @ 2020-12-23  8:51 UTC (permalink / raw)
  To: Yong Wu
  Cc: youlin.pei, anan.sun, Nicolas Boichat, srv_heupstream,
	Tomasz Figa, Will Deacon, linux-kernel, Krzysztof Kozlowski,
	chao.hao, iommu, linux-mediatek, Matthias Brugger, kernel-team,
	Greg Kroah-Hartman, Robin Murphy, linux-arm-kernel

On Wed, Dec 16, 2020 at 06:36:01PM +0800, Yong Wu wrote:
> In the end of __iommu_map, It alway call iotlb_sync_map.
> This patch moves iotlb_sync_map out from __iommu_map since it is
> unnecessary to call this for each sg segment especially iotlb_sync_map
> is flush tlb all currently.
> 
> Signed-off-by: Yong Wu <yong.wu@mediatek.com>
> Reviewed-by: Robin Murphy <robin.murphy@arm.com>

What about adding a little helper that does the NULL check and method
call instead of duplicating it all over?
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v3 6/7] iommu/mediatek: Gather iova in iommu_unmap to achieve tlb sync once
  2020-12-16 10:36 ` [PATCH v3 6/7] iommu/mediatek: Gather iova in iommu_unmap to achieve tlb sync once Yong Wu
@ 2020-12-23  8:56   ` Tomasz Figa
  2020-12-23 11:00     ` Robin Murphy
  2021-01-18 16:35   ` Robin Murphy
  1 sibling, 1 reply; 23+ messages in thread
From: Tomasz Figa @ 2020-12-23  8:56 UTC (permalink / raw)
  To: Yong Wu
  Cc: youlin.pei, anan.sun, Nicolas Boichat, srv_heupstream,
	Tomasz Figa, Will Deacon, linux-kernel, Krzysztof Kozlowski,
	chao.hao, iommu, linux-mediatek, Matthias Brugger, kernel-team,
	Greg Kroah-Hartman, Robin Murphy, linux-arm-kernel

On Wed, Dec 16, 2020 at 06:36:06PM +0800, Yong Wu wrote:
> In current iommu_unmap, this code is:
> 
> 	iommu_iotlb_gather_init(&iotlb_gather);
> 	ret = __iommu_unmap(domain, iova, size, &iotlb_gather);
> 	iommu_iotlb_sync(domain, &iotlb_gather);
> 
> We could gather the whole iova range in __iommu_unmap, and then do tlb
> synchronization in the iommu_iotlb_sync.
> 
> This patch implement this, Gather the range in mtk_iommu_unmap.
> then iommu_iotlb_sync call tlb synchronization for the gathered iova range.
> we don't call iommu_iotlb_gather_add_page since our tlb synchronization
> could be regardless of granule size.
> 
> In this way, gather->start is impossible ULONG_MAX, remove the checking.
> 
> This patch aims to do tlb synchronization *once* in the iommu_unmap.
> 
> Signed-off-by: Yong Wu <yong.wu@mediatek.com>
> ---
>  drivers/iommu/mtk_iommu.c | 8 +++++---
>  1 file changed, 5 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
> index db7d43adb06b..89cec51405cd 100644
> --- a/drivers/iommu/mtk_iommu.c
> +++ b/drivers/iommu/mtk_iommu.c
> @@ -506,7 +506,12 @@ static size_t mtk_iommu_unmap(struct iommu_domain *domain,
>  			      struct iommu_iotlb_gather *gather)
>  {
>  	struct mtk_iommu_domain *dom = to_mtk_domain(domain);
> +	unsigned long long end = iova + size;
>  
> +	if (gather->start > iova)
> +		gather->start = iova;
> +	if (gather->end < end)
> +		gather->end = end;

I don't know how common the case is, but what happens if
gather->start...gather->end is a disjoint range from iova...end? E.g.

 | gather      | ..XXX... | iova |
 |             |          |      |
 gather->start |          iova   |
               gather->end       end

We would also end up invalidating the TLB for the XXX area, which could
affect the performance.

Also, why is the existing code in __arm_v7s_unmap() not enough? It seems
to call io_pgtable_tlb_add_page() already, so it should be batching the
flushes.

>  	return dom->iop->unmap(dom->iop, iova, size, gather);
>  }
>  
> @@ -523,9 +528,6 @@ static void mtk_iommu_iotlb_sync(struct iommu_domain *domain,
>  	struct mtk_iommu_domain *dom = to_mtk_domain(domain);
>  	size_t length = gather->end - gather->start;
>  
> -	if (gather->start == ULONG_MAX)
> -		return;
> -
>  	mtk_iommu_tlb_flush_range_sync(gather->start, length, gather->pgsize,
>  				       dom->data);
>  }
> -- 
> 2.18.0
> 
> _______________________________________________
> iommu mailing list
> iommu@lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/iommu
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v3 6/7] iommu/mediatek: Gather iova in iommu_unmap to achieve tlb sync once
  2020-12-23  8:56   ` Tomasz Figa
@ 2020-12-23 11:00     ` Robin Murphy
  2021-01-08  9:56       ` Tomasz Figa
  2021-01-08  9:56       ` Tomasz Figa
  0 siblings, 2 replies; 23+ messages in thread
From: Robin Murphy @ 2020-12-23 11:00 UTC (permalink / raw)
  To: Tomasz Figa, Yong Wu
  Cc: youlin.pei, anan.sun, Nicolas Boichat, srv_heupstream,
	Tomasz Figa, kernel-team, linux-kernel, Krzysztof Kozlowski,
	chao.hao, iommu, linux-mediatek, Matthias Brugger,
	Greg Kroah-Hartman, Will Deacon, linux-arm-kernel

On 2020-12-23 08:56, Tomasz Figa wrote:
> On Wed, Dec 16, 2020 at 06:36:06PM +0800, Yong Wu wrote:
>> In current iommu_unmap, this code is:
>>
>> 	iommu_iotlb_gather_init(&iotlb_gather);
>> 	ret = __iommu_unmap(domain, iova, size, &iotlb_gather);
>> 	iommu_iotlb_sync(domain, &iotlb_gather);
>>
>> We could gather the whole iova range in __iommu_unmap, and then do tlb
>> synchronization in the iommu_iotlb_sync.
>>
>> This patch implement this, Gather the range in mtk_iommu_unmap.
>> then iommu_iotlb_sync call tlb synchronization for the gathered iova range.
>> we don't call iommu_iotlb_gather_add_page since our tlb synchronization
>> could be regardless of granule size.
>>
>> In this way, gather->start is impossible ULONG_MAX, remove the checking.
>>
>> This patch aims to do tlb synchronization *once* in the iommu_unmap.
>>
>> Signed-off-by: Yong Wu <yong.wu@mediatek.com>
>> ---
>>   drivers/iommu/mtk_iommu.c | 8 +++++---
>>   1 file changed, 5 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
>> index db7d43adb06b..89cec51405cd 100644
>> --- a/drivers/iommu/mtk_iommu.c
>> +++ b/drivers/iommu/mtk_iommu.c
>> @@ -506,7 +506,12 @@ static size_t mtk_iommu_unmap(struct iommu_domain *domain,
>>   			      struct iommu_iotlb_gather *gather)
>>   {
>>   	struct mtk_iommu_domain *dom = to_mtk_domain(domain);
>> +	unsigned long long end = iova + size;
>>   
>> +	if (gather->start > iova)
>> +		gather->start = iova;
>> +	if (gather->end < end)
>> +		gather->end = end;
> 
> I don't know how common the case is, but what happens if
> gather->start...gather->end is a disjoint range from iova...end? E.g.
> 
>   | gather      | ..XXX... | iova |
>   |             |          |      |
>   gather->start |          iova   |
>                 gather->end       end
> 
> We would also end up invalidating the TLB for the XXX area, which could
> affect the performance.

Take a closer look at iommu_unmap() - the gather data is scoped to each 
individual call, so that can't possibly happen.

> Also, why is the existing code in __arm_v7s_unmap() not enough? It seems
> to call io_pgtable_tlb_add_page() already, so it should be batching the
> flushes.

Because if we leave io-pgtable in charge of maintenance it will also 
inject additional invalidations and syncs for the sake of strictly 
correct walk cache maintenance. Apparently we can get away without that 
on this hardware, so the fundamental purpose of this series is to 
sidestep it.

It's proven to be cleaner overall to devolve this kind of "non-standard" 
TLB maintenance back to drivers rather than try to cram yet more 
special-case complexity into io-pgtable itself. I'm planning to clean up 
the remains of the TLBI_ON_MAP quirk entirely after this.

Robin.

>>   	return dom->iop->unmap(dom->iop, iova, size, gather);
>>   }
>>   
>> @@ -523,9 +528,6 @@ static void mtk_iommu_iotlb_sync(struct iommu_domain *domain,
>>   	struct mtk_iommu_domain *dom = to_mtk_domain(domain);
>>   	size_t length = gather->end - gather->start;
>>   
>> -	if (gather->start == ULONG_MAX)
>> -		return;
>> -
>>   	mtk_iommu_tlb_flush_range_sync(gather->start, length, gather->pgsize,
>>   				       dom->data);
>>   }
>> -- 
>> 2.18.0
>>
>> _______________________________________________
>> iommu mailing list
>> iommu@lists.linux-foundation.org
>> https://lists.linuxfoundation.org/mailman/listinfo/iommu
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v3 1/7] iommu: Move iotlb_sync_map out from __iommu_map
  2020-12-23  8:51   ` Christoph Hellwig
@ 2020-12-24 11:27     ` Yong Wu
  0 siblings, 0 replies; 23+ messages in thread
From: Yong Wu @ 2020-12-24 11:27 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: youlin.pei, anan.sun, Nicolas Boichat, srv_heupstream,
	Tomasz Figa, Will Deacon, linux-kernel, Krzysztof Kozlowski,
	chao.hao, iommu, linux-mediatek, Matthias Brugger, kernel-team,
	Greg Kroah-Hartman, Robin Murphy, linux-arm-kernel

On Wed, 2020-12-23 at 08:51 +0000, Christoph Hellwig wrote:
> On Wed, Dec 16, 2020 at 06:36:01PM +0800, Yong Wu wrote:
> > In the end of __iommu_map, It alway call iotlb_sync_map.
> > This patch moves iotlb_sync_map out from __iommu_map since it is
> > unnecessary to call this for each sg segment especially iotlb_sync_map
> > is flush tlb all currently.
> > 
> > Signed-off-by: Yong Wu <yong.wu@mediatek.com>
> > Reviewed-by: Robin Murphy <robin.murphy@arm.com>
> 
> What about adding a little helper that does the NULL check and method
> call instead of duplicating it all over?

Thanks for the review. Of course OK.

Then the code like below. 
(If the helper name "_iommu_map" is not good, please tell me.)

+static int _iommu_map(struct iommu_domain *domain, unsigned long iova,
+		       phys_addr_t paddr, size_t size, int prot, gfp_t gfp)
+{
+	const struct iommu_ops *ops = domain->ops;
+	int ret;
+
+	ret = __iommu_map(domain, iova, paddr, size, prot, gfp);
+	if (ret == 0 && ops->iotlb_sync_map)
+		ops->iotlb_sync_map(domain);
+	return ret;
+}
+
 int iommu_map(struct iommu_domain *domain, unsigned long iova,
 	      phys_addr_t paddr, size_t size, int prot)
 {
 	might_sleep();
-	return __iommu_map(domain, iova, paddr, size, prot, GFP_KERNEL);
+	return _iommu_map(domain, iova, paddr, size, prot, GFP_KERNEL);
 }
 EXPORT_SYMBOL_GPL(iommu_map);
 
 int iommu_map_atomic(struct iommu_domain *domain, unsigned long iova,
 	      phys_addr_t paddr, size_t size, int prot)
 {
-	return __iommu_map(domain, iova, paddr, size, prot, GFP_ATOMIC);
+	return _iommu_map(domain, iova, paddr, size, prot, GFP_ATOMIC);
 }
 EXPORT_SYMBOL_GPL(iommu_map_atomic);

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v3 6/7] iommu/mediatek: Gather iova in iommu_unmap to achieve tlb sync once
  2020-12-23 11:00     ` Robin Murphy
@ 2021-01-08  9:56       ` Tomasz Figa
  2021-01-08  9:56       ` Tomasz Figa
  1 sibling, 0 replies; 23+ messages in thread
From: Tomasz Figa @ 2021-01-08  9:56 UTC (permalink / raw)
  To: Robin Murphy
  Cc: youlin.pei, anan.sun, Nicolas Boichat, srv_heupstream,
	kernel-team, Linux Kernel Mailing List, Krzysztof Kozlowski,
	chao.hao,
	list@263.net:IOMMU DRIVERS
	<iommu@lists.linux-foundation.org>,
	Joerg Roedel <joro@8bytes.org>, ,
	moderated list:ARM/Mediatek SoC support,
	list@263.net:IOMMU DRIVERS
	<iommu@lists.linux-foundation.org>,
	Joerg Roedel <joro@8bytes.org>, ,
	Matthias Brugger, Greg Kroah-Hartman, Will Deacon

On Wed, Dec 23, 2020 at 8:00 PM Robin Murphy <robin.murphy@arm.com> wrote:
>
> On 2020-12-23 08:56, Tomasz Figa wrote:
> > On Wed, Dec 16, 2020 at 06:36:06PM +0800, Yong Wu wrote:
> >> In current iommu_unmap, this code is:
> >>
> >>      iommu_iotlb_gather_init(&iotlb_gather);
> >>      ret = __iommu_unmap(domain, iova, size, &iotlb_gather);
> >>      iommu_iotlb_sync(domain, &iotlb_gather);
> >>
> >> We could gather the whole iova range in __iommu_unmap, and then do tlb
> >> synchronization in the iommu_iotlb_sync.
> >>
> >> This patch implement this, Gather the range in mtk_iommu_unmap.
> >> then iommu_iotlb_sync call tlb synchronization for the gathered iova range.
> >> we don't call iommu_iotlb_gather_add_page since our tlb synchronization
> >> could be regardless of granule size.
> >>
> >> In this way, gather->start is impossible ULONG_MAX, remove the checking.
> >>
> >> This patch aims to do tlb synchronization *once* in the iommu_unmap.
> >>
> >> Signed-off-by: Yong Wu <yong.wu@mediatek.com>
> >> ---
> >>   drivers/iommu/mtk_iommu.c | 8 +++++---
> >>   1 file changed, 5 insertions(+), 3 deletions(-)
> >>
> >> diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
> >> index db7d43adb06b..89cec51405cd 100644
> >> --- a/drivers/iommu/mtk_iommu.c
> >> +++ b/drivers/iommu/mtk_iommu.c
> >> @@ -506,7 +506,12 @@ static size_t mtk_iommu_unmap(struct iommu_domain *domain,
> >>                            struct iommu_iotlb_gather *gather)
> >>   {
> >>      struct mtk_iommu_domain *dom = to_mtk_domain(domain);
> >> +    unsigned long long end = iova + size;
> >>
> >> +    if (gather->start > iova)
> >> +            gather->start = iova;
> >> +    if (gather->end < end)
> >> +            gather->end = end;
> >
> > I don't know how common the case is, but what happens if
> > gather->start...gather->end is a disjoint range from iova...end? E.g.
> >
> >   | gather      | ..XXX... | iova |
> >   |             |          |      |
> >   gather->start |          iova   |
> >                 gather->end       end
> >
> > We would also end up invalidating the TLB for the XXX area, which could
> > affect the performance.
>
> Take a closer look at iommu_unmap() - the gather data is scoped to each
> individual call, so that can't possibly happen.
>
> > Also, why is the existing code in __arm_v7s_unmap() not enough? It seems
> > to call io_pgtable_tlb_add_page() already, so it should be batching the
> > flushes.
>
> Because if we leave io-pgtable in charge of maintenance it will also
> inject additional invalidations and syncs for the sake of strictly
> correct walk cache maintenance. Apparently we can get away without that
> on this hardware, so the fundamental purpose of this series is to
> sidestep it.
>
> It's proven to be cleaner overall to devolve this kind of "non-standard"
> TLB maintenance back to drivers rather than try to cram yet more
> special-case complexity into io-pgtable itself. I'm planning to clean up
> the remains of the TLBI_ON_MAP quirk entirely after this.
>
> Robin.
>
> >>      return dom->iop->unmap(dom->iop, iova, size, gather);
> >>   }
> >>
> >> @@ -523,9 +528,6 @@ static void mtk_iommu_iotlb_sync(struct iommu_domain *domain,
> >>      struct mtk_iommu_domain *dom = to_mtk_domain(domain);
> >>      size_t length = gather->end - gather->start;
> >>
> >> -    if (gather->start == ULONG_MAX)
> >> -            return;
> >> -
> >>      mtk_iommu_tlb_flush_range_sync(gather->start, length, gather->pgsize,
> >>                                     dom->data);
> >>   }
> >> --
> >> 2.18.0
> >>
> >> _______________________________________________
> >> iommu mailing list
> >> iommu@lists.linux-foundation.org
> >> https://lists.linuxfoundation.org/mailman/listinfo/iommu
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v3 6/7] iommu/mediatek: Gather iova in iommu_unmap to achieve tlb sync once
  2020-12-23 11:00     ` Robin Murphy
  2021-01-08  9:56       ` Tomasz Figa
@ 2021-01-08  9:56       ` Tomasz Figa
  1 sibling, 0 replies; 23+ messages in thread
From: Tomasz Figa @ 2021-01-08  9:56 UTC (permalink / raw)
  To: Robin Murphy
  Cc: youlin.pei, anan.sun, Nicolas Boichat, srv_heupstream,
	kernel-team, Linux Kernel Mailing List, Krzysztof Kozlowski,
	chao.hao,
	list@263.net:IOMMU DRIVERS
	<iommu@lists.linux-foundation.org>,
	Joerg Roedel <joro@8bytes.org>, ,
	moderated list:ARM/Mediatek SoC support,
	list@263.net:IOMMU DRIVERS
	<iommu@lists.linux-foundation.org>,
	Joerg Roedel <joro@8bytes.org>, ,
	Matthias Brugger, Greg Kroah-Hartman, Will Deacon

On Wed, Dec 23, 2020 at 8:00 PM Robin Murphy <robin.murphy@arm.com> wrote:
>
> On 2020-12-23 08:56, Tomasz Figa wrote:
> > On Wed, Dec 16, 2020 at 06:36:06PM +0800, Yong Wu wrote:
> >> In current iommu_unmap, this code is:
> >>
> >>      iommu_iotlb_gather_init(&iotlb_gather);
> >>      ret = __iommu_unmap(domain, iova, size, &iotlb_gather);
> >>      iommu_iotlb_sync(domain, &iotlb_gather);
> >>
> >> We could gather the whole iova range in __iommu_unmap, and then do tlb
> >> synchronization in the iommu_iotlb_sync.
> >>
> >> This patch implement this, Gather the range in mtk_iommu_unmap.
> >> then iommu_iotlb_sync call tlb synchronization for the gathered iova range.
> >> we don't call iommu_iotlb_gather_add_page since our tlb synchronization
> >> could be regardless of granule size.
> >>
> >> In this way, gather->start is impossible ULONG_MAX, remove the checking.
> >>
> >> This patch aims to do tlb synchronization *once* in the iommu_unmap.
> >>
> >> Signed-off-by: Yong Wu <yong.wu@mediatek.com>
> >> ---
> >>   drivers/iommu/mtk_iommu.c | 8 +++++---
> >>   1 file changed, 5 insertions(+), 3 deletions(-)
> >>
> >> diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
> >> index db7d43adb06b..89cec51405cd 100644
> >> --- a/drivers/iommu/mtk_iommu.c
> >> +++ b/drivers/iommu/mtk_iommu.c
> >> @@ -506,7 +506,12 @@ static size_t mtk_iommu_unmap(struct iommu_domain *domain,
> >>                            struct iommu_iotlb_gather *gather)
> >>   {
> >>      struct mtk_iommu_domain *dom = to_mtk_domain(domain);
> >> +    unsigned long long end = iova + size;
> >>
> >> +    if (gather->start > iova)
> >> +            gather->start = iova;
> >> +    if (gather->end < end)
> >> +            gather->end = end;
> >
> > I don't know how common the case is, but what happens if
> > gather->start...gather->end is a disjoint range from iova...end? E.g.
> >
> >   | gather      | ..XXX... | iova |
> >   |             |          |      |
> >   gather->start |          iova   |
> >                 gather->end       end
> >
> > We would also end up invalidating the TLB for the XXX area, which could
> > affect the performance.
>
> Take a closer look at iommu_unmap() - the gather data is scoped to each
> individual call, so that can't possibly happen.
>
> > Also, why is the existing code in __arm_v7s_unmap() not enough? It seems
> > to call io_pgtable_tlb_add_page() already, so it should be batching the
> > flushes.
>
> Because if we leave io-pgtable in charge of maintenance it will also
> inject additional invalidations and syncs for the sake of strictly
> correct walk cache maintenance. Apparently we can get away without that
> on this hardware, so the fundamental purpose of this series is to
> sidestep it.
>
> It's proven to be cleaner overall to devolve this kind of "non-standard"
> TLB maintenance back to drivers rather than try to cram yet more
> special-case complexity into io-pgtable itself. I'm planning to clean up
> the remains of the TLBI_ON_MAP quirk entirely after this.

(Sorry, I sent an empty email accidentally.)

I see, thanks for clarifying. The patch looks good to me then.

Best regards,
Tomasz

>
> Robin.
>
> >>      return dom->iop->unmap(dom->iop, iova, size, gather);
> >>   }
> >>
> >> @@ -523,9 +528,6 @@ static void mtk_iommu_iotlb_sync(struct iommu_domain *domain,
> >>      struct mtk_iommu_domain *dom = to_mtk_domain(domain);
> >>      size_t length = gather->end - gather->start;
> >>
> >> -    if (gather->start == ULONG_MAX)
> >> -            return;
> >> -
> >>      mtk_iommu_tlb_flush_range_sync(gather->start, length, gather->pgsize,
> >>                                     dom->data);
> >>   }
> >> --
> >> 2.18.0
> >>
> >> _______________________________________________
> >> iommu mailing list
> >> iommu@lists.linux-foundation.org
> >> https://lists.linuxfoundation.org/mailman/listinfo/iommu
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v3 5/7] iommu: Allow io_pgtable_tlb ops optional
  2020-12-16 10:36 ` [PATCH v3 5/7] iommu: Allow io_pgtable_tlb ops optional Yong Wu
@ 2021-01-18 16:18   ` Robin Murphy
  0 siblings, 0 replies; 23+ messages in thread
From: Robin Murphy @ 2021-01-18 16:18 UTC (permalink / raw)
  To: Yong Wu, Joerg Roedel, Will Deacon
  Cc: youlin.pei, anan.sun, Nicolas Boichat, srv_heupstream, chao.hao,
	linux-kernel, Krzysztof Kozlowski, Tomasz Figa, iommu,
	linux-mediatek, Matthias Brugger, Greg Kroah-Hartman,
	kernel-team, linux-arm-kernel

On 2020-12-16 10:36, Yong Wu wrote:
> This patch allows io_pgtable_tlb ops could be null since the IOMMU drivers
> may use the tlb ops from iommu framework.

There's not much in it, but I guess this does make more sense overall 
than just making .tlb_flush_all optional and drivers having to provide a 
full set of NULL callbacks.

Reviewed-by: Robin Murphy <robin.murphy@arm.com>

> Signed-off-by: Yong Wu <yong.wu@mediatek.com>
> ---
>   include/linux/io-pgtable.h | 8 +++++---
>   1 file changed, 5 insertions(+), 3 deletions(-)
> 
> diff --git a/include/linux/io-pgtable.h b/include/linux/io-pgtable.h
> index adde9e49be08..c81796814afa 100644
> --- a/include/linux/io-pgtable.h
> +++ b/include/linux/io-pgtable.h
> @@ -206,14 +206,16 @@ struct io_pgtable {
>   
>   static inline void io_pgtable_tlb_flush_all(struct io_pgtable *iop)
>   {
> -	iop->cfg.tlb->tlb_flush_all(iop->cookie);
> +	if (iop->cfg.tlb && iop->cfg.tlb->tlb_flush_all)
> +		iop->cfg.tlb->tlb_flush_all(iop->cookie);
>   }
>   
>   static inline void
>   io_pgtable_tlb_flush_walk(struct io_pgtable *iop, unsigned long iova,
>   			  size_t size, size_t granule)
>   {
> -	iop->cfg.tlb->tlb_flush_walk(iova, size, granule, iop->cookie);
> +	if (iop->cfg.tlb && iop->cfg.tlb->tlb_flush_walk)
> +		iop->cfg.tlb->tlb_flush_walk(iova, size, granule, iop->cookie);
>   }
>   
>   static inline void
> @@ -221,7 +223,7 @@ io_pgtable_tlb_add_page(struct io_pgtable *iop,
>   			struct iommu_iotlb_gather * gather, unsigned long iova,
>   			size_t granule)
>   {
> -	if (iop->cfg.tlb->tlb_add_page)
> +	if (iop->cfg.tlb && iop->cfg.tlb->tlb_add_page)
>   		iop->cfg.tlb->tlb_add_page(gather, iova, granule, iop->cookie);
>   }
>   
> 
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v3 6/7] iommu/mediatek: Gather iova in iommu_unmap to achieve tlb sync once
  2020-12-16 10:36 ` [PATCH v3 6/7] iommu/mediatek: Gather iova in iommu_unmap to achieve tlb sync once Yong Wu
  2020-12-23  8:56   ` Tomasz Figa
@ 2021-01-18 16:35   ` Robin Murphy
  2021-01-18 16:58     ` Will Deacon
  1 sibling, 1 reply; 23+ messages in thread
From: Robin Murphy @ 2021-01-18 16:35 UTC (permalink / raw)
  To: Yong Wu, Joerg Roedel, Will Deacon
  Cc: youlin.pei, anan.sun, Nicolas Boichat, srv_heupstream, chao.hao,
	linux-kernel, Krzysztof Kozlowski, Tomasz Figa, iommu,
	linux-mediatek, Matthias Brugger, Greg Kroah-Hartman,
	kernel-team, linux-arm-kernel

On 2020-12-16 10:36, Yong Wu wrote:
> In current iommu_unmap, this code is:
> 
> 	iommu_iotlb_gather_init(&iotlb_gather);
> 	ret = __iommu_unmap(domain, iova, size, &iotlb_gather);
> 	iommu_iotlb_sync(domain, &iotlb_gather);
> 
> We could gather the whole iova range in __iommu_unmap, and then do tlb
> synchronization in the iommu_iotlb_sync.
> 
> This patch implement this, Gather the range in mtk_iommu_unmap.
> then iommu_iotlb_sync call tlb synchronization for the gathered iova range.
> we don't call iommu_iotlb_gather_add_page since our tlb synchronization
> could be regardless of granule size.
> 
> In this way, gather->start is impossible ULONG_MAX, remove the checking.
> 
> This patch aims to do tlb synchronization *once* in the iommu_unmap.

Assuming the update to patch #4 simply results in "unsigned long end = 
iova + size - 1;" here,

Reviewed-by: Robin Murphy <robin.murphy@arm.com>

> Signed-off-by: Yong Wu <yong.wu@mediatek.com>
> ---
>   drivers/iommu/mtk_iommu.c | 8 +++++---
>   1 file changed, 5 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
> index db7d43adb06b..89cec51405cd 100644
> --- a/drivers/iommu/mtk_iommu.c
> +++ b/drivers/iommu/mtk_iommu.c
> @@ -506,7 +506,12 @@ static size_t mtk_iommu_unmap(struct iommu_domain *domain,
>   			      struct iommu_iotlb_gather *gather)
>   {
>   	struct mtk_iommu_domain *dom = to_mtk_domain(domain);
> +	unsigned long long end = iova + size;
>   
> +	if (gather->start > iova)
> +		gather->start = iova;
> +	if (gather->end < end)
> +		gather->end = end;
>   	return dom->iop->unmap(dom->iop, iova, size, gather);
>   }
>   
> @@ -523,9 +528,6 @@ static void mtk_iommu_iotlb_sync(struct iommu_domain *domain,
>   	struct mtk_iommu_domain *dom = to_mtk_domain(domain);
>   	size_t length = gather->end - gather->start;
>   
> -	if (gather->start == ULONG_MAX)
> -		return;
> -
>   	mtk_iommu_tlb_flush_range_sync(gather->start, length, gather->pgsize,
>   				       dom->data);
>   }
> 
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v3 7/7] iommu/mediatek: Remove the tlb-ops for v7s
  2020-12-16 10:36 ` [PATCH v3 7/7] iommu/mediatek: Remove the tlb-ops for v7s Yong Wu
@ 2021-01-18 16:39   ` Robin Murphy
  0 siblings, 0 replies; 23+ messages in thread
From: Robin Murphy @ 2021-01-18 16:39 UTC (permalink / raw)
  To: Yong Wu, Joerg Roedel, Will Deacon
  Cc: youlin.pei, anan.sun, Nicolas Boichat, srv_heupstream, chao.hao,
	linux-kernel, Krzysztof Kozlowski, Tomasz Figa, iommu,
	linux-mediatek, Matthias Brugger, Greg Kroah-Hartman,
	kernel-team, linux-arm-kernel

On 2020-12-16 10:36, Yong Wu wrote:
> Until now, we have already used the tlb operations from iommu framework,
> then the tlb operations for v7s can be removed.
> 
> Correspondingly, Switch the paramenter "cookie" to internal structure.

FWIW,

Reviewed-by: Robin Murphy <robin.murphy@arm.com>

> Signed-off-by: Yong Wu <yong.wu@mediatek.com>
> ---
>   drivers/iommu/mtk_iommu.c | 27 ++++-----------------------
>   1 file changed, 4 insertions(+), 23 deletions(-)
> 
> diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
> index 89cec51405cd..5656819cd4a1 100644
> --- a/drivers/iommu/mtk_iommu.c
> +++ b/drivers/iommu/mtk_iommu.c
> @@ -206,10 +206,8 @@ static struct mtk_iommu_domain *to_mtk_domain(struct iommu_domain *dom)
>   	return container_of(dom, struct mtk_iommu_domain, domain);
>   }
>   
> -static void mtk_iommu_tlb_flush_all(void *cookie)
> +static void mtk_iommu_tlb_flush_all(struct mtk_iommu_data *data)
>   {
> -	struct mtk_iommu_data *data = cookie;
> -
>   	for_each_m4u(data) {
>   		if (!pm_runtime_active(data->dev))
>   			continue;
> @@ -221,9 +219,9 @@ static void mtk_iommu_tlb_flush_all(void *cookie)
>   }
>   
>   static void mtk_iommu_tlb_flush_range_sync(unsigned long iova, size_t size,
> -					   size_t granule, void *cookie)
> +					   size_t granule,
> +					   struct mtk_iommu_data *data)
>   {
> -	struct mtk_iommu_data *data = cookie;
>   	unsigned long flags;
>   	int ret;
>   	u32 tmp;
> @@ -250,7 +248,7 @@ static void mtk_iommu_tlb_flush_range_sync(unsigned long iova, size_t size,
>   		if (ret) {
>   			dev_warn(data->dev,
>   				 "Partial TLB flush timed out, falling back to full flush\n");
> -			mtk_iommu_tlb_flush_all(cookie);
> +			mtk_iommu_tlb_flush_all(data);
>   		}
>   		/* Clear the CPE status */
>   		writel_relaxed(0, data->base + REG_MMU_CPE_DONE);
> @@ -258,22 +256,6 @@ static void mtk_iommu_tlb_flush_range_sync(unsigned long iova, size_t size,
>   	}
>   }
>   
> -static void mtk_iommu_tlb_flush_page_nosync(struct iommu_iotlb_gather *gather,
> -					    unsigned long iova, size_t granule,
> -					    void *cookie)
> -{
> -	struct mtk_iommu_data *data = cookie;
> -	struct iommu_domain *domain = &data->m4u_dom->domain;
> -
> -	iommu_iotlb_gather_add_page(domain, gather, iova, granule);
> -}
> -
> -static const struct iommu_flush_ops mtk_iommu_flush_ops = {
> -	.tlb_flush_all = mtk_iommu_tlb_flush_all,
> -	.tlb_flush_walk = mtk_iommu_tlb_flush_range_sync,
> -	.tlb_add_page = mtk_iommu_tlb_flush_page_nosync,
> -};
> -
>   static irqreturn_t mtk_iommu_isr(int irq, void *dev_id)
>   {
>   	struct mtk_iommu_data *data = dev_id;
> @@ -380,7 +362,6 @@ static int mtk_iommu_domain_finalise(struct mtk_iommu_domain *dom)
>   		.pgsize_bitmap = mtk_iommu_ops.pgsize_bitmap,
>   		.ias = MTK_IOMMU_HAS_FLAG(data->plat_data, IOVA_34_EN) ? 34 : 32,
>   		.oas = 35,
> -		.tlb = &mtk_iommu_flush_ops,
>   		.iommu_dev = data->dev,
>   	};
>   
> 
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v3 6/7] iommu/mediatek: Gather iova in iommu_unmap to achieve tlb sync once
  2021-01-18 16:35   ` Robin Murphy
@ 2021-01-18 16:58     ` Will Deacon
  2021-01-18 17:14       ` Robin Murphy
  0 siblings, 1 reply; 23+ messages in thread
From: Will Deacon @ 2021-01-18 16:58 UTC (permalink / raw)
  To: Robin Murphy
  Cc: youlin.pei, anan.sun, Nicolas Boichat, srv_heupstream, chao.hao,
	linux-kernel, Krzysztof Kozlowski, Tomasz Figa, iommu,
	linux-mediatek, linux-arm-kernel, Matthias Brugger,
	Greg Kroah-Hartman, kernel-team

On Mon, Jan 18, 2021 at 04:35:22PM +0000, Robin Murphy wrote:
> On 2020-12-16 10:36, Yong Wu wrote:
> > In current iommu_unmap, this code is:
> > 
> > 	iommu_iotlb_gather_init(&iotlb_gather);
> > 	ret = __iommu_unmap(domain, iova, size, &iotlb_gather);
> > 	iommu_iotlb_sync(domain, &iotlb_gather);
> > 
> > We could gather the whole iova range in __iommu_unmap, and then do tlb
> > synchronization in the iommu_iotlb_sync.
> > 
> > This patch implement this, Gather the range in mtk_iommu_unmap.
> > then iommu_iotlb_sync call tlb synchronization for the gathered iova range.
> > we don't call iommu_iotlb_gather_add_page since our tlb synchronization
> > could be regardless of granule size.
> > 
> > In this way, gather->start is impossible ULONG_MAX, remove the checking.
> > 
> > This patch aims to do tlb synchronization *once* in the iommu_unmap.
> 
> Assuming the update to patch #4 simply results in "unsigned long end = iova
> + size - 1;" here,
> 
> Reviewed-by: Robin Murphy <robin.murphy@arm.com>

There's a v4 here:

https://lore.kernel.org/r/20210107122909.16317-1-yong.wu@mediatek.com

Will
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v3 6/7] iommu/mediatek: Gather iova in iommu_unmap to achieve tlb sync once
  2021-01-18 16:58     ` Will Deacon
@ 2021-01-18 17:14       ` Robin Murphy
  0 siblings, 0 replies; 23+ messages in thread
From: Robin Murphy @ 2021-01-18 17:14 UTC (permalink / raw)
  To: Will Deacon
  Cc: youlin.pei, anan.sun, Nicolas Boichat, srv_heupstream,
	Tomasz Figa, linux-kernel, Krzysztof Kozlowski, chao.hao, iommu,
	linux-mediatek, Matthias Brugger, Greg Kroah-Hartman,
	kernel-team, linux-arm-kernel

On 2021-01-18 16:58, Will Deacon wrote:
> On Mon, Jan 18, 2021 at 04:35:22PM +0000, Robin Murphy wrote:
>> On 2020-12-16 10:36, Yong Wu wrote:
>>> In current iommu_unmap, this code is:
>>>
>>> 	iommu_iotlb_gather_init(&iotlb_gather);
>>> 	ret = __iommu_unmap(domain, iova, size, &iotlb_gather);
>>> 	iommu_iotlb_sync(domain, &iotlb_gather);
>>>
>>> We could gather the whole iova range in __iommu_unmap, and then do tlb
>>> synchronization in the iommu_iotlb_sync.
>>>
>>> This patch implement this, Gather the range in mtk_iommu_unmap.
>>> then iommu_iotlb_sync call tlb synchronization for the gathered iova range.
>>> we don't call iommu_iotlb_gather_add_page since our tlb synchronization
>>> could be regardless of granule size.
>>>
>>> In this way, gather->start is impossible ULONG_MAX, remove the checking.
>>>
>>> This patch aims to do tlb synchronization *once* in the iommu_unmap.
>>
>> Assuming the update to patch #4 simply results in "unsigned long end = iova
>> + size - 1;" here,
>>
>> Reviewed-by: Robin Murphy <robin.murphy@arm.com>
> 
> There's a v4 here:
> 
> https://lore.kernel.org/r/20210107122909.16317-1-yong.wu@mediatek.com

Ha, so there is! Apparently I missed that in my post-holiday sweep last 
week and leant too heavily on the inbox-in-date-order assumption. Lemme 
just go catch up...

Thanks,
Robin.
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2021-01-18 17:14 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-12-16 10:36 [PATCH v3 0/7] MediaTek IOMMU improve tlb flush performance in map/unmap Yong Wu
2020-12-16 10:36 ` [PATCH v3 1/7] iommu: Move iotlb_sync_map out from __iommu_map Yong Wu
2020-12-23  8:51   ` Christoph Hellwig
2020-12-24 11:27     ` Yong Wu
2020-12-16 10:36 ` [PATCH v3 2/7] iommu: Add iova and size as parameters in iotlb_sync_map Yong Wu
2020-12-16 10:36 ` [PATCH v3 3/7] iommu/mediatek: Add iotlb_sync_map to sync whole the iova range Yong Wu
2020-12-16 10:36 ` [PATCH v3 4/7] iommu: Switch gather->end to unsigned long long Yong Wu
2020-12-16 11:03   ` David Laight
2020-12-17  2:26     ` Yong Wu
2020-12-16 12:10   ` Robin Murphy
2020-12-17  2:26     ` Yong Wu
2020-12-16 10:36 ` [PATCH v3 5/7] iommu: Allow io_pgtable_tlb ops optional Yong Wu
2021-01-18 16:18   ` Robin Murphy
2020-12-16 10:36 ` [PATCH v3 6/7] iommu/mediatek: Gather iova in iommu_unmap to achieve tlb sync once Yong Wu
2020-12-23  8:56   ` Tomasz Figa
2020-12-23 11:00     ` Robin Murphy
2021-01-08  9:56       ` Tomasz Figa
2021-01-08  9:56       ` Tomasz Figa
2021-01-18 16:35   ` Robin Murphy
2021-01-18 16:58     ` Will Deacon
2021-01-18 17:14       ` Robin Murphy
2020-12-16 10:36 ` [PATCH v3 7/7] iommu/mediatek: Remove the tlb-ops for v7s Yong Wu
2021-01-18 16:39   ` Robin Murphy

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).