linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4 0/7] Improve tlb range flush
@ 2019-10-16  3:33 Yong Wu
  2019-10-16  3:33 ` [PATCH v4 1/7] iommu/mediatek: Correct the flush_iotlb_all callback Yong Wu
                   ` (6 more replies)
  0 siblings, 7 replies; 12+ messages in thread
From: Yong Wu @ 2019-10-16  3:33 UTC (permalink / raw)
  To: Matthias Brugger, Joerg Roedel, Will Deacon
  Cc: Evan Green, Robin Murphy, Tomasz Figa, linux-mediatek,
	srv_heupstream, linux-kernel, linux-arm-kernel, iommu, yong.wu,
	youlin.pei, Nicolas Boichat, anan.sun, cui.zhang, chao.hao,
	edison.hsieh

This patchset mainly fixes a tlb flush timeout issue and use the new
iommu_gather to re-implement the tlb flush flow. and several clean up
patches about the tlb_flush.

change note:

v4: 1. Add a new tlb_lock for tlb operations.
    2. Delete the pgtlock.
    3. Remove the "writel" patch.

v3: https://lore.kernel.org/linux-iommu/1571035101-4213-1-git-send-email-yong.wu@mediatek.com/T/#t
   1. Use the gather to implement the tlb_flush suggested from Tomasz.
   2. add some clean up patches.

v2:
https://lore.kernel.org/linux-iommu/1570627143-29441-1-git-send-email-yong.wu@mediatek.com/T/#t
   1. rebase on v5.4-rc1
   2. only split to several patches.

v1:
https://lore.kernel.org/linux-iommu/CAAFQd5C3U7pZo4SSUJ52Q7E+0FaUoORQFbQC5RhCHBhi=NFYTw@mail.gmail.com/T/#t

Yong Wu (7):
  iommu/mediatek: Correct the flush_iotlb_all callback
  iommu/mediatek: Add a new tlb_lock for tlb_flush
  iommu/mediatek: Use gather to achieve the tlb range flush
  iommu/mediatek: Delete the leaf in the tlb_flush
  iommu/mediatek: Move the tlb_sync into tlb_flush
  iommu/mediatek: Get rid of the pgtlock
  iommu/mediatek: Reduce the tlb flush timeout value

 drivers/iommu/mtk_iommu.c | 88 +++++++++++++++--------------------------------
 drivers/iommu/mtk_iommu.h |  2 +-
 2 files changed, 29 insertions(+), 61 deletions(-)

-- 
1.9.1


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH v4 1/7] iommu/mediatek: Correct the flush_iotlb_all callback
  2019-10-16  3:33 [PATCH v4 0/7] Improve tlb range flush Yong Wu
@ 2019-10-16  3:33 ` Yong Wu
  2019-10-16  3:33 ` [PATCH v4 2/7] iommu/mediatek: Add a new tlb_lock for tlb_flush Yong Wu
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 12+ messages in thread
From: Yong Wu @ 2019-10-16  3:33 UTC (permalink / raw)
  To: Matthias Brugger, Joerg Roedel, Will Deacon
  Cc: Evan Green, Robin Murphy, Tomasz Figa, linux-mediatek,
	srv_heupstream, linux-kernel, linux-arm-kernel, iommu, yong.wu,
	youlin.pei, Nicolas Boichat, anan.sun, cui.zhang, chao.hao,
	edison.hsieh

Use the correct tlb_flush_all instead of the original one.

Fixes: 4d689b619445 ("iommu/io-pgtable-arm-v7s: Convert to IOMMU API TLB sync")
Signed-off-by: Yong Wu <yong.wu@mediatek.com>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
---
 drivers/iommu/mtk_iommu.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
index 67a483c..76b9388 100644
--- a/drivers/iommu/mtk_iommu.c
+++ b/drivers/iommu/mtk_iommu.c
@@ -447,7 +447,7 @@ static size_t mtk_iommu_unmap(struct iommu_domain *domain,
 
 static void mtk_iommu_flush_iotlb_all(struct iommu_domain *domain)
 {
-	mtk_iommu_tlb_sync(mtk_iommu_get_m4u_data());
+	mtk_iommu_tlb_flush_all(mtk_iommu_get_m4u_data());
 }
 
 static void mtk_iommu_iotlb_sync(struct iommu_domain *domain,
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v4 2/7] iommu/mediatek: Add a new tlb_lock for tlb_flush
  2019-10-16  3:33 [PATCH v4 0/7] Improve tlb range flush Yong Wu
  2019-10-16  3:33 ` [PATCH v4 1/7] iommu/mediatek: Correct the flush_iotlb_all callback Yong Wu
@ 2019-10-16  3:33 ` Yong Wu
  2019-10-23 16:52   ` Will Deacon
  2019-10-16  3:33 ` [PATCH v4 3/7] iommu/mediatek: Use gather to achieve the tlb range flush Yong Wu
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 12+ messages in thread
From: Yong Wu @ 2019-10-16  3:33 UTC (permalink / raw)
  To: Matthias Brugger, Joerg Roedel, Will Deacon
  Cc: Evan Green, Robin Murphy, Tomasz Figa, linux-mediatek,
	srv_heupstream, linux-kernel, linux-arm-kernel, iommu, yong.wu,
	youlin.pei, Nicolas Boichat, anan.sun, cui.zhang, chao.hao,
	edison.hsieh

The commit 4d689b619445 ("iommu/io-pgtable-arm-v7s: Convert to IOMMU API
TLB sync") help move the tlb_sync of unmap from v7s into the iommu
framework. It helps add a new function "mtk_iommu_iotlb_sync", But it
lacked the lock, then it will cause the variable "tlb_flush_active"
may be changed unexpectedly, we could see this warning log randomly:

mtk-iommu 10205000.iommu: Partial TLB flush timed out, falling back to
full flush

The HW requires tlb_flush/tlb_sync in pairs strictly, this patch adds
a new tlb_lock for tlb operations to fix this issue.

Fixes: 4d689b619445 ("iommu/io-pgtable-arm-v7s: Convert to IOMMU API TLB
sync")
Signed-off-by: Yong Wu <yong.wu@mediatek.com>
---
 drivers/iommu/mtk_iommu.c | 23 ++++++++++++++++++++++-
 drivers/iommu/mtk_iommu.h |  1 +
 2 files changed, 23 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
index 76b9388..c2f6c78 100644
--- a/drivers/iommu/mtk_iommu.c
+++ b/drivers/iommu/mtk_iommu.c
@@ -219,22 +219,37 @@ static void mtk_iommu_tlb_sync(void *cookie)
 static void mtk_iommu_tlb_flush_walk(unsigned long iova, size_t size,
 				     size_t granule, void *cookie)
 {
+	struct mtk_iommu_data *data = cookie;
+	unsigned long flags;
+
+	spin_lock_irqsave(&data->tlb_lock, flags);
 	mtk_iommu_tlb_add_flush_nosync(iova, size, granule, false, cookie);
 	mtk_iommu_tlb_sync(cookie);
+	spin_unlock_irqrestore(&data->tlb_lock, flags);
 }
 
 static void mtk_iommu_tlb_flush_leaf(unsigned long iova, size_t size,
 				     size_t granule, void *cookie)
 {
+	struct mtk_iommu_data *data = cookie;
+	unsigned long flags;
+
+	spin_lock_irqsave(&data->tlb_lock, flags);
 	mtk_iommu_tlb_add_flush_nosync(iova, size, granule, true, cookie);
 	mtk_iommu_tlb_sync(cookie);
+	spin_unlock_irqrestore(&data->tlb_lock, flags);
 }
 
 static void mtk_iommu_tlb_flush_page_nosync(struct iommu_iotlb_gather *gather,
 					    unsigned long iova, size_t granule,
 					    void *cookie)
 {
+	struct mtk_iommu_data *data = cookie;
+	unsigned long flags;
+
+	spin_lock_irqsave(&data->tlb_lock, flags);
 	mtk_iommu_tlb_add_flush_nosync(iova, granule, granule, true, cookie);
+	spin_unlock_irqrestore(&data->tlb_lock, flags);
 }
 
 static const struct iommu_flush_ops mtk_iommu_flush_ops = {
@@ -453,7 +468,12 @@ static void mtk_iommu_flush_iotlb_all(struct iommu_domain *domain)
 static void mtk_iommu_iotlb_sync(struct iommu_domain *domain,
 				 struct iommu_iotlb_gather *gather)
 {
-	mtk_iommu_tlb_sync(mtk_iommu_get_m4u_data());
+	struct mtk_iommu_data *data = mtk_iommu_get_m4u_data();
+	unsigned long flags;
+
+	spin_lock_irqsave(&data->tlb_lock, flags);
+	mtk_iommu_tlb_sync(data);
+	spin_unlock_irqrestore(&data->tlb_lock, flags);
 }
 
 static phys_addr_t mtk_iommu_iova_to_phys(struct iommu_domain *domain,
@@ -733,6 +753,7 @@ static int mtk_iommu_probe(struct platform_device *pdev)
 	if (ret)
 		return ret;
 
+	spin_lock_init(&data->tlb_lock);
 	list_add_tail(&data->list, &m4ulist);
 
 	if (!iommu_present(&platform_bus_type))
diff --git a/drivers/iommu/mtk_iommu.h b/drivers/iommu/mtk_iommu.h
index fc0f16e..8cae22d 100644
--- a/drivers/iommu/mtk_iommu.h
+++ b/drivers/iommu/mtk_iommu.h
@@ -58,6 +58,7 @@ struct mtk_iommu_data {
 	struct iommu_group		*m4u_group;
 	bool                            enable_4GB;
 	bool				tlb_flush_active;
+	spinlock_t			tlb_lock; /* lock for tlb range flush */
 
 	struct iommu_device		iommu;
 	const struct mtk_iommu_plat_data *plat_data;
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v4 3/7] iommu/mediatek: Use gather to achieve the tlb range flush
  2019-10-16  3:33 [PATCH v4 0/7] Improve tlb range flush Yong Wu
  2019-10-16  3:33 ` [PATCH v4 1/7] iommu/mediatek: Correct the flush_iotlb_all callback Yong Wu
  2019-10-16  3:33 ` [PATCH v4 2/7] iommu/mediatek: Add a new tlb_lock for tlb_flush Yong Wu
@ 2019-10-16  3:33 ` Yong Wu
  2019-10-23 16:55   ` Will Deacon
  2019-10-16  3:33 ` [PATCH v4 4/7] iommu/mediatek: Delete the leaf in the tlb_flush Yong Wu
                   ` (3 subsequent siblings)
  6 siblings, 1 reply; 12+ messages in thread
From: Yong Wu @ 2019-10-16  3:33 UTC (permalink / raw)
  To: Matthias Brugger, Joerg Roedel, Will Deacon
  Cc: Evan Green, Robin Murphy, Tomasz Figa, linux-mediatek,
	srv_heupstream, linux-kernel, linux-arm-kernel, iommu, yong.wu,
	youlin.pei, Nicolas Boichat, anan.sun, cui.zhang, chao.hao,
	edison.hsieh

Use the iommu_gather mechanism to achieve the tlb range flush.
Gather the iova range in the "tlb_add_page", then flush the merged iova
range in iotlb_sync.

Suggested-by: Tomasz Figa <tfiga@chromium.org>
Signed-off-by: Yong Wu <yong.wu@mediatek.com>
---
 drivers/iommu/mtk_iommu.c | 12 ++++++++----
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
index c2f6c78..81ac95f 100644
--- a/drivers/iommu/mtk_iommu.c
+++ b/drivers/iommu/mtk_iommu.c
@@ -245,11 +245,9 @@ static void mtk_iommu_tlb_flush_page_nosync(struct iommu_iotlb_gather *gather,
 					    void *cookie)
 {
 	struct mtk_iommu_data *data = cookie;
-	unsigned long flags;
+	struct iommu_domain *domain = &data->m4u_dom->domain;
 
-	spin_lock_irqsave(&data->tlb_lock, flags);
-	mtk_iommu_tlb_add_flush_nosync(iova, granule, granule, true, cookie);
-	spin_unlock_irqrestore(&data->tlb_lock, flags);
+	iommu_iotlb_gather_add_page(domain, gather, iova, granule);
 }
 
 static const struct iommu_flush_ops mtk_iommu_flush_ops = {
@@ -469,9 +467,15 @@ static void mtk_iommu_iotlb_sync(struct iommu_domain *domain,
 				 struct iommu_iotlb_gather *gather)
 {
 	struct mtk_iommu_data *data = mtk_iommu_get_m4u_data();
+	size_t length = gather->end - gather->start;
 	unsigned long flags;
 
+	if (gather->start == ULONG_MAX)
+		return;
+
 	spin_lock_irqsave(&data->tlb_lock, flags);
+	mtk_iommu_tlb_add_flush_nosync(gather->start, length, gather->pgsize,
+				       false, data);
 	mtk_iommu_tlb_sync(data);
 	spin_unlock_irqrestore(&data->tlb_lock, flags);
 }
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v4 4/7] iommu/mediatek: Delete the leaf in the tlb_flush
  2019-10-16  3:33 [PATCH v4 0/7] Improve tlb range flush Yong Wu
                   ` (2 preceding siblings ...)
  2019-10-16  3:33 ` [PATCH v4 3/7] iommu/mediatek: Use gather to achieve the tlb range flush Yong Wu
@ 2019-10-16  3:33 ` Yong Wu
  2019-10-16  3:33 ` [PATCH v4 5/7] iommu/mediatek: Move the tlb_sync into tlb_flush Yong Wu
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 12+ messages in thread
From: Yong Wu @ 2019-10-16  3:33 UTC (permalink / raw)
  To: Matthias Brugger, Joerg Roedel, Will Deacon
  Cc: Evan Green, Robin Murphy, Tomasz Figa, linux-mediatek,
	srv_heupstream, linux-kernel, linux-arm-kernel, iommu, yong.wu,
	youlin.pei, Nicolas Boichat, anan.sun, cui.zhang, chao.hao,
	edison.hsieh

In our tlb range flush, we don't care the "leaf". Remove it to simplify
the code. no functional change.

"granule" also is unnecessary for us, Keep it satisfy the format of
tlb_flush_walk.

Signed-off-by: Yong Wu <yong.wu@mediatek.com>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
---
 drivers/iommu/mtk_iommu.c | 21 ++++-----------------
 1 file changed, 4 insertions(+), 17 deletions(-)

diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
index 81ac95f..1d7254c 100644
--- a/drivers/iommu/mtk_iommu.c
+++ b/drivers/iommu/mtk_iommu.c
@@ -174,8 +174,7 @@ static void mtk_iommu_tlb_flush_all(void *cookie)
 }
 
 static void mtk_iommu_tlb_add_flush_nosync(unsigned long iova, size_t size,
-					   size_t granule, bool leaf,
-					   void *cookie)
+					   size_t granule, void *cookie)
 {
 	struct mtk_iommu_data *data = cookie;
 
@@ -223,19 +222,7 @@ static void mtk_iommu_tlb_flush_walk(unsigned long iova, size_t size,
 	unsigned long flags;
 
 	spin_lock_irqsave(&data->tlb_lock, flags);
-	mtk_iommu_tlb_add_flush_nosync(iova, size, granule, false, cookie);
-	mtk_iommu_tlb_sync(cookie);
-	spin_unlock_irqrestore(&data->tlb_lock, flags);
-}
-
-static void mtk_iommu_tlb_flush_leaf(unsigned long iova, size_t size,
-				     size_t granule, void *cookie)
-{
-	struct mtk_iommu_data *data = cookie;
-	unsigned long flags;
-
-	spin_lock_irqsave(&data->tlb_lock, flags);
-	mtk_iommu_tlb_add_flush_nosync(iova, size, granule, true, cookie);
+	mtk_iommu_tlb_add_flush_nosync(iova, size, granule, cookie);
 	mtk_iommu_tlb_sync(cookie);
 	spin_unlock_irqrestore(&data->tlb_lock, flags);
 }
@@ -253,7 +240,7 @@ static void mtk_iommu_tlb_flush_page_nosync(struct iommu_iotlb_gather *gather,
 static const struct iommu_flush_ops mtk_iommu_flush_ops = {
 	.tlb_flush_all = mtk_iommu_tlb_flush_all,
 	.tlb_flush_walk = mtk_iommu_tlb_flush_walk,
-	.tlb_flush_leaf = mtk_iommu_tlb_flush_leaf,
+	.tlb_flush_leaf = mtk_iommu_tlb_flush_walk,
 	.tlb_add_page = mtk_iommu_tlb_flush_page_nosync,
 };
 
@@ -475,7 +462,7 @@ static void mtk_iommu_iotlb_sync(struct iommu_domain *domain,
 
 	spin_lock_irqsave(&data->tlb_lock, flags);
 	mtk_iommu_tlb_add_flush_nosync(gather->start, length, gather->pgsize,
-				       false, data);
+				       data);
 	mtk_iommu_tlb_sync(data);
 	spin_unlock_irqrestore(&data->tlb_lock, flags);
 }
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v4 5/7] iommu/mediatek: Move the tlb_sync into tlb_flush
  2019-10-16  3:33 [PATCH v4 0/7] Improve tlb range flush Yong Wu
                   ` (3 preceding siblings ...)
  2019-10-16  3:33 ` [PATCH v4 4/7] iommu/mediatek: Delete the leaf in the tlb_flush Yong Wu
@ 2019-10-16  3:33 ` Yong Wu
  2019-10-16  3:33 ` [PATCH v4 6/7] iommu/mediatek: Get rid of the pgtlock Yong Wu
  2019-10-16  3:33 ` [PATCH v4 7/7] iommu/mediatek: Reduce the tlb flush timeout value Yong Wu
  6 siblings, 0 replies; 12+ messages in thread
From: Yong Wu @ 2019-10-16  3:33 UTC (permalink / raw)
  To: Matthias Brugger, Joerg Roedel, Will Deacon
  Cc: Evan Green, Robin Murphy, Tomasz Figa, linux-mediatek,
	srv_heupstream, linux-kernel, linux-arm-kernel, iommu, yong.wu,
	youlin.pei, Nicolas Boichat, anan.sun, cui.zhang, chao.hao,
	edison.hsieh

Right now, the tlb_add_flush_nosync and tlb_sync always appear together.
we merge the two functions into one(also move the tlb_lock into the new
function). No functional change.

Signed-off-by: Chao Hao <chao.hao@mediatek.com>
Signed-off-by: Yong Wu <yong.wu@mediatek.com>
---
 drivers/iommu/mtk_iommu.c | 45 ++++++++++-----------------------------------
 drivers/iommu/mtk_iommu.h |  1 -
 2 files changed, 10 insertions(+), 36 deletions(-)

diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
index 1d7254c..0e5f41f 100644
--- a/drivers/iommu/mtk_iommu.c
+++ b/drivers/iommu/mtk_iommu.c
@@ -173,12 +173,16 @@ static void mtk_iommu_tlb_flush_all(void *cookie)
 	}
 }
 
-static void mtk_iommu_tlb_add_flush_nosync(unsigned long iova, size_t size,
+static void mtk_iommu_tlb_flush_range_sync(unsigned long iova, size_t size,
 					   size_t granule, void *cookie)
 {
 	struct mtk_iommu_data *data = cookie;
+	unsigned long flags;
+	int ret;
+	u32 tmp;
 
 	for_each_m4u(data) {
+		spin_lock_irqsave(&data->tlb_lock, flags);
 		writel_relaxed(F_INVLD_EN1 | F_INVLD_EN0,
 			       data->base + REG_MMU_INV_SEL);
 
@@ -187,21 +191,8 @@ static void mtk_iommu_tlb_add_flush_nosync(unsigned long iova, size_t size,
 			       data->base + REG_MMU_INVLD_END_A);
 		writel_relaxed(F_MMU_INV_RANGE,
 			       data->base + REG_MMU_INVALIDATE);
-		data->tlb_flush_active = true;
-	}
-}
-
-static void mtk_iommu_tlb_sync(void *cookie)
-{
-	struct mtk_iommu_data *data = cookie;
-	int ret;
-	u32 tmp;
-
-	for_each_m4u(data) {
-		/* Avoid timing out if there's nothing to wait for */
-		if (!data->tlb_flush_active)
-			return;
 
+		/* tlb sync */
 		ret = readl_poll_timeout_atomic(data->base + REG_MMU_CPE_DONE,
 						tmp, tmp != 0, 10, 100000);
 		if (ret) {
@@ -211,22 +202,10 @@ static void mtk_iommu_tlb_sync(void *cookie)
 		}
 		/* Clear the CPE status */
 		writel_relaxed(0, data->base + REG_MMU_CPE_DONE);
-		data->tlb_flush_active = false;
+		spin_unlock_irqrestore(&data->tlb_lock, flags);
 	}
 }
 
-static void mtk_iommu_tlb_flush_walk(unsigned long iova, size_t size,
-				     size_t granule, void *cookie)
-{
-	struct mtk_iommu_data *data = cookie;
-	unsigned long flags;
-
-	spin_lock_irqsave(&data->tlb_lock, flags);
-	mtk_iommu_tlb_add_flush_nosync(iova, size, granule, cookie);
-	mtk_iommu_tlb_sync(cookie);
-	spin_unlock_irqrestore(&data->tlb_lock, flags);
-}
-
 static void mtk_iommu_tlb_flush_page_nosync(struct iommu_iotlb_gather *gather,
 					    unsigned long iova, size_t granule,
 					    void *cookie)
@@ -239,8 +218,8 @@ static void mtk_iommu_tlb_flush_page_nosync(struct iommu_iotlb_gather *gather,
 
 static const struct iommu_flush_ops mtk_iommu_flush_ops = {
 	.tlb_flush_all = mtk_iommu_tlb_flush_all,
-	.tlb_flush_walk = mtk_iommu_tlb_flush_walk,
-	.tlb_flush_leaf = mtk_iommu_tlb_flush_walk,
+	.tlb_flush_walk = mtk_iommu_tlb_flush_range_sync,
+	.tlb_flush_leaf = mtk_iommu_tlb_flush_range_sync,
 	.tlb_add_page = mtk_iommu_tlb_flush_page_nosync,
 };
 
@@ -455,16 +434,12 @@ static void mtk_iommu_iotlb_sync(struct iommu_domain *domain,
 {
 	struct mtk_iommu_data *data = mtk_iommu_get_m4u_data();
 	size_t length = gather->end - gather->start;
-	unsigned long flags;
 
 	if (gather->start == ULONG_MAX)
 		return;
 
-	spin_lock_irqsave(&data->tlb_lock, flags);
-	mtk_iommu_tlb_add_flush_nosync(gather->start, length, gather->pgsize,
+	mtk_iommu_tlb_flush_range_sync(gather->start, length, gather->pgsize,
 				       data);
-	mtk_iommu_tlb_sync(data);
-	spin_unlock_irqrestore(&data->tlb_lock, flags);
 }
 
 static phys_addr_t mtk_iommu_iova_to_phys(struct iommu_domain *domain,
diff --git a/drivers/iommu/mtk_iommu.h b/drivers/iommu/mtk_iommu.h
index 8cae22d..ea949a3 100644
--- a/drivers/iommu/mtk_iommu.h
+++ b/drivers/iommu/mtk_iommu.h
@@ -57,7 +57,6 @@ struct mtk_iommu_data {
 	struct mtk_iommu_domain		*m4u_dom;
 	struct iommu_group		*m4u_group;
 	bool                            enable_4GB;
-	bool				tlb_flush_active;
 	spinlock_t			tlb_lock; /* lock for tlb range flush */
 
 	struct iommu_device		iommu;
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v4 6/7] iommu/mediatek: Get rid of the pgtlock
  2019-10-16  3:33 [PATCH v4 0/7] Improve tlb range flush Yong Wu
                   ` (4 preceding siblings ...)
  2019-10-16  3:33 ` [PATCH v4 5/7] iommu/mediatek: Move the tlb_sync into tlb_flush Yong Wu
@ 2019-10-16  3:33 ` Yong Wu
  2019-10-16  3:33 ` [PATCH v4 7/7] iommu/mediatek: Reduce the tlb flush timeout value Yong Wu
  6 siblings, 0 replies; 12+ messages in thread
From: Yong Wu @ 2019-10-16  3:33 UTC (permalink / raw)
  To: Matthias Brugger, Joerg Roedel, Will Deacon
  Cc: Evan Green, Robin Murphy, Tomasz Figa, linux-mediatek,
	srv_heupstream, linux-kernel, linux-arm-kernel, iommu, yong.wu,
	youlin.pei, Nicolas Boichat, anan.sun, cui.zhang, chao.hao,
	edison.hsieh

Now we have tlb_lock for the HW tlb flush, then pgtable code hasn't
needed the external "pgtlock" for a while. this patch remove the
"pgtlock".

Signed-off-by: Yong Wu <yong.wu@mediatek.com>
---
 drivers/iommu/mtk_iommu.c | 25 +++----------------------
 1 file changed, 3 insertions(+), 22 deletions(-)

diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
index 0e5f41f..c2b7ed5 100644
--- a/drivers/iommu/mtk_iommu.c
+++ b/drivers/iommu/mtk_iommu.c
@@ -101,8 +101,6 @@
 #define MTK_M4U_TO_PORT(id)		((id) & 0x1f)
 
 struct mtk_iommu_domain {
-	spinlock_t			pgtlock; /* lock for page table */
-
 	struct io_pgtable_cfg		cfg;
 	struct io_pgtable_ops		*iop;
 
@@ -295,8 +293,6 @@ static int mtk_iommu_domain_finalise(struct mtk_iommu_domain *dom)
 {
 	struct mtk_iommu_data *data = mtk_iommu_get_m4u_data();
 
-	spin_lock_init(&dom->pgtlock);
-
 	dom->cfg = (struct io_pgtable_cfg) {
 		.quirks = IO_PGTABLE_QUIRK_ARM_NS |
 			IO_PGTABLE_QUIRK_NO_PERMS |
@@ -395,18 +391,13 @@ static int mtk_iommu_map(struct iommu_domain *domain, unsigned long iova,
 {
 	struct mtk_iommu_domain *dom = to_mtk_domain(domain);
 	struct mtk_iommu_data *data = mtk_iommu_get_m4u_data();
-	unsigned long flags;
-	int ret;
 
 	/* The "4GB mode" M4U physically can not use the lower remap of Dram. */
 	if (data->enable_4GB)
 		paddr |= BIT_ULL(32);
 
-	spin_lock_irqsave(&dom->pgtlock, flags);
-	ret = dom->iop->map(dom->iop, iova, paddr, size, prot);
-	spin_unlock_irqrestore(&dom->pgtlock, flags);
-
-	return ret;
+	/* Synchronize with the tlb_lock */
+	return dom->iop->map(dom->iop, iova, paddr, size, prot);
 }
 
 static size_t mtk_iommu_unmap(struct iommu_domain *domain,
@@ -414,14 +405,8 @@ static size_t mtk_iommu_unmap(struct iommu_domain *domain,
 			      struct iommu_iotlb_gather *gather)
 {
 	struct mtk_iommu_domain *dom = to_mtk_domain(domain);
-	unsigned long flags;
-	size_t unmapsz;
-
-	spin_lock_irqsave(&dom->pgtlock, flags);
-	unmapsz = dom->iop->unmap(dom->iop, iova, size, gather);
-	spin_unlock_irqrestore(&dom->pgtlock, flags);
 
-	return unmapsz;
+	return dom->iop->unmap(dom->iop, iova, size, gather);
 }
 
 static void mtk_iommu_flush_iotlb_all(struct iommu_domain *domain)
@@ -447,13 +432,9 @@ static phys_addr_t mtk_iommu_iova_to_phys(struct iommu_domain *domain,
 {
 	struct mtk_iommu_domain *dom = to_mtk_domain(domain);
 	struct mtk_iommu_data *data = mtk_iommu_get_m4u_data();
-	unsigned long flags;
 	phys_addr_t pa;
 
-	spin_lock_irqsave(&dom->pgtlock, flags);
 	pa = dom->iop->iova_to_phys(dom->iop, iova);
-	spin_unlock_irqrestore(&dom->pgtlock, flags);
-
 	if (data->enable_4GB && pa >= MTK_IOMMU_4GB_MODE_REMAP_BASE)
 		pa &= ~BIT_ULL(32);
 
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v4 7/7] iommu/mediatek: Reduce the tlb flush timeout value
  2019-10-16  3:33 [PATCH v4 0/7] Improve tlb range flush Yong Wu
                   ` (5 preceding siblings ...)
  2019-10-16  3:33 ` [PATCH v4 6/7] iommu/mediatek: Get rid of the pgtlock Yong Wu
@ 2019-10-16  3:33 ` Yong Wu
  6 siblings, 0 replies; 12+ messages in thread
From: Yong Wu @ 2019-10-16  3:33 UTC (permalink / raw)
  To: Matthias Brugger, Joerg Roedel, Will Deacon
  Cc: Evan Green, Robin Murphy, Tomasz Figa, linux-mediatek,
	srv_heupstream, linux-kernel, linux-arm-kernel, iommu, yong.wu,
	youlin.pei, Nicolas Boichat, anan.sun, cui.zhang, chao.hao,
	edison.hsieh

Reduce the tlb timeout value from 100000us to 1000us. the original value
is so long that affect the multimedia performance. This is only a minor
improvement rather than fixing a issue.

Signed-off-by: Yong Wu <yong.wu@mediatek.com>
---
 drivers/iommu/mtk_iommu.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
index c2b7ed5..8ca2e99 100644
--- a/drivers/iommu/mtk_iommu.c
+++ b/drivers/iommu/mtk_iommu.c
@@ -192,7 +192,7 @@ static void mtk_iommu_tlb_flush_range_sync(unsigned long iova, size_t size,
 
 		/* tlb sync */
 		ret = readl_poll_timeout_atomic(data->base + REG_MMU_CPE_DONE,
-						tmp, tmp != 0, 10, 100000);
+						tmp, tmp != 0, 10, 1000);
 		if (ret) {
 			dev_warn(data->dev,
 				 "Partial TLB flush timed out, falling back to full flush\n");
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH v4 2/7] iommu/mediatek: Add a new tlb_lock for tlb_flush
  2019-10-16  3:33 ` [PATCH v4 2/7] iommu/mediatek: Add a new tlb_lock for tlb_flush Yong Wu
@ 2019-10-23 16:52   ` Will Deacon
  2019-10-24  7:22     ` Yong Wu
  0 siblings, 1 reply; 12+ messages in thread
From: Will Deacon @ 2019-10-23 16:52 UTC (permalink / raw)
  To: Yong Wu
  Cc: Matthias Brugger, Joerg Roedel, Will Deacon, Evan Green,
	Robin Murphy, Tomasz Figa, linux-mediatek, srv_heupstream,
	linux-kernel, linux-arm-kernel, iommu, youlin.pei,
	Nicolas Boichat, anan.sun, cui.zhang, chao.hao, edison.hsieh

On Wed, Oct 16, 2019 at 11:33:07AM +0800, Yong Wu wrote:
> The commit 4d689b619445 ("iommu/io-pgtable-arm-v7s: Convert to IOMMU API
> TLB sync") help move the tlb_sync of unmap from v7s into the iommu
> framework. It helps add a new function "mtk_iommu_iotlb_sync", But it
> lacked the lock, then it will cause the variable "tlb_flush_active"
> may be changed unexpectedly, we could see this warning log randomly:
> 
> mtk-iommu 10205000.iommu: Partial TLB flush timed out, falling back to
> full flush
> 
> The HW requires tlb_flush/tlb_sync in pairs strictly, this patch adds
> a new tlb_lock for tlb operations to fix this issue.
> 
> Fixes: 4d689b619445 ("iommu/io-pgtable-arm-v7s: Convert to IOMMU API TLB
> sync")
> Signed-off-by: Yong Wu <yong.wu@mediatek.com>
> ---
>  drivers/iommu/mtk_iommu.c | 23 ++++++++++++++++++++++-
>  drivers/iommu/mtk_iommu.h |  1 +
>  2 files changed, 23 insertions(+), 1 deletion(-)

[...]

>  static void mtk_iommu_tlb_flush_page_nosync(struct iommu_iotlb_gather *gather,
>  					    unsigned long iova, size_t granule,
>  					    void *cookie)
>  {
> +	struct mtk_iommu_data *data = cookie;
> +	unsigned long flags;
> +
> +	spin_lock_irqsave(&data->tlb_lock, flags);
>  	mtk_iommu_tlb_add_flush_nosync(iova, granule, granule, true, cookie);
> +	spin_unlock_irqrestore(&data->tlb_lock, flags);

Given that you release the lock here, what prevents another nosync()
operation being issued before you've managed to do the sync()?

Will

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v4 3/7] iommu/mediatek: Use gather to achieve the tlb range flush
  2019-10-16  3:33 ` [PATCH v4 3/7] iommu/mediatek: Use gather to achieve the tlb range flush Yong Wu
@ 2019-10-23 16:55   ` Will Deacon
  2019-10-24  7:22     ` Yong Wu
  0 siblings, 1 reply; 12+ messages in thread
From: Will Deacon @ 2019-10-23 16:55 UTC (permalink / raw)
  To: Yong Wu
  Cc: Matthias Brugger, Joerg Roedel, Will Deacon, Evan Green,
	Robin Murphy, Tomasz Figa, linux-mediatek, srv_heupstream,
	linux-kernel, linux-arm-kernel, iommu, youlin.pei,
	Nicolas Boichat, anan.sun, cui.zhang, chao.hao, edison.hsieh

On Wed, Oct 16, 2019 at 11:33:08AM +0800, Yong Wu wrote:
> Use the iommu_gather mechanism to achieve the tlb range flush.
> Gather the iova range in the "tlb_add_page", then flush the merged iova
> range in iotlb_sync.
> 
> Suggested-by: Tomasz Figa <tfiga@chromium.org>
> Signed-off-by: Yong Wu <yong.wu@mediatek.com>
> ---
>  drivers/iommu/mtk_iommu.c | 12 ++++++++----
>  1 file changed, 8 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
> index c2f6c78..81ac95f 100644
> --- a/drivers/iommu/mtk_iommu.c
> +++ b/drivers/iommu/mtk_iommu.c
> @@ -245,11 +245,9 @@ static void mtk_iommu_tlb_flush_page_nosync(struct iommu_iotlb_gather *gather,
>  					    void *cookie)
>  {
>  	struct mtk_iommu_data *data = cookie;
> -	unsigned long flags;
> +	struct iommu_domain *domain = &data->m4u_dom->domain;
>  
> -	spin_lock_irqsave(&data->tlb_lock, flags);
> -	mtk_iommu_tlb_add_flush_nosync(iova, granule, granule, true, cookie);
> -	spin_unlock_irqrestore(&data->tlb_lock, flags);
> +	iommu_iotlb_gather_add_page(domain, gather, iova, granule);

You need to be careful here, because iommu_iotlb_gather_add_page() can
call iommu_tlb_sync() in some situations and you don't hold the lock.

>  static const struct iommu_flush_ops mtk_iommu_flush_ops = {
> @@ -469,9 +467,15 @@ static void mtk_iommu_iotlb_sync(struct iommu_domain *domain,
>  				 struct iommu_iotlb_gather *gather)
>  {
>  	struct mtk_iommu_data *data = mtk_iommu_get_m4u_data();
> +	size_t length = gather->end - gather->start;
>  	unsigned long flags;
>  
> +	if (gather->start == ULONG_MAX)
> +		return;
> +
>  	spin_lock_irqsave(&data->tlb_lock, flags);
> +	mtk_iommu_tlb_add_flush_nosync(gather->start, length, gather->pgsize,
> +				       false, data);
>  	mtk_iommu_tlb_sync(data);
>  	spin_unlock_irqrestore(&data->tlb_lock, flags);

Modulo my comment above, this fixes my previous comment. Given that mainline
is already broken, I guess the runtime bisectability isn't a problem.

Will

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v4 2/7] iommu/mediatek: Add a new tlb_lock for tlb_flush
  2019-10-23 16:52   ` Will Deacon
@ 2019-10-24  7:22     ` Yong Wu
  0 siblings, 0 replies; 12+ messages in thread
From: Yong Wu @ 2019-10-24  7:22 UTC (permalink / raw)
  To: Will Deacon
  Cc: Matthias Brugger, Joerg Roedel, Will Deacon, Evan Green,
	Robin Murphy, Tomasz Figa, linux-mediatek, srv_heupstream,
	linux-kernel, linux-arm-kernel, iommu, youlin.pei,
	Nicolas Boichat, anan.sun, cui.zhang, chao.hao, edison.hsieh

On Wed, 2019-10-23 at 17:52 +0100, Will Deacon wrote:
> On Wed, Oct 16, 2019 at 11:33:07AM +0800, Yong Wu wrote:
> > The commit 4d689b619445 ("iommu/io-pgtable-arm-v7s: Convert to IOMMU API
> > TLB sync") help move the tlb_sync of unmap from v7s into the iommu
> > framework. It helps add a new function "mtk_iommu_iotlb_sync", But it
> > lacked the lock, then it will cause the variable "tlb_flush_active"
> > may be changed unexpectedly, we could see this warning log randomly:
> > 
> > mtk-iommu 10205000.iommu: Partial TLB flush timed out, falling back to
> > full flush
> > 
> > The HW requires tlb_flush/tlb_sync in pairs strictly, this patch adds
> > a new tlb_lock for tlb operations to fix this issue.
> > 
> > Fixes: 4d689b619445 ("iommu/io-pgtable-arm-v7s: Convert to IOMMU API TLB
> > sync")
> > Signed-off-by: Yong Wu <yong.wu@mediatek.com>
> > ---
> >  drivers/iommu/mtk_iommu.c | 23 ++++++++++++++++++++++-
> >  drivers/iommu/mtk_iommu.h |  1 +
> >  2 files changed, 23 insertions(+), 1 deletion(-)
> 
> [...]
> 
> >  static void mtk_iommu_tlb_flush_page_nosync(struct iommu_iotlb_gather *gather,
> >  					    unsigned long iova, size_t granule,
> >  					    void *cookie)
> >  {
> > +	struct mtk_iommu_data *data = cookie;
> > +	unsigned long flags;
> > +
> > +	spin_lock_irqsave(&data->tlb_lock, flags);
> >  	mtk_iommu_tlb_add_flush_nosync(iova, granule, granule, true, cookie);
> > +	spin_unlock_irqrestore(&data->tlb_lock, flags);
> 
> Given that you release the lock here, what prevents another nosync()
> operation being issued before you've managed to do the sync()?

This patch can not guarantee each flush_nosync() always followed by a
sync().

The current mainline don't guarantee this too(Current v7s call
flush_nosync 16 times, then call tlb_sync 1 time in the
supersection/largepage case.). At this situation, it don't guarantee the
previous tlb_flushes are finished, But we never got the timeout
issue(Fortunately our HW always work well at that time. Maybe the HW
finish the previous tlb flush so quickly even though we don't polling
its finish flag).

We got the timeout issue only after this commit 4d689b619445. This patch
only fixes this issue via adding a new lock in iotlb_sync.(It don't
adjust the sequence of flush_nosync() and sync()).

> 
> Will



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v4 3/7] iommu/mediatek: Use gather to achieve the tlb range flush
  2019-10-23 16:55   ` Will Deacon
@ 2019-10-24  7:22     ` Yong Wu
  0 siblings, 0 replies; 12+ messages in thread
From: Yong Wu @ 2019-10-24  7:22 UTC (permalink / raw)
  To: Will Deacon
  Cc: Matthias Brugger, Joerg Roedel, Will Deacon, Evan Green,
	Robin Murphy, Tomasz Figa, linux-mediatek, srv_heupstream,
	linux-kernel, linux-arm-kernel, iommu, youlin.pei,
	Nicolas Boichat, anan.sun, cui.zhang, chao.hao, edison.hsieh

On Wed, 2019-10-23 at 17:55 +0100, Will Deacon wrote:
> On Wed, Oct 16, 2019 at 11:33:08AM +0800, Yong Wu wrote:
> > Use the iommu_gather mechanism to achieve the tlb range flush.
> > Gather the iova range in the "tlb_add_page", then flush the merged iova
> > range in iotlb_sync.
> > 
> > Suggested-by: Tomasz Figa <tfiga@chromium.org>
> > Signed-off-by: Yong Wu <yong.wu@mediatek.com>
> > ---
> >  drivers/iommu/mtk_iommu.c | 12 ++++++++----
> >  1 file changed, 8 insertions(+), 4 deletions(-)
> > 
> > diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
> > index c2f6c78..81ac95f 100644
> > --- a/drivers/iommu/mtk_iommu.c
> > +++ b/drivers/iommu/mtk_iommu.c
> > @@ -245,11 +245,9 @@ static void mtk_iommu_tlb_flush_page_nosync(struct iommu_iotlb_gather *gather,
> >  					    void *cookie)
> >  {
> >  	struct mtk_iommu_data *data = cookie;
> > -	unsigned long flags;
> > +	struct iommu_domain *domain = &data->m4u_dom->domain;
> >  
> > -	spin_lock_irqsave(&data->tlb_lock, flags);
> > -	mtk_iommu_tlb_add_flush_nosync(iova, granule, granule, true, cookie);
> > -	spin_unlock_irqrestore(&data->tlb_lock, flags);
> > +	iommu_iotlb_gather_add_page(domain, gather, iova, granule);
> 
> You need to be careful here, because iommu_iotlb_gather_add_page() can
> call iommu_tlb_sync() in some situations and you don't hold the lock.

The mtk_iommu_iotlb_sync below already has the lock in it, so I delete
the lock here.

> 
> >  static const struct iommu_flush_ops mtk_iommu_flush_ops = {
> > @@ -469,9 +467,15 @@ static void mtk_iommu_iotlb_sync(struct iommu_domain *domain,
> >  				 struct iommu_iotlb_gather *gather)
> >  {
> >  	struct mtk_iommu_data *data = mtk_iommu_get_m4u_data();
> > +	size_t length = gather->end - gather->start;
> >  	unsigned long flags;
> >  
> > +	if (gather->start == ULONG_MAX)
> > +		return;
> > +
> >  	spin_lock_irqsave(&data->tlb_lock, flags);
> > +	mtk_iommu_tlb_add_flush_nosync(gather->start, length, gather->pgsize,
> > +				       false, data);
> >  	mtk_iommu_tlb_sync(data);
> >  	spin_unlock_irqrestore(&data->tlb_lock, flags);
> 
> Modulo my comment above, this fixes my previous comment. Given that mainline
> is already broken, I guess the runtime bisectability isn't a problem.

As the reply in [2/7]. the mainline is not broken after [2/7], it only
go to the previous status before commit(4d689b619445).

After using the iommu_gather, the iova will be the merged range in this
iotlb_sync, it is just fit to do the tlb-flush/tlb-sync. then it fixes
our potential issue(No tlb-sync for the previous tlb-flush range).

> 
> Will



^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2019-10-24  7:22 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-10-16  3:33 [PATCH v4 0/7] Improve tlb range flush Yong Wu
2019-10-16  3:33 ` [PATCH v4 1/7] iommu/mediatek: Correct the flush_iotlb_all callback Yong Wu
2019-10-16  3:33 ` [PATCH v4 2/7] iommu/mediatek: Add a new tlb_lock for tlb_flush Yong Wu
2019-10-23 16:52   ` Will Deacon
2019-10-24  7:22     ` Yong Wu
2019-10-16  3:33 ` [PATCH v4 3/7] iommu/mediatek: Use gather to achieve the tlb range flush Yong Wu
2019-10-23 16:55   ` Will Deacon
2019-10-24  7:22     ` Yong Wu
2019-10-16  3:33 ` [PATCH v4 4/7] iommu/mediatek: Delete the leaf in the tlb_flush Yong Wu
2019-10-16  3:33 ` [PATCH v4 5/7] iommu/mediatek: Move the tlb_sync into tlb_flush Yong Wu
2019-10-16  3:33 ` [PATCH v4 6/7] iommu/mediatek: Get rid of the pgtlock Yong Wu
2019-10-16  3:33 ` [PATCH v4 7/7] iommu/mediatek: Reduce the tlb flush timeout value Yong Wu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).