[PATCH] CHROMIUM: iommu: rockchip: Make sure that page table state is coherent

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH] CHROMIUM: iommu: rockchip: Make sure that page table state is coherent
@ 2015-02-09 11:19 ` Tomasz Figa
  0 siblings, 0 replies; 24+ messages in thread
From: Tomasz Figa @ 2015-02-09 11:19 UTC (permalink / raw)
  To: iommu
  Cc: linux-arm-kernel, linux-kernel, linux-rockchip, Joerg Roedel,
	Heiko Stuebner, Daniel Kurtz, Tomasz Figa

Even though the code uses the dt_lock spin lock to serialize mapping
operation from different threads, it does not protect from IOMMU
accesses that might be already taking place and thus altering state
of the IOTLB. This means that current mapping code which first zaps
the page table and only then updates it with new mapping which is
prone to mentioned race.

In addition, current code assumes that mappings are always > 4 MiB
(which translates to 1024 PTEs) and so they would always occupy
entire page tables. This is not true for mappings created by V4L2
Videobuf2 DMA contig allocator.

This patch changes the mapping code to always zap the page table
after it is updated, which avoids the aforementioned race and also
zap the last page of the mapping to make sure that stale data is
not cached from an already existing mapping.

Signed-off-by: Tomasz Figa <tfiga@chromium.org>
Reviewed-by: Daniel Kurtz <djkurtz@chromium.org>
---
 drivers/iommu/rockchip-iommu.c | 23 +++++++++++++++++------
 1 file changed, 17 insertions(+), 6 deletions(-)

diff --git a/drivers/iommu/rockchip-iommu.c b/drivers/iommu/rockchip-iommu.c
index 6a8b1ec..b06fe76 100644
--- a/drivers/iommu/rockchip-iommu.c
+++ b/drivers/iommu/rockchip-iommu.c
@@ -544,6 +544,15 @@ static void rk_iommu_zap_iova(struct rk_iommu_domain *rk_domain,
 	spin_unlock_irqrestore(&rk_domain->iommus_lock, flags);
 }
 
+static void rk_iommu_zap_iova_first_last(struct rk_iommu_domain *rk_domain,
+					 dma_addr_t iova, size_t size)
+{
+	rk_iommu_zap_iova(rk_domain, iova, SPAGE_SIZE);
+	if (size > SPAGE_SIZE)
+		rk_iommu_zap_iova(rk_domain, iova + size - SPAGE_SIZE,
+					SPAGE_SIZE);
+}
+
 static u32 *rk_dte_get_page_table(struct rk_iommu_domain *rk_domain,
 				  dma_addr_t iova)
 {
@@ -568,12 +577,6 @@ static u32 *rk_dte_get_page_table(struct rk_iommu_domain *rk_domain,
 	rk_table_flush(page_table, NUM_PT_ENTRIES);
 	rk_table_flush(dte_addr, 1);
 
-	/*
-	 * Zap the first iova of newly allocated page table so iommu evicts
-	 * old cached value of new dte from the iotlb.
-	 */
-	rk_iommu_zap_iova(rk_domain, iova, SPAGE_SIZE);
-
 done:
 	pt_phys = rk_dte_pt_address(dte);
 	return (u32 *)phys_to_virt(pt_phys);
@@ -623,6 +626,14 @@ static int rk_iommu_map_iova(struct rk_iommu_domain *rk_domain, u32 *pte_addr,
 
 	rk_table_flush(pte_addr, pte_count);
 
+	/*
+	 * Zap the first and last iova to evict from iotlb any previously
+	 * mapped cachelines holding stale values for its dte and pte.
+	 * We only zap the first and last iova, since only they could have
+	 * dte or pte shared with an existing mapping.
+	 */
+	rk_iommu_zap_iova_first_last(rk_domain, iova, size);
+
 	return 0;
 unwind:
 	/* Unmap the range of iovas that we just mapped */
-- 
2.2.0.rc0.207.ga3a616c


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH] CHROMIUM: iommu: rockchip: Make sure that page table state is coherent
@ 2015-02-09 11:19 ` Tomasz Figa
  0 siblings, 0 replies; 24+ messages in thread
From: Tomasz Figa @ 2015-02-09 11:19 UTC (permalink / raw)
  To: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA
  Cc: Heiko Stuebner, linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	Daniel Kurtz, Tomasz Figa,
	linux-rockchip-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r

Even though the code uses the dt_lock spin lock to serialize mapping
operation from different threads, it does not protect from IOMMU
accesses that might be already taking place and thus altering state
of the IOTLB. This means that current mapping code which first zaps
the page table and only then updates it with new mapping which is
prone to mentioned race.

In addition, current code assumes that mappings are always > 4 MiB
(which translates to 1024 PTEs) and so they would always occupy
entire page tables. This is not true for mappings created by V4L2
Videobuf2 DMA contig allocator.

This patch changes the mapping code to always zap the page table
after it is updated, which avoids the aforementioned race and also
zap the last page of the mapping to make sure that stale data is
not cached from an already existing mapping.

Signed-off-by: Tomasz Figa <tfiga-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>
Reviewed-by: Daniel Kurtz <djkurtz-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>
---
 drivers/iommu/rockchip-iommu.c | 23 +++++++++++++++++------
 1 file changed, 17 insertions(+), 6 deletions(-)

diff --git a/drivers/iommu/rockchip-iommu.c b/drivers/iommu/rockchip-iommu.c
index 6a8b1ec..b06fe76 100644
--- a/drivers/iommu/rockchip-iommu.c
+++ b/drivers/iommu/rockchip-iommu.c
@@ -544,6 +544,15 @@ static void rk_iommu_zap_iova(struct rk_iommu_domain *rk_domain,
 	spin_unlock_irqrestore(&rk_domain->iommus_lock, flags);
 }
 
+static void rk_iommu_zap_iova_first_last(struct rk_iommu_domain *rk_domain,
+					 dma_addr_t iova, size_t size)
+{
+	rk_iommu_zap_iova(rk_domain, iova, SPAGE_SIZE);
+	if (size > SPAGE_SIZE)
+		rk_iommu_zap_iova(rk_domain, iova + size - SPAGE_SIZE,
+					SPAGE_SIZE);
+}
+
 static u32 *rk_dte_get_page_table(struct rk_iommu_domain *rk_domain,
 				  dma_addr_t iova)
 {
@@ -568,12 +577,6 @@ static u32 *rk_dte_get_page_table(struct rk_iommu_domain *rk_domain,
 	rk_table_flush(page_table, NUM_PT_ENTRIES);
 	rk_table_flush(dte_addr, 1);
 
-	/*
-	 * Zap the first iova of newly allocated page table so iommu evicts
-	 * old cached value of new dte from the iotlb.
-	 */
-	rk_iommu_zap_iova(rk_domain, iova, SPAGE_SIZE);
-
 done:
 	pt_phys = rk_dte_pt_address(dte);
 	return (u32 *)phys_to_virt(pt_phys);
@@ -623,6 +626,14 @@ static int rk_iommu_map_iova(struct rk_iommu_domain *rk_domain, u32 *pte_addr,
 
 	rk_table_flush(pte_addr, pte_count);
 
+	/*
+	 * Zap the first and last iova to evict from iotlb any previously
+	 * mapped cachelines holding stale values for its dte and pte.
+	 * We only zap the first and last iova, since only they could have
+	 * dte or pte shared with an existing mapping.
+	 */
+	rk_iommu_zap_iova_first_last(rk_domain, iova, size);
+
 	return 0;
 unwind:
 	/* Unmap the range of iovas that we just mapped */
-- 
2.2.0.rc0.207.ga3a616c

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH] CHROMIUM: iommu: rockchip: Make sure that page table state is coherent
@ 2015-02-09 11:19 ` Tomasz Figa
  0 siblings, 0 replies; 24+ messages in thread
From: Tomasz Figa @ 2015-02-09 11:19 UTC (permalink / raw)
  To: linux-arm-kernel

Even though the code uses the dt_lock spin lock to serialize mapping
operation from different threads, it does not protect from IOMMU
accesses that might be already taking place and thus altering state
of the IOTLB. This means that current mapping code which first zaps
the page table and only then updates it with new mapping which is
prone to mentioned race.

In addition, current code assumes that mappings are always > 4 MiB
(which translates to 1024 PTEs) and so they would always occupy
entire page tables. This is not true for mappings created by V4L2
Videobuf2 DMA contig allocator.

This patch changes the mapping code to always zap the page table
after it is updated, which avoids the aforementioned race and also
zap the last page of the mapping to make sure that stale data is
not cached from an already existing mapping.

Signed-off-by: Tomasz Figa <tfiga@chromium.org>
Reviewed-by: Daniel Kurtz <djkurtz@chromium.org>
---
 drivers/iommu/rockchip-iommu.c | 23 +++++++++++++++++------
 1 file changed, 17 insertions(+), 6 deletions(-)

diff --git a/drivers/iommu/rockchip-iommu.c b/drivers/iommu/rockchip-iommu.c
index 6a8b1ec..b06fe76 100644
--- a/drivers/iommu/rockchip-iommu.c
+++ b/drivers/iommu/rockchip-iommu.c
@@ -544,6 +544,15 @@ static void rk_iommu_zap_iova(struct rk_iommu_domain *rk_domain,
 	spin_unlock_irqrestore(&rk_domain->iommus_lock, flags);
 }
 
+static void rk_iommu_zap_iova_first_last(struct rk_iommu_domain *rk_domain,
+					 dma_addr_t iova, size_t size)
+{
+	rk_iommu_zap_iova(rk_domain, iova, SPAGE_SIZE);
+	if (size > SPAGE_SIZE)
+		rk_iommu_zap_iova(rk_domain, iova + size - SPAGE_SIZE,
+					SPAGE_SIZE);
+}
+
 static u32 *rk_dte_get_page_table(struct rk_iommu_domain *rk_domain,
 				  dma_addr_t iova)
 {
@@ -568,12 +577,6 @@ static u32 *rk_dte_get_page_table(struct rk_iommu_domain *rk_domain,
 	rk_table_flush(page_table, NUM_PT_ENTRIES);
 	rk_table_flush(dte_addr, 1);
 
-	/*
-	 * Zap the first iova of newly allocated page table so iommu evicts
-	 * old cached value of new dte from the iotlb.
-	 */
-	rk_iommu_zap_iova(rk_domain, iova, SPAGE_SIZE);
-
 done:
 	pt_phys = rk_dte_pt_address(dte);
 	return (u32 *)phys_to_virt(pt_phys);
@@ -623,6 +626,14 @@ static int rk_iommu_map_iova(struct rk_iommu_domain *rk_domain, u32 *pte_addr,
 
 	rk_table_flush(pte_addr, pte_count);
 
+	/*
+	 * Zap the first and last iova to evict from iotlb any previously
+	 * mapped cachelines holding stale values for its dte and pte.
+	 * We only zap the first and last iova, since only they could have
+	 * dte or pte shared with an existing mapping.
+	 */
+	rk_iommu_zap_iova_first_last(rk_domain, iova, size);
+
 	return 0;
 unwind:
 	/* Unmap the range of iovas that we just mapped */
-- 
2.2.0.rc0.207.ga3a616c

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH] CHROMIUM: iommu: rockchip: Make sure that page table state is coherent
@ 2015-02-09 11:22   ` Tomasz Figa
  0 siblings, 0 replies; 24+ messages in thread
From: Tomasz Figa @ 2015-02-09 11:22 UTC (permalink / raw)
  To: iommu
  Cc: linux-arm-kernel, linux-kernel, open list:ARM/Rockchip SoC...,
	Joerg Roedel, Heiko Stuebner, Daniel Kurtz, Tomasz Figa

On Mon, Feb 9, 2015 at 8:19 PM, Tomasz Figa <tfiga@chromium.org> wrote:
> Even though the code uses the dt_lock spin lock to serialize mapping
> operation from different threads, it does not protect from IOMMU
> accesses that might be already taking place and thus altering state
> of the IOTLB. This means that current mapping code which first zaps
> the page table and only then updates it with new mapping which is
> prone to mentioned race.

Oops, forgot to remove the "CHROMIUM" tag. Is that something that
could be fixed when applying (if the patch is otherwise okay) or
should I resend?

Best regards,
Tomasz

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] CHROMIUM: iommu: rockchip: Make sure that page table state is coherent
@ 2015-02-09 11:22   ` Tomasz Figa
  0 siblings, 0 replies; 24+ messages in thread
From: Tomasz Figa @ 2015-02-09 11:22 UTC (permalink / raw)
  To: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA
  Cc: Heiko Stuebner, linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	Daniel Kurtz, Tomasz Figa, open list:ARM/Rockchip SoC...,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r

On Mon, Feb 9, 2015 at 8:19 PM, Tomasz Figa <tfiga-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org> wrote:
> Even though the code uses the dt_lock spin lock to serialize mapping
> operation from different threads, it does not protect from IOMMU
> accesses that might be already taking place and thus altering state
> of the IOTLB. This means that current mapping code which first zaps
> the page table and only then updates it with new mapping which is
> prone to mentioned race.

Oops, forgot to remove the "CHROMIUM" tag. Is that something that
could be fixed when applying (if the patch is otherwise okay) or
should I resend?

Best regards,
Tomasz

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH] CHROMIUM: iommu: rockchip: Make sure that page table state is coherent
@ 2015-02-09 11:22   ` Tomasz Figa
  0 siblings, 0 replies; 24+ messages in thread
From: Tomasz Figa @ 2015-02-09 11:22 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Feb 9, 2015 at 8:19 PM, Tomasz Figa <tfiga@chromium.org> wrote:
> Even though the code uses the dt_lock spin lock to serialize mapping
> operation from different threads, it does not protect from IOMMU
> accesses that might be already taking place and thus altering state
> of the IOTLB. This means that current mapping code which first zaps
> the page table and only then updates it with new mapping which is
> prone to mentioned race.

Oops, forgot to remove the "CHROMIUM" tag. Is that something that
could be fixed when applying (if the patch is otherwise okay) or
should I resend?

Best regards,
Tomasz

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] CHROMIUM: iommu: rockchip: Make sure that page table state is coherent
  2015-02-09 11:19 ` Tomasz Figa
@ 2015-02-10 22:21   ` Heiko Stübner
  -1 siblings, 0 replies; 24+ messages in thread
From: Heiko Stübner @ 2015-02-10 22:21 UTC (permalink / raw)
  To: Tomasz Figa
  Cc: iommu, linux-arm-kernel, linux-kernel, linux-rockchip,
	Joerg Roedel, Daniel Kurtz

Am Montag, 9. Februar 2015, 20:19:21 schrieb Tomasz Figa:
> Even though the code uses the dt_lock spin lock to serialize mapping
> operation from different threads, it does not protect from IOMMU
> accesses that might be already taking place and thus altering state
> of the IOTLB. This means that current mapping code which first zaps
> the page table and only then updates it with new mapping which is
> prone to mentioned race.
> 
> In addition, current code assumes that mappings are always > 4 MiB
> (which translates to 1024 PTEs) and so they would always occupy
> entire page tables. This is not true for mappings created by V4L2
> Videobuf2 DMA contig allocator.
> 
> This patch changes the mapping code to always zap the page table
> after it is updated, which avoids the aforementioned race and also
> zap the last page of the mapping to make sure that stale data is
> not cached from an already existing mapping.
> 
> Signed-off-by: Tomasz Figa <tfiga@chromium.org>
> Reviewed-by: Daniel Kurtz <djkurtz@chromium.org>

I don't know enough about iommu-magic yet to review this properly, but on my 
rk3288-firefly the whole display pipeline stays in working condition, down to 
x11 and es2gears, so

Tested-by: Heiko Stuebner <heiko@sntech.de>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH] CHROMIUM: iommu: rockchip: Make sure that page table state is coherent
@ 2015-02-10 22:21   ` Heiko Stübner
  0 siblings, 0 replies; 24+ messages in thread
From: Heiko Stübner @ 2015-02-10 22:21 UTC (permalink / raw)
  To: linux-arm-kernel

Am Montag, 9. Februar 2015, 20:19:21 schrieb Tomasz Figa:
> Even though the code uses the dt_lock spin lock to serialize mapping
> operation from different threads, it does not protect from IOMMU
> accesses that might be already taking place and thus altering state
> of the IOTLB. This means that current mapping code which first zaps
> the page table and only then updates it with new mapping which is
> prone to mentioned race.
> 
> In addition, current code assumes that mappings are always > 4 MiB
> (which translates to 1024 PTEs) and so they would always occupy
> entire page tables. This is not true for mappings created by V4L2
> Videobuf2 DMA contig allocator.
> 
> This patch changes the mapping code to always zap the page table
> after it is updated, which avoids the aforementioned race and also
> zap the last page of the mapping to make sure that stale data is
> not cached from an already existing mapping.
> 
> Signed-off-by: Tomasz Figa <tfiga@chromium.org>
> Reviewed-by: Daniel Kurtz <djkurtz@chromium.org>

I don't know enough about iommu-magic yet to review this properly, but on my 
rk3288-firefly the whole display pipeline stays in working condition, down to 
x11 and es2gears, so

Tested-by: Heiko Stuebner <heiko@sntech.de>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] CHROMIUM: iommu: rockchip: Make sure that page table state is coherent
  2015-02-09 11:19 ` Tomasz Figa
@ 2015-03-03 13:36   ` Joerg Roedel
  -1 siblings, 0 replies; 24+ messages in thread
From: Joerg Roedel @ 2015-03-03 13:36 UTC (permalink / raw)
  To: Tomasz Figa
  Cc: iommu, linux-arm-kernel, linux-kernel, linux-rockchip,
	Heiko Stuebner, Daniel Kurtz

On Mon, Feb 09, 2015 at 08:19:21PM +0900, Tomasz Figa wrote:
> Even though the code uses the dt_lock spin lock to serialize mapping
> operation from different threads, it does not protect from IOMMU
> accesses that might be already taking place and thus altering state
> of the IOTLB. This means that current mapping code which first zaps
> the page table and only then updates it with new mapping which is
> prone to mentioned race.

Could you elabortate a bit on the race and why it is sufficient to zap
only the first and the last iova? From the description and the comments
in the patch this is not clear to me.


	Joerg


^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH] CHROMIUM: iommu: rockchip: Make sure that page table state is coherent
@ 2015-03-03 13:36   ` Joerg Roedel
  0 siblings, 0 replies; 24+ messages in thread
From: Joerg Roedel @ 2015-03-03 13:36 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Feb 09, 2015 at 08:19:21PM +0900, Tomasz Figa wrote:
> Even though the code uses the dt_lock spin lock to serialize mapping
> operation from different threads, it does not protect from IOMMU
> accesses that might be already taking place and thus altering state
> of the IOTLB. This means that current mapping code which first zaps
> the page table and only then updates it with new mapping which is
> prone to mentioned race.

Could you elabortate a bit on the race and why it is sufficient to zap
only the first and the last iova? From the description and the comments
in the patch this is not clear to me.


	Joerg

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] CHROMIUM: iommu: rockchip: Make sure that page table state is coherent
@ 2015-03-23  8:38     ` Tomasz Figa
  0 siblings, 0 replies; 24+ messages in thread
From: Tomasz Figa @ 2015-03-23  8:38 UTC (permalink / raw)
  To: Joerg Roedel
  Cc: iommu, linux-arm-kernel, linux-kernel,
	open list:ARM/Rockchip SoC...,
	Heiko Stuebner, Daniel Kurtz

Sorry, I had to dig my way out through my backlog.

On Tue, Mar 3, 2015 at 10:36 PM, Joerg Roedel <joro@8bytes.org> wrote:
> On Mon, Feb 09, 2015 at 08:19:21PM +0900, Tomasz Figa wrote:
>> Even though the code uses the dt_lock spin lock to serialize mapping
>> operation from different threads, it does not protect from IOMMU
>> accesses that might be already taking place and thus altering state
>> of the IOTLB. This means that current mapping code which first zaps
>> the page table and only then updates it with new mapping which is
>> prone to mentioned race.
>
> Could you elabortate a bit on the race and why it is sufficient to zap
> only the first and the last iova? From the description and the comments
> in the patch this is not clear to me.

Let's start with why it's sufficient to zap only first and last iova.

While unmapping, the driver zaps all iovas belonging to the mapping,
so the page tables not used by any mapping won't be cached. Now when
the driver creates a mapping it might end up occupying several page
tables. However, since the mapping area is virtually contiguous, only
the first and last page table can be shared with different mappings.
This means that only first and last iovas can be already cached. In
fact, we could detect if first and last page tables are shared and do
not zap at all, but this wouldn't really optimize too much. Why
invalidating one iova is enough to invalidate the whole page table is
unclear to me as well, but it seems to be the correct way on this
hardware.

As for the race, it's also kind of explained by the above. The already
running hardware can trigger page table look-ups in the IOMMU and so
caching of the page table between our zapping and updating its
contents. With this patch zapping is performed after updating the page
table so the race is gone.

Best regards,
Tomasz

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] CHROMIUM: iommu: rockchip: Make sure that page table state is coherent
@ 2015-03-23  8:38     ` Tomasz Figa
  0 siblings, 0 replies; 24+ messages in thread
From: Tomasz Figa @ 2015-03-23  8:38 UTC (permalink / raw)
  To: Joerg Roedel
  Cc: Heiko Stuebner, linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	Daniel Kurtz, open list:ARM/Rockchip SoC...,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r

Sorry, I had to dig my way out through my backlog.

On Tue, Mar 3, 2015 at 10:36 PM, Joerg Roedel <joro-zLv9SwRftAIdnm+yROfE0A@public.gmane.org> wrote:
> On Mon, Feb 09, 2015 at 08:19:21PM +0900, Tomasz Figa wrote:
>> Even though the code uses the dt_lock spin lock to serialize mapping
>> operation from different threads, it does not protect from IOMMU
>> accesses that might be already taking place and thus altering state
>> of the IOTLB. This means that current mapping code which first zaps
>> the page table and only then updates it with new mapping which is
>> prone to mentioned race.
>
> Could you elabortate a bit on the race and why it is sufficient to zap
> only the first and the last iova? From the description and the comments
> in the patch this is not clear to me.

Let's start with why it's sufficient to zap only first and last iova.

While unmapping, the driver zaps all iovas belonging to the mapping,
so the page tables not used by any mapping won't be cached. Now when
the driver creates a mapping it might end up occupying several page
tables. However, since the mapping area is virtually contiguous, only
the first and last page table can be shared with different mappings.
This means that only first and last iovas can be already cached. In
fact, we could detect if first and last page tables are shared and do
not zap at all, but this wouldn't really optimize too much. Why
invalidating one iova is enough to invalidate the whole page table is
unclear to me as well, but it seems to be the correct way on this
hardware.

As for the race, it's also kind of explained by the above. The already
running hardware can trigger page table look-ups in the IOMMU and so
caching of the page table between our zapping and updating its
contents. With this patch zapping is performed after updating the page
table so the race is gone.

Best regards,
Tomasz

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH] CHROMIUM: iommu: rockchip: Make sure that page table state is coherent
@ 2015-03-23  8:38     ` Tomasz Figa
  0 siblings, 0 replies; 24+ messages in thread
From: Tomasz Figa @ 2015-03-23  8:38 UTC (permalink / raw)
  To: linux-arm-kernel

Sorry, I had to dig my way out through my backlog.

On Tue, Mar 3, 2015 at 10:36 PM, Joerg Roedel <joro@8bytes.org> wrote:
> On Mon, Feb 09, 2015 at 08:19:21PM +0900, Tomasz Figa wrote:
>> Even though the code uses the dt_lock spin lock to serialize mapping
>> operation from different threads, it does not protect from IOMMU
>> accesses that might be already taking place and thus altering state
>> of the IOTLB. This means that current mapping code which first zaps
>> the page table and only then updates it with new mapping which is
>> prone to mentioned race.
>
> Could you elabortate a bit on the race and why it is sufficient to zap
> only the first and the last iova? From the description and the comments
> in the patch this is not clear to me.

Let's start with why it's sufficient to zap only first and last iova.

While unmapping, the driver zaps all iovas belonging to the mapping,
so the page tables not used by any mapping won't be cached. Now when
the driver creates a mapping it might end up occupying several page
tables. However, since the mapping area is virtually contiguous, only
the first and last page table can be shared with different mappings.
This means that only first and last iovas can be already cached. In
fact, we could detect if first and last page tables are shared and do
not zap at all, but this wouldn't really optimize too much. Why
invalidating one iova is enough to invalidate the whole page table is
unclear to me as well, but it seems to be the correct way on this
hardware.

As for the race, it's also kind of explained by the above. The already
running hardware can trigger page table look-ups in the IOMMU and so
caching of the page table between our zapping and updating its
contents. With this patch zapping is performed after updating the page
table so the race is gone.

Best regards,
Tomasz

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] CHROMIUM: iommu: rockchip: Make sure that page table state is coherent
@ 2015-03-23 12:29       ` Joerg Roedel
  0 siblings, 0 replies; 24+ messages in thread
From: Joerg Roedel @ 2015-03-23 12:29 UTC (permalink / raw)
  To: Tomasz Figa
  Cc: iommu, linux-arm-kernel, linux-kernel,
	open list:ARM/Rockchip SoC...,
	Heiko Stuebner, Daniel Kurtz

Hi Tomasz,

On Mon, Mar 23, 2015 at 05:38:45PM +0900, Tomasz Figa wrote:
> While unmapping, the driver zaps all iovas belonging to the mapping,
> so the page tables not used by any mapping won't be cached. Now when
> the driver creates a mapping it might end up occupying several page
> tables. However, since the mapping area is virtually contiguous, only
> the first and last page table can be shared with different mappings.
> This means that only first and last iovas can be already cached. In
> fact, we could detect if first and last page tables are shared and do
> not zap at all, but this wouldn't really optimize too much. Why
> invalidating one iova is enough to invalidate the whole page table is
> unclear to me as well, but it seems to be the correct way on this
> hardware.
> 
> As for the race, it's also kind of explained by the above. The already
> running hardware can trigger page table look-ups in the IOMMU and so
> caching of the page table between our zapping and updating its
> contents. With this patch zapping is performed after updating the page
> table so the race is gone.

Okay, this makes sense. Can you add this information to the patch
changelog and resend please?

Thanks,

	Joerg


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] CHROMIUM: iommu: rockchip: Make sure that page table state is coherent
@ 2015-03-23 12:29       ` Joerg Roedel
  0 siblings, 0 replies; 24+ messages in thread
From: Joerg Roedel @ 2015-03-23 12:29 UTC (permalink / raw)
  To: Tomasz Figa
  Cc: Heiko Stuebner, linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	Daniel Kurtz, open list:ARM/Rockchip SoC...,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r

Hi Tomasz,

On Mon, Mar 23, 2015 at 05:38:45PM +0900, Tomasz Figa wrote:
> While unmapping, the driver zaps all iovas belonging to the mapping,
> so the page tables not used by any mapping won't be cached. Now when
> the driver creates a mapping it might end up occupying several page
> tables. However, since the mapping area is virtually contiguous, only
> the first and last page table can be shared with different mappings.
> This means that only first and last iovas can be already cached. In
> fact, we could detect if first and last page tables are shared and do
> not zap at all, but this wouldn't really optimize too much. Why
> invalidating one iova is enough to invalidate the whole page table is
> unclear to me as well, but it seems to be the correct way on this
> hardware.
> 
> As for the race, it's also kind of explained by the above. The already
> running hardware can trigger page table look-ups in the IOMMU and so
> caching of the page table between our zapping and updating its
> contents. With this patch zapping is performed after updating the page
> table so the race is gone.

Okay, this makes sense. Can you add this information to the patch
changelog and resend please?

Thanks,

	Joerg

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH] CHROMIUM: iommu: rockchip: Make sure that page table state is coherent
@ 2015-03-23 12:29       ` Joerg Roedel
  0 siblings, 0 replies; 24+ messages in thread
From: Joerg Roedel @ 2015-03-23 12:29 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Tomasz,

On Mon, Mar 23, 2015 at 05:38:45PM +0900, Tomasz Figa wrote:
> While unmapping, the driver zaps all iovas belonging to the mapping,
> so the page tables not used by any mapping won't be cached. Now when
> the driver creates a mapping it might end up occupying several page
> tables. However, since the mapping area is virtually contiguous, only
> the first and last page table can be shared with different mappings.
> This means that only first and last iovas can be already cached. In
> fact, we could detect if first and last page tables are shared and do
> not zap at all, but this wouldn't really optimize too much. Why
> invalidating one iova is enough to invalidate the whole page table is
> unclear to me as well, but it seems to be the correct way on this
> hardware.
> 
> As for the race, it's also kind of explained by the above. The already
> running hardware can trigger page table look-ups in the IOMMU and so
> caching of the page table between our zapping and updating its
> contents. With this patch zapping is performed after updating the page
> table so the race is gone.

Okay, this makes sense. Can you add this information to the patch
changelog and resend please?

Thanks,

	Joerg

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] CHROMIUM: iommu: rockchip: Make sure that page table state is coherent
  2015-03-23  8:38     ` Tomasz Figa
  (?)
  (?)
@ 2021-12-09 15:18       ` Dafna Hirschfeld
  -1 siblings, 0 replies; 24+ messages in thread
From: Dafna Hirschfeld @ 2021-12-09 15:18 UTC (permalink / raw)
  To: Tomasz Figa, Joerg Roedel
  Cc: iommu, linux-arm-kernel, linux-kernel,
	open list:ARM/Rockchip SoC...,
	Heiko Stuebner, Daniel Kurtz, Collabora Kernel ML



On 23.03.15 10:38, Tomasz Figa wrote:
> Sorry, I had to dig my way out through my backlog.
> 
> On Tue, Mar 3, 2015 at 10:36 PM, Joerg Roedel <joro@8bytes.org> wrote:
>> On Mon, Feb 09, 2015 at 08:19:21PM +0900, Tomasz Figa wrote:
>>> Even though the code uses the dt_lock spin lock to serialize mapping
>>> operation from different threads, it does not protect from IOMMU
>>> accesses that might be already taking place and thus altering state
>>> of the IOTLB. This means that current mapping code which first zaps
>>> the page table and only then updates it with new mapping which is
>>> prone to mentioned race.
>>
>> Could you elabortate a bit on the race and why it is sufficient to zap
>> only the first and the last iova? From the description and the comments
>> in the patch this is not clear to me.
> 
> Let's start with why it's sufficient to zap only first and last iova.
> 
> While unmapping, the driver zaps all iovas belonging to the mapping,
> so the page tables not used by any mapping won't be cached. Now when
> the driver creates a mapping it might end up occupying several page
> tables. However, since the mapping area is virtually contiguous, only
> the first and last page table can be shared with different mappings.
> This means that only first and last iovas can be already cached. In
> fact, we could detect if first and last page tables are shared and do
> not zap at all, but this wouldn't really optimize too much. Why
> invalidating one iova is enough to invalidate the whole page table is
> unclear to me as well, but it seems to be the correct way on this
> hardware.

Hi,
It seems to me that actually each mapping needs exactly one page.
Since (as the inline doc in rk_iommu_map states) the pgsize_bitmap
makes sure that iova mappings fits exactly into one page table
since the mapping size is maximum 4M.

This actually means that if rk_dte_get_page_table does not allocate a
new page table but returns one that is already partially used from previous
mappings then two page tables might be required, but I think the iova
allocation somehow make sure that this will not be the case.

If it was the case then the code would be buggy because it means
that the loop in rk_iommu_map_iova will write behind the page table
given in rk_dte_get_page_table (which we didn't allocate)

So I it seems to me that calling 'rk_iommu_zap_iova(rk_domain, iova, SPAGE_SIZE);'
as done before this patch should be used, but be moved from
rk_dte_get_page_table to where rk_iommu_zap_iova_first_last is now

Thanks,
Dafna

> 
> As for the race, it's also kind of explained by the above. The already
> running hardware can trigger page table look-ups in the IOMMU and so
> caching of the page table between our zapping and updating its
> contents. With this patch zapping is performed after updating the page
> table so the race is gone.
> 
> Best regards,
> Tomasz
> 
>  From mboxrd@z Thu Jan  1 00:00:00 1970
> Return-Path: <linux-kernel-owner@vger.kernel.org>
> Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
> 	id S1753210AbbCWM3R (ORCPT <rfc822;w@1wt.eu>);
> 	Mon, 23 Mar 2015 08:29:17 -0400
> Received: from 8bytes.org ([81.169.241.247]:33957 "EHLO theia.8bytes.org"
> 	rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP
> 	id S1752552AbbCWM3M (ORCPT <rfc822;linux-kernel@vger.kernel.org>);
> 	Mon, 23 Mar 2015 08:29:12 -0400
> Date: Mon, 23 Mar 2015 13:29:10 +0100
> From: Joerg Roedel <joro@8bytes.org>
> To: Tomasz Figa <tfiga@chromium.org>
> Cc: iommu@lists.linux-foundation.org,
>          "linux-arm-kernel@lists.infradead.org"
> 	<linux-arm-kernel@lists.infradead.org>,
>          "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
>          "open list:ARM/Rockchip SoC..." <linux-rockchip@lists.infradead.org>,
>          Heiko Stuebner <heiko@sntech.de>, Daniel Kurtz <djkurtz@chromium.org>
> Subject: Re: [PATCH] CHROMIUM: iommu: rockchip: Make sure that page table
>   state is coherent
> Message-ID: <20150323122910.GO4441@8bytes.org>
> References: <1423480761-33453-1-git-send-email-tfiga@chromium.org>
>   <20150303133659.GD10502@8bytes.org>
>   <CAAFQd5Abk6X7AVTFaNuUSiShn31pzwwTE3VjfLnE4kyziAjy2A@mail.gmail.com>
> MIME-Version: 1.0
> Content-Type: text/plain; charset=us-ascii
> Content-Disposition: inline
> In-Reply-To: <CAAFQd5Abk6X7AVTFaNuUSiShn31pzwwTE3VjfLnE4kyziAjy2A@mail.gmail.com>
> User-Agent: Mutt/1.5.21 (2010-09-15)
> Sender: linux-kernel-owner@vger.kernel.org
> List-ID: <linux-kernel.vger.kernel.org>
> X-Mailing-List: linux-kernel@vger.kernel.org
> 
> Hi Tomasz,
> 
> On Mon, Mar 23, 2015 at 05:38:45PM +0900, Tomasz Figa wrote:
>> While unmapping, the driver zaps all iovas belonging to the mapping,
>> so the page tables not used by any mapping won't be cached. Now when
>> the driver creates a mapping it might end up occupying several page
>> tables. However, since the mapping area is virtually contiguous, only
>> the first and last page table can be shared with different mappings.
>> This means that only first and last iovas can be already cached. In
>> fact, we could detect if first and last page tables are shared and do
>> not zap at all, but this wouldn't really optimize too much. Why
>> invalidating one iova is enough to invalidate the whole page table is
>> unclear to me as well, but it seems to be the correct way on this
>> hardware.
>>
>> As for the race, it's also kind of explained by the above. The already
>> running hardware can trigger page table look-ups in the IOMMU and so
>> caching of the page table between our zapping and updating its
>> contents. With this patch zapping is performed after updating the page
>> table so the race is gone.
> 
> Okay, this makes sense. Can you add this information to the patch
> changelog and resend please?
> 
> Thanks,
> 
> 	Joerg
> 
> 
>  From mboxrd@z Thu Jan  1 00:00:00 1970
> From: Tomasz Figa <tfiga-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>
> Subject: [PATCH] CHROMIUM: iommu: rockchip: Make sure that page table state is
> 	coherent
> Date: Mon,  9 Feb 2015 20:19:21 +0900
> Message-ID: <1423480761-33453-1-git-send-email-tfiga@chromium.org>
> Mime-Version: 1.0
> Content-Type: text/plain; charset="us-ascii"
> Content-Transfer-Encoding: 7bit
> Return-path: <iommu-bounces-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org>
> List-Unsubscribe: <https://lists.linuxfoundation.org/mailman/options/iommu>,
> 	<mailto:iommu-request-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org?subject=unsubscribe>
> List-Archive: <http://lists.linuxfoundation.org/pipermail/iommu/>
> List-Post: <mailto:iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org>
> List-Help: <mailto:iommu-request-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org?subject=help>
> List-Subscribe: <https://lists.linuxfoundation.org/mailman/listinfo/iommu>,
> 	<mailto:iommu-request-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org?subject=subscribe>
> Sender: iommu-bounces-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org
> Errors-To: iommu-bounces-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org
> To: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org
> Cc: Heiko Stuebner <heiko-4mtYJXux2i+zQB+pC5nmwQ@public.gmane.org>, linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, Daniel Kurtz <djkurtz-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>, Tomasz Figa <tfiga-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>, linux-rockchip-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r@public.gmane.org, linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r@public.gmane.org
> List-Id: iommu@lists.linux-foundation.org
> 
> Even though the code uses the dt_lock spin lock to serialize mapping
> operation from different threads, it does not protect from IOMMU
> accesses that might be already taking place and thus altering state
> of the IOTLB. This means that current mapping code which first zaps
> the page table and only then updates it with new mapping which is
> prone to mentioned race.
> 
> In addition, current code assumes that mappings are always > 4 MiB
> (which translates to 1024 PTEs) and so they would always occupy
> entire page tables. This is not true for mappings created by V4L2
> Videobuf2 DMA contig allocator.
> 
> This patch changes the mapping code to always zap the page table
> after it is updated, which avoids the aforementioned race and also
> zap the last page of the mapping to make sure that stale data is
> not cached from an already existing mapping.
> 
> Signed-off-by: Tomasz Figa <tfiga-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>
> Reviewed-by: Daniel Kurtz <djkurtz-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>
> ---
>   drivers/iommu/rockchip-iommu.c | 23 +++++++++++++++++------
>   1 file changed, 17 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/iommu/rockchip-iommu.c b/drivers/iommu/rockchip-iommu.c
> index 6a8b1ec..b06fe76 100644
> --- a/drivers/iommu/rockchip-iommu.c
> +++ b/drivers/iommu/rockchip-iommu.c
> @@ -544,6 +544,15 @@ static void rk_iommu_zap_iova(struct rk_iommu_domain *rk_domain,
>   	spin_unlock_irqrestore(&rk_domain->iommus_lock, flags);
>   }
>   
> +static void rk_iommu_zap_iova_first_last(struct rk_iommu_domain *rk_domain,
> +					 dma_addr_t iova, size_t size)
> +{
> +	rk_iommu_zap_iova(rk_domain, iova, SPAGE_SIZE);
> +	if (size > SPAGE_SIZE)
> +		rk_iommu_zap_iova(rk_domain, iova + size - SPAGE_SIZE,
> +					SPAGE_SIZE);
> +}
> +
>   static u32 *rk_dte_get_page_table(struct rk_iommu_domain *rk_domain,
>   				  dma_addr_t iova)
>   {
> @@ -568,12 +577,6 @@ static u32 *rk_dte_get_page_table(struct rk_iommu_domain *rk_domain,
>   	rk_table_flush(page_table, NUM_PT_ENTRIES);
>   	rk_table_flush(dte_addr, 1);
>   
> -	/*
> -	 * Zap the first iova of newly allocated page table so iommu evicts
> -	 * old cached value of new dte from the iotlb.
> -	 */
> -	rk_iommu_zap_iova(rk_domain, iova, SPAGE_SIZE);
> -
>   done:
>   	pt_phys = rk_dte_pt_address(dte);
>   	return (u32 *)phys_to_virt(pt_phys);
> @@ -623,6 +626,14 @@ static int rk_iommu_map_iova(struct rk_iommu_domain *rk_domain, u32 *pte_addr,
>   
>   	rk_table_flush(pte_addr, pte_count);
>   
> +	/*
> +	 * Zap the first and last iova to evict from iotlb any previously
> +	 * mapped cachelines holding stale values for its dte and pte.
> +	 * We only zap the first and last iova, since only they could have
> +	 * dte or pte shared with an existing mapping.
> +	 */
> +	rk_iommu_zap_iova_first_last(rk_domain, iova, size);
> +
>   	return 0;
>   unwind:
>   	/* Unmap the range of iovas that we just mapped */
> 

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] CHROMIUM: iommu: rockchip: Make sure that page table state is coherent
@ 2021-12-09 15:18       ` Dafna Hirschfeld
  0 siblings, 0 replies; 24+ messages in thread
From: Dafna Hirschfeld @ 2021-12-09 15:18 UTC (permalink / raw)
  To: Tomasz Figa, Joerg Roedel
  Cc: iommu, linux-arm-kernel, linux-kernel,
	open list:ARM/Rockchip SoC...,
	Heiko Stuebner, Daniel Kurtz, Collabora Kernel ML



On 23.03.15 10:38, Tomasz Figa wrote:
> Sorry, I had to dig my way out through my backlog.
> 
> On Tue, Mar 3, 2015 at 10:36 PM, Joerg Roedel <joro@8bytes.org> wrote:
>> On Mon, Feb 09, 2015 at 08:19:21PM +0900, Tomasz Figa wrote:
>>> Even though the code uses the dt_lock spin lock to serialize mapping
>>> operation from different threads, it does not protect from IOMMU
>>> accesses that might be already taking place and thus altering state
>>> of the IOTLB. This means that current mapping code which first zaps
>>> the page table and only then updates it with new mapping which is
>>> prone to mentioned race.
>>
>> Could you elabortate a bit on the race and why it is sufficient to zap
>> only the first and the last iova? From the description and the comments
>> in the patch this is not clear to me.
> 
> Let's start with why it's sufficient to zap only first and last iova.
> 
> While unmapping, the driver zaps all iovas belonging to the mapping,
> so the page tables not used by any mapping won't be cached. Now when
> the driver creates a mapping it might end up occupying several page
> tables. However, since the mapping area is virtually contiguous, only
> the first and last page table can be shared with different mappings.
> This means that only first and last iovas can be already cached. In
> fact, we could detect if first and last page tables are shared and do
> not zap at all, but this wouldn't really optimize too much. Why
> invalidating one iova is enough to invalidate the whole page table is
> unclear to me as well, but it seems to be the correct way on this
> hardware.

Hi,
It seems to me that actually each mapping needs exactly one page.
Since (as the inline doc in rk_iommu_map states) the pgsize_bitmap
makes sure that iova mappings fits exactly into one page table
since the mapping size is maximum 4M.

This actually means that if rk_dte_get_page_table does not allocate a
new page table but returns one that is already partially used from previous
mappings then two page tables might be required, but I think the iova
allocation somehow make sure that this will not be the case.

If it was the case then the code would be buggy because it means
that the loop in rk_iommu_map_iova will write behind the page table
given in rk_dte_get_page_table (which we didn't allocate)

So I it seems to me that calling 'rk_iommu_zap_iova(rk_domain, iova, SPAGE_SIZE);'
as done before this patch should be used, but be moved from
rk_dte_get_page_table to where rk_iommu_zap_iova_first_last is now

Thanks,
Dafna

> 
> As for the race, it's also kind of explained by the above. The already
> running hardware can trigger page table look-ups in the IOMMU and so
> caching of the page table between our zapping and updating its
> contents. With this patch zapping is performed after updating the page
> table so the race is gone.
> 
> Best regards,
> Tomasz
> 
>  From mboxrd@z Thu Jan  1 00:00:00 1970
> Return-Path: <linux-kernel-owner@vger.kernel.org>
> Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
> 	id S1753210AbbCWM3R (ORCPT <rfc822;w@1wt.eu>);
> 	Mon, 23 Mar 2015 08:29:17 -0400
> Received: from 8bytes.org ([81.169.241.247]:33957 "EHLO theia.8bytes.org"
> 	rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP
> 	id S1752552AbbCWM3M (ORCPT <rfc822;linux-kernel@vger.kernel.org>);
> 	Mon, 23 Mar 2015 08:29:12 -0400
> Date: Mon, 23 Mar 2015 13:29:10 +0100
> From: Joerg Roedel <joro@8bytes.org>
> To: Tomasz Figa <tfiga@chromium.org>
> Cc: iommu@lists.linux-foundation.org,
>          "linux-arm-kernel@lists.infradead.org"
> 	<linux-arm-kernel@lists.infradead.org>,
>          "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
>          "open list:ARM/Rockchip SoC..." <linux-rockchip@lists.infradead.org>,
>          Heiko Stuebner <heiko@sntech.de>, Daniel Kurtz <djkurtz@chromium.org>
> Subject: Re: [PATCH] CHROMIUM: iommu: rockchip: Make sure that page table
>   state is coherent
> Message-ID: <20150323122910.GO4441@8bytes.org>
> References: <1423480761-33453-1-git-send-email-tfiga@chromium.org>
>   <20150303133659.GD10502@8bytes.org>
>   <CAAFQd5Abk6X7AVTFaNuUSiShn31pzwwTE3VjfLnE4kyziAjy2A@mail.gmail.com>
> MIME-Version: 1.0
> Content-Type: text/plain; charset=us-ascii
> Content-Disposition: inline
> In-Reply-To: <CAAFQd5Abk6X7AVTFaNuUSiShn31pzwwTE3VjfLnE4kyziAjy2A@mail.gmail.com>
> User-Agent: Mutt/1.5.21 (2010-09-15)
> Sender: linux-kernel-owner@vger.kernel.org
> List-ID: <linux-kernel.vger.kernel.org>
> X-Mailing-List: linux-kernel@vger.kernel.org
> 
> Hi Tomasz,
> 
> On Mon, Mar 23, 2015 at 05:38:45PM +0900, Tomasz Figa wrote:
>> While unmapping, the driver zaps all iovas belonging to the mapping,
>> so the page tables not used by any mapping won't be cached. Now when
>> the driver creates a mapping it might end up occupying several page
>> tables. However, since the mapping area is virtually contiguous, only
>> the first and last page table can be shared with different mappings.
>> This means that only first and last iovas can be already cached. In
>> fact, we could detect if first and last page tables are shared and do
>> not zap at all, but this wouldn't really optimize too much. Why
>> invalidating one iova is enough to invalidate the whole page table is
>> unclear to me as well, but it seems to be the correct way on this
>> hardware.
>>
>> As for the race, it's also kind of explained by the above. The already
>> running hardware can trigger page table look-ups in the IOMMU and so
>> caching of the page table between our zapping and updating its
>> contents. With this patch zapping is performed after updating the page
>> table so the race is gone.
> 
> Okay, this makes sense. Can you add this information to the patch
> changelog and resend please?
> 
> Thanks,
> 
> 	Joerg
> 
> 
>  From mboxrd@z Thu Jan  1 00:00:00 1970
> From: Tomasz Figa <tfiga-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>
> Subject: [PATCH] CHROMIUM: iommu: rockchip: Make sure that page table state is
> 	coherent
> Date: Mon,  9 Feb 2015 20:19:21 +0900
> Message-ID: <1423480761-33453-1-git-send-email-tfiga@chromium.org>
> Mime-Version: 1.0
> Content-Type: text/plain; charset="us-ascii"
> Content-Transfer-Encoding: 7bit
> Return-path: <iommu-bounces-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org>
> List-Unsubscribe: <https://lists.linuxfoundation.org/mailman/options/iommu>,
> 	<mailto:iommu-request-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org?subject=unsubscribe>
> List-Archive: <http://lists.linuxfoundation.org/pipermail/iommu/>
> List-Post: <mailto:iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org>
> List-Help: <mailto:iommu-request-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org?subject=help>
> List-Subscribe: <https://lists.linuxfoundation.org/mailman/listinfo/iommu>,
> 	<mailto:iommu-request-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org?subject=subscribe>
> Sender: iommu-bounces-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org
> Errors-To: iommu-bounces-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org
> To: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org
> Cc: Heiko Stuebner <heiko-4mtYJXux2i+zQB+pC5nmwQ@public.gmane.org>, linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, Daniel Kurtz <djkurtz-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>, Tomasz Figa <tfiga-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>, linux-rockchip-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r@public.gmane.org, linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r@public.gmane.org
> List-Id: iommu@lists.linux-foundation.org
> 
> Even though the code uses the dt_lock spin lock to serialize mapping
> operation from different threads, it does not protect from IOMMU
> accesses that might be already taking place and thus altering state
> of the IOTLB. This means that current mapping code which first zaps
> the page table and only then updates it with new mapping which is
> prone to mentioned race.
> 
> In addition, current code assumes that mappings are always > 4 MiB
> (which translates to 1024 PTEs) and so they would always occupy
> entire page tables. This is not true for mappings created by V4L2
> Videobuf2 DMA contig allocator.
> 
> This patch changes the mapping code to always zap the page table
> after it is updated, which avoids the aforementioned race and also
> zap the last page of the mapping to make sure that stale data is
> not cached from an already existing mapping.
> 
> Signed-off-by: Tomasz Figa <tfiga-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>
> Reviewed-by: Daniel Kurtz <djkurtz-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>
> ---
>   drivers/iommu/rockchip-iommu.c | 23 +++++++++++++++++------
>   1 file changed, 17 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/iommu/rockchip-iommu.c b/drivers/iommu/rockchip-iommu.c
> index 6a8b1ec..b06fe76 100644
> --- a/drivers/iommu/rockchip-iommu.c
> +++ b/drivers/iommu/rockchip-iommu.c
> @@ -544,6 +544,15 @@ static void rk_iommu_zap_iova(struct rk_iommu_domain *rk_domain,
>   	spin_unlock_irqrestore(&rk_domain->iommus_lock, flags);
>   }
>   
> +static void rk_iommu_zap_iova_first_last(struct rk_iommu_domain *rk_domain,
> +					 dma_addr_t iova, size_t size)
> +{
> +	rk_iommu_zap_iova(rk_domain, iova, SPAGE_SIZE);
> +	if (size > SPAGE_SIZE)
> +		rk_iommu_zap_iova(rk_domain, iova + size - SPAGE_SIZE,
> +					SPAGE_SIZE);
> +}
> +
>   static u32 *rk_dte_get_page_table(struct rk_iommu_domain *rk_domain,
>   				  dma_addr_t iova)
>   {
> @@ -568,12 +577,6 @@ static u32 *rk_dte_get_page_table(struct rk_iommu_domain *rk_domain,
>   	rk_table_flush(page_table, NUM_PT_ENTRIES);
>   	rk_table_flush(dte_addr, 1);
>   
> -	/*
> -	 * Zap the first iova of newly allocated page table so iommu evicts
> -	 * old cached value of new dte from the iotlb.
> -	 */
> -	rk_iommu_zap_iova(rk_domain, iova, SPAGE_SIZE);
> -
>   done:
>   	pt_phys = rk_dte_pt_address(dte);
>   	return (u32 *)phys_to_virt(pt_phys);
> @@ -623,6 +626,14 @@ static int rk_iommu_map_iova(struct rk_iommu_domain *rk_domain, u32 *pte_addr,
>   
>   	rk_table_flush(pte_addr, pte_count);
>   
> +	/*
> +	 * Zap the first and last iova to evict from iotlb any previously
> +	 * mapped cachelines holding stale values for its dte and pte.
> +	 * We only zap the first and last iova, since only they could have
> +	 * dte or pte shared with an existing mapping.
> +	 */
> +	rk_iommu_zap_iova_first_last(rk_domain, iova, size);
> +
>   	return 0;
>   unwind:
>   	/* Unmap the range of iovas that we just mapped */
> 

_______________________________________________
Linux-rockchip mailing list
Linux-rockchip@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-rockchip

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] CHROMIUM: iommu: rockchip: Make sure that page table state is coherent
@ 2021-12-09 15:18       ` Dafna Hirschfeld
  0 siblings, 0 replies; 24+ messages in thread
From: Dafna Hirschfeld @ 2021-12-09 15:18 UTC (permalink / raw)
  To: Tomasz Figa, Joerg Roedel
  Cc: Heiko Stuebner, linux-kernel, Daniel Kurtz,
	open list:ARM/Rockchip SoC...,
	iommu, Collabora Kernel ML, linux-arm-kernel



On 23.03.15 10:38, Tomasz Figa wrote:
> Sorry, I had to dig my way out through my backlog.
> 
> On Tue, Mar 3, 2015 at 10:36 PM, Joerg Roedel <joro@8bytes.org> wrote:
>> On Mon, Feb 09, 2015 at 08:19:21PM +0900, Tomasz Figa wrote:
>>> Even though the code uses the dt_lock spin lock to serialize mapping
>>> operation from different threads, it does not protect from IOMMU
>>> accesses that might be already taking place and thus altering state
>>> of the IOTLB. This means that current mapping code which first zaps
>>> the page table and only then updates it with new mapping which is
>>> prone to mentioned race.
>>
>> Could you elabortate a bit on the race and why it is sufficient to zap
>> only the first and the last iova? From the description and the comments
>> in the patch this is not clear to me.
> 
> Let's start with why it's sufficient to zap only first and last iova.
> 
> While unmapping, the driver zaps all iovas belonging to the mapping,
> so the page tables not used by any mapping won't be cached. Now when
> the driver creates a mapping it might end up occupying several page
> tables. However, since the mapping area is virtually contiguous, only
> the first and last page table can be shared with different mappings.
> This means that only first and last iovas can be already cached. In
> fact, we could detect if first and last page tables are shared and do
> not zap at all, but this wouldn't really optimize too much. Why
> invalidating one iova is enough to invalidate the whole page table is
> unclear to me as well, but it seems to be the correct way on this
> hardware.

Hi,
It seems to me that actually each mapping needs exactly one page.
Since (as the inline doc in rk_iommu_map states) the pgsize_bitmap
makes sure that iova mappings fits exactly into one page table
since the mapping size is maximum 4M.

This actually means that if rk_dte_get_page_table does not allocate a
new page table but returns one that is already partially used from previous
mappings then two page tables might be required, but I think the iova
allocation somehow make sure that this will not be the case.

If it was the case then the code would be buggy because it means
that the loop in rk_iommu_map_iova will write behind the page table
given in rk_dte_get_page_table (which we didn't allocate)

So I it seems to me that calling 'rk_iommu_zap_iova(rk_domain, iova, SPAGE_SIZE);'
as done before this patch should be used, but be moved from
rk_dte_get_page_table to where rk_iommu_zap_iova_first_last is now

Thanks,
Dafna

> 
> As for the race, it's also kind of explained by the above. The already
> running hardware can trigger page table look-ups in the IOMMU and so
> caching of the page table between our zapping and updating its
> contents. With this patch zapping is performed after updating the page
> table so the race is gone.
> 
> Best regards,
> Tomasz
> 
>  From mboxrd@z Thu Jan  1 00:00:00 1970
> Return-Path: <linux-kernel-owner@vger.kernel.org>
> Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
> 	id S1753210AbbCWM3R (ORCPT <rfc822;w@1wt.eu>);
> 	Mon, 23 Mar 2015 08:29:17 -0400
> Received: from 8bytes.org ([81.169.241.247]:33957 "EHLO theia.8bytes.org"
> 	rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP
> 	id S1752552AbbCWM3M (ORCPT <rfc822;linux-kernel@vger.kernel.org>);
> 	Mon, 23 Mar 2015 08:29:12 -0400
> Date: Mon, 23 Mar 2015 13:29:10 +0100
> From: Joerg Roedel <joro@8bytes.org>
> To: Tomasz Figa <tfiga@chromium.org>
> Cc: iommu@lists.linux-foundation.org,
>          "linux-arm-kernel@lists.infradead.org"
> 	<linux-arm-kernel@lists.infradead.org>,
>          "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
>          "open list:ARM/Rockchip SoC..." <linux-rockchip@lists.infradead.org>,
>          Heiko Stuebner <heiko@sntech.de>, Daniel Kurtz <djkurtz@chromium.org>
> Subject: Re: [PATCH] CHROMIUM: iommu: rockchip: Make sure that page table
>   state is coherent
> Message-ID: <20150323122910.GO4441@8bytes.org>
> References: <1423480761-33453-1-git-send-email-tfiga@chromium.org>
>   <20150303133659.GD10502@8bytes.org>
>   <CAAFQd5Abk6X7AVTFaNuUSiShn31pzwwTE3VjfLnE4kyziAjy2A@mail.gmail.com>
> MIME-Version: 1.0
> Content-Type: text/plain; charset=us-ascii
> Content-Disposition: inline
> In-Reply-To: <CAAFQd5Abk6X7AVTFaNuUSiShn31pzwwTE3VjfLnE4kyziAjy2A@mail.gmail.com>
> User-Agent: Mutt/1.5.21 (2010-09-15)
> Sender: linux-kernel-owner@vger.kernel.org
> List-ID: <linux-kernel.vger.kernel.org>
> X-Mailing-List: linux-kernel@vger.kernel.org
> 
> Hi Tomasz,
> 
> On Mon, Mar 23, 2015 at 05:38:45PM +0900, Tomasz Figa wrote:
>> While unmapping, the driver zaps all iovas belonging to the mapping,
>> so the page tables not used by any mapping won't be cached. Now when
>> the driver creates a mapping it might end up occupying several page
>> tables. However, since the mapping area is virtually contiguous, only
>> the first and last page table can be shared with different mappings.
>> This means that only first and last iovas can be already cached. In
>> fact, we could detect if first and last page tables are shared and do
>> not zap at all, but this wouldn't really optimize too much. Why
>> invalidating one iova is enough to invalidate the whole page table is
>> unclear to me as well, but it seems to be the correct way on this
>> hardware.
>>
>> As for the race, it's also kind of explained by the above. The already
>> running hardware can trigger page table look-ups in the IOMMU and so
>> caching of the page table between our zapping and updating its
>> contents. With this patch zapping is performed after updating the page
>> table so the race is gone.
> 
> Okay, this makes sense. Can you add this information to the patch
> changelog and resend please?
> 
> Thanks,
> 
> 	Joerg
> 
> 
>  From mboxrd@z Thu Jan  1 00:00:00 1970
> From: Tomasz Figa <tfiga-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>
> Subject: [PATCH] CHROMIUM: iommu: rockchip: Make sure that page table state is
> 	coherent
> Date: Mon,  9 Feb 2015 20:19:21 +0900
> Message-ID: <1423480761-33453-1-git-send-email-tfiga@chromium.org>
> Mime-Version: 1.0
> Content-Type: text/plain; charset="us-ascii"
> Content-Transfer-Encoding: 7bit
> Return-path: <iommu-bounces-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org>
> List-Unsubscribe: <https://lists.linuxfoundation.org/mailman/options/iommu>,
> 	<mailto:iommu-request-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org?subject=unsubscribe>
> List-Archive: <http://lists.linuxfoundation.org/pipermail/iommu/>
> List-Post: <mailto:iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org>
> List-Help: <mailto:iommu-request-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org?subject=help>
> List-Subscribe: <https://lists.linuxfoundation.org/mailman/listinfo/iommu>,
> 	<mailto:iommu-request-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org?subject=subscribe>
> Sender: iommu-bounces-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org
> Errors-To: iommu-bounces-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org
> To: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org
> Cc: Heiko Stuebner <heiko-4mtYJXux2i+zQB+pC5nmwQ@public.gmane.org>, linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, Daniel Kurtz <djkurtz-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>, Tomasz Figa <tfiga-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>, linux-rockchip-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r@public.gmane.org, linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r@public.gmane.org
> List-Id: iommu@lists.linux-foundation.org
> 
> Even though the code uses the dt_lock spin lock to serialize mapping
> operation from different threads, it does not protect from IOMMU
> accesses that might be already taking place and thus altering state
> of the IOTLB. This means that current mapping code which first zaps
> the page table and only then updates it with new mapping which is
> prone to mentioned race.
> 
> In addition, current code assumes that mappings are always > 4 MiB
> (which translates to 1024 PTEs) and so they would always occupy
> entire page tables. This is not true for mappings created by V4L2
> Videobuf2 DMA contig allocator.
> 
> This patch changes the mapping code to always zap the page table
> after it is updated, which avoids the aforementioned race and also
> zap the last page of the mapping to make sure that stale data is
> not cached from an already existing mapping.
> 
> Signed-off-by: Tomasz Figa <tfiga-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>
> Reviewed-by: Daniel Kurtz <djkurtz-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>
> ---
>   drivers/iommu/rockchip-iommu.c | 23 +++++++++++++++++------
>   1 file changed, 17 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/iommu/rockchip-iommu.c b/drivers/iommu/rockchip-iommu.c
> index 6a8b1ec..b06fe76 100644
> --- a/drivers/iommu/rockchip-iommu.c
> +++ b/drivers/iommu/rockchip-iommu.c
> @@ -544,6 +544,15 @@ static void rk_iommu_zap_iova(struct rk_iommu_domain *rk_domain,
>   	spin_unlock_irqrestore(&rk_domain->iommus_lock, flags);
>   }
>   
> +static void rk_iommu_zap_iova_first_last(struct rk_iommu_domain *rk_domain,
> +					 dma_addr_t iova, size_t size)
> +{
> +	rk_iommu_zap_iova(rk_domain, iova, SPAGE_SIZE);
> +	if (size > SPAGE_SIZE)
> +		rk_iommu_zap_iova(rk_domain, iova + size - SPAGE_SIZE,
> +					SPAGE_SIZE);
> +}
> +
>   static u32 *rk_dte_get_page_table(struct rk_iommu_domain *rk_domain,
>   				  dma_addr_t iova)
>   {
> @@ -568,12 +577,6 @@ static u32 *rk_dte_get_page_table(struct rk_iommu_domain *rk_domain,
>   	rk_table_flush(page_table, NUM_PT_ENTRIES);
>   	rk_table_flush(dte_addr, 1);
>   
> -	/*
> -	 * Zap the first iova of newly allocated page table so iommu evicts
> -	 * old cached value of new dte from the iotlb.
> -	 */
> -	rk_iommu_zap_iova(rk_domain, iova, SPAGE_SIZE);
> -
>   done:
>   	pt_phys = rk_dte_pt_address(dte);
>   	return (u32 *)phys_to_virt(pt_phys);
> @@ -623,6 +626,14 @@ static int rk_iommu_map_iova(struct rk_iommu_domain *rk_domain, u32 *pte_addr,
>   
>   	rk_table_flush(pte_addr, pte_count);
>   
> +	/*
> +	 * Zap the first and last iova to evict from iotlb any previously
> +	 * mapped cachelines holding stale values for its dte and pte.
> +	 * We only zap the first and last iova, since only they could have
> +	 * dte or pte shared with an existing mapping.
> +	 */
> +	rk_iommu_zap_iova_first_last(rk_domain, iova, size);
> +
>   	return 0;
>   unwind:
>   	/* Unmap the range of iovas that we just mapped */
> 
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] CHROMIUM: iommu: rockchip: Make sure that page table state is coherent
@ 2021-12-09 15:18       ` Dafna Hirschfeld
  0 siblings, 0 replies; 24+ messages in thread
From: Dafna Hirschfeld @ 2021-12-09 15:18 UTC (permalink / raw)
  To: Tomasz Figa, Joerg Roedel
  Cc: iommu, linux-arm-kernel, linux-kernel,
	open list:ARM/Rockchip SoC...,
	Heiko Stuebner, Daniel Kurtz, Collabora Kernel ML



On 23.03.15 10:38, Tomasz Figa wrote:
> Sorry, I had to dig my way out through my backlog.
> 
> On Tue, Mar 3, 2015 at 10:36 PM, Joerg Roedel <joro@8bytes.org> wrote:
>> On Mon, Feb 09, 2015 at 08:19:21PM +0900, Tomasz Figa wrote:
>>> Even though the code uses the dt_lock spin lock to serialize mapping
>>> operation from different threads, it does not protect from IOMMU
>>> accesses that might be already taking place and thus altering state
>>> of the IOTLB. This means that current mapping code which first zaps
>>> the page table and only then updates it with new mapping which is
>>> prone to mentioned race.
>>
>> Could you elabortate a bit on the race and why it is sufficient to zap
>> only the first and the last iova? From the description and the comments
>> in the patch this is not clear to me.
> 
> Let's start with why it's sufficient to zap only first and last iova.
> 
> While unmapping, the driver zaps all iovas belonging to the mapping,
> so the page tables not used by any mapping won't be cached. Now when
> the driver creates a mapping it might end up occupying several page
> tables. However, since the mapping area is virtually contiguous, only
> the first and last page table can be shared with different mappings.
> This means that only first and last iovas can be already cached. In
> fact, we could detect if first and last page tables are shared and do
> not zap at all, but this wouldn't really optimize too much. Why
> invalidating one iova is enough to invalidate the whole page table is
> unclear to me as well, but it seems to be the correct way on this
> hardware.

Hi,
It seems to me that actually each mapping needs exactly one page.
Since (as the inline doc in rk_iommu_map states) the pgsize_bitmap
makes sure that iova mappings fits exactly into one page table
since the mapping size is maximum 4M.

This actually means that if rk_dte_get_page_table does not allocate a
new page table but returns one that is already partially used from previous
mappings then two page tables might be required, but I think the iova
allocation somehow make sure that this will not be the case.

If it was the case then the code would be buggy because it means
that the loop in rk_iommu_map_iova will write behind the page table
given in rk_dte_get_page_table (which we didn't allocate)

So I it seems to me that calling 'rk_iommu_zap_iova(rk_domain, iova, SPAGE_SIZE);'
as done before this patch should be used, but be moved from
rk_dte_get_page_table to where rk_iommu_zap_iova_first_last is now

Thanks,
Dafna

> 
> As for the race, it's also kind of explained by the above. The already
> running hardware can trigger page table look-ups in the IOMMU and so
> caching of the page table between our zapping and updating its
> contents. With this patch zapping is performed after updating the page
> table so the race is gone.
> 
> Best regards,
> Tomasz
> 
>  From mboxrd@z Thu Jan  1 00:00:00 1970
> Return-Path: <linux-kernel-owner@vger.kernel.org>
> Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
> 	id S1753210AbbCWM3R (ORCPT <rfc822;w@1wt.eu>);
> 	Mon, 23 Mar 2015 08:29:17 -0400
> Received: from 8bytes.org ([81.169.241.247]:33957 "EHLO theia.8bytes.org"
> 	rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP
> 	id S1752552AbbCWM3M (ORCPT <rfc822;linux-kernel@vger.kernel.org>);
> 	Mon, 23 Mar 2015 08:29:12 -0400
> Date: Mon, 23 Mar 2015 13:29:10 +0100
> From: Joerg Roedel <joro@8bytes.org>
> To: Tomasz Figa <tfiga@chromium.org>
> Cc: iommu@lists.linux-foundation.org,
>          "linux-arm-kernel@lists.infradead.org"
> 	<linux-arm-kernel@lists.infradead.org>,
>          "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
>          "open list:ARM/Rockchip SoC..." <linux-rockchip@lists.infradead.org>,
>          Heiko Stuebner <heiko@sntech.de>, Daniel Kurtz <djkurtz@chromium.org>
> Subject: Re: [PATCH] CHROMIUM: iommu: rockchip: Make sure that page table
>   state is coherent
> Message-ID: <20150323122910.GO4441@8bytes.org>
> References: <1423480761-33453-1-git-send-email-tfiga@chromium.org>
>   <20150303133659.GD10502@8bytes.org>
>   <CAAFQd5Abk6X7AVTFaNuUSiShn31pzwwTE3VjfLnE4kyziAjy2A@mail.gmail.com>
> MIME-Version: 1.0
> Content-Type: text/plain; charset=us-ascii
> Content-Disposition: inline
> In-Reply-To: <CAAFQd5Abk6X7AVTFaNuUSiShn31pzwwTE3VjfLnE4kyziAjy2A@mail.gmail.com>
> User-Agent: Mutt/1.5.21 (2010-09-15)
> Sender: linux-kernel-owner@vger.kernel.org
> List-ID: <linux-kernel.vger.kernel.org>
> X-Mailing-List: linux-kernel@vger.kernel.org
> 
> Hi Tomasz,
> 
> On Mon, Mar 23, 2015 at 05:38:45PM +0900, Tomasz Figa wrote:
>> While unmapping, the driver zaps all iovas belonging to the mapping,
>> so the page tables not used by any mapping won't be cached. Now when
>> the driver creates a mapping it might end up occupying several page
>> tables. However, since the mapping area is virtually contiguous, only
>> the first and last page table can be shared with different mappings.
>> This means that only first and last iovas can be already cached. In
>> fact, we could detect if first and last page tables are shared and do
>> not zap at all, but this wouldn't really optimize too much. Why
>> invalidating one iova is enough to invalidate the whole page table is
>> unclear to me as well, but it seems to be the correct way on this
>> hardware.
>>
>> As for the race, it's also kind of explained by the above. The already
>> running hardware can trigger page table look-ups in the IOMMU and so
>> caching of the page table between our zapping and updating its
>> contents. With this patch zapping is performed after updating the page
>> table so the race is gone.
> 
> Okay, this makes sense. Can you add this information to the patch
> changelog and resend please?
> 
> Thanks,
> 
> 	Joerg
> 
> 
>  From mboxrd@z Thu Jan  1 00:00:00 1970
> From: Tomasz Figa <tfiga-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>
> Subject: [PATCH] CHROMIUM: iommu: rockchip: Make sure that page table state is
> 	coherent
> Date: Mon,  9 Feb 2015 20:19:21 +0900
> Message-ID: <1423480761-33453-1-git-send-email-tfiga@chromium.org>
> Mime-Version: 1.0
> Content-Type: text/plain; charset="us-ascii"
> Content-Transfer-Encoding: 7bit
> Return-path: <iommu-bounces-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org>
> List-Unsubscribe: <https://lists.linuxfoundation.org/mailman/options/iommu>,
> 	<mailto:iommu-request-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org?subject=unsubscribe>
> List-Archive: <http://lists.linuxfoundation.org/pipermail/iommu/>
> List-Post: <mailto:iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org>
> List-Help: <mailto:iommu-request-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org?subject=help>
> List-Subscribe: <https://lists.linuxfoundation.org/mailman/listinfo/iommu>,
> 	<mailto:iommu-request-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org?subject=subscribe>
> Sender: iommu-bounces-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org
> Errors-To: iommu-bounces-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org
> To: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org
> Cc: Heiko Stuebner <heiko-4mtYJXux2i+zQB+pC5nmwQ@public.gmane.org>, linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, Daniel Kurtz <djkurtz-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>, Tomasz Figa <tfiga-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>, linux-rockchip-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r@public.gmane.org, linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r@public.gmane.org
> List-Id: iommu@lists.linux-foundation.org
> 
> Even though the code uses the dt_lock spin lock to serialize mapping
> operation from different threads, it does not protect from IOMMU
> accesses that might be already taking place and thus altering state
> of the IOTLB. This means that current mapping code which first zaps
> the page table and only then updates it with new mapping which is
> prone to mentioned race.
> 
> In addition, current code assumes that mappings are always > 4 MiB
> (which translates to 1024 PTEs) and so they would always occupy
> entire page tables. This is not true for mappings created by V4L2
> Videobuf2 DMA contig allocator.
> 
> This patch changes the mapping code to always zap the page table
> after it is updated, which avoids the aforementioned race and also
> zap the last page of the mapping to make sure that stale data is
> not cached from an already existing mapping.
> 
> Signed-off-by: Tomasz Figa <tfiga-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>
> Reviewed-by: Daniel Kurtz <djkurtz-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>
> ---
>   drivers/iommu/rockchip-iommu.c | 23 +++++++++++++++++------
>   1 file changed, 17 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/iommu/rockchip-iommu.c b/drivers/iommu/rockchip-iommu.c
> index 6a8b1ec..b06fe76 100644
> --- a/drivers/iommu/rockchip-iommu.c
> +++ b/drivers/iommu/rockchip-iommu.c
> @@ -544,6 +544,15 @@ static void rk_iommu_zap_iova(struct rk_iommu_domain *rk_domain,
>   	spin_unlock_irqrestore(&rk_domain->iommus_lock, flags);
>   }
>   
> +static void rk_iommu_zap_iova_first_last(struct rk_iommu_domain *rk_domain,
> +					 dma_addr_t iova, size_t size)
> +{
> +	rk_iommu_zap_iova(rk_domain, iova, SPAGE_SIZE);
> +	if (size > SPAGE_SIZE)
> +		rk_iommu_zap_iova(rk_domain, iova + size - SPAGE_SIZE,
> +					SPAGE_SIZE);
> +}
> +
>   static u32 *rk_dte_get_page_table(struct rk_iommu_domain *rk_domain,
>   				  dma_addr_t iova)
>   {
> @@ -568,12 +577,6 @@ static u32 *rk_dte_get_page_table(struct rk_iommu_domain *rk_domain,
>   	rk_table_flush(page_table, NUM_PT_ENTRIES);
>   	rk_table_flush(dte_addr, 1);
>   
> -	/*
> -	 * Zap the first iova of newly allocated page table so iommu evicts
> -	 * old cached value of new dte from the iotlb.
> -	 */
> -	rk_iommu_zap_iova(rk_domain, iova, SPAGE_SIZE);
> -
>   done:
>   	pt_phys = rk_dte_pt_address(dte);
>   	return (u32 *)phys_to_virt(pt_phys);
> @@ -623,6 +626,14 @@ static int rk_iommu_map_iova(struct rk_iommu_domain *rk_domain, u32 *pte_addr,
>   
>   	rk_table_flush(pte_addr, pte_count);
>   
> +	/*
> +	 * Zap the first and last iova to evict from iotlb any previously
> +	 * mapped cachelines holding stale values for its dte and pte.
> +	 * We only zap the first and last iova, since only they could have
> +	 * dte or pte shared with an existing mapping.
> +	 */
> +	rk_iommu_zap_iova_first_last(rk_domain, iova, size);
> +
>   	return 0;
>   unwind:
>   	/* Unmap the range of iovas that we just mapped */
> 

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] CHROMIUM: iommu: rockchip: Make sure that page table state is coherent
  2021-12-09 15:18       ` Dafna Hirschfeld
  (?)
  (?)
@ 2022-01-24  5:37         ` Tomasz Figa
  -1 siblings, 0 replies; 24+ messages in thread
From: Tomasz Figa @ 2022-01-24  5:37 UTC (permalink / raw)
  To: Dafna Hirschfeld
  Cc: Joerg Roedel, iommu, linux-arm-kernel, linux-kernel,
	open list:ARM/Rockchip SoC...,
	Heiko Stuebner, Daniel Kurtz, Collabora Kernel ML

Hi Dafna,

On Fri, Dec 10, 2021 at 12:18 AM Dafna Hirschfeld
<dafna.hirschfeld@collabora.com> wrote:
>
>
>
> On 23.03.15 10:38, Tomasz Figa wrote:
> > Sorry, I had to dig my way out through my backlog.
> >
> > On Tue, Mar 3, 2015 at 10:36 PM, Joerg Roedel <joro@8bytes.org> wrote:
> >> On Mon, Feb 09, 2015 at 08:19:21PM +0900, Tomasz Figa wrote:
> >>> Even though the code uses the dt_lock spin lock to serialize mapping
> >>> operation from different threads, it does not protect from IOMMU
> >>> accesses that might be already taking place and thus altering state
> >>> of the IOTLB. This means that current mapping code which first zaps
> >>> the page table and only then updates it with new mapping which is
> >>> prone to mentioned race.
> >>
> >> Could you elabortate a bit on the race and why it is sufficient to zap
> >> only the first and the last iova? From the description and the comments
> >> in the patch this is not clear to me.
> >
> > Let's start with why it's sufficient to zap only first and last iova.
> >
> > While unmapping, the driver zaps all iovas belonging to the mapping,
> > so the page tables not used by any mapping won't be cached. Now when
> > the driver creates a mapping it might end up occupying several page
> > tables. However, since the mapping area is virtually contiguous, only
> > the first and last page table can be shared with different mappings.
> > This means that only first and last iovas can be already cached. In
> > fact, we could detect if first and last page tables are shared and do
> > not zap at all, but this wouldn't really optimize too much. Why
> > invalidating one iova is enough to invalidate the whole page table is
> > unclear to me as well, but it seems to be the correct way on this
> > hardware.
>
> Hi,
> It seems to me that actually each mapping needs exactly one page.
> Since (as the inline doc in rk_iommu_map states) the pgsize_bitmap
> makes sure that iova mappings fits exactly into one page table
> since the mapping size is maximum 4M.
>
> This actually means that if rk_dte_get_page_table does not allocate a
> new page table but returns one that is already partially used from previous
> mappings then two page tables might be required, but I think the iova
> allocation somehow make sure that this will not be the case.

Yes, it was exactly for the case. Note that the zap operation is
per-IO-page and not per IOPT and there is some prefetching going on in
the TLB of this IOMMU. So neighboring mappings can interfere with each
other.

>
> If it was the case then the code would be buggy because it means
> that the loop in rk_iommu_map_iova will write behind the page table
> given in rk_dte_get_page_table (which we didn't allocate)

Sorry, I don't see how it could write behind the page table. Could you
give me an example?

>
> So I it seems to me that calling 'rk_iommu_zap_iova(rk_domain, iova, SPAGE_SIZE);'
> as done before this patch should be used, but be moved from
> rk_dte_get_page_table to where rk_iommu_zap_iova_first_last is now
>
> Thanks,
> Dafna
>
> >
> > As for the race, it's also kind of explained by the above. The already
> > running hardware can trigger page table look-ups in the IOMMU and so
> > caching of the page table between our zapping and updating its
> > contents. With this patch zapping is performed after updating the page
> > table so the race is gone.
> >
> > Best regards,
> > Tomasz
> >
> >  From mboxrd@z Thu Jan  1 00:00:00 1970
> > Return-Path: <linux-kernel-owner@vger.kernel.org>
> > Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
> >       id S1753210AbbCWM3R (ORCPT <rfc822;w@1wt.eu>);
> >       Mon, 23 Mar 2015 08:29:17 -0400
> > Received: from 8bytes.org ([81.169.241.247]:33957 "EHLO theia.8bytes.org"
> >       rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP
> >       id S1752552AbbCWM3M (ORCPT <rfc822;linux-kernel@vger.kernel.org>);
> >       Mon, 23 Mar 2015 08:29:12 -0400
> > Date: Mon, 23 Mar 2015 13:29:10 +0100
> > From: Joerg Roedel <joro@8bytes.org>
> > To: Tomasz Figa <tfiga@chromium.org>
> > Cc: iommu@lists.linux-foundation.org,
> >          "linux-arm-kernel@lists.infradead.org"
> >       <linux-arm-kernel@lists.infradead.org>,
> >          "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
> >          "open list:ARM/Rockchip SoC..." <linux-rockchip@lists.infradead.org>,
> >          Heiko Stuebner <heiko@sntech.de>, Daniel Kurtz <djkurtz@chromium.org>
> > Subject: Re: [PATCH] CHROMIUM: iommu: rockchip: Make sure that page table
> >   state is coherent
> > Message-ID: <20150323122910.GO4441@8bytes.org>
> > References: <1423480761-33453-1-git-send-email-tfiga@chromium.org>
> >   <20150303133659.GD10502@8bytes.org>
> >   <CAAFQd5Abk6X7AVTFaNuUSiShn31pzwwTE3VjfLnE4kyziAjy2A@mail.gmail.com>
> > MIME-Version: 1.0
> > Content-Type: text/plain; charset=us-ascii
> > Content-Disposition: inline
> > In-Reply-To: <CAAFQd5Abk6X7AVTFaNuUSiShn31pzwwTE3VjfLnE4kyziAjy2A@mail.gmail.com>
> > User-Agent: Mutt/1.5.21 (2010-09-15)
> > Sender: linux-kernel-owner@vger.kernel.org
> > List-ID: <linux-kernel.vger.kernel.org>
> > X-Mailing-List: linux-kernel@vger.kernel.org
> >
> > Hi Tomasz,
> >
> > On Mon, Mar 23, 2015 at 05:38:45PM +0900, Tomasz Figa wrote:
> >> While unmapping, the driver zaps all iovas belonging to the mapping,
> >> so the page tables not used by any mapping won't be cached. Now when
> >> the driver creates a mapping it might end up occupying several page
> >> tables. However, since the mapping area is virtually contiguous, only
> >> the first and last page table can be shared with different mappings.
> >> This means that only first and last iovas can be already cached. In
> >> fact, we could detect if first and last page tables are shared and do
> >> not zap at all, but this wouldn't really optimize too much. Why
> >> invalidating one iova is enough to invalidate the whole page table is
> >> unclear to me as well, but it seems to be the correct way on this
> >> hardware.
> >>
> >> As for the race, it's also kind of explained by the above. The already
> >> running hardware can trigger page table look-ups in the IOMMU and so
> >> caching of the page table between our zapping and updating its
> >> contents. With this patch zapping is performed after updating the page
> >> table so the race is gone.
> >
> > Okay, this makes sense. Can you add this information to the patch
> > changelog and resend please?
> >
> > Thanks,
> >
> >       Joerg
> >
> >
> >  From mboxrd@z Thu Jan  1 00:00:00 1970
> > From: Tomasz Figa <tfiga-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>
> > Subject: [PATCH] CHROMIUM: iommu: rockchip: Make sure that page table state is
> >       coherent
> > Date: Mon,  9 Feb 2015 20:19:21 +0900
> > Message-ID: <1423480761-33453-1-git-send-email-tfiga@chromium.org>
> > Mime-Version: 1.0
> > Content-Type: text/plain; charset="us-ascii"
> > Content-Transfer-Encoding: 7bit
> > Return-path: <iommu-bounces-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org>
> > List-Unsubscribe: <https://lists.linuxfoundation.org/mailman/options/iommu>,
> >       <mailto:iommu-request-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org?subject=unsubscribe>
> > List-Archive: <http://lists.linuxfoundation.org/pipermail/iommu/>
> > List-Post: <mailto:iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org>
> > List-Help: <mailto:iommu-request-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org?subject=help>
> > List-Subscribe: <https://lists.linuxfoundation.org/mailman/listinfo/iommu>,
> >       <mailto:iommu-request-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org?subject=subscribe>
> > Sender: iommu-bounces-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org
> > Errors-To: iommu-bounces-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org
> > To: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org
> > Cc: Heiko Stuebner <heiko-4mtYJXux2i+zQB+pC5nmwQ@public.gmane.org>, linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, Daniel Kurtz <djkurtz-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>, Tomasz Figa <tfiga-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>, linux-rockchip-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r@public.gmane.org, linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r@public.gmane.org
> > List-Id: iommu@lists.linux-foundation.org
> >
> > Even though the code uses the dt_lock spin lock to serialize mapping
> > operation from different threads, it does not protect from IOMMU
> > accesses that might be already taking place and thus altering state
> > of the IOTLB. This means that current mapping code which first zaps
> > the page table and only then updates it with new mapping which is
> > prone to mentioned race.
> >
> > In addition, current code assumes that mappings are always > 4 MiB
> > (which translates to 1024 PTEs) and so they would always occupy
> > entire page tables. This is not true for mappings created by V4L2
> > Videobuf2 DMA contig allocator.
> >
> > This patch changes the mapping code to always zap the page table
> > after it is updated, which avoids the aforementioned race and also
> > zap the last page of the mapping to make sure that stale data is
> > not cached from an already existing mapping.
> >
> > Signed-off-by: Tomasz Figa <tfiga-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>
> > Reviewed-by: Daniel Kurtz <djkurtz-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>
> > ---
> >   drivers/iommu/rockchip-iommu.c | 23 +++++++++++++++++------
> >   1 file changed, 17 insertions(+), 6 deletions(-)
> >
> > diff --git a/drivers/iommu/rockchip-iommu.c b/drivers/iommu/rockchip-iommu.c
> > index 6a8b1ec..b06fe76 100644
> > --- a/drivers/iommu/rockchip-iommu.c
> > +++ b/drivers/iommu/rockchip-iommu.c
> > @@ -544,6 +544,15 @@ static void rk_iommu_zap_iova(struct rk_iommu_domain *rk_domain,
> >       spin_unlock_irqrestore(&rk_domain->iommus_lock, flags);
> >   }
> >
> > +static void rk_iommu_zap_iova_first_last(struct rk_iommu_domain *rk_domain,
> > +                                      dma_addr_t iova, size_t size)
> > +{
> > +     rk_iommu_zap_iova(rk_domain, iova, SPAGE_SIZE);
> > +     if (size > SPAGE_SIZE)
> > +             rk_iommu_zap_iova(rk_domain, iova + size - SPAGE_SIZE,
> > +                                     SPAGE_SIZE);
> > +}
> > +
> >   static u32 *rk_dte_get_page_table(struct rk_iommu_domain *rk_domain,
> >                                 dma_addr_t iova)
> >   {
> > @@ -568,12 +577,6 @@ static u32 *rk_dte_get_page_table(struct rk_iommu_domain *rk_domain,
> >       rk_table_flush(page_table, NUM_PT_ENTRIES);
> >       rk_table_flush(dte_addr, 1);
> >
> > -     /*
> > -      * Zap the first iova of newly allocated page table so iommu evicts
> > -      * old cached value of new dte from the iotlb.
> > -      */
> > -     rk_iommu_zap_iova(rk_domain, iova, SPAGE_SIZE);
> > -
> >   done:
> >       pt_phys = rk_dte_pt_address(dte);
> >       return (u32 *)phys_to_virt(pt_phys);
> > @@ -623,6 +626,14 @@ static int rk_iommu_map_iova(struct rk_iommu_domain *rk_domain, u32 *pte_addr,
> >
> >       rk_table_flush(pte_addr, pte_count);
> >
> > +     /*
> > +      * Zap the first and last iova to evict from iotlb any previously
> > +      * mapped cachelines holding stale values for its dte and pte.
> > +      * We only zap the first and last iova, since only they could have
> > +      * dte or pte shared with an existing mapping.
> > +      */
> > +     rk_iommu_zap_iova_first_last(rk_domain, iova, size);
> > +
> >       return 0;
> >   unwind:
> >       /* Unmap the range of iovas that we just mapped */
> >

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] CHROMIUM: iommu: rockchip: Make sure that page table state is coherent
@ 2022-01-24  5:37         ` Tomasz Figa
  0 siblings, 0 replies; 24+ messages in thread
From: Tomasz Figa @ 2022-01-24  5:37 UTC (permalink / raw)
  To: Dafna Hirschfeld
  Cc: Joerg Roedel, iommu, linux-arm-kernel, linux-kernel,
	open list:ARM/Rockchip SoC...,
	Heiko Stuebner, Daniel Kurtz, Collabora Kernel ML

Hi Dafna,

On Fri, Dec 10, 2021 at 12:18 AM Dafna Hirschfeld
<dafna.hirschfeld@collabora.com> wrote:
>
>
>
> On 23.03.15 10:38, Tomasz Figa wrote:
> > Sorry, I had to dig my way out through my backlog.
> >
> > On Tue, Mar 3, 2015 at 10:36 PM, Joerg Roedel <joro@8bytes.org> wrote:
> >> On Mon, Feb 09, 2015 at 08:19:21PM +0900, Tomasz Figa wrote:
> >>> Even though the code uses the dt_lock spin lock to serialize mapping
> >>> operation from different threads, it does not protect from IOMMU
> >>> accesses that might be already taking place and thus altering state
> >>> of the IOTLB. This means that current mapping code which first zaps
> >>> the page table and only then updates it with new mapping which is
> >>> prone to mentioned race.
> >>
> >> Could you elabortate a bit on the race and why it is sufficient to zap
> >> only the first and the last iova? From the description and the comments
> >> in the patch this is not clear to me.
> >
> > Let's start with why it's sufficient to zap only first and last iova.
> >
> > While unmapping, the driver zaps all iovas belonging to the mapping,
> > so the page tables not used by any mapping won't be cached. Now when
> > the driver creates a mapping it might end up occupying several page
> > tables. However, since the mapping area is virtually contiguous, only
> > the first and last page table can be shared with different mappings.
> > This means that only first and last iovas can be already cached. In
> > fact, we could detect if first and last page tables are shared and do
> > not zap at all, but this wouldn't really optimize too much. Why
> > invalidating one iova is enough to invalidate the whole page table is
> > unclear to me as well, but it seems to be the correct way on this
> > hardware.
>
> Hi,
> It seems to me that actually each mapping needs exactly one page.
> Since (as the inline doc in rk_iommu_map states) the pgsize_bitmap
> makes sure that iova mappings fits exactly into one page table
> since the mapping size is maximum 4M.
>
> This actually means that if rk_dte_get_page_table does not allocate a
> new page table but returns one that is already partially used from previous
> mappings then two page tables might be required, but I think the iova
> allocation somehow make sure that this will not be the case.

Yes, it was exactly for the case. Note that the zap operation is
per-IO-page and not per IOPT and there is some prefetching going on in
the TLB of this IOMMU. So neighboring mappings can interfere with each
other.

>
> If it was the case then the code would be buggy because it means
> that the loop in rk_iommu_map_iova will write behind the page table
> given in rk_dte_get_page_table (which we didn't allocate)

Sorry, I don't see how it could write behind the page table. Could you
give me an example?

>
> So I it seems to me that calling 'rk_iommu_zap_iova(rk_domain, iova, SPAGE_SIZE);'
> as done before this patch should be used, but be moved from
> rk_dte_get_page_table to where rk_iommu_zap_iova_first_last is now
>
> Thanks,
> Dafna
>
> >
> > As for the race, it's also kind of explained by the above. The already
> > running hardware can trigger page table look-ups in the IOMMU and so
> > caching of the page table between our zapping and updating its
> > contents. With this patch zapping is performed after updating the page
> > table so the race is gone.
> >
> > Best regards,
> > Tomasz
> >
> >  From mboxrd@z Thu Jan  1 00:00:00 1970
> > Return-Path: <linux-kernel-owner@vger.kernel.org>
> > Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
> >       id S1753210AbbCWM3R (ORCPT <rfc822;w@1wt.eu>);
> >       Mon, 23 Mar 2015 08:29:17 -0400
> > Received: from 8bytes.org ([81.169.241.247]:33957 "EHLO theia.8bytes.org"
> >       rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP
> >       id S1752552AbbCWM3M (ORCPT <rfc822;linux-kernel@vger.kernel.org>);
> >       Mon, 23 Mar 2015 08:29:12 -0400
> > Date: Mon, 23 Mar 2015 13:29:10 +0100
> > From: Joerg Roedel <joro@8bytes.org>
> > To: Tomasz Figa <tfiga@chromium.org>
> > Cc: iommu@lists.linux-foundation.org,
> >          "linux-arm-kernel@lists.infradead.org"
> >       <linux-arm-kernel@lists.infradead.org>,
> >          "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
> >          "open list:ARM/Rockchip SoC..." <linux-rockchip@lists.infradead.org>,
> >          Heiko Stuebner <heiko@sntech.de>, Daniel Kurtz <djkurtz@chromium.org>
> > Subject: Re: [PATCH] CHROMIUM: iommu: rockchip: Make sure that page table
> >   state is coherent
> > Message-ID: <20150323122910.GO4441@8bytes.org>
> > References: <1423480761-33453-1-git-send-email-tfiga@chromium.org>
> >   <20150303133659.GD10502@8bytes.org>
> >   <CAAFQd5Abk6X7AVTFaNuUSiShn31pzwwTE3VjfLnE4kyziAjy2A@mail.gmail.com>
> > MIME-Version: 1.0
> > Content-Type: text/plain; charset=us-ascii
> > Content-Disposition: inline
> > In-Reply-To: <CAAFQd5Abk6X7AVTFaNuUSiShn31pzwwTE3VjfLnE4kyziAjy2A@mail.gmail.com>
> > User-Agent: Mutt/1.5.21 (2010-09-15)
> > Sender: linux-kernel-owner@vger.kernel.org
> > List-ID: <linux-kernel.vger.kernel.org>
> > X-Mailing-List: linux-kernel@vger.kernel.org
> >
> > Hi Tomasz,
> >
> > On Mon, Mar 23, 2015 at 05:38:45PM +0900, Tomasz Figa wrote:
> >> While unmapping, the driver zaps all iovas belonging to the mapping,
> >> so the page tables not used by any mapping won't be cached. Now when
> >> the driver creates a mapping it might end up occupying several page
> >> tables. However, since the mapping area is virtually contiguous, only
> >> the first and last page table can be shared with different mappings.
> >> This means that only first and last iovas can be already cached. In
> >> fact, we could detect if first and last page tables are shared and do
> >> not zap at all, but this wouldn't really optimize too much. Why
> >> invalidating one iova is enough to invalidate the whole page table is
> >> unclear to me as well, but it seems to be the correct way on this
> >> hardware.
> >>
> >> As for the race, it's also kind of explained by the above. The already
> >> running hardware can trigger page table look-ups in the IOMMU and so
> >> caching of the page table between our zapping and updating its
> >> contents. With this patch zapping is performed after updating the page
> >> table so the race is gone.
> >
> > Okay, this makes sense. Can you add this information to the patch
> > changelog and resend please?
> >
> > Thanks,
> >
> >       Joerg
> >
> >
> >  From mboxrd@z Thu Jan  1 00:00:00 1970
> > From: Tomasz Figa <tfiga-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>
> > Subject: [PATCH] CHROMIUM: iommu: rockchip: Make sure that page table state is
> >       coherent
> > Date: Mon,  9 Feb 2015 20:19:21 +0900
> > Message-ID: <1423480761-33453-1-git-send-email-tfiga@chromium.org>
> > Mime-Version: 1.0
> > Content-Type: text/plain; charset="us-ascii"
> > Content-Transfer-Encoding: 7bit
> > Return-path: <iommu-bounces-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org>
> > List-Unsubscribe: <https://lists.linuxfoundation.org/mailman/options/iommu>,
> >       <mailto:iommu-request-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org?subject=unsubscribe>
> > List-Archive: <http://lists.linuxfoundation.org/pipermail/iommu/>
> > List-Post: <mailto:iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org>
> > List-Help: <mailto:iommu-request-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org?subject=help>
> > List-Subscribe: <https://lists.linuxfoundation.org/mailman/listinfo/iommu>,
> >       <mailto:iommu-request-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org?subject=subscribe>
> > Sender: iommu-bounces-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org
> > Errors-To: iommu-bounces-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org
> > To: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org
> > Cc: Heiko Stuebner <heiko-4mtYJXux2i+zQB+pC5nmwQ@public.gmane.org>, linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, Daniel Kurtz <djkurtz-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>, Tomasz Figa <tfiga-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>, linux-rockchip-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r@public.gmane.org, linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r@public.gmane.org
> > List-Id: iommu@lists.linux-foundation.org
> >
> > Even though the code uses the dt_lock spin lock to serialize mapping
> > operation from different threads, it does not protect from IOMMU
> > accesses that might be already taking place and thus altering state
> > of the IOTLB. This means that current mapping code which first zaps
> > the page table and only then updates it with new mapping which is
> > prone to mentioned race.
> >
> > In addition, current code assumes that mappings are always > 4 MiB
> > (which translates to 1024 PTEs) and so they would always occupy
> > entire page tables. This is not true for mappings created by V4L2
> > Videobuf2 DMA contig allocator.
> >
> > This patch changes the mapping code to always zap the page table
> > after it is updated, which avoids the aforementioned race and also
> > zap the last page of the mapping to make sure that stale data is
> > not cached from an already existing mapping.
> >
> > Signed-off-by: Tomasz Figa <tfiga-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>
> > Reviewed-by: Daniel Kurtz <djkurtz-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>
> > ---
> >   drivers/iommu/rockchip-iommu.c | 23 +++++++++++++++++------
> >   1 file changed, 17 insertions(+), 6 deletions(-)
> >
> > diff --git a/drivers/iommu/rockchip-iommu.c b/drivers/iommu/rockchip-iommu.c
> > index 6a8b1ec..b06fe76 100644
> > --- a/drivers/iommu/rockchip-iommu.c
> > +++ b/drivers/iommu/rockchip-iommu.c
> > @@ -544,6 +544,15 @@ static void rk_iommu_zap_iova(struct rk_iommu_domain *rk_domain,
> >       spin_unlock_irqrestore(&rk_domain->iommus_lock, flags);
> >   }
> >
> > +static void rk_iommu_zap_iova_first_last(struct rk_iommu_domain *rk_domain,
> > +                                      dma_addr_t iova, size_t size)
> > +{
> > +     rk_iommu_zap_iova(rk_domain, iova, SPAGE_SIZE);
> > +     if (size > SPAGE_SIZE)
> > +             rk_iommu_zap_iova(rk_domain, iova + size - SPAGE_SIZE,
> > +                                     SPAGE_SIZE);
> > +}
> > +
> >   static u32 *rk_dte_get_page_table(struct rk_iommu_domain *rk_domain,
> >                                 dma_addr_t iova)
> >   {
> > @@ -568,12 +577,6 @@ static u32 *rk_dte_get_page_table(struct rk_iommu_domain *rk_domain,
> >       rk_table_flush(page_table, NUM_PT_ENTRIES);
> >       rk_table_flush(dte_addr, 1);
> >
> > -     /*
> > -      * Zap the first iova of newly allocated page table so iommu evicts
> > -      * old cached value of new dte from the iotlb.
> > -      */
> > -     rk_iommu_zap_iova(rk_domain, iova, SPAGE_SIZE);
> > -
> >   done:
> >       pt_phys = rk_dte_pt_address(dte);
> >       return (u32 *)phys_to_virt(pt_phys);
> > @@ -623,6 +626,14 @@ static int rk_iommu_map_iova(struct rk_iommu_domain *rk_domain, u32 *pte_addr,
> >
> >       rk_table_flush(pte_addr, pte_count);
> >
> > +     /*
> > +      * Zap the first and last iova to evict from iotlb any previously
> > +      * mapped cachelines holding stale values for its dte and pte.
> > +      * We only zap the first and last iova, since only they could have
> > +      * dte or pte shared with an existing mapping.
> > +      */
> > +     rk_iommu_zap_iova_first_last(rk_domain, iova, size);
> > +
> >       return 0;
> >   unwind:
> >       /* Unmap the range of iovas that we just mapped */
> >

_______________________________________________
Linux-rockchip mailing list
Linux-rockchip@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-rockchip

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] CHROMIUM: iommu: rockchip: Make sure that page table state is coherent
@ 2022-01-24  5:37         ` Tomasz Figa
  0 siblings, 0 replies; 24+ messages in thread
From: Tomasz Figa @ 2022-01-24  5:37 UTC (permalink / raw)
  To: Dafna Hirschfeld
  Cc: Heiko Stuebner, linux-kernel, Daniel Kurtz,
	open list:ARM/Rockchip SoC...,
	iommu, Collabora Kernel ML, linux-arm-kernel

Hi Dafna,

On Fri, Dec 10, 2021 at 12:18 AM Dafna Hirschfeld
<dafna.hirschfeld@collabora.com> wrote:
>
>
>
> On 23.03.15 10:38, Tomasz Figa wrote:
> > Sorry, I had to dig my way out through my backlog.
> >
> > On Tue, Mar 3, 2015 at 10:36 PM, Joerg Roedel <joro@8bytes.org> wrote:
> >> On Mon, Feb 09, 2015 at 08:19:21PM +0900, Tomasz Figa wrote:
> >>> Even though the code uses the dt_lock spin lock to serialize mapping
> >>> operation from different threads, it does not protect from IOMMU
> >>> accesses that might be already taking place and thus altering state
> >>> of the IOTLB. This means that current mapping code which first zaps
> >>> the page table and only then updates it with new mapping which is
> >>> prone to mentioned race.
> >>
> >> Could you elabortate a bit on the race and why it is sufficient to zap
> >> only the first and the last iova? From the description and the comments
> >> in the patch this is not clear to me.
> >
> > Let's start with why it's sufficient to zap only first and last iova.
> >
> > While unmapping, the driver zaps all iovas belonging to the mapping,
> > so the page tables not used by any mapping won't be cached. Now when
> > the driver creates a mapping it might end up occupying several page
> > tables. However, since the mapping area is virtually contiguous, only
> > the first and last page table can be shared with different mappings.
> > This means that only first and last iovas can be already cached. In
> > fact, we could detect if first and last page tables are shared and do
> > not zap at all, but this wouldn't really optimize too much. Why
> > invalidating one iova is enough to invalidate the whole page table is
> > unclear to me as well, but it seems to be the correct way on this
> > hardware.
>
> Hi,
> It seems to me that actually each mapping needs exactly one page.
> Since (as the inline doc in rk_iommu_map states) the pgsize_bitmap
> makes sure that iova mappings fits exactly into one page table
> since the mapping size is maximum 4M.
>
> This actually means that if rk_dte_get_page_table does not allocate a
> new page table but returns one that is already partially used from previous
> mappings then two page tables might be required, but I think the iova
> allocation somehow make sure that this will not be the case.

Yes, it was exactly for the case. Note that the zap operation is
per-IO-page and not per IOPT and there is some prefetching going on in
the TLB of this IOMMU. So neighboring mappings can interfere with each
other.

>
> If it was the case then the code would be buggy because it means
> that the loop in rk_iommu_map_iova will write behind the page table
> given in rk_dte_get_page_table (which we didn't allocate)

Sorry, I don't see how it could write behind the page table. Could you
give me an example?

>
> So I it seems to me that calling 'rk_iommu_zap_iova(rk_domain, iova, SPAGE_SIZE);'
> as done before this patch should be used, but be moved from
> rk_dte_get_page_table to where rk_iommu_zap_iova_first_last is now
>
> Thanks,
> Dafna
>
> >
> > As for the race, it's also kind of explained by the above. The already
> > running hardware can trigger page table look-ups in the IOMMU and so
> > caching of the page table between our zapping and updating its
> > contents. With this patch zapping is performed after updating the page
> > table so the race is gone.
> >
> > Best regards,
> > Tomasz
> >
> >  From mboxrd@z Thu Jan  1 00:00:00 1970
> > Return-Path: <linux-kernel-owner@vger.kernel.org>
> > Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
> >       id S1753210AbbCWM3R (ORCPT <rfc822;w@1wt.eu>);
> >       Mon, 23 Mar 2015 08:29:17 -0400
> > Received: from 8bytes.org ([81.169.241.247]:33957 "EHLO theia.8bytes.org"
> >       rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP
> >       id S1752552AbbCWM3M (ORCPT <rfc822;linux-kernel@vger.kernel.org>);
> >       Mon, 23 Mar 2015 08:29:12 -0400
> > Date: Mon, 23 Mar 2015 13:29:10 +0100
> > From: Joerg Roedel <joro@8bytes.org>
> > To: Tomasz Figa <tfiga@chromium.org>
> > Cc: iommu@lists.linux-foundation.org,
> >          "linux-arm-kernel@lists.infradead.org"
> >       <linux-arm-kernel@lists.infradead.org>,
> >          "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
> >          "open list:ARM/Rockchip SoC..." <linux-rockchip@lists.infradead.org>,
> >          Heiko Stuebner <heiko@sntech.de>, Daniel Kurtz <djkurtz@chromium.org>
> > Subject: Re: [PATCH] CHROMIUM: iommu: rockchip: Make sure that page table
> >   state is coherent
> > Message-ID: <20150323122910.GO4441@8bytes.org>
> > References: <1423480761-33453-1-git-send-email-tfiga@chromium.org>
> >   <20150303133659.GD10502@8bytes.org>
> >   <CAAFQd5Abk6X7AVTFaNuUSiShn31pzwwTE3VjfLnE4kyziAjy2A@mail.gmail.com>
> > MIME-Version: 1.0
> > Content-Type: text/plain; charset=us-ascii
> > Content-Disposition: inline
> > In-Reply-To: <CAAFQd5Abk6X7AVTFaNuUSiShn31pzwwTE3VjfLnE4kyziAjy2A@mail.gmail.com>
> > User-Agent: Mutt/1.5.21 (2010-09-15)
> > Sender: linux-kernel-owner@vger.kernel.org
> > List-ID: <linux-kernel.vger.kernel.org>
> > X-Mailing-List: linux-kernel@vger.kernel.org
> >
> > Hi Tomasz,
> >
> > On Mon, Mar 23, 2015 at 05:38:45PM +0900, Tomasz Figa wrote:
> >> While unmapping, the driver zaps all iovas belonging to the mapping,
> >> so the page tables not used by any mapping won't be cached. Now when
> >> the driver creates a mapping it might end up occupying several page
> >> tables. However, since the mapping area is virtually contiguous, only
> >> the first and last page table can be shared with different mappings.
> >> This means that only first and last iovas can be already cached. In
> >> fact, we could detect if first and last page tables are shared and do
> >> not zap at all, but this wouldn't really optimize too much. Why
> >> invalidating one iova is enough to invalidate the whole page table is
> >> unclear to me as well, but it seems to be the correct way on this
> >> hardware.
> >>
> >> As for the race, it's also kind of explained by the above. The already
> >> running hardware can trigger page table look-ups in the IOMMU and so
> >> caching of the page table between our zapping and updating its
> >> contents. With this patch zapping is performed after updating the page
> >> table so the race is gone.
> >
> > Okay, this makes sense. Can you add this information to the patch
> > changelog and resend please?
> >
> > Thanks,
> >
> >       Joerg
> >
> >
> >  From mboxrd@z Thu Jan  1 00:00:00 1970
> > From: Tomasz Figa <tfiga-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>
> > Subject: [PATCH] CHROMIUM: iommu: rockchip: Make sure that page table state is
> >       coherent
> > Date: Mon,  9 Feb 2015 20:19:21 +0900
> > Message-ID: <1423480761-33453-1-git-send-email-tfiga@chromium.org>
> > Mime-Version: 1.0
> > Content-Type: text/plain; charset="us-ascii"
> > Content-Transfer-Encoding: 7bit
> > Return-path: <iommu-bounces-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org>
> > List-Unsubscribe: <https://lists.linuxfoundation.org/mailman/options/iommu>,
> >       <mailto:iommu-request-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org?subject=unsubscribe>
> > List-Archive: <http://lists.linuxfoundation.org/pipermail/iommu/>
> > List-Post: <mailto:iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org>
> > List-Help: <mailto:iommu-request-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org?subject=help>
> > List-Subscribe: <https://lists.linuxfoundation.org/mailman/listinfo/iommu>,
> >       <mailto:iommu-request-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org?subject=subscribe>
> > Sender: iommu-bounces-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org
> > Errors-To: iommu-bounces-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org
> > To: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org
> > Cc: Heiko Stuebner <heiko-4mtYJXux2i+zQB+pC5nmwQ@public.gmane.org>, linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, Daniel Kurtz <djkurtz-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>, Tomasz Figa <tfiga-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>, linux-rockchip-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r@public.gmane.org, linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r@public.gmane.org
> > List-Id: iommu@lists.linux-foundation.org
> >
> > Even though the code uses the dt_lock spin lock to serialize mapping
> > operation from different threads, it does not protect from IOMMU
> > accesses that might be already taking place and thus altering state
> > of the IOTLB. This means that current mapping code which first zaps
> > the page table and only then updates it with new mapping which is
> > prone to mentioned race.
> >
> > In addition, current code assumes that mappings are always > 4 MiB
> > (which translates to 1024 PTEs) and so they would always occupy
> > entire page tables. This is not true for mappings created by V4L2
> > Videobuf2 DMA contig allocator.
> >
> > This patch changes the mapping code to always zap the page table
> > after it is updated, which avoids the aforementioned race and also
> > zap the last page of the mapping to make sure that stale data is
> > not cached from an already existing mapping.
> >
> > Signed-off-by: Tomasz Figa <tfiga-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>
> > Reviewed-by: Daniel Kurtz <djkurtz-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>
> > ---
> >   drivers/iommu/rockchip-iommu.c | 23 +++++++++++++++++------
> >   1 file changed, 17 insertions(+), 6 deletions(-)
> >
> > diff --git a/drivers/iommu/rockchip-iommu.c b/drivers/iommu/rockchip-iommu.c
> > index 6a8b1ec..b06fe76 100644
> > --- a/drivers/iommu/rockchip-iommu.c
> > +++ b/drivers/iommu/rockchip-iommu.c
> > @@ -544,6 +544,15 @@ static void rk_iommu_zap_iova(struct rk_iommu_domain *rk_domain,
> >       spin_unlock_irqrestore(&rk_domain->iommus_lock, flags);
> >   }
> >
> > +static void rk_iommu_zap_iova_first_last(struct rk_iommu_domain *rk_domain,
> > +                                      dma_addr_t iova, size_t size)
> > +{
> > +     rk_iommu_zap_iova(rk_domain, iova, SPAGE_SIZE);
> > +     if (size > SPAGE_SIZE)
> > +             rk_iommu_zap_iova(rk_domain, iova + size - SPAGE_SIZE,
> > +                                     SPAGE_SIZE);
> > +}
> > +
> >   static u32 *rk_dte_get_page_table(struct rk_iommu_domain *rk_domain,
> >                                 dma_addr_t iova)
> >   {
> > @@ -568,12 +577,6 @@ static u32 *rk_dte_get_page_table(struct rk_iommu_domain *rk_domain,
> >       rk_table_flush(page_table, NUM_PT_ENTRIES);
> >       rk_table_flush(dte_addr, 1);
> >
> > -     /*
> > -      * Zap the first iova of newly allocated page table so iommu evicts
> > -      * old cached value of new dte from the iotlb.
> > -      */
> > -     rk_iommu_zap_iova(rk_domain, iova, SPAGE_SIZE);
> > -
> >   done:
> >       pt_phys = rk_dte_pt_address(dte);
> >       return (u32 *)phys_to_virt(pt_phys);
> > @@ -623,6 +626,14 @@ static int rk_iommu_map_iova(struct rk_iommu_domain *rk_domain, u32 *pte_addr,
> >
> >       rk_table_flush(pte_addr, pte_count);
> >
> > +     /*
> > +      * Zap the first and last iova to evict from iotlb any previously
> > +      * mapped cachelines holding stale values for its dte and pte.
> > +      * We only zap the first and last iova, since only they could have
> > +      * dte or pte shared with an existing mapping.
> > +      */
> > +     rk_iommu_zap_iova_first_last(rk_domain, iova, size);
> > +
> >       return 0;
> >   unwind:
> >       /* Unmap the range of iovas that we just mapped */
> >
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] CHROMIUM: iommu: rockchip: Make sure that page table state is coherent
@ 2022-01-24  5:37         ` Tomasz Figa
  0 siblings, 0 replies; 24+ messages in thread
From: Tomasz Figa @ 2022-01-24  5:37 UTC (permalink / raw)
  To: Dafna Hirschfeld
  Cc: Joerg Roedel, iommu, linux-arm-kernel, linux-kernel,
	open list:ARM/Rockchip SoC...,
	Heiko Stuebner, Daniel Kurtz, Collabora Kernel ML

Hi Dafna,

On Fri, Dec 10, 2021 at 12:18 AM Dafna Hirschfeld
<dafna.hirschfeld@collabora.com> wrote:
>
>
>
> On 23.03.15 10:38, Tomasz Figa wrote:
> > Sorry, I had to dig my way out through my backlog.
> >
> > On Tue, Mar 3, 2015 at 10:36 PM, Joerg Roedel <joro@8bytes.org> wrote:
> >> On Mon, Feb 09, 2015 at 08:19:21PM +0900, Tomasz Figa wrote:
> >>> Even though the code uses the dt_lock spin lock to serialize mapping
> >>> operation from different threads, it does not protect from IOMMU
> >>> accesses that might be already taking place and thus altering state
> >>> of the IOTLB. This means that current mapping code which first zaps
> >>> the page table and only then updates it with new mapping which is
> >>> prone to mentioned race.
> >>
> >> Could you elabortate a bit on the race and why it is sufficient to zap
> >> only the first and the last iova? From the description and the comments
> >> in the patch this is not clear to me.
> >
> > Let's start with why it's sufficient to zap only first and last iova.
> >
> > While unmapping, the driver zaps all iovas belonging to the mapping,
> > so the page tables not used by any mapping won't be cached. Now when
> > the driver creates a mapping it might end up occupying several page
> > tables. However, since the mapping area is virtually contiguous, only
> > the first and last page table can be shared with different mappings.
> > This means that only first and last iovas can be already cached. In
> > fact, we could detect if first and last page tables are shared and do
> > not zap at all, but this wouldn't really optimize too much. Why
> > invalidating one iova is enough to invalidate the whole page table is
> > unclear to me as well, but it seems to be the correct way on this
> > hardware.
>
> Hi,
> It seems to me that actually each mapping needs exactly one page.
> Since (as the inline doc in rk_iommu_map states) the pgsize_bitmap
> makes sure that iova mappings fits exactly into one page table
> since the mapping size is maximum 4M.
>
> This actually means that if rk_dte_get_page_table does not allocate a
> new page table but returns one that is already partially used from previous
> mappings then two page tables might be required, but I think the iova
> allocation somehow make sure that this will not be the case.

Yes, it was exactly for the case. Note that the zap operation is
per-IO-page and not per IOPT and there is some prefetching going on in
the TLB of this IOMMU. So neighboring mappings can interfere with each
other.

>
> If it was the case then the code would be buggy because it means
> that the loop in rk_iommu_map_iova will write behind the page table
> given in rk_dte_get_page_table (which we didn't allocate)

Sorry, I don't see how it could write behind the page table. Could you
give me an example?

>
> So I it seems to me that calling 'rk_iommu_zap_iova(rk_domain, iova, SPAGE_SIZE);'
> as done before this patch should be used, but be moved from
> rk_dte_get_page_table to where rk_iommu_zap_iova_first_last is now
>
> Thanks,
> Dafna
>
> >
> > As for the race, it's also kind of explained by the above. The already
> > running hardware can trigger page table look-ups in the IOMMU and so
> > caching of the page table between our zapping and updating its
> > contents. With this patch zapping is performed after updating the page
> > table so the race is gone.
> >
> > Best regards,
> > Tomasz
> >
> >  From mboxrd@z Thu Jan  1 00:00:00 1970
> > Return-Path: <linux-kernel-owner@vger.kernel.org>
> > Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
> >       id S1753210AbbCWM3R (ORCPT <rfc822;w@1wt.eu>);
> >       Mon, 23 Mar 2015 08:29:17 -0400
> > Received: from 8bytes.org ([81.169.241.247]:33957 "EHLO theia.8bytes.org"
> >       rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP
> >       id S1752552AbbCWM3M (ORCPT <rfc822;linux-kernel@vger.kernel.org>);
> >       Mon, 23 Mar 2015 08:29:12 -0400
> > Date: Mon, 23 Mar 2015 13:29:10 +0100
> > From: Joerg Roedel <joro@8bytes.org>
> > To: Tomasz Figa <tfiga@chromium.org>
> > Cc: iommu@lists.linux-foundation.org,
> >          "linux-arm-kernel@lists.infradead.org"
> >       <linux-arm-kernel@lists.infradead.org>,
> >          "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
> >          "open list:ARM/Rockchip SoC..." <linux-rockchip@lists.infradead.org>,
> >          Heiko Stuebner <heiko@sntech.de>, Daniel Kurtz <djkurtz@chromium.org>
> > Subject: Re: [PATCH] CHROMIUM: iommu: rockchip: Make sure that page table
> >   state is coherent
> > Message-ID: <20150323122910.GO4441@8bytes.org>
> > References: <1423480761-33453-1-git-send-email-tfiga@chromium.org>
> >   <20150303133659.GD10502@8bytes.org>
> >   <CAAFQd5Abk6X7AVTFaNuUSiShn31pzwwTE3VjfLnE4kyziAjy2A@mail.gmail.com>
> > MIME-Version: 1.0
> > Content-Type: text/plain; charset=us-ascii
> > Content-Disposition: inline
> > In-Reply-To: <CAAFQd5Abk6X7AVTFaNuUSiShn31pzwwTE3VjfLnE4kyziAjy2A@mail.gmail.com>
> > User-Agent: Mutt/1.5.21 (2010-09-15)
> > Sender: linux-kernel-owner@vger.kernel.org
> > List-ID: <linux-kernel.vger.kernel.org>
> > X-Mailing-List: linux-kernel@vger.kernel.org
> >
> > Hi Tomasz,
> >
> > On Mon, Mar 23, 2015 at 05:38:45PM +0900, Tomasz Figa wrote:
> >> While unmapping, the driver zaps all iovas belonging to the mapping,
> >> so the page tables not used by any mapping won't be cached. Now when
> >> the driver creates a mapping it might end up occupying several page
> >> tables. However, since the mapping area is virtually contiguous, only
> >> the first and last page table can be shared with different mappings.
> >> This means that only first and last iovas can be already cached. In
> >> fact, we could detect if first and last page tables are shared and do
> >> not zap at all, but this wouldn't really optimize too much. Why
> >> invalidating one iova is enough to invalidate the whole page table is
> >> unclear to me as well, but it seems to be the correct way on this
> >> hardware.
> >>
> >> As for the race, it's also kind of explained by the above. The already
> >> running hardware can trigger page table look-ups in the IOMMU and so
> >> caching of the page table between our zapping and updating its
> >> contents. With this patch zapping is performed after updating the page
> >> table so the race is gone.
> >
> > Okay, this makes sense. Can you add this information to the patch
> > changelog and resend please?
> >
> > Thanks,
> >
> >       Joerg
> >
> >
> >  From mboxrd@z Thu Jan  1 00:00:00 1970
> > From: Tomasz Figa <tfiga-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>
> > Subject: [PATCH] CHROMIUM: iommu: rockchip: Make sure that page table state is
> >       coherent
> > Date: Mon,  9 Feb 2015 20:19:21 +0900
> > Message-ID: <1423480761-33453-1-git-send-email-tfiga@chromium.org>
> > Mime-Version: 1.0
> > Content-Type: text/plain; charset="us-ascii"
> > Content-Transfer-Encoding: 7bit
> > Return-path: <iommu-bounces-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org>
> > List-Unsubscribe: <https://lists.linuxfoundation.org/mailman/options/iommu>,
> >       <mailto:iommu-request-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org?subject=unsubscribe>
> > List-Archive: <http://lists.linuxfoundation.org/pipermail/iommu/>
> > List-Post: <mailto:iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org>
> > List-Help: <mailto:iommu-request-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org?subject=help>
> > List-Subscribe: <https://lists.linuxfoundation.org/mailman/listinfo/iommu>,
> >       <mailto:iommu-request-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org?subject=subscribe>
> > Sender: iommu-bounces-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org
> > Errors-To: iommu-bounces-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org
> > To: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org
> > Cc: Heiko Stuebner <heiko-4mtYJXux2i+zQB+pC5nmwQ@public.gmane.org>, linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, Daniel Kurtz <djkurtz-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>, Tomasz Figa <tfiga-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>, linux-rockchip-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r@public.gmane.org, linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r@public.gmane.org
> > List-Id: iommu@lists.linux-foundation.org
> >
> > Even though the code uses the dt_lock spin lock to serialize mapping
> > operation from different threads, it does not protect from IOMMU
> > accesses that might be already taking place and thus altering state
> > of the IOTLB. This means that current mapping code which first zaps
> > the page table and only then updates it with new mapping which is
> > prone to mentioned race.
> >
> > In addition, current code assumes that mappings are always > 4 MiB
> > (which translates to 1024 PTEs) and so they would always occupy
> > entire page tables. This is not true for mappings created by V4L2
> > Videobuf2 DMA contig allocator.
> >
> > This patch changes the mapping code to always zap the page table
> > after it is updated, which avoids the aforementioned race and also
> > zap the last page of the mapping to make sure that stale data is
> > not cached from an already existing mapping.
> >
> > Signed-off-by: Tomasz Figa <tfiga-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>
> > Reviewed-by: Daniel Kurtz <djkurtz-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>
> > ---
> >   drivers/iommu/rockchip-iommu.c | 23 +++++++++++++++++------
> >   1 file changed, 17 insertions(+), 6 deletions(-)
> >
> > diff --git a/drivers/iommu/rockchip-iommu.c b/drivers/iommu/rockchip-iommu.c
> > index 6a8b1ec..b06fe76 100644
> > --- a/drivers/iommu/rockchip-iommu.c
> > +++ b/drivers/iommu/rockchip-iommu.c
> > @@ -544,6 +544,15 @@ static void rk_iommu_zap_iova(struct rk_iommu_domain *rk_domain,
> >       spin_unlock_irqrestore(&rk_domain->iommus_lock, flags);
> >   }
> >
> > +static void rk_iommu_zap_iova_first_last(struct rk_iommu_domain *rk_domain,
> > +                                      dma_addr_t iova, size_t size)
> > +{
> > +     rk_iommu_zap_iova(rk_domain, iova, SPAGE_SIZE);
> > +     if (size > SPAGE_SIZE)
> > +             rk_iommu_zap_iova(rk_domain, iova + size - SPAGE_SIZE,
> > +                                     SPAGE_SIZE);
> > +}
> > +
> >   static u32 *rk_dte_get_page_table(struct rk_iommu_domain *rk_domain,
> >                                 dma_addr_t iova)
> >   {
> > @@ -568,12 +577,6 @@ static u32 *rk_dte_get_page_table(struct rk_iommu_domain *rk_domain,
> >       rk_table_flush(page_table, NUM_PT_ENTRIES);
> >       rk_table_flush(dte_addr, 1);
> >
> > -     /*
> > -      * Zap the first iova of newly allocated page table so iommu evicts
> > -      * old cached value of new dte from the iotlb.
> > -      */
> > -     rk_iommu_zap_iova(rk_domain, iova, SPAGE_SIZE);
> > -
> >   done:
> >       pt_phys = rk_dte_pt_address(dte);
> >       return (u32 *)phys_to_virt(pt_phys);
> > @@ -623,6 +626,14 @@ static int rk_iommu_map_iova(struct rk_iommu_domain *rk_domain, u32 *pte_addr,
> >
> >       rk_table_flush(pte_addr, pte_count);
> >
> > +     /*
> > +      * Zap the first and last iova to evict from iotlb any previously
> > +      * mapped cachelines holding stale values for its dte and pte.
> > +      * We only zap the first and last iova, since only they could have
> > +      * dte or pte shared with an existing mapping.
> > +      */
> > +     rk_iommu_zap_iova_first_last(rk_domain, iova, size);
> > +
> >       return 0;
> >   unwind:
> >       /* Unmap the range of iovas that we just mapped */
> >

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2022-01-24  5:39 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-02-09 11:19 [PATCH] CHROMIUM: iommu: rockchip: Make sure that page table state is coherent Tomasz Figa
2015-02-09 11:19 ` Tomasz Figa
2015-02-09 11:19 ` Tomasz Figa
2015-02-09 11:22 ` Tomasz Figa
2015-02-09 11:22   ` Tomasz Figa
2015-02-09 11:22   ` Tomasz Figa
2015-02-10 22:21 ` Heiko Stübner
2015-02-10 22:21   ` Heiko Stübner
2015-03-03 13:36 ` Joerg Roedel
2015-03-03 13:36   ` Joerg Roedel
2015-03-23  8:38   ` Tomasz Figa
2015-03-23  8:38     ` Tomasz Figa
2015-03-23  8:38     ` Tomasz Figa
2015-03-23 12:29     ` Joerg Roedel
2015-03-23 12:29       ` Joerg Roedel
2015-03-23 12:29       ` Joerg Roedel
2021-12-09 15:18     ` Dafna Hirschfeld
2021-12-09 15:18       ` Dafna Hirschfeld
2021-12-09 15:18       ` Dafna Hirschfeld
2021-12-09 15:18       ` Dafna Hirschfeld
2022-01-24  5:37       ` Tomasz Figa
2022-01-24  5:37         ` Tomasz Figa
2022-01-24  5:37         ` Tomasz Figa
2022-01-24  5:37         ` Tomasz Figa

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.