All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3] Allow restricted-dma-pool to customize IO_TLB_SEGSIZE
@ 2021-11-23 11:21 ` Hsin-Yi Wang
  0 siblings, 0 replies; 50+ messages in thread
From: Hsin-Yi Wang @ 2021-11-23 11:21 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Marek Szyprowski, Robin Murphy, iommu, linux-kernel, Rob Herring,
	Maxime Ripard, -,
	devicetree, Matthias Brugger, linux-arm-kernel, linux-mediatek,
	senozhatsky, tfiga

Default IO_TLB_SEGSIZE (128) slabs may be not enough for some use cases.
This series adds support to customize io_tlb_segsize for each
restricted-dma-pool.

Example use case:

mtk-isp drivers[1] are controlled by mtk-scp[2] and allocate memory through
mtk-scp. In order to use the noncontiguous DMA API[3], we need to use
the swiotlb pool. mtk-scp needs to allocate memory with 2560 slabs.
mtk-isp drivers also needs to allocate memory with 200+ slabs. Both are
larger than the default IO_TLB_SEGSIZE (128) slabs.

[1] (not in upstream) https://patchwork.kernel.org/project/linux-media/cover/20190611035344.29814-1-jungo.lin@mediatek.com/
[2] https://elixir.bootlin.com/linux/latest/source/drivers/remoteproc/mtk_scp.c
[3] https://patchwork.kernel.org/project/linux-media/cover/20210909112430.61243-1-senozhatsky@chromium.org/

Hsin-Yi Wang (3):
  dma: swiotlb: Allow restricted-dma-pool to customize IO_TLB_SEGSIZE
  dt-bindings: Add io-tlb-segsize property for restricted-dma-pool
  arm64: dts: mt8183: use restricted swiotlb for scp mem

 .../reserved-memory/shared-dma-pool.yaml      |  8 +++++
 .../arm64/boot/dts/mediatek/mt8183-kukui.dtsi |  4 +--
 include/linux/swiotlb.h                       |  1 +
 kernel/dma/swiotlb.c                          | 34 ++++++++++++++-----
 4 files changed, 37 insertions(+), 10 deletions(-)

-- 
2.34.0.rc2.393.gf8c9666880-goog


^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH 0/3] Allow restricted-dma-pool to customize IO_TLB_SEGSIZE
@ 2021-11-23 11:21 ` Hsin-Yi Wang
  0 siblings, 0 replies; 50+ messages in thread
From: Hsin-Yi Wang @ 2021-11-23 11:21 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: devicetree, -,
	linux-kernel, senozhatsky, iommu, Rob Herring, linux-mediatek,
	Maxime Ripard, Matthias Brugger, Robin Murphy, linux-arm-kernel

Default IO_TLB_SEGSIZE (128) slabs may be not enough for some use cases.
This series adds support to customize io_tlb_segsize for each
restricted-dma-pool.

Example use case:

mtk-isp drivers[1] are controlled by mtk-scp[2] and allocate memory through
mtk-scp. In order to use the noncontiguous DMA API[3], we need to use
the swiotlb pool. mtk-scp needs to allocate memory with 2560 slabs.
mtk-isp drivers also needs to allocate memory with 200+ slabs. Both are
larger than the default IO_TLB_SEGSIZE (128) slabs.

[1] (not in upstream) https://patchwork.kernel.org/project/linux-media/cover/20190611035344.29814-1-jungo.lin@mediatek.com/
[2] https://elixir.bootlin.com/linux/latest/source/drivers/remoteproc/mtk_scp.c
[3] https://patchwork.kernel.org/project/linux-media/cover/20210909112430.61243-1-senozhatsky@chromium.org/

Hsin-Yi Wang (3):
  dma: swiotlb: Allow restricted-dma-pool to customize IO_TLB_SEGSIZE
  dt-bindings: Add io-tlb-segsize property for restricted-dma-pool
  arm64: dts: mt8183: use restricted swiotlb for scp mem

 .../reserved-memory/shared-dma-pool.yaml      |  8 +++++
 .../arm64/boot/dts/mediatek/mt8183-kukui.dtsi |  4 +--
 include/linux/swiotlb.h                       |  1 +
 kernel/dma/swiotlb.c                          | 34 ++++++++++++++-----
 4 files changed, 37 insertions(+), 10 deletions(-)

-- 
2.34.0.rc2.393.gf8c9666880-goog

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH 0/3] Allow restricted-dma-pool to customize IO_TLB_SEGSIZE
@ 2021-11-23 11:21 ` Hsin-Yi Wang
  0 siblings, 0 replies; 50+ messages in thread
From: Hsin-Yi Wang @ 2021-11-23 11:21 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Marek Szyprowski, Robin Murphy, iommu, linux-kernel, Rob Herring,
	Maxime Ripard, -,
	devicetree, Matthias Brugger, linux-arm-kernel, linux-mediatek,
	senozhatsky, tfiga

Default IO_TLB_SEGSIZE (128) slabs may be not enough for some use cases.
This series adds support to customize io_tlb_segsize for each
restricted-dma-pool.

Example use case:

mtk-isp drivers[1] are controlled by mtk-scp[2] and allocate memory through
mtk-scp. In order to use the noncontiguous DMA API[3], we need to use
the swiotlb pool. mtk-scp needs to allocate memory with 2560 slabs.
mtk-isp drivers also needs to allocate memory with 200+ slabs. Both are
larger than the default IO_TLB_SEGSIZE (128) slabs.

[1] (not in upstream) https://patchwork.kernel.org/project/linux-media/cover/20190611035344.29814-1-jungo.lin@mediatek.com/
[2] https://elixir.bootlin.com/linux/latest/source/drivers/remoteproc/mtk_scp.c
[3] https://patchwork.kernel.org/project/linux-media/cover/20210909112430.61243-1-senozhatsky@chromium.org/

Hsin-Yi Wang (3):
  dma: swiotlb: Allow restricted-dma-pool to customize IO_TLB_SEGSIZE
  dt-bindings: Add io-tlb-segsize property for restricted-dma-pool
  arm64: dts: mt8183: use restricted swiotlb for scp mem

 .../reserved-memory/shared-dma-pool.yaml      |  8 +++++
 .../arm64/boot/dts/mediatek/mt8183-kukui.dtsi |  4 +--
 include/linux/swiotlb.h                       |  1 +
 kernel/dma/swiotlb.c                          | 34 ++++++++++++++-----
 4 files changed, 37 insertions(+), 10 deletions(-)

-- 
2.34.0.rc2.393.gf8c9666880-goog


_______________________________________________
Linux-mediatek mailing list
Linux-mediatek@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-mediatek

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH 0/3] Allow restricted-dma-pool to customize IO_TLB_SEGSIZE
@ 2021-11-23 11:21 ` Hsin-Yi Wang
  0 siblings, 0 replies; 50+ messages in thread
From: Hsin-Yi Wang @ 2021-11-23 11:21 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Marek Szyprowski, Robin Murphy, iommu, linux-kernel, Rob Herring,
	Maxime Ripard, -,
	devicetree, Matthias Brugger, linux-arm-kernel, linux-mediatek,
	senozhatsky, tfiga

Default IO_TLB_SEGSIZE (128) slabs may be not enough for some use cases.
This series adds support to customize io_tlb_segsize for each
restricted-dma-pool.

Example use case:

mtk-isp drivers[1] are controlled by mtk-scp[2] and allocate memory through
mtk-scp. In order to use the noncontiguous DMA API[3], we need to use
the swiotlb pool. mtk-scp needs to allocate memory with 2560 slabs.
mtk-isp drivers also needs to allocate memory with 200+ slabs. Both are
larger than the default IO_TLB_SEGSIZE (128) slabs.

[1] (not in upstream) https://patchwork.kernel.org/project/linux-media/cover/20190611035344.29814-1-jungo.lin@mediatek.com/
[2] https://elixir.bootlin.com/linux/latest/source/drivers/remoteproc/mtk_scp.c
[3] https://patchwork.kernel.org/project/linux-media/cover/20210909112430.61243-1-senozhatsky@chromium.org/

Hsin-Yi Wang (3):
  dma: swiotlb: Allow restricted-dma-pool to customize IO_TLB_SEGSIZE
  dt-bindings: Add io-tlb-segsize property for restricted-dma-pool
  arm64: dts: mt8183: use restricted swiotlb for scp mem

 .../reserved-memory/shared-dma-pool.yaml      |  8 +++++
 .../arm64/boot/dts/mediatek/mt8183-kukui.dtsi |  4 +--
 include/linux/swiotlb.h                       |  1 +
 kernel/dma/swiotlb.c                          | 34 ++++++++++++++-----
 4 files changed, 37 insertions(+), 10 deletions(-)

-- 
2.34.0.rc2.393.gf8c9666880-goog


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH 0/3] Allow restricted-dma-pool to customize IO_TLB_SEGSIZE
@ 2021-11-23 11:21 ` Hsin-Yi Wang
  0 siblings, 0 replies; 50+ messages in thread
From: Hsin-Yi Wang @ 2021-11-23 11:21 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Marek Szyprowski, Robin Murphy,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Rob Herring, Maxime Ripard,
	-,
	devicetree-u79uwXL29TY76Z2rM5mHXA, Matthias Brugger,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-mediatek-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	senozhatsky-F7+t8E8rja9g9hUCZPvPmw, tfiga-F7+t8E8rja9g9hUCZPvPmw

Default IO_TLB_SEGSIZE (128) slabs may be not enough for some use cases.
This series adds support to customize io_tlb_segsize for each
restricted-dma-pool.

Example use case:

mtk-isp drivers[1] are controlled by mtk-scp[2] and allocate memory through
mtk-scp. In order to use the noncontiguous DMA API[3], we need to use
the swiotlb pool. mtk-scp needs to allocate memory with 2560 slabs.
mtk-isp drivers also needs to allocate memory with 200+ slabs. Both are
larger than the default IO_TLB_SEGSIZE (128) slabs.

[1] (not in upstream) https://patchwork.kernel.org/project/linux-media/cover/20190611035344.29814-1-jungo.lin-NuS5LvNUpcJWk0Htik3J/w@public.gmane.org/
[2] https://elixir.bootlin.com/linux/latest/source/drivers/remoteproc/mtk_scp.c
[3] https://patchwork.kernel.org/project/linux-media/cover/20210909112430.61243-1-senozhatsky-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org/

Hsin-Yi Wang (3):
  dma: swiotlb: Allow restricted-dma-pool to customize IO_TLB_SEGSIZE
  dt-bindings: Add io-tlb-segsize property for restricted-dma-pool
  arm64: dts: mt8183: use restricted swiotlb for scp mem

 .../reserved-memory/shared-dma-pool.yaml      |  8 +++++
 .../arm64/boot/dts/mediatek/mt8183-kukui.dtsi |  4 +--
 include/linux/swiotlb.h                       |  1 +
 kernel/dma/swiotlb.c                          | 34 ++++++++++++++-----
 4 files changed, 37 insertions(+), 10 deletions(-)

-- 
2.34.0.rc2.393.gf8c9666880-goog


^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH 1/3] dma: swiotlb: Allow restricted-dma-pool to customize IO_TLB_SEGSIZE
  2021-11-23 11:21 ` Hsin-Yi Wang
                     ` (2 preceding siblings ...)
  (?)
@ 2021-11-23 11:21   ` Hsin-Yi Wang
  -1 siblings, 0 replies; 50+ messages in thread
From: Hsin-Yi Wang @ 2021-11-23 11:21 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Marek Szyprowski, Robin Murphy, iommu, linux-kernel, Rob Herring,
	Maxime Ripard, -,
	devicetree, Matthias Brugger, linux-arm-kernel, linux-mediatek,
	senozhatsky, tfiga

Default IO_TLB_SEGSIZE is 128, but some use cases requires more slabs.
Otherwise swiotlb_find_slots() will fail.

This patch allows each mem pool to decide their own io-tlb-segsize
through dt property.

Signed-off-by: Hsin-Yi Wang <hsinyi@chromium.org>
---
 include/linux/swiotlb.h |  1 +
 kernel/dma/swiotlb.c    | 34 ++++++++++++++++++++++++++--------
 2 files changed, 27 insertions(+), 8 deletions(-)

diff --git a/include/linux/swiotlb.h b/include/linux/swiotlb.h
index 569272871375c4..73b3312f23e65b 100644
--- a/include/linux/swiotlb.h
+++ b/include/linux/swiotlb.h
@@ -95,6 +95,7 @@ struct io_tlb_mem {
 	unsigned long nslabs;
 	unsigned long used;
 	unsigned int index;
+	unsigned int io_tlb_segsize;
 	spinlock_t lock;
 	struct dentry *debugfs;
 	bool late_alloc;
diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
index 8e840fbbed7c7a..021eef1844ca4c 100644
--- a/kernel/dma/swiotlb.c
+++ b/kernel/dma/swiotlb.c
@@ -145,9 +145,10 @@ void swiotlb_print_info(void)
 	       (mem->nslabs << IO_TLB_SHIFT) >> 20);
 }
 
-static inline unsigned long io_tlb_offset(unsigned long val)
+static inline unsigned long io_tlb_offset(unsigned long val,
+					  unsigned long io_tlb_segsize)
 {
-	return val & (IO_TLB_SEGSIZE - 1);
+	return val & (io_tlb_segsize - 1);
 }
 
 static inline unsigned long nr_slots(u64 val)
@@ -186,13 +187,16 @@ static void swiotlb_init_io_tlb_mem(struct io_tlb_mem *mem, phys_addr_t start,
 	mem->end = mem->start + bytes;
 	mem->index = 0;
 	mem->late_alloc = late_alloc;
+	if (!mem->io_tlb_segsize)
+		mem->io_tlb_segsize = IO_TLB_SEGSIZE;
 
 	if (swiotlb_force == SWIOTLB_FORCE)
 		mem->force_bounce = true;
 
 	spin_lock_init(&mem->lock);
 	for (i = 0; i < mem->nslabs; i++) {
-		mem->slots[i].list = IO_TLB_SEGSIZE - io_tlb_offset(i);
+		mem->slots[i].list = mem->io_tlb_segsize -
+				     io_tlb_offset(i, mem->io_tlb_segsize);
 		mem->slots[i].orig_addr = INVALID_PHYS_ADDR;
 		mem->slots[i].alloc_size = 0;
 	}
@@ -523,7 +527,7 @@ static int swiotlb_find_slots(struct device *dev, phys_addr_t orig_addr,
 			alloc_size - (offset + ((i - index) << IO_TLB_SHIFT));
 	}
 	for (i = index - 1;
-	     io_tlb_offset(i) != IO_TLB_SEGSIZE - 1 &&
+	     io_tlb_offset(i, mem->io_tlb_segsize) != mem->io_tlb_segsize - 1 &&
 	     mem->slots[i].list; i--)
 		mem->slots[i].list = ++count;
 
@@ -603,7 +607,7 @@ static void swiotlb_release_slots(struct device *dev, phys_addr_t tlb_addr)
 	 * with slots below and above the pool being returned.
 	 */
 	spin_lock_irqsave(&mem->lock, flags);
-	if (index + nslots < ALIGN(index + 1, IO_TLB_SEGSIZE))
+	if (index + nslots < ALIGN(index + 1, mem->io_tlb_segsize))
 		count = mem->slots[index + nslots].list;
 	else
 		count = 0;
@@ -623,8 +627,8 @@ static void swiotlb_release_slots(struct device *dev, phys_addr_t tlb_addr)
 	 * available (non zero)
 	 */
 	for (i = index - 1;
-	     io_tlb_offset(i) != IO_TLB_SEGSIZE - 1 && mem->slots[i].list;
-	     i--)
+	     io_tlb_offset(i, mem->io_tlb_segsize) != mem->io_tlb_segsize - 1 &&
+	     mem->slots[i].list; i--)
 		mem->slots[i].list = ++count;
 	mem->used -= nslots;
 	spin_unlock_irqrestore(&mem->lock, flags);
@@ -701,7 +705,9 @@ dma_addr_t swiotlb_map(struct device *dev, phys_addr_t paddr, size_t size,
 
 size_t swiotlb_max_mapping_size(struct device *dev)
 {
-	return ((size_t)IO_TLB_SIZE) * IO_TLB_SEGSIZE;
+	struct io_tlb_mem *mem = dev->dma_io_tlb_mem;
+
+	return ((size_t)IO_TLB_SIZE) * mem->io_tlb_segsize;
 }
 
 bool is_swiotlb_active(struct device *dev)
@@ -788,6 +794,7 @@ static int rmem_swiotlb_device_init(struct reserved_mem *rmem,
 {
 	struct io_tlb_mem *mem = rmem->priv;
 	unsigned long nslabs = rmem->size >> IO_TLB_SHIFT;
+	struct device_node *np;
 
 	/*
 	 * Since multiple devices can share the same pool, the private data,
@@ -808,6 +815,17 @@ static int rmem_swiotlb_device_init(struct reserved_mem *rmem,
 
 		set_memory_decrypted((unsigned long)phys_to_virt(rmem->base),
 				     rmem->size >> PAGE_SHIFT);
+
+		np = of_find_node_by_phandle(rmem->phandle);
+		if (np) {
+			if (!of_property_read_u32(np, "io-tlb-segsize",
+						  &mem->io_tlb_segsize)) {
+				if (hweight32(mem->io_tlb_segsize) != 1)
+					mem->io_tlb_segsize = IO_TLB_SEGSIZE;
+			}
+			of_node_put(np);
+		}
+
 		swiotlb_init_io_tlb_mem(mem, rmem->base, nslabs, false);
 		mem->force_bounce = true;
 		mem->for_alloc = true;
-- 
2.34.0.rc2.393.gf8c9666880-goog


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH 1/3] dma: swiotlb: Allow restricted-dma-pool to customize IO_TLB_SEGSIZE
@ 2021-11-23 11:21   ` Hsin-Yi Wang
  0 siblings, 0 replies; 50+ messages in thread
From: Hsin-Yi Wang @ 2021-11-23 11:21 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: devicetree, -,
	linux-kernel, senozhatsky, iommu, Rob Herring, linux-mediatek,
	Maxime Ripard, Matthias Brugger, Robin Murphy, linux-arm-kernel

Default IO_TLB_SEGSIZE is 128, but some use cases requires more slabs.
Otherwise swiotlb_find_slots() will fail.

This patch allows each mem pool to decide their own io-tlb-segsize
through dt property.

Signed-off-by: Hsin-Yi Wang <hsinyi@chromium.org>
---
 include/linux/swiotlb.h |  1 +
 kernel/dma/swiotlb.c    | 34 ++++++++++++++++++++++++++--------
 2 files changed, 27 insertions(+), 8 deletions(-)

diff --git a/include/linux/swiotlb.h b/include/linux/swiotlb.h
index 569272871375c4..73b3312f23e65b 100644
--- a/include/linux/swiotlb.h
+++ b/include/linux/swiotlb.h
@@ -95,6 +95,7 @@ struct io_tlb_mem {
 	unsigned long nslabs;
 	unsigned long used;
 	unsigned int index;
+	unsigned int io_tlb_segsize;
 	spinlock_t lock;
 	struct dentry *debugfs;
 	bool late_alloc;
diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
index 8e840fbbed7c7a..021eef1844ca4c 100644
--- a/kernel/dma/swiotlb.c
+++ b/kernel/dma/swiotlb.c
@@ -145,9 +145,10 @@ void swiotlb_print_info(void)
 	       (mem->nslabs << IO_TLB_SHIFT) >> 20);
 }
 
-static inline unsigned long io_tlb_offset(unsigned long val)
+static inline unsigned long io_tlb_offset(unsigned long val,
+					  unsigned long io_tlb_segsize)
 {
-	return val & (IO_TLB_SEGSIZE - 1);
+	return val & (io_tlb_segsize - 1);
 }
 
 static inline unsigned long nr_slots(u64 val)
@@ -186,13 +187,16 @@ static void swiotlb_init_io_tlb_mem(struct io_tlb_mem *mem, phys_addr_t start,
 	mem->end = mem->start + bytes;
 	mem->index = 0;
 	mem->late_alloc = late_alloc;
+	if (!mem->io_tlb_segsize)
+		mem->io_tlb_segsize = IO_TLB_SEGSIZE;
 
 	if (swiotlb_force == SWIOTLB_FORCE)
 		mem->force_bounce = true;
 
 	spin_lock_init(&mem->lock);
 	for (i = 0; i < mem->nslabs; i++) {
-		mem->slots[i].list = IO_TLB_SEGSIZE - io_tlb_offset(i);
+		mem->slots[i].list = mem->io_tlb_segsize -
+				     io_tlb_offset(i, mem->io_tlb_segsize);
 		mem->slots[i].orig_addr = INVALID_PHYS_ADDR;
 		mem->slots[i].alloc_size = 0;
 	}
@@ -523,7 +527,7 @@ static int swiotlb_find_slots(struct device *dev, phys_addr_t orig_addr,
 			alloc_size - (offset + ((i - index) << IO_TLB_SHIFT));
 	}
 	for (i = index - 1;
-	     io_tlb_offset(i) != IO_TLB_SEGSIZE - 1 &&
+	     io_tlb_offset(i, mem->io_tlb_segsize) != mem->io_tlb_segsize - 1 &&
 	     mem->slots[i].list; i--)
 		mem->slots[i].list = ++count;
 
@@ -603,7 +607,7 @@ static void swiotlb_release_slots(struct device *dev, phys_addr_t tlb_addr)
 	 * with slots below and above the pool being returned.
 	 */
 	spin_lock_irqsave(&mem->lock, flags);
-	if (index + nslots < ALIGN(index + 1, IO_TLB_SEGSIZE))
+	if (index + nslots < ALIGN(index + 1, mem->io_tlb_segsize))
 		count = mem->slots[index + nslots].list;
 	else
 		count = 0;
@@ -623,8 +627,8 @@ static void swiotlb_release_slots(struct device *dev, phys_addr_t tlb_addr)
 	 * available (non zero)
 	 */
 	for (i = index - 1;
-	     io_tlb_offset(i) != IO_TLB_SEGSIZE - 1 && mem->slots[i].list;
-	     i--)
+	     io_tlb_offset(i, mem->io_tlb_segsize) != mem->io_tlb_segsize - 1 &&
+	     mem->slots[i].list; i--)
 		mem->slots[i].list = ++count;
 	mem->used -= nslots;
 	spin_unlock_irqrestore(&mem->lock, flags);
@@ -701,7 +705,9 @@ dma_addr_t swiotlb_map(struct device *dev, phys_addr_t paddr, size_t size,
 
 size_t swiotlb_max_mapping_size(struct device *dev)
 {
-	return ((size_t)IO_TLB_SIZE) * IO_TLB_SEGSIZE;
+	struct io_tlb_mem *mem = dev->dma_io_tlb_mem;
+
+	return ((size_t)IO_TLB_SIZE) * mem->io_tlb_segsize;
 }
 
 bool is_swiotlb_active(struct device *dev)
@@ -788,6 +794,7 @@ static int rmem_swiotlb_device_init(struct reserved_mem *rmem,
 {
 	struct io_tlb_mem *mem = rmem->priv;
 	unsigned long nslabs = rmem->size >> IO_TLB_SHIFT;
+	struct device_node *np;
 
 	/*
 	 * Since multiple devices can share the same pool, the private data,
@@ -808,6 +815,17 @@ static int rmem_swiotlb_device_init(struct reserved_mem *rmem,
 
 		set_memory_decrypted((unsigned long)phys_to_virt(rmem->base),
 				     rmem->size >> PAGE_SHIFT);
+
+		np = of_find_node_by_phandle(rmem->phandle);
+		if (np) {
+			if (!of_property_read_u32(np, "io-tlb-segsize",
+						  &mem->io_tlb_segsize)) {
+				if (hweight32(mem->io_tlb_segsize) != 1)
+					mem->io_tlb_segsize = IO_TLB_SEGSIZE;
+			}
+			of_node_put(np);
+		}
+
 		swiotlb_init_io_tlb_mem(mem, rmem->base, nslabs, false);
 		mem->force_bounce = true;
 		mem->for_alloc = true;
-- 
2.34.0.rc2.393.gf8c9666880-goog

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH 1/3] dma: swiotlb: Allow restricted-dma-pool to customize IO_TLB_SEGSIZE
@ 2021-11-23 11:21   ` Hsin-Yi Wang
  0 siblings, 0 replies; 50+ messages in thread
From: Hsin-Yi Wang @ 2021-11-23 11:21 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Marek Szyprowski, Robin Murphy, iommu, linux-kernel, Rob Herring,
	Maxime Ripard, -,
	devicetree, Matthias Brugger, linux-arm-kernel, linux-mediatek,
	senozhatsky, tfiga

Default IO_TLB_SEGSIZE is 128, but some use cases requires more slabs.
Otherwise swiotlb_find_slots() will fail.

This patch allows each mem pool to decide their own io-tlb-segsize
through dt property.

Signed-off-by: Hsin-Yi Wang <hsinyi@chromium.org>
---
 include/linux/swiotlb.h |  1 +
 kernel/dma/swiotlb.c    | 34 ++++++++++++++++++++++++++--------
 2 files changed, 27 insertions(+), 8 deletions(-)

diff --git a/include/linux/swiotlb.h b/include/linux/swiotlb.h
index 569272871375c4..73b3312f23e65b 100644
--- a/include/linux/swiotlb.h
+++ b/include/linux/swiotlb.h
@@ -95,6 +95,7 @@ struct io_tlb_mem {
 	unsigned long nslabs;
 	unsigned long used;
 	unsigned int index;
+	unsigned int io_tlb_segsize;
 	spinlock_t lock;
 	struct dentry *debugfs;
 	bool late_alloc;
diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
index 8e840fbbed7c7a..021eef1844ca4c 100644
--- a/kernel/dma/swiotlb.c
+++ b/kernel/dma/swiotlb.c
@@ -145,9 +145,10 @@ void swiotlb_print_info(void)
 	       (mem->nslabs << IO_TLB_SHIFT) >> 20);
 }
 
-static inline unsigned long io_tlb_offset(unsigned long val)
+static inline unsigned long io_tlb_offset(unsigned long val,
+					  unsigned long io_tlb_segsize)
 {
-	return val & (IO_TLB_SEGSIZE - 1);
+	return val & (io_tlb_segsize - 1);
 }
 
 static inline unsigned long nr_slots(u64 val)
@@ -186,13 +187,16 @@ static void swiotlb_init_io_tlb_mem(struct io_tlb_mem *mem, phys_addr_t start,
 	mem->end = mem->start + bytes;
 	mem->index = 0;
 	mem->late_alloc = late_alloc;
+	if (!mem->io_tlb_segsize)
+		mem->io_tlb_segsize = IO_TLB_SEGSIZE;
 
 	if (swiotlb_force == SWIOTLB_FORCE)
 		mem->force_bounce = true;
 
 	spin_lock_init(&mem->lock);
 	for (i = 0; i < mem->nslabs; i++) {
-		mem->slots[i].list = IO_TLB_SEGSIZE - io_tlb_offset(i);
+		mem->slots[i].list = mem->io_tlb_segsize -
+				     io_tlb_offset(i, mem->io_tlb_segsize);
 		mem->slots[i].orig_addr = INVALID_PHYS_ADDR;
 		mem->slots[i].alloc_size = 0;
 	}
@@ -523,7 +527,7 @@ static int swiotlb_find_slots(struct device *dev, phys_addr_t orig_addr,
 			alloc_size - (offset + ((i - index) << IO_TLB_SHIFT));
 	}
 	for (i = index - 1;
-	     io_tlb_offset(i) != IO_TLB_SEGSIZE - 1 &&
+	     io_tlb_offset(i, mem->io_tlb_segsize) != mem->io_tlb_segsize - 1 &&
 	     mem->slots[i].list; i--)
 		mem->slots[i].list = ++count;
 
@@ -603,7 +607,7 @@ static void swiotlb_release_slots(struct device *dev, phys_addr_t tlb_addr)
 	 * with slots below and above the pool being returned.
 	 */
 	spin_lock_irqsave(&mem->lock, flags);
-	if (index + nslots < ALIGN(index + 1, IO_TLB_SEGSIZE))
+	if (index + nslots < ALIGN(index + 1, mem->io_tlb_segsize))
 		count = mem->slots[index + nslots].list;
 	else
 		count = 0;
@@ -623,8 +627,8 @@ static void swiotlb_release_slots(struct device *dev, phys_addr_t tlb_addr)
 	 * available (non zero)
 	 */
 	for (i = index - 1;
-	     io_tlb_offset(i) != IO_TLB_SEGSIZE - 1 && mem->slots[i].list;
-	     i--)
+	     io_tlb_offset(i, mem->io_tlb_segsize) != mem->io_tlb_segsize - 1 &&
+	     mem->slots[i].list; i--)
 		mem->slots[i].list = ++count;
 	mem->used -= nslots;
 	spin_unlock_irqrestore(&mem->lock, flags);
@@ -701,7 +705,9 @@ dma_addr_t swiotlb_map(struct device *dev, phys_addr_t paddr, size_t size,
 
 size_t swiotlb_max_mapping_size(struct device *dev)
 {
-	return ((size_t)IO_TLB_SIZE) * IO_TLB_SEGSIZE;
+	struct io_tlb_mem *mem = dev->dma_io_tlb_mem;
+
+	return ((size_t)IO_TLB_SIZE) * mem->io_tlb_segsize;
 }
 
 bool is_swiotlb_active(struct device *dev)
@@ -788,6 +794,7 @@ static int rmem_swiotlb_device_init(struct reserved_mem *rmem,
 {
 	struct io_tlb_mem *mem = rmem->priv;
 	unsigned long nslabs = rmem->size >> IO_TLB_SHIFT;
+	struct device_node *np;
 
 	/*
 	 * Since multiple devices can share the same pool, the private data,
@@ -808,6 +815,17 @@ static int rmem_swiotlb_device_init(struct reserved_mem *rmem,
 
 		set_memory_decrypted((unsigned long)phys_to_virt(rmem->base),
 				     rmem->size >> PAGE_SHIFT);
+
+		np = of_find_node_by_phandle(rmem->phandle);
+		if (np) {
+			if (!of_property_read_u32(np, "io-tlb-segsize",
+						  &mem->io_tlb_segsize)) {
+				if (hweight32(mem->io_tlb_segsize) != 1)
+					mem->io_tlb_segsize = IO_TLB_SEGSIZE;
+			}
+			of_node_put(np);
+		}
+
 		swiotlb_init_io_tlb_mem(mem, rmem->base, nslabs, false);
 		mem->force_bounce = true;
 		mem->for_alloc = true;
-- 
2.34.0.rc2.393.gf8c9666880-goog


_______________________________________________
Linux-mediatek mailing list
Linux-mediatek@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-mediatek

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH 1/3] dma: swiotlb: Allow restricted-dma-pool to customize IO_TLB_SEGSIZE
@ 2021-11-23 11:21   ` Hsin-Yi Wang
  0 siblings, 0 replies; 50+ messages in thread
From: Hsin-Yi Wang @ 2021-11-23 11:21 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Marek Szyprowski, Robin Murphy, iommu, linux-kernel, Rob Herring,
	Maxime Ripard, -,
	devicetree, Matthias Brugger, linux-arm-kernel, linux-mediatek,
	senozhatsky, tfiga

Default IO_TLB_SEGSIZE is 128, but some use cases requires more slabs.
Otherwise swiotlb_find_slots() will fail.

This patch allows each mem pool to decide their own io-tlb-segsize
through dt property.

Signed-off-by: Hsin-Yi Wang <hsinyi@chromium.org>
---
 include/linux/swiotlb.h |  1 +
 kernel/dma/swiotlb.c    | 34 ++++++++++++++++++++++++++--------
 2 files changed, 27 insertions(+), 8 deletions(-)

diff --git a/include/linux/swiotlb.h b/include/linux/swiotlb.h
index 569272871375c4..73b3312f23e65b 100644
--- a/include/linux/swiotlb.h
+++ b/include/linux/swiotlb.h
@@ -95,6 +95,7 @@ struct io_tlb_mem {
 	unsigned long nslabs;
 	unsigned long used;
 	unsigned int index;
+	unsigned int io_tlb_segsize;
 	spinlock_t lock;
 	struct dentry *debugfs;
 	bool late_alloc;
diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
index 8e840fbbed7c7a..021eef1844ca4c 100644
--- a/kernel/dma/swiotlb.c
+++ b/kernel/dma/swiotlb.c
@@ -145,9 +145,10 @@ void swiotlb_print_info(void)
 	       (mem->nslabs << IO_TLB_SHIFT) >> 20);
 }
 
-static inline unsigned long io_tlb_offset(unsigned long val)
+static inline unsigned long io_tlb_offset(unsigned long val,
+					  unsigned long io_tlb_segsize)
 {
-	return val & (IO_TLB_SEGSIZE - 1);
+	return val & (io_tlb_segsize - 1);
 }
 
 static inline unsigned long nr_slots(u64 val)
@@ -186,13 +187,16 @@ static void swiotlb_init_io_tlb_mem(struct io_tlb_mem *mem, phys_addr_t start,
 	mem->end = mem->start + bytes;
 	mem->index = 0;
 	mem->late_alloc = late_alloc;
+	if (!mem->io_tlb_segsize)
+		mem->io_tlb_segsize = IO_TLB_SEGSIZE;
 
 	if (swiotlb_force == SWIOTLB_FORCE)
 		mem->force_bounce = true;
 
 	spin_lock_init(&mem->lock);
 	for (i = 0; i < mem->nslabs; i++) {
-		mem->slots[i].list = IO_TLB_SEGSIZE - io_tlb_offset(i);
+		mem->slots[i].list = mem->io_tlb_segsize -
+				     io_tlb_offset(i, mem->io_tlb_segsize);
 		mem->slots[i].orig_addr = INVALID_PHYS_ADDR;
 		mem->slots[i].alloc_size = 0;
 	}
@@ -523,7 +527,7 @@ static int swiotlb_find_slots(struct device *dev, phys_addr_t orig_addr,
 			alloc_size - (offset + ((i - index) << IO_TLB_SHIFT));
 	}
 	for (i = index - 1;
-	     io_tlb_offset(i) != IO_TLB_SEGSIZE - 1 &&
+	     io_tlb_offset(i, mem->io_tlb_segsize) != mem->io_tlb_segsize - 1 &&
 	     mem->slots[i].list; i--)
 		mem->slots[i].list = ++count;
 
@@ -603,7 +607,7 @@ static void swiotlb_release_slots(struct device *dev, phys_addr_t tlb_addr)
 	 * with slots below and above the pool being returned.
 	 */
 	spin_lock_irqsave(&mem->lock, flags);
-	if (index + nslots < ALIGN(index + 1, IO_TLB_SEGSIZE))
+	if (index + nslots < ALIGN(index + 1, mem->io_tlb_segsize))
 		count = mem->slots[index + nslots].list;
 	else
 		count = 0;
@@ -623,8 +627,8 @@ static void swiotlb_release_slots(struct device *dev, phys_addr_t tlb_addr)
 	 * available (non zero)
 	 */
 	for (i = index - 1;
-	     io_tlb_offset(i) != IO_TLB_SEGSIZE - 1 && mem->slots[i].list;
-	     i--)
+	     io_tlb_offset(i, mem->io_tlb_segsize) != mem->io_tlb_segsize - 1 &&
+	     mem->slots[i].list; i--)
 		mem->slots[i].list = ++count;
 	mem->used -= nslots;
 	spin_unlock_irqrestore(&mem->lock, flags);
@@ -701,7 +705,9 @@ dma_addr_t swiotlb_map(struct device *dev, phys_addr_t paddr, size_t size,
 
 size_t swiotlb_max_mapping_size(struct device *dev)
 {
-	return ((size_t)IO_TLB_SIZE) * IO_TLB_SEGSIZE;
+	struct io_tlb_mem *mem = dev->dma_io_tlb_mem;
+
+	return ((size_t)IO_TLB_SIZE) * mem->io_tlb_segsize;
 }
 
 bool is_swiotlb_active(struct device *dev)
@@ -788,6 +794,7 @@ static int rmem_swiotlb_device_init(struct reserved_mem *rmem,
 {
 	struct io_tlb_mem *mem = rmem->priv;
 	unsigned long nslabs = rmem->size >> IO_TLB_SHIFT;
+	struct device_node *np;
 
 	/*
 	 * Since multiple devices can share the same pool, the private data,
@@ -808,6 +815,17 @@ static int rmem_swiotlb_device_init(struct reserved_mem *rmem,
 
 		set_memory_decrypted((unsigned long)phys_to_virt(rmem->base),
 				     rmem->size >> PAGE_SHIFT);
+
+		np = of_find_node_by_phandle(rmem->phandle);
+		if (np) {
+			if (!of_property_read_u32(np, "io-tlb-segsize",
+						  &mem->io_tlb_segsize)) {
+				if (hweight32(mem->io_tlb_segsize) != 1)
+					mem->io_tlb_segsize = IO_TLB_SEGSIZE;
+			}
+			of_node_put(np);
+		}
+
 		swiotlb_init_io_tlb_mem(mem, rmem->base, nslabs, false);
 		mem->force_bounce = true;
 		mem->for_alloc = true;
-- 
2.34.0.rc2.393.gf8c9666880-goog


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH 1/3] dma: swiotlb: Allow restricted-dma-pool to customize IO_TLB_SEGSIZE
@ 2021-11-23 11:21   ` Hsin-Yi Wang
  0 siblings, 0 replies; 50+ messages in thread
From: Hsin-Yi Wang @ 2021-11-23 11:21 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Marek Szyprowski, Robin Murphy,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Rob Herring, Maxime Ripard,
	-,
	devicetree-u79uwXL29TY76Z2rM5mHXA, Matthias Brugger,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-mediatek-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	senozhatsky-F7+t8E8rja9g9hUCZPvPmw, tfiga-F7+t8E8rja9g9hUCZPvPmw

Default IO_TLB_SEGSIZE is 128, but some use cases requires more slabs.
Otherwise swiotlb_find_slots() will fail.

This patch allows each mem pool to decide their own io-tlb-segsize
through dt property.

Signed-off-by: Hsin-Yi Wang <hsinyi-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>
---
 include/linux/swiotlb.h |  1 +
 kernel/dma/swiotlb.c    | 34 ++++++++++++++++++++++++++--------
 2 files changed, 27 insertions(+), 8 deletions(-)

diff --git a/include/linux/swiotlb.h b/include/linux/swiotlb.h
index 569272871375c4..73b3312f23e65b 100644
--- a/include/linux/swiotlb.h
+++ b/include/linux/swiotlb.h
@@ -95,6 +95,7 @@ struct io_tlb_mem {
 	unsigned long nslabs;
 	unsigned long used;
 	unsigned int index;
+	unsigned int io_tlb_segsize;
 	spinlock_t lock;
 	struct dentry *debugfs;
 	bool late_alloc;
diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
index 8e840fbbed7c7a..021eef1844ca4c 100644
--- a/kernel/dma/swiotlb.c
+++ b/kernel/dma/swiotlb.c
@@ -145,9 +145,10 @@ void swiotlb_print_info(void)
 	       (mem->nslabs << IO_TLB_SHIFT) >> 20);
 }
 
-static inline unsigned long io_tlb_offset(unsigned long val)
+static inline unsigned long io_tlb_offset(unsigned long val,
+					  unsigned long io_tlb_segsize)
 {
-	return val & (IO_TLB_SEGSIZE - 1);
+	return val & (io_tlb_segsize - 1);
 }
 
 static inline unsigned long nr_slots(u64 val)
@@ -186,13 +187,16 @@ static void swiotlb_init_io_tlb_mem(struct io_tlb_mem *mem, phys_addr_t start,
 	mem->end = mem->start + bytes;
 	mem->index = 0;
 	mem->late_alloc = late_alloc;
+	if (!mem->io_tlb_segsize)
+		mem->io_tlb_segsize = IO_TLB_SEGSIZE;
 
 	if (swiotlb_force == SWIOTLB_FORCE)
 		mem->force_bounce = true;
 
 	spin_lock_init(&mem->lock);
 	for (i = 0; i < mem->nslabs; i++) {
-		mem->slots[i].list = IO_TLB_SEGSIZE - io_tlb_offset(i);
+		mem->slots[i].list = mem->io_tlb_segsize -
+				     io_tlb_offset(i, mem->io_tlb_segsize);
 		mem->slots[i].orig_addr = INVALID_PHYS_ADDR;
 		mem->slots[i].alloc_size = 0;
 	}
@@ -523,7 +527,7 @@ static int swiotlb_find_slots(struct device *dev, phys_addr_t orig_addr,
 			alloc_size - (offset + ((i - index) << IO_TLB_SHIFT));
 	}
 	for (i = index - 1;
-	     io_tlb_offset(i) != IO_TLB_SEGSIZE - 1 &&
+	     io_tlb_offset(i, mem->io_tlb_segsize) != mem->io_tlb_segsize - 1 &&
 	     mem->slots[i].list; i--)
 		mem->slots[i].list = ++count;
 
@@ -603,7 +607,7 @@ static void swiotlb_release_slots(struct device *dev, phys_addr_t tlb_addr)
 	 * with slots below and above the pool being returned.
 	 */
 	spin_lock_irqsave(&mem->lock, flags);
-	if (index + nslots < ALIGN(index + 1, IO_TLB_SEGSIZE))
+	if (index + nslots < ALIGN(index + 1, mem->io_tlb_segsize))
 		count = mem->slots[index + nslots].list;
 	else
 		count = 0;
@@ -623,8 +627,8 @@ static void swiotlb_release_slots(struct device *dev, phys_addr_t tlb_addr)
 	 * available (non zero)
 	 */
 	for (i = index - 1;
-	     io_tlb_offset(i) != IO_TLB_SEGSIZE - 1 && mem->slots[i].list;
-	     i--)
+	     io_tlb_offset(i, mem->io_tlb_segsize) != mem->io_tlb_segsize - 1 &&
+	     mem->slots[i].list; i--)
 		mem->slots[i].list = ++count;
 	mem->used -= nslots;
 	spin_unlock_irqrestore(&mem->lock, flags);
@@ -701,7 +705,9 @@ dma_addr_t swiotlb_map(struct device *dev, phys_addr_t paddr, size_t size,
 
 size_t swiotlb_max_mapping_size(struct device *dev)
 {
-	return ((size_t)IO_TLB_SIZE) * IO_TLB_SEGSIZE;
+	struct io_tlb_mem *mem = dev->dma_io_tlb_mem;
+
+	return ((size_t)IO_TLB_SIZE) * mem->io_tlb_segsize;
 }
 
 bool is_swiotlb_active(struct device *dev)
@@ -788,6 +794,7 @@ static int rmem_swiotlb_device_init(struct reserved_mem *rmem,
 {
 	struct io_tlb_mem *mem = rmem->priv;
 	unsigned long nslabs = rmem->size >> IO_TLB_SHIFT;
+	struct device_node *np;
 
 	/*
 	 * Since multiple devices can share the same pool, the private data,
@@ -808,6 +815,17 @@ static int rmem_swiotlb_device_init(struct reserved_mem *rmem,
 
 		set_memory_decrypted((unsigned long)phys_to_virt(rmem->base),
 				     rmem->size >> PAGE_SHIFT);
+
+		np = of_find_node_by_phandle(rmem->phandle);
+		if (np) {
+			if (!of_property_read_u32(np, "io-tlb-segsize",
+						  &mem->io_tlb_segsize)) {
+				if (hweight32(mem->io_tlb_segsize) != 1)
+					mem->io_tlb_segsize = IO_TLB_SEGSIZE;
+			}
+			of_node_put(np);
+		}
+
 		swiotlb_init_io_tlb_mem(mem, rmem->base, nslabs, false);
 		mem->force_bounce = true;
 		mem->for_alloc = true;
-- 
2.34.0.rc2.393.gf8c9666880-goog


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH 2/3] dt-bindings: Add io-tlb-segsize property for restricted-dma-pool
  2021-11-23 11:21 ` Hsin-Yi Wang
                     ` (2 preceding siblings ...)
  (?)
@ 2021-11-23 11:21   ` Hsin-Yi Wang
  -1 siblings, 0 replies; 50+ messages in thread
From: Hsin-Yi Wang @ 2021-11-23 11:21 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Marek Szyprowski, Robin Murphy, iommu, linux-kernel, Rob Herring,
	Maxime Ripard, -,
	devicetree, Matthias Brugger, linux-arm-kernel, linux-mediatek,
	senozhatsky, tfiga

Add a io-tlb-segsize property that each restricted-dma-pool can set its
own io_tlb_segsize since some use cases require slabs larger than default
value (128).

Signed-off-by: Hsin-Yi Wang <hsinyi@chromium.org>
---
 .../bindings/reserved-memory/shared-dma-pool.yaml         | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/Documentation/devicetree/bindings/reserved-memory/shared-dma-pool.yaml b/Documentation/devicetree/bindings/reserved-memory/shared-dma-pool.yaml
index a4bf757d6881de..6198bf6b76f0b2 100644
--- a/Documentation/devicetree/bindings/reserved-memory/shared-dma-pool.yaml
+++ b/Documentation/devicetree/bindings/reserved-memory/shared-dma-pool.yaml
@@ -56,6 +56,14 @@ properties:
       If this property is present, then Linux will use the region for
       the default pool of the consistent DMA allocator.
 
+  io-tlb-segsize:
+    type: u32
+    description: >
+      Each restricted-dma-pool can use this property to set its own
+      io_tlb_segsize. If not set, it will use the default value
+      IO_TLB_SEGSIZE defined in include/linux/swiotlb.h. The value has
+      to be a power of 2, otherwise it will fall back to IO_TLB_SEGSIZE.
+
 unevaluatedProperties: false
 
 examples:
-- 
2.34.0.rc2.393.gf8c9666880-goog


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH 2/3] dt-bindings: Add io-tlb-segsize property for restricted-dma-pool
@ 2021-11-23 11:21   ` Hsin-Yi Wang
  0 siblings, 0 replies; 50+ messages in thread
From: Hsin-Yi Wang @ 2021-11-23 11:21 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: devicetree, -,
	linux-kernel, senozhatsky, iommu, Rob Herring, linux-mediatek,
	Maxime Ripard, Matthias Brugger, Robin Murphy, linux-arm-kernel

Add a io-tlb-segsize property that each restricted-dma-pool can set its
own io_tlb_segsize since some use cases require slabs larger than default
value (128).

Signed-off-by: Hsin-Yi Wang <hsinyi@chromium.org>
---
 .../bindings/reserved-memory/shared-dma-pool.yaml         | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/Documentation/devicetree/bindings/reserved-memory/shared-dma-pool.yaml b/Documentation/devicetree/bindings/reserved-memory/shared-dma-pool.yaml
index a4bf757d6881de..6198bf6b76f0b2 100644
--- a/Documentation/devicetree/bindings/reserved-memory/shared-dma-pool.yaml
+++ b/Documentation/devicetree/bindings/reserved-memory/shared-dma-pool.yaml
@@ -56,6 +56,14 @@ properties:
       If this property is present, then Linux will use the region for
       the default pool of the consistent DMA allocator.
 
+  io-tlb-segsize:
+    type: u32
+    description: >
+      Each restricted-dma-pool can use this property to set its own
+      io_tlb_segsize. If not set, it will use the default value
+      IO_TLB_SEGSIZE defined in include/linux/swiotlb.h. The value has
+      to be a power of 2, otherwise it will fall back to IO_TLB_SEGSIZE.
+
 unevaluatedProperties: false
 
 examples:
-- 
2.34.0.rc2.393.gf8c9666880-goog

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH 2/3] dt-bindings: Add io-tlb-segsize property for restricted-dma-pool
@ 2021-11-23 11:21   ` Hsin-Yi Wang
  0 siblings, 0 replies; 50+ messages in thread
From: Hsin-Yi Wang @ 2021-11-23 11:21 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Marek Szyprowski, Robin Murphy, iommu, linux-kernel, Rob Herring,
	Maxime Ripard, -,
	devicetree, Matthias Brugger, linux-arm-kernel, linux-mediatek,
	senozhatsky, tfiga

Add a io-tlb-segsize property that each restricted-dma-pool can set its
own io_tlb_segsize since some use cases require slabs larger than default
value (128).

Signed-off-by: Hsin-Yi Wang <hsinyi@chromium.org>
---
 .../bindings/reserved-memory/shared-dma-pool.yaml         | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/Documentation/devicetree/bindings/reserved-memory/shared-dma-pool.yaml b/Documentation/devicetree/bindings/reserved-memory/shared-dma-pool.yaml
index a4bf757d6881de..6198bf6b76f0b2 100644
--- a/Documentation/devicetree/bindings/reserved-memory/shared-dma-pool.yaml
+++ b/Documentation/devicetree/bindings/reserved-memory/shared-dma-pool.yaml
@@ -56,6 +56,14 @@ properties:
       If this property is present, then Linux will use the region for
       the default pool of the consistent DMA allocator.
 
+  io-tlb-segsize:
+    type: u32
+    description: >
+      Each restricted-dma-pool can use this property to set its own
+      io_tlb_segsize. If not set, it will use the default value
+      IO_TLB_SEGSIZE defined in include/linux/swiotlb.h. The value has
+      to be a power of 2, otherwise it will fall back to IO_TLB_SEGSIZE.
+
 unevaluatedProperties: false
 
 examples:
-- 
2.34.0.rc2.393.gf8c9666880-goog


_______________________________________________
Linux-mediatek mailing list
Linux-mediatek@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-mediatek

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH 2/3] dt-bindings: Add io-tlb-segsize property for restricted-dma-pool
@ 2021-11-23 11:21   ` Hsin-Yi Wang
  0 siblings, 0 replies; 50+ messages in thread
From: Hsin-Yi Wang @ 2021-11-23 11:21 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Marek Szyprowski, Robin Murphy, iommu, linux-kernel, Rob Herring,
	Maxime Ripard, -,
	devicetree, Matthias Brugger, linux-arm-kernel, linux-mediatek,
	senozhatsky, tfiga

Add a io-tlb-segsize property that each restricted-dma-pool can set its
own io_tlb_segsize since some use cases require slabs larger than default
value (128).

Signed-off-by: Hsin-Yi Wang <hsinyi@chromium.org>
---
 .../bindings/reserved-memory/shared-dma-pool.yaml         | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/Documentation/devicetree/bindings/reserved-memory/shared-dma-pool.yaml b/Documentation/devicetree/bindings/reserved-memory/shared-dma-pool.yaml
index a4bf757d6881de..6198bf6b76f0b2 100644
--- a/Documentation/devicetree/bindings/reserved-memory/shared-dma-pool.yaml
+++ b/Documentation/devicetree/bindings/reserved-memory/shared-dma-pool.yaml
@@ -56,6 +56,14 @@ properties:
       If this property is present, then Linux will use the region for
       the default pool of the consistent DMA allocator.
 
+  io-tlb-segsize:
+    type: u32
+    description: >
+      Each restricted-dma-pool can use this property to set its own
+      io_tlb_segsize. If not set, it will use the default value
+      IO_TLB_SEGSIZE defined in include/linux/swiotlb.h. The value has
+      to be a power of 2, otherwise it will fall back to IO_TLB_SEGSIZE.
+
 unevaluatedProperties: false
 
 examples:
-- 
2.34.0.rc2.393.gf8c9666880-goog


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH 2/3] dt-bindings: Add io-tlb-segsize property for restricted-dma-pool
@ 2021-11-23 11:21   ` Hsin-Yi Wang
  0 siblings, 0 replies; 50+ messages in thread
From: Hsin-Yi Wang @ 2021-11-23 11:21 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Marek Szyprowski, Robin Murphy,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Rob Herring, Maxime Ripard,
	-,
	devicetree-u79uwXL29TY76Z2rM5mHXA, Matthias Brugger,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-mediatek-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	senozhatsky-F7+t8E8rja9g9hUCZPvPmw, tfiga-F7+t8E8rja9g9hUCZPvPmw

Add a io-tlb-segsize property that each restricted-dma-pool can set its
own io_tlb_segsize since some use cases require slabs larger than default
value (128).

Signed-off-by: Hsin-Yi Wang <hsinyi-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>
---
 .../bindings/reserved-memory/shared-dma-pool.yaml         | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/Documentation/devicetree/bindings/reserved-memory/shared-dma-pool.yaml b/Documentation/devicetree/bindings/reserved-memory/shared-dma-pool.yaml
index a4bf757d6881de..6198bf6b76f0b2 100644
--- a/Documentation/devicetree/bindings/reserved-memory/shared-dma-pool.yaml
+++ b/Documentation/devicetree/bindings/reserved-memory/shared-dma-pool.yaml
@@ -56,6 +56,14 @@ properties:
       If this property is present, then Linux will use the region for
       the default pool of the consistent DMA allocator.
 
+  io-tlb-segsize:
+    type: u32
+    description: >
+      Each restricted-dma-pool can use this property to set its own
+      io_tlb_segsize. If not set, it will use the default value
+      IO_TLB_SEGSIZE defined in include/linux/swiotlb.h. The value has
+      to be a power of 2, otherwise it will fall back to IO_TLB_SEGSIZE.
+
 unevaluatedProperties: false
 
 examples:
-- 
2.34.0.rc2.393.gf8c9666880-goog


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH 3/3] arm64: dts: mt8183: use restricted swiotlb for scp mem
  2021-11-23 11:21 ` Hsin-Yi Wang
                     ` (2 preceding siblings ...)
  (?)
@ 2021-11-23 11:21   ` Hsin-Yi Wang
  -1 siblings, 0 replies; 50+ messages in thread
From: Hsin-Yi Wang @ 2021-11-23 11:21 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Marek Szyprowski, Robin Murphy, iommu, linux-kernel, Rob Herring,
	Maxime Ripard, -,
	devicetree, Matthias Brugger, linux-arm-kernel, linux-mediatek,
	senozhatsky, tfiga

Use restricted-dma-pool for mtk_scp's reserved memory. And set the
io-tlb-segsize to 4096 since the driver needs at least 2560 slabs to
allocate memory.

Signed-off-by: Hsin-Yi Wang <hsinyi@chromium.org>
---
 arch/arm64/boot/dts/mediatek/mt8183-kukui.dtsi | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/boot/dts/mediatek/mt8183-kukui.dtsi b/arch/arm64/boot/dts/mediatek/mt8183-kukui.dtsi
index 94c13c45919445..de94b2fd7f33e7 100644
--- a/arch/arm64/boot/dts/mediatek/mt8183-kukui.dtsi
+++ b/arch/arm64/boot/dts/mediatek/mt8183-kukui.dtsi
@@ -109,9 +109,9 @@ reserved_memory: reserved-memory {
 		ranges;
 
 		scp_mem_reserved: scp_mem_region {
-			compatible = "shared-dma-pool";
+			compatible = "restricted-dma-pool";
 			reg = <0 0x50000000 0 0x2900000>;
-			no-map;
+			io-tlb-segsize = <4096>;
 		};
 	};
 
-- 
2.34.0.rc2.393.gf8c9666880-goog


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH 3/3] arm64: dts: mt8183: use restricted swiotlb for scp mem
@ 2021-11-23 11:21   ` Hsin-Yi Wang
  0 siblings, 0 replies; 50+ messages in thread
From: Hsin-Yi Wang @ 2021-11-23 11:21 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: devicetree, -,
	linux-kernel, senozhatsky, iommu, Rob Herring, linux-mediatek,
	Maxime Ripard, Matthias Brugger, Robin Murphy, linux-arm-kernel

Use restricted-dma-pool for mtk_scp's reserved memory. And set the
io-tlb-segsize to 4096 since the driver needs at least 2560 slabs to
allocate memory.

Signed-off-by: Hsin-Yi Wang <hsinyi@chromium.org>
---
 arch/arm64/boot/dts/mediatek/mt8183-kukui.dtsi | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/boot/dts/mediatek/mt8183-kukui.dtsi b/arch/arm64/boot/dts/mediatek/mt8183-kukui.dtsi
index 94c13c45919445..de94b2fd7f33e7 100644
--- a/arch/arm64/boot/dts/mediatek/mt8183-kukui.dtsi
+++ b/arch/arm64/boot/dts/mediatek/mt8183-kukui.dtsi
@@ -109,9 +109,9 @@ reserved_memory: reserved-memory {
 		ranges;
 
 		scp_mem_reserved: scp_mem_region {
-			compatible = "shared-dma-pool";
+			compatible = "restricted-dma-pool";
 			reg = <0 0x50000000 0 0x2900000>;
-			no-map;
+			io-tlb-segsize = <4096>;
 		};
 	};
 
-- 
2.34.0.rc2.393.gf8c9666880-goog

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH 3/3] arm64: dts: mt8183: use restricted swiotlb for scp mem
@ 2021-11-23 11:21   ` Hsin-Yi Wang
  0 siblings, 0 replies; 50+ messages in thread
From: Hsin-Yi Wang @ 2021-11-23 11:21 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Marek Szyprowski, Robin Murphy, iommu, linux-kernel, Rob Herring,
	Maxime Ripard, -,
	devicetree, Matthias Brugger, linux-arm-kernel, linux-mediatek,
	senozhatsky, tfiga

Use restricted-dma-pool for mtk_scp's reserved memory. And set the
io-tlb-segsize to 4096 since the driver needs at least 2560 slabs to
allocate memory.

Signed-off-by: Hsin-Yi Wang <hsinyi@chromium.org>
---
 arch/arm64/boot/dts/mediatek/mt8183-kukui.dtsi | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/boot/dts/mediatek/mt8183-kukui.dtsi b/arch/arm64/boot/dts/mediatek/mt8183-kukui.dtsi
index 94c13c45919445..de94b2fd7f33e7 100644
--- a/arch/arm64/boot/dts/mediatek/mt8183-kukui.dtsi
+++ b/arch/arm64/boot/dts/mediatek/mt8183-kukui.dtsi
@@ -109,9 +109,9 @@ reserved_memory: reserved-memory {
 		ranges;
 
 		scp_mem_reserved: scp_mem_region {
-			compatible = "shared-dma-pool";
+			compatible = "restricted-dma-pool";
 			reg = <0 0x50000000 0 0x2900000>;
-			no-map;
+			io-tlb-segsize = <4096>;
 		};
 	};
 
-- 
2.34.0.rc2.393.gf8c9666880-goog


_______________________________________________
Linux-mediatek mailing list
Linux-mediatek@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-mediatek

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH 3/3] arm64: dts: mt8183: use restricted swiotlb for scp mem
@ 2021-11-23 11:21   ` Hsin-Yi Wang
  0 siblings, 0 replies; 50+ messages in thread
From: Hsin-Yi Wang @ 2021-11-23 11:21 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Marek Szyprowski, Robin Murphy, iommu, linux-kernel, Rob Herring,
	Maxime Ripard, -,
	devicetree, Matthias Brugger, linux-arm-kernel, linux-mediatek,
	senozhatsky, tfiga

Use restricted-dma-pool for mtk_scp's reserved memory. And set the
io-tlb-segsize to 4096 since the driver needs at least 2560 slabs to
allocate memory.

Signed-off-by: Hsin-Yi Wang <hsinyi@chromium.org>
---
 arch/arm64/boot/dts/mediatek/mt8183-kukui.dtsi | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/boot/dts/mediatek/mt8183-kukui.dtsi b/arch/arm64/boot/dts/mediatek/mt8183-kukui.dtsi
index 94c13c45919445..de94b2fd7f33e7 100644
--- a/arch/arm64/boot/dts/mediatek/mt8183-kukui.dtsi
+++ b/arch/arm64/boot/dts/mediatek/mt8183-kukui.dtsi
@@ -109,9 +109,9 @@ reserved_memory: reserved-memory {
 		ranges;
 
 		scp_mem_reserved: scp_mem_region {
-			compatible = "shared-dma-pool";
+			compatible = "restricted-dma-pool";
 			reg = <0 0x50000000 0 0x2900000>;
-			no-map;
+			io-tlb-segsize = <4096>;
 		};
 	};
 
-- 
2.34.0.rc2.393.gf8c9666880-goog


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH 3/3] arm64: dts: mt8183: use restricted swiotlb for scp mem
@ 2021-11-23 11:21   ` Hsin-Yi Wang
  0 siblings, 0 replies; 50+ messages in thread
From: Hsin-Yi Wang @ 2021-11-23 11:21 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Marek Szyprowski, Robin Murphy,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Rob Herring, Maxime Ripard,
	-,
	devicetree-u79uwXL29TY76Z2rM5mHXA, Matthias Brugger,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-mediatek-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	senozhatsky-F7+t8E8rja9g9hUCZPvPmw, tfiga-F7+t8E8rja9g9hUCZPvPmw

Use restricted-dma-pool for mtk_scp's reserved memory. And set the
io-tlb-segsize to 4096 since the driver needs at least 2560 slabs to
allocate memory.

Signed-off-by: Hsin-Yi Wang <hsinyi-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>
---
 arch/arm64/boot/dts/mediatek/mt8183-kukui.dtsi | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/boot/dts/mediatek/mt8183-kukui.dtsi b/arch/arm64/boot/dts/mediatek/mt8183-kukui.dtsi
index 94c13c45919445..de94b2fd7f33e7 100644
--- a/arch/arm64/boot/dts/mediatek/mt8183-kukui.dtsi
+++ b/arch/arm64/boot/dts/mediatek/mt8183-kukui.dtsi
@@ -109,9 +109,9 @@ reserved_memory: reserved-memory {
 		ranges;
 
 		scp_mem_reserved: scp_mem_region {
-			compatible = "shared-dma-pool";
+			compatible = "restricted-dma-pool";
 			reg = <0 0x50000000 0 0x2900000>;
-			no-map;
+			io-tlb-segsize = <4096>;
 		};
 	};
 
-- 
2.34.0.rc2.393.gf8c9666880-goog


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* Re: [PATCH 0/3] Allow restricted-dma-pool to customize IO_TLB_SEGSIZE
  2021-11-23 11:21 ` Hsin-Yi Wang
                     ` (2 preceding siblings ...)
  (?)
@ 2021-11-23 11:58   ` Robin Murphy
  -1 siblings, 0 replies; 50+ messages in thread
From: Robin Murphy @ 2021-11-23 11:58 UTC (permalink / raw)
  To: Hsin-Yi Wang, Christoph Hellwig
  Cc: Marek Szyprowski, iommu, linux-kernel, Rob Herring,
	Maxime Ripard, -,
	devicetree, Matthias Brugger, linux-arm-kernel, linux-mediatek,
	senozhatsky, tfiga

On 2021-11-23 11:21, Hsin-Yi Wang wrote:
> Default IO_TLB_SEGSIZE (128) slabs may be not enough for some use cases.
> This series adds support to customize io_tlb_segsize for each
> restricted-dma-pool.
> 
> Example use case:
> 
> mtk-isp drivers[1] are controlled by mtk-scp[2] and allocate memory through
> mtk-scp. In order to use the noncontiguous DMA API[3], we need to use
> the swiotlb pool. mtk-scp needs to allocate memory with 2560 slabs.
> mtk-isp drivers also needs to allocate memory with 200+ slabs. Both are
> larger than the default IO_TLB_SEGSIZE (128) slabs.

Are drivers really doing streaming DMA mappings that large? If so, that 
seems like it might be worth trying to address in its own right for the 
sake of efficiency - allocating ~5MB of memory twice and copying it back 
and forth doesn't sound like the ideal thing to do.

If it's really about coherent DMA buffer allocation, I thought the plan 
was that devices which expect to use a significant amount and/or size of 
coherent buffers would continue to use a shared-dma-pool for that? It's 
still what the binding implies. My understanding was that 
swiotlb_alloc() is mostly just a fallback for the sake of drivers which 
mostly do streaming DMA but may allocate a handful of pages worth of 
coherent buffers here and there. Certainly looking at the mtk_scp 
driver, that seems like it shouldn't be going anywhere near SWIOTLB at all.

Robin.

> [1] (not in upstream) https://patchwork.kernel.org/project/linux-media/cover/20190611035344.29814-1-jungo.lin@mediatek.com/
> [2] https://elixir.bootlin.com/linux/latest/source/drivers/remoteproc/mtk_scp.c
> [3] https://patchwork.kernel.org/project/linux-media/cover/20210909112430.61243-1-senozhatsky@chromium.org/
> 
> Hsin-Yi Wang (3):
>    dma: swiotlb: Allow restricted-dma-pool to customize IO_TLB_SEGSIZE
>    dt-bindings: Add io-tlb-segsize property for restricted-dma-pool
>    arm64: dts: mt8183: use restricted swiotlb for scp mem
> 
>   .../reserved-memory/shared-dma-pool.yaml      |  8 +++++
>   .../arm64/boot/dts/mediatek/mt8183-kukui.dtsi |  4 +--
>   include/linux/swiotlb.h                       |  1 +
>   kernel/dma/swiotlb.c                          | 34 ++++++++++++++-----
>   4 files changed, 37 insertions(+), 10 deletions(-)
> 

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 0/3] Allow restricted-dma-pool to customize IO_TLB_SEGSIZE
@ 2021-11-23 11:58   ` Robin Murphy
  0 siblings, 0 replies; 50+ messages in thread
From: Robin Murphy @ 2021-11-23 11:58 UTC (permalink / raw)
  To: Hsin-Yi Wang, Christoph Hellwig
  Cc: devicetree, -,
	linux-kernel, senozhatsky, iommu, Rob Herring, linux-mediatek,
	Maxime Ripard, Matthias Brugger, linux-arm-kernel

On 2021-11-23 11:21, Hsin-Yi Wang wrote:
> Default IO_TLB_SEGSIZE (128) slabs may be not enough for some use cases.
> This series adds support to customize io_tlb_segsize for each
> restricted-dma-pool.
> 
> Example use case:
> 
> mtk-isp drivers[1] are controlled by mtk-scp[2] and allocate memory through
> mtk-scp. In order to use the noncontiguous DMA API[3], we need to use
> the swiotlb pool. mtk-scp needs to allocate memory with 2560 slabs.
> mtk-isp drivers also needs to allocate memory with 200+ slabs. Both are
> larger than the default IO_TLB_SEGSIZE (128) slabs.

Are drivers really doing streaming DMA mappings that large? If so, that 
seems like it might be worth trying to address in its own right for the 
sake of efficiency - allocating ~5MB of memory twice and copying it back 
and forth doesn't sound like the ideal thing to do.

If it's really about coherent DMA buffer allocation, I thought the plan 
was that devices which expect to use a significant amount and/or size of 
coherent buffers would continue to use a shared-dma-pool for that? It's 
still what the binding implies. My understanding was that 
swiotlb_alloc() is mostly just a fallback for the sake of drivers which 
mostly do streaming DMA but may allocate a handful of pages worth of 
coherent buffers here and there. Certainly looking at the mtk_scp 
driver, that seems like it shouldn't be going anywhere near SWIOTLB at all.

Robin.

> [1] (not in upstream) https://patchwork.kernel.org/project/linux-media/cover/20190611035344.29814-1-jungo.lin@mediatek.com/
> [2] https://elixir.bootlin.com/linux/latest/source/drivers/remoteproc/mtk_scp.c
> [3] https://patchwork.kernel.org/project/linux-media/cover/20210909112430.61243-1-senozhatsky@chromium.org/
> 
> Hsin-Yi Wang (3):
>    dma: swiotlb: Allow restricted-dma-pool to customize IO_TLB_SEGSIZE
>    dt-bindings: Add io-tlb-segsize property for restricted-dma-pool
>    arm64: dts: mt8183: use restricted swiotlb for scp mem
> 
>   .../reserved-memory/shared-dma-pool.yaml      |  8 +++++
>   .../arm64/boot/dts/mediatek/mt8183-kukui.dtsi |  4 +--
>   include/linux/swiotlb.h                       |  1 +
>   kernel/dma/swiotlb.c                          | 34 ++++++++++++++-----
>   4 files changed, 37 insertions(+), 10 deletions(-)
> 
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 0/3] Allow restricted-dma-pool to customize IO_TLB_SEGSIZE
@ 2021-11-23 11:58   ` Robin Murphy
  0 siblings, 0 replies; 50+ messages in thread
From: Robin Murphy @ 2021-11-23 11:58 UTC (permalink / raw)
  To: Hsin-Yi Wang, Christoph Hellwig
  Cc: Marek Szyprowski, iommu, linux-kernel, Rob Herring,
	Maxime Ripard, -,
	devicetree, Matthias Brugger, linux-arm-kernel, linux-mediatek,
	senozhatsky, tfiga

On 2021-11-23 11:21, Hsin-Yi Wang wrote:
> Default IO_TLB_SEGSIZE (128) slabs may be not enough for some use cases.
> This series adds support to customize io_tlb_segsize for each
> restricted-dma-pool.
> 
> Example use case:
> 
> mtk-isp drivers[1] are controlled by mtk-scp[2] and allocate memory through
> mtk-scp. In order to use the noncontiguous DMA API[3], we need to use
> the swiotlb pool. mtk-scp needs to allocate memory with 2560 slabs.
> mtk-isp drivers also needs to allocate memory with 200+ slabs. Both are
> larger than the default IO_TLB_SEGSIZE (128) slabs.

Are drivers really doing streaming DMA mappings that large? If so, that 
seems like it might be worth trying to address in its own right for the 
sake of efficiency - allocating ~5MB of memory twice and copying it back 
and forth doesn't sound like the ideal thing to do.

If it's really about coherent DMA buffer allocation, I thought the plan 
was that devices which expect to use a significant amount and/or size of 
coherent buffers would continue to use a shared-dma-pool for that? It's 
still what the binding implies. My understanding was that 
swiotlb_alloc() is mostly just a fallback for the sake of drivers which 
mostly do streaming DMA but may allocate a handful of pages worth of 
coherent buffers here and there. Certainly looking at the mtk_scp 
driver, that seems like it shouldn't be going anywhere near SWIOTLB at all.

Robin.

> [1] (not in upstream) https://patchwork.kernel.org/project/linux-media/cover/20190611035344.29814-1-jungo.lin@mediatek.com/
> [2] https://elixir.bootlin.com/linux/latest/source/drivers/remoteproc/mtk_scp.c
> [3] https://patchwork.kernel.org/project/linux-media/cover/20210909112430.61243-1-senozhatsky@chromium.org/
> 
> Hsin-Yi Wang (3):
>    dma: swiotlb: Allow restricted-dma-pool to customize IO_TLB_SEGSIZE
>    dt-bindings: Add io-tlb-segsize property for restricted-dma-pool
>    arm64: dts: mt8183: use restricted swiotlb for scp mem
> 
>   .../reserved-memory/shared-dma-pool.yaml      |  8 +++++
>   .../arm64/boot/dts/mediatek/mt8183-kukui.dtsi |  4 +--
>   include/linux/swiotlb.h                       |  1 +
>   kernel/dma/swiotlb.c                          | 34 ++++++++++++++-----
>   4 files changed, 37 insertions(+), 10 deletions(-)
> 

_______________________________________________
Linux-mediatek mailing list
Linux-mediatek@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-mediatek

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 0/3] Allow restricted-dma-pool to customize IO_TLB_SEGSIZE
@ 2021-11-23 11:58   ` Robin Murphy
  0 siblings, 0 replies; 50+ messages in thread
From: Robin Murphy @ 2021-11-23 11:58 UTC (permalink / raw)
  To: Hsin-Yi Wang, Christoph Hellwig
  Cc: Marek Szyprowski, iommu, linux-kernel, Rob Herring,
	Maxime Ripard, -,
	devicetree, Matthias Brugger, linux-arm-kernel, linux-mediatek,
	senozhatsky, tfiga

On 2021-11-23 11:21, Hsin-Yi Wang wrote:
> Default IO_TLB_SEGSIZE (128) slabs may be not enough for some use cases.
> This series adds support to customize io_tlb_segsize for each
> restricted-dma-pool.
> 
> Example use case:
> 
> mtk-isp drivers[1] are controlled by mtk-scp[2] and allocate memory through
> mtk-scp. In order to use the noncontiguous DMA API[3], we need to use
> the swiotlb pool. mtk-scp needs to allocate memory with 2560 slabs.
> mtk-isp drivers also needs to allocate memory with 200+ slabs. Both are
> larger than the default IO_TLB_SEGSIZE (128) slabs.

Are drivers really doing streaming DMA mappings that large? If so, that 
seems like it might be worth trying to address in its own right for the 
sake of efficiency - allocating ~5MB of memory twice and copying it back 
and forth doesn't sound like the ideal thing to do.

If it's really about coherent DMA buffer allocation, I thought the plan 
was that devices which expect to use a significant amount and/or size of 
coherent buffers would continue to use a shared-dma-pool for that? It's 
still what the binding implies. My understanding was that 
swiotlb_alloc() is mostly just a fallback for the sake of drivers which 
mostly do streaming DMA but may allocate a handful of pages worth of 
coherent buffers here and there. Certainly looking at the mtk_scp 
driver, that seems like it shouldn't be going anywhere near SWIOTLB at all.

Robin.

> [1] (not in upstream) https://patchwork.kernel.org/project/linux-media/cover/20190611035344.29814-1-jungo.lin@mediatek.com/
> [2] https://elixir.bootlin.com/linux/latest/source/drivers/remoteproc/mtk_scp.c
> [3] https://patchwork.kernel.org/project/linux-media/cover/20210909112430.61243-1-senozhatsky@chromium.org/
> 
> Hsin-Yi Wang (3):
>    dma: swiotlb: Allow restricted-dma-pool to customize IO_TLB_SEGSIZE
>    dt-bindings: Add io-tlb-segsize property for restricted-dma-pool
>    arm64: dts: mt8183: use restricted swiotlb for scp mem
> 
>   .../reserved-memory/shared-dma-pool.yaml      |  8 +++++
>   .../arm64/boot/dts/mediatek/mt8183-kukui.dtsi |  4 +--
>   include/linux/swiotlb.h                       |  1 +
>   kernel/dma/swiotlb.c                          | 34 ++++++++++++++-----
>   4 files changed, 37 insertions(+), 10 deletions(-)
> 

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 0/3] Allow restricted-dma-pool to customize IO_TLB_SEGSIZE
@ 2021-11-23 11:58   ` Robin Murphy
  0 siblings, 0 replies; 50+ messages in thread
From: Robin Murphy @ 2021-11-23 11:58 UTC (permalink / raw)
  To: Hsin-Yi Wang, Christoph Hellwig
  Cc: Marek Szyprowski,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Rob Herring, Maxime Ripard,
	-,
	devicetree-u79uwXL29TY76Z2rM5mHXA, Matthias Brugger,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-mediatek-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	senozhatsky-F7+t8E8rja9g9hUCZPvPmw, tfiga-F7+t8E8rja9g9hUCZPvPmw

On 2021-11-23 11:21, Hsin-Yi Wang wrote:
> Default IO_TLB_SEGSIZE (128) slabs may be not enough for some use cases.
> This series adds support to customize io_tlb_segsize for each
> restricted-dma-pool.
> 
> Example use case:
> 
> mtk-isp drivers[1] are controlled by mtk-scp[2] and allocate memory through
> mtk-scp. In order to use the noncontiguous DMA API[3], we need to use
> the swiotlb pool. mtk-scp needs to allocate memory with 2560 slabs.
> mtk-isp drivers also needs to allocate memory with 200+ slabs. Both are
> larger than the default IO_TLB_SEGSIZE (128) slabs.

Are drivers really doing streaming DMA mappings that large? If so, that 
seems like it might be worth trying to address in its own right for the 
sake of efficiency - allocating ~5MB of memory twice and copying it back 
and forth doesn't sound like the ideal thing to do.

If it's really about coherent DMA buffer allocation, I thought the plan 
was that devices which expect to use a significant amount and/or size of 
coherent buffers would continue to use a shared-dma-pool for that? It's 
still what the binding implies. My understanding was that 
swiotlb_alloc() is mostly just a fallback for the sake of drivers which 
mostly do streaming DMA but may allocate a handful of pages worth of 
coherent buffers here and there. Certainly looking at the mtk_scp 
driver, that seems like it shouldn't be going anywhere near SWIOTLB at all.

Robin.

> [1] (not in upstream) https://patchwork.kernel.org/project/linux-media/cover/20190611035344.29814-1-jungo.lin-NuS5LvNUpcJWk0Htik3J/w@public.gmane.org/
> [2] https://elixir.bootlin.com/linux/latest/source/drivers/remoteproc/mtk_scp.c
> [3] https://patchwork.kernel.org/project/linux-media/cover/20210909112430.61243-1-senozhatsky-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org/
> 
> Hsin-Yi Wang (3):
>    dma: swiotlb: Allow restricted-dma-pool to customize IO_TLB_SEGSIZE
>    dt-bindings: Add io-tlb-segsize property for restricted-dma-pool
>    arm64: dts: mt8183: use restricted swiotlb for scp mem
> 
>   .../reserved-memory/shared-dma-pool.yaml      |  8 +++++
>   .../arm64/boot/dts/mediatek/mt8183-kukui.dtsi |  4 +--
>   include/linux/swiotlb.h                       |  1 +
>   kernel/dma/swiotlb.c                          | 34 ++++++++++++++-----
>   4 files changed, 37 insertions(+), 10 deletions(-)
> 

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 2/3] dt-bindings: Add io-tlb-segsize property for restricted-dma-pool
  2021-11-23 11:21   ` Hsin-Yi Wang
                       ` (2 preceding siblings ...)
  (?)
@ 2021-11-23 16:34     ` Rob Herring
  -1 siblings, 0 replies; 50+ messages in thread
From: Rob Herring @ 2021-11-23 16:34 UTC (permalink / raw)
  To: Hsin-Yi Wang
  Cc: tfiga, -,
	Christoph Hellwig, Robin Murphy, linux-arm-kernel, Maxime Ripard,
	Rob Herring, senozhatsky, iommu, Matthias Brugger,
	Marek Szyprowski, devicetree, linux-kernel, linux-mediatek

On Tue, 23 Nov 2021 19:21:03 +0800, Hsin-Yi Wang wrote:
> Add a io-tlb-segsize property that each restricted-dma-pool can set its
> own io_tlb_segsize since some use cases require slabs larger than default
> value (128).
> 
> Signed-off-by: Hsin-Yi Wang <hsinyi@chromium.org>
> ---
>  .../bindings/reserved-memory/shared-dma-pool.yaml         | 8 ++++++++
>  1 file changed, 8 insertions(+)
> 

My bot found errors running 'make DT_CHECKER_FLAGS=-m dt_binding_check'
on your patch (DT_CHECKER_FLAGS is new in v5.13):

yamllint warnings/errors:

dtschema/dtc warnings/errors:
/builds/robherring/linux-dt-review/Documentation/devicetree/bindings/reserved-memory/shared-dma-pool.yaml: properties:io-tlb-segsize:type: 'anyOf' conditional failed, one must be fixed:
	'u32' is not one of ['array', 'boolean', 'integer', 'null', 'number', 'object', 'string']
	'u32' is not of type 'array'
	from schema $id: http://json-schema.org/draft-07/schema#
/builds/robherring/linux-dt-review/Documentation/devicetree/bindings/reserved-memory/shared-dma-pool.yaml: properties:io-tlb-segsize:type: 'u32' is not one of ['boolean', 'object']
	from schema $id: http://devicetree.org/meta-schemas/core.yaml#
/builds/robherring/linux-dt-review/Documentation/devicetree/bindings/reserved-memory/shared-dma-pool.yaml: ignoring, error in schema: properties: io-tlb-segsize: type
warning: no schema found in file: ./Documentation/devicetree/bindings/reserved-memory/shared-dma-pool.yaml
Documentation/devicetree/bindings/display/msm/gpu.example.dt.yaml:0:0: /example-1/reserved-memory/gpu@8f200000: failed to match any schema with compatible: ['shared-dma-pool']
Documentation/devicetree/bindings/reserved-memory/shared-dma-pool.example.dt.yaml:0:0: /example-0/reserved-memory/linux,cma: failed to match any schema with compatible: ['shared-dma-pool']
Documentation/devicetree/bindings/reserved-memory/shared-dma-pool.example.dt.yaml:0:0: /example-0/reserved-memory/restricted-dma-pool@50000000: failed to match any schema with compatible: ['restricted-dma-pool']
Documentation/devicetree/bindings/dsp/fsl,dsp.example.dt.yaml:0:0: /example-1/vdev0buffer@94300000: failed to match any schema with compatible: ['shared-dma-pool']
Documentation/devicetree/bindings/remoteproc/ti,omap-remoteproc.example.dt.yaml:0:0: /example-0/reserved-memory/dsp-memory@98000000: failed to match any schema with compatible: ['shared-dma-pool']
Documentation/devicetree/bindings/remoteproc/ti,omap-remoteproc.example.dt.yaml:0:0: /example-1/reserved-memory/ipu-memory@95800000: failed to match any schema with compatible: ['shared-dma-pool']
Documentation/devicetree/bindings/remoteproc/ti,omap-remoteproc.example.dt.yaml:0:0: /example-2/reserved-memory/dsp1-memory@99000000: failed to match any schema with compatible: ['shared-dma-pool']
Documentation/devicetree/bindings/sound/google,cros-ec-codec.example.dt.yaml:0:0: /example-0/reserved-mem@52800000: failed to match any schema with compatible: ['shared-dma-pool']

doc reference errors (make refcheckdocs):

See https://patchwork.ozlabs.org/patch/1558503

This check can fail if there are any dependencies. The base for a patch
series is generally the most recent rc1.

If you already ran 'make dt_binding_check' and didn't see the above
error(s), then make sure 'yamllint' is installed and dt-schema is up to
date:

pip3 install dtschema --upgrade

Please check and re-submit.


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 2/3] dt-bindings: Add io-tlb-segsize property for restricted-dma-pool
@ 2021-11-23 16:34     ` Rob Herring
  0 siblings, 0 replies; 50+ messages in thread
From: Rob Herring @ 2021-11-23 16:34 UTC (permalink / raw)
  To: Hsin-Yi Wang
  Cc: devicetree, -,
	linux-kernel, Christoph Hellwig, senozhatsky, iommu, Rob Herring,
	linux-mediatek, Maxime Ripard, Matthias Brugger, Robin Murphy,
	linux-arm-kernel

On Tue, 23 Nov 2021 19:21:03 +0800, Hsin-Yi Wang wrote:
> Add a io-tlb-segsize property that each restricted-dma-pool can set its
> own io_tlb_segsize since some use cases require slabs larger than default
> value (128).
> 
> Signed-off-by: Hsin-Yi Wang <hsinyi@chromium.org>
> ---
>  .../bindings/reserved-memory/shared-dma-pool.yaml         | 8 ++++++++
>  1 file changed, 8 insertions(+)
> 

My bot found errors running 'make DT_CHECKER_FLAGS=-m dt_binding_check'
on your patch (DT_CHECKER_FLAGS is new in v5.13):

yamllint warnings/errors:

dtschema/dtc warnings/errors:
/builds/robherring/linux-dt-review/Documentation/devicetree/bindings/reserved-memory/shared-dma-pool.yaml: properties:io-tlb-segsize:type: 'anyOf' conditional failed, one must be fixed:
	'u32' is not one of ['array', 'boolean', 'integer', 'null', 'number', 'object', 'string']
	'u32' is not of type 'array'
	from schema $id: http://json-schema.org/draft-07/schema#
/builds/robherring/linux-dt-review/Documentation/devicetree/bindings/reserved-memory/shared-dma-pool.yaml: properties:io-tlb-segsize:type: 'u32' is not one of ['boolean', 'object']
	from schema $id: http://devicetree.org/meta-schemas/core.yaml#
/builds/robherring/linux-dt-review/Documentation/devicetree/bindings/reserved-memory/shared-dma-pool.yaml: ignoring, error in schema: properties: io-tlb-segsize: type
warning: no schema found in file: ./Documentation/devicetree/bindings/reserved-memory/shared-dma-pool.yaml
Documentation/devicetree/bindings/display/msm/gpu.example.dt.yaml:0:0: /example-1/reserved-memory/gpu@8f200000: failed to match any schema with compatible: ['shared-dma-pool']
Documentation/devicetree/bindings/reserved-memory/shared-dma-pool.example.dt.yaml:0:0: /example-0/reserved-memory/linux,cma: failed to match any schema with compatible: ['shared-dma-pool']
Documentation/devicetree/bindings/reserved-memory/shared-dma-pool.example.dt.yaml:0:0: /example-0/reserved-memory/restricted-dma-pool@50000000: failed to match any schema with compatible: ['restricted-dma-pool']
Documentation/devicetree/bindings/dsp/fsl,dsp.example.dt.yaml:0:0: /example-1/vdev0buffer@94300000: failed to match any schema with compatible: ['shared-dma-pool']
Documentation/devicetree/bindings/remoteproc/ti,omap-remoteproc.example.dt.yaml:0:0: /example-0/reserved-memory/dsp-memory@98000000: failed to match any schema with compatible: ['shared-dma-pool']
Documentation/devicetree/bindings/remoteproc/ti,omap-remoteproc.example.dt.yaml:0:0: /example-1/reserved-memory/ipu-memory@95800000: failed to match any schema with compatible: ['shared-dma-pool']
Documentation/devicetree/bindings/remoteproc/ti,omap-remoteproc.example.dt.yaml:0:0: /example-2/reserved-memory/dsp1-memory@99000000: failed to match any schema with compatible: ['shared-dma-pool']
Documentation/devicetree/bindings/sound/google,cros-ec-codec.example.dt.yaml:0:0: /example-0/reserved-mem@52800000: failed to match any schema with compatible: ['shared-dma-pool']

doc reference errors (make refcheckdocs):

See https://patchwork.ozlabs.org/patch/1558503

This check can fail if there are any dependencies. The base for a patch
series is generally the most recent rc1.

If you already ran 'make dt_binding_check' and didn't see the above
error(s), then make sure 'yamllint' is installed and dt-schema is up to
date:

pip3 install dtschema --upgrade

Please check and re-submit.

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 2/3] dt-bindings: Add io-tlb-segsize property for restricted-dma-pool
@ 2021-11-23 16:34     ` Rob Herring
  0 siblings, 0 replies; 50+ messages in thread
From: Rob Herring @ 2021-11-23 16:34 UTC (permalink / raw)
  To: Hsin-Yi Wang
  Cc: tfiga, -,
	Christoph Hellwig, Robin Murphy, linux-arm-kernel, Maxime Ripard,
	Rob Herring, senozhatsky, iommu, Matthias Brugger,
	Marek Szyprowski, devicetree, linux-kernel, linux-mediatek

On Tue, 23 Nov 2021 19:21:03 +0800, Hsin-Yi Wang wrote:
> Add a io-tlb-segsize property that each restricted-dma-pool can set its
> own io_tlb_segsize since some use cases require slabs larger than default
> value (128).
> 
> Signed-off-by: Hsin-Yi Wang <hsinyi@chromium.org>
> ---
>  .../bindings/reserved-memory/shared-dma-pool.yaml         | 8 ++++++++
>  1 file changed, 8 insertions(+)
> 

My bot found errors running 'make DT_CHECKER_FLAGS=-m dt_binding_check'
on your patch (DT_CHECKER_FLAGS is new in v5.13):

yamllint warnings/errors:

dtschema/dtc warnings/errors:
/builds/robherring/linux-dt-review/Documentation/devicetree/bindings/reserved-memory/shared-dma-pool.yaml: properties:io-tlb-segsize:type: 'anyOf' conditional failed, one must be fixed:
	'u32' is not one of ['array', 'boolean', 'integer', 'null', 'number', 'object', 'string']
	'u32' is not of type 'array'
	from schema $id: http://json-schema.org/draft-07/schema#
/builds/robherring/linux-dt-review/Documentation/devicetree/bindings/reserved-memory/shared-dma-pool.yaml: properties:io-tlb-segsize:type: 'u32' is not one of ['boolean', 'object']
	from schema $id: http://devicetree.org/meta-schemas/core.yaml#
/builds/robherring/linux-dt-review/Documentation/devicetree/bindings/reserved-memory/shared-dma-pool.yaml: ignoring, error in schema: properties: io-tlb-segsize: type
warning: no schema found in file: ./Documentation/devicetree/bindings/reserved-memory/shared-dma-pool.yaml
Documentation/devicetree/bindings/display/msm/gpu.example.dt.yaml:0:0: /example-1/reserved-memory/gpu@8f200000: failed to match any schema with compatible: ['shared-dma-pool']
Documentation/devicetree/bindings/reserved-memory/shared-dma-pool.example.dt.yaml:0:0: /example-0/reserved-memory/linux,cma: failed to match any schema with compatible: ['shared-dma-pool']
Documentation/devicetree/bindings/reserved-memory/shared-dma-pool.example.dt.yaml:0:0: /example-0/reserved-memory/restricted-dma-pool@50000000: failed to match any schema with compatible: ['restricted-dma-pool']
Documentation/devicetree/bindings/dsp/fsl,dsp.example.dt.yaml:0:0: /example-1/vdev0buffer@94300000: failed to match any schema with compatible: ['shared-dma-pool']
Documentation/devicetree/bindings/remoteproc/ti,omap-remoteproc.example.dt.yaml:0:0: /example-0/reserved-memory/dsp-memory@98000000: failed to match any schema with compatible: ['shared-dma-pool']
Documentation/devicetree/bindings/remoteproc/ti,omap-remoteproc.example.dt.yaml:0:0: /example-1/reserved-memory/ipu-memory@95800000: failed to match any schema with compatible: ['shared-dma-pool']
Documentation/devicetree/bindings/remoteproc/ti,omap-remoteproc.example.dt.yaml:0:0: /example-2/reserved-memory/dsp1-memory@99000000: failed to match any schema with compatible: ['shared-dma-pool']
Documentation/devicetree/bindings/sound/google,cros-ec-codec.example.dt.yaml:0:0: /example-0/reserved-mem@52800000: failed to match any schema with compatible: ['shared-dma-pool']

doc reference errors (make refcheckdocs):

See https://patchwork.ozlabs.org/patch/1558503

This check can fail if there are any dependencies. The base for a patch
series is generally the most recent rc1.

If you already ran 'make dt_binding_check' and didn't see the above
error(s), then make sure 'yamllint' is installed and dt-schema is up to
date:

pip3 install dtschema --upgrade

Please check and re-submit.


_______________________________________________
Linux-mediatek mailing list
Linux-mediatek@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-mediatek

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 2/3] dt-bindings: Add io-tlb-segsize property for restricted-dma-pool
@ 2021-11-23 16:34     ` Rob Herring
  0 siblings, 0 replies; 50+ messages in thread
From: Rob Herring @ 2021-11-23 16:34 UTC (permalink / raw)
  To: Hsin-Yi Wang
  Cc: tfiga, -,
	Christoph Hellwig, Robin Murphy, linux-arm-kernel, Maxime Ripard,
	Rob Herring, senozhatsky, iommu, Matthias Brugger,
	Marek Szyprowski, devicetree, linux-kernel, linux-mediatek

On Tue, 23 Nov 2021 19:21:03 +0800, Hsin-Yi Wang wrote:
> Add a io-tlb-segsize property that each restricted-dma-pool can set its
> own io_tlb_segsize since some use cases require slabs larger than default
> value (128).
> 
> Signed-off-by: Hsin-Yi Wang <hsinyi@chromium.org>
> ---
>  .../bindings/reserved-memory/shared-dma-pool.yaml         | 8 ++++++++
>  1 file changed, 8 insertions(+)
> 

My bot found errors running 'make DT_CHECKER_FLAGS=-m dt_binding_check'
on your patch (DT_CHECKER_FLAGS is new in v5.13):

yamllint warnings/errors:

dtschema/dtc warnings/errors:
/builds/robherring/linux-dt-review/Documentation/devicetree/bindings/reserved-memory/shared-dma-pool.yaml: properties:io-tlb-segsize:type: 'anyOf' conditional failed, one must be fixed:
	'u32' is not one of ['array', 'boolean', 'integer', 'null', 'number', 'object', 'string']
	'u32' is not of type 'array'
	from schema $id: http://json-schema.org/draft-07/schema#
/builds/robherring/linux-dt-review/Documentation/devicetree/bindings/reserved-memory/shared-dma-pool.yaml: properties:io-tlb-segsize:type: 'u32' is not one of ['boolean', 'object']
	from schema $id: http://devicetree.org/meta-schemas/core.yaml#
/builds/robherring/linux-dt-review/Documentation/devicetree/bindings/reserved-memory/shared-dma-pool.yaml: ignoring, error in schema: properties: io-tlb-segsize: type
warning: no schema found in file: ./Documentation/devicetree/bindings/reserved-memory/shared-dma-pool.yaml
Documentation/devicetree/bindings/display/msm/gpu.example.dt.yaml:0:0: /example-1/reserved-memory/gpu@8f200000: failed to match any schema with compatible: ['shared-dma-pool']
Documentation/devicetree/bindings/reserved-memory/shared-dma-pool.example.dt.yaml:0:0: /example-0/reserved-memory/linux,cma: failed to match any schema with compatible: ['shared-dma-pool']
Documentation/devicetree/bindings/reserved-memory/shared-dma-pool.example.dt.yaml:0:0: /example-0/reserved-memory/restricted-dma-pool@50000000: failed to match any schema with compatible: ['restricted-dma-pool']
Documentation/devicetree/bindings/dsp/fsl,dsp.example.dt.yaml:0:0: /example-1/vdev0buffer@94300000: failed to match any schema with compatible: ['shared-dma-pool']
Documentation/devicetree/bindings/remoteproc/ti,omap-remoteproc.example.dt.yaml:0:0: /example-0/reserved-memory/dsp-memory@98000000: failed to match any schema with compatible: ['shared-dma-pool']
Documentation/devicetree/bindings/remoteproc/ti,omap-remoteproc.example.dt.yaml:0:0: /example-1/reserved-memory/ipu-memory@95800000: failed to match any schema with compatible: ['shared-dma-pool']
Documentation/devicetree/bindings/remoteproc/ti,omap-remoteproc.example.dt.yaml:0:0: /example-2/reserved-memory/dsp1-memory@99000000: failed to match any schema with compatible: ['shared-dma-pool']
Documentation/devicetree/bindings/sound/google,cros-ec-codec.example.dt.yaml:0:0: /example-0/reserved-mem@52800000: failed to match any schema with compatible: ['shared-dma-pool']

doc reference errors (make refcheckdocs):

See https://patchwork.ozlabs.org/patch/1558503

This check can fail if there are any dependencies. The base for a patch
series is generally the most recent rc1.

If you already ran 'make dt_binding_check' and didn't see the above
error(s), then make sure 'yamllint' is installed and dt-schema is up to
date:

pip3 install dtschema --upgrade

Please check and re-submit.


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 2/3] dt-bindings: Add io-tlb-segsize property for restricted-dma-pool
@ 2021-11-23 16:34     ` Rob Herring
  0 siblings, 0 replies; 50+ messages in thread
From: Rob Herring @ 2021-11-23 16:34 UTC (permalink / raw)
  To: Hsin-Yi Wang
  Cc: tfiga-F7+t8E8rja9g9hUCZPvPmw, -,
	Christoph Hellwig, Robin Murphy,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r, Maxime Ripard,
	Rob Herring, senozhatsky-F7+t8E8rja9g9hUCZPvPmw,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	Matthias Brugger, Marek Szyprowski,
	devicetree-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-mediatek-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r

On Tue, 23 Nov 2021 19:21:03 +0800, Hsin-Yi Wang wrote:
> Add a io-tlb-segsize property that each restricted-dma-pool can set its
> own io_tlb_segsize since some use cases require slabs larger than default
> value (128).
> 
> Signed-off-by: Hsin-Yi Wang <hsinyi-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>
> ---
>  .../bindings/reserved-memory/shared-dma-pool.yaml         | 8 ++++++++
>  1 file changed, 8 insertions(+)
> 

My bot found errors running 'make DT_CHECKER_FLAGS=-m dt_binding_check'
on your patch (DT_CHECKER_FLAGS is new in v5.13):

yamllint warnings/errors:

dtschema/dtc warnings/errors:
/builds/robherring/linux-dt-review/Documentation/devicetree/bindings/reserved-memory/shared-dma-pool.yaml: properties:io-tlb-segsize:type: 'anyOf' conditional failed, one must be fixed:
	'u32' is not one of ['array', 'boolean', 'integer', 'null', 'number', 'object', 'string']
	'u32' is not of type 'array'
	from schema $id: http://json-schema.org/draft-07/schema#
/builds/robherring/linux-dt-review/Documentation/devicetree/bindings/reserved-memory/shared-dma-pool.yaml: properties:io-tlb-segsize:type: 'u32' is not one of ['boolean', 'object']
	from schema $id: http://devicetree.org/meta-schemas/core.yaml#
/builds/robherring/linux-dt-review/Documentation/devicetree/bindings/reserved-memory/shared-dma-pool.yaml: ignoring, error in schema: properties: io-tlb-segsize: type
warning: no schema found in file: ./Documentation/devicetree/bindings/reserved-memory/shared-dma-pool.yaml
Documentation/devicetree/bindings/display/msm/gpu.example.dt.yaml:0:0: /example-1/reserved-memory/gpu@8f200000: failed to match any schema with compatible: ['shared-dma-pool']
Documentation/devicetree/bindings/reserved-memory/shared-dma-pool.example.dt.yaml:0:0: /example-0/reserved-memory/linux,cma: failed to match any schema with compatible: ['shared-dma-pool']
Documentation/devicetree/bindings/reserved-memory/shared-dma-pool.example.dt.yaml:0:0: /example-0/reserved-memory/restricted-dma-pool@50000000: failed to match any schema with compatible: ['restricted-dma-pool']
Documentation/devicetree/bindings/dsp/fsl,dsp.example.dt.yaml:0:0: /example-1/vdev0buffer@94300000: failed to match any schema with compatible: ['shared-dma-pool']
Documentation/devicetree/bindings/remoteproc/ti,omap-remoteproc.example.dt.yaml:0:0: /example-0/reserved-memory/dsp-memory@98000000: failed to match any schema with compatible: ['shared-dma-pool']
Documentation/devicetree/bindings/remoteproc/ti,omap-remoteproc.example.dt.yaml:0:0: /example-1/reserved-memory/ipu-memory@95800000: failed to match any schema with compatible: ['shared-dma-pool']
Documentation/devicetree/bindings/remoteproc/ti,omap-remoteproc.example.dt.yaml:0:0: /example-2/reserved-memory/dsp1-memory@99000000: failed to match any schema with compatible: ['shared-dma-pool']
Documentation/devicetree/bindings/sound/google,cros-ec-codec.example.dt.yaml:0:0: /example-0/reserved-mem@52800000: failed to match any schema with compatible: ['shared-dma-pool']

doc reference errors (make refcheckdocs):

See https://patchwork.ozlabs.org/patch/1558503

This check can fail if there are any dependencies. The base for a patch
series is generally the most recent rc1.

If you already ran 'make dt_binding_check' and didn't see the above
error(s), then make sure 'yamllint' is installed and dt-schema is up to
date:

pip3 install dtschema --upgrade

Please check and re-submit.


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 0/3] Allow restricted-dma-pool to customize IO_TLB_SEGSIZE
  2021-11-23 11:58   ` Robin Murphy
                       ` (2 preceding siblings ...)
  (?)
@ 2021-11-24  3:55     ` Hsin-Yi Wang
  -1 siblings, 0 replies; 50+ messages in thread
From: Hsin-Yi Wang @ 2021-11-24  3:55 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Christoph Hellwig, Marek Szyprowski, iommu, linux-kernel,
	Rob Herring, Maxime Ripard, -,
	devicetree, Matthias Brugger, linux-arm-kernel, linux-mediatek,
	senozhatsky, tfiga

On Tue, Nov 23, 2021 at 7:58 PM Robin Murphy <robin.murphy@arm.com> wrote:
>
> On 2021-11-23 11:21, Hsin-Yi Wang wrote:
> > Default IO_TLB_SEGSIZE (128) slabs may be not enough for some use cases.
> > This series adds support to customize io_tlb_segsize for each
> > restricted-dma-pool.
> >
> > Example use case:
> >
> > mtk-isp drivers[1] are controlled by mtk-scp[2] and allocate memory through
> > mtk-scp. In order to use the noncontiguous DMA API[3], we need to use
> > the swiotlb pool. mtk-scp needs to allocate memory with 2560 slabs.
> > mtk-isp drivers also needs to allocate memory with 200+ slabs. Both are
> > larger than the default IO_TLB_SEGSIZE (128) slabs.
>
> Are drivers really doing streaming DMA mappings that large? If so, that
> seems like it might be worth trying to address in its own right for the
> sake of efficiency - allocating ~5MB of memory twice and copying it back
> and forth doesn't sound like the ideal thing to do.
>
> If it's really about coherent DMA buffer allocation, I thought the plan
> was that devices which expect to use a significant amount and/or size of
> coherent buffers would continue to use a shared-dma-pool for that? It's
> still what the binding implies. My understanding was that
> swiotlb_alloc() is mostly just a fallback for the sake of drivers which
> mostly do streaming DMA but may allocate a handful of pages worth of
> coherent buffers here and there. Certainly looking at the mtk_scp
> driver, that seems like it shouldn't be going anywhere near SWIOTLB at all.
>
mtk_scp on its own can use the shared-dma-pool, which it currently uses.
The reason we switched to restricted-dma-pool is that we want to use
the noncontiguous DMA API for mtk-isp. The noncontiguous DMA API is
designed for devices with iommu, and if a device doesn't have an
iommu, it will fallback using swiotlb. But currently noncontiguous DMA
API doesn't work with the shared-dma-pool.

vb2_dc_alloc() -> dma_alloc_noncontiguous() -> alloc_single_sgt() ->
__dma_alloc_pages() -> dma_direct_alloc_pages() ->
__dma_direct_alloc_pages() -> swiotlb_alloc().


> Robin.
>
> > [1] (not in upstream) https://patchwork.kernel.org/project/linux-media/cover/20190611035344.29814-1-jungo.lin@mediatek.com/
> > [2] https://elixir.bootlin.com/linux/latest/source/drivers/remoteproc/mtk_scp.c
> > [3] https://patchwork.kernel.org/project/linux-media/cover/20210909112430.61243-1-senozhatsky@chromium.org/
> >
> > Hsin-Yi Wang (3):
> >    dma: swiotlb: Allow restricted-dma-pool to customize IO_TLB_SEGSIZE
> >    dt-bindings: Add io-tlb-segsize property for restricted-dma-pool
> >    arm64: dts: mt8183: use restricted swiotlb for scp mem
> >
> >   .../reserved-memory/shared-dma-pool.yaml      |  8 +++++
> >   .../arm64/boot/dts/mediatek/mt8183-kukui.dtsi |  4 +--
> >   include/linux/swiotlb.h                       |  1 +
> >   kernel/dma/swiotlb.c                          | 34 ++++++++++++++-----
> >   4 files changed, 37 insertions(+), 10 deletions(-)
> >

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 0/3] Allow restricted-dma-pool to customize IO_TLB_SEGSIZE
@ 2021-11-24  3:55     ` Hsin-Yi Wang
  0 siblings, 0 replies; 50+ messages in thread
From: Hsin-Yi Wang @ 2021-11-24  3:55 UTC (permalink / raw)
  To: Robin Murphy
  Cc: devicetree, -,
	linux-kernel, Christoph Hellwig, senozhatsky, iommu, Rob Herring,
	linux-mediatek, Maxime Ripard, Matthias Brugger,
	linux-arm-kernel

On Tue, Nov 23, 2021 at 7:58 PM Robin Murphy <robin.murphy@arm.com> wrote:
>
> On 2021-11-23 11:21, Hsin-Yi Wang wrote:
> > Default IO_TLB_SEGSIZE (128) slabs may be not enough for some use cases.
> > This series adds support to customize io_tlb_segsize for each
> > restricted-dma-pool.
> >
> > Example use case:
> >
> > mtk-isp drivers[1] are controlled by mtk-scp[2] and allocate memory through
> > mtk-scp. In order to use the noncontiguous DMA API[3], we need to use
> > the swiotlb pool. mtk-scp needs to allocate memory with 2560 slabs.
> > mtk-isp drivers also needs to allocate memory with 200+ slabs. Both are
> > larger than the default IO_TLB_SEGSIZE (128) slabs.
>
> Are drivers really doing streaming DMA mappings that large? If so, that
> seems like it might be worth trying to address in its own right for the
> sake of efficiency - allocating ~5MB of memory twice and copying it back
> and forth doesn't sound like the ideal thing to do.
>
> If it's really about coherent DMA buffer allocation, I thought the plan
> was that devices which expect to use a significant amount and/or size of
> coherent buffers would continue to use a shared-dma-pool for that? It's
> still what the binding implies. My understanding was that
> swiotlb_alloc() is mostly just a fallback for the sake of drivers which
> mostly do streaming DMA but may allocate a handful of pages worth of
> coherent buffers here and there. Certainly looking at the mtk_scp
> driver, that seems like it shouldn't be going anywhere near SWIOTLB at all.
>
mtk_scp on its own can use the shared-dma-pool, which it currently uses.
The reason we switched to restricted-dma-pool is that we want to use
the noncontiguous DMA API for mtk-isp. The noncontiguous DMA API is
designed for devices with iommu, and if a device doesn't have an
iommu, it will fallback using swiotlb. But currently noncontiguous DMA
API doesn't work with the shared-dma-pool.

vb2_dc_alloc() -> dma_alloc_noncontiguous() -> alloc_single_sgt() ->
__dma_alloc_pages() -> dma_direct_alloc_pages() ->
__dma_direct_alloc_pages() -> swiotlb_alloc().


> Robin.
>
> > [1] (not in upstream) https://patchwork.kernel.org/project/linux-media/cover/20190611035344.29814-1-jungo.lin@mediatek.com/
> > [2] https://elixir.bootlin.com/linux/latest/source/drivers/remoteproc/mtk_scp.c
> > [3] https://patchwork.kernel.org/project/linux-media/cover/20210909112430.61243-1-senozhatsky@chromium.org/
> >
> > Hsin-Yi Wang (3):
> >    dma: swiotlb: Allow restricted-dma-pool to customize IO_TLB_SEGSIZE
> >    dt-bindings: Add io-tlb-segsize property for restricted-dma-pool
> >    arm64: dts: mt8183: use restricted swiotlb for scp mem
> >
> >   .../reserved-memory/shared-dma-pool.yaml      |  8 +++++
> >   .../arm64/boot/dts/mediatek/mt8183-kukui.dtsi |  4 +--
> >   include/linux/swiotlb.h                       |  1 +
> >   kernel/dma/swiotlb.c                          | 34 ++++++++++++++-----
> >   4 files changed, 37 insertions(+), 10 deletions(-)
> >
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 0/3] Allow restricted-dma-pool to customize IO_TLB_SEGSIZE
@ 2021-11-24  3:55     ` Hsin-Yi Wang
  0 siblings, 0 replies; 50+ messages in thread
From: Hsin-Yi Wang @ 2021-11-24  3:55 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Christoph Hellwig, Marek Szyprowski, iommu, linux-kernel,
	Rob Herring, Maxime Ripard, -,
	devicetree, Matthias Brugger, linux-arm-kernel, linux-mediatek,
	senozhatsky, tfiga

On Tue, Nov 23, 2021 at 7:58 PM Robin Murphy <robin.murphy@arm.com> wrote:
>
> On 2021-11-23 11:21, Hsin-Yi Wang wrote:
> > Default IO_TLB_SEGSIZE (128) slabs may be not enough for some use cases.
> > This series adds support to customize io_tlb_segsize for each
> > restricted-dma-pool.
> >
> > Example use case:
> >
> > mtk-isp drivers[1] are controlled by mtk-scp[2] and allocate memory through
> > mtk-scp. In order to use the noncontiguous DMA API[3], we need to use
> > the swiotlb pool. mtk-scp needs to allocate memory with 2560 slabs.
> > mtk-isp drivers also needs to allocate memory with 200+ slabs. Both are
> > larger than the default IO_TLB_SEGSIZE (128) slabs.
>
> Are drivers really doing streaming DMA mappings that large? If so, that
> seems like it might be worth trying to address in its own right for the
> sake of efficiency - allocating ~5MB of memory twice and copying it back
> and forth doesn't sound like the ideal thing to do.
>
> If it's really about coherent DMA buffer allocation, I thought the plan
> was that devices which expect to use a significant amount and/or size of
> coherent buffers would continue to use a shared-dma-pool for that? It's
> still what the binding implies. My understanding was that
> swiotlb_alloc() is mostly just a fallback for the sake of drivers which
> mostly do streaming DMA but may allocate a handful of pages worth of
> coherent buffers here and there. Certainly looking at the mtk_scp
> driver, that seems like it shouldn't be going anywhere near SWIOTLB at all.
>
mtk_scp on its own can use the shared-dma-pool, which it currently uses.
The reason we switched to restricted-dma-pool is that we want to use
the noncontiguous DMA API for mtk-isp. The noncontiguous DMA API is
designed for devices with iommu, and if a device doesn't have an
iommu, it will fallback using swiotlb. But currently noncontiguous DMA
API doesn't work with the shared-dma-pool.

vb2_dc_alloc() -> dma_alloc_noncontiguous() -> alloc_single_sgt() ->
__dma_alloc_pages() -> dma_direct_alloc_pages() ->
__dma_direct_alloc_pages() -> swiotlb_alloc().


> Robin.
>
> > [1] (not in upstream) https://patchwork.kernel.org/project/linux-media/cover/20190611035344.29814-1-jungo.lin@mediatek.com/
> > [2] https://elixir.bootlin.com/linux/latest/source/drivers/remoteproc/mtk_scp.c
> > [3] https://patchwork.kernel.org/project/linux-media/cover/20210909112430.61243-1-senozhatsky@chromium.org/
> >
> > Hsin-Yi Wang (3):
> >    dma: swiotlb: Allow restricted-dma-pool to customize IO_TLB_SEGSIZE
> >    dt-bindings: Add io-tlb-segsize property for restricted-dma-pool
> >    arm64: dts: mt8183: use restricted swiotlb for scp mem
> >
> >   .../reserved-memory/shared-dma-pool.yaml      |  8 +++++
> >   .../arm64/boot/dts/mediatek/mt8183-kukui.dtsi |  4 +--
> >   include/linux/swiotlb.h                       |  1 +
> >   kernel/dma/swiotlb.c                          | 34 ++++++++++++++-----
> >   4 files changed, 37 insertions(+), 10 deletions(-)
> >

_______________________________________________
Linux-mediatek mailing list
Linux-mediatek@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-mediatek

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 0/3] Allow restricted-dma-pool to customize IO_TLB_SEGSIZE
@ 2021-11-24  3:55     ` Hsin-Yi Wang
  0 siblings, 0 replies; 50+ messages in thread
From: Hsin-Yi Wang @ 2021-11-24  3:55 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Christoph Hellwig, Marek Szyprowski, iommu, linux-kernel,
	Rob Herring, Maxime Ripard, -,
	devicetree, Matthias Brugger, linux-arm-kernel, linux-mediatek,
	senozhatsky, tfiga

On Tue, Nov 23, 2021 at 7:58 PM Robin Murphy <robin.murphy@arm.com> wrote:
>
> On 2021-11-23 11:21, Hsin-Yi Wang wrote:
> > Default IO_TLB_SEGSIZE (128) slabs may be not enough for some use cases.
> > This series adds support to customize io_tlb_segsize for each
> > restricted-dma-pool.
> >
> > Example use case:
> >
> > mtk-isp drivers[1] are controlled by mtk-scp[2] and allocate memory through
> > mtk-scp. In order to use the noncontiguous DMA API[3], we need to use
> > the swiotlb pool. mtk-scp needs to allocate memory with 2560 slabs.
> > mtk-isp drivers also needs to allocate memory with 200+ slabs. Both are
> > larger than the default IO_TLB_SEGSIZE (128) slabs.
>
> Are drivers really doing streaming DMA mappings that large? If so, that
> seems like it might be worth trying to address in its own right for the
> sake of efficiency - allocating ~5MB of memory twice and copying it back
> and forth doesn't sound like the ideal thing to do.
>
> If it's really about coherent DMA buffer allocation, I thought the plan
> was that devices which expect to use a significant amount and/or size of
> coherent buffers would continue to use a shared-dma-pool for that? It's
> still what the binding implies. My understanding was that
> swiotlb_alloc() is mostly just a fallback for the sake of drivers which
> mostly do streaming DMA but may allocate a handful of pages worth of
> coherent buffers here and there. Certainly looking at the mtk_scp
> driver, that seems like it shouldn't be going anywhere near SWIOTLB at all.
>
mtk_scp on its own can use the shared-dma-pool, which it currently uses.
The reason we switched to restricted-dma-pool is that we want to use
the noncontiguous DMA API for mtk-isp. The noncontiguous DMA API is
designed for devices with iommu, and if a device doesn't have an
iommu, it will fallback using swiotlb. But currently noncontiguous DMA
API doesn't work with the shared-dma-pool.

vb2_dc_alloc() -> dma_alloc_noncontiguous() -> alloc_single_sgt() ->
__dma_alloc_pages() -> dma_direct_alloc_pages() ->
__dma_direct_alloc_pages() -> swiotlb_alloc().


> Robin.
>
> > [1] (not in upstream) https://patchwork.kernel.org/project/linux-media/cover/20190611035344.29814-1-jungo.lin@mediatek.com/
> > [2] https://elixir.bootlin.com/linux/latest/source/drivers/remoteproc/mtk_scp.c
> > [3] https://patchwork.kernel.org/project/linux-media/cover/20210909112430.61243-1-senozhatsky@chromium.org/
> >
> > Hsin-Yi Wang (3):
> >    dma: swiotlb: Allow restricted-dma-pool to customize IO_TLB_SEGSIZE
> >    dt-bindings: Add io-tlb-segsize property for restricted-dma-pool
> >    arm64: dts: mt8183: use restricted swiotlb for scp mem
> >
> >   .../reserved-memory/shared-dma-pool.yaml      |  8 +++++
> >   .../arm64/boot/dts/mediatek/mt8183-kukui.dtsi |  4 +--
> >   include/linux/swiotlb.h                       |  1 +
> >   kernel/dma/swiotlb.c                          | 34 ++++++++++++++-----
> >   4 files changed, 37 insertions(+), 10 deletions(-)
> >

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 0/3] Allow restricted-dma-pool to customize IO_TLB_SEGSIZE
@ 2021-11-24  3:55     ` Hsin-Yi Wang
  0 siblings, 0 replies; 50+ messages in thread
From: Hsin-Yi Wang @ 2021-11-24  3:55 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Christoph Hellwig, Marek Szyprowski,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Rob Herring, Maxime Ripard,
	-,
	devicetree-u79uwXL29TY76Z2rM5mHXA, Matthias Brugger,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-mediatek-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	senozhatsky-F7+t8E8rja9g9hUCZPvPmw, tfiga-F7+t8E8rja9g9hUCZPvPmw

On Tue, Nov 23, 2021 at 7:58 PM Robin Murphy <robin.murphy-5wv7dgnIgG8@public.gmane.org> wrote:
>
> On 2021-11-23 11:21, Hsin-Yi Wang wrote:
> > Default IO_TLB_SEGSIZE (128) slabs may be not enough for some use cases.
> > This series adds support to customize io_tlb_segsize for each
> > restricted-dma-pool.
> >
> > Example use case:
> >
> > mtk-isp drivers[1] are controlled by mtk-scp[2] and allocate memory through
> > mtk-scp. In order to use the noncontiguous DMA API[3], we need to use
> > the swiotlb pool. mtk-scp needs to allocate memory with 2560 slabs.
> > mtk-isp drivers also needs to allocate memory with 200+ slabs. Both are
> > larger than the default IO_TLB_SEGSIZE (128) slabs.
>
> Are drivers really doing streaming DMA mappings that large? If so, that
> seems like it might be worth trying to address in its own right for the
> sake of efficiency - allocating ~5MB of memory twice and copying it back
> and forth doesn't sound like the ideal thing to do.
>
> If it's really about coherent DMA buffer allocation, I thought the plan
> was that devices which expect to use a significant amount and/or size of
> coherent buffers would continue to use a shared-dma-pool for that? It's
> still what the binding implies. My understanding was that
> swiotlb_alloc() is mostly just a fallback for the sake of drivers which
> mostly do streaming DMA but may allocate a handful of pages worth of
> coherent buffers here and there. Certainly looking at the mtk_scp
> driver, that seems like it shouldn't be going anywhere near SWIOTLB at all.
>
mtk_scp on its own can use the shared-dma-pool, which it currently uses.
The reason we switched to restricted-dma-pool is that we want to use
the noncontiguous DMA API for mtk-isp. The noncontiguous DMA API is
designed for devices with iommu, and if a device doesn't have an
iommu, it will fallback using swiotlb. But currently noncontiguous DMA
API doesn't work with the shared-dma-pool.

vb2_dc_alloc() -> dma_alloc_noncontiguous() -> alloc_single_sgt() ->
__dma_alloc_pages() -> dma_direct_alloc_pages() ->
__dma_direct_alloc_pages() -> swiotlb_alloc().


> Robin.
>
> > [1] (not in upstream) https://patchwork.kernel.org/project/linux-media/cover/20190611035344.29814-1-jungo.lin-NuS5LvNUpcJWk0Htik3J/w@public.gmane.org/
> > [2] https://elixir.bootlin.com/linux/latest/source/drivers/remoteproc/mtk_scp.c
> > [3] https://patchwork.kernel.org/project/linux-media/cover/20210909112430.61243-1-senozhatsky-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org/
> >
> > Hsin-Yi Wang (3):
> >    dma: swiotlb: Allow restricted-dma-pool to customize IO_TLB_SEGSIZE
> >    dt-bindings: Add io-tlb-segsize property for restricted-dma-pool
> >    arm64: dts: mt8183: use restricted swiotlb for scp mem
> >
> >   .../reserved-memory/shared-dma-pool.yaml      |  8 +++++
> >   .../arm64/boot/dts/mediatek/mt8183-kukui.dtsi |  4 +--
> >   include/linux/swiotlb.h                       |  1 +
> >   kernel/dma/swiotlb.c                          | 34 ++++++++++++++-----
> >   4 files changed, 37 insertions(+), 10 deletions(-)
> >

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 0/3] Allow restricted-dma-pool to customize IO_TLB_SEGSIZE
  2021-11-24  3:55     ` Hsin-Yi Wang
                         ` (2 preceding siblings ...)
  (?)
@ 2021-11-24 12:34       ` Robin Murphy
  -1 siblings, 0 replies; 50+ messages in thread
From: Robin Murphy @ 2021-11-24 12:34 UTC (permalink / raw)
  To: Hsin-Yi Wang
  Cc: devicetree, -,
	linux-kernel, Christoph Hellwig, senozhatsky, iommu, Rob Herring,
	linux-mediatek, Maxime Ripard, Matthias Brugger,
	linux-arm-kernel

On 2021-11-24 03:55, Hsin-Yi Wang wrote:
> On Tue, Nov 23, 2021 at 7:58 PM Robin Murphy <robin.murphy@arm.com> wrote:
>>
>> On 2021-11-23 11:21, Hsin-Yi Wang wrote:
>>> Default IO_TLB_SEGSIZE (128) slabs may be not enough for some use cases.
>>> This series adds support to customize io_tlb_segsize for each
>>> restricted-dma-pool.
>>>
>>> Example use case:
>>>
>>> mtk-isp drivers[1] are controlled by mtk-scp[2] and allocate memory through
>>> mtk-scp. In order to use the noncontiguous DMA API[3], we need to use
>>> the swiotlb pool. mtk-scp needs to allocate memory with 2560 slabs.
>>> mtk-isp drivers also needs to allocate memory with 200+ slabs. Both are
>>> larger than the default IO_TLB_SEGSIZE (128) slabs.
>>
>> Are drivers really doing streaming DMA mappings that large? If so, that
>> seems like it might be worth trying to address in its own right for the
>> sake of efficiency - allocating ~5MB of memory twice and copying it back
>> and forth doesn't sound like the ideal thing to do.
>>
>> If it's really about coherent DMA buffer allocation, I thought the plan
>> was that devices which expect to use a significant amount and/or size of
>> coherent buffers would continue to use a shared-dma-pool for that? It's
>> still what the binding implies. My understanding was that
>> swiotlb_alloc() is mostly just a fallback for the sake of drivers which
>> mostly do streaming DMA but may allocate a handful of pages worth of
>> coherent buffers here and there. Certainly looking at the mtk_scp
>> driver, that seems like it shouldn't be going anywhere near SWIOTLB at all.
>>
> mtk_scp on its own can use the shared-dma-pool, which it currently uses.
> The reason we switched to restricted-dma-pool is that we want to use
> the noncontiguous DMA API for mtk-isp. The noncontiguous DMA API is
> designed for devices with iommu, and if a device doesn't have an
> iommu, it will fallback using swiotlb. But currently noncontiguous DMA
> API doesn't work with the shared-dma-pool.
> 
> vb2_dc_alloc() -> dma_alloc_noncontiguous() -> alloc_single_sgt() ->
> __dma_alloc_pages() -> dma_direct_alloc_pages() ->
> __dma_direct_alloc_pages() -> swiotlb_alloc().

OK, thanks for clarifying. My gut feeling is that drivers should 
probably only be calling the noncontiguous API when they *know* that 
they have a scatter-gather-capable device or IOMMU that can cope with 
it, but either way I'm still not convinced that it makes sense to hack 
up SWIOTLB with DT ABI baggage for an obscure fallback case. It would 
seem a lot more sensible to fix alloc_single_sgt() to not ignore 
per-device pools once it has effectively fallen back to the normal 
dma_alloc_attrs() flow, but I guess that's not technically guaranteed to 
uphold the assumption that we can allocate struct-page-backed memory.

Still, if we've got to the point of needing to use a SWIOTLB pool as 
nothing more than a bad reinvention of CMA, rather than an actual bounce 
buffer, that reeks of a fundamental design issue and adding more hacks 
on top to bodge around it is not the right way to go - we need to take a 
step back and properly reconsider how dma_alloc_noncontiguous() is 
supposed to interact with DMA protection schemes.

Thanks,
Robin.

>>> [1] (not in upstream) https://patchwork.kernel.org/project/linux-media/cover/20190611035344.29814-1-jungo.lin@mediatek.com/
>>> [2] https://elixir.bootlin.com/linux/latest/source/drivers/remoteproc/mtk_scp.c
>>> [3] https://patchwork.kernel.org/project/linux-media/cover/20210909112430.61243-1-senozhatsky@chromium.org/
>>>
>>> Hsin-Yi Wang (3):
>>>     dma: swiotlb: Allow restricted-dma-pool to customize IO_TLB_SEGSIZE
>>>     dt-bindings: Add io-tlb-segsize property for restricted-dma-pool
>>>     arm64: dts: mt8183: use restricted swiotlb for scp mem
>>>
>>>    .../reserved-memory/shared-dma-pool.yaml      |  8 +++++
>>>    .../arm64/boot/dts/mediatek/mt8183-kukui.dtsi |  4 +--
>>>    include/linux/swiotlb.h                       |  1 +
>>>    kernel/dma/swiotlb.c                          | 34 ++++++++++++++-----
>>>    4 files changed, 37 insertions(+), 10 deletions(-)
>>>
> _______________________________________________
> iommu mailing list
> iommu@lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/iommu
> 
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 0/3] Allow restricted-dma-pool to customize IO_TLB_SEGSIZE
@ 2021-11-24 12:34       ` Robin Murphy
  0 siblings, 0 replies; 50+ messages in thread
From: Robin Murphy @ 2021-11-24 12:34 UTC (permalink / raw)
  To: Hsin-Yi Wang
  Cc: devicetree, -,
	linux-kernel, Christoph Hellwig, senozhatsky, iommu, Rob Herring,
	linux-mediatek, Maxime Ripard, Matthias Brugger,
	linux-arm-kernel

On 2021-11-24 03:55, Hsin-Yi Wang wrote:
> On Tue, Nov 23, 2021 at 7:58 PM Robin Murphy <robin.murphy@arm.com> wrote:
>>
>> On 2021-11-23 11:21, Hsin-Yi Wang wrote:
>>> Default IO_TLB_SEGSIZE (128) slabs may be not enough for some use cases.
>>> This series adds support to customize io_tlb_segsize for each
>>> restricted-dma-pool.
>>>
>>> Example use case:
>>>
>>> mtk-isp drivers[1] are controlled by mtk-scp[2] and allocate memory through
>>> mtk-scp. In order to use the noncontiguous DMA API[3], we need to use
>>> the swiotlb pool. mtk-scp needs to allocate memory with 2560 slabs.
>>> mtk-isp drivers also needs to allocate memory with 200+ slabs. Both are
>>> larger than the default IO_TLB_SEGSIZE (128) slabs.
>>
>> Are drivers really doing streaming DMA mappings that large? If so, that
>> seems like it might be worth trying to address in its own right for the
>> sake of efficiency - allocating ~5MB of memory twice and copying it back
>> and forth doesn't sound like the ideal thing to do.
>>
>> If it's really about coherent DMA buffer allocation, I thought the plan
>> was that devices which expect to use a significant amount and/or size of
>> coherent buffers would continue to use a shared-dma-pool for that? It's
>> still what the binding implies. My understanding was that
>> swiotlb_alloc() is mostly just a fallback for the sake of drivers which
>> mostly do streaming DMA but may allocate a handful of pages worth of
>> coherent buffers here and there. Certainly looking at the mtk_scp
>> driver, that seems like it shouldn't be going anywhere near SWIOTLB at all.
>>
> mtk_scp on its own can use the shared-dma-pool, which it currently uses.
> The reason we switched to restricted-dma-pool is that we want to use
> the noncontiguous DMA API for mtk-isp. The noncontiguous DMA API is
> designed for devices with iommu, and if a device doesn't have an
> iommu, it will fallback using swiotlb. But currently noncontiguous DMA
> API doesn't work with the shared-dma-pool.
> 
> vb2_dc_alloc() -> dma_alloc_noncontiguous() -> alloc_single_sgt() ->
> __dma_alloc_pages() -> dma_direct_alloc_pages() ->
> __dma_direct_alloc_pages() -> swiotlb_alloc().

OK, thanks for clarifying. My gut feeling is that drivers should 
probably only be calling the noncontiguous API when they *know* that 
they have a scatter-gather-capable device or IOMMU that can cope with 
it, but either way I'm still not convinced that it makes sense to hack 
up SWIOTLB with DT ABI baggage for an obscure fallback case. It would 
seem a lot more sensible to fix alloc_single_sgt() to not ignore 
per-device pools once it has effectively fallen back to the normal 
dma_alloc_attrs() flow, but I guess that's not technically guaranteed to 
uphold the assumption that we can allocate struct-page-backed memory.

Still, if we've got to the point of needing to use a SWIOTLB pool as 
nothing more than a bad reinvention of CMA, rather than an actual bounce 
buffer, that reeks of a fundamental design issue and adding more hacks 
on top to bodge around it is not the right way to go - we need to take a 
step back and properly reconsider how dma_alloc_noncontiguous() is 
supposed to interact with DMA protection schemes.

Thanks,
Robin.

>>> [1] (not in upstream) https://patchwork.kernel.org/project/linux-media/cover/20190611035344.29814-1-jungo.lin@mediatek.com/
>>> [2] https://elixir.bootlin.com/linux/latest/source/drivers/remoteproc/mtk_scp.c
>>> [3] https://patchwork.kernel.org/project/linux-media/cover/20210909112430.61243-1-senozhatsky@chromium.org/
>>>
>>> Hsin-Yi Wang (3):
>>>     dma: swiotlb: Allow restricted-dma-pool to customize IO_TLB_SEGSIZE
>>>     dt-bindings: Add io-tlb-segsize property for restricted-dma-pool
>>>     arm64: dts: mt8183: use restricted swiotlb for scp mem
>>>
>>>    .../reserved-memory/shared-dma-pool.yaml      |  8 +++++
>>>    .../arm64/boot/dts/mediatek/mt8183-kukui.dtsi |  4 +--
>>>    include/linux/swiotlb.h                       |  1 +
>>>    kernel/dma/swiotlb.c                          | 34 ++++++++++++++-----
>>>    4 files changed, 37 insertions(+), 10 deletions(-)
>>>
> _______________________________________________
> iommu mailing list
> iommu@lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/iommu
> 

_______________________________________________
Linux-mediatek mailing list
Linux-mediatek@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-mediatek

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 0/3] Allow restricted-dma-pool to customize IO_TLB_SEGSIZE
@ 2021-11-24 12:34       ` Robin Murphy
  0 siblings, 0 replies; 50+ messages in thread
From: Robin Murphy @ 2021-11-24 12:34 UTC (permalink / raw)
  To: Hsin-Yi Wang
  Cc: devicetree, -,
	linux-kernel, Christoph Hellwig, senozhatsky, iommu, Rob Herring,
	linux-mediatek, Maxime Ripard, Matthias Brugger,
	linux-arm-kernel

On 2021-11-24 03:55, Hsin-Yi Wang wrote:
> On Tue, Nov 23, 2021 at 7:58 PM Robin Murphy <robin.murphy@arm.com> wrote:
>>
>> On 2021-11-23 11:21, Hsin-Yi Wang wrote:
>>> Default IO_TLB_SEGSIZE (128) slabs may be not enough for some use cases.
>>> This series adds support to customize io_tlb_segsize for each
>>> restricted-dma-pool.
>>>
>>> Example use case:
>>>
>>> mtk-isp drivers[1] are controlled by mtk-scp[2] and allocate memory through
>>> mtk-scp. In order to use the noncontiguous DMA API[3], we need to use
>>> the swiotlb pool. mtk-scp needs to allocate memory with 2560 slabs.
>>> mtk-isp drivers also needs to allocate memory with 200+ slabs. Both are
>>> larger than the default IO_TLB_SEGSIZE (128) slabs.
>>
>> Are drivers really doing streaming DMA mappings that large? If so, that
>> seems like it might be worth trying to address in its own right for the
>> sake of efficiency - allocating ~5MB of memory twice and copying it back
>> and forth doesn't sound like the ideal thing to do.
>>
>> If it's really about coherent DMA buffer allocation, I thought the plan
>> was that devices which expect to use a significant amount and/or size of
>> coherent buffers would continue to use a shared-dma-pool for that? It's
>> still what the binding implies. My understanding was that
>> swiotlb_alloc() is mostly just a fallback for the sake of drivers which
>> mostly do streaming DMA but may allocate a handful of pages worth of
>> coherent buffers here and there. Certainly looking at the mtk_scp
>> driver, that seems like it shouldn't be going anywhere near SWIOTLB at all.
>>
> mtk_scp on its own can use the shared-dma-pool, which it currently uses.
> The reason we switched to restricted-dma-pool is that we want to use
> the noncontiguous DMA API for mtk-isp. The noncontiguous DMA API is
> designed for devices with iommu, and if a device doesn't have an
> iommu, it will fallback using swiotlb. But currently noncontiguous DMA
> API doesn't work with the shared-dma-pool.
> 
> vb2_dc_alloc() -> dma_alloc_noncontiguous() -> alloc_single_sgt() ->
> __dma_alloc_pages() -> dma_direct_alloc_pages() ->
> __dma_direct_alloc_pages() -> swiotlb_alloc().

OK, thanks for clarifying. My gut feeling is that drivers should 
probably only be calling the noncontiguous API when they *know* that 
they have a scatter-gather-capable device or IOMMU that can cope with 
it, but either way I'm still not convinced that it makes sense to hack 
up SWIOTLB with DT ABI baggage for an obscure fallback case. It would 
seem a lot more sensible to fix alloc_single_sgt() to not ignore 
per-device pools once it has effectively fallen back to the normal 
dma_alloc_attrs() flow, but I guess that's not technically guaranteed to 
uphold the assumption that we can allocate struct-page-backed memory.

Still, if we've got to the point of needing to use a SWIOTLB pool as 
nothing more than a bad reinvention of CMA, rather than an actual bounce 
buffer, that reeks of a fundamental design issue and adding more hacks 
on top to bodge around it is not the right way to go - we need to take a 
step back and properly reconsider how dma_alloc_noncontiguous() is 
supposed to interact with DMA protection schemes.

Thanks,
Robin.

>>> [1] (not in upstream) https://patchwork.kernel.org/project/linux-media/cover/20190611035344.29814-1-jungo.lin@mediatek.com/
>>> [2] https://elixir.bootlin.com/linux/latest/source/drivers/remoteproc/mtk_scp.c
>>> [3] https://patchwork.kernel.org/project/linux-media/cover/20210909112430.61243-1-senozhatsky@chromium.org/
>>>
>>> Hsin-Yi Wang (3):
>>>     dma: swiotlb: Allow restricted-dma-pool to customize IO_TLB_SEGSIZE
>>>     dt-bindings: Add io-tlb-segsize property for restricted-dma-pool
>>>     arm64: dts: mt8183: use restricted swiotlb for scp mem
>>>
>>>    .../reserved-memory/shared-dma-pool.yaml      |  8 +++++
>>>    .../arm64/boot/dts/mediatek/mt8183-kukui.dtsi |  4 +--
>>>    include/linux/swiotlb.h                       |  1 +
>>>    kernel/dma/swiotlb.c                          | 34 ++++++++++++++-----
>>>    4 files changed, 37 insertions(+), 10 deletions(-)
>>>
> _______________________________________________
> iommu mailing list
> iommu@lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/iommu
> 

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 0/3] Allow restricted-dma-pool to customize IO_TLB_SEGSIZE
@ 2021-11-24 12:34       ` Robin Murphy
  0 siblings, 0 replies; 50+ messages in thread
From: Robin Murphy @ 2021-11-24 12:34 UTC (permalink / raw)
  To: Hsin-Yi Wang
  Cc: devicetree, -,
	linux-kernel, Christoph Hellwig, senozhatsky, iommu, Rob Herring,
	linux-mediatek, Maxime Ripard, Matthias Brugger,
	linux-arm-kernel

On 2021-11-24 03:55, Hsin-Yi Wang wrote:
> On Tue, Nov 23, 2021 at 7:58 PM Robin Murphy <robin.murphy@arm.com> wrote:
>>
>> On 2021-11-23 11:21, Hsin-Yi Wang wrote:
>>> Default IO_TLB_SEGSIZE (128) slabs may be not enough for some use cases.
>>> This series adds support to customize io_tlb_segsize for each
>>> restricted-dma-pool.
>>>
>>> Example use case:
>>>
>>> mtk-isp drivers[1] are controlled by mtk-scp[2] and allocate memory through
>>> mtk-scp. In order to use the noncontiguous DMA API[3], we need to use
>>> the swiotlb pool. mtk-scp needs to allocate memory with 2560 slabs.
>>> mtk-isp drivers also needs to allocate memory with 200+ slabs. Both are
>>> larger than the default IO_TLB_SEGSIZE (128) slabs.
>>
>> Are drivers really doing streaming DMA mappings that large? If so, that
>> seems like it might be worth trying to address in its own right for the
>> sake of efficiency - allocating ~5MB of memory twice and copying it back
>> and forth doesn't sound like the ideal thing to do.
>>
>> If it's really about coherent DMA buffer allocation, I thought the plan
>> was that devices which expect to use a significant amount and/or size of
>> coherent buffers would continue to use a shared-dma-pool for that? It's
>> still what the binding implies. My understanding was that
>> swiotlb_alloc() is mostly just a fallback for the sake of drivers which
>> mostly do streaming DMA but may allocate a handful of pages worth of
>> coherent buffers here and there. Certainly looking at the mtk_scp
>> driver, that seems like it shouldn't be going anywhere near SWIOTLB at all.
>>
> mtk_scp on its own can use the shared-dma-pool, which it currently uses.
> The reason we switched to restricted-dma-pool is that we want to use
> the noncontiguous DMA API for mtk-isp. The noncontiguous DMA API is
> designed for devices with iommu, and if a device doesn't have an
> iommu, it will fallback using swiotlb. But currently noncontiguous DMA
> API doesn't work with the shared-dma-pool.
> 
> vb2_dc_alloc() -> dma_alloc_noncontiguous() -> alloc_single_sgt() ->
> __dma_alloc_pages() -> dma_direct_alloc_pages() ->
> __dma_direct_alloc_pages() -> swiotlb_alloc().

OK, thanks for clarifying. My gut feeling is that drivers should 
probably only be calling the noncontiguous API when they *know* that 
they have a scatter-gather-capable device or IOMMU that can cope with 
it, but either way I'm still not convinced that it makes sense to hack 
up SWIOTLB with DT ABI baggage for an obscure fallback case. It would 
seem a lot more sensible to fix alloc_single_sgt() to not ignore 
per-device pools once it has effectively fallen back to the normal 
dma_alloc_attrs() flow, but I guess that's not technically guaranteed to 
uphold the assumption that we can allocate struct-page-backed memory.

Still, if we've got to the point of needing to use a SWIOTLB pool as 
nothing more than a bad reinvention of CMA, rather than an actual bounce 
buffer, that reeks of a fundamental design issue and adding more hacks 
on top to bodge around it is not the right way to go - we need to take a 
step back and properly reconsider how dma_alloc_noncontiguous() is 
supposed to interact with DMA protection schemes.

Thanks,
Robin.

>>> [1] (not in upstream) https://patchwork.kernel.org/project/linux-media/cover/20190611035344.29814-1-jungo.lin@mediatek.com/
>>> [2] https://elixir.bootlin.com/linux/latest/source/drivers/remoteproc/mtk_scp.c
>>> [3] https://patchwork.kernel.org/project/linux-media/cover/20210909112430.61243-1-senozhatsky@chromium.org/
>>>
>>> Hsin-Yi Wang (3):
>>>     dma: swiotlb: Allow restricted-dma-pool to customize IO_TLB_SEGSIZE
>>>     dt-bindings: Add io-tlb-segsize property for restricted-dma-pool
>>>     arm64: dts: mt8183: use restricted swiotlb for scp mem
>>>
>>>    .../reserved-memory/shared-dma-pool.yaml      |  8 +++++
>>>    .../arm64/boot/dts/mediatek/mt8183-kukui.dtsi |  4 +--
>>>    include/linux/swiotlb.h                       |  1 +
>>>    kernel/dma/swiotlb.c                          | 34 ++++++++++++++-----
>>>    4 files changed, 37 insertions(+), 10 deletions(-)
>>>
> _______________________________________________
> iommu mailing list
> iommu@lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/iommu
> 

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 0/3] Allow restricted-dma-pool to customize IO_TLB_SEGSIZE
@ 2021-11-24 12:34       ` Robin Murphy
  0 siblings, 0 replies; 50+ messages in thread
From: Robin Murphy @ 2021-11-24 12:34 UTC (permalink / raw)
  To: Hsin-Yi Wang
  Cc: devicetree-u79uwXL29TY76Z2rM5mHXA, -,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Christoph Hellwig,
	senozhatsky-F7+t8E8rja9g9hUCZPvPmw,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Rob Herring,
	linux-mediatek-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r, Maxime Ripard,
	Matthias Brugger,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r

On 2021-11-24 03:55, Hsin-Yi Wang wrote:
> On Tue, Nov 23, 2021 at 7:58 PM Robin Murphy <robin.murphy-5wv7dgnIgG8@public.gmane.org> wrote:
>>
>> On 2021-11-23 11:21, Hsin-Yi Wang wrote:
>>> Default IO_TLB_SEGSIZE (128) slabs may be not enough for some use cases.
>>> This series adds support to customize io_tlb_segsize for each
>>> restricted-dma-pool.
>>>
>>> Example use case:
>>>
>>> mtk-isp drivers[1] are controlled by mtk-scp[2] and allocate memory through
>>> mtk-scp. In order to use the noncontiguous DMA API[3], we need to use
>>> the swiotlb pool. mtk-scp needs to allocate memory with 2560 slabs.
>>> mtk-isp drivers also needs to allocate memory with 200+ slabs. Both are
>>> larger than the default IO_TLB_SEGSIZE (128) slabs.
>>
>> Are drivers really doing streaming DMA mappings that large? If so, that
>> seems like it might be worth trying to address in its own right for the
>> sake of efficiency - allocating ~5MB of memory twice and copying it back
>> and forth doesn't sound like the ideal thing to do.
>>
>> If it's really about coherent DMA buffer allocation, I thought the plan
>> was that devices which expect to use a significant amount and/or size of
>> coherent buffers would continue to use a shared-dma-pool for that? It's
>> still what the binding implies. My understanding was that
>> swiotlb_alloc() is mostly just a fallback for the sake of drivers which
>> mostly do streaming DMA but may allocate a handful of pages worth of
>> coherent buffers here and there. Certainly looking at the mtk_scp
>> driver, that seems like it shouldn't be going anywhere near SWIOTLB at all.
>>
> mtk_scp on its own can use the shared-dma-pool, which it currently uses.
> The reason we switched to restricted-dma-pool is that we want to use
> the noncontiguous DMA API for mtk-isp. The noncontiguous DMA API is
> designed for devices with iommu, and if a device doesn't have an
> iommu, it will fallback using swiotlb. But currently noncontiguous DMA
> API doesn't work with the shared-dma-pool.
> 
> vb2_dc_alloc() -> dma_alloc_noncontiguous() -> alloc_single_sgt() ->
> __dma_alloc_pages() -> dma_direct_alloc_pages() ->
> __dma_direct_alloc_pages() -> swiotlb_alloc().

OK, thanks for clarifying. My gut feeling is that drivers should 
probably only be calling the noncontiguous API when they *know* that 
they have a scatter-gather-capable device or IOMMU that can cope with 
it, but either way I'm still not convinced that it makes sense to hack 
up SWIOTLB with DT ABI baggage for an obscure fallback case. It would 
seem a lot more sensible to fix alloc_single_sgt() to not ignore 
per-device pools once it has effectively fallen back to the normal 
dma_alloc_attrs() flow, but I guess that's not technically guaranteed to 
uphold the assumption that we can allocate struct-page-backed memory.

Still, if we've got to the point of needing to use a SWIOTLB pool as 
nothing more than a bad reinvention of CMA, rather than an actual bounce 
buffer, that reeks of a fundamental design issue and adding more hacks 
on top to bodge around it is not the right way to go - we need to take a 
step back and properly reconsider how dma_alloc_noncontiguous() is 
supposed to interact with DMA protection schemes.

Thanks,
Robin.

>>> [1] (not in upstream) https://patchwork.kernel.org/project/linux-media/cover/20190611035344.29814-1-jungo.lin-NuS5LvNUpcJWk0Htik3J/w@public.gmane.org/
>>> [2] https://elixir.bootlin.com/linux/latest/source/drivers/remoteproc/mtk_scp.c
>>> [3] https://patchwork.kernel.org/project/linux-media/cover/20210909112430.61243-1-senozhatsky-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org/
>>>
>>> Hsin-Yi Wang (3):
>>>     dma: swiotlb: Allow restricted-dma-pool to customize IO_TLB_SEGSIZE
>>>     dt-bindings: Add io-tlb-segsize property for restricted-dma-pool
>>>     arm64: dts: mt8183: use restricted swiotlb for scp mem
>>>
>>>    .../reserved-memory/shared-dma-pool.yaml      |  8 +++++
>>>    .../arm64/boot/dts/mediatek/mt8183-kukui.dtsi |  4 +--
>>>    include/linux/swiotlb.h                       |  1 +
>>>    kernel/dma/swiotlb.c                          | 34 ++++++++++++++-----
>>>    4 files changed, 37 insertions(+), 10 deletions(-)
>>>
> _______________________________________________
> iommu mailing list
> iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org
> https://lists.linuxfoundation.org/mailman/listinfo/iommu
> 

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 0/3] Allow restricted-dma-pool to customize IO_TLB_SEGSIZE
  2021-11-23 11:58   ` Robin Murphy
                       ` (2 preceding siblings ...)
  (?)
@ 2021-11-25  7:35     ` Tomasz Figa
  -1 siblings, 0 replies; 50+ messages in thread
From: Tomasz Figa @ 2021-11-25  7:35 UTC (permalink / raw)
  To: Robin Murphy
  Cc: devicetree, -,
	linux-kernel, Christoph Hellwig, senozhatsky, iommu, Rob Herring,
	linux-mediatek, Maxime Ripard, Hsin-Yi Wang, Matthias Brugger,
	linux-arm-kernel

Hi Robin,

On Tue, Nov 23, 2021 at 8:59 PM Robin Murphy <robin.murphy@arm.com> wrote:
>
> On 2021-11-23 11:21, Hsin-Yi Wang wrote:
> > Default IO_TLB_SEGSIZE (128) slabs may be not enough for some use cases.
> > This series adds support to customize io_tlb_segsize for each
> > restricted-dma-pool.
> >
> > Example use case:
> >
> > mtk-isp drivers[1] are controlled by mtk-scp[2] and allocate memory through
> > mtk-scp. In order to use the noncontiguous DMA API[3], we need to use
> > the swiotlb pool. mtk-scp needs to allocate memory with 2560 slabs.
> > mtk-isp drivers also needs to allocate memory with 200+ slabs. Both are
> > larger than the default IO_TLB_SEGSIZE (128) slabs.
>
> Are drivers really doing streaming DMA mappings that large? If so, that
> seems like it might be worth trying to address in its own right for the
> sake of efficiency - allocating ~5MB of memory twice and copying it back
> and forth doesn't sound like the ideal thing to do.
>
> If it's really about coherent DMA buffer allocation, I thought the plan
> was that devices which expect to use a significant amount and/or size of
> coherent buffers would continue to use a shared-dma-pool for that? It's
> still what the binding implies. My understanding was that
> swiotlb_alloc() is mostly just a fallback for the sake of drivers which
> mostly do streaming DMA but may allocate a handful of pages worth of
> coherent buffers here and there. Certainly looking at the mtk_scp
> driver, that seems like it shouldn't be going anywhere near SWIOTLB at all.

First, thanks a lot for taking a look at this patch series.

The drivers would do streaming DMA within a reserved region that is
the only memory accessible to them for security reasons. This seems to
exactly match the definition of the restricted pool as merged
recently.

The new dma_alloc_noncontiguous() API would allow allocating suitable
memory directly from the pool, which would eliminate the need to copy.
However, for a restricted pool, this would exercise the SWIOTLB
allocator, which currently suffers from the limitation as described by
Hsin-Yi. Since the allocator in general is quite general purpose and
already used for coherent allocations as per the current restricted
pool implementation, I think it indeed makes sense to lift the
limitation, rather than trying to come up with yet another thing.

Best regards,
Tomasz

>
> Robin.
>
> > [1] (not in upstream) https://patchwork.kernel.org/project/linux-media/cover/20190611035344.29814-1-jungo.lin@mediatek.com/
> > [2] https://elixir.bootlin.com/linux/latest/source/drivers/remoteproc/mtk_scp.c
> > [3] https://patchwork.kernel.org/project/linux-media/cover/20210909112430.61243-1-senozhatsky@chromium.org/
> >
> > Hsin-Yi Wang (3):
> >    dma: swiotlb: Allow restricted-dma-pool to customize IO_TLB_SEGSIZE
> >    dt-bindings: Add io-tlb-segsize property for restricted-dma-pool
> >    arm64: dts: mt8183: use restricted swiotlb for scp mem
> >
> >   .../reserved-memory/shared-dma-pool.yaml      |  8 +++++
> >   .../arm64/boot/dts/mediatek/mt8183-kukui.dtsi |  4 +--
> >   include/linux/swiotlb.h                       |  1 +
> >   kernel/dma/swiotlb.c                          | 34 ++++++++++++++-----
> >   4 files changed, 37 insertions(+), 10 deletions(-)
> >
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 0/3] Allow restricted-dma-pool to customize IO_TLB_SEGSIZE
@ 2021-11-25  7:35     ` Tomasz Figa
  0 siblings, 0 replies; 50+ messages in thread
From: Tomasz Figa @ 2021-11-25  7:35 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Hsin-Yi Wang, Christoph Hellwig, Marek Szyprowski, iommu,
	linux-kernel, Rob Herring, Maxime Ripard, -,
	devicetree, Matthias Brugger, linux-arm-kernel, linux-mediatek,
	senozhatsky

Hi Robin,

On Tue, Nov 23, 2021 at 8:59 PM Robin Murphy <robin.murphy@arm.com> wrote:
>
> On 2021-11-23 11:21, Hsin-Yi Wang wrote:
> > Default IO_TLB_SEGSIZE (128) slabs may be not enough for some use cases.
> > This series adds support to customize io_tlb_segsize for each
> > restricted-dma-pool.
> >
> > Example use case:
> >
> > mtk-isp drivers[1] are controlled by mtk-scp[2] and allocate memory through
> > mtk-scp. In order to use the noncontiguous DMA API[3], we need to use
> > the swiotlb pool. mtk-scp needs to allocate memory with 2560 slabs.
> > mtk-isp drivers also needs to allocate memory with 200+ slabs. Both are
> > larger than the default IO_TLB_SEGSIZE (128) slabs.
>
> Are drivers really doing streaming DMA mappings that large? If so, that
> seems like it might be worth trying to address in its own right for the
> sake of efficiency - allocating ~5MB of memory twice and copying it back
> and forth doesn't sound like the ideal thing to do.
>
> If it's really about coherent DMA buffer allocation, I thought the plan
> was that devices which expect to use a significant amount and/or size of
> coherent buffers would continue to use a shared-dma-pool for that? It's
> still what the binding implies. My understanding was that
> swiotlb_alloc() is mostly just a fallback for the sake of drivers which
> mostly do streaming DMA but may allocate a handful of pages worth of
> coherent buffers here and there. Certainly looking at the mtk_scp
> driver, that seems like it shouldn't be going anywhere near SWIOTLB at all.

First, thanks a lot for taking a look at this patch series.

The drivers would do streaming DMA within a reserved region that is
the only memory accessible to them for security reasons. This seems to
exactly match the definition of the restricted pool as merged
recently.

The new dma_alloc_noncontiguous() API would allow allocating suitable
memory directly from the pool, which would eliminate the need to copy.
However, for a restricted pool, this would exercise the SWIOTLB
allocator, which currently suffers from the limitation as described by
Hsin-Yi. Since the allocator in general is quite general purpose and
already used for coherent allocations as per the current restricted
pool implementation, I think it indeed makes sense to lift the
limitation, rather than trying to come up with yet another thing.

Best regards,
Tomasz

>
> Robin.
>
> > [1] (not in upstream) https://patchwork.kernel.org/project/linux-media/cover/20190611035344.29814-1-jungo.lin@mediatek.com/
> > [2] https://elixir.bootlin.com/linux/latest/source/drivers/remoteproc/mtk_scp.c
> > [3] https://patchwork.kernel.org/project/linux-media/cover/20210909112430.61243-1-senozhatsky@chromium.org/
> >
> > Hsin-Yi Wang (3):
> >    dma: swiotlb: Allow restricted-dma-pool to customize IO_TLB_SEGSIZE
> >    dt-bindings: Add io-tlb-segsize property for restricted-dma-pool
> >    arm64: dts: mt8183: use restricted swiotlb for scp mem
> >
> >   .../reserved-memory/shared-dma-pool.yaml      |  8 +++++
> >   .../arm64/boot/dts/mediatek/mt8183-kukui.dtsi |  4 +--
> >   include/linux/swiotlb.h                       |  1 +
> >   kernel/dma/swiotlb.c                          | 34 ++++++++++++++-----
> >   4 files changed, 37 insertions(+), 10 deletions(-)
> >

_______________________________________________
Linux-mediatek mailing list
Linux-mediatek@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-mediatek

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 0/3] Allow restricted-dma-pool to customize IO_TLB_SEGSIZE
@ 2021-11-25  7:35     ` Tomasz Figa
  0 siblings, 0 replies; 50+ messages in thread
From: Tomasz Figa @ 2021-11-25  7:35 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Hsin-Yi Wang, Christoph Hellwig, Marek Szyprowski, iommu,
	linux-kernel, Rob Herring, Maxime Ripard, -,
	devicetree, Matthias Brugger, linux-arm-kernel, linux-mediatek,
	senozhatsky

Hi Robin,

On Tue, Nov 23, 2021 at 8:59 PM Robin Murphy <robin.murphy@arm.com> wrote:
>
> On 2021-11-23 11:21, Hsin-Yi Wang wrote:
> > Default IO_TLB_SEGSIZE (128) slabs may be not enough for some use cases.
> > This series adds support to customize io_tlb_segsize for each
> > restricted-dma-pool.
> >
> > Example use case:
> >
> > mtk-isp drivers[1] are controlled by mtk-scp[2] and allocate memory through
> > mtk-scp. In order to use the noncontiguous DMA API[3], we need to use
> > the swiotlb pool. mtk-scp needs to allocate memory with 2560 slabs.
> > mtk-isp drivers also needs to allocate memory with 200+ slabs. Both are
> > larger than the default IO_TLB_SEGSIZE (128) slabs.
>
> Are drivers really doing streaming DMA mappings that large? If so, that
> seems like it might be worth trying to address in its own right for the
> sake of efficiency - allocating ~5MB of memory twice and copying it back
> and forth doesn't sound like the ideal thing to do.
>
> If it's really about coherent DMA buffer allocation, I thought the plan
> was that devices which expect to use a significant amount and/or size of
> coherent buffers would continue to use a shared-dma-pool for that? It's
> still what the binding implies. My understanding was that
> swiotlb_alloc() is mostly just a fallback for the sake of drivers which
> mostly do streaming DMA but may allocate a handful of pages worth of
> coherent buffers here and there. Certainly looking at the mtk_scp
> driver, that seems like it shouldn't be going anywhere near SWIOTLB at all.

First, thanks a lot for taking a look at this patch series.

The drivers would do streaming DMA within a reserved region that is
the only memory accessible to them for security reasons. This seems to
exactly match the definition of the restricted pool as merged
recently.

The new dma_alloc_noncontiguous() API would allow allocating suitable
memory directly from the pool, which would eliminate the need to copy.
However, for a restricted pool, this would exercise the SWIOTLB
allocator, which currently suffers from the limitation as described by
Hsin-Yi. Since the allocator in general is quite general purpose and
already used for coherent allocations as per the current restricted
pool implementation, I think it indeed makes sense to lift the
limitation, rather than trying to come up with yet another thing.

Best regards,
Tomasz

>
> Robin.
>
> > [1] (not in upstream) https://patchwork.kernel.org/project/linux-media/cover/20190611035344.29814-1-jungo.lin@mediatek.com/
> > [2] https://elixir.bootlin.com/linux/latest/source/drivers/remoteproc/mtk_scp.c
> > [3] https://patchwork.kernel.org/project/linux-media/cover/20210909112430.61243-1-senozhatsky@chromium.org/
> >
> > Hsin-Yi Wang (3):
> >    dma: swiotlb: Allow restricted-dma-pool to customize IO_TLB_SEGSIZE
> >    dt-bindings: Add io-tlb-segsize property for restricted-dma-pool
> >    arm64: dts: mt8183: use restricted swiotlb for scp mem
> >
> >   .../reserved-memory/shared-dma-pool.yaml      |  8 +++++
> >   .../arm64/boot/dts/mediatek/mt8183-kukui.dtsi |  4 +--
> >   include/linux/swiotlb.h                       |  1 +
> >   kernel/dma/swiotlb.c                          | 34 ++++++++++++++-----
> >   4 files changed, 37 insertions(+), 10 deletions(-)
> >

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 0/3] Allow restricted-dma-pool to customize IO_TLB_SEGSIZE
@ 2021-11-25  7:35     ` Tomasz Figa
  0 siblings, 0 replies; 50+ messages in thread
From: Tomasz Figa @ 2021-11-25  7:35 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Hsin-Yi Wang, Christoph Hellwig, Marek Szyprowski, iommu,
	linux-kernel, Rob Herring, Maxime Ripard, -,
	devicetree, Matthias Brugger, linux-arm-kernel, linux-mediatek,
	senozhatsky

Hi Robin,

On Tue, Nov 23, 2021 at 8:59 PM Robin Murphy <robin.murphy@arm.com> wrote:
>
> On 2021-11-23 11:21, Hsin-Yi Wang wrote:
> > Default IO_TLB_SEGSIZE (128) slabs may be not enough for some use cases.
> > This series adds support to customize io_tlb_segsize for each
> > restricted-dma-pool.
> >
> > Example use case:
> >
> > mtk-isp drivers[1] are controlled by mtk-scp[2] and allocate memory through
> > mtk-scp. In order to use the noncontiguous DMA API[3], we need to use
> > the swiotlb pool. mtk-scp needs to allocate memory with 2560 slabs.
> > mtk-isp drivers also needs to allocate memory with 200+ slabs. Both are
> > larger than the default IO_TLB_SEGSIZE (128) slabs.
>
> Are drivers really doing streaming DMA mappings that large? If so, that
> seems like it might be worth trying to address in its own right for the
> sake of efficiency - allocating ~5MB of memory twice and copying it back
> and forth doesn't sound like the ideal thing to do.
>
> If it's really about coherent DMA buffer allocation, I thought the plan
> was that devices which expect to use a significant amount and/or size of
> coherent buffers would continue to use a shared-dma-pool for that? It's
> still what the binding implies. My understanding was that
> swiotlb_alloc() is mostly just a fallback for the sake of drivers which
> mostly do streaming DMA but may allocate a handful of pages worth of
> coherent buffers here and there. Certainly looking at the mtk_scp
> driver, that seems like it shouldn't be going anywhere near SWIOTLB at all.

First, thanks a lot for taking a look at this patch series.

The drivers would do streaming DMA within a reserved region that is
the only memory accessible to them for security reasons. This seems to
exactly match the definition of the restricted pool as merged
recently.

The new dma_alloc_noncontiguous() API would allow allocating suitable
memory directly from the pool, which would eliminate the need to copy.
However, for a restricted pool, this would exercise the SWIOTLB
allocator, which currently suffers from the limitation as described by
Hsin-Yi. Since the allocator in general is quite general purpose and
already used for coherent allocations as per the current restricted
pool implementation, I think it indeed makes sense to lift the
limitation, rather than trying to come up with yet another thing.

Best regards,
Tomasz

>
> Robin.
>
> > [1] (not in upstream) https://patchwork.kernel.org/project/linux-media/cover/20190611035344.29814-1-jungo.lin@mediatek.com/
> > [2] https://elixir.bootlin.com/linux/latest/source/drivers/remoteproc/mtk_scp.c
> > [3] https://patchwork.kernel.org/project/linux-media/cover/20210909112430.61243-1-senozhatsky@chromium.org/
> >
> > Hsin-Yi Wang (3):
> >    dma: swiotlb: Allow restricted-dma-pool to customize IO_TLB_SEGSIZE
> >    dt-bindings: Add io-tlb-segsize property for restricted-dma-pool
> >    arm64: dts: mt8183: use restricted swiotlb for scp mem
> >
> >   .../reserved-memory/shared-dma-pool.yaml      |  8 +++++
> >   .../arm64/boot/dts/mediatek/mt8183-kukui.dtsi |  4 +--
> >   include/linux/swiotlb.h                       |  1 +
> >   kernel/dma/swiotlb.c                          | 34 ++++++++++++++-----
> >   4 files changed, 37 insertions(+), 10 deletions(-)
> >

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 0/3] Allow restricted-dma-pool to customize IO_TLB_SEGSIZE
@ 2021-11-25  7:35     ` Tomasz Figa
  0 siblings, 0 replies; 50+ messages in thread
From: Tomasz Figa @ 2021-11-25  7:35 UTC (permalink / raw)
  To: Robin Murphy
  Cc: devicetree-u79uwXL29TY76Z2rM5mHXA, -,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Christoph Hellwig,
	senozhatsky-F7+t8E8rja9g9hUCZPvPmw,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Rob Herring,
	linux-mediatek-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r, Maxime Ripard,
	Hsin-Yi Wang, Matthias Brugger,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r

Hi Robin,

On Tue, Nov 23, 2021 at 8:59 PM Robin Murphy <robin.murphy-5wv7dgnIgG8@public.gmane.org> wrote:
>
> On 2021-11-23 11:21, Hsin-Yi Wang wrote:
> > Default IO_TLB_SEGSIZE (128) slabs may be not enough for some use cases.
> > This series adds support to customize io_tlb_segsize for each
> > restricted-dma-pool.
> >
> > Example use case:
> >
> > mtk-isp drivers[1] are controlled by mtk-scp[2] and allocate memory through
> > mtk-scp. In order to use the noncontiguous DMA API[3], we need to use
> > the swiotlb pool. mtk-scp needs to allocate memory with 2560 slabs.
> > mtk-isp drivers also needs to allocate memory with 200+ slabs. Both are
> > larger than the default IO_TLB_SEGSIZE (128) slabs.
>
> Are drivers really doing streaming DMA mappings that large? If so, that
> seems like it might be worth trying to address in its own right for the
> sake of efficiency - allocating ~5MB of memory twice and copying it back
> and forth doesn't sound like the ideal thing to do.
>
> If it's really about coherent DMA buffer allocation, I thought the plan
> was that devices which expect to use a significant amount and/or size of
> coherent buffers would continue to use a shared-dma-pool for that? It's
> still what the binding implies. My understanding was that
> swiotlb_alloc() is mostly just a fallback for the sake of drivers which
> mostly do streaming DMA but may allocate a handful of pages worth of
> coherent buffers here and there. Certainly looking at the mtk_scp
> driver, that seems like it shouldn't be going anywhere near SWIOTLB at all.

First, thanks a lot for taking a look at this patch series.

The drivers would do streaming DMA within a reserved region that is
the only memory accessible to them for security reasons. This seems to
exactly match the definition of the restricted pool as merged
recently.

The new dma_alloc_noncontiguous() API would allow allocating suitable
memory directly from the pool, which would eliminate the need to copy.
However, for a restricted pool, this would exercise the SWIOTLB
allocator, which currently suffers from the limitation as described by
Hsin-Yi. Since the allocator in general is quite general purpose and
already used for coherent allocations as per the current restricted
pool implementation, I think it indeed makes sense to lift the
limitation, rather than trying to come up with yet another thing.

Best regards,
Tomasz

>
> Robin.
>
> > [1] (not in upstream) https://patchwork.kernel.org/project/linux-media/cover/20190611035344.29814-1-jungo.lin-NuS5LvNUpcJWk0Htik3J/w@public.gmane.org/
> > [2] https://elixir.bootlin.com/linux/latest/source/drivers/remoteproc/mtk_scp.c
> > [3] https://patchwork.kernel.org/project/linux-media/cover/20210909112430.61243-1-senozhatsky-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org/
> >
> > Hsin-Yi Wang (3):
> >    dma: swiotlb: Allow restricted-dma-pool to customize IO_TLB_SEGSIZE
> >    dt-bindings: Add io-tlb-segsize property for restricted-dma-pool
> >    arm64: dts: mt8183: use restricted swiotlb for scp mem
> >
> >   .../reserved-memory/shared-dma-pool.yaml      |  8 +++++
> >   .../arm64/boot/dts/mediatek/mt8183-kukui.dtsi |  4 +--
> >   include/linux/swiotlb.h                       |  1 +
> >   kernel/dma/swiotlb.c                          | 34 ++++++++++++++-----
> >   4 files changed, 37 insertions(+), 10 deletions(-)
> >

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 0/3] Allow restricted-dma-pool to customize IO_TLB_SEGSIZE
  2021-11-25  7:35     ` Tomasz Figa
                         ` (2 preceding siblings ...)
  (?)
@ 2021-12-03 13:07       ` Robin Murphy
  -1 siblings, 0 replies; 50+ messages in thread
From: Robin Murphy @ 2021-12-03 13:07 UTC (permalink / raw)
  To: Tomasz Figa
  Cc: Hsin-Yi Wang, Christoph Hellwig, Marek Szyprowski, iommu,
	linux-kernel, Rob Herring, Maxime Ripard, -,
	devicetree, Matthias Brugger, linux-arm-kernel, linux-mediatek,
	senozhatsky

On 2021-11-25 07:35, Tomasz Figa wrote:
> Hi Robin,
> 
> On Tue, Nov 23, 2021 at 8:59 PM Robin Murphy <robin.murphy@arm.com> wrote:
>>
>> On 2021-11-23 11:21, Hsin-Yi Wang wrote:
>>> Default IO_TLB_SEGSIZE (128) slabs may be not enough for some use cases.
>>> This series adds support to customize io_tlb_segsize for each
>>> restricted-dma-pool.
>>>
>>> Example use case:
>>>
>>> mtk-isp drivers[1] are controlled by mtk-scp[2] and allocate memory through
>>> mtk-scp. In order to use the noncontiguous DMA API[3], we need to use
>>> the swiotlb pool. mtk-scp needs to allocate memory with 2560 slabs.
>>> mtk-isp drivers also needs to allocate memory with 200+ slabs. Both are
>>> larger than the default IO_TLB_SEGSIZE (128) slabs.
>>
>> Are drivers really doing streaming DMA mappings that large? If so, that
>> seems like it might be worth trying to address in its own right for the
>> sake of efficiency - allocating ~5MB of memory twice and copying it back
>> and forth doesn't sound like the ideal thing to do.
>>
>> If it's really about coherent DMA buffer allocation, I thought the plan
>> was that devices which expect to use a significant amount and/or size of
>> coherent buffers would continue to use a shared-dma-pool for that? It's
>> still what the binding implies. My understanding was that
>> swiotlb_alloc() is mostly just a fallback for the sake of drivers which
>> mostly do streaming DMA but may allocate a handful of pages worth of
>> coherent buffers here and there. Certainly looking at the mtk_scp
>> driver, that seems like it shouldn't be going anywhere near SWIOTLB at all.
> 
> First, thanks a lot for taking a look at this patch series.
> 
> The drivers would do streaming DMA within a reserved region that is
> the only memory accessible to them for security reasons. This seems to
> exactly match the definition of the restricted pool as merged
> recently.

Huh? Of the drivers indicated, the SCP driver is doing nothing but 
coherent allocations, and I'm not entirely sure what those ISP driver 
patches are supposed to be doing but I suspect it's probably just buffer 
allocation too. I don't see any actual streaming DMA anywhere :/

> The new dma_alloc_noncontiguous() API would allow allocating suitable
> memory directly from the pool, which would eliminate the need to copy.

Can you clarify what's being copied, and where? I'm not all that 
familiar with the media APIs, but I thought it was all based around 
preallocated DMA buffers (the whole dedicated "videobuf" thing)? The few 
instances of actual streaming DMA I can see in drivers/media/ look to be 
mostly PCI drivers mapping private descriptors, whereas the MTK ISP 
appears to be entirely register-based.

> However, for a restricted pool, this would exercise the SWIOTLB
> allocator, which currently suffers from the limitation as described by
> Hsin-Yi. Since the allocator in general is quite general purpose and
> already used for coherent allocations as per the current restricted
> pool implementation, I think it indeed makes sense to lift the
> limitation, rather than trying to come up with yet another thing.

No, just fix the dma_alloc_noncontiguous() fallback case to split the 
allocation into dma_max_mapping_size() chunks. *That* makes sense.

Thanks,
Robin.

> 
> Best regards,
> Tomasz
> 
>>
>> Robin.
>>
>>> [1] (not in upstream) https://patchwork.kernel.org/project/linux-media/cover/20190611035344.29814-1-jungo.lin@mediatek.com/
>>> [2] https://elixir.bootlin.com/linux/latest/source/drivers/remoteproc/mtk_scp.c
>>> [3] https://patchwork.kernel.org/project/linux-media/cover/20210909112430.61243-1-senozhatsky@chromium.org/
>>>
>>> Hsin-Yi Wang (3):
>>>     dma: swiotlb: Allow restricted-dma-pool to customize IO_TLB_SEGSIZE
>>>     dt-bindings: Add io-tlb-segsize property for restricted-dma-pool
>>>     arm64: dts: mt8183: use restricted swiotlb for scp mem
>>>
>>>    .../reserved-memory/shared-dma-pool.yaml      |  8 +++++
>>>    .../arm64/boot/dts/mediatek/mt8183-kukui.dtsi |  4 +--
>>>    include/linux/swiotlb.h                       |  1 +
>>>    kernel/dma/swiotlb.c                          | 34 ++++++++++++++-----
>>>    4 files changed, 37 insertions(+), 10 deletions(-)
>>>

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 0/3] Allow restricted-dma-pool to customize IO_TLB_SEGSIZE
@ 2021-12-03 13:07       ` Robin Murphy
  0 siblings, 0 replies; 50+ messages in thread
From: Robin Murphy @ 2021-12-03 13:07 UTC (permalink / raw)
  To: Tomasz Figa
  Cc: devicetree, -,
	linux-kernel, Christoph Hellwig, senozhatsky, iommu, Rob Herring,
	linux-mediatek, Maxime Ripard, Hsin-Yi Wang, Matthias Brugger,
	linux-arm-kernel

On 2021-11-25 07:35, Tomasz Figa wrote:
> Hi Robin,
> 
> On Tue, Nov 23, 2021 at 8:59 PM Robin Murphy <robin.murphy@arm.com> wrote:
>>
>> On 2021-11-23 11:21, Hsin-Yi Wang wrote:
>>> Default IO_TLB_SEGSIZE (128) slabs may be not enough for some use cases.
>>> This series adds support to customize io_tlb_segsize for each
>>> restricted-dma-pool.
>>>
>>> Example use case:
>>>
>>> mtk-isp drivers[1] are controlled by mtk-scp[2] and allocate memory through
>>> mtk-scp. In order to use the noncontiguous DMA API[3], we need to use
>>> the swiotlb pool. mtk-scp needs to allocate memory with 2560 slabs.
>>> mtk-isp drivers also needs to allocate memory with 200+ slabs. Both are
>>> larger than the default IO_TLB_SEGSIZE (128) slabs.
>>
>> Are drivers really doing streaming DMA mappings that large? If so, that
>> seems like it might be worth trying to address in its own right for the
>> sake of efficiency - allocating ~5MB of memory twice and copying it back
>> and forth doesn't sound like the ideal thing to do.
>>
>> If it's really about coherent DMA buffer allocation, I thought the plan
>> was that devices which expect to use a significant amount and/or size of
>> coherent buffers would continue to use a shared-dma-pool for that? It's
>> still what the binding implies. My understanding was that
>> swiotlb_alloc() is mostly just a fallback for the sake of drivers which
>> mostly do streaming DMA but may allocate a handful of pages worth of
>> coherent buffers here and there. Certainly looking at the mtk_scp
>> driver, that seems like it shouldn't be going anywhere near SWIOTLB at all.
> 
> First, thanks a lot for taking a look at this patch series.
> 
> The drivers would do streaming DMA within a reserved region that is
> the only memory accessible to them for security reasons. This seems to
> exactly match the definition of the restricted pool as merged
> recently.

Huh? Of the drivers indicated, the SCP driver is doing nothing but 
coherent allocations, and I'm not entirely sure what those ISP driver 
patches are supposed to be doing but I suspect it's probably just buffer 
allocation too. I don't see any actual streaming DMA anywhere :/

> The new dma_alloc_noncontiguous() API would allow allocating suitable
> memory directly from the pool, which would eliminate the need to copy.

Can you clarify what's being copied, and where? I'm not all that 
familiar with the media APIs, but I thought it was all based around 
preallocated DMA buffers (the whole dedicated "videobuf" thing)? The few 
instances of actual streaming DMA I can see in drivers/media/ look to be 
mostly PCI drivers mapping private descriptors, whereas the MTK ISP 
appears to be entirely register-based.

> However, for a restricted pool, this would exercise the SWIOTLB
> allocator, which currently suffers from the limitation as described by
> Hsin-Yi. Since the allocator in general is quite general purpose and
> already used for coherent allocations as per the current restricted
> pool implementation, I think it indeed makes sense to lift the
> limitation, rather than trying to come up with yet another thing.

No, just fix the dma_alloc_noncontiguous() fallback case to split the 
allocation into dma_max_mapping_size() chunks. *That* makes sense.

Thanks,
Robin.

> 
> Best regards,
> Tomasz
> 
>>
>> Robin.
>>
>>> [1] (not in upstream) https://patchwork.kernel.org/project/linux-media/cover/20190611035344.29814-1-jungo.lin@mediatek.com/
>>> [2] https://elixir.bootlin.com/linux/latest/source/drivers/remoteproc/mtk_scp.c
>>> [3] https://patchwork.kernel.org/project/linux-media/cover/20210909112430.61243-1-senozhatsky@chromium.org/
>>>
>>> Hsin-Yi Wang (3):
>>>     dma: swiotlb: Allow restricted-dma-pool to customize IO_TLB_SEGSIZE
>>>     dt-bindings: Add io-tlb-segsize property for restricted-dma-pool
>>>     arm64: dts: mt8183: use restricted swiotlb for scp mem
>>>
>>>    .../reserved-memory/shared-dma-pool.yaml      |  8 +++++
>>>    .../arm64/boot/dts/mediatek/mt8183-kukui.dtsi |  4 +--
>>>    include/linux/swiotlb.h                       |  1 +
>>>    kernel/dma/swiotlb.c                          | 34 ++++++++++++++-----
>>>    4 files changed, 37 insertions(+), 10 deletions(-)
>>>
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 0/3] Allow restricted-dma-pool to customize IO_TLB_SEGSIZE
@ 2021-12-03 13:07       ` Robin Murphy
  0 siblings, 0 replies; 50+ messages in thread
From: Robin Murphy @ 2021-12-03 13:07 UTC (permalink / raw)
  To: Tomasz Figa
  Cc: Hsin-Yi Wang, Christoph Hellwig, Marek Szyprowski, iommu,
	linux-kernel, Rob Herring, Maxime Ripard, -,
	devicetree, Matthias Brugger, linux-arm-kernel, linux-mediatek,
	senozhatsky

On 2021-11-25 07:35, Tomasz Figa wrote:
> Hi Robin,
> 
> On Tue, Nov 23, 2021 at 8:59 PM Robin Murphy <robin.murphy@arm.com> wrote:
>>
>> On 2021-11-23 11:21, Hsin-Yi Wang wrote:
>>> Default IO_TLB_SEGSIZE (128) slabs may be not enough for some use cases.
>>> This series adds support to customize io_tlb_segsize for each
>>> restricted-dma-pool.
>>>
>>> Example use case:
>>>
>>> mtk-isp drivers[1] are controlled by mtk-scp[2] and allocate memory through
>>> mtk-scp. In order to use the noncontiguous DMA API[3], we need to use
>>> the swiotlb pool. mtk-scp needs to allocate memory with 2560 slabs.
>>> mtk-isp drivers also needs to allocate memory with 200+ slabs. Both are
>>> larger than the default IO_TLB_SEGSIZE (128) slabs.
>>
>> Are drivers really doing streaming DMA mappings that large? If so, that
>> seems like it might be worth trying to address in its own right for the
>> sake of efficiency - allocating ~5MB of memory twice and copying it back
>> and forth doesn't sound like the ideal thing to do.
>>
>> If it's really about coherent DMA buffer allocation, I thought the plan
>> was that devices which expect to use a significant amount and/or size of
>> coherent buffers would continue to use a shared-dma-pool for that? It's
>> still what the binding implies. My understanding was that
>> swiotlb_alloc() is mostly just a fallback for the sake of drivers which
>> mostly do streaming DMA but may allocate a handful of pages worth of
>> coherent buffers here and there. Certainly looking at the mtk_scp
>> driver, that seems like it shouldn't be going anywhere near SWIOTLB at all.
> 
> First, thanks a lot for taking a look at this patch series.
> 
> The drivers would do streaming DMA within a reserved region that is
> the only memory accessible to them for security reasons. This seems to
> exactly match the definition of the restricted pool as merged
> recently.

Huh? Of the drivers indicated, the SCP driver is doing nothing but 
coherent allocations, and I'm not entirely sure what those ISP driver 
patches are supposed to be doing but I suspect it's probably just buffer 
allocation too. I don't see any actual streaming DMA anywhere :/

> The new dma_alloc_noncontiguous() API would allow allocating suitable
> memory directly from the pool, which would eliminate the need to copy.

Can you clarify what's being copied, and where? I'm not all that 
familiar with the media APIs, but I thought it was all based around 
preallocated DMA buffers (the whole dedicated "videobuf" thing)? The few 
instances of actual streaming DMA I can see in drivers/media/ look to be 
mostly PCI drivers mapping private descriptors, whereas the MTK ISP 
appears to be entirely register-based.

> However, for a restricted pool, this would exercise the SWIOTLB
> allocator, which currently suffers from the limitation as described by
> Hsin-Yi. Since the allocator in general is quite general purpose and
> already used for coherent allocations as per the current restricted
> pool implementation, I think it indeed makes sense to lift the
> limitation, rather than trying to come up with yet another thing.

No, just fix the dma_alloc_noncontiguous() fallback case to split the 
allocation into dma_max_mapping_size() chunks. *That* makes sense.

Thanks,
Robin.

> 
> Best regards,
> Tomasz
> 
>>
>> Robin.
>>
>>> [1] (not in upstream) https://patchwork.kernel.org/project/linux-media/cover/20190611035344.29814-1-jungo.lin@mediatek.com/
>>> [2] https://elixir.bootlin.com/linux/latest/source/drivers/remoteproc/mtk_scp.c
>>> [3] https://patchwork.kernel.org/project/linux-media/cover/20210909112430.61243-1-senozhatsky@chromium.org/
>>>
>>> Hsin-Yi Wang (3):
>>>     dma: swiotlb: Allow restricted-dma-pool to customize IO_TLB_SEGSIZE
>>>     dt-bindings: Add io-tlb-segsize property for restricted-dma-pool
>>>     arm64: dts: mt8183: use restricted swiotlb for scp mem
>>>
>>>    .../reserved-memory/shared-dma-pool.yaml      |  8 +++++
>>>    .../arm64/boot/dts/mediatek/mt8183-kukui.dtsi |  4 +--
>>>    include/linux/swiotlb.h                       |  1 +
>>>    kernel/dma/swiotlb.c                          | 34 ++++++++++++++-----
>>>    4 files changed, 37 insertions(+), 10 deletions(-)
>>>

_______________________________________________
Linux-mediatek mailing list
Linux-mediatek@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-mediatek

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 0/3] Allow restricted-dma-pool to customize IO_TLB_SEGSIZE
@ 2021-12-03 13:07       ` Robin Murphy
  0 siblings, 0 replies; 50+ messages in thread
From: Robin Murphy @ 2021-12-03 13:07 UTC (permalink / raw)
  To: Tomasz Figa
  Cc: Hsin-Yi Wang, Christoph Hellwig, Marek Szyprowski, iommu,
	linux-kernel, Rob Herring, Maxime Ripard, -,
	devicetree, Matthias Brugger, linux-arm-kernel, linux-mediatek,
	senozhatsky

On 2021-11-25 07:35, Tomasz Figa wrote:
> Hi Robin,
> 
> On Tue, Nov 23, 2021 at 8:59 PM Robin Murphy <robin.murphy@arm.com> wrote:
>>
>> On 2021-11-23 11:21, Hsin-Yi Wang wrote:
>>> Default IO_TLB_SEGSIZE (128) slabs may be not enough for some use cases.
>>> This series adds support to customize io_tlb_segsize for each
>>> restricted-dma-pool.
>>>
>>> Example use case:
>>>
>>> mtk-isp drivers[1] are controlled by mtk-scp[2] and allocate memory through
>>> mtk-scp. In order to use the noncontiguous DMA API[3], we need to use
>>> the swiotlb pool. mtk-scp needs to allocate memory with 2560 slabs.
>>> mtk-isp drivers also needs to allocate memory with 200+ slabs. Both are
>>> larger than the default IO_TLB_SEGSIZE (128) slabs.
>>
>> Are drivers really doing streaming DMA mappings that large? If so, that
>> seems like it might be worth trying to address in its own right for the
>> sake of efficiency - allocating ~5MB of memory twice and copying it back
>> and forth doesn't sound like the ideal thing to do.
>>
>> If it's really about coherent DMA buffer allocation, I thought the plan
>> was that devices which expect to use a significant amount and/or size of
>> coherent buffers would continue to use a shared-dma-pool for that? It's
>> still what the binding implies. My understanding was that
>> swiotlb_alloc() is mostly just a fallback for the sake of drivers which
>> mostly do streaming DMA but may allocate a handful of pages worth of
>> coherent buffers here and there. Certainly looking at the mtk_scp
>> driver, that seems like it shouldn't be going anywhere near SWIOTLB at all.
> 
> First, thanks a lot for taking a look at this patch series.
> 
> The drivers would do streaming DMA within a reserved region that is
> the only memory accessible to them for security reasons. This seems to
> exactly match the definition of the restricted pool as merged
> recently.

Huh? Of the drivers indicated, the SCP driver is doing nothing but 
coherent allocations, and I'm not entirely sure what those ISP driver 
patches are supposed to be doing but I suspect it's probably just buffer 
allocation too. I don't see any actual streaming DMA anywhere :/

> The new dma_alloc_noncontiguous() API would allow allocating suitable
> memory directly from the pool, which would eliminate the need to copy.

Can you clarify what's being copied, and where? I'm not all that 
familiar with the media APIs, but I thought it was all based around 
preallocated DMA buffers (the whole dedicated "videobuf" thing)? The few 
instances of actual streaming DMA I can see in drivers/media/ look to be 
mostly PCI drivers mapping private descriptors, whereas the MTK ISP 
appears to be entirely register-based.

> However, for a restricted pool, this would exercise the SWIOTLB
> allocator, which currently suffers from the limitation as described by
> Hsin-Yi. Since the allocator in general is quite general purpose and
> already used for coherent allocations as per the current restricted
> pool implementation, I think it indeed makes sense to lift the
> limitation, rather than trying to come up with yet another thing.

No, just fix the dma_alloc_noncontiguous() fallback case to split the 
allocation into dma_max_mapping_size() chunks. *That* makes sense.

Thanks,
Robin.

> 
> Best regards,
> Tomasz
> 
>>
>> Robin.
>>
>>> [1] (not in upstream) https://patchwork.kernel.org/project/linux-media/cover/20190611035344.29814-1-jungo.lin@mediatek.com/
>>> [2] https://elixir.bootlin.com/linux/latest/source/drivers/remoteproc/mtk_scp.c
>>> [3] https://patchwork.kernel.org/project/linux-media/cover/20210909112430.61243-1-senozhatsky@chromium.org/
>>>
>>> Hsin-Yi Wang (3):
>>>     dma: swiotlb: Allow restricted-dma-pool to customize IO_TLB_SEGSIZE
>>>     dt-bindings: Add io-tlb-segsize property for restricted-dma-pool
>>>     arm64: dts: mt8183: use restricted swiotlb for scp mem
>>>
>>>    .../reserved-memory/shared-dma-pool.yaml      |  8 +++++
>>>    .../arm64/boot/dts/mediatek/mt8183-kukui.dtsi |  4 +--
>>>    include/linux/swiotlb.h                       |  1 +
>>>    kernel/dma/swiotlb.c                          | 34 ++++++++++++++-----
>>>    4 files changed, 37 insertions(+), 10 deletions(-)
>>>

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 0/3] Allow restricted-dma-pool to customize IO_TLB_SEGSIZE
@ 2021-12-03 13:07       ` Robin Murphy
  0 siblings, 0 replies; 50+ messages in thread
From: Robin Murphy @ 2021-12-03 13:07 UTC (permalink / raw)
  To: Tomasz Figa
  Cc: Hsin-Yi Wang, Christoph Hellwig, Marek Szyprowski,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Rob Herring, Maxime Ripard,
	-,
	devicetree-u79uwXL29TY76Z2rM5mHXA, Matthias Brugger,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-mediatek-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	senozhatsky-F7+t8E8rja9g9hUCZPvPmw

On 2021-11-25 07:35, Tomasz Figa wrote:
> Hi Robin,
> 
> On Tue, Nov 23, 2021 at 8:59 PM Robin Murphy <robin.murphy-5wv7dgnIgG8@public.gmane.org> wrote:
>>
>> On 2021-11-23 11:21, Hsin-Yi Wang wrote:
>>> Default IO_TLB_SEGSIZE (128) slabs may be not enough for some use cases.
>>> This series adds support to customize io_tlb_segsize for each
>>> restricted-dma-pool.
>>>
>>> Example use case:
>>>
>>> mtk-isp drivers[1] are controlled by mtk-scp[2] and allocate memory through
>>> mtk-scp. In order to use the noncontiguous DMA API[3], we need to use
>>> the swiotlb pool. mtk-scp needs to allocate memory with 2560 slabs.
>>> mtk-isp drivers also needs to allocate memory with 200+ slabs. Both are
>>> larger than the default IO_TLB_SEGSIZE (128) slabs.
>>
>> Are drivers really doing streaming DMA mappings that large? If so, that
>> seems like it might be worth trying to address in its own right for the
>> sake of efficiency - allocating ~5MB of memory twice and copying it back
>> and forth doesn't sound like the ideal thing to do.
>>
>> If it's really about coherent DMA buffer allocation, I thought the plan
>> was that devices which expect to use a significant amount and/or size of
>> coherent buffers would continue to use a shared-dma-pool for that? It's
>> still what the binding implies. My understanding was that
>> swiotlb_alloc() is mostly just a fallback for the sake of drivers which
>> mostly do streaming DMA but may allocate a handful of pages worth of
>> coherent buffers here and there. Certainly looking at the mtk_scp
>> driver, that seems like it shouldn't be going anywhere near SWIOTLB at all.
> 
> First, thanks a lot for taking a look at this patch series.
> 
> The drivers would do streaming DMA within a reserved region that is
> the only memory accessible to them for security reasons. This seems to
> exactly match the definition of the restricted pool as merged
> recently.

Huh? Of the drivers indicated, the SCP driver is doing nothing but 
coherent allocations, and I'm not entirely sure what those ISP driver 
patches are supposed to be doing but I suspect it's probably just buffer 
allocation too. I don't see any actual streaming DMA anywhere :/

> The new dma_alloc_noncontiguous() API would allow allocating suitable
> memory directly from the pool, which would eliminate the need to copy.

Can you clarify what's being copied, and where? I'm not all that 
familiar with the media APIs, but I thought it was all based around 
preallocated DMA buffers (the whole dedicated "videobuf" thing)? The few 
instances of actual streaming DMA I can see in drivers/media/ look to be 
mostly PCI drivers mapping private descriptors, whereas the MTK ISP 
appears to be entirely register-based.

> However, for a restricted pool, this would exercise the SWIOTLB
> allocator, which currently suffers from the limitation as described by
> Hsin-Yi. Since the allocator in general is quite general purpose and
> already used for coherent allocations as per the current restricted
> pool implementation, I think it indeed makes sense to lift the
> limitation, rather than trying to come up with yet another thing.

No, just fix the dma_alloc_noncontiguous() fallback case to split the 
allocation into dma_max_mapping_size() chunks. *That* makes sense.

Thanks,
Robin.

> 
> Best regards,
> Tomasz
> 
>>
>> Robin.
>>
>>> [1] (not in upstream) https://patchwork.kernel.org/project/linux-media/cover/20190611035344.29814-1-jungo.lin-NuS5LvNUpcJWk0Htik3J/w@public.gmane.org/
>>> [2] https://elixir.bootlin.com/linux/latest/source/drivers/remoteproc/mtk_scp.c
>>> [3] https://patchwork.kernel.org/project/linux-media/cover/20210909112430.61243-1-senozhatsky-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org/
>>>
>>> Hsin-Yi Wang (3):
>>>     dma: swiotlb: Allow restricted-dma-pool to customize IO_TLB_SEGSIZE
>>>     dt-bindings: Add io-tlb-segsize property for restricted-dma-pool
>>>     arm64: dts: mt8183: use restricted swiotlb for scp mem
>>>
>>>    .../reserved-memory/shared-dma-pool.yaml      |  8 +++++
>>>    .../arm64/boot/dts/mediatek/mt8183-kukui.dtsi |  4 +--
>>>    include/linux/swiotlb.h                       |  1 +
>>>    kernel/dma/swiotlb.c                          | 34 ++++++++++++++-----
>>>    4 files changed, 37 insertions(+), 10 deletions(-)
>>>

^ permalink raw reply	[flat|nested] 50+ messages in thread

end of thread, other threads:[~2021-12-03 13:09 UTC | newest]

Thread overview: 50+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-11-23 11:21 [PATCH 0/3] Allow restricted-dma-pool to customize IO_TLB_SEGSIZE Hsin-Yi Wang
2021-11-23 11:21 ` Hsin-Yi Wang
2021-11-23 11:21 ` Hsin-Yi Wang
2021-11-23 11:21 ` Hsin-Yi Wang
2021-11-23 11:21 ` Hsin-Yi Wang
2021-11-23 11:21 ` [PATCH 1/3] dma: swiotlb: " Hsin-Yi Wang
2021-11-23 11:21   ` Hsin-Yi Wang
2021-11-23 11:21   ` Hsin-Yi Wang
2021-11-23 11:21   ` Hsin-Yi Wang
2021-11-23 11:21   ` Hsin-Yi Wang
2021-11-23 11:21 ` [PATCH 2/3] dt-bindings: Add io-tlb-segsize property for restricted-dma-pool Hsin-Yi Wang
2021-11-23 11:21   ` Hsin-Yi Wang
2021-11-23 11:21   ` Hsin-Yi Wang
2021-11-23 11:21   ` Hsin-Yi Wang
2021-11-23 11:21   ` Hsin-Yi Wang
2021-11-23 16:34   ` Rob Herring
2021-11-23 16:34     ` Rob Herring
2021-11-23 16:34     ` Rob Herring
2021-11-23 16:34     ` Rob Herring
2021-11-23 16:34     ` Rob Herring
2021-11-23 11:21 ` [PATCH 3/3] arm64: dts: mt8183: use restricted swiotlb for scp mem Hsin-Yi Wang
2021-11-23 11:21   ` Hsin-Yi Wang
2021-11-23 11:21   ` Hsin-Yi Wang
2021-11-23 11:21   ` Hsin-Yi Wang
2021-11-23 11:21   ` Hsin-Yi Wang
2021-11-23 11:58 ` [PATCH 0/3] Allow restricted-dma-pool to customize IO_TLB_SEGSIZE Robin Murphy
2021-11-23 11:58   ` Robin Murphy
2021-11-23 11:58   ` Robin Murphy
2021-11-23 11:58   ` Robin Murphy
2021-11-23 11:58   ` Robin Murphy
2021-11-24  3:55   ` Hsin-Yi Wang
2021-11-24  3:55     ` Hsin-Yi Wang
2021-11-24  3:55     ` Hsin-Yi Wang
2021-11-24  3:55     ` Hsin-Yi Wang
2021-11-24  3:55     ` Hsin-Yi Wang
2021-11-24 12:34     ` Robin Murphy
2021-11-24 12:34       ` Robin Murphy
2021-11-24 12:34       ` Robin Murphy
2021-11-24 12:34       ` Robin Murphy
2021-11-24 12:34       ` Robin Murphy
2021-11-25  7:35   ` Tomasz Figa
2021-11-25  7:35     ` Tomasz Figa
2021-11-25  7:35     ` Tomasz Figa
2021-11-25  7:35     ` Tomasz Figa
2021-11-25  7:35     ` Tomasz Figa
2021-12-03 13:07     ` Robin Murphy
2021-12-03 13:07       ` Robin Murphy
2021-12-03 13:07       ` Robin Murphy
2021-12-03 13:07       ` Robin Murphy
2021-12-03 13:07       ` Robin Murphy

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.