linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* move swiotlb noncoherent dma support from arm64 to generic code V2
@ 2018-10-08  8:02 Christoph Hellwig
  2018-10-08  8:02 ` [PATCH 01/10] swiotlb: remove a pointless comment Christoph Hellwig
                   ` (9 more replies)
  0 siblings, 10 replies; 57+ messages in thread
From: Christoph Hellwig @ 2018-10-08  8:02 UTC (permalink / raw)
  To: Will Deacon, Catalin Marinas, Robin Murphy, Konrad Rzeszutek Wilk
  Cc: linux-arm-kernel, iommu, linux-kernel

Hi all,

this series starts with various swiotlb cleanups, then adds support for
non-cache coherent devices to the generic swiotlb support, and finally
switches arm64 to use the generic code.

Given that this series depends on patches in the dma-mapping tree, or
pending for it I've also published a git tree here:

    git://git.infradead.org/users/hch/misc.git swiotlb-noncoherent.2

Gitweb:

    http://git.infradead.org/users/hch/misc.git/shortlog/refs/heads/swiotlb-noncoherent.2

Changes since v1:
 - make sure arm64 always calls dma_direct_alloc_pages to avoid
   circular calling conventions
 - add a patch in to stop dipping into the swiotlb pool for coherent
   allocations - this just depletes the swiotlb pool for no good reason

^ permalink raw reply	[flat|nested] 57+ messages in thread

* [PATCH 01/10] swiotlb: remove a pointless comment
  2018-10-08  8:02 move swiotlb noncoherent dma support from arm64 to generic code V2 Christoph Hellwig
@ 2018-10-08  8:02 ` Christoph Hellwig
  2018-10-11 17:49   ` Robin Murphy
  2018-10-19  0:09   ` Konrad Rzeszutek Wilk
  2018-10-08  8:02 ` [PATCH 02/10] swiotlb: mark is_swiotlb_buffer static Christoph Hellwig
                   ` (8 subsequent siblings)
  9 siblings, 2 replies; 57+ messages in thread
From: Christoph Hellwig @ 2018-10-08  8:02 UTC (permalink / raw)
  To: Will Deacon, Catalin Marinas, Robin Murphy, Konrad Rzeszutek Wilk
  Cc: linux-arm-kernel, iommu, linux-kernel

This comments describes an aspect of the map_sg interface that isn't
even exploited by swiotlb.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 kernel/dma/swiotlb.c | 6 ------
 1 file changed, 6 deletions(-)

diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
index 4f8a6dbf0b60..9062b14bc7f4 100644
--- a/kernel/dma/swiotlb.c
+++ b/kernel/dma/swiotlb.c
@@ -925,12 +925,6 @@ swiotlb_sync_single_for_device(struct device *hwdev, dma_addr_t dev_addr,
  * appropriate dma address and length.  They are obtained via
  * sg_dma_{address,length}(SG).
  *
- * NOTE: An implementation may be able to use a smaller number of
- *       DMA address/length pairs than there are SG table elements.
- *       (for example via virtual mapping capabilities)
- *       The routine returns the number of addr/length pairs actually
- *       used, at most nents.
- *
  * Device ownership issues as mentioned above for swiotlb_map_page are the
  * same here.
  */
-- 
2.19.0


^ permalink raw reply	[flat|nested] 57+ messages in thread

* [PATCH 02/10] swiotlb: mark is_swiotlb_buffer static
  2018-10-08  8:02 move swiotlb noncoherent dma support from arm64 to generic code V2 Christoph Hellwig
  2018-10-08  8:02 ` [PATCH 01/10] swiotlb: remove a pointless comment Christoph Hellwig
@ 2018-10-08  8:02 ` Christoph Hellwig
  2018-10-11 17:54   ` Robin Murphy
  2018-10-19  0:12   ` Konrad Rzeszutek Wilk
  2018-10-08  8:02 ` [PATCH 03/10] swiotlb: do not panic on mapping failures Christoph Hellwig
                   ` (7 subsequent siblings)
  9 siblings, 2 replies; 57+ messages in thread
From: Christoph Hellwig @ 2018-10-08  8:02 UTC (permalink / raw)
  To: Will Deacon, Catalin Marinas, Robin Murphy, Konrad Rzeszutek Wilk
  Cc: linux-arm-kernel, iommu, linux-kernel

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 include/linux/swiotlb.h | 1 -
 kernel/dma/swiotlb.c    | 2 +-
 2 files changed, 1 insertion(+), 2 deletions(-)

diff --git a/include/linux/swiotlb.h b/include/linux/swiotlb.h
index 965be92c33b5..7ef541ce8f34 100644
--- a/include/linux/swiotlb.h
+++ b/include/linux/swiotlb.h
@@ -121,7 +121,6 @@ static inline unsigned int swiotlb_max_segment(void) { return 0; }
 #endif
 
 extern void swiotlb_print_info(void);
-extern int is_swiotlb_buffer(phys_addr_t paddr);
 extern void swiotlb_set_max_segment(unsigned int);
 
 extern const struct dma_map_ops swiotlb_dma_ops;
diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
index 9062b14bc7f4..26d3af52956f 100644
--- a/kernel/dma/swiotlb.c
+++ b/kernel/dma/swiotlb.c
@@ -429,7 +429,7 @@ void __init swiotlb_exit(void)
 	max_segment = 0;
 }
 
-int is_swiotlb_buffer(phys_addr_t paddr)
+static int is_swiotlb_buffer(phys_addr_t paddr)
 {
 	return paddr >= io_tlb_start && paddr < io_tlb_end;
 }
-- 
2.19.0


^ permalink raw reply	[flat|nested] 57+ messages in thread

* [PATCH 03/10] swiotlb: do not panic on mapping failures
  2018-10-08  8:02 move swiotlb noncoherent dma support from arm64 to generic code V2 Christoph Hellwig
  2018-10-08  8:02 ` [PATCH 01/10] swiotlb: remove a pointless comment Christoph Hellwig
  2018-10-08  8:02 ` [PATCH 02/10] swiotlb: mark is_swiotlb_buffer static Christoph Hellwig
@ 2018-10-08  8:02 ` Christoph Hellwig
  2018-10-11 18:06   ` Robin Murphy
  2018-10-19  0:17   ` Konrad Rzeszutek Wilk
  2018-10-08  8:02 ` [PATCH 04/10] swiotlb: remove the overflow buffer Christoph Hellwig
                   ` (6 subsequent siblings)
  9 siblings, 2 replies; 57+ messages in thread
From: Christoph Hellwig @ 2018-10-08  8:02 UTC (permalink / raw)
  To: Will Deacon, Catalin Marinas, Robin Murphy, Konrad Rzeszutek Wilk
  Cc: linux-arm-kernel, iommu, linux-kernel

All properly written drivers now have error handling in the
dma_map_single / dma_map_page callers.  As swiotlb_tbl_map_single already
prints a useful warning when running out of swiotlb pool swace we can
also remove swiotlb_full entirely as it serves no purpose now.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 kernel/dma/swiotlb.c | 33 +--------------------------------
 1 file changed, 1 insertion(+), 32 deletions(-)

diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
index 26d3af52956f..69bf305ee5f8 100644
--- a/kernel/dma/swiotlb.c
+++ b/kernel/dma/swiotlb.c
@@ -761,34 +761,6 @@ static bool swiotlb_free_buffer(struct device *dev, size_t size,
 	return true;
 }
 
-static void
-swiotlb_full(struct device *dev, size_t size, enum dma_data_direction dir,
-	     int do_panic)
-{
-	if (swiotlb_force == SWIOTLB_NO_FORCE)
-		return;
-
-	/*
-	 * Ran out of IOMMU space for this operation. This is very bad.
-	 * Unfortunately the drivers cannot handle this operation properly.
-	 * unless they check for dma_mapping_error (most don't)
-	 * When the mapping is small enough return a static buffer to limit
-	 * the damage, or panic when the transfer is too big.
-	 */
-	dev_err_ratelimited(dev, "DMA: Out of SW-IOMMU space for %zu bytes\n",
-			    size);
-
-	if (size <= io_tlb_overflow || !do_panic)
-		return;
-
-	if (dir == DMA_BIDIRECTIONAL)
-		panic("DMA: Random memory could be DMA accessed\n");
-	if (dir == DMA_FROM_DEVICE)
-		panic("DMA: Random memory could be DMA written\n");
-	if (dir == DMA_TO_DEVICE)
-		panic("DMA: Random memory could be DMA read\n");
-}
-
 /*
  * Map a single buffer of the indicated size for DMA in streaming mode.  The
  * physical address to use is returned.
@@ -817,10 +789,8 @@ dma_addr_t swiotlb_map_page(struct device *dev, struct page *page,
 
 	/* Oh well, have to allocate and map a bounce buffer. */
 	map = map_single(dev, phys, size, dir, attrs);
-	if (map == SWIOTLB_MAP_ERROR) {
-		swiotlb_full(dev, size, dir, 1);
+	if (map == SWIOTLB_MAP_ERROR)
 		return __phys_to_dma(dev, io_tlb_overflow_buffer);
-	}
 
 	dev_addr = __phys_to_dma(dev, map);
 
@@ -948,7 +918,6 @@ swiotlb_map_sg_attrs(struct device *hwdev, struct scatterlist *sgl, int nelems,
 			if (map == SWIOTLB_MAP_ERROR) {
 				/* Don't panic here, we expect map_sg users
 				   to do proper error handling. */
-				swiotlb_full(hwdev, sg->length, dir, 0);
 				attrs |= DMA_ATTR_SKIP_CPU_SYNC;
 				swiotlb_unmap_sg_attrs(hwdev, sgl, i, dir,
 						       attrs);
-- 
2.19.0


^ permalink raw reply	[flat|nested] 57+ messages in thread

* [PATCH 04/10] swiotlb: remove the overflow buffer
  2018-10-08  8:02 move swiotlb noncoherent dma support from arm64 to generic code V2 Christoph Hellwig
                   ` (2 preceding siblings ...)
  2018-10-08  8:02 ` [PATCH 03/10] swiotlb: do not panic on mapping failures Christoph Hellwig
@ 2018-10-08  8:02 ` Christoph Hellwig
  2018-10-11 18:19   ` Robin Murphy
                     ` (2 more replies)
  2018-10-08  8:02 ` [PATCH 05/10] swiotlb: merge swiotlb_unmap_page and unmap_single Christoph Hellwig
                   ` (5 subsequent siblings)
  9 siblings, 3 replies; 57+ messages in thread
From: Christoph Hellwig @ 2018-10-08  8:02 UTC (permalink / raw)
  To: Will Deacon, Catalin Marinas, Robin Murphy, Konrad Rzeszutek Wilk
  Cc: linux-arm-kernel, iommu, linux-kernel

Like all other dma mapping drivers just return an error code instead
of an actual memory buffer.  The reason for the overflow buffer was
that at the time swiotlb was invented there was no way to check for
dma mapping errors, but this has long been fixed.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 arch/arm64/mm/dma-mapping.c       |  2 +-
 arch/powerpc/kernel/dma-swiotlb.c |  4 +--
 include/linux/dma-direct.h        |  2 ++
 include/linux/swiotlb.h           |  3 --
 kernel/dma/direct.c               |  2 --
 kernel/dma/swiotlb.c              | 59 ++-----------------------------
 6 files changed, 8 insertions(+), 64 deletions(-)

diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c
index 072c51fb07d7..8d91b927e09e 100644
--- a/arch/arm64/mm/dma-mapping.c
+++ b/arch/arm64/mm/dma-mapping.c
@@ -324,7 +324,7 @@ static int __swiotlb_dma_supported(struct device *hwdev, u64 mask)
 static int __swiotlb_dma_mapping_error(struct device *hwdev, dma_addr_t addr)
 {
 	if (swiotlb)
-		return swiotlb_dma_mapping_error(hwdev, addr);
+		return dma_direct_mapping_error(hwdev, addr);
 	return 0;
 }
 
diff --git a/arch/powerpc/kernel/dma-swiotlb.c b/arch/powerpc/kernel/dma-swiotlb.c
index 88f3963ca30f..5fc335f4d9cd 100644
--- a/arch/powerpc/kernel/dma-swiotlb.c
+++ b/arch/powerpc/kernel/dma-swiotlb.c
@@ -11,7 +11,7 @@
  *
  */
 
-#include <linux/dma-mapping.h>
+#include <linux/dma-direct.h>
 #include <linux/memblock.h>
 #include <linux/pfn.h>
 #include <linux/of_platform.h>
@@ -59,7 +59,7 @@ const struct dma_map_ops powerpc_swiotlb_dma_ops = {
 	.sync_single_for_device = swiotlb_sync_single_for_device,
 	.sync_sg_for_cpu = swiotlb_sync_sg_for_cpu,
 	.sync_sg_for_device = swiotlb_sync_sg_for_device,
-	.mapping_error = swiotlb_dma_mapping_error,
+	.mapping_error = dma_direct_mapping_error,
 	.get_required_mask = swiotlb_powerpc_get_required,
 };
 
diff --git a/include/linux/dma-direct.h b/include/linux/dma-direct.h
index fbca184ff5a0..bd73e7a91410 100644
--- a/include/linux/dma-direct.h
+++ b/include/linux/dma-direct.h
@@ -5,6 +5,8 @@
 #include <linux/dma-mapping.h>
 #include <linux/mem_encrypt.h>
 
+#define DIRECT_MAPPING_ERROR		0
+
 #ifdef CONFIG_ARCH_HAS_PHYS_TO_DMA
 #include <asm/dma-direct.h>
 #else
diff --git a/include/linux/swiotlb.h b/include/linux/swiotlb.h
index 7ef541ce8f34..f847c1b265c4 100644
--- a/include/linux/swiotlb.h
+++ b/include/linux/swiotlb.h
@@ -106,9 +106,6 @@ extern void
 swiotlb_sync_sg_for_device(struct device *hwdev, struct scatterlist *sg,
 			   int nelems, enum dma_data_direction dir);
 
-extern int
-swiotlb_dma_mapping_error(struct device *hwdev, dma_addr_t dma_addr);
-
 extern int
 swiotlb_dma_supported(struct device *hwdev, u64 mask);
 
diff --git a/kernel/dma/direct.c b/kernel/dma/direct.c
index 674a8da22844..12798abba55e 100644
--- a/kernel/dma/direct.c
+++ b/kernel/dma/direct.c
@@ -14,8 +14,6 @@
 #include <linux/pfn.h>
 #include <linux/set_memory.h>
 
-#define DIRECT_MAPPING_ERROR		0
-
 /*
  * Most architectures use ZONE_DMA for the first 16 Megabytes, but
  * some use it for entirely different regions:
diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
index 69bf305ee5f8..11dbcd80b4a6 100644
--- a/kernel/dma/swiotlb.c
+++ b/kernel/dma/swiotlb.c
@@ -72,13 +72,6 @@ static phys_addr_t io_tlb_start, io_tlb_end;
  */
 static unsigned long io_tlb_nslabs;
 
-/*
- * When the IOMMU overflows we return a fallback buffer. This sets the size.
- */
-static unsigned long io_tlb_overflow = 32*1024;
-
-static phys_addr_t io_tlb_overflow_buffer;
-
 /*
  * This is a free list describing the number of free entries available from
  * each index
@@ -126,7 +119,6 @@ setup_io_tlb_npages(char *str)
 	return 0;
 }
 early_param("swiotlb", setup_io_tlb_npages);
-/* make io_tlb_overflow tunable too? */
 
 unsigned long swiotlb_nr_tbl(void)
 {
@@ -194,16 +186,10 @@ void __init swiotlb_update_mem_attributes(void)
 	bytes = PAGE_ALIGN(io_tlb_nslabs << IO_TLB_SHIFT);
 	set_memory_decrypted((unsigned long)vaddr, bytes >> PAGE_SHIFT);
 	memset(vaddr, 0, bytes);
-
-	vaddr = phys_to_virt(io_tlb_overflow_buffer);
-	bytes = PAGE_ALIGN(io_tlb_overflow);
-	set_memory_decrypted((unsigned long)vaddr, bytes >> PAGE_SHIFT);
-	memset(vaddr, 0, bytes);
 }
 
 int __init swiotlb_init_with_tbl(char *tlb, unsigned long nslabs, int verbose)
 {
-	void *v_overflow_buffer;
 	unsigned long i, bytes;
 
 	bytes = nslabs << IO_TLB_SHIFT;
@@ -212,17 +198,6 @@ int __init swiotlb_init_with_tbl(char *tlb, unsigned long nslabs, int verbose)
 	io_tlb_start = __pa(tlb);
 	io_tlb_end = io_tlb_start + bytes;
 
-	/*
-	 * Get the overflow emergency buffer
-	 */
-	v_overflow_buffer = memblock_virt_alloc_low_nopanic(
-						PAGE_ALIGN(io_tlb_overflow),
-						PAGE_SIZE);
-	if (!v_overflow_buffer)
-		return -ENOMEM;
-
-	io_tlb_overflow_buffer = __pa(v_overflow_buffer);
-
 	/*
 	 * Allocate and initialize the free list array.  This array is used
 	 * to find contiguous free memory regions of size up to IO_TLB_SEGSIZE
@@ -330,7 +305,6 @@ int
 swiotlb_late_init_with_tbl(char *tlb, unsigned long nslabs)
 {
 	unsigned long i, bytes;
-	unsigned char *v_overflow_buffer;
 
 	bytes = nslabs << IO_TLB_SHIFT;
 
@@ -341,19 +315,6 @@ swiotlb_late_init_with_tbl(char *tlb, unsigned long nslabs)
 	set_memory_decrypted((unsigned long)tlb, bytes >> PAGE_SHIFT);
 	memset(tlb, 0, bytes);
 
-	/*
-	 * Get the overflow emergency buffer
-	 */
-	v_overflow_buffer = (void *)__get_free_pages(GFP_DMA,
-						     get_order(io_tlb_overflow));
-	if (!v_overflow_buffer)
-		goto cleanup2;
-
-	set_memory_decrypted((unsigned long)v_overflow_buffer,
-			io_tlb_overflow >> PAGE_SHIFT);
-	memset(v_overflow_buffer, 0, io_tlb_overflow);
-	io_tlb_overflow_buffer = virt_to_phys(v_overflow_buffer);
-
 	/*
 	 * Allocate and initialize the free list array.  This array is used
 	 * to find contiguous free memory regions of size up to IO_TLB_SEGSIZE
@@ -390,10 +351,6 @@ swiotlb_late_init_with_tbl(char *tlb, unsigned long nslabs)
 	                                                 sizeof(int)));
 	io_tlb_list = NULL;
 cleanup3:
-	free_pages((unsigned long)v_overflow_buffer,
-		   get_order(io_tlb_overflow));
-	io_tlb_overflow_buffer = 0;
-cleanup2:
 	io_tlb_end = 0;
 	io_tlb_start = 0;
 	io_tlb_nslabs = 0;
@@ -407,8 +364,6 @@ void __init swiotlb_exit(void)
 		return;
 
 	if (late_alloc) {
-		free_pages((unsigned long)phys_to_virt(io_tlb_overflow_buffer),
-			   get_order(io_tlb_overflow));
 		free_pages((unsigned long)io_tlb_orig_addr,
 			   get_order(io_tlb_nslabs * sizeof(phys_addr_t)));
 		free_pages((unsigned long)io_tlb_list, get_order(io_tlb_nslabs *
@@ -416,8 +371,6 @@ void __init swiotlb_exit(void)
 		free_pages((unsigned long)phys_to_virt(io_tlb_start),
 			   get_order(io_tlb_nslabs << IO_TLB_SHIFT));
 	} else {
-		memblock_free_late(io_tlb_overflow_buffer,
-				   PAGE_ALIGN(io_tlb_overflow));
 		memblock_free_late(__pa(io_tlb_orig_addr),
 				   PAGE_ALIGN(io_tlb_nslabs * sizeof(phys_addr_t)));
 		memblock_free_late(__pa(io_tlb_list),
@@ -790,7 +743,7 @@ dma_addr_t swiotlb_map_page(struct device *dev, struct page *page,
 	/* Oh well, have to allocate and map a bounce buffer. */
 	map = map_single(dev, phys, size, dir, attrs);
 	if (map == SWIOTLB_MAP_ERROR)
-		return __phys_to_dma(dev, io_tlb_overflow_buffer);
+		return DIRECT_MAPPING_ERROR;
 
 	dev_addr = __phys_to_dma(dev, map);
 
@@ -801,7 +754,7 @@ dma_addr_t swiotlb_map_page(struct device *dev, struct page *page,
 	attrs |= DMA_ATTR_SKIP_CPU_SYNC;
 	swiotlb_tbl_unmap_single(dev, map, size, dir, attrs);
 
-	return __phys_to_dma(dev, io_tlb_overflow_buffer);
+	return DIRECT_MAPPING_ERROR;
 }
 
 /*
@@ -985,12 +938,6 @@ swiotlb_sync_sg_for_device(struct device *hwdev, struct scatterlist *sg,
 	swiotlb_sync_sg(hwdev, sg, nelems, dir, SYNC_FOR_DEVICE);
 }
 
-int
-swiotlb_dma_mapping_error(struct device *hwdev, dma_addr_t dma_addr)
-{
-	return (dma_addr == __phys_to_dma(hwdev, io_tlb_overflow_buffer));
-}
-
 /*
  * Return whether the given device DMA address mask can be supported
  * properly.  For example, if your device can only drive the low 24-bits
@@ -1033,7 +980,7 @@ void swiotlb_free(struct device *dev, size_t size, void *vaddr,
 }
 
 const struct dma_map_ops swiotlb_dma_ops = {
-	.mapping_error		= swiotlb_dma_mapping_error,
+	.mapping_error		= dma_direct_mapping_error,
 	.alloc			= swiotlb_alloc,
 	.free			= swiotlb_free,
 	.sync_single_for_cpu	= swiotlb_sync_single_for_cpu,
-- 
2.19.0


^ permalink raw reply	[flat|nested] 57+ messages in thread

* [PATCH 05/10] swiotlb: merge swiotlb_unmap_page and unmap_single
  2018-10-08  8:02 move swiotlb noncoherent dma support from arm64 to generic code V2 Christoph Hellwig
                   ` (3 preceding siblings ...)
  2018-10-08  8:02 ` [PATCH 04/10] swiotlb: remove the overflow buffer Christoph Hellwig
@ 2018-10-08  8:02 ` Christoph Hellwig
  2018-10-18 17:44   ` Robin Murphy
  2018-10-19  0:25   ` Konrad Rzeszutek Wilk
  2018-10-08  8:02 ` [PATCH 06/10] swiotlb: use swiotlb_map_page in swiotlb_map_sg_attrs Christoph Hellwig
                   ` (4 subsequent siblings)
  9 siblings, 2 replies; 57+ messages in thread
From: Christoph Hellwig @ 2018-10-08  8:02 UTC (permalink / raw)
  To: Will Deacon, Catalin Marinas, Robin Murphy, Konrad Rzeszutek Wilk
  Cc: linux-arm-kernel, iommu, linux-kernel

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 kernel/dma/swiotlb.c | 15 ++++-----------
 1 file changed, 4 insertions(+), 11 deletions(-)

diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
index 11dbcd80b4a6..15335f3a1bf3 100644
--- a/kernel/dma/swiotlb.c
+++ b/kernel/dma/swiotlb.c
@@ -765,9 +765,9 @@ dma_addr_t swiotlb_map_page(struct device *dev, struct page *page,
  * After this call, reads by the cpu to the buffer are guaranteed to see
  * whatever the device wrote there.
  */
-static void unmap_single(struct device *hwdev, dma_addr_t dev_addr,
-			 size_t size, enum dma_data_direction dir,
-			 unsigned long attrs)
+void swiotlb_unmap_page(struct device *hwdev, dma_addr_t dev_addr,
+			size_t size, enum dma_data_direction dir,
+			unsigned long attrs)
 {
 	phys_addr_t paddr = dma_to_phys(hwdev, dev_addr);
 
@@ -790,13 +790,6 @@ static void unmap_single(struct device *hwdev, dma_addr_t dev_addr,
 	dma_mark_clean(phys_to_virt(paddr), size);
 }
 
-void swiotlb_unmap_page(struct device *hwdev, dma_addr_t dev_addr,
-			size_t size, enum dma_data_direction dir,
-			unsigned long attrs)
-{
-	unmap_single(hwdev, dev_addr, size, dir, attrs);
-}
-
 /*
  * Make physical memory consistent for a single streaming mode DMA translation
  * after a transfer.
@@ -900,7 +893,7 @@ swiotlb_unmap_sg_attrs(struct device *hwdev, struct scatterlist *sgl,
 	BUG_ON(dir == DMA_NONE);
 
 	for_each_sg(sgl, sg, nelems, i)
-		unmap_single(hwdev, sg->dma_address, sg_dma_len(sg), dir,
+		swiotlb_unmap_page(hwdev, sg->dma_address, sg_dma_len(sg), dir,
 			     attrs);
 }
 
-- 
2.19.0


^ permalink raw reply	[flat|nested] 57+ messages in thread

* [PATCH 06/10] swiotlb: use swiotlb_map_page in swiotlb_map_sg_attrs
  2018-10-08  8:02 move swiotlb noncoherent dma support from arm64 to generic code V2 Christoph Hellwig
                   ` (4 preceding siblings ...)
  2018-10-08  8:02 ` [PATCH 05/10] swiotlb: merge swiotlb_unmap_page and unmap_single Christoph Hellwig
@ 2018-10-08  8:02 ` Christoph Hellwig
  2018-10-18 17:53   ` Robin Murphy
                     ` (2 more replies)
  2018-10-08  8:02 ` [PATCH 07/10] swiotlb: refactor swiotlb_map_page Christoph Hellwig
                   ` (3 subsequent siblings)
  9 siblings, 3 replies; 57+ messages in thread
From: Christoph Hellwig @ 2018-10-08  8:02 UTC (permalink / raw)
  To: Will Deacon, Catalin Marinas, Robin Murphy, Konrad Rzeszutek Wilk
  Cc: linux-arm-kernel, iommu, linux-kernel

No need to duplicate the code - map_sg is equivalent to map_page
for each page in the scatterlist.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 kernel/dma/swiotlb.c | 34 ++++++++++++----------------------
 1 file changed, 12 insertions(+), 22 deletions(-)

diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
index 15335f3a1bf3..15755d7a5242 100644
--- a/kernel/dma/swiotlb.c
+++ b/kernel/dma/swiotlb.c
@@ -845,37 +845,27 @@ swiotlb_sync_single_for_device(struct device *hwdev, dma_addr_t dev_addr,
  * same here.
  */
 int
-swiotlb_map_sg_attrs(struct device *hwdev, struct scatterlist *sgl, int nelems,
+swiotlb_map_sg_attrs(struct device *dev, struct scatterlist *sgl, int nelems,
 		     enum dma_data_direction dir, unsigned long attrs)
 {
 	struct scatterlist *sg;
 	int i;
 
-	BUG_ON(dir == DMA_NONE);
-
 	for_each_sg(sgl, sg, nelems, i) {
-		phys_addr_t paddr = sg_phys(sg);
-		dma_addr_t dev_addr = phys_to_dma(hwdev, paddr);
-
-		if (swiotlb_force == SWIOTLB_FORCE ||
-		    !dma_capable(hwdev, dev_addr, sg->length)) {
-			phys_addr_t map = map_single(hwdev, sg_phys(sg),
-						     sg->length, dir, attrs);
-			if (map == SWIOTLB_MAP_ERROR) {
-				/* Don't panic here, we expect map_sg users
-				   to do proper error handling. */
-				attrs |= DMA_ATTR_SKIP_CPU_SYNC;
-				swiotlb_unmap_sg_attrs(hwdev, sgl, i, dir,
-						       attrs);
-				sg_dma_len(sgl) = 0;
-				return 0;
-			}
-			sg->dma_address = __phys_to_dma(hwdev, map);
-		} else
-			sg->dma_address = dev_addr;
+		sg->dma_address = swiotlb_map_page(dev, sg_page(sg), sg->offset,
+				sg->length, dir, attrs);
+		if (sg->dma_address == DIRECT_MAPPING_ERROR)
+			goto out_error;
 		sg_dma_len(sg) = sg->length;
 	}
+
 	return nelems;
+
+out_error:
+	swiotlb_unmap_sg_attrs(dev, sgl, i, dir,
+			attrs | DMA_ATTR_SKIP_CPU_SYNC);
+	sg_dma_len(sgl) = 0;
+	return 0;
 }
 
 /*
-- 
2.19.0


^ permalink raw reply	[flat|nested] 57+ messages in thread

* [PATCH 07/10] swiotlb: refactor swiotlb_map_page
  2018-10-08  8:02 move swiotlb noncoherent dma support from arm64 to generic code V2 Christoph Hellwig
                   ` (5 preceding siblings ...)
  2018-10-08  8:02 ` [PATCH 06/10] swiotlb: use swiotlb_map_page in swiotlb_map_sg_attrs Christoph Hellwig
@ 2018-10-08  8:02 ` Christoph Hellwig
  2018-10-18 18:09   ` Robin Murphy
  2018-10-08  8:02 ` [PATCH 08/10] swiotlb: don't dip into swiotlb pool for coherent allocations Christoph Hellwig
                   ` (2 subsequent siblings)
  9 siblings, 1 reply; 57+ messages in thread
From: Christoph Hellwig @ 2018-10-08  8:02 UTC (permalink / raw)
  To: Will Deacon, Catalin Marinas, Robin Murphy, Konrad Rzeszutek Wilk
  Cc: linux-arm-kernel, iommu, linux-kernel

Remove the somewhat useless map_single function, and replace it with a
swiotlb_bounce_page handler that handles everything related to actually
bouncing a page.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 kernel/dma/swiotlb.c | 77 +++++++++++++++++++++-----------------------
 1 file changed, 36 insertions(+), 41 deletions(-)

diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
index 15755d7a5242..4d7a4d85d71e 100644
--- a/kernel/dma/swiotlb.c
+++ b/kernel/dma/swiotlb.c
@@ -543,26 +543,6 @@ phys_addr_t swiotlb_tbl_map_single(struct device *hwdev,
 	return tlb_addr;
 }
 
-/*
- * Allocates bounce buffer and returns its physical address.
- */
-static phys_addr_t
-map_single(struct device *hwdev, phys_addr_t phys, size_t size,
-	   enum dma_data_direction dir, unsigned long attrs)
-{
-	dma_addr_t start_dma_addr;
-
-	if (swiotlb_force == SWIOTLB_NO_FORCE) {
-		dev_warn_ratelimited(hwdev, "Cannot do DMA to address %pa\n",
-				     &phys);
-		return SWIOTLB_MAP_ERROR;
-	}
-
-	start_dma_addr = __phys_to_dma(hwdev, io_tlb_start);
-	return swiotlb_tbl_map_single(hwdev, start_dma_addr, phys, size,
-				      dir, attrs);
-}
-
 /*
  * tlb_addr is the physical address of the bounce buffer to unmap.
  */
@@ -714,6 +694,34 @@ static bool swiotlb_free_buffer(struct device *dev, size_t size,
 	return true;
 }
 
+static dma_addr_t swiotlb_bounce_page(struct device *dev, phys_addr_t *phys,
+		size_t size, enum dma_data_direction dir, unsigned long attrs)
+{
+	dma_addr_t dma_addr;
+
+	if (unlikely(swiotlb_force == SWIOTLB_NO_FORCE)) {
+		dev_warn_ratelimited(dev,
+			"Cannot do DMA to address %pa\n", phys);
+		return DIRECT_MAPPING_ERROR;
+	}
+
+	/* Oh well, have to allocate and map a bounce buffer. */
+	*phys = swiotlb_tbl_map_single(dev, __phys_to_dma(dev, io_tlb_start),
+			*phys, size, dir, attrs);
+	if (*phys == SWIOTLB_MAP_ERROR)
+		return DIRECT_MAPPING_ERROR;
+
+	/* Ensure that the address returned is DMA'ble */
+	dma_addr = __phys_to_dma(dev, *phys);
+	if (unlikely(!dma_capable(dev, dma_addr, size))) {
+		swiotlb_tbl_unmap_single(dev, *phys, size, dir,
+			attrs | DMA_ATTR_SKIP_CPU_SYNC);
+		return DIRECT_MAPPING_ERROR;
+	}
+
+	return dma_addr;
+}
+
 /*
  * Map a single buffer of the indicated size for DMA in streaming mode.  The
  * physical address to use is returned.
@@ -726,8 +734,8 @@ dma_addr_t swiotlb_map_page(struct device *dev, struct page *page,
 			    enum dma_data_direction dir,
 			    unsigned long attrs)
 {
-	phys_addr_t map, phys = page_to_phys(page) + offset;
-	dma_addr_t dev_addr = phys_to_dma(dev, phys);
+	phys_addr_t phys = page_to_phys(page) + offset;
+	dma_addr_t dma_addr = phys_to_dma(dev, phys);
 
 	BUG_ON(dir == DMA_NONE);
 	/*
@@ -735,26 +743,13 @@ dma_addr_t swiotlb_map_page(struct device *dev, struct page *page,
 	 * we can safely return the device addr and not worry about bounce
 	 * buffering it.
 	 */
-	if (dma_capable(dev, dev_addr, size) && swiotlb_force != SWIOTLB_FORCE)
-		return dev_addr;
-
-	trace_swiotlb_bounced(dev, dev_addr, size, swiotlb_force);
-
-	/* Oh well, have to allocate and map a bounce buffer. */
-	map = map_single(dev, phys, size, dir, attrs);
-	if (map == SWIOTLB_MAP_ERROR)
-		return DIRECT_MAPPING_ERROR;
-
-	dev_addr = __phys_to_dma(dev, map);
-
-	/* Ensure that the address returned is DMA'ble */
-	if (dma_capable(dev, dev_addr, size))
-		return dev_addr;
-
-	attrs |= DMA_ATTR_SKIP_CPU_SYNC;
-	swiotlb_tbl_unmap_single(dev, map, size, dir, attrs);
+	if (!dma_capable(dev, dma_addr, size) ||
+	    swiotlb_force == SWIOTLB_FORCE) {
+		trace_swiotlb_bounced(dev, dma_addr, size, swiotlb_force);
+		dma_addr = swiotlb_bounce_page(dev, &phys, size, dir, attrs);
+	}
 
-	return DIRECT_MAPPING_ERROR;
+	return dma_addr;
 }
 
 /*
-- 
2.19.0


^ permalink raw reply	[flat|nested] 57+ messages in thread

* [PATCH 08/10] swiotlb: don't dip into swiotlb pool for coherent allocations
  2018-10-08  8:02 move swiotlb noncoherent dma support from arm64 to generic code V2 Christoph Hellwig
                   ` (6 preceding siblings ...)
  2018-10-08  8:02 ` [PATCH 07/10] swiotlb: refactor swiotlb_map_page Christoph Hellwig
@ 2018-10-08  8:02 ` Christoph Hellwig
  2018-10-12 17:04   ` Catalin Marinas
                     ` (2 more replies)
  2018-10-08  8:02 ` [PATCH 09/10] swiotlb: add support for non-coherent DMA Christoph Hellwig
  2018-10-08  8:02 ` [PATCH 10/10] arm64: use the generic swiotlb_dma_ops Christoph Hellwig
  9 siblings, 3 replies; 57+ messages in thread
From: Christoph Hellwig @ 2018-10-08  8:02 UTC (permalink / raw)
  To: Will Deacon, Catalin Marinas, Robin Murphy, Konrad Rzeszutek Wilk
  Cc: linux-arm-kernel, iommu, linux-kernel

All architectures that support swiotlb also have a zone that backs up
these less than full addressing allocations (usually ZONE_DMA32).

Because of that it is rather pointless to fall back to the global swiotlb
buffer if the normal dma direct allocation failed - the only thing this
will do is to eat up bounce buffers that would be more useful to serve
streaming mappings.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 arch/arm64/mm/dma-mapping.c |   6 +--
 include/linux/swiotlb.h     |   5 --
 kernel/dma/swiotlb.c        | 105 +-----------------------------------
 3 files changed, 5 insertions(+), 111 deletions(-)

diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c
index 8d91b927e09e..eee6cfcfde9e 100644
--- a/arch/arm64/mm/dma-mapping.c
+++ b/arch/arm64/mm/dma-mapping.c
@@ -112,7 +112,7 @@ static void *__dma_alloc(struct device *dev, size_t size,
 		return addr;
 	}
 
-	ptr = swiotlb_alloc(dev, size, dma_handle, flags, attrs);
+	ptr = dma_direct_alloc_pages(dev, size, dma_handle, flags, attrs);
 	if (!ptr)
 		goto no_mem;
 
@@ -133,7 +133,7 @@ static void *__dma_alloc(struct device *dev, size_t size,
 	return coherent_ptr;
 
 no_map:
-	swiotlb_free(dev, size, ptr, *dma_handle, attrs);
+	dma_direct_free_pages(dev, size, ptr, *dma_handle, attrs);
 no_mem:
 	return NULL;
 }
@@ -151,7 +151,7 @@ static void __dma_free(struct device *dev, size_t size,
 			return;
 		vunmap(vaddr);
 	}
-	swiotlb_free(dev, size, swiotlb_addr, dma_handle, attrs);
+	dma_direct_free_pages(dev, size, swiotlb_addr, dma_handle, attrs);
 }
 
 static dma_addr_t __swiotlb_map_page(struct device *dev, struct page *page,
diff --git a/include/linux/swiotlb.h b/include/linux/swiotlb.h
index f847c1b265c4..a387b59640a4 100644
--- a/include/linux/swiotlb.h
+++ b/include/linux/swiotlb.h
@@ -67,11 +67,6 @@ extern void swiotlb_tbl_sync_single(struct device *hwdev,
 
 /* Accessory functions. */
 
-void *swiotlb_alloc(struct device *hwdev, size_t size, dma_addr_t *dma_handle,
-		gfp_t flags, unsigned long attrs);
-void swiotlb_free(struct device *dev, size_t size, void *vaddr,
-		dma_addr_t dma_addr, unsigned long attrs);
-
 extern dma_addr_t swiotlb_map_page(struct device *dev, struct page *page,
 				   unsigned long offset, size_t size,
 				   enum dma_data_direction dir,
diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
index 4d7a4d85d71e..475a41eff3dc 100644
--- a/kernel/dma/swiotlb.c
+++ b/kernel/dma/swiotlb.c
@@ -622,78 +622,6 @@ void swiotlb_tbl_sync_single(struct device *hwdev, phys_addr_t tlb_addr,
 	}
 }
 
-static inline bool dma_coherent_ok(struct device *dev, dma_addr_t addr,
-		size_t size)
-{
-	u64 mask = DMA_BIT_MASK(32);
-
-	if (dev && dev->coherent_dma_mask)
-		mask = dev->coherent_dma_mask;
-	return addr + size - 1 <= mask;
-}
-
-static void *
-swiotlb_alloc_buffer(struct device *dev, size_t size, dma_addr_t *dma_handle,
-		unsigned long attrs)
-{
-	phys_addr_t phys_addr;
-
-	if (swiotlb_force == SWIOTLB_NO_FORCE)
-		goto out_warn;
-
-	phys_addr = swiotlb_tbl_map_single(dev,
-			__phys_to_dma(dev, io_tlb_start),
-			0, size, DMA_FROM_DEVICE, attrs);
-	if (phys_addr == SWIOTLB_MAP_ERROR)
-		goto out_warn;
-
-	*dma_handle = __phys_to_dma(dev, phys_addr);
-	if (!dma_coherent_ok(dev, *dma_handle, size))
-		goto out_unmap;
-
-	memset(phys_to_virt(phys_addr), 0, size);
-	return phys_to_virt(phys_addr);
-
-out_unmap:
-	dev_warn(dev, "hwdev DMA mask = 0x%016Lx, dev_addr = 0x%016Lx\n",
-		(unsigned long long)dev->coherent_dma_mask,
-		(unsigned long long)*dma_handle);
-
-	/*
-	 * DMA_TO_DEVICE to avoid memcpy in unmap_single.
-	 * DMA_ATTR_SKIP_CPU_SYNC is optional.
-	 */
-	swiotlb_tbl_unmap_single(dev, phys_addr, size, DMA_TO_DEVICE,
-			DMA_ATTR_SKIP_CPU_SYNC);
-out_warn:
-	if (!(attrs & DMA_ATTR_NO_WARN) && printk_ratelimit()) {
-		dev_warn(dev,
-			"swiotlb: coherent allocation failed, size=%zu\n",
-			size);
-		dump_stack();
-	}
-	return NULL;
-}
-
-static bool swiotlb_free_buffer(struct device *dev, size_t size,
-		dma_addr_t dma_addr)
-{
-	phys_addr_t phys_addr = dma_to_phys(dev, dma_addr);
-
-	WARN_ON_ONCE(irqs_disabled());
-
-	if (!is_swiotlb_buffer(phys_addr))
-		return false;
-
-	/*
-	 * DMA_TO_DEVICE to avoid memcpy in swiotlb_tbl_unmap_single.
-	 * DMA_ATTR_SKIP_CPU_SYNC is optional.
-	 */
-	swiotlb_tbl_unmap_single(dev, phys_addr, size, DMA_TO_DEVICE,
-				 DMA_ATTR_SKIP_CPU_SYNC);
-	return true;
-}
-
 static dma_addr_t swiotlb_bounce_page(struct device *dev, phys_addr_t *phys,
 		size_t size, enum dma_data_direction dir, unsigned long attrs)
 {
@@ -928,39 +856,10 @@ swiotlb_dma_supported(struct device *hwdev, u64 mask)
 	return __phys_to_dma(hwdev, io_tlb_end - 1) <= mask;
 }
 
-void *swiotlb_alloc(struct device *dev, size_t size, dma_addr_t *dma_handle,
-		gfp_t gfp, unsigned long attrs)
-{
-	void *vaddr;
-
-	/* temporary workaround: */
-	if (gfp & __GFP_NOWARN)
-		attrs |= DMA_ATTR_NO_WARN;
-
-	/*
-	 * Don't print a warning when the first allocation attempt fails.
-	 * swiotlb_alloc_coherent() will print a warning when the DMA memory
-	 * allocation ultimately failed.
-	 */
-	gfp |= __GFP_NOWARN;
-
-	vaddr = dma_direct_alloc(dev, size, dma_handle, gfp, attrs);
-	if (!vaddr)
-		vaddr = swiotlb_alloc_buffer(dev, size, dma_handle, attrs);
-	return vaddr;
-}
-
-void swiotlb_free(struct device *dev, size_t size, void *vaddr,
-		dma_addr_t dma_addr, unsigned long attrs)
-{
-	if (!swiotlb_free_buffer(dev, size, dma_addr))
-		dma_direct_free(dev, size, vaddr, dma_addr, attrs);
-}
-
 const struct dma_map_ops swiotlb_dma_ops = {
 	.mapping_error		= dma_direct_mapping_error,
-	.alloc			= swiotlb_alloc,
-	.free			= swiotlb_free,
+	.alloc			= dma_direct_alloc,
+	.free			= dma_direct_free,
 	.sync_single_for_cpu	= swiotlb_sync_single_for_cpu,
 	.sync_single_for_device	= swiotlb_sync_single_for_device,
 	.sync_sg_for_cpu	= swiotlb_sync_sg_for_cpu,
-- 
2.19.0


^ permalink raw reply	[flat|nested] 57+ messages in thread

* [PATCH 09/10] swiotlb: add support for non-coherent DMA
  2018-10-08  8:02 move swiotlb noncoherent dma support from arm64 to generic code V2 Christoph Hellwig
                   ` (7 preceding siblings ...)
  2018-10-08  8:02 ` [PATCH 08/10] swiotlb: don't dip into swiotlb pool for coherent allocations Christoph Hellwig
@ 2018-10-08  8:02 ` Christoph Hellwig
  2018-10-19  0:49   ` Konrad Rzeszutek Wilk
  2018-10-22 17:11   ` Robin Murphy
  2018-10-08  8:02 ` [PATCH 10/10] arm64: use the generic swiotlb_dma_ops Christoph Hellwig
  9 siblings, 2 replies; 57+ messages in thread
From: Christoph Hellwig @ 2018-10-08  8:02 UTC (permalink / raw)
  To: Will Deacon, Catalin Marinas, Robin Murphy, Konrad Rzeszutek Wilk
  Cc: linux-arm-kernel, iommu, linux-kernel

Handle architectures that are not cache coherent directly in the main
swiotlb code by calling arch_sync_dma_for_{device,cpu} in all the right
places from the various dma_map/unmap/sync methods when the device is
non-coherent.

Because swiotlb now uses dma_direct_alloc for the coherent allocation
that side is already taken care of by the dma-direct code calling into
arch_dma_{alloc,free} for devices that are non-coherent.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 kernel/dma/swiotlb.c | 23 +++++++++++++++++------
 1 file changed, 17 insertions(+), 6 deletions(-)

diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
index 475a41eff3dc..52885b274368 100644
--- a/kernel/dma/swiotlb.c
+++ b/kernel/dma/swiotlb.c
@@ -21,6 +21,7 @@
 
 #include <linux/cache.h>
 #include <linux/dma-direct.h>
+#include <linux/dma-noncoherent.h>
 #include <linux/mm.h>
 #include <linux/export.h>
 #include <linux/spinlock.h>
@@ -677,6 +678,10 @@ dma_addr_t swiotlb_map_page(struct device *dev, struct page *page,
 		dma_addr = swiotlb_bounce_page(dev, &phys, size, dir, attrs);
 	}
 
+	if (!dev_is_dma_coherent(dev) &&
+	    (attrs & DMA_ATTR_SKIP_CPU_SYNC) == 0)
+		arch_sync_dma_for_device(dev, phys, size, dir);
+
 	return dma_addr;
 }
 
@@ -696,6 +701,10 @@ void swiotlb_unmap_page(struct device *hwdev, dma_addr_t dev_addr,
 
 	BUG_ON(dir == DMA_NONE);
 
+	if (!dev_is_dma_coherent(hwdev) &&
+	    (attrs & DMA_ATTR_SKIP_CPU_SYNC) == 0)
+		arch_sync_dma_for_cpu(hwdev, paddr, size, dir);
+
 	if (is_swiotlb_buffer(paddr)) {
 		swiotlb_tbl_unmap_single(hwdev, paddr, size, dir, attrs);
 		return;
@@ -732,15 +741,17 @@ swiotlb_sync_single(struct device *hwdev, dma_addr_t dev_addr,
 
 	BUG_ON(dir == DMA_NONE);
 
-	if (is_swiotlb_buffer(paddr)) {
+	if (!dev_is_dma_coherent(hwdev) && target == SYNC_FOR_CPU)
+		arch_sync_dma_for_cpu(hwdev, paddr, size, dir);
+
+	if (is_swiotlb_buffer(paddr))
 		swiotlb_tbl_sync_single(hwdev, paddr, size, dir, target);
-		return;
-	}
 
-	if (dir != DMA_FROM_DEVICE)
-		return;
+	if (!dev_is_dma_coherent(hwdev) && target == SYNC_FOR_DEVICE)
+		arch_sync_dma_for_device(hwdev, paddr, size, dir);
 
-	dma_mark_clean(phys_to_virt(paddr), size);
+	if (!is_swiotlb_buffer(paddr) && dir == DMA_FROM_DEVICE)
+		dma_mark_clean(phys_to_virt(paddr), size);
 }
 
 void
-- 
2.19.0


^ permalink raw reply	[flat|nested] 57+ messages in thread

* [PATCH 10/10] arm64: use the generic swiotlb_dma_ops
  2018-10-08  8:02 move swiotlb noncoherent dma support from arm64 to generic code V2 Christoph Hellwig
                   ` (8 preceding siblings ...)
  2018-10-08  8:02 ` [PATCH 09/10] swiotlb: add support for non-coherent DMA Christoph Hellwig
@ 2018-10-08  8:02 ` Christoph Hellwig
  2018-10-12 13:01   ` Robin Murphy
  2018-10-22 17:52   ` Robin Murphy
  9 siblings, 2 replies; 57+ messages in thread
From: Christoph Hellwig @ 2018-10-08  8:02 UTC (permalink / raw)
  To: Will Deacon, Catalin Marinas, Robin Murphy, Konrad Rzeszutek Wilk
  Cc: linux-arm-kernel, iommu, linux-kernel

Now that the generic swiotlb code supports non-coherent DMA we can switch
to it for arm64.  For that we need to refactor the existing
alloc/free/mmap/pgprot helpers to be used as the architecture hooks,
and implement the standard arch_sync_dma_for_{device,cpu} hooks for
cache maintaincance in the streaming dma hooks, which also implies
using the generic dma_coherent flag in struct device.

Note that we need to keep the old is_device_dma_coherent function around
for now, so that the shared arm/arm64 Xen code keeps working.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 arch/arm64/Kconfig                   |   4 +
 arch/arm64/include/asm/device.h      |   1 -
 arch/arm64/include/asm/dma-mapping.h |   7 +-
 arch/arm64/mm/dma-mapping.c          | 257 +++++----------------------
 4 files changed, 56 insertions(+), 213 deletions(-)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 1b1a0e95c751..c4db5131d837 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -11,6 +11,8 @@ config ARM64
 	select ARCH_CLOCKSOURCE_DATA
 	select ARCH_HAS_DEBUG_VIRTUAL
 	select ARCH_HAS_DEVMEM_IS_ALLOWED
+	select ARCH_HAS_DMA_COHERENT_TO_PFN
+	select ARCH_HAS_DMA_MMAP_PGPROT
 	select ARCH_HAS_ACPI_TABLE_UPGRADE if ACPI
 	select ARCH_HAS_ELF_RANDOMIZE
 	select ARCH_HAS_FAST_MULTIPLIER
@@ -24,6 +26,8 @@ config ARM64
 	select ARCH_HAS_SG_CHAIN
 	select ARCH_HAS_STRICT_KERNEL_RWX
 	select ARCH_HAS_STRICT_MODULE_RWX
+	select ARCH_HAS_SYNC_DMA_FOR_DEVICE
+	select ARCH_HAS_SYNC_DMA_FOR_CPU
 	select ARCH_HAS_SYSCALL_WRAPPER
 	select ARCH_HAS_TICK_BROADCAST if GENERIC_CLOCKEVENTS_BROADCAST
 	select ARCH_HAVE_NMI_SAFE_CMPXCHG
diff --git a/arch/arm64/include/asm/device.h b/arch/arm64/include/asm/device.h
index 5a5fa47a6b18..3dd3d664c5c5 100644
--- a/arch/arm64/include/asm/device.h
+++ b/arch/arm64/include/asm/device.h
@@ -23,7 +23,6 @@ struct dev_archdata {
 #ifdef CONFIG_XEN
 	const struct dma_map_ops *dev_dma_ops;
 #endif
-	bool dma_coherent;
 };
 
 struct pdev_archdata {
diff --git a/arch/arm64/include/asm/dma-mapping.h b/arch/arm64/include/asm/dma-mapping.h
index b7847eb8a7bb..c41f3fb1446c 100644
--- a/arch/arm64/include/asm/dma-mapping.h
+++ b/arch/arm64/include/asm/dma-mapping.h
@@ -44,10 +44,13 @@ void arch_teardown_dma_ops(struct device *dev);
 #define arch_teardown_dma_ops	arch_teardown_dma_ops
 #endif
 
-/* do not use this function in a driver */
+/*
+ * Do not use this function in a driver, it is only provided for
+ * arch/arm/mm/xen.c, which is used by arm64 as well.
+ */
 static inline bool is_device_dma_coherent(struct device *dev)
 {
-	return dev->archdata.dma_coherent;
+	return dev->dma_coherent;
 }
 
 #endif	/* __KERNEL__ */
diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c
index eee6cfcfde9e..3c75d69b54e7 100644
--- a/arch/arm64/mm/dma-mapping.c
+++ b/arch/arm64/mm/dma-mapping.c
@@ -25,6 +25,7 @@
 #include <linux/slab.h>
 #include <linux/genalloc.h>
 #include <linux/dma-direct.h>
+#include <linux/dma-noncoherent.h>
 #include <linux/dma-contiguous.h>
 #include <linux/vmalloc.h>
 #include <linux/swiotlb.h>
@@ -32,16 +33,6 @@
 
 #include <asm/cacheflush.h>
 
-static int swiotlb __ro_after_init;
-
-static pgprot_t __get_dma_pgprot(unsigned long attrs, pgprot_t prot,
-				 bool coherent)
-{
-	if (!coherent || (attrs & DMA_ATTR_WRITE_COMBINE))
-		return pgprot_writecombine(prot);
-	return prot;
-}
-
 static struct gen_pool *atomic_pool __ro_after_init;
 
 #define DEFAULT_DMA_COHERENT_POOL_SIZE  SZ_256K
@@ -91,18 +82,16 @@ static int __free_from_pool(void *start, size_t size)
 	return 1;
 }
 
-static void *__dma_alloc(struct device *dev, size_t size,
-			 dma_addr_t *dma_handle, gfp_t flags,
-			 unsigned long attrs)
+void *arch_dma_alloc(struct device *dev, size_t size, dma_addr_t *dma_handle,
+		gfp_t flags, unsigned long attrs)
 {
 	struct page *page;
 	void *ptr, *coherent_ptr;
-	bool coherent = is_device_dma_coherent(dev);
-	pgprot_t prot = __get_dma_pgprot(attrs, PAGE_KERNEL, false);
+	pgprot_t prot = pgprot_writecombine(PAGE_KERNEL);
 
 	size = PAGE_ALIGN(size);
 
-	if (!coherent && !gfpflags_allow_blocking(flags)) {
+	if (!gfpflags_allow_blocking(flags)) {
 		struct page *page = NULL;
 		void *addr = __alloc_from_pool(size, &page, flags);
 
@@ -116,10 +105,6 @@ static void *__dma_alloc(struct device *dev, size_t size,
 	if (!ptr)
 		goto no_mem;
 
-	/* no need for non-cacheable mapping if coherent */
-	if (coherent)
-		return ptr;
-
 	/* remove any dirty cache lines on the kernel alias */
 	__dma_flush_area(ptr, size);
 
@@ -138,125 +123,50 @@ static void *__dma_alloc(struct device *dev, size_t size,
 	return NULL;
 }
 
-static void __dma_free(struct device *dev, size_t size,
-		       void *vaddr, dma_addr_t dma_handle,
-		       unsigned long attrs)
-{
-	void *swiotlb_addr = phys_to_virt(dma_to_phys(dev, dma_handle));
-
-	size = PAGE_ALIGN(size);
-
-	if (!is_device_dma_coherent(dev)) {
-		if (__free_from_pool(vaddr, size))
-			return;
-		vunmap(vaddr);
-	}
-	dma_direct_free_pages(dev, size, swiotlb_addr, dma_handle, attrs);
-}
-
-static dma_addr_t __swiotlb_map_page(struct device *dev, struct page *page,
-				     unsigned long offset, size_t size,
-				     enum dma_data_direction dir,
-				     unsigned long attrs)
-{
-	dma_addr_t dev_addr;
-
-	dev_addr = swiotlb_map_page(dev, page, offset, size, dir, attrs);
-	if (!is_device_dma_coherent(dev) &&
-	    (attrs & DMA_ATTR_SKIP_CPU_SYNC) == 0)
-		__dma_map_area(phys_to_virt(dma_to_phys(dev, dev_addr)), size, dir);
-
-	return dev_addr;
-}
-
-
-static void __swiotlb_unmap_page(struct device *dev, dma_addr_t dev_addr,
-				 size_t size, enum dma_data_direction dir,
-				 unsigned long attrs)
+void arch_dma_free(struct device *dev, size_t size, void *vaddr,
+		dma_addr_t dma_handle, unsigned long attrs)
 {
-	if (!is_device_dma_coherent(dev) &&
-	    (attrs & DMA_ATTR_SKIP_CPU_SYNC) == 0)
-		__dma_unmap_area(phys_to_virt(dma_to_phys(dev, dev_addr)), size, dir);
-	swiotlb_unmap_page(dev, dev_addr, size, dir, attrs);
+	if (__free_from_pool(vaddr, PAGE_ALIGN(size)))
+		return;
+	vunmap(vaddr);
+	dma_direct_free_pages(dev, size, vaddr, dma_handle, attrs);
 }
 
-static int __swiotlb_map_sg_attrs(struct device *dev, struct scatterlist *sgl,
-				  int nelems, enum dma_data_direction dir,
-				  unsigned long attrs)
+long arch_dma_coherent_to_pfn(struct device *dev, void *cpu_addr,
+		dma_addr_t dma_addr)
 {
-	struct scatterlist *sg;
-	int i, ret;
-
-	ret = swiotlb_map_sg_attrs(dev, sgl, nelems, dir, attrs);
-	if (!is_device_dma_coherent(dev) &&
-	    (attrs & DMA_ATTR_SKIP_CPU_SYNC) == 0)
-		for_each_sg(sgl, sg, ret, i)
-			__dma_map_area(phys_to_virt(dma_to_phys(dev, sg->dma_address)),
-				       sg->length, dir);
-
-	return ret;
+	return __phys_to_pfn(dma_to_phys(dev, dma_addr));
 }
 
-static void __swiotlb_unmap_sg_attrs(struct device *dev,
-				     struct scatterlist *sgl, int nelems,
-				     enum dma_data_direction dir,
-				     unsigned long attrs)
+pgprot_t arch_dma_mmap_pgprot(struct device *dev, pgprot_t prot,
+		unsigned long attrs)
 {
-	struct scatterlist *sg;
-	int i;
-
-	if (!is_device_dma_coherent(dev) &&
-	    (attrs & DMA_ATTR_SKIP_CPU_SYNC) == 0)
-		for_each_sg(sgl, sg, nelems, i)
-			__dma_unmap_area(phys_to_virt(dma_to_phys(dev, sg->dma_address)),
-					 sg->length, dir);
-	swiotlb_unmap_sg_attrs(dev, sgl, nelems, dir, attrs);
+	if (!dev_is_dma_coherent(dev) || (attrs & DMA_ATTR_WRITE_COMBINE))
+		return pgprot_writecombine(prot);
+	return prot;
 }
 
-static void __swiotlb_sync_single_for_cpu(struct device *dev,
-					  dma_addr_t dev_addr, size_t size,
-					  enum dma_data_direction dir)
+void arch_sync_dma_for_device(struct device *dev, phys_addr_t paddr,
+		size_t size, enum dma_data_direction dir)
 {
-	if (!is_device_dma_coherent(dev))
-		__dma_unmap_area(phys_to_virt(dma_to_phys(dev, dev_addr)), size, dir);
-	swiotlb_sync_single_for_cpu(dev, dev_addr, size, dir);
+	__dma_map_area(phys_to_virt(paddr), size, dir);
 }
 
-static void __swiotlb_sync_single_for_device(struct device *dev,
-					     dma_addr_t dev_addr, size_t size,
-					     enum dma_data_direction dir)
+void arch_sync_dma_for_cpu(struct device *dev, phys_addr_t paddr,
+		size_t size, enum dma_data_direction dir)
 {
-	swiotlb_sync_single_for_device(dev, dev_addr, size, dir);
-	if (!is_device_dma_coherent(dev))
-		__dma_map_area(phys_to_virt(dma_to_phys(dev, dev_addr)), size, dir);
+	__dma_unmap_area(phys_to_virt(paddr), size, dir);
 }
 
-static void __swiotlb_sync_sg_for_cpu(struct device *dev,
-				      struct scatterlist *sgl, int nelems,
-				      enum dma_data_direction dir)
+static int __swiotlb_get_sgtable_page(struct sg_table *sgt,
+				      struct page *page, size_t size)
 {
-	struct scatterlist *sg;
-	int i;
-
-	if (!is_device_dma_coherent(dev))
-		for_each_sg(sgl, sg, nelems, i)
-			__dma_unmap_area(phys_to_virt(dma_to_phys(dev, sg->dma_address)),
-					 sg->length, dir);
-	swiotlb_sync_sg_for_cpu(dev, sgl, nelems, dir);
-}
+	int ret = sg_alloc_table(sgt, 1, GFP_KERNEL);
 
-static void __swiotlb_sync_sg_for_device(struct device *dev,
-					 struct scatterlist *sgl, int nelems,
-					 enum dma_data_direction dir)
-{
-	struct scatterlist *sg;
-	int i;
+	if (!ret)
+		sg_set_page(sgt->sgl, page, PAGE_ALIGN(size), 0);
 
-	swiotlb_sync_sg_for_device(dev, sgl, nelems, dir);
-	if (!is_device_dma_coherent(dev))
-		for_each_sg(sgl, sg, nelems, i)
-			__dma_map_area(phys_to_virt(dma_to_phys(dev, sg->dma_address)),
-				       sg->length, dir);
+	return ret;
 }
 
 static int __swiotlb_mmap_pfn(struct vm_area_struct *vma,
@@ -277,74 +187,6 @@ static int __swiotlb_mmap_pfn(struct vm_area_struct *vma,
 	return ret;
 }
 
-static int __swiotlb_mmap(struct device *dev,
-			  struct vm_area_struct *vma,
-			  void *cpu_addr, dma_addr_t dma_addr, size_t size,
-			  unsigned long attrs)
-{
-	int ret;
-	unsigned long pfn = dma_to_phys(dev, dma_addr) >> PAGE_SHIFT;
-
-	vma->vm_page_prot = __get_dma_pgprot(attrs, vma->vm_page_prot,
-					     is_device_dma_coherent(dev));
-
-	if (dma_mmap_from_dev_coherent(dev, vma, cpu_addr, size, &ret))
-		return ret;
-
-	return __swiotlb_mmap_pfn(vma, pfn, size);
-}
-
-static int __swiotlb_get_sgtable_page(struct sg_table *sgt,
-				      struct page *page, size_t size)
-{
-	int ret = sg_alloc_table(sgt, 1, GFP_KERNEL);
-
-	if (!ret)
-		sg_set_page(sgt->sgl, page, PAGE_ALIGN(size), 0);
-
-	return ret;
-}
-
-static int __swiotlb_get_sgtable(struct device *dev, struct sg_table *sgt,
-				 void *cpu_addr, dma_addr_t handle, size_t size,
-				 unsigned long attrs)
-{
-	struct page *page = phys_to_page(dma_to_phys(dev, handle));
-
-	return __swiotlb_get_sgtable_page(sgt, page, size);
-}
-
-static int __swiotlb_dma_supported(struct device *hwdev, u64 mask)
-{
-	if (swiotlb)
-		return swiotlb_dma_supported(hwdev, mask);
-	return 1;
-}
-
-static int __swiotlb_dma_mapping_error(struct device *hwdev, dma_addr_t addr)
-{
-	if (swiotlb)
-		return dma_direct_mapping_error(hwdev, addr);
-	return 0;
-}
-
-static const struct dma_map_ops arm64_swiotlb_dma_ops = {
-	.alloc = __dma_alloc,
-	.free = __dma_free,
-	.mmap = __swiotlb_mmap,
-	.get_sgtable = __swiotlb_get_sgtable,
-	.map_page = __swiotlb_map_page,
-	.unmap_page = __swiotlb_unmap_page,
-	.map_sg = __swiotlb_map_sg_attrs,
-	.unmap_sg = __swiotlb_unmap_sg_attrs,
-	.sync_single_for_cpu = __swiotlb_sync_single_for_cpu,
-	.sync_single_for_device = __swiotlb_sync_single_for_device,
-	.sync_sg_for_cpu = __swiotlb_sync_sg_for_cpu,
-	.sync_sg_for_device = __swiotlb_sync_sg_for_device,
-	.dma_supported = __swiotlb_dma_supported,
-	.mapping_error = __swiotlb_dma_mapping_error,
-};
-
 static int __init atomic_pool_init(void)
 {
 	pgprot_t prot = __pgprot(PROT_NORMAL_NC);
@@ -500,10 +342,6 @@ EXPORT_SYMBOL(dummy_dma_ops);
 
 static int __init arm64_dma_init(void)
 {
-	if (swiotlb_force == SWIOTLB_FORCE ||
-	    max_pfn > (arm64_dma_phys_limit >> PAGE_SHIFT))
-		swiotlb = 1;
-
 	WARN_TAINT(ARCH_DMA_MINALIGN < cache_line_size(),
 		   TAINT_CPU_OUT_OF_SPEC,
 		   "ARCH_DMA_MINALIGN smaller than CTR_EL0.CWG (%d < %d)",
@@ -528,7 +366,7 @@ static void *__iommu_alloc_attrs(struct device *dev, size_t size,
 				 dma_addr_t *handle, gfp_t gfp,
 				 unsigned long attrs)
 {
-	bool coherent = is_device_dma_coherent(dev);
+	bool coherent = dev_is_dma_coherent(dev);
 	int ioprot = dma_info_to_prot(DMA_BIDIRECTIONAL, coherent, attrs);
 	size_t iosize = size;
 	void *addr;
@@ -569,7 +407,7 @@ static void *__iommu_alloc_attrs(struct device *dev, size_t size,
 			addr = NULL;
 		}
 	} else if (attrs & DMA_ATTR_FORCE_CONTIGUOUS) {
-		pgprot_t prot = __get_dma_pgprot(attrs, PAGE_KERNEL, coherent);
+		pgprot_t prot = arch_dma_mmap_pgprot(dev, PAGE_KERNEL, attrs);
 		struct page *page;
 
 		page = dma_alloc_from_contiguous(dev, size >> PAGE_SHIFT,
@@ -596,7 +434,7 @@ static void *__iommu_alloc_attrs(struct device *dev, size_t size,
 						    size >> PAGE_SHIFT);
 		}
 	} else {
-		pgprot_t prot = __get_dma_pgprot(attrs, PAGE_KERNEL, coherent);
+		pgprot_t prot = arch_dma_mmap_pgprot(dev, PAGE_KERNEL, attrs);
 		struct page **pages;
 
 		pages = iommu_dma_alloc(dev, iosize, gfp, attrs, ioprot,
@@ -658,8 +496,7 @@ static int __iommu_mmap_attrs(struct device *dev, struct vm_area_struct *vma,
 	struct vm_struct *area;
 	int ret;
 
-	vma->vm_page_prot = __get_dma_pgprot(attrs, vma->vm_page_prot,
-					     is_device_dma_coherent(dev));
+	vma->vm_page_prot = arch_dma_mmap_pgprot(dev, vma->vm_page_prot, attrs);
 
 	if (dma_mmap_from_dev_coherent(dev, vma, cpu_addr, size, &ret))
 		return ret;
@@ -709,11 +546,11 @@ static void __iommu_sync_single_for_cpu(struct device *dev,
 {
 	phys_addr_t phys;
 
-	if (is_device_dma_coherent(dev))
+	if (dev_is_dma_coherent(dev))
 		return;
 
 	phys = iommu_iova_to_phys(iommu_get_domain_for_dev(dev), dev_addr);
-	__dma_unmap_area(phys_to_virt(phys), size, dir);
+	arch_sync_dma_for_cpu(dev, phys, size, dir);
 }
 
 static void __iommu_sync_single_for_device(struct device *dev,
@@ -722,11 +559,11 @@ static void __iommu_sync_single_for_device(struct device *dev,
 {
 	phys_addr_t phys;
 
-	if (is_device_dma_coherent(dev))
+	if (dev_is_dma_coherent(dev))
 		return;
 
 	phys = iommu_iova_to_phys(iommu_get_domain_for_dev(dev), dev_addr);
-	__dma_map_area(phys_to_virt(phys), size, dir);
+	arch_sync_dma_for_device(dev, phys, size, dir);
 }
 
 static dma_addr_t __iommu_map_page(struct device *dev, struct page *page,
@@ -734,7 +571,7 @@ static dma_addr_t __iommu_map_page(struct device *dev, struct page *page,
 				   enum dma_data_direction dir,
 				   unsigned long attrs)
 {
-	bool coherent = is_device_dma_coherent(dev);
+	bool coherent = dev_is_dma_coherent(dev);
 	int prot = dma_info_to_prot(dir, coherent, attrs);
 	dma_addr_t dev_addr = iommu_dma_map_page(dev, page, offset, size, prot);
 
@@ -762,11 +599,11 @@ static void __iommu_sync_sg_for_cpu(struct device *dev,
 	struct scatterlist *sg;
 	int i;
 
-	if (is_device_dma_coherent(dev))
+	if (dev_is_dma_coherent(dev))
 		return;
 
 	for_each_sg(sgl, sg, nelems, i)
-		__dma_unmap_area(sg_virt(sg), sg->length, dir);
+		arch_sync_dma_for_cpu(dev, sg_phys(sg), sg->length, dir);
 }
 
 static void __iommu_sync_sg_for_device(struct device *dev,
@@ -776,18 +613,18 @@ static void __iommu_sync_sg_for_device(struct device *dev,
 	struct scatterlist *sg;
 	int i;
 
-	if (is_device_dma_coherent(dev))
+	if (dev_is_dma_coherent(dev))
 		return;
 
 	for_each_sg(sgl, sg, nelems, i)
-		__dma_map_area(sg_virt(sg), sg->length, dir);
+		arch_sync_dma_for_device(dev, sg_phys(sg), sg->length, dir);
 }
 
 static int __iommu_map_sg_attrs(struct device *dev, struct scatterlist *sgl,
 				int nelems, enum dma_data_direction dir,
 				unsigned long attrs)
 {
-	bool coherent = is_device_dma_coherent(dev);
+	bool coherent = dev_is_dma_coherent(dev);
 
 	if ((attrs & DMA_ATTR_SKIP_CPU_SYNC) == 0)
 		__iommu_sync_sg_for_device(dev, sgl, nelems, dir);
@@ -879,9 +716,9 @@ void arch_setup_dma_ops(struct device *dev, u64 dma_base, u64 size,
 			const struct iommu_ops *iommu, bool coherent)
 {
 	if (!dev->dma_ops)
-		dev->dma_ops = &arm64_swiotlb_dma_ops;
+		dev->dma_ops = &swiotlb_dma_ops;
 
-	dev->archdata.dma_coherent = coherent;
+	dev->dma_coherent = coherent;
 	__iommu_setup_dma_ops(dev, dma_base, size, iommu);
 
 #ifdef CONFIG_XEN
-- 
2.19.0


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 01/10] swiotlb: remove a pointless comment
  2018-10-08  8:02 ` [PATCH 01/10] swiotlb: remove a pointless comment Christoph Hellwig
@ 2018-10-11 17:49   ` Robin Murphy
  2018-10-19  0:09   ` Konrad Rzeszutek Wilk
  1 sibling, 0 replies; 57+ messages in thread
From: Robin Murphy @ 2018-10-11 17:49 UTC (permalink / raw)
  To: Christoph Hellwig, Will Deacon, Catalin Marinas, Konrad Rzeszutek Wilk
  Cc: linux-arm-kernel, iommu, linux-kernel

On 08/10/18 09:02, Christoph Hellwig wrote:
> This comments describes an aspect of the map_sg interface that isn't
> even exploited by swiotlb.

Reviewed-by: Robin Murphy <robin.murphy@arm.com>

> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>   kernel/dma/swiotlb.c | 6 ------
>   1 file changed, 6 deletions(-)
> 
> diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
> index 4f8a6dbf0b60..9062b14bc7f4 100644
> --- a/kernel/dma/swiotlb.c
> +++ b/kernel/dma/swiotlb.c
> @@ -925,12 +925,6 @@ swiotlb_sync_single_for_device(struct device *hwdev, dma_addr_t dev_addr,
>    * appropriate dma address and length.  They are obtained via
>    * sg_dma_{address,length}(SG).
>    *
> - * NOTE: An implementation may be able to use a smaller number of
> - *       DMA address/length pairs than there are SG table elements.
> - *       (for example via virtual mapping capabilities)
> - *       The routine returns the number of addr/length pairs actually
> - *       used, at most nents.
> - *
>    * Device ownership issues as mentioned above for swiotlb_map_page are the
>    * same here.
>    */
> 

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 02/10] swiotlb: mark is_swiotlb_buffer static
  2018-10-08  8:02 ` [PATCH 02/10] swiotlb: mark is_swiotlb_buffer static Christoph Hellwig
@ 2018-10-11 17:54   ` Robin Murphy
  2018-10-19  0:12   ` Konrad Rzeszutek Wilk
  1 sibling, 0 replies; 57+ messages in thread
From: Robin Murphy @ 2018-10-11 17:54 UTC (permalink / raw)
  To: Christoph Hellwig, Will Deacon, Catalin Marinas, Konrad Rzeszutek Wilk
  Cc: linux-arm-kernel, iommu, linux-kernel

On 08/10/18 09:02, Christoph Hellwig wrote:

I agree there isn't really a good reason for external code to ever be 
poking at this, despite it being helpful for arch code trying to hack 
around awful coherency issues ;)

Reviewed-by: Robin Murphy <robin.murphy@arm.com>

> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>   include/linux/swiotlb.h | 1 -
>   kernel/dma/swiotlb.c    | 2 +-
>   2 files changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/include/linux/swiotlb.h b/include/linux/swiotlb.h
> index 965be92c33b5..7ef541ce8f34 100644
> --- a/include/linux/swiotlb.h
> +++ b/include/linux/swiotlb.h
> @@ -121,7 +121,6 @@ static inline unsigned int swiotlb_max_segment(void) { return 0; }
>   #endif
>   
>   extern void swiotlb_print_info(void);
> -extern int is_swiotlb_buffer(phys_addr_t paddr);
>   extern void swiotlb_set_max_segment(unsigned int);
>   
>   extern const struct dma_map_ops swiotlb_dma_ops;
> diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
> index 9062b14bc7f4..26d3af52956f 100644
> --- a/kernel/dma/swiotlb.c
> +++ b/kernel/dma/swiotlb.c
> @@ -429,7 +429,7 @@ void __init swiotlb_exit(void)
>   	max_segment = 0;
>   }
>   
> -int is_swiotlb_buffer(phys_addr_t paddr)
> +static int is_swiotlb_buffer(phys_addr_t paddr)
>   {
>   	return paddr >= io_tlb_start && paddr < io_tlb_end;
>   }
> 

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 03/10] swiotlb: do not panic on mapping failures
  2018-10-08  8:02 ` [PATCH 03/10] swiotlb: do not panic on mapping failures Christoph Hellwig
@ 2018-10-11 18:06   ` Robin Murphy
  2018-10-19  0:18     ` Konrad Rzeszutek Wilk
  2018-10-19  0:17   ` Konrad Rzeszutek Wilk
  1 sibling, 1 reply; 57+ messages in thread
From: Robin Murphy @ 2018-10-11 18:06 UTC (permalink / raw)
  To: Christoph Hellwig, Will Deacon, Catalin Marinas, Konrad Rzeszutek Wilk
  Cc: linux-arm-kernel, iommu, linux-kernel

On 08/10/18 09:02, Christoph Hellwig wrote:
> All properly written drivers now have error handling in the
> dma_map_single / dma_map_page callers.  As swiotlb_tbl_map_single already
> prints a useful warning when running out of swiotlb pool swace we can
> also remove swiotlb_full entirely as it serves no purpose now.

$ git grep -l 'dma_map_\(page\|single\)' drivers/ | wc -l
385
$ git grep -l dma_mapping_error drivers/ | wc -l
396

Close enough, to first approximation :D

I agree the other (non-fatal) warning seems sufficient, and frankly the 
panic can be a bit of a pain for development sometimes.

Reviewed-by: Robin Murphy <robin.murphy@arm.com>

> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>   kernel/dma/swiotlb.c | 33 +--------------------------------
>   1 file changed, 1 insertion(+), 32 deletions(-)
> 
> diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
> index 26d3af52956f..69bf305ee5f8 100644
> --- a/kernel/dma/swiotlb.c
> +++ b/kernel/dma/swiotlb.c
> @@ -761,34 +761,6 @@ static bool swiotlb_free_buffer(struct device *dev, size_t size,
>   	return true;
>   }
>   
> -static void
> -swiotlb_full(struct device *dev, size_t size, enum dma_data_direction dir,
> -	     int do_panic)
> -{
> -	if (swiotlb_force == SWIOTLB_NO_FORCE)
> -		return;
> -
> -	/*
> -	 * Ran out of IOMMU space for this operation. This is very bad.
> -	 * Unfortunately the drivers cannot handle this operation properly.
> -	 * unless they check for dma_mapping_error (most don't)
> -	 * When the mapping is small enough return a static buffer to limit
> -	 * the damage, or panic when the transfer is too big.
> -	 */
> -	dev_err_ratelimited(dev, "DMA: Out of SW-IOMMU space for %zu bytes\n",
> -			    size);
> -
> -	if (size <= io_tlb_overflow || !do_panic)
> -		return;
> -
> -	if (dir == DMA_BIDIRECTIONAL)
> -		panic("DMA: Random memory could be DMA accessed\n");
> -	if (dir == DMA_FROM_DEVICE)
> -		panic("DMA: Random memory could be DMA written\n");
> -	if (dir == DMA_TO_DEVICE)
> -		panic("DMA: Random memory could be DMA read\n");
> -}
> -
>   /*
>    * Map a single buffer of the indicated size for DMA in streaming mode.  The
>    * physical address to use is returned.
> @@ -817,10 +789,8 @@ dma_addr_t swiotlb_map_page(struct device *dev, struct page *page,
>   
>   	/* Oh well, have to allocate and map a bounce buffer. */
>   	map = map_single(dev, phys, size, dir, attrs);
> -	if (map == SWIOTLB_MAP_ERROR) {
> -		swiotlb_full(dev, size, dir, 1);
> +	if (map == SWIOTLB_MAP_ERROR)
>   		return __phys_to_dma(dev, io_tlb_overflow_buffer);
> -	}
>   
>   	dev_addr = __phys_to_dma(dev, map);
>   
> @@ -948,7 +918,6 @@ swiotlb_map_sg_attrs(struct device *hwdev, struct scatterlist *sgl, int nelems,
>   			if (map == SWIOTLB_MAP_ERROR) {
>   				/* Don't panic here, we expect map_sg users
>   				   to do proper error handling. */
> -				swiotlb_full(hwdev, sg->length, dir, 0);
>   				attrs |= DMA_ATTR_SKIP_CPU_SYNC;
>   				swiotlb_unmap_sg_attrs(hwdev, sgl, i, dir,
>   						       attrs);
> 

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 04/10] swiotlb: remove the overflow buffer
  2018-10-08  8:02 ` [PATCH 04/10] swiotlb: remove the overflow buffer Christoph Hellwig
@ 2018-10-11 18:19   ` Robin Murphy
  2018-10-12 17:04   ` Catalin Marinas
  2018-10-19  0:23   ` Konrad Rzeszutek Wilk
  2 siblings, 0 replies; 57+ messages in thread
From: Robin Murphy @ 2018-10-11 18:19 UTC (permalink / raw)
  To: Christoph Hellwig, Will Deacon, Catalin Marinas, Konrad Rzeszutek Wilk
  Cc: linux-arm-kernel, iommu, linux-kernel

On 08/10/18 09:02, Christoph Hellwig wrote:
> Like all other dma mapping drivers just return an error code instead
> of an actual memory buffer.  The reason for the overflow buffer was
> that at the time swiotlb was invented there was no way to check for
> dma mapping errors, but this has long been fixed.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>   arch/arm64/mm/dma-mapping.c       |  2 +-
>   arch/powerpc/kernel/dma-swiotlb.c |  4 +--
>   include/linux/dma-direct.h        |  2 ++
>   include/linux/swiotlb.h           |  3 --
>   kernel/dma/direct.c               |  2 --
>   kernel/dma/swiotlb.c              | 59 ++-----------------------------
>   6 files changed, 8 insertions(+), 64 deletions(-)
> 
> diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c
> index 072c51fb07d7..8d91b927e09e 100644
> --- a/arch/arm64/mm/dma-mapping.c
> +++ b/arch/arm64/mm/dma-mapping.c
> @@ -324,7 +324,7 @@ static int __swiotlb_dma_supported(struct device *hwdev, u64 mask)
>   static int __swiotlb_dma_mapping_error(struct device *hwdev, dma_addr_t addr)
>   {
>   	if (swiotlb)

 From a bit of archaeology, I think io_tlb_overflow_buffer being 
uninitialised was the reason this needed fixing in adbe7e26f425. Given 
that, you could as well just nuke this whole helper and install 
dma_direct_mapping_error in the ops like the other users, although I 
appreciate it's only really a bisection thing so may not matter much.

With the caveat that it's late and I may technically have missed 
something, but have at least found the branch to boot-test OK:

Reviewed-by: Robin Murphy <robin.murphy@arm.com>

> -		return swiotlb_dma_mapping_error(hwdev, addr);
> +		return dma_direct_mapping_error(hwdev, addr);
>   	return 0;
>   }
>   
> diff --git a/arch/powerpc/kernel/dma-swiotlb.c b/arch/powerpc/kernel/dma-swiotlb.c
> index 88f3963ca30f..5fc335f4d9cd 100644
> --- a/arch/powerpc/kernel/dma-swiotlb.c
> +++ b/arch/powerpc/kernel/dma-swiotlb.c
> @@ -11,7 +11,7 @@
>    *
>    */
>   
> -#include <linux/dma-mapping.h>
> +#include <linux/dma-direct.h>
>   #include <linux/memblock.h>
>   #include <linux/pfn.h>
>   #include <linux/of_platform.h>
> @@ -59,7 +59,7 @@ const struct dma_map_ops powerpc_swiotlb_dma_ops = {
>   	.sync_single_for_device = swiotlb_sync_single_for_device,
>   	.sync_sg_for_cpu = swiotlb_sync_sg_for_cpu,
>   	.sync_sg_for_device = swiotlb_sync_sg_for_device,
> -	.mapping_error = swiotlb_dma_mapping_error,
> +	.mapping_error = dma_direct_mapping_error,
>   	.get_required_mask = swiotlb_powerpc_get_required,
>   };
>   
> diff --git a/include/linux/dma-direct.h b/include/linux/dma-direct.h
> index fbca184ff5a0..bd73e7a91410 100644
> --- a/include/linux/dma-direct.h
> +++ b/include/linux/dma-direct.h
> @@ -5,6 +5,8 @@
>   #include <linux/dma-mapping.h>
>   #include <linux/mem_encrypt.h>
>   
> +#define DIRECT_MAPPING_ERROR		0
> +
>   #ifdef CONFIG_ARCH_HAS_PHYS_TO_DMA
>   #include <asm/dma-direct.h>
>   #else
> diff --git a/include/linux/swiotlb.h b/include/linux/swiotlb.h
> index 7ef541ce8f34..f847c1b265c4 100644
> --- a/include/linux/swiotlb.h
> +++ b/include/linux/swiotlb.h
> @@ -106,9 +106,6 @@ extern void
>   swiotlb_sync_sg_for_device(struct device *hwdev, struct scatterlist *sg,
>   			   int nelems, enum dma_data_direction dir);
>   
> -extern int
> -swiotlb_dma_mapping_error(struct device *hwdev, dma_addr_t dma_addr);
> -
>   extern int
>   swiotlb_dma_supported(struct device *hwdev, u64 mask);
>   
> diff --git a/kernel/dma/direct.c b/kernel/dma/direct.c
> index 674a8da22844..12798abba55e 100644
> --- a/kernel/dma/direct.c
> +++ b/kernel/dma/direct.c
> @@ -14,8 +14,6 @@
>   #include <linux/pfn.h>
>   #include <linux/set_memory.h>
>   
> -#define DIRECT_MAPPING_ERROR		0
> -
>   /*
>    * Most architectures use ZONE_DMA for the first 16 Megabytes, but
>    * some use it for entirely different regions:
> diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
> index 69bf305ee5f8..11dbcd80b4a6 100644
> --- a/kernel/dma/swiotlb.c
> +++ b/kernel/dma/swiotlb.c
> @@ -72,13 +72,6 @@ static phys_addr_t io_tlb_start, io_tlb_end;
>    */
>   static unsigned long io_tlb_nslabs;
>   
> -/*
> - * When the IOMMU overflows we return a fallback buffer. This sets the size.
> - */
> -static unsigned long io_tlb_overflow = 32*1024;
> -
> -static phys_addr_t io_tlb_overflow_buffer;
> -
>   /*
>    * This is a free list describing the number of free entries available from
>    * each index
> @@ -126,7 +119,6 @@ setup_io_tlb_npages(char *str)
>   	return 0;
>   }
>   early_param("swiotlb", setup_io_tlb_npages);
> -/* make io_tlb_overflow tunable too? */
>   
>   unsigned long swiotlb_nr_tbl(void)
>   {
> @@ -194,16 +186,10 @@ void __init swiotlb_update_mem_attributes(void)
>   	bytes = PAGE_ALIGN(io_tlb_nslabs << IO_TLB_SHIFT);
>   	set_memory_decrypted((unsigned long)vaddr, bytes >> PAGE_SHIFT);
>   	memset(vaddr, 0, bytes);
> -
> -	vaddr = phys_to_virt(io_tlb_overflow_buffer);
> -	bytes = PAGE_ALIGN(io_tlb_overflow);
> -	set_memory_decrypted((unsigned long)vaddr, bytes >> PAGE_SHIFT);
> -	memset(vaddr, 0, bytes);
>   }
>   
>   int __init swiotlb_init_with_tbl(char *tlb, unsigned long nslabs, int verbose)
>   {
> -	void *v_overflow_buffer;
>   	unsigned long i, bytes;
>   
>   	bytes = nslabs << IO_TLB_SHIFT;
> @@ -212,17 +198,6 @@ int __init swiotlb_init_with_tbl(char *tlb, unsigned long nslabs, int verbose)
>   	io_tlb_start = __pa(tlb);
>   	io_tlb_end = io_tlb_start + bytes;
>   
> -	/*
> -	 * Get the overflow emergency buffer
> -	 */
> -	v_overflow_buffer = memblock_virt_alloc_low_nopanic(
> -						PAGE_ALIGN(io_tlb_overflow),
> -						PAGE_SIZE);
> -	if (!v_overflow_buffer)
> -		return -ENOMEM;
> -
> -	io_tlb_overflow_buffer = __pa(v_overflow_buffer);
> -
>   	/*
>   	 * Allocate and initialize the free list array.  This array is used
>   	 * to find contiguous free memory regions of size up to IO_TLB_SEGSIZE
> @@ -330,7 +305,6 @@ int
>   swiotlb_late_init_with_tbl(char *tlb, unsigned long nslabs)
>   {
>   	unsigned long i, bytes;
> -	unsigned char *v_overflow_buffer;
>   
>   	bytes = nslabs << IO_TLB_SHIFT;
>   
> @@ -341,19 +315,6 @@ swiotlb_late_init_with_tbl(char *tlb, unsigned long nslabs)
>   	set_memory_decrypted((unsigned long)tlb, bytes >> PAGE_SHIFT);
>   	memset(tlb, 0, bytes);
>   
> -	/*
> -	 * Get the overflow emergency buffer
> -	 */
> -	v_overflow_buffer = (void *)__get_free_pages(GFP_DMA,
> -						     get_order(io_tlb_overflow));
> -	if (!v_overflow_buffer)
> -		goto cleanup2;
> -
> -	set_memory_decrypted((unsigned long)v_overflow_buffer,
> -			io_tlb_overflow >> PAGE_SHIFT);
> -	memset(v_overflow_buffer, 0, io_tlb_overflow);
> -	io_tlb_overflow_buffer = virt_to_phys(v_overflow_buffer);
> -
>   	/*
>   	 * Allocate and initialize the free list array.  This array is used
>   	 * to find contiguous free memory regions of size up to IO_TLB_SEGSIZE
> @@ -390,10 +351,6 @@ swiotlb_late_init_with_tbl(char *tlb, unsigned long nslabs)
>   	                                                 sizeof(int)));
>   	io_tlb_list = NULL;
>   cleanup3:
> -	free_pages((unsigned long)v_overflow_buffer,
> -		   get_order(io_tlb_overflow));
> -	io_tlb_overflow_buffer = 0;
> -cleanup2:
>   	io_tlb_end = 0;
>   	io_tlb_start = 0;
>   	io_tlb_nslabs = 0;
> @@ -407,8 +364,6 @@ void __init swiotlb_exit(void)
>   		return;
>   
>   	if (late_alloc) {
> -		free_pages((unsigned long)phys_to_virt(io_tlb_overflow_buffer),
> -			   get_order(io_tlb_overflow));
>   		free_pages((unsigned long)io_tlb_orig_addr,
>   			   get_order(io_tlb_nslabs * sizeof(phys_addr_t)));
>   		free_pages((unsigned long)io_tlb_list, get_order(io_tlb_nslabs *
> @@ -416,8 +371,6 @@ void __init swiotlb_exit(void)
>   		free_pages((unsigned long)phys_to_virt(io_tlb_start),
>   			   get_order(io_tlb_nslabs << IO_TLB_SHIFT));
>   	} else {
> -		memblock_free_late(io_tlb_overflow_buffer,
> -				   PAGE_ALIGN(io_tlb_overflow));
>   		memblock_free_late(__pa(io_tlb_orig_addr),
>   				   PAGE_ALIGN(io_tlb_nslabs * sizeof(phys_addr_t)));
>   		memblock_free_late(__pa(io_tlb_list),
> @@ -790,7 +743,7 @@ dma_addr_t swiotlb_map_page(struct device *dev, struct page *page,
>   	/* Oh well, have to allocate and map a bounce buffer. */
>   	map = map_single(dev, phys, size, dir, attrs);
>   	if (map == SWIOTLB_MAP_ERROR)
> -		return __phys_to_dma(dev, io_tlb_overflow_buffer);
> +		return DIRECT_MAPPING_ERROR;
>   
>   	dev_addr = __phys_to_dma(dev, map);
>   
> @@ -801,7 +754,7 @@ dma_addr_t swiotlb_map_page(struct device *dev, struct page *page,
>   	attrs |= DMA_ATTR_SKIP_CPU_SYNC;
>   	swiotlb_tbl_unmap_single(dev, map, size, dir, attrs);
>   
> -	return __phys_to_dma(dev, io_tlb_overflow_buffer);
> +	return DIRECT_MAPPING_ERROR;
>   }
>   
>   /*
> @@ -985,12 +938,6 @@ swiotlb_sync_sg_for_device(struct device *hwdev, struct scatterlist *sg,
>   	swiotlb_sync_sg(hwdev, sg, nelems, dir, SYNC_FOR_DEVICE);
>   }
>   
> -int
> -swiotlb_dma_mapping_error(struct device *hwdev, dma_addr_t dma_addr)
> -{
> -	return (dma_addr == __phys_to_dma(hwdev, io_tlb_overflow_buffer));
> -}
> -
>   /*
>    * Return whether the given device DMA address mask can be supported
>    * properly.  For example, if your device can only drive the low 24-bits
> @@ -1033,7 +980,7 @@ void swiotlb_free(struct device *dev, size_t size, void *vaddr,
>   }
>   
>   const struct dma_map_ops swiotlb_dma_ops = {
> -	.mapping_error		= swiotlb_dma_mapping_error,
> +	.mapping_error		= dma_direct_mapping_error,
>   	.alloc			= swiotlb_alloc,
>   	.free			= swiotlb_free,
>   	.sync_single_for_cpu	= swiotlb_sync_single_for_cpu,
> 

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 10/10] arm64: use the generic swiotlb_dma_ops
  2018-10-08  8:02 ` [PATCH 10/10] arm64: use the generic swiotlb_dma_ops Christoph Hellwig
@ 2018-10-12 13:01   ` Robin Murphy
  2018-10-12 14:40     ` Christoph Hellwig
  2018-10-22 17:52   ` Robin Murphy
  1 sibling, 1 reply; 57+ messages in thread
From: Robin Murphy @ 2018-10-12 13:01 UTC (permalink / raw)
  To: Christoph Hellwig, Will Deacon, Catalin Marinas, Konrad Rzeszutek Wilk
  Cc: linux-arm-kernel, iommu, linux-kernel

On 08/10/18 09:02, Christoph Hellwig wrote:
> Now that the generic swiotlb code supports non-coherent DMA we can switch
> to it for arm64.  For that we need to refactor the existing
> alloc/free/mmap/pgprot helpers to be used as the architecture hooks,
> and implement the standard arch_sync_dma_for_{device,cpu} hooks for
> cache maintaincance in the streaming dma hooks, which also implies
> using the generic dma_coherent flag in struct device.
> 
> Note that we need to keep the old is_device_dma_coherent function around
> for now, so that the shared arm/arm64 Xen code keeps working.

OK, so when I said last night that it boot-tested OK, that much was 
true, but then I shut the board down as I left and got a megasplosion of 
bad page state BUGs, e.g.:

[ 3312.848182] BUG: Bad page state in process llvmpipe-5  pfn:8a78e
[ 3312.854138] page:ffff7e000029e380 count:2 mapcount:1 
mapping:ffff800034019268 index:0xd3
[ 3312.862170] flags: 0x10028(uptodate|lru|mappedtodisk)
[ 3312.867176] raw: 0000000000010028 ffff7e0000dc5508 ffff7e0000c7bb48 
ffff800034019268
[ 3312.874883] raw: 00000000000000d3 0000000000000000 0000000200000000 
ffff800036b91000
[ 3312.882608] page dumped because: page still charged to cgroup
[ 3312.888352] page->mem_cgroup:ffff800036b91000
[ 3312.892714] bad because of flags: 0x20(lru)
[ 3312.896959] Modules linked in:
[ 3312.900042] CPU: 0 PID: 283 Comm: llvmpipe-5 Tainted: G    B 
    4.19.0-rc3+ #801
[ 3312.908136] Hardware name: ARM LTD ARM Juno Development Platform/ARM 
Juno Development Platform, BIOS EDK II Jul 10 2018
[ 3312.918810] Call trace:
[ 3312.921236]  dump_backtrace+0x0/0x1f0
[ 3312.924860]  show_stack+0x14/0x20
[ 3312.928142]  dump_stack+0x9c/0xbc
[ 3312.931422]  bad_page+0xe8/0x150
[ 3312.934615]  free_pages_check_bad+0x9c/0xa8
[ 3312.938754]  __free_pages_ok+0x27c/0x288
[ 3312.942635]  __free_pages+0x30/0x48
[ 3312.946086]  free_pages.part.22+0x1c/0x28
[ 3312.950053]  free_pages+0x14/0x20
[ 3312.953333]  dma_direct_free_pages+0x5c/0x68
[ 3312.957560]  arch_dma_free+0x5c/0x70
[ 3312.961095]  dma_direct_free+0x20/0x28
[ 3312.964805]  drm_gem_cma_free_object+0xb0/0x150
[ 3312.969290]  drm_gem_object_free+0x1c/0x58
[ 3312.973343]  drm_gem_object_put_unlocked+0x70/0x98
[ 3312.978084]  drm_gem_object_handle_put_unlocked+0x64/0xc0
[ 3312.983426]  drm_gem_object_release_handle+0x54/0x90
[ 3312.988339]  idr_for_each+0x70/0x130
[ 3312.991876]  drm_gem_release+0x28/0x40
[ 3312.995584]  drm_file_free.part.0+0x214/0x288
[ 3312.999895]  drm_release+0x94/0x118
[ 3313.003347]  __fput+0x8c/0x1c0
[ 3313.006368]  ____fput+0xc/0x18
[ 3313.009390]  task_work_run+0x98/0xb8
[ 3313.012927]  do_exit+0x2b8/0x970
[ 3313.016119]  do_group_exit+0x38/0xa0
[ 3313.019656]  get_signal+0xfc/0x4b0
[ 3313.023021]  do_signal+0x1a0/0x2a8
[ 3313.026386]  do_notify_resume+0xe8/0x128
[ 3313.030267]  work_pending+0x8/0x10
...
[ 3313.972527] BUG: Bad page state in process llvmpipe-5  pfn:8a794
[ 3313.978605] page:ffff7e000029e500 count:0 mapcount:-128 
mapping:0000000000000000 index:0x1
[ 3313.986845] flags: 0x0()
[ 3313.989393] raw: 0000000000000000 ffff7e0000185908 ffff7e0000933b08 
0000000000000000
[ 3313.997116] raw: 0000000000000001 0000000000000002 00000000ffffff7f 
0000000000000000
[ 3314.004824] page dumped because: nonzero mapcount
[ 3314.009523] Modules linked in:
[ 3314.012623] CPU: 0 PID: 283 Comm: llvmpipe-5 Tainted: G    B 
    4.19.0-rc3+ #801
[ 3314.020718] Hardware name: ARM LTD ARM Juno Development Platform/ARM 
Juno Development Platform, BIOS EDK II Jul 10 2018
[ 3314.031390] Call trace:
[ 3314.033810]  dump_backtrace+0x0/0x1f0
[ 3314.037434]  show_stack+0x14/0x20
[ 3314.040713]  dump_stack+0x9c/0xbc
[ 3314.043993]  bad_page+0xe8/0x150
[ 3314.047186]  free_pages_check_bad+0x70/0xa8
[ 3314.051325]  __free_pages_ok+0x27c/0x288
[ 3314.055205]  __free_pages+0x30/0x48
[ 3314.058656]  free_pages.part.22+0x1c/0x28
[ 3314.062623]  free_pages+0x14/0x20
[ 3314.065902]  dma_direct_free_pages+0x5c/0x68
[ 3314.070127]  arch_dma_free+0x5c/0x70
[ 3314.073663]  dma_direct_free+0x20/0x28
[ 3314.077371]  drm_gem_cma_free_object+0xb0/0x150
[ 3314.081854]  drm_gem_object_free+0x1c/0x58
[ 3314.085906]  drm_gem_object_put_unlocked+0x70/0x98
[ 3314.090647]  drm_gem_object_handle_put_unlocked+0x64/0xc0
[ 3314.095990]  drm_gem_object_release_handle+0x54/0x90
[ 3314.100902]  idr_for_each+0x70/0x130
[ 3314.104439]  drm_gem_release+0x28/0x40
[ 3314.108148]  drm_file_free.part.0+0x214/0x288
[ 3314.112458]  drm_release+0x94/0x118
[ 3314.115909]  __fput+0x8c/0x1c0
[ 3314.118930]  ____fput+0xc/0x18
[ 3314.121950]  task_work_run+0x98/0xb8
[ 3314.125486]  do_exit+0x2b8/0x970
[ 3314.128678]  do_group_exit+0x38/0xa0
[ 3314.132214]  get_signal+0xfc/0x4b0
[ 3314.135579]  do_signal+0x1a0/0x2a8
[ 3314.138945]  do_notify_resume+0xe8/0x128
[ 3314.142825]  work_pending+0x8/0x10
...

and hundreds more with various different reasons.

AFAICS there's something about coherent allocations causing page 
corruptions (it doesn't seem triggerable without the DRM framebuffer 
being allocated, at least with the method I've been using). Predictably, 
though somewhat unhelpfully, it bisects to this commit, so the apparent 
alloc_coherent connection is as far as I can narrow it down for the moment.

Robin.

> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>   arch/arm64/Kconfig                   |   4 +
>   arch/arm64/include/asm/device.h      |   1 -
>   arch/arm64/include/asm/dma-mapping.h |   7 +-
>   arch/arm64/mm/dma-mapping.c          | 257 +++++----------------------
>   4 files changed, 56 insertions(+), 213 deletions(-)
> 
> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> index 1b1a0e95c751..c4db5131d837 100644
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -11,6 +11,8 @@ config ARM64
>   	select ARCH_CLOCKSOURCE_DATA
>   	select ARCH_HAS_DEBUG_VIRTUAL
>   	select ARCH_HAS_DEVMEM_IS_ALLOWED
> +	select ARCH_HAS_DMA_COHERENT_TO_PFN
> +	select ARCH_HAS_DMA_MMAP_PGPROT
>   	select ARCH_HAS_ACPI_TABLE_UPGRADE if ACPI
>   	select ARCH_HAS_ELF_RANDOMIZE
>   	select ARCH_HAS_FAST_MULTIPLIER
> @@ -24,6 +26,8 @@ config ARM64
>   	select ARCH_HAS_SG_CHAIN
>   	select ARCH_HAS_STRICT_KERNEL_RWX
>   	select ARCH_HAS_STRICT_MODULE_RWX
> +	select ARCH_HAS_SYNC_DMA_FOR_DEVICE
> +	select ARCH_HAS_SYNC_DMA_FOR_CPU
>   	select ARCH_HAS_SYSCALL_WRAPPER
>   	select ARCH_HAS_TICK_BROADCAST if GENERIC_CLOCKEVENTS_BROADCAST
>   	select ARCH_HAVE_NMI_SAFE_CMPXCHG
> diff --git a/arch/arm64/include/asm/device.h b/arch/arm64/include/asm/device.h
> index 5a5fa47a6b18..3dd3d664c5c5 100644
> --- a/arch/arm64/include/asm/device.h
> +++ b/arch/arm64/include/asm/device.h
> @@ -23,7 +23,6 @@ struct dev_archdata {
>   #ifdef CONFIG_XEN
>   	const struct dma_map_ops *dev_dma_ops;
>   #endif
> -	bool dma_coherent;
>   };
>   
>   struct pdev_archdata {
> diff --git a/arch/arm64/include/asm/dma-mapping.h b/arch/arm64/include/asm/dma-mapping.h
> index b7847eb8a7bb..c41f3fb1446c 100644
> --- a/arch/arm64/include/asm/dma-mapping.h
> +++ b/arch/arm64/include/asm/dma-mapping.h
> @@ -44,10 +44,13 @@ void arch_teardown_dma_ops(struct device *dev);
>   #define arch_teardown_dma_ops	arch_teardown_dma_ops
>   #endif
>   
> -/* do not use this function in a driver */
> +/*
> + * Do not use this function in a driver, it is only provided for
> + * arch/arm/mm/xen.c, which is used by arm64 as well.
> + */
>   static inline bool is_device_dma_coherent(struct device *dev)
>   {
> -	return dev->archdata.dma_coherent;
> +	return dev->dma_coherent;
>   }
>   
>   #endif	/* __KERNEL__ */
> diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c
> index eee6cfcfde9e..3c75d69b54e7 100644
> --- a/arch/arm64/mm/dma-mapping.c
> +++ b/arch/arm64/mm/dma-mapping.c
> @@ -25,6 +25,7 @@
>   #include <linux/slab.h>
>   #include <linux/genalloc.h>
>   #include <linux/dma-direct.h>
> +#include <linux/dma-noncoherent.h>
>   #include <linux/dma-contiguous.h>
>   #include <linux/vmalloc.h>
>   #include <linux/swiotlb.h>
> @@ -32,16 +33,6 @@
>   
>   #include <asm/cacheflush.h>
>   
> -static int swiotlb __ro_after_init;
> -
> -static pgprot_t __get_dma_pgprot(unsigned long attrs, pgprot_t prot,
> -				 bool coherent)
> -{
> -	if (!coherent || (attrs & DMA_ATTR_WRITE_COMBINE))
> -		return pgprot_writecombine(prot);
> -	return prot;
> -}
> -
>   static struct gen_pool *atomic_pool __ro_after_init;
>   
>   #define DEFAULT_DMA_COHERENT_POOL_SIZE  SZ_256K
> @@ -91,18 +82,16 @@ static int __free_from_pool(void *start, size_t size)
>   	return 1;
>   }
>   
> -static void *__dma_alloc(struct device *dev, size_t size,
> -			 dma_addr_t *dma_handle, gfp_t flags,
> -			 unsigned long attrs)
> +void *arch_dma_alloc(struct device *dev, size_t size, dma_addr_t *dma_handle,
> +		gfp_t flags, unsigned long attrs)
>   {
>   	struct page *page;
>   	void *ptr, *coherent_ptr;
> -	bool coherent = is_device_dma_coherent(dev);
> -	pgprot_t prot = __get_dma_pgprot(attrs, PAGE_KERNEL, false);
> +	pgprot_t prot = pgprot_writecombine(PAGE_KERNEL);
>   
>   	size = PAGE_ALIGN(size);
>   
> -	if (!coherent && !gfpflags_allow_blocking(flags)) {
> +	if (!gfpflags_allow_blocking(flags)) {
>   		struct page *page = NULL;
>   		void *addr = __alloc_from_pool(size, &page, flags);
>   
> @@ -116,10 +105,6 @@ static void *__dma_alloc(struct device *dev, size_t size,
>   	if (!ptr)
>   		goto no_mem;
>   
> -	/* no need for non-cacheable mapping if coherent */
> -	if (coherent)
> -		return ptr;
> -
>   	/* remove any dirty cache lines on the kernel alias */
>   	__dma_flush_area(ptr, size);
>   
> @@ -138,125 +123,50 @@ static void *__dma_alloc(struct device *dev, size_t size,
>   	return NULL;
>   }
>   
> -static void __dma_free(struct device *dev, size_t size,
> -		       void *vaddr, dma_addr_t dma_handle,
> -		       unsigned long attrs)
> -{
> -	void *swiotlb_addr = phys_to_virt(dma_to_phys(dev, dma_handle));
> -
> -	size = PAGE_ALIGN(size);
> -
> -	if (!is_device_dma_coherent(dev)) {
> -		if (__free_from_pool(vaddr, size))
> -			return;
> -		vunmap(vaddr);
> -	}
> -	dma_direct_free_pages(dev, size, swiotlb_addr, dma_handle, attrs);
> -}
> -
> -static dma_addr_t __swiotlb_map_page(struct device *dev, struct page *page,
> -				     unsigned long offset, size_t size,
> -				     enum dma_data_direction dir,
> -				     unsigned long attrs)
> -{
> -	dma_addr_t dev_addr;
> -
> -	dev_addr = swiotlb_map_page(dev, page, offset, size, dir, attrs);
> -	if (!is_device_dma_coherent(dev) &&
> -	    (attrs & DMA_ATTR_SKIP_CPU_SYNC) == 0)
> -		__dma_map_area(phys_to_virt(dma_to_phys(dev, dev_addr)), size, dir);
> -
> -	return dev_addr;
> -}
> -
> -
> -static void __swiotlb_unmap_page(struct device *dev, dma_addr_t dev_addr,
> -				 size_t size, enum dma_data_direction dir,
> -				 unsigned long attrs)
> +void arch_dma_free(struct device *dev, size_t size, void *vaddr,
> +		dma_addr_t dma_handle, unsigned long attrs)
>   {
> -	if (!is_device_dma_coherent(dev) &&
> -	    (attrs & DMA_ATTR_SKIP_CPU_SYNC) == 0)
> -		__dma_unmap_area(phys_to_virt(dma_to_phys(dev, dev_addr)), size, dir);
> -	swiotlb_unmap_page(dev, dev_addr, size, dir, attrs);
> +	if (__free_from_pool(vaddr, PAGE_ALIGN(size)))
> +		return;
> +	vunmap(vaddr);
> +	dma_direct_free_pages(dev, size, vaddr, dma_handle, attrs);
>   }
>   
> -static int __swiotlb_map_sg_attrs(struct device *dev, struct scatterlist *sgl,
> -				  int nelems, enum dma_data_direction dir,
> -				  unsigned long attrs)
> +long arch_dma_coherent_to_pfn(struct device *dev, void *cpu_addr,
> +		dma_addr_t dma_addr)
>   {
> -	struct scatterlist *sg;
> -	int i, ret;
> -
> -	ret = swiotlb_map_sg_attrs(dev, sgl, nelems, dir, attrs);
> -	if (!is_device_dma_coherent(dev) &&
> -	    (attrs & DMA_ATTR_SKIP_CPU_SYNC) == 0)
> -		for_each_sg(sgl, sg, ret, i)
> -			__dma_map_area(phys_to_virt(dma_to_phys(dev, sg->dma_address)),
> -				       sg->length, dir);
> -
> -	return ret;
> +	return __phys_to_pfn(dma_to_phys(dev, dma_addr));
>   }
>   
> -static void __swiotlb_unmap_sg_attrs(struct device *dev,
> -				     struct scatterlist *sgl, int nelems,
> -				     enum dma_data_direction dir,
> -				     unsigned long attrs)
> +pgprot_t arch_dma_mmap_pgprot(struct device *dev, pgprot_t prot,
> +		unsigned long attrs)
>   {
> -	struct scatterlist *sg;
> -	int i;
> -
> -	if (!is_device_dma_coherent(dev) &&
> -	    (attrs & DMA_ATTR_SKIP_CPU_SYNC) == 0)
> -		for_each_sg(sgl, sg, nelems, i)
> -			__dma_unmap_area(phys_to_virt(dma_to_phys(dev, sg->dma_address)),
> -					 sg->length, dir);
> -	swiotlb_unmap_sg_attrs(dev, sgl, nelems, dir, attrs);
> +	if (!dev_is_dma_coherent(dev) || (attrs & DMA_ATTR_WRITE_COMBINE))
> +		return pgprot_writecombine(prot);
> +	return prot;
>   }
>   
> -static void __swiotlb_sync_single_for_cpu(struct device *dev,
> -					  dma_addr_t dev_addr, size_t size,
> -					  enum dma_data_direction dir)
> +void arch_sync_dma_for_device(struct device *dev, phys_addr_t paddr,
> +		size_t size, enum dma_data_direction dir)
>   {
> -	if (!is_device_dma_coherent(dev))
> -		__dma_unmap_area(phys_to_virt(dma_to_phys(dev, dev_addr)), size, dir);
> -	swiotlb_sync_single_for_cpu(dev, dev_addr, size, dir);
> +	__dma_map_area(phys_to_virt(paddr), size, dir);
>   }
>   
> -static void __swiotlb_sync_single_for_device(struct device *dev,
> -					     dma_addr_t dev_addr, size_t size,
> -					     enum dma_data_direction dir)
> +void arch_sync_dma_for_cpu(struct device *dev, phys_addr_t paddr,
> +		size_t size, enum dma_data_direction dir)
>   {
> -	swiotlb_sync_single_for_device(dev, dev_addr, size, dir);
> -	if (!is_device_dma_coherent(dev))
> -		__dma_map_area(phys_to_virt(dma_to_phys(dev, dev_addr)), size, dir);
> +	__dma_unmap_area(phys_to_virt(paddr), size, dir);
>   }
>   
> -static void __swiotlb_sync_sg_for_cpu(struct device *dev,
> -				      struct scatterlist *sgl, int nelems,
> -				      enum dma_data_direction dir)
> +static int __swiotlb_get_sgtable_page(struct sg_table *sgt,
> +				      struct page *page, size_t size)
>   {
> -	struct scatterlist *sg;
> -	int i;
> -
> -	if (!is_device_dma_coherent(dev))
> -		for_each_sg(sgl, sg, nelems, i)
> -			__dma_unmap_area(phys_to_virt(dma_to_phys(dev, sg->dma_address)),
> -					 sg->length, dir);
> -	swiotlb_sync_sg_for_cpu(dev, sgl, nelems, dir);
> -}
> +	int ret = sg_alloc_table(sgt, 1, GFP_KERNEL);
>   
> -static void __swiotlb_sync_sg_for_device(struct device *dev,
> -					 struct scatterlist *sgl, int nelems,
> -					 enum dma_data_direction dir)
> -{
> -	struct scatterlist *sg;
> -	int i;
> +	if (!ret)
> +		sg_set_page(sgt->sgl, page, PAGE_ALIGN(size), 0);
>   
> -	swiotlb_sync_sg_for_device(dev, sgl, nelems, dir);
> -	if (!is_device_dma_coherent(dev))
> -		for_each_sg(sgl, sg, nelems, i)
> -			__dma_map_area(phys_to_virt(dma_to_phys(dev, sg->dma_address)),
> -				       sg->length, dir);
> +	return ret;
>   }
>   
>   static int __swiotlb_mmap_pfn(struct vm_area_struct *vma,
> @@ -277,74 +187,6 @@ static int __swiotlb_mmap_pfn(struct vm_area_struct *vma,
>   	return ret;
>   }
>   
> -static int __swiotlb_mmap(struct device *dev,
> -			  struct vm_area_struct *vma,
> -			  void *cpu_addr, dma_addr_t dma_addr, size_t size,
> -			  unsigned long attrs)
> -{
> -	int ret;
> -	unsigned long pfn = dma_to_phys(dev, dma_addr) >> PAGE_SHIFT;
> -
> -	vma->vm_page_prot = __get_dma_pgprot(attrs, vma->vm_page_prot,
> -					     is_device_dma_coherent(dev));
> -
> -	if (dma_mmap_from_dev_coherent(dev, vma, cpu_addr, size, &ret))
> -		return ret;
> -
> -	return __swiotlb_mmap_pfn(vma, pfn, size);
> -}
> -
> -static int __swiotlb_get_sgtable_page(struct sg_table *sgt,
> -				      struct page *page, size_t size)
> -{
> -	int ret = sg_alloc_table(sgt, 1, GFP_KERNEL);
> -
> -	if (!ret)
> -		sg_set_page(sgt->sgl, page, PAGE_ALIGN(size), 0);
> -
> -	return ret;
> -}
> -
> -static int __swiotlb_get_sgtable(struct device *dev, struct sg_table *sgt,
> -				 void *cpu_addr, dma_addr_t handle, size_t size,
> -				 unsigned long attrs)
> -{
> -	struct page *page = phys_to_page(dma_to_phys(dev, handle));
> -
> -	return __swiotlb_get_sgtable_page(sgt, page, size);
> -}
> -
> -static int __swiotlb_dma_supported(struct device *hwdev, u64 mask)
> -{
> -	if (swiotlb)
> -		return swiotlb_dma_supported(hwdev, mask);
> -	return 1;
> -}
> -
> -static int __swiotlb_dma_mapping_error(struct device *hwdev, dma_addr_t addr)
> -{
> -	if (swiotlb)
> -		return dma_direct_mapping_error(hwdev, addr);
> -	return 0;
> -}
> -
> -static const struct dma_map_ops arm64_swiotlb_dma_ops = {
> -	.alloc = __dma_alloc,
> -	.free = __dma_free,
> -	.mmap = __swiotlb_mmap,
> -	.get_sgtable = __swiotlb_get_sgtable,
> -	.map_page = __swiotlb_map_page,
> -	.unmap_page = __swiotlb_unmap_page,
> -	.map_sg = __swiotlb_map_sg_attrs,
> -	.unmap_sg = __swiotlb_unmap_sg_attrs,
> -	.sync_single_for_cpu = __swiotlb_sync_single_for_cpu,
> -	.sync_single_for_device = __swiotlb_sync_single_for_device,
> -	.sync_sg_for_cpu = __swiotlb_sync_sg_for_cpu,
> -	.sync_sg_for_device = __swiotlb_sync_sg_for_device,
> -	.dma_supported = __swiotlb_dma_supported,
> -	.mapping_error = __swiotlb_dma_mapping_error,
> -};
> -
>   static int __init atomic_pool_init(void)
>   {
>   	pgprot_t prot = __pgprot(PROT_NORMAL_NC);
> @@ -500,10 +342,6 @@ EXPORT_SYMBOL(dummy_dma_ops);
>   
>   static int __init arm64_dma_init(void)
>   {
> -	if (swiotlb_force == SWIOTLB_FORCE ||
> -	    max_pfn > (arm64_dma_phys_limit >> PAGE_SHIFT))
> -		swiotlb = 1;
> -
>   	WARN_TAINT(ARCH_DMA_MINALIGN < cache_line_size(),
>   		   TAINT_CPU_OUT_OF_SPEC,
>   		   "ARCH_DMA_MINALIGN smaller than CTR_EL0.CWG (%d < %d)",
> @@ -528,7 +366,7 @@ static void *__iommu_alloc_attrs(struct device *dev, size_t size,
>   				 dma_addr_t *handle, gfp_t gfp,
>   				 unsigned long attrs)
>   {
> -	bool coherent = is_device_dma_coherent(dev);
> +	bool coherent = dev_is_dma_coherent(dev);
>   	int ioprot = dma_info_to_prot(DMA_BIDIRECTIONAL, coherent, attrs);
>   	size_t iosize = size;
>   	void *addr;
> @@ -569,7 +407,7 @@ static void *__iommu_alloc_attrs(struct device *dev, size_t size,
>   			addr = NULL;
>   		}
>   	} else if (attrs & DMA_ATTR_FORCE_CONTIGUOUS) {
> -		pgprot_t prot = __get_dma_pgprot(attrs, PAGE_KERNEL, coherent);
> +		pgprot_t prot = arch_dma_mmap_pgprot(dev, PAGE_KERNEL, attrs);
>   		struct page *page;
>   
>   		page = dma_alloc_from_contiguous(dev, size >> PAGE_SHIFT,
> @@ -596,7 +434,7 @@ static void *__iommu_alloc_attrs(struct device *dev, size_t size,
>   						    size >> PAGE_SHIFT);
>   		}
>   	} else {
> -		pgprot_t prot = __get_dma_pgprot(attrs, PAGE_KERNEL, coherent);
> +		pgprot_t prot = arch_dma_mmap_pgprot(dev, PAGE_KERNEL, attrs);
>   		struct page **pages;
>   
>   		pages = iommu_dma_alloc(dev, iosize, gfp, attrs, ioprot,
> @@ -658,8 +496,7 @@ static int __iommu_mmap_attrs(struct device *dev, struct vm_area_struct *vma,
>   	struct vm_struct *area;
>   	int ret;
>   
> -	vma->vm_page_prot = __get_dma_pgprot(attrs, vma->vm_page_prot,
> -					     is_device_dma_coherent(dev));
> +	vma->vm_page_prot = arch_dma_mmap_pgprot(dev, vma->vm_page_prot, attrs);
>   
>   	if (dma_mmap_from_dev_coherent(dev, vma, cpu_addr, size, &ret))
>   		return ret;
> @@ -709,11 +546,11 @@ static void __iommu_sync_single_for_cpu(struct device *dev,
>   {
>   	phys_addr_t phys;
>   
> -	if (is_device_dma_coherent(dev))
> +	if (dev_is_dma_coherent(dev))
>   		return;
>   
>   	phys = iommu_iova_to_phys(iommu_get_domain_for_dev(dev), dev_addr);
> -	__dma_unmap_area(phys_to_virt(phys), size, dir);
> +	arch_sync_dma_for_cpu(dev, phys, size, dir);
>   }
>   
>   static void __iommu_sync_single_for_device(struct device *dev,
> @@ -722,11 +559,11 @@ static void __iommu_sync_single_for_device(struct device *dev,
>   {
>   	phys_addr_t phys;
>   
> -	if (is_device_dma_coherent(dev))
> +	if (dev_is_dma_coherent(dev))
>   		return;
>   
>   	phys = iommu_iova_to_phys(iommu_get_domain_for_dev(dev), dev_addr);
> -	__dma_map_area(phys_to_virt(phys), size, dir);
> +	arch_sync_dma_for_device(dev, phys, size, dir);
>   }
>   
>   static dma_addr_t __iommu_map_page(struct device *dev, struct page *page,
> @@ -734,7 +571,7 @@ static dma_addr_t __iommu_map_page(struct device *dev, struct page *page,
>   				   enum dma_data_direction dir,
>   				   unsigned long attrs)
>   {
> -	bool coherent = is_device_dma_coherent(dev);
> +	bool coherent = dev_is_dma_coherent(dev);
>   	int prot = dma_info_to_prot(dir, coherent, attrs);
>   	dma_addr_t dev_addr = iommu_dma_map_page(dev, page, offset, size, prot);
>   
> @@ -762,11 +599,11 @@ static void __iommu_sync_sg_for_cpu(struct device *dev,
>   	struct scatterlist *sg;
>   	int i;
>   
> -	if (is_device_dma_coherent(dev))
> +	if (dev_is_dma_coherent(dev))
>   		return;
>   
>   	for_each_sg(sgl, sg, nelems, i)
> -		__dma_unmap_area(sg_virt(sg), sg->length, dir);
> +		arch_sync_dma_for_cpu(dev, sg_phys(sg), sg->length, dir);
>   }
>   
>   static void __iommu_sync_sg_for_device(struct device *dev,
> @@ -776,18 +613,18 @@ static void __iommu_sync_sg_for_device(struct device *dev,
>   	struct scatterlist *sg;
>   	int i;
>   
> -	if (is_device_dma_coherent(dev))
> +	if (dev_is_dma_coherent(dev))
>   		return;
>   
>   	for_each_sg(sgl, sg, nelems, i)
> -		__dma_map_area(sg_virt(sg), sg->length, dir);
> +		arch_sync_dma_for_device(dev, sg_phys(sg), sg->length, dir);
>   }
>   
>   static int __iommu_map_sg_attrs(struct device *dev, struct scatterlist *sgl,
>   				int nelems, enum dma_data_direction dir,
>   				unsigned long attrs)
>   {
> -	bool coherent = is_device_dma_coherent(dev);
> +	bool coherent = dev_is_dma_coherent(dev);
>   
>   	if ((attrs & DMA_ATTR_SKIP_CPU_SYNC) == 0)
>   		__iommu_sync_sg_for_device(dev, sgl, nelems, dir);
> @@ -879,9 +716,9 @@ void arch_setup_dma_ops(struct device *dev, u64 dma_base, u64 size,
>   			const struct iommu_ops *iommu, bool coherent)
>   {
>   	if (!dev->dma_ops)
> -		dev->dma_ops = &arm64_swiotlb_dma_ops;
> +		dev->dma_ops = &swiotlb_dma_ops;
>   
> -	dev->archdata.dma_coherent = coherent;
> +	dev->dma_coherent = coherent;
>   	__iommu_setup_dma_ops(dev, dma_base, size, iommu);
>   
>   #ifdef CONFIG_XEN
> 

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 10/10] arm64: use the generic swiotlb_dma_ops
  2018-10-12 13:01   ` Robin Murphy
@ 2018-10-12 14:40     ` Christoph Hellwig
  2018-10-12 17:05       ` Catalin Marinas
  0 siblings, 1 reply; 57+ messages in thread
From: Christoph Hellwig @ 2018-10-12 14:40 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Christoph Hellwig, Will Deacon, Catalin Marinas,
	Konrad Rzeszutek Wilk, linux-arm-kernel, iommu, linux-kernel

On Fri, Oct 12, 2018 at 02:01:00PM +0100, Robin Murphy wrote:
> On 08/10/18 09:02, Christoph Hellwig wrote:
>> Now that the generic swiotlb code supports non-coherent DMA we can switch
>> to it for arm64.  For that we need to refactor the existing
>> alloc/free/mmap/pgprot helpers to be used as the architecture hooks,
>> and implement the standard arch_sync_dma_for_{device,cpu} hooks for
>> cache maintaincance in the streaming dma hooks, which also implies
>> using the generic dma_coherent flag in struct device.
>>
>> Note that we need to keep the old is_device_dma_coherent function around
>> for now, so that the shared arm/arm64 Xen code keeps working.
>
> OK, so when I said last night that it boot-tested OK, that much was true, 
> but then I shut the board down as I left and got a megasplosion of bad page 
> state BUGs, e.g.:

I think this is because I am passing the wrong address to
dma_direct_free_pages.  Please try this patch on top:

diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c
index 3c75d69b54e7..4f0f92856c4c 100644
--- a/arch/arm64/mm/dma-mapping.c
+++ b/arch/arm64/mm/dma-mapping.c
@@ -126,10 +126,12 @@ void *arch_dma_alloc(struct device *dev, size_t size, dma_addr_t *dma_handle,
 void arch_dma_free(struct device *dev, size_t size, void *vaddr,
 		dma_addr_t dma_handle, unsigned long attrs)
 {
-	if (__free_from_pool(vaddr, PAGE_ALIGN(size)))
-		return;
-	vunmap(vaddr);
-	dma_direct_free_pages(dev, size, vaddr, dma_handle, attrs);
+	if (!__free_from_pool(vaddr, PAGE_ALIGN(size)))
+		void *kaddr = phys_to_virt(dma_to_phys(dev, dma_handle));
+
+		vunmap(vaddr);
+		dma_direct_free_pages(dev, size, kaddr, dma_handle, attrs);
+	}
 }
 
 long arch_dma_coherent_to_pfn(struct device *dev, void *cpu_addr,

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 04/10] swiotlb: remove the overflow buffer
  2018-10-08  8:02 ` [PATCH 04/10] swiotlb: remove the overflow buffer Christoph Hellwig
  2018-10-11 18:19   ` Robin Murphy
@ 2018-10-12 17:04   ` Catalin Marinas
  2018-10-19  0:23   ` Konrad Rzeszutek Wilk
  2 siblings, 0 replies; 57+ messages in thread
From: Catalin Marinas @ 2018-10-12 17:04 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Will Deacon, Robin Murphy, Konrad Rzeszutek Wilk, iommu,
	linux-kernel, linux-arm-kernel

On Mon, Oct 08, 2018 at 10:02:40AM +0200, Christoph Hellwig wrote:
> Like all other dma mapping drivers just return an error code instead
> of an actual memory buffer.  The reason for the overflow buffer was
> that at the time swiotlb was invented there was no way to check for
> dma mapping errors, but this has long been fixed.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>  arch/arm64/mm/dma-mapping.c       |  2 +-
>  arch/powerpc/kernel/dma-swiotlb.c |  4 +--
>  include/linux/dma-direct.h        |  2 ++
>  include/linux/swiotlb.h           |  3 --
>  kernel/dma/direct.c               |  2 --
>  kernel/dma/swiotlb.c              | 59 ++-----------------------------
>  6 files changed, 8 insertions(+), 64 deletions(-)

I went through the patches and they look fine to me (with the vunmap
fix for the last patch). For the arm64 bits:

Acked-by: Catalin Marinas <catalin.marinas@arm.com>

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 08/10] swiotlb: don't dip into swiotlb pool for coherent allocations
  2018-10-08  8:02 ` [PATCH 08/10] swiotlb: don't dip into swiotlb pool for coherent allocations Christoph Hellwig
@ 2018-10-12 17:04   ` Catalin Marinas
  2018-10-19  0:40   ` Konrad Rzeszutek Wilk
  2018-10-19 16:45   ` Robin Murphy
  2 siblings, 0 replies; 57+ messages in thread
From: Catalin Marinas @ 2018-10-12 17:04 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Will Deacon, Robin Murphy, Konrad Rzeszutek Wilk, iommu,
	linux-kernel, linux-arm-kernel

On Mon, Oct 08, 2018 at 10:02:44AM +0200, Christoph Hellwig wrote:
> All architectures that support swiotlb also have a zone that backs up
> these less than full addressing allocations (usually ZONE_DMA32).
> 
> Because of that it is rather pointless to fall back to the global swiotlb
> buffer if the normal dma direct allocation failed - the only thing this
> will do is to eat up bounce buffers that would be more useful to serve
> streaming mappings.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>  arch/arm64/mm/dma-mapping.c |   6 +--
>  include/linux/swiotlb.h     |   5 --
>  kernel/dma/swiotlb.c        | 105 +-----------------------------------
>  3 files changed, 5 insertions(+), 111 deletions(-)

Acked-by: Catalin Marinas <catalin.marinas@arm.com>

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 10/10] arm64: use the generic swiotlb_dma_ops
  2018-10-12 14:40     ` Christoph Hellwig
@ 2018-10-12 17:05       ` Catalin Marinas
  0 siblings, 0 replies; 57+ messages in thread
From: Catalin Marinas @ 2018-10-12 17:05 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Robin Murphy, Konrad Rzeszutek Wilk, Will Deacon, linux-kernel,
	iommu, linux-arm-kernel

On Fri, Oct 12, 2018 at 04:40:49PM +0200, Christoph Hellwig wrote:
> On Fri, Oct 12, 2018 at 02:01:00PM +0100, Robin Murphy wrote:
> > On 08/10/18 09:02, Christoph Hellwig wrote:
> >> Now that the generic swiotlb code supports non-coherent DMA we can switch
> >> to it for arm64.  For that we need to refactor the existing
> >> alloc/free/mmap/pgprot helpers to be used as the architecture hooks,
> >> and implement the standard arch_sync_dma_for_{device,cpu} hooks for
> >> cache maintaincance in the streaming dma hooks, which also implies
> >> using the generic dma_coherent flag in struct device.
> >>
> >> Note that we need to keep the old is_device_dma_coherent function around
> >> for now, so that the shared arm/arm64 Xen code keeps working.
> >
> > OK, so when I said last night that it boot-tested OK, that much was true, 
> > but then I shut the board down as I left and got a megasplosion of bad page 
> > state BUGs, e.g.:
> 
> I think this is because I am passing the wrong address to
> dma_direct_free_pages.  Please try this patch on top:
> 
> diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c
> index 3c75d69b54e7..4f0f92856c4c 100644
> --- a/arch/arm64/mm/dma-mapping.c
> +++ b/arch/arm64/mm/dma-mapping.c
> @@ -126,10 +126,12 @@ void *arch_dma_alloc(struct device *dev, size_t size, dma_addr_t *dma_handle,
>  void arch_dma_free(struct device *dev, size_t size, void *vaddr,
>  		dma_addr_t dma_handle, unsigned long attrs)
>  {
> -	if (__free_from_pool(vaddr, PAGE_ALIGN(size)))
> -		return;
> -	vunmap(vaddr);
> -	dma_direct_free_pages(dev, size, vaddr, dma_handle, attrs);
> +	if (!__free_from_pool(vaddr, PAGE_ALIGN(size)))
> +		void *kaddr = phys_to_virt(dma_to_phys(dev, dma_handle));
> +
> +		vunmap(vaddr);
> +		dma_direct_free_pages(dev, size, kaddr, dma_handle, attrs);
> +	}
>  }
>  
>  long arch_dma_coherent_to_pfn(struct device *dev, void *cpu_addr,

With this fix:

Acked-by: Catalin Marinas <catalin.marinas@arm.com>

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 05/10] swiotlb: merge swiotlb_unmap_page and unmap_single
  2018-10-08  8:02 ` [PATCH 05/10] swiotlb: merge swiotlb_unmap_page and unmap_single Christoph Hellwig
@ 2018-10-18 17:44   ` Robin Murphy
  2018-10-19  0:25   ` Konrad Rzeszutek Wilk
  1 sibling, 0 replies; 57+ messages in thread
From: Robin Murphy @ 2018-10-18 17:44 UTC (permalink / raw)
  To: Christoph Hellwig, Will Deacon, Catalin Marinas, Konrad Rzeszutek Wilk
  Cc: linux-arm-kernel, iommu, linux-kernel

On 08/10/18 09:02, Christoph Hellwig wrote:
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>   kernel/dma/swiotlb.c | 15 ++++-----------
>   1 file changed, 4 insertions(+), 11 deletions(-)
> 
> diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
> index 11dbcd80b4a6..15335f3a1bf3 100644
> --- a/kernel/dma/swiotlb.c
> +++ b/kernel/dma/swiotlb.c
> @@ -765,9 +765,9 @@ dma_addr_t swiotlb_map_page(struct device *dev, struct page *page,
>    * After this call, reads by the cpu to the buffer are guaranteed to see
>    * whatever the device wrote there.
>    */
> -static void unmap_single(struct device *hwdev, dma_addr_t dev_addr,
> -			 size_t size, enum dma_data_direction dir,
> -			 unsigned long attrs)
> +void swiotlb_unmap_page(struct device *hwdev, dma_addr_t dev_addr,
> +			size_t size, enum dma_data_direction dir,
> +			unsigned long attrs)
>   {
>   	phys_addr_t paddr = dma_to_phys(hwdev, dev_addr);
>   
> @@ -790,13 +790,6 @@ static void unmap_single(struct device *hwdev, dma_addr_t dev_addr,
>   	dma_mark_clean(phys_to_virt(paddr), size);
>   }
>   
> -void swiotlb_unmap_page(struct device *hwdev, dma_addr_t dev_addr,
> -			size_t size, enum dma_data_direction dir,
> -			unsigned long attrs)
> -{
> -	unmap_single(hwdev, dev_addr, size, dir, attrs);
> -}
> -
>   /*
>    * Make physical memory consistent for a single streaming mode DMA translation
>    * after a transfer.
> @@ -900,7 +893,7 @@ swiotlb_unmap_sg_attrs(struct device *hwdev, struct scatterlist *sgl,
>   	BUG_ON(dir == DMA_NONE);

In passing, I notice that this BUG_ON(), and its friends elsewhere in 
the file, appear entirely pointless, since the wrappers in dma-mapping.h 
have already made the equivalent check.

For this patch, though,

Reviewed-by: Robin Murphy <robin.murphy@arm.com>

>   
>   	for_each_sg(sgl, sg, nelems, i)
> -		unmap_single(hwdev, sg->dma_address, sg_dma_len(sg), dir,
> +		swiotlb_unmap_page(hwdev, sg->dma_address, sg_dma_len(sg), dir,
>   			     attrs);
>   }
>   
> 

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 06/10] swiotlb: use swiotlb_map_page in swiotlb_map_sg_attrs
  2018-10-08  8:02 ` [PATCH 06/10] swiotlb: use swiotlb_map_page in swiotlb_map_sg_attrs Christoph Hellwig
@ 2018-10-18 17:53   ` Robin Murphy
  2018-10-19  0:33   ` Konrad Rzeszutek Wilk
  2018-11-07  1:27   ` John Stultz
  2 siblings, 0 replies; 57+ messages in thread
From: Robin Murphy @ 2018-10-18 17:53 UTC (permalink / raw)
  To: Christoph Hellwig, Will Deacon, Catalin Marinas, Konrad Rzeszutek Wilk
  Cc: linux-arm-kernel, iommu, linux-kernel

On 08/10/18 09:02, Christoph Hellwig wrote:
> No need to duplicate the code - map_sg is equivalent to map_page
> for each page in the scatterlist.

Even better, this will also make map_sg actually fire the tracepoint 
(which I've certainly found handy in the past for debugging a mask 
cockup that led to excessive bouncing).

Reviewed-by: Robin Murphy <robin.murphy@arm.com>

> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>   kernel/dma/swiotlb.c | 34 ++++++++++++----------------------
>   1 file changed, 12 insertions(+), 22 deletions(-)
> 
> diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
> index 15335f3a1bf3..15755d7a5242 100644
> --- a/kernel/dma/swiotlb.c
> +++ b/kernel/dma/swiotlb.c
> @@ -845,37 +845,27 @@ swiotlb_sync_single_for_device(struct device *hwdev, dma_addr_t dev_addr,
>    * same here.
>    */
>   int
> -swiotlb_map_sg_attrs(struct device *hwdev, struct scatterlist *sgl, int nelems,
> +swiotlb_map_sg_attrs(struct device *dev, struct scatterlist *sgl, int nelems,
>   		     enum dma_data_direction dir, unsigned long attrs)
>   {
>   	struct scatterlist *sg;
>   	int i;
>   
> -	BUG_ON(dir == DMA_NONE);
> -
>   	for_each_sg(sgl, sg, nelems, i) {
> -		phys_addr_t paddr = sg_phys(sg);
> -		dma_addr_t dev_addr = phys_to_dma(hwdev, paddr);
> -
> -		if (swiotlb_force == SWIOTLB_FORCE ||
> -		    !dma_capable(hwdev, dev_addr, sg->length)) {
> -			phys_addr_t map = map_single(hwdev, sg_phys(sg),
> -						     sg->length, dir, attrs);
> -			if (map == SWIOTLB_MAP_ERROR) {
> -				/* Don't panic here, we expect map_sg users
> -				   to do proper error handling. */
> -				attrs |= DMA_ATTR_SKIP_CPU_SYNC;
> -				swiotlb_unmap_sg_attrs(hwdev, sgl, i, dir,
> -						       attrs);
> -				sg_dma_len(sgl) = 0;
> -				return 0;
> -			}
> -			sg->dma_address = __phys_to_dma(hwdev, map);
> -		} else
> -			sg->dma_address = dev_addr;
> +		sg->dma_address = swiotlb_map_page(dev, sg_page(sg), sg->offset,
> +				sg->length, dir, attrs);
> +		if (sg->dma_address == DIRECT_MAPPING_ERROR)
> +			goto out_error;
>   		sg_dma_len(sg) = sg->length;
>   	}
> +
>   	return nelems;
> +
> +out_error:
> +	swiotlb_unmap_sg_attrs(dev, sgl, i, dir,
> +			attrs | DMA_ATTR_SKIP_CPU_SYNC);
> +	sg_dma_len(sgl) = 0;
> +	return 0;
>   }
>   
>   /*
> 

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 07/10] swiotlb: refactor swiotlb_map_page
  2018-10-08  8:02 ` [PATCH 07/10] swiotlb: refactor swiotlb_map_page Christoph Hellwig
@ 2018-10-18 18:09   ` Robin Murphy
  2018-10-19  0:37     ` Konrad Rzeszutek Wilk
  0 siblings, 1 reply; 57+ messages in thread
From: Robin Murphy @ 2018-10-18 18:09 UTC (permalink / raw)
  To: Christoph Hellwig, Will Deacon, Catalin Marinas, Konrad Rzeszutek Wilk
  Cc: linux-arm-kernel, iommu, linux-kernel

On 08/10/18 09:02, Christoph Hellwig wrote:
> Remove the somewhat useless map_single function, and replace it with a
> swiotlb_bounce_page handler that handles everything related to actually
> bouncing a page.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>   kernel/dma/swiotlb.c | 77 +++++++++++++++++++++-----------------------
>   1 file changed, 36 insertions(+), 41 deletions(-)
> 
> diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
> index 15755d7a5242..4d7a4d85d71e 100644
> --- a/kernel/dma/swiotlb.c
> +++ b/kernel/dma/swiotlb.c
> @@ -543,26 +543,6 @@ phys_addr_t swiotlb_tbl_map_single(struct device *hwdev,
>   	return tlb_addr;
>   }
>   
> -/*
> - * Allocates bounce buffer and returns its physical address.
> - */
> -static phys_addr_t
> -map_single(struct device *hwdev, phys_addr_t phys, size_t size,
> -	   enum dma_data_direction dir, unsigned long attrs)
> -{
> -	dma_addr_t start_dma_addr;
> -
> -	if (swiotlb_force == SWIOTLB_NO_FORCE) {
> -		dev_warn_ratelimited(hwdev, "Cannot do DMA to address %pa\n",
> -				     &phys);
> -		return SWIOTLB_MAP_ERROR;
> -	}
> -
> -	start_dma_addr = __phys_to_dma(hwdev, io_tlb_start);
> -	return swiotlb_tbl_map_single(hwdev, start_dma_addr, phys, size,
> -				      dir, attrs);
> -}
> -
>   /*
>    * tlb_addr is the physical address of the bounce buffer to unmap.
>    */
> @@ -714,6 +694,34 @@ static bool swiotlb_free_buffer(struct device *dev, size_t size,
>   	return true;
>   }
>   
> +static dma_addr_t swiotlb_bounce_page(struct device *dev, phys_addr_t *phys,
> +		size_t size, enum dma_data_direction dir, unsigned long attrs)
> +{
> +	dma_addr_t dma_addr;
> +
> +	if (unlikely(swiotlb_force == SWIOTLB_NO_FORCE)) {
> +		dev_warn_ratelimited(dev,
> +			"Cannot do DMA to address %pa\n", phys);
> +		return DIRECT_MAPPING_ERROR;
> +	}
> +
> +	/* Oh well, have to allocate and map a bounce buffer. */
> +	*phys = swiotlb_tbl_map_single(dev, __phys_to_dma(dev, io_tlb_start),
> +			*phys, size, dir, attrs);
> +	if (*phys == SWIOTLB_MAP_ERROR)
> +		return DIRECT_MAPPING_ERROR;
> +
> +	/* Ensure that the address returned is DMA'ble */
> +	dma_addr = __phys_to_dma(dev, *phys);
> +	if (unlikely(!dma_capable(dev, dma_addr, size))) {
> +		swiotlb_tbl_unmap_single(dev, *phys, size, dir,
> +			attrs | DMA_ATTR_SKIP_CPU_SYNC);
> +		return DIRECT_MAPPING_ERROR;
> +	}
> +
> +	return dma_addr;
> +}
> +
>   /*
>    * Map a single buffer of the indicated size for DMA in streaming mode.  The
>    * physical address to use is returned.
> @@ -726,8 +734,8 @@ dma_addr_t swiotlb_map_page(struct device *dev, struct page *page,
>   			    enum dma_data_direction dir,
>   			    unsigned long attrs)
>   {
> -	phys_addr_t map, phys = page_to_phys(page) + offset;
> -	dma_addr_t dev_addr = phys_to_dma(dev, phys);
> +	phys_addr_t phys = page_to_phys(page) + offset;
> +	dma_addr_t dma_addr = phys_to_dma(dev, phys);
>   
>   	BUG_ON(dir == DMA_NONE);
>   	/*
> @@ -735,26 +743,13 @@ dma_addr_t swiotlb_map_page(struct device *dev, struct page *page,
>   	 * we can safely return the device addr and not worry about bounce
>   	 * buffering it.
>   	 */
> -	if (dma_capable(dev, dev_addr, size) && swiotlb_force != SWIOTLB_FORCE)
> -		return dev_addr;
> -
> -	trace_swiotlb_bounced(dev, dev_addr, size, swiotlb_force);
> -
> -	/* Oh well, have to allocate and map a bounce buffer. */
> -	map = map_single(dev, phys, size, dir, attrs);
> -	if (map == SWIOTLB_MAP_ERROR)
> -		return DIRECT_MAPPING_ERROR;
> -
> -	dev_addr = __phys_to_dma(dev, map);
> -
> -	/* Ensure that the address returned is DMA'ble */
> -	if (dma_capable(dev, dev_addr, size))
> -		return dev_addr;
> -
> -	attrs |= DMA_ATTR_SKIP_CPU_SYNC;
> -	swiotlb_tbl_unmap_single(dev, map, size, dir, attrs);
> +	if (!dma_capable(dev, dma_addr, size) ||
> +	    swiotlb_force == SWIOTLB_FORCE) {
> +		trace_swiotlb_bounced(dev, dma_addr, size, swiotlb_force);
> +		dma_addr = swiotlb_bounce_page(dev, &phys, size, dir, attrs);
> +	}

FWIW I prefer the inverse condition and early return of the original 
code here, which also then allows a tail-call to swiotlb_bounce_page() 
(and saves a couple of lines), but it's no biggie.

Reviewed-by: Robin Murphy <robin.murphy@arm.com>

>   
> -	return DIRECT_MAPPING_ERROR;
> +	return dma_addr;
>   }
>   
>   /*
> 

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 01/10] swiotlb: remove a pointless comment
  2018-10-08  8:02 ` [PATCH 01/10] swiotlb: remove a pointless comment Christoph Hellwig
  2018-10-11 17:49   ` Robin Murphy
@ 2018-10-19  0:09   ` Konrad Rzeszutek Wilk
  1 sibling, 0 replies; 57+ messages in thread
From: Konrad Rzeszutek Wilk @ 2018-10-19  0:09 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Will Deacon, Catalin Marinas, Robin Murphy, linux-arm-kernel,
	iommu, linux-kernel

On Mon, Oct 08, 2018 at 10:02:37AM +0200, Christoph Hellwig wrote:
> This comments describes an aspect of the map_sg interface that isn't
> even exploited by swiotlb.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

Thank you!
> ---
>  kernel/dma/swiotlb.c | 6 ------
>  1 file changed, 6 deletions(-)
> 
> diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
> index 4f8a6dbf0b60..9062b14bc7f4 100644
> --- a/kernel/dma/swiotlb.c
> +++ b/kernel/dma/swiotlb.c
> @@ -925,12 +925,6 @@ swiotlb_sync_single_for_device(struct device *hwdev, dma_addr_t dev_addr,
>   * appropriate dma address and length.  They are obtained via
>   * sg_dma_{address,length}(SG).
>   *
> - * NOTE: An implementation may be able to use a smaller number of
> - *       DMA address/length pairs than there are SG table elements.
> - *       (for example via virtual mapping capabilities)
> - *       The routine returns the number of addr/length pairs actually
> - *       used, at most nents.
> - *
>   * Device ownership issues as mentioned above for swiotlb_map_page are the
>   * same here.
>   */
> -- 
> 2.19.0
> 

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 02/10] swiotlb: mark is_swiotlb_buffer static
  2018-10-08  8:02 ` [PATCH 02/10] swiotlb: mark is_swiotlb_buffer static Christoph Hellwig
  2018-10-11 17:54   ` Robin Murphy
@ 2018-10-19  0:12   ` Konrad Rzeszutek Wilk
  1 sibling, 0 replies; 57+ messages in thread
From: Konrad Rzeszutek Wilk @ 2018-10-19  0:12 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Will Deacon, Catalin Marinas, Robin Murphy, linux-arm-kernel,
	iommu, linux-kernel

On Mon, Oct 08, 2018 at 10:02:38AM +0200, Christoph Hellwig wrote:
> Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

Thank you!
> ---
>  include/linux/swiotlb.h | 1 -
>  kernel/dma/swiotlb.c    | 2 +-
>  2 files changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/include/linux/swiotlb.h b/include/linux/swiotlb.h
> index 965be92c33b5..7ef541ce8f34 100644
> --- a/include/linux/swiotlb.h
> +++ b/include/linux/swiotlb.h
> @@ -121,7 +121,6 @@ static inline unsigned int swiotlb_max_segment(void) { return 0; }
>  #endif
>  
>  extern void swiotlb_print_info(void);
> -extern int is_swiotlb_buffer(phys_addr_t paddr);
>  extern void swiotlb_set_max_segment(unsigned int);
>  
>  extern const struct dma_map_ops swiotlb_dma_ops;
> diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
> index 9062b14bc7f4..26d3af52956f 100644
> --- a/kernel/dma/swiotlb.c
> +++ b/kernel/dma/swiotlb.c
> @@ -429,7 +429,7 @@ void __init swiotlb_exit(void)
>  	max_segment = 0;
>  }
>  
> -int is_swiotlb_buffer(phys_addr_t paddr)
> +static int is_swiotlb_buffer(phys_addr_t paddr)
>  {
>  	return paddr >= io_tlb_start && paddr < io_tlb_end;
>  }
> -- 
> 2.19.0
> 

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 03/10] swiotlb: do not panic on mapping failures
  2018-10-08  8:02 ` [PATCH 03/10] swiotlb: do not panic on mapping failures Christoph Hellwig
  2018-10-11 18:06   ` Robin Murphy
@ 2018-10-19  0:17   ` Konrad Rzeszutek Wilk
  2018-10-19  6:04     ` Christoph Hellwig
  1 sibling, 1 reply; 57+ messages in thread
From: Konrad Rzeszutek Wilk @ 2018-10-19  0:17 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Will Deacon, Catalin Marinas, Robin Murphy, linux-arm-kernel,
	iommu, linux-kernel

On Mon, Oct 08, 2018 at 10:02:39AM +0200, Christoph Hellwig wrote:
> All properly written drivers now have error handling in the
> dma_map_single / dma_map_page callers.  As swiotlb_tbl_map_single already
> prints a useful warning when running out of swiotlb pool swace we can

s/swace/space/

> also remove swiotlb_full entirely as it serves no purpose now.

The panic albeit was useful for those drivers that didn't handle it.

Thoughts? I am kind of thinking screw them but then I am not too thrilled
about silent data corruption errors.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>  kernel/dma/swiotlb.c | 33 +--------------------------------
>  1 file changed, 1 insertion(+), 32 deletions(-)
> 
> diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
> index 26d3af52956f..69bf305ee5f8 100644
> --- a/kernel/dma/swiotlb.c
> +++ b/kernel/dma/swiotlb.c
> @@ -761,34 +761,6 @@ static bool swiotlb_free_buffer(struct device *dev, size_t size,
>  	return true;
>  }
>  
> -static void
> -swiotlb_full(struct device *dev, size_t size, enum dma_data_direction dir,
> -	     int do_panic)
> -{
> -	if (swiotlb_force == SWIOTLB_NO_FORCE)
> -		return;
> -
> -	/*
> -	 * Ran out of IOMMU space for this operation. This is very bad.
> -	 * Unfortunately the drivers cannot handle this operation properly.
> -	 * unless they check for dma_mapping_error (most don't)
> -	 * When the mapping is small enough return a static buffer to limit
> -	 * the damage, or panic when the transfer is too big.
> -	 */
> -	dev_err_ratelimited(dev, "DMA: Out of SW-IOMMU space for %zu bytes\n",
> -			    size);
> -
> -	if (size <= io_tlb_overflow || !do_panic)
> -		return;
> -
> -	if (dir == DMA_BIDIRECTIONAL)
> -		panic("DMA: Random memory could be DMA accessed\n");
> -	if (dir == DMA_FROM_DEVICE)
> -		panic("DMA: Random memory could be DMA written\n");
> -	if (dir == DMA_TO_DEVICE)
> -		panic("DMA: Random memory could be DMA read\n");
> -}
> -
>  /*
>   * Map a single buffer of the indicated size for DMA in streaming mode.  The
>   * physical address to use is returned.
> @@ -817,10 +789,8 @@ dma_addr_t swiotlb_map_page(struct device *dev, struct page *page,
>  
>  	/* Oh well, have to allocate and map a bounce buffer. */
>  	map = map_single(dev, phys, size, dir, attrs);
> -	if (map == SWIOTLB_MAP_ERROR) {
> -		swiotlb_full(dev, size, dir, 1);
> +	if (map == SWIOTLB_MAP_ERROR)
>  		return __phys_to_dma(dev, io_tlb_overflow_buffer);
> -	}
>  
>  	dev_addr = __phys_to_dma(dev, map);
>  
> @@ -948,7 +918,6 @@ swiotlb_map_sg_attrs(struct device *hwdev, struct scatterlist *sgl, int nelems,
>  			if (map == SWIOTLB_MAP_ERROR) {
>  				/* Don't panic here, we expect map_sg users
>  				   to do proper error handling. */
> -				swiotlb_full(hwdev, sg->length, dir, 0);
>  				attrs |= DMA_ATTR_SKIP_CPU_SYNC;
>  				swiotlb_unmap_sg_attrs(hwdev, sgl, i, dir,
>  						       attrs);
> -- 
> 2.19.0
> 

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 03/10] swiotlb: do not panic on mapping failures
  2018-10-11 18:06   ` Robin Murphy
@ 2018-10-19  0:18     ` Konrad Rzeszutek Wilk
  0 siblings, 0 replies; 57+ messages in thread
From: Konrad Rzeszutek Wilk @ 2018-10-19  0:18 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Christoph Hellwig, Will Deacon, Catalin Marinas,
	linux-arm-kernel, iommu, linux-kernel

On Thu, Oct 11, 2018 at 07:06:31PM +0100, Robin Murphy wrote:
> On 08/10/18 09:02, Christoph Hellwig wrote:
> > All properly written drivers now have error handling in the
> > dma_map_single / dma_map_page callers.  As swiotlb_tbl_map_single already
> > prints a useful warning when running out of swiotlb pool swace we can
> > also remove swiotlb_full entirely as it serves no purpose now.
> 
> $ git grep -l 'dma_map_\(page\|single\)' drivers/ | wc -l
> 385
> $ git grep -l dma_mapping_error drivers/ | wc -l
> 396
> 
> Close enough, to first approximation :D
> 
> I agree the other (non-fatal) warning seems sufficient, and frankly the
> panic can be a bit of a pain for development sometimes.

What would you recommend instead?
> 
> Reviewed-by: Robin Murphy <robin.murphy@arm.com>
> 
> > Signed-off-by: Christoph Hellwig <hch@lst.de>
> > ---
> >   kernel/dma/swiotlb.c | 33 +--------------------------------
> >   1 file changed, 1 insertion(+), 32 deletions(-)
> > 
> > diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
> > index 26d3af52956f..69bf305ee5f8 100644
> > --- a/kernel/dma/swiotlb.c
> > +++ b/kernel/dma/swiotlb.c
> > @@ -761,34 +761,6 @@ static bool swiotlb_free_buffer(struct device *dev, size_t size,
> >   	return true;
> >   }
> > -static void
> > -swiotlb_full(struct device *dev, size_t size, enum dma_data_direction dir,
> > -	     int do_panic)
> > -{
> > -	if (swiotlb_force == SWIOTLB_NO_FORCE)
> > -		return;
> > -
> > -	/*
> > -	 * Ran out of IOMMU space for this operation. This is very bad.
> > -	 * Unfortunately the drivers cannot handle this operation properly.
> > -	 * unless they check for dma_mapping_error (most don't)
> > -	 * When the mapping is small enough return a static buffer to limit
> > -	 * the damage, or panic when the transfer is too big.
> > -	 */
> > -	dev_err_ratelimited(dev, "DMA: Out of SW-IOMMU space for %zu bytes\n",
> > -			    size);
> > -
> > -	if (size <= io_tlb_overflow || !do_panic)
> > -		return;
> > -
> > -	if (dir == DMA_BIDIRECTIONAL)
> > -		panic("DMA: Random memory could be DMA accessed\n");
> > -	if (dir == DMA_FROM_DEVICE)
> > -		panic("DMA: Random memory could be DMA written\n");
> > -	if (dir == DMA_TO_DEVICE)
> > -		panic("DMA: Random memory could be DMA read\n");
> > -}
> > -
> >   /*
> >    * Map a single buffer of the indicated size for DMA in streaming mode.  The
> >    * physical address to use is returned.
> > @@ -817,10 +789,8 @@ dma_addr_t swiotlb_map_page(struct device *dev, struct page *page,
> >   	/* Oh well, have to allocate and map a bounce buffer. */
> >   	map = map_single(dev, phys, size, dir, attrs);
> > -	if (map == SWIOTLB_MAP_ERROR) {
> > -		swiotlb_full(dev, size, dir, 1);
> > +	if (map == SWIOTLB_MAP_ERROR)
> >   		return __phys_to_dma(dev, io_tlb_overflow_buffer);
> > -	}
> >   	dev_addr = __phys_to_dma(dev, map);
> > @@ -948,7 +918,6 @@ swiotlb_map_sg_attrs(struct device *hwdev, struct scatterlist *sgl, int nelems,
> >   			if (map == SWIOTLB_MAP_ERROR) {
> >   				/* Don't panic here, we expect map_sg users
> >   				   to do proper error handling. */
> > -				swiotlb_full(hwdev, sg->length, dir, 0);
> >   				attrs |= DMA_ATTR_SKIP_CPU_SYNC;
> >   				swiotlb_unmap_sg_attrs(hwdev, sgl, i, dir,
> >   						       attrs);
> > 

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 04/10] swiotlb: remove the overflow buffer
  2018-10-08  8:02 ` [PATCH 04/10] swiotlb: remove the overflow buffer Christoph Hellwig
  2018-10-11 18:19   ` Robin Murphy
  2018-10-12 17:04   ` Catalin Marinas
@ 2018-10-19  0:23   ` Konrad Rzeszutek Wilk
  2 siblings, 0 replies; 57+ messages in thread
From: Konrad Rzeszutek Wilk @ 2018-10-19  0:23 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Will Deacon, Catalin Marinas, Robin Murphy, linux-arm-kernel,
	iommu, linux-kernel

On Mon, Oct 08, 2018 at 10:02:40AM +0200, Christoph Hellwig wrote:
> Like all other dma mapping drivers just return an error code instead
> of an actual memory buffer.  The reason for the overflow buffer was
> that at the time swiotlb was invented there was no way to check for
> dma mapping errors, but this has long been fixed.

RIP overflow buffer.

Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

Thank you!
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>  arch/arm64/mm/dma-mapping.c       |  2 +-
>  arch/powerpc/kernel/dma-swiotlb.c |  4 +--
>  include/linux/dma-direct.h        |  2 ++
>  include/linux/swiotlb.h           |  3 --
>  kernel/dma/direct.c               |  2 --
>  kernel/dma/swiotlb.c              | 59 ++-----------------------------
>  6 files changed, 8 insertions(+), 64 deletions(-)
> 
> diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c
> index 072c51fb07d7..8d91b927e09e 100644
> --- a/arch/arm64/mm/dma-mapping.c
> +++ b/arch/arm64/mm/dma-mapping.c
> @@ -324,7 +324,7 @@ static int __swiotlb_dma_supported(struct device *hwdev, u64 mask)
>  static int __swiotlb_dma_mapping_error(struct device *hwdev, dma_addr_t addr)
>  {
>  	if (swiotlb)
> -		return swiotlb_dma_mapping_error(hwdev, addr);
> +		return dma_direct_mapping_error(hwdev, addr);
>  	return 0;
>  }
>  
> diff --git a/arch/powerpc/kernel/dma-swiotlb.c b/arch/powerpc/kernel/dma-swiotlb.c
> index 88f3963ca30f..5fc335f4d9cd 100644
> --- a/arch/powerpc/kernel/dma-swiotlb.c
> +++ b/arch/powerpc/kernel/dma-swiotlb.c
> @@ -11,7 +11,7 @@
>   *
>   */
>  
> -#include <linux/dma-mapping.h>
> +#include <linux/dma-direct.h>
>  #include <linux/memblock.h>
>  #include <linux/pfn.h>
>  #include <linux/of_platform.h>
> @@ -59,7 +59,7 @@ const struct dma_map_ops powerpc_swiotlb_dma_ops = {
>  	.sync_single_for_device = swiotlb_sync_single_for_device,
>  	.sync_sg_for_cpu = swiotlb_sync_sg_for_cpu,
>  	.sync_sg_for_device = swiotlb_sync_sg_for_device,
> -	.mapping_error = swiotlb_dma_mapping_error,
> +	.mapping_error = dma_direct_mapping_error,
>  	.get_required_mask = swiotlb_powerpc_get_required,
>  };
>  
> diff --git a/include/linux/dma-direct.h b/include/linux/dma-direct.h
> index fbca184ff5a0..bd73e7a91410 100644
> --- a/include/linux/dma-direct.h
> +++ b/include/linux/dma-direct.h
> @@ -5,6 +5,8 @@
>  #include <linux/dma-mapping.h>
>  #include <linux/mem_encrypt.h>
>  
> +#define DIRECT_MAPPING_ERROR		0
> +
>  #ifdef CONFIG_ARCH_HAS_PHYS_TO_DMA
>  #include <asm/dma-direct.h>
>  #else
> diff --git a/include/linux/swiotlb.h b/include/linux/swiotlb.h
> index 7ef541ce8f34..f847c1b265c4 100644
> --- a/include/linux/swiotlb.h
> +++ b/include/linux/swiotlb.h
> @@ -106,9 +106,6 @@ extern void
>  swiotlb_sync_sg_for_device(struct device *hwdev, struct scatterlist *sg,
>  			   int nelems, enum dma_data_direction dir);
>  
> -extern int
> -swiotlb_dma_mapping_error(struct device *hwdev, dma_addr_t dma_addr);
> -
>  extern int
>  swiotlb_dma_supported(struct device *hwdev, u64 mask);
>  
> diff --git a/kernel/dma/direct.c b/kernel/dma/direct.c
> index 674a8da22844..12798abba55e 100644
> --- a/kernel/dma/direct.c
> +++ b/kernel/dma/direct.c
> @@ -14,8 +14,6 @@
>  #include <linux/pfn.h>
>  #include <linux/set_memory.h>
>  
> -#define DIRECT_MAPPING_ERROR		0
> -
>  /*
>   * Most architectures use ZONE_DMA for the first 16 Megabytes, but
>   * some use it for entirely different regions:
> diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
> index 69bf305ee5f8..11dbcd80b4a6 100644
> --- a/kernel/dma/swiotlb.c
> +++ b/kernel/dma/swiotlb.c
> @@ -72,13 +72,6 @@ static phys_addr_t io_tlb_start, io_tlb_end;
>   */
>  static unsigned long io_tlb_nslabs;
>  
> -/*
> - * When the IOMMU overflows we return a fallback buffer. This sets the size.
> - */
> -static unsigned long io_tlb_overflow = 32*1024;
> -
> -static phys_addr_t io_tlb_overflow_buffer;
> -
>  /*
>   * This is a free list describing the number of free entries available from
>   * each index
> @@ -126,7 +119,6 @@ setup_io_tlb_npages(char *str)
>  	return 0;
>  }
>  early_param("swiotlb", setup_io_tlb_npages);
> -/* make io_tlb_overflow tunable too? */
>  
>  unsigned long swiotlb_nr_tbl(void)
>  {
> @@ -194,16 +186,10 @@ void __init swiotlb_update_mem_attributes(void)
>  	bytes = PAGE_ALIGN(io_tlb_nslabs << IO_TLB_SHIFT);
>  	set_memory_decrypted((unsigned long)vaddr, bytes >> PAGE_SHIFT);
>  	memset(vaddr, 0, bytes);
> -
> -	vaddr = phys_to_virt(io_tlb_overflow_buffer);
> -	bytes = PAGE_ALIGN(io_tlb_overflow);
> -	set_memory_decrypted((unsigned long)vaddr, bytes >> PAGE_SHIFT);
> -	memset(vaddr, 0, bytes);
>  }
>  
>  int __init swiotlb_init_with_tbl(char *tlb, unsigned long nslabs, int verbose)
>  {
> -	void *v_overflow_buffer;
>  	unsigned long i, bytes;
>  
>  	bytes = nslabs << IO_TLB_SHIFT;
> @@ -212,17 +198,6 @@ int __init swiotlb_init_with_tbl(char *tlb, unsigned long nslabs, int verbose)
>  	io_tlb_start = __pa(tlb);
>  	io_tlb_end = io_tlb_start + bytes;
>  
> -	/*
> -	 * Get the overflow emergency buffer
> -	 */
> -	v_overflow_buffer = memblock_virt_alloc_low_nopanic(
> -						PAGE_ALIGN(io_tlb_overflow),
> -						PAGE_SIZE);
> -	if (!v_overflow_buffer)
> -		return -ENOMEM;
> -
> -	io_tlb_overflow_buffer = __pa(v_overflow_buffer);
> -
>  	/*
>  	 * Allocate and initialize the free list array.  This array is used
>  	 * to find contiguous free memory regions of size up to IO_TLB_SEGSIZE
> @@ -330,7 +305,6 @@ int
>  swiotlb_late_init_with_tbl(char *tlb, unsigned long nslabs)
>  {
>  	unsigned long i, bytes;
> -	unsigned char *v_overflow_buffer;
>  
>  	bytes = nslabs << IO_TLB_SHIFT;
>  
> @@ -341,19 +315,6 @@ swiotlb_late_init_with_tbl(char *tlb, unsigned long nslabs)
>  	set_memory_decrypted((unsigned long)tlb, bytes >> PAGE_SHIFT);
>  	memset(tlb, 0, bytes);
>  
> -	/*
> -	 * Get the overflow emergency buffer
> -	 */
> -	v_overflow_buffer = (void *)__get_free_pages(GFP_DMA,
> -						     get_order(io_tlb_overflow));
> -	if (!v_overflow_buffer)
> -		goto cleanup2;
> -
> -	set_memory_decrypted((unsigned long)v_overflow_buffer,
> -			io_tlb_overflow >> PAGE_SHIFT);
> -	memset(v_overflow_buffer, 0, io_tlb_overflow);
> -	io_tlb_overflow_buffer = virt_to_phys(v_overflow_buffer);
> -
>  	/*
>  	 * Allocate and initialize the free list array.  This array is used
>  	 * to find contiguous free memory regions of size up to IO_TLB_SEGSIZE
> @@ -390,10 +351,6 @@ swiotlb_late_init_with_tbl(char *tlb, unsigned long nslabs)
>  	                                                 sizeof(int)));
>  	io_tlb_list = NULL;
>  cleanup3:
> -	free_pages((unsigned long)v_overflow_buffer,
> -		   get_order(io_tlb_overflow));
> -	io_tlb_overflow_buffer = 0;
> -cleanup2:
>  	io_tlb_end = 0;
>  	io_tlb_start = 0;
>  	io_tlb_nslabs = 0;
> @@ -407,8 +364,6 @@ void __init swiotlb_exit(void)
>  		return;
>  
>  	if (late_alloc) {
> -		free_pages((unsigned long)phys_to_virt(io_tlb_overflow_buffer),
> -			   get_order(io_tlb_overflow));
>  		free_pages((unsigned long)io_tlb_orig_addr,
>  			   get_order(io_tlb_nslabs * sizeof(phys_addr_t)));
>  		free_pages((unsigned long)io_tlb_list, get_order(io_tlb_nslabs *
> @@ -416,8 +371,6 @@ void __init swiotlb_exit(void)
>  		free_pages((unsigned long)phys_to_virt(io_tlb_start),
>  			   get_order(io_tlb_nslabs << IO_TLB_SHIFT));
>  	} else {
> -		memblock_free_late(io_tlb_overflow_buffer,
> -				   PAGE_ALIGN(io_tlb_overflow));
>  		memblock_free_late(__pa(io_tlb_orig_addr),
>  				   PAGE_ALIGN(io_tlb_nslabs * sizeof(phys_addr_t)));
>  		memblock_free_late(__pa(io_tlb_list),
> @@ -790,7 +743,7 @@ dma_addr_t swiotlb_map_page(struct device *dev, struct page *page,
>  	/* Oh well, have to allocate and map a bounce buffer. */
>  	map = map_single(dev, phys, size, dir, attrs);
>  	if (map == SWIOTLB_MAP_ERROR)
> -		return __phys_to_dma(dev, io_tlb_overflow_buffer);
> +		return DIRECT_MAPPING_ERROR;
>  
>  	dev_addr = __phys_to_dma(dev, map);
>  
> @@ -801,7 +754,7 @@ dma_addr_t swiotlb_map_page(struct device *dev, struct page *page,
>  	attrs |= DMA_ATTR_SKIP_CPU_SYNC;
>  	swiotlb_tbl_unmap_single(dev, map, size, dir, attrs);
>  
> -	return __phys_to_dma(dev, io_tlb_overflow_buffer);
> +	return DIRECT_MAPPING_ERROR;
>  }
>  
>  /*
> @@ -985,12 +938,6 @@ swiotlb_sync_sg_for_device(struct device *hwdev, struct scatterlist *sg,
>  	swiotlb_sync_sg(hwdev, sg, nelems, dir, SYNC_FOR_DEVICE);
>  }
>  
> -int
> -swiotlb_dma_mapping_error(struct device *hwdev, dma_addr_t dma_addr)
> -{
> -	return (dma_addr == __phys_to_dma(hwdev, io_tlb_overflow_buffer));
> -}
> -
>  /*
>   * Return whether the given device DMA address mask can be supported
>   * properly.  For example, if your device can only drive the low 24-bits
> @@ -1033,7 +980,7 @@ void swiotlb_free(struct device *dev, size_t size, void *vaddr,
>  }
>  
>  const struct dma_map_ops swiotlb_dma_ops = {
> -	.mapping_error		= swiotlb_dma_mapping_error,
> +	.mapping_error		= dma_direct_mapping_error,
>  	.alloc			= swiotlb_alloc,
>  	.free			= swiotlb_free,
>  	.sync_single_for_cpu	= swiotlb_sync_single_for_cpu,
> -- 
> 2.19.0
> 

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 05/10] swiotlb: merge swiotlb_unmap_page and unmap_single
  2018-10-08  8:02 ` [PATCH 05/10] swiotlb: merge swiotlb_unmap_page and unmap_single Christoph Hellwig
  2018-10-18 17:44   ` Robin Murphy
@ 2018-10-19  0:25   ` Konrad Rzeszutek Wilk
  1 sibling, 0 replies; 57+ messages in thread
From: Konrad Rzeszutek Wilk @ 2018-10-19  0:25 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Will Deacon, Catalin Marinas, Robin Murphy, linux-arm-kernel,
	iommu, linux-kernel

On Mon, Oct 08, 2018 at 10:02:41AM +0200, Christoph Hellwig wrote:
> Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

Thank you!
> ---
>  kernel/dma/swiotlb.c | 15 ++++-----------
>  1 file changed, 4 insertions(+), 11 deletions(-)
> 
> diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
> index 11dbcd80b4a6..15335f3a1bf3 100644
> --- a/kernel/dma/swiotlb.c
> +++ b/kernel/dma/swiotlb.c
> @@ -765,9 +765,9 @@ dma_addr_t swiotlb_map_page(struct device *dev, struct page *page,
>   * After this call, reads by the cpu to the buffer are guaranteed to see
>   * whatever the device wrote there.
>   */
> -static void unmap_single(struct device *hwdev, dma_addr_t dev_addr,
> -			 size_t size, enum dma_data_direction dir,
> -			 unsigned long attrs)
> +void swiotlb_unmap_page(struct device *hwdev, dma_addr_t dev_addr,
> +			size_t size, enum dma_data_direction dir,
> +			unsigned long attrs)
>  {
>  	phys_addr_t paddr = dma_to_phys(hwdev, dev_addr);
>  
> @@ -790,13 +790,6 @@ static void unmap_single(struct device *hwdev, dma_addr_t dev_addr,
>  	dma_mark_clean(phys_to_virt(paddr), size);
>  }
>  
> -void swiotlb_unmap_page(struct device *hwdev, dma_addr_t dev_addr,
> -			size_t size, enum dma_data_direction dir,
> -			unsigned long attrs)
> -{
> -	unmap_single(hwdev, dev_addr, size, dir, attrs);
> -}
> -
>  /*
>   * Make physical memory consistent for a single streaming mode DMA translation
>   * after a transfer.
> @@ -900,7 +893,7 @@ swiotlb_unmap_sg_attrs(struct device *hwdev, struct scatterlist *sgl,
>  	BUG_ON(dir == DMA_NONE);
>  
>  	for_each_sg(sgl, sg, nelems, i)
> -		unmap_single(hwdev, sg->dma_address, sg_dma_len(sg), dir,
> +		swiotlb_unmap_page(hwdev, sg->dma_address, sg_dma_len(sg), dir,
>  			     attrs);
>  }
>  
> -- 
> 2.19.0
> 

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 06/10] swiotlb: use swiotlb_map_page in swiotlb_map_sg_attrs
  2018-10-08  8:02 ` [PATCH 06/10] swiotlb: use swiotlb_map_page in swiotlb_map_sg_attrs Christoph Hellwig
  2018-10-18 17:53   ` Robin Murphy
@ 2018-10-19  0:33   ` Konrad Rzeszutek Wilk
  2018-11-07  1:27   ` John Stultz
  2 siblings, 0 replies; 57+ messages in thread
From: Konrad Rzeszutek Wilk @ 2018-10-19  0:33 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Will Deacon, Catalin Marinas, Robin Murphy, linux-arm-kernel,
	iommu, linux-kernel

On Mon, Oct 08, 2018 at 10:02:42AM +0200, Christoph Hellwig wrote:
> No need to duplicate the code - map_sg is equivalent to map_page
> for each page in the scatterlist.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

Thank you!
> ---
>  kernel/dma/swiotlb.c | 34 ++++++++++++----------------------
>  1 file changed, 12 insertions(+), 22 deletions(-)
> 
> diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
> index 15335f3a1bf3..15755d7a5242 100644
> --- a/kernel/dma/swiotlb.c
> +++ b/kernel/dma/swiotlb.c
> @@ -845,37 +845,27 @@ swiotlb_sync_single_for_device(struct device *hwdev, dma_addr_t dev_addr,
>   * same here.
>   */
>  int
> -swiotlb_map_sg_attrs(struct device *hwdev, struct scatterlist *sgl, int nelems,
> +swiotlb_map_sg_attrs(struct device *dev, struct scatterlist *sgl, int nelems,
>  		     enum dma_data_direction dir, unsigned long attrs)
>  {
>  	struct scatterlist *sg;
>  	int i;
>  
> -	BUG_ON(dir == DMA_NONE);
> -
>  	for_each_sg(sgl, sg, nelems, i) {
> -		phys_addr_t paddr = sg_phys(sg);
> -		dma_addr_t dev_addr = phys_to_dma(hwdev, paddr);
> -
> -		if (swiotlb_force == SWIOTLB_FORCE ||
> -		    !dma_capable(hwdev, dev_addr, sg->length)) {
> -			phys_addr_t map = map_single(hwdev, sg_phys(sg),
> -						     sg->length, dir, attrs);
> -			if (map == SWIOTLB_MAP_ERROR) {
> -				/* Don't panic here, we expect map_sg users
> -				   to do proper error handling. */
> -				attrs |= DMA_ATTR_SKIP_CPU_SYNC;
> -				swiotlb_unmap_sg_attrs(hwdev, sgl, i, dir,
> -						       attrs);
> -				sg_dma_len(sgl) = 0;
> -				return 0;
> -			}
> -			sg->dma_address = __phys_to_dma(hwdev, map);
> -		} else
> -			sg->dma_address = dev_addr;
> +		sg->dma_address = swiotlb_map_page(dev, sg_page(sg), sg->offset,
> +				sg->length, dir, attrs);
> +		if (sg->dma_address == DIRECT_MAPPING_ERROR)
> +			goto out_error;
>  		sg_dma_len(sg) = sg->length;
>  	}
> +
>  	return nelems;
> +
> +out_error:
> +	swiotlb_unmap_sg_attrs(dev, sgl, i, dir,
> +			attrs | DMA_ATTR_SKIP_CPU_SYNC);
> +	sg_dma_len(sgl) = 0;
> +	return 0;
>  }
>  
>  /*
> -- 
> 2.19.0
> 

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 07/10] swiotlb: refactor swiotlb_map_page
  2018-10-18 18:09   ` Robin Murphy
@ 2018-10-19  0:37     ` Konrad Rzeszutek Wilk
  2018-10-19  6:52       ` Christoph Hellwig
  0 siblings, 1 reply; 57+ messages in thread
From: Konrad Rzeszutek Wilk @ 2018-10-19  0:37 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Christoph Hellwig, Will Deacon, Catalin Marinas,
	linux-arm-kernel, iommu, linux-kernel

> > +	if (!dma_capable(dev, dma_addr, size) ||
> > +	    swiotlb_force == SWIOTLB_FORCE) {
> > +		trace_swiotlb_bounced(dev, dma_addr, size, swiotlb_force);
> > +		dma_addr = swiotlb_bounce_page(dev, &phys, size, dir, attrs);
> > +	}
> 
> FWIW I prefer the inverse condition and early return of the original code
> here, which also then allows a tail-call to swiotlb_bounce_page() (and saves
> a couple of lines), but it's no biggie.
> 
> Reviewed-by: Robin Murphy <robin.murphy@arm.com>

I agree with Robin - it certainly makes it easier to read.

With that small change:
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

Thank you!

> 
> > -	return DIRECT_MAPPING_ERROR;
> > +	return dma_addr;
> >   }
> >   /*
> > 

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 08/10] swiotlb: don't dip into swiotlb pool for coherent allocations
  2018-10-08  8:02 ` [PATCH 08/10] swiotlb: don't dip into swiotlb pool for coherent allocations Christoph Hellwig
  2018-10-12 17:04   ` Catalin Marinas
@ 2018-10-19  0:40   ` Konrad Rzeszutek Wilk
  2018-10-19 16:45   ` Robin Murphy
  2 siblings, 0 replies; 57+ messages in thread
From: Konrad Rzeszutek Wilk @ 2018-10-19  0:40 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Will Deacon, Catalin Marinas, Robin Murphy, linux-arm-kernel,
	iommu, linux-kernel

On Mon, Oct 08, 2018 at 10:02:44AM +0200, Christoph Hellwig wrote:
> All architectures that support swiotlb also have a zone that backs up
> these less than full addressing allocations (usually ZONE_DMA32).
> 
> Because of that it is rather pointless to fall back to the global swiotlb
> buffer if the normal dma direct allocation failed - the only thing this
> will do is to eat up bounce buffers that would be more useful to serve
> streaming mappings.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

Thank you!
> ---
>  arch/arm64/mm/dma-mapping.c |   6 +--
>  include/linux/swiotlb.h     |   5 --
>  kernel/dma/swiotlb.c        | 105 +-----------------------------------
>  3 files changed, 5 insertions(+), 111 deletions(-)
> 
> diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c
> index 8d91b927e09e..eee6cfcfde9e 100644
> --- a/arch/arm64/mm/dma-mapping.c
> +++ b/arch/arm64/mm/dma-mapping.c
> @@ -112,7 +112,7 @@ static void *__dma_alloc(struct device *dev, size_t size,
>  		return addr;
>  	}
>  
> -	ptr = swiotlb_alloc(dev, size, dma_handle, flags, attrs);
> +	ptr = dma_direct_alloc_pages(dev, size, dma_handle, flags, attrs);
>  	if (!ptr)
>  		goto no_mem;
>  
> @@ -133,7 +133,7 @@ static void *__dma_alloc(struct device *dev, size_t size,
>  	return coherent_ptr;
>  
>  no_map:
> -	swiotlb_free(dev, size, ptr, *dma_handle, attrs);
> +	dma_direct_free_pages(dev, size, ptr, *dma_handle, attrs);
>  no_mem:
>  	return NULL;
>  }
> @@ -151,7 +151,7 @@ static void __dma_free(struct device *dev, size_t size,
>  			return;
>  		vunmap(vaddr);
>  	}
> -	swiotlb_free(dev, size, swiotlb_addr, dma_handle, attrs);
> +	dma_direct_free_pages(dev, size, swiotlb_addr, dma_handle, attrs);
>  }
>  
>  static dma_addr_t __swiotlb_map_page(struct device *dev, struct page *page,
> diff --git a/include/linux/swiotlb.h b/include/linux/swiotlb.h
> index f847c1b265c4..a387b59640a4 100644
> --- a/include/linux/swiotlb.h
> +++ b/include/linux/swiotlb.h
> @@ -67,11 +67,6 @@ extern void swiotlb_tbl_sync_single(struct device *hwdev,
>  
>  /* Accessory functions. */
>  
> -void *swiotlb_alloc(struct device *hwdev, size_t size, dma_addr_t *dma_handle,
> -		gfp_t flags, unsigned long attrs);
> -void swiotlb_free(struct device *dev, size_t size, void *vaddr,
> -		dma_addr_t dma_addr, unsigned long attrs);
> -
>  extern dma_addr_t swiotlb_map_page(struct device *dev, struct page *page,
>  				   unsigned long offset, size_t size,
>  				   enum dma_data_direction dir,
> diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
> index 4d7a4d85d71e..475a41eff3dc 100644
> --- a/kernel/dma/swiotlb.c
> +++ b/kernel/dma/swiotlb.c
> @@ -622,78 +622,6 @@ void swiotlb_tbl_sync_single(struct device *hwdev, phys_addr_t tlb_addr,
>  	}
>  }
>  
> -static inline bool dma_coherent_ok(struct device *dev, dma_addr_t addr,
> -		size_t size)
> -{
> -	u64 mask = DMA_BIT_MASK(32);
> -
> -	if (dev && dev->coherent_dma_mask)
> -		mask = dev->coherent_dma_mask;
> -	return addr + size - 1 <= mask;
> -}
> -
> -static void *
> -swiotlb_alloc_buffer(struct device *dev, size_t size, dma_addr_t *dma_handle,
> -		unsigned long attrs)
> -{
> -	phys_addr_t phys_addr;
> -
> -	if (swiotlb_force == SWIOTLB_NO_FORCE)
> -		goto out_warn;
> -
> -	phys_addr = swiotlb_tbl_map_single(dev,
> -			__phys_to_dma(dev, io_tlb_start),
> -			0, size, DMA_FROM_DEVICE, attrs);
> -	if (phys_addr == SWIOTLB_MAP_ERROR)
> -		goto out_warn;
> -
> -	*dma_handle = __phys_to_dma(dev, phys_addr);
> -	if (!dma_coherent_ok(dev, *dma_handle, size))
> -		goto out_unmap;
> -
> -	memset(phys_to_virt(phys_addr), 0, size);
> -	return phys_to_virt(phys_addr);
> -
> -out_unmap:
> -	dev_warn(dev, "hwdev DMA mask = 0x%016Lx, dev_addr = 0x%016Lx\n",
> -		(unsigned long long)dev->coherent_dma_mask,
> -		(unsigned long long)*dma_handle);
> -
> -	/*
> -	 * DMA_TO_DEVICE to avoid memcpy in unmap_single.
> -	 * DMA_ATTR_SKIP_CPU_SYNC is optional.
> -	 */
> -	swiotlb_tbl_unmap_single(dev, phys_addr, size, DMA_TO_DEVICE,
> -			DMA_ATTR_SKIP_CPU_SYNC);
> -out_warn:
> -	if (!(attrs & DMA_ATTR_NO_WARN) && printk_ratelimit()) {
> -		dev_warn(dev,
> -			"swiotlb: coherent allocation failed, size=%zu\n",
> -			size);
> -		dump_stack();
> -	}
> -	return NULL;
> -}
> -
> -static bool swiotlb_free_buffer(struct device *dev, size_t size,
> -		dma_addr_t dma_addr)
> -{
> -	phys_addr_t phys_addr = dma_to_phys(dev, dma_addr);
> -
> -	WARN_ON_ONCE(irqs_disabled());
> -
> -	if (!is_swiotlb_buffer(phys_addr))
> -		return false;
> -
> -	/*
> -	 * DMA_TO_DEVICE to avoid memcpy in swiotlb_tbl_unmap_single.
> -	 * DMA_ATTR_SKIP_CPU_SYNC is optional.
> -	 */
> -	swiotlb_tbl_unmap_single(dev, phys_addr, size, DMA_TO_DEVICE,
> -				 DMA_ATTR_SKIP_CPU_SYNC);
> -	return true;
> -}
> -
>  static dma_addr_t swiotlb_bounce_page(struct device *dev, phys_addr_t *phys,
>  		size_t size, enum dma_data_direction dir, unsigned long attrs)
>  {
> @@ -928,39 +856,10 @@ swiotlb_dma_supported(struct device *hwdev, u64 mask)
>  	return __phys_to_dma(hwdev, io_tlb_end - 1) <= mask;
>  }
>  
> -void *swiotlb_alloc(struct device *dev, size_t size, dma_addr_t *dma_handle,
> -		gfp_t gfp, unsigned long attrs)
> -{
> -	void *vaddr;
> -
> -	/* temporary workaround: */
> -	if (gfp & __GFP_NOWARN)
> -		attrs |= DMA_ATTR_NO_WARN;
> -
> -	/*
> -	 * Don't print a warning when the first allocation attempt fails.
> -	 * swiotlb_alloc_coherent() will print a warning when the DMA memory
> -	 * allocation ultimately failed.
> -	 */
> -	gfp |= __GFP_NOWARN;
> -
> -	vaddr = dma_direct_alloc(dev, size, dma_handle, gfp, attrs);
> -	if (!vaddr)
> -		vaddr = swiotlb_alloc_buffer(dev, size, dma_handle, attrs);
> -	return vaddr;
> -}
> -
> -void swiotlb_free(struct device *dev, size_t size, void *vaddr,
> -		dma_addr_t dma_addr, unsigned long attrs)
> -{
> -	if (!swiotlb_free_buffer(dev, size, dma_addr))
> -		dma_direct_free(dev, size, vaddr, dma_addr, attrs);
> -}
> -
>  const struct dma_map_ops swiotlb_dma_ops = {
>  	.mapping_error		= dma_direct_mapping_error,
> -	.alloc			= swiotlb_alloc,
> -	.free			= swiotlb_free,
> +	.alloc			= dma_direct_alloc,
> +	.free			= dma_direct_free,
>  	.sync_single_for_cpu	= swiotlb_sync_single_for_cpu,
>  	.sync_single_for_device	= swiotlb_sync_single_for_device,
>  	.sync_sg_for_cpu	= swiotlb_sync_sg_for_cpu,
> -- 
> 2.19.0
> 

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 09/10] swiotlb: add support for non-coherent DMA
  2018-10-08  8:02 ` [PATCH 09/10] swiotlb: add support for non-coherent DMA Christoph Hellwig
@ 2018-10-19  0:49   ` Konrad Rzeszutek Wilk
  2018-10-22 17:11   ` Robin Murphy
  1 sibling, 0 replies; 57+ messages in thread
From: Konrad Rzeszutek Wilk @ 2018-10-19  0:49 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Will Deacon, Catalin Marinas, Robin Murphy, linux-arm-kernel,
	iommu, linux-kernel

On Mon, Oct 08, 2018 at 10:02:45AM +0200, Christoph Hellwig wrote:
> Handle architectures that are not cache coherent directly in the main
> swiotlb code by calling arch_sync_dma_for_{device,cpu} in all the right
> places from the various dma_map/unmap/sync methods when the device is
> non-coherent.
> 
> Because swiotlb now uses dma_direct_alloc for the coherent allocation
> that side is already taken care of by the dma-direct code calling into
> arch_dma_{alloc,free} for devices that are non-coherent.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

Thank you!
> ---
>  kernel/dma/swiotlb.c | 23 +++++++++++++++++------
>  1 file changed, 17 insertions(+), 6 deletions(-)
> 
> diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
> index 475a41eff3dc..52885b274368 100644
> --- a/kernel/dma/swiotlb.c
> +++ b/kernel/dma/swiotlb.c
> @@ -21,6 +21,7 @@
>  
>  #include <linux/cache.h>
>  #include <linux/dma-direct.h>
> +#include <linux/dma-noncoherent.h>
>  #include <linux/mm.h>
>  #include <linux/export.h>
>  #include <linux/spinlock.h>
> @@ -677,6 +678,10 @@ dma_addr_t swiotlb_map_page(struct device *dev, struct page *page,
>  		dma_addr = swiotlb_bounce_page(dev, &phys, size, dir, attrs);
>  	}
>  
> +	if (!dev_is_dma_coherent(dev) &&
> +	    (attrs & DMA_ATTR_SKIP_CPU_SYNC) == 0)
> +		arch_sync_dma_for_device(dev, phys, size, dir);
> +
>  	return dma_addr;
>  }
>  
> @@ -696,6 +701,10 @@ void swiotlb_unmap_page(struct device *hwdev, dma_addr_t dev_addr,
>  
>  	BUG_ON(dir == DMA_NONE);
>  
> +	if (!dev_is_dma_coherent(hwdev) &&
> +	    (attrs & DMA_ATTR_SKIP_CPU_SYNC) == 0)
> +		arch_sync_dma_for_cpu(hwdev, paddr, size, dir);
> +
>  	if (is_swiotlb_buffer(paddr)) {
>  		swiotlb_tbl_unmap_single(hwdev, paddr, size, dir, attrs);
>  		return;
> @@ -732,15 +741,17 @@ swiotlb_sync_single(struct device *hwdev, dma_addr_t dev_addr,
>  
>  	BUG_ON(dir == DMA_NONE);
>  
> -	if (is_swiotlb_buffer(paddr)) {
> +	if (!dev_is_dma_coherent(hwdev) && target == SYNC_FOR_CPU)
> +		arch_sync_dma_for_cpu(hwdev, paddr, size, dir);
> +
> +	if (is_swiotlb_buffer(paddr))
>  		swiotlb_tbl_sync_single(hwdev, paddr, size, dir, target);
> -		return;
> -	}
>  
> -	if (dir != DMA_FROM_DEVICE)
> -		return;
> +	if (!dev_is_dma_coherent(hwdev) && target == SYNC_FOR_DEVICE)
> +		arch_sync_dma_for_device(hwdev, paddr, size, dir);
>  
> -	dma_mark_clean(phys_to_virt(paddr), size);
> +	if (!is_swiotlb_buffer(paddr) && dir == DMA_FROM_DEVICE)
> +		dma_mark_clean(phys_to_virt(paddr), size);
>  }
>  
>  void
> -- 
> 2.19.0
> 

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 03/10] swiotlb: do not panic on mapping failures
  2018-10-19  0:17   ` Konrad Rzeszutek Wilk
@ 2018-10-19  6:04     ` Christoph Hellwig
  2018-10-19 13:45       ` Konrad Rzeszutek Wilk
  0 siblings, 1 reply; 57+ messages in thread
From: Christoph Hellwig @ 2018-10-19  6:04 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: Christoph Hellwig, Will Deacon, Catalin Marinas, Robin Murphy,
	linux-arm-kernel, iommu, linux-kernel

On Thu, Oct 18, 2018 at 08:17:14PM -0400, Konrad Rzeszutek Wilk wrote:
> On Mon, Oct 08, 2018 at 10:02:39AM +0200, Christoph Hellwig wrote:
> > All properly written drivers now have error handling in the
> > dma_map_single / dma_map_page callers.  As swiotlb_tbl_map_single already
> > prints a useful warning when running out of swiotlb pool swace we can
> 
> s/swace/space/
> 
> > also remove swiotlb_full entirely as it serves no purpose now.
> 
> The panic albeit was useful for those drivers that didn't handle it.
> 
> Thoughts? I am kind of thinking screw them but then I am not too thrilled
> about silent data corruption errors.

I can add an (optional) panic back in common code so that the different
mapping types are covered.

Btw, are you fine with taking this through the dma mapping tree for
this merge window?  I'd probably leave it in linux-next for another
week or so and don't send it with the first pull request on Monday.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 07/10] swiotlb: refactor swiotlb_map_page
  2018-10-19  0:37     ` Konrad Rzeszutek Wilk
@ 2018-10-19  6:52       ` Christoph Hellwig
  2018-10-19 13:46         ` Konrad Rzeszutek Wilk
  0 siblings, 1 reply; 57+ messages in thread
From: Christoph Hellwig @ 2018-10-19  6:52 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: Robin Murphy, Christoph Hellwig, Will Deacon, Catalin Marinas,
	linux-arm-kernel, iommu, linux-kernel

On Thu, Oct 18, 2018 at 08:37:15PM -0400, Konrad Rzeszutek Wilk wrote:
> > > +	if (!dma_capable(dev, dma_addr, size) ||
> > > +	    swiotlb_force == SWIOTLB_FORCE) {
> > > +		trace_swiotlb_bounced(dev, dma_addr, size, swiotlb_force);
> > > +		dma_addr = swiotlb_bounce_page(dev, &phys, size, dir, attrs);
> > > +	}
> > 
> > FWIW I prefer the inverse condition and early return of the original code
> > here, which also then allows a tail-call to swiotlb_bounce_page() (and saves
> > a couple of lines), but it's no biggie.
> > 
> > Reviewed-by: Robin Murphy <robin.murphy@arm.com>
> 
> I agree with Robin - it certainly makes it easier to read.
> 
> With that small change:
> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

So I did this edit, and in this patch it does indeed look much cleaner.
But in patch 9 we introduce the cache maintainance, and have to invert
the condition again if we don't want a goto mess:

---
From e840ec23360788d54a8ebd2ebc7cd0f0ef8bdb01 Mon Sep 17 00:00:00 2001
From: Christoph Hellwig <hch@lst.de>
Date: Fri, 19 Oct 2018 08:51:53 +0200
Subject: swiotlb: add support for non-coherent DMA

Handle architectures that are not cache coherent directly in the main
swiotlb code by calling arch_sync_dma_for_{device,cpu} in all the right
places from the various dma_map/unmap/sync methods when the device is
non-coherent.

Because swiotlb now uses dma_direct_alloc for the coherent allocation
that side is already taken care of by the dma-direct code calling into
arch_dma_{alloc,free} for devices that are non-coherent.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 kernel/dma/swiotlb.c | 33 +++++++++++++++++++++++----------
 1 file changed, 23 insertions(+), 10 deletions(-)

diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
index 1a01b0ac0a5e..ebecaf255ea2 100644
--- a/kernel/dma/swiotlb.c
+++ b/kernel/dma/swiotlb.c
@@ -21,6 +21,7 @@
 
 #include <linux/cache.h>
 #include <linux/dma-direct.h>
+#include <linux/dma-noncoherent.h>
 #include <linux/mm.h>
 #include <linux/export.h>
 #include <linux/spinlock.h>
@@ -671,11 +672,17 @@ dma_addr_t swiotlb_map_page(struct device *dev, struct page *page,
 	 * we can safely return the device addr and not worry about bounce
 	 * buffering it.
 	 */
-	if (dma_capable(dev, dev_addr, size) && swiotlb_force != SWIOTLB_FORCE)
-		return dev_addr;
+	if (!dma_capable(dev, dev_addr, size) ||
+	    swiotlb_force == SWIOTLB_FORCE) {
+		trace_swiotlb_bounced(dev, dev_addr, size, swiotlb_force);
+		dev_addr = swiotlb_bounce_page(dev, &phys, size, dir, attrs);
+	}
+
+	if (!dev_is_dma_coherent(dev) &&
+	    (attrs & DMA_ATTR_SKIP_CPU_SYNC) == 0)
+		arch_sync_dma_for_device(dev, phys, size, dir);
 
-	trace_swiotlb_bounced(dev, dev_addr, size, swiotlb_force);
-	return swiotlb_bounce_page(dev, &phys, size, dir, attrs);
+	return dev_addr;
 }
 
 /*
@@ -694,6 +701,10 @@ void swiotlb_unmap_page(struct device *hwdev, dma_addr_t dev_addr,
 
 	BUG_ON(dir == DMA_NONE);
 
+	if (!dev_is_dma_coherent(hwdev) &&
+	    (attrs & DMA_ATTR_SKIP_CPU_SYNC) == 0)
+		arch_sync_dma_for_cpu(hwdev, paddr, size, dir);
+
 	if (is_swiotlb_buffer(paddr)) {
 		swiotlb_tbl_unmap_single(hwdev, paddr, size, dir, attrs);
 		return;
@@ -730,15 +741,17 @@ swiotlb_sync_single(struct device *hwdev, dma_addr_t dev_addr,
 
 	BUG_ON(dir == DMA_NONE);
 
-	if (is_swiotlb_buffer(paddr)) {
+	if (!dev_is_dma_coherent(hwdev) && target == SYNC_FOR_CPU)
+		arch_sync_dma_for_cpu(hwdev, paddr, size, dir);
+
+	if (is_swiotlb_buffer(paddr))
 		swiotlb_tbl_sync_single(hwdev, paddr, size, dir, target);
-		return;
-	}
 
-	if (dir != DMA_FROM_DEVICE)
-		return;
+	if (!dev_is_dma_coherent(hwdev) && target == SYNC_FOR_DEVICE)
+		arch_sync_dma_for_device(hwdev, paddr, size, dir);
 
-	dma_mark_clean(phys_to_virt(paddr), size);
+	if (!is_swiotlb_buffer(paddr) && dir == DMA_FROM_DEVICE)
+		dma_mark_clean(phys_to_virt(paddr), size);
 }
 
 void
-- 
2.19.1


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 03/10] swiotlb: do not panic on mapping failures
  2018-10-19  6:04     ` Christoph Hellwig
@ 2018-10-19 13:45       ` Konrad Rzeszutek Wilk
  0 siblings, 0 replies; 57+ messages in thread
From: Konrad Rzeszutek Wilk @ 2018-10-19 13:45 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Will Deacon, Catalin Marinas, Robin Murphy, linux-arm-kernel,
	iommu, linux-kernel

On Fri, Oct 19, 2018 at 08:04:25AM +0200, Christoph Hellwig wrote:
> On Thu, Oct 18, 2018 at 08:17:14PM -0400, Konrad Rzeszutek Wilk wrote:
> > On Mon, Oct 08, 2018 at 10:02:39AM +0200, Christoph Hellwig wrote:
> > > All properly written drivers now have error handling in the
> > > dma_map_single / dma_map_page callers.  As swiotlb_tbl_map_single already
> > > prints a useful warning when running out of swiotlb pool swace we can
> > 
> > s/swace/space/
> > 
> > > also remove swiotlb_full entirely as it serves no purpose now.
> > 
> > The panic albeit was useful for those drivers that didn't handle it.
> > 
> > Thoughts? I am kind of thinking screw them but then I am not too thrilled
> > about silent data corruption errors.
> 
> I can add an (optional) panic back in common code so that the different
> mapping types are covered.

Thank you.
> 
> Btw, are you fine with taking this through the dma mapping tree for
> this merge window?  I'd probably leave it in linux-next for another

Yes, please.
> week or so and don't send it with the first pull request on Monday.

<nods>

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 07/10] swiotlb: refactor swiotlb_map_page
  2018-10-19  6:52       ` Christoph Hellwig
@ 2018-10-19 13:46         ` Konrad Rzeszutek Wilk
  0 siblings, 0 replies; 57+ messages in thread
From: Konrad Rzeszutek Wilk @ 2018-10-19 13:46 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Robin Murphy, Will Deacon, Catalin Marinas, linux-arm-kernel,
	iommu, linux-kernel

On Fri, Oct 19, 2018 at 08:52:58AM +0200, Christoph Hellwig wrote:
> On Thu, Oct 18, 2018 at 08:37:15PM -0400, Konrad Rzeszutek Wilk wrote:
> > > > +	if (!dma_capable(dev, dma_addr, size) ||
> > > > +	    swiotlb_force == SWIOTLB_FORCE) {
> > > > +		trace_swiotlb_bounced(dev, dma_addr, size, swiotlb_force);
> > > > +		dma_addr = swiotlb_bounce_page(dev, &phys, size, dir, attrs);
> > > > +	}
> > > 
> > > FWIW I prefer the inverse condition and early return of the original code
> > > here, which also then allows a tail-call to swiotlb_bounce_page() (and saves
> > > a couple of lines), but it's no biggie.
> > > 
> > > Reviewed-by: Robin Murphy <robin.murphy@arm.com>
> > 
> > I agree with Robin - it certainly makes it easier to read.
> > 
> > With that small change:
> > Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> 
> So I did this edit, and in this patch it does indeed look much cleaner.
> But in patch 9 we introduce the cache maintainance, and have to invert
> the condition again if we don't want a goto mess:

Right. In which case please leave this patch as it is. And please
plaster the Reviewed-by on the patch. Thank you!

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 08/10] swiotlb: don't dip into swiotlb pool for coherent allocations
  2018-10-08  8:02 ` [PATCH 08/10] swiotlb: don't dip into swiotlb pool for coherent allocations Christoph Hellwig
  2018-10-12 17:04   ` Catalin Marinas
  2018-10-19  0:40   ` Konrad Rzeszutek Wilk
@ 2018-10-19 16:45   ` Robin Murphy
  2 siblings, 0 replies; 57+ messages in thread
From: Robin Murphy @ 2018-10-19 16:45 UTC (permalink / raw)
  To: Christoph Hellwig, Will Deacon, Catalin Marinas, Konrad Rzeszutek Wilk
  Cc: linux-arm-kernel, iommu, linux-kernel

On 08/10/2018 09:02, Christoph Hellwig wrote:
> All architectures that support swiotlb also have a zone that backs up
> these less than full addressing allocations (usually ZONE_DMA32).
> 
> Because of that it is rather pointless to fall back to the global swiotlb
> buffer if the normal dma direct allocation failed - the only thing this
> will do is to eat up bounce buffers that would be more useful to serve
> streaming mappings.

Reviewed-by: Robin Murphy <robin.murphy@arm.com>

> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>   arch/arm64/mm/dma-mapping.c |   6 +--
>   include/linux/swiotlb.h     |   5 --
>   kernel/dma/swiotlb.c        | 105 +-----------------------------------
>   3 files changed, 5 insertions(+), 111 deletions(-)
> 
> diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c
> index 8d91b927e09e..eee6cfcfde9e 100644
> --- a/arch/arm64/mm/dma-mapping.c
> +++ b/arch/arm64/mm/dma-mapping.c
> @@ -112,7 +112,7 @@ static void *__dma_alloc(struct device *dev, size_t size,
>   		return addr;
>   	}
>   
> -	ptr = swiotlb_alloc(dev, size, dma_handle, flags, attrs);
> +	ptr = dma_direct_alloc_pages(dev, size, dma_handle, flags, attrs);
>   	if (!ptr)
>   		goto no_mem;
>   
> @@ -133,7 +133,7 @@ static void *__dma_alloc(struct device *dev, size_t size,
>   	return coherent_ptr;
>   
>   no_map:
> -	swiotlb_free(dev, size, ptr, *dma_handle, attrs);
> +	dma_direct_free_pages(dev, size, ptr, *dma_handle, attrs);
>   no_mem:
>   	return NULL;
>   }
> @@ -151,7 +151,7 @@ static void __dma_free(struct device *dev, size_t size,
>   			return;
>   		vunmap(vaddr);
>   	}
> -	swiotlb_free(dev, size, swiotlb_addr, dma_handle, attrs);
> +	dma_direct_free_pages(dev, size, swiotlb_addr, dma_handle, attrs);
>   }
>   
>   static dma_addr_t __swiotlb_map_page(struct device *dev, struct page *page,
> diff --git a/include/linux/swiotlb.h b/include/linux/swiotlb.h
> index f847c1b265c4..a387b59640a4 100644
> --- a/include/linux/swiotlb.h
> +++ b/include/linux/swiotlb.h
> @@ -67,11 +67,6 @@ extern void swiotlb_tbl_sync_single(struct device *hwdev,
>   
>   /* Accessory functions. */
>   
> -void *swiotlb_alloc(struct device *hwdev, size_t size, dma_addr_t *dma_handle,
> -		gfp_t flags, unsigned long attrs);
> -void swiotlb_free(struct device *dev, size_t size, void *vaddr,
> -		dma_addr_t dma_addr, unsigned long attrs);
> -
>   extern dma_addr_t swiotlb_map_page(struct device *dev, struct page *page,
>   				   unsigned long offset, size_t size,
>   				   enum dma_data_direction dir,
> diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
> index 4d7a4d85d71e..475a41eff3dc 100644
> --- a/kernel/dma/swiotlb.c
> +++ b/kernel/dma/swiotlb.c
> @@ -622,78 +622,6 @@ void swiotlb_tbl_sync_single(struct device *hwdev, phys_addr_t tlb_addr,
>   	}
>   }
>   
> -static inline bool dma_coherent_ok(struct device *dev, dma_addr_t addr,
> -		size_t size)
> -{
> -	u64 mask = DMA_BIT_MASK(32);
> -
> -	if (dev && dev->coherent_dma_mask)
> -		mask = dev->coherent_dma_mask;
> -	return addr + size - 1 <= mask;
> -}
> -
> -static void *
> -swiotlb_alloc_buffer(struct device *dev, size_t size, dma_addr_t *dma_handle,
> -		unsigned long attrs)
> -{
> -	phys_addr_t phys_addr;
> -
> -	if (swiotlb_force == SWIOTLB_NO_FORCE)
> -		goto out_warn;
> -
> -	phys_addr = swiotlb_tbl_map_single(dev,
> -			__phys_to_dma(dev, io_tlb_start),
> -			0, size, DMA_FROM_DEVICE, attrs);
> -	if (phys_addr == SWIOTLB_MAP_ERROR)
> -		goto out_warn;
> -
> -	*dma_handle = __phys_to_dma(dev, phys_addr);
> -	if (!dma_coherent_ok(dev, *dma_handle, size))
> -		goto out_unmap;
> -
> -	memset(phys_to_virt(phys_addr), 0, size);
> -	return phys_to_virt(phys_addr);
> -
> -out_unmap:
> -	dev_warn(dev, "hwdev DMA mask = 0x%016Lx, dev_addr = 0x%016Lx\n",
> -		(unsigned long long)dev->coherent_dma_mask,
> -		(unsigned long long)*dma_handle);
> -
> -	/*
> -	 * DMA_TO_DEVICE to avoid memcpy in unmap_single.
> -	 * DMA_ATTR_SKIP_CPU_SYNC is optional.
> -	 */
> -	swiotlb_tbl_unmap_single(dev, phys_addr, size, DMA_TO_DEVICE,
> -			DMA_ATTR_SKIP_CPU_SYNC);
> -out_warn:
> -	if (!(attrs & DMA_ATTR_NO_WARN) && printk_ratelimit()) {
> -		dev_warn(dev,
> -			"swiotlb: coherent allocation failed, size=%zu\n",
> -			size);
> -		dump_stack();
> -	}
> -	return NULL;
> -}
> -
> -static bool swiotlb_free_buffer(struct device *dev, size_t size,
> -		dma_addr_t dma_addr)
> -{
> -	phys_addr_t phys_addr = dma_to_phys(dev, dma_addr);
> -
> -	WARN_ON_ONCE(irqs_disabled());
> -
> -	if (!is_swiotlb_buffer(phys_addr))
> -		return false;
> -
> -	/*
> -	 * DMA_TO_DEVICE to avoid memcpy in swiotlb_tbl_unmap_single.
> -	 * DMA_ATTR_SKIP_CPU_SYNC is optional.
> -	 */
> -	swiotlb_tbl_unmap_single(dev, phys_addr, size, DMA_TO_DEVICE,
> -				 DMA_ATTR_SKIP_CPU_SYNC);
> -	return true;
> -}
> -
>   static dma_addr_t swiotlb_bounce_page(struct device *dev, phys_addr_t *phys,
>   		size_t size, enum dma_data_direction dir, unsigned long attrs)
>   {
> @@ -928,39 +856,10 @@ swiotlb_dma_supported(struct device *hwdev, u64 mask)
>   	return __phys_to_dma(hwdev, io_tlb_end - 1) <= mask;
>   }
>   
> -void *swiotlb_alloc(struct device *dev, size_t size, dma_addr_t *dma_handle,
> -		gfp_t gfp, unsigned long attrs)
> -{
> -	void *vaddr;
> -
> -	/* temporary workaround: */
> -	if (gfp & __GFP_NOWARN)
> -		attrs |= DMA_ATTR_NO_WARN;
> -
> -	/*
> -	 * Don't print a warning when the first allocation attempt fails.
> -	 * swiotlb_alloc_coherent() will print a warning when the DMA memory
> -	 * allocation ultimately failed.
> -	 */
> -	gfp |= __GFP_NOWARN;
> -
> -	vaddr = dma_direct_alloc(dev, size, dma_handle, gfp, attrs);
> -	if (!vaddr)
> -		vaddr = swiotlb_alloc_buffer(dev, size, dma_handle, attrs);
> -	return vaddr;
> -}
> -
> -void swiotlb_free(struct device *dev, size_t size, void *vaddr,
> -		dma_addr_t dma_addr, unsigned long attrs)
> -{
> -	if (!swiotlb_free_buffer(dev, size, dma_addr))
> -		dma_direct_free(dev, size, vaddr, dma_addr, attrs);
> -}
> -
>   const struct dma_map_ops swiotlb_dma_ops = {
>   	.mapping_error		= dma_direct_mapping_error,
> -	.alloc			= swiotlb_alloc,
> -	.free			= swiotlb_free,
> +	.alloc			= dma_direct_alloc,
> +	.free			= dma_direct_free,
>   	.sync_single_for_cpu	= swiotlb_sync_single_for_cpu,
>   	.sync_single_for_device	= swiotlb_sync_single_for_device,
>   	.sync_sg_for_cpu	= swiotlb_sync_sg_for_cpu,
> 

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 09/10] swiotlb: add support for non-coherent DMA
  2018-10-08  8:02 ` [PATCH 09/10] swiotlb: add support for non-coherent DMA Christoph Hellwig
  2018-10-19  0:49   ` Konrad Rzeszutek Wilk
@ 2018-10-22 17:11   ` Robin Murphy
  2018-10-26  8:04     ` Christoph Hellwig
  1 sibling, 1 reply; 57+ messages in thread
From: Robin Murphy @ 2018-10-22 17:11 UTC (permalink / raw)
  To: Christoph Hellwig, Will Deacon, Catalin Marinas, Konrad Rzeszutek Wilk
  Cc: linux-arm-kernel, iommu, linux-kernel

On 08/10/2018 09:02, Christoph Hellwig wrote:
> Handle architectures that are not cache coherent directly in the main
> swiotlb code by calling arch_sync_dma_for_{device,cpu} in all the right
> places from the various dma_map/unmap/sync methods when the device is
> non-coherent.
> 
> Because swiotlb now uses dma_direct_alloc for the coherent allocation
> that side is already taken care of by the dma-direct code calling into
> arch_dma_{alloc,free} for devices that are non-coherent.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>   kernel/dma/swiotlb.c | 23 +++++++++++++++++------
>   1 file changed, 17 insertions(+), 6 deletions(-)
> 
> diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
> index 475a41eff3dc..52885b274368 100644
> --- a/kernel/dma/swiotlb.c
> +++ b/kernel/dma/swiotlb.c
> @@ -21,6 +21,7 @@
>   
>   #include <linux/cache.h>
>   #include <linux/dma-direct.h>
> +#include <linux/dma-noncoherent.h>
>   #include <linux/mm.h>
>   #include <linux/export.h>
>   #include <linux/spinlock.h>
> @@ -677,6 +678,10 @@ dma_addr_t swiotlb_map_page(struct device *dev, struct page *page,
>   		dma_addr = swiotlb_bounce_page(dev, &phys, size, dir, attrs);
>   	}
>   
> +	if (!dev_is_dma_coherent(dev) &&
> +	    (attrs & DMA_ATTR_SKIP_CPU_SYNC) == 0)

Nit: other parts of the file are already using the "!(...)" style rather 
than "(...) == 0".

> +		arch_sync_dma_for_device(dev, phys, size, dir);
> +
>   	return dma_addr;
>   }
>   
> @@ -696,6 +701,10 @@ void swiotlb_unmap_page(struct device *hwdev, dma_addr_t dev_addr,
>   
>   	BUG_ON(dir == DMA_NONE);
>   
> +	if (!dev_is_dma_coherent(hwdev) &&
> +	    (attrs & DMA_ATTR_SKIP_CPU_SYNC) == 0)
> +		arch_sync_dma_for_cpu(hwdev, paddr, size, dir);
> +
>   	if (is_swiotlb_buffer(paddr)) {
>   		swiotlb_tbl_unmap_single(hwdev, paddr, size, dir, attrs);
>   		return;
> @@ -732,15 +741,17 @@ swiotlb_sync_single(struct device *hwdev, dma_addr_t dev_addr,
>   
>   	BUG_ON(dir == DMA_NONE);
>   
> -	if (is_swiotlb_buffer(paddr)) {
> +	if (!dev_is_dma_coherent(hwdev) && target == SYNC_FOR_CPU)
> +		arch_sync_dma_for_cpu(hwdev, paddr, size, dir);
> +
> +	if (is_swiotlb_buffer(paddr))
>   		swiotlb_tbl_sync_single(hwdev, paddr, size, dir, target);
> -		return;
> -	}
>   
> -	if (dir != DMA_FROM_DEVICE)
> -		return;
> +	if (!dev_is_dma_coherent(hwdev) && target == SYNC_FOR_DEVICE)
> +		arch_sync_dma_for_device(hwdev, paddr, size, dir);
>   
> -	dma_mark_clean(phys_to_virt(paddr), size);
> +	if (!is_swiotlb_buffer(paddr) && dir == DMA_FROM_DEVICE)
> +		dma_mark_clean(phys_to_virt(paddr), size);
>   }

All these "if"s end up pretty hard to follow at first glance :(

I had a quick play at moving the cache maintenance here out into the 
callers, which comes out arguably looking perhaps a little cleaner (only 
+1 source line overall, and actually reduces text size by 32 bytes for 
my build), but sadly I can't really see any way of doing the equivalent 
for map/unmap short of duplicating the whole 3-line arch_sync_*() block, 
which just makes for a different readability problem. As you mentioned 
on patch #7, I guess this really is just one of those things which has 
no nice solution, so cosmetics aside,

Reviewed-by: Robin Murphy <robin.murphy@arm.com>

FWIW, below is my "cleanup" attempt (diff on top of the 
swiotlb-noncoherent.3 branch).

Robin.

----->8-----
diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
index 52885b274368..43ee29969fdd 100644
--- a/kernel/dma/swiotlb.c
+++ b/kernel/dma/swiotlb.c
@@ -741,23 +741,24 @@ swiotlb_sync_single(struct device *hwdev, 
dma_addr_t dev_addr,

  	BUG_ON(dir == DMA_NONE);

-	if (!dev_is_dma_coherent(hwdev) && target == SYNC_FOR_CPU)
-		arch_sync_dma_for_cpu(hwdev, paddr, size, dir);
-
-	if (is_swiotlb_buffer(paddr))
+	if (is_swiotlb_buffer(paddr)) {
  		swiotlb_tbl_sync_single(hwdev, paddr, size, dir, target);
+		return;
+	}

-	if (!dev_is_dma_coherent(hwdev) && target == SYNC_FOR_DEVICE)
-		arch_sync_dma_for_device(hwdev, paddr, size, dir);
+	if (dir != DMA_FROM_DEVICE)
+		return;

-	if (!is_swiotlb_buffer(paddr) && dir == DMA_FROM_DEVICE)
-		dma_mark_clean(phys_to_virt(paddr), size);
+	dma_mark_clean(phys_to_virt(paddr), size);
  }

  void
  swiotlb_sync_single_for_cpu(struct device *hwdev, dma_addr_t dev_addr,
  			    size_t size, enum dma_data_direction dir)
  {
+	if (!dev_is_dma_coherent(hwdev))
+		arch_sync_dma_for_cpu(hwdev, dma_to_phys(hwdev, dev_addr),
+				      size, dir);
  	swiotlb_sync_single(hwdev, dev_addr, size, dir, SYNC_FOR_CPU);
  }

@@ -766,6 +767,9 @@ swiotlb_sync_single_for_device(struct device *hwdev, 
dma_addr_t dev_addr,
  			       size_t size, enum dma_data_direction dir)
  {
  	swiotlb_sync_single(hwdev, dev_addr, size, dir, SYNC_FOR_DEVICE);
+	if (!dev_is_dma_coherent(hwdev))
+		arch_sync_dma_for_device(hwdev, dma_to_phys(hwdev, dev_addr),
+					 size, dir);
  }

  /*
@@ -828,31 +832,28 @@ swiotlb_unmap_sg_attrs(struct device *hwdev, 
struct scatterlist *sgl,
   * The same as swiotlb_sync_single_* but for a scatter-gather list, 
same rules
   * and usage.
   */
-static void
-swiotlb_sync_sg(struct device *hwdev, struct scatterlist *sgl,
-		int nelems, enum dma_data_direction dir,
-		enum dma_sync_target target)
+void
+swiotlb_sync_sg_for_cpu(struct device *hwdev, struct scatterlist *sgl,
+			int nelems, enum dma_data_direction dir)
  {
  	struct scatterlist *sg;
  	int i;

  	for_each_sg(sgl, sg, nelems, i)
-		swiotlb_sync_single(hwdev, sg->dma_address,
-				    sg_dma_len(sg), dir, target);
+		swiotlb_sync_single_for_cpu(hwdev, sg->dma_address,
+					    sg_dma_len(sg), dir);
  }

  void
-swiotlb_sync_sg_for_cpu(struct device *hwdev, struct scatterlist *sg,
-			int nelems, enum dma_data_direction dir)
-{
-	swiotlb_sync_sg(hwdev, sg, nelems, dir, SYNC_FOR_CPU);
-}
-
-void
-swiotlb_sync_sg_for_device(struct device *hwdev, struct scatterlist *sg,
+swiotlb_sync_sg_for_device(struct device *hwdev, struct scatterlist *sgl,
  			   int nelems, enum dma_data_direction dir)
  {
-	swiotlb_sync_sg(hwdev, sg, nelems, dir, SYNC_FOR_DEVICE);
+	struct scatterlist *sg;
+	int i;
+
+	for_each_sg(sgl, sg, nelems, i)
+		swiotlb_sync_single_for_device(hwdev, sg->dma_address,
+					       sg_dma_len(sg), dir);
  }

  /*

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 10/10] arm64: use the generic swiotlb_dma_ops
  2018-10-08  8:02 ` [PATCH 10/10] arm64: use the generic swiotlb_dma_ops Christoph Hellwig
  2018-10-12 13:01   ` Robin Murphy
@ 2018-10-22 17:52   ` Robin Murphy
  2018-10-26 12:44     ` Christoph Hellwig
  1 sibling, 1 reply; 57+ messages in thread
From: Robin Murphy @ 2018-10-22 17:52 UTC (permalink / raw)
  To: Christoph Hellwig, Will Deacon, Catalin Marinas, Konrad Rzeszutek Wilk
  Cc: linux-arm-kernel, iommu, linux-kernel

On 08/10/2018 09:02, Christoph Hellwig wrote:
> Now that the generic swiotlb code supports non-coherent DMA we can switch
> to it for arm64.  For that we need to refactor the existing
> alloc/free/mmap/pgprot helpers to be used as the architecture hooks,
> and implement the standard arch_sync_dma_for_{device,cpu} hooks for
> cache maintaincance in the streaming dma hooks, which also implies
> using the generic dma_coherent flag in struct device.
> 
> Note that we need to keep the old is_device_dma_coherent function around
> for now, so that the shared arm/arm64 Xen code keeps working.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>   arch/arm64/Kconfig                   |   4 +
>   arch/arm64/include/asm/device.h      |   1 -
>   arch/arm64/include/asm/dma-mapping.h |   7 +-
>   arch/arm64/mm/dma-mapping.c          | 257 +++++----------------------
>   4 files changed, 56 insertions(+), 213 deletions(-)
> 
> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> index 1b1a0e95c751..c4db5131d837 100644
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -11,6 +11,8 @@ config ARM64
>   	select ARCH_CLOCKSOURCE_DATA
>   	select ARCH_HAS_DEBUG_VIRTUAL
>   	select ARCH_HAS_DEVMEM_IS_ALLOWED
> +	select ARCH_HAS_DMA_COHERENT_TO_PFN
> +	select ARCH_HAS_DMA_MMAP_PGPROT
>   	select ARCH_HAS_ACPI_TABLE_UPGRADE if ACPI
>   	select ARCH_HAS_ELF_RANDOMIZE
>   	select ARCH_HAS_FAST_MULTIPLIER
> @@ -24,6 +26,8 @@ config ARM64
>   	select ARCH_HAS_SG_CHAIN
>   	select ARCH_HAS_STRICT_KERNEL_RWX
>   	select ARCH_HAS_STRICT_MODULE_RWX
> +	select ARCH_HAS_SYNC_DMA_FOR_DEVICE
> +	select ARCH_HAS_SYNC_DMA_FOR_CPU
>   	select ARCH_HAS_SYSCALL_WRAPPER
>   	select ARCH_HAS_TICK_BROADCAST if GENERIC_CLOCKEVENTS_BROADCAST
>   	select ARCH_HAVE_NMI_SAFE_CMPXCHG
> diff --git a/arch/arm64/include/asm/device.h b/arch/arm64/include/asm/device.h
> index 5a5fa47a6b18..3dd3d664c5c5 100644
> --- a/arch/arm64/include/asm/device.h
> +++ b/arch/arm64/include/asm/device.h
> @@ -23,7 +23,6 @@ struct dev_archdata {
>   #ifdef CONFIG_XEN
>   	const struct dma_map_ops *dev_dma_ops;
>   #endif
> -	bool dma_coherent;
>   };
>   
>   struct pdev_archdata {
> diff --git a/arch/arm64/include/asm/dma-mapping.h b/arch/arm64/include/asm/dma-mapping.h
> index b7847eb8a7bb..c41f3fb1446c 100644
> --- a/arch/arm64/include/asm/dma-mapping.h
> +++ b/arch/arm64/include/asm/dma-mapping.h
> @@ -44,10 +44,13 @@ void arch_teardown_dma_ops(struct device *dev);
>   #define arch_teardown_dma_ops	arch_teardown_dma_ops
>   #endif
>   
> -/* do not use this function in a driver */
> +/*
> + * Do not use this function in a driver, it is only provided for
> + * arch/arm/mm/xen.c, which is used by arm64 as well.
> + */
>   static inline bool is_device_dma_coherent(struct device *dev)
>   {
> -	return dev->archdata.dma_coherent;
> +	return dev->dma_coherent;
>   }
>   
>   #endif	/* __KERNEL__ */
> diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c
> index eee6cfcfde9e..3c75d69b54e7 100644
> --- a/arch/arm64/mm/dma-mapping.c
> +++ b/arch/arm64/mm/dma-mapping.c
> @@ -25,6 +25,7 @@
>   #include <linux/slab.h>
>   #include <linux/genalloc.h>
>   #include <linux/dma-direct.h>
> +#include <linux/dma-noncoherent.h>
>   #include <linux/dma-contiguous.h>
>   #include <linux/vmalloc.h>
>   #include <linux/swiotlb.h>
> @@ -32,16 +33,6 @@
>   
>   #include <asm/cacheflush.h>
>   
> -static int swiotlb __ro_after_init;
> -
> -static pgprot_t __get_dma_pgprot(unsigned long attrs, pgprot_t prot,
> -				 bool coherent)
> -{
> -	if (!coherent || (attrs & DMA_ATTR_WRITE_COMBINE))
> -		return pgprot_writecombine(prot);
> -	return prot;
> -}
> -
>   static struct gen_pool *atomic_pool __ro_after_init;
>   
>   #define DEFAULT_DMA_COHERENT_POOL_SIZE  SZ_256K
> @@ -91,18 +82,16 @@ static int __free_from_pool(void *start, size_t size)
>   	return 1;
>   }
>   
> -static void *__dma_alloc(struct device *dev, size_t size,
> -			 dma_addr_t *dma_handle, gfp_t flags,
> -			 unsigned long attrs)
> +void *arch_dma_alloc(struct device *dev, size_t size, dma_addr_t *dma_handle,
> +		gfp_t flags, unsigned long attrs)
>   {
>   	struct page *page;
>   	void *ptr, *coherent_ptr;
> -	bool coherent = is_device_dma_coherent(dev);
> -	pgprot_t prot = __get_dma_pgprot(attrs, PAGE_KERNEL, false);
> +	pgprot_t prot = pgprot_writecombine(PAGE_KERNEL);
>   
>   	size = PAGE_ALIGN(size);
>   
> -	if (!coherent && !gfpflags_allow_blocking(flags)) {
> +	if (!gfpflags_allow_blocking(flags)) {
>   		struct page *page = NULL;
>   		void *addr = __alloc_from_pool(size, &page, flags);
>   
> @@ -116,10 +105,6 @@ static void *__dma_alloc(struct device *dev, size_t size,
>   	if (!ptr)
>   		goto no_mem;
>   
> -	/* no need for non-cacheable mapping if coherent */
> -	if (coherent)
> -		return ptr;
> -
>   	/* remove any dirty cache lines on the kernel alias */
>   	__dma_flush_area(ptr, size);
>   
> @@ -138,125 +123,50 @@ static void *__dma_alloc(struct device *dev, size_t size,
>   	return NULL;
>   }
>   
> -static void __dma_free(struct device *dev, size_t size,
> -		       void *vaddr, dma_addr_t dma_handle,
> -		       unsigned long attrs)
> -{
> -	void *swiotlb_addr = phys_to_virt(dma_to_phys(dev, dma_handle));
> -
> -	size = PAGE_ALIGN(size);
> -
> -	if (!is_device_dma_coherent(dev)) {
> -		if (__free_from_pool(vaddr, size))
> -			return;
> -		vunmap(vaddr);
> -	}
> -	dma_direct_free_pages(dev, size, swiotlb_addr, dma_handle, attrs);
> -}
> -
> -static dma_addr_t __swiotlb_map_page(struct device *dev, struct page *page,
> -				     unsigned long offset, size_t size,
> -				     enum dma_data_direction dir,
> -				     unsigned long attrs)
> -{
> -	dma_addr_t dev_addr;
> -
> -	dev_addr = swiotlb_map_page(dev, page, offset, size, dir, attrs);
> -	if (!is_device_dma_coherent(dev) &&
> -	    (attrs & DMA_ATTR_SKIP_CPU_SYNC) == 0)
> -		__dma_map_area(phys_to_virt(dma_to_phys(dev, dev_addr)), size, dir);
> -
> -	return dev_addr;
> -}
> -
> -
> -static void __swiotlb_unmap_page(struct device *dev, dma_addr_t dev_addr,
> -				 size_t size, enum dma_data_direction dir,
> -				 unsigned long attrs)
> +void arch_dma_free(struct device *dev, size_t size, void *vaddr,
> +		dma_addr_t dma_handle, unsigned long attrs)
>   {
> -	if (!is_device_dma_coherent(dev) &&
> -	    (attrs & DMA_ATTR_SKIP_CPU_SYNC) == 0)
> -		__dma_unmap_area(phys_to_virt(dma_to_phys(dev, dev_addr)), size, dir);
> -	swiotlb_unmap_page(dev, dev_addr, size, dir, attrs);
> +	if (__free_from_pool(vaddr, PAGE_ALIGN(size)))
> +		return;
> +	vunmap(vaddr);
> +	dma_direct_free_pages(dev, size, vaddr, dma_handle, attrs);
>   }
>   
> -static int __swiotlb_map_sg_attrs(struct device *dev, struct scatterlist *sgl,
> -				  int nelems, enum dma_data_direction dir,
> -				  unsigned long attrs)
> +long arch_dma_coherent_to_pfn(struct device *dev, void *cpu_addr,

I realise I'm spotting this a bit late, but does this interface really 
need somewhat-atypical signed PFNs?

> +		dma_addr_t dma_addr)
>   {
> -	struct scatterlist *sg;
> -	int i, ret;
> -
> -	ret = swiotlb_map_sg_attrs(dev, sgl, nelems, dir, attrs);
> -	if (!is_device_dma_coherent(dev) &&
> -	    (attrs & DMA_ATTR_SKIP_CPU_SYNC) == 0)
> -		for_each_sg(sgl, sg, ret, i)
> -			__dma_map_area(phys_to_virt(dma_to_phys(dev, sg->dma_address)),
> -				       sg->length, dir);
> -
> -	return ret;
> +	return __phys_to_pfn(dma_to_phys(dev, dma_addr));
>   }
>   
> -static void __swiotlb_unmap_sg_attrs(struct device *dev,
> -				     struct scatterlist *sgl, int nelems,
> -				     enum dma_data_direction dir,
> -				     unsigned long attrs)
> +pgprot_t arch_dma_mmap_pgprot(struct device *dev, pgprot_t prot,
> +		unsigned long attrs)
>   {
> -	struct scatterlist *sg;
> -	int i;
> -
> -	if (!is_device_dma_coherent(dev) &&
> -	    (attrs & DMA_ATTR_SKIP_CPU_SYNC) == 0)
> -		for_each_sg(sgl, sg, nelems, i)
> -			__dma_unmap_area(phys_to_virt(dma_to_phys(dev, sg->dma_address)),
> -					 sg->length, dir);
> -	swiotlb_unmap_sg_attrs(dev, sgl, nelems, dir, attrs);
> +	if (!dev_is_dma_coherent(dev) || (attrs & DMA_ATTR_WRITE_COMBINE))
> +		return pgprot_writecombine(prot);
> +	return prot;
>   }
>   
> -static void __swiotlb_sync_single_for_cpu(struct device *dev,
> -					  dma_addr_t dev_addr, size_t size,
> -					  enum dma_data_direction dir)
> +void arch_sync_dma_for_device(struct device *dev, phys_addr_t paddr,
> +		size_t size, enum dma_data_direction dir)
>   {
> -	if (!is_device_dma_coherent(dev))
> -		__dma_unmap_area(phys_to_virt(dma_to_phys(dev, dev_addr)), size, dir);
> -	swiotlb_sync_single_for_cpu(dev, dev_addr, size, dir);
> +	__dma_map_area(phys_to_virt(paddr), size, dir);
>   }
>   
> -static void __swiotlb_sync_single_for_device(struct device *dev,
> -					     dma_addr_t dev_addr, size_t size,
> -					     enum dma_data_direction dir)
> +void arch_sync_dma_for_cpu(struct device *dev, phys_addr_t paddr,
> +		size_t size, enum dma_data_direction dir)
>   {
> -	swiotlb_sync_single_for_device(dev, dev_addr, size, dir);
> -	if (!is_device_dma_coherent(dev))
> -		__dma_map_area(phys_to_virt(dma_to_phys(dev, dev_addr)), size, dir);
> +	__dma_unmap_area(phys_to_virt(paddr), size, dir);
>   }
>   
> -static void __swiotlb_sync_sg_for_cpu(struct device *dev,
> -				      struct scatterlist *sgl, int nelems,
> -				      enum dma_data_direction dir)
> +static int __swiotlb_get_sgtable_page(struct sg_table *sgt,
> +				      struct page *page, size_t size)

If __iommu_get_sgtable() is now the only user, I'd say just inline these 
remnants there (or let that case call dma_common_sgtable() if that's 
what swiotlb_dma_ops are now relying on). Either way the "__swiotlb" is 
now a complete misnomer.

Other than that, and the aforementioned fixing of arch_dma_free(), it 
seems OK and has survived at least some light testing:

Reviewed-by: Robin Murphy <robin.murphy@arm.com>

Thanks,
Robin.

>   {
> -	struct scatterlist *sg;
> -	int i;
> -
> -	if (!is_device_dma_coherent(dev))
> -		for_each_sg(sgl, sg, nelems, i)
> -			__dma_unmap_area(phys_to_virt(dma_to_phys(dev, sg->dma_address)),
> -					 sg->length, dir);
> -	swiotlb_sync_sg_for_cpu(dev, sgl, nelems, dir);
> -}
> +	int ret = sg_alloc_table(sgt, 1, GFP_KERNEL);
>   
> -static void __swiotlb_sync_sg_for_device(struct device *dev,
> -					 struct scatterlist *sgl, int nelems,
> -					 enum dma_data_direction dir)
> -{
> -	struct scatterlist *sg;
> -	int i;
> +	if (!ret)
> +		sg_set_page(sgt->sgl, page, PAGE_ALIGN(size), 0);
>   
> -	swiotlb_sync_sg_for_device(dev, sgl, nelems, dir);
> -	if (!is_device_dma_coherent(dev))
> -		for_each_sg(sgl, sg, nelems, i)
> -			__dma_map_area(phys_to_virt(dma_to_phys(dev, sg->dma_address)),
> -				       sg->length, dir);
> +	return ret;
>   }
>   
>   static int __swiotlb_mmap_pfn(struct vm_area_struct *vma,
> @@ -277,74 +187,6 @@ static int __swiotlb_mmap_pfn(struct vm_area_struct *vma,
>   	return ret;
>   }
>   
> -static int __swiotlb_mmap(struct device *dev,
> -			  struct vm_area_struct *vma,
> -			  void *cpu_addr, dma_addr_t dma_addr, size_t size,
> -			  unsigned long attrs)
> -{
> -	int ret;
> -	unsigned long pfn = dma_to_phys(dev, dma_addr) >> PAGE_SHIFT;
> -
> -	vma->vm_page_prot = __get_dma_pgprot(attrs, vma->vm_page_prot,
> -					     is_device_dma_coherent(dev));
> -
> -	if (dma_mmap_from_dev_coherent(dev, vma, cpu_addr, size, &ret))
> -		return ret;
> -
> -	return __swiotlb_mmap_pfn(vma, pfn, size);
> -}
> -
> -static int __swiotlb_get_sgtable_page(struct sg_table *sgt,
> -				      struct page *page, size_t size)
> -{
> -	int ret = sg_alloc_table(sgt, 1, GFP_KERNEL);
> -
> -	if (!ret)
> -		sg_set_page(sgt->sgl, page, PAGE_ALIGN(size), 0);
> -
> -	return ret;
> -}
> -
> -static int __swiotlb_get_sgtable(struct device *dev, struct sg_table *sgt,
> -				 void *cpu_addr, dma_addr_t handle, size_t size,
> -				 unsigned long attrs)
> -{
> -	struct page *page = phys_to_page(dma_to_phys(dev, handle));
> -
> -	return __swiotlb_get_sgtable_page(sgt, page, size);
> -}
> -
> -static int __swiotlb_dma_supported(struct device *hwdev, u64 mask)
> -{
> -	if (swiotlb)
> -		return swiotlb_dma_supported(hwdev, mask);
> -	return 1;
> -}
> -
> -static int __swiotlb_dma_mapping_error(struct device *hwdev, dma_addr_t addr)
> -{
> -	if (swiotlb)
> -		return dma_direct_mapping_error(hwdev, addr);
> -	return 0;
> -}
> -
> -static const struct dma_map_ops arm64_swiotlb_dma_ops = {
> -	.alloc = __dma_alloc,
> -	.free = __dma_free,
> -	.mmap = __swiotlb_mmap,
> -	.get_sgtable = __swiotlb_get_sgtable,
> -	.map_page = __swiotlb_map_page,
> -	.unmap_page = __swiotlb_unmap_page,
> -	.map_sg = __swiotlb_map_sg_attrs,
> -	.unmap_sg = __swiotlb_unmap_sg_attrs,
> -	.sync_single_for_cpu = __swiotlb_sync_single_for_cpu,
> -	.sync_single_for_device = __swiotlb_sync_single_for_device,
> -	.sync_sg_for_cpu = __swiotlb_sync_sg_for_cpu,
> -	.sync_sg_for_device = __swiotlb_sync_sg_for_device,
> -	.dma_supported = __swiotlb_dma_supported,
> -	.mapping_error = __swiotlb_dma_mapping_error,
> -};
> -
>   static int __init atomic_pool_init(void)
>   {
>   	pgprot_t prot = __pgprot(PROT_NORMAL_NC);
> @@ -500,10 +342,6 @@ EXPORT_SYMBOL(dummy_dma_ops);
>   
>   static int __init arm64_dma_init(void)
>   {
> -	if (swiotlb_force == SWIOTLB_FORCE ||
> -	    max_pfn > (arm64_dma_phys_limit >> PAGE_SHIFT))
> -		swiotlb = 1;
> -
>   	WARN_TAINT(ARCH_DMA_MINALIGN < cache_line_size(),
>   		   TAINT_CPU_OUT_OF_SPEC,
>   		   "ARCH_DMA_MINALIGN smaller than CTR_EL0.CWG (%d < %d)",
> @@ -528,7 +366,7 @@ static void *__iommu_alloc_attrs(struct device *dev, size_t size,
>   				 dma_addr_t *handle, gfp_t gfp,
>   				 unsigned long attrs)
>   {
> -	bool coherent = is_device_dma_coherent(dev);
> +	bool coherent = dev_is_dma_coherent(dev);
>   	int ioprot = dma_info_to_prot(DMA_BIDIRECTIONAL, coherent, attrs);
>   	size_t iosize = size;
>   	void *addr;
> @@ -569,7 +407,7 @@ static void *__iommu_alloc_attrs(struct device *dev, size_t size,
>   			addr = NULL;
>   		}
>   	} else if (attrs & DMA_ATTR_FORCE_CONTIGUOUS) {
> -		pgprot_t prot = __get_dma_pgprot(attrs, PAGE_KERNEL, coherent);
> +		pgprot_t prot = arch_dma_mmap_pgprot(dev, PAGE_KERNEL, attrs);
>   		struct page *page;
>   
>   		page = dma_alloc_from_contiguous(dev, size >> PAGE_SHIFT,
> @@ -596,7 +434,7 @@ static void *__iommu_alloc_attrs(struct device *dev, size_t size,
>   						    size >> PAGE_SHIFT);
>   		}
>   	} else {
> -		pgprot_t prot = __get_dma_pgprot(attrs, PAGE_KERNEL, coherent);
> +		pgprot_t prot = arch_dma_mmap_pgprot(dev, PAGE_KERNEL, attrs);
>   		struct page **pages;
>   
>   		pages = iommu_dma_alloc(dev, iosize, gfp, attrs, ioprot,
> @@ -658,8 +496,7 @@ static int __iommu_mmap_attrs(struct device *dev, struct vm_area_struct *vma,
>   	struct vm_struct *area;
>   	int ret;
>   
> -	vma->vm_page_prot = __get_dma_pgprot(attrs, vma->vm_page_prot,
> -					     is_device_dma_coherent(dev));
> +	vma->vm_page_prot = arch_dma_mmap_pgprot(dev, vma->vm_page_prot, attrs);
>   
>   	if (dma_mmap_from_dev_coherent(dev, vma, cpu_addr, size, &ret))
>   		return ret;
> @@ -709,11 +546,11 @@ static void __iommu_sync_single_for_cpu(struct device *dev,
>   {
>   	phys_addr_t phys;
>   
> -	if (is_device_dma_coherent(dev))
> +	if (dev_is_dma_coherent(dev))
>   		return;
>   
>   	phys = iommu_iova_to_phys(iommu_get_domain_for_dev(dev), dev_addr);
> -	__dma_unmap_area(phys_to_virt(phys), size, dir);
> +	arch_sync_dma_for_cpu(dev, phys, size, dir);
>   }
>   
>   static void __iommu_sync_single_for_device(struct device *dev,
> @@ -722,11 +559,11 @@ static void __iommu_sync_single_for_device(struct device *dev,
>   {
>   	phys_addr_t phys;
>   
> -	if (is_device_dma_coherent(dev))
> +	if (dev_is_dma_coherent(dev))
>   		return;
>   
>   	phys = iommu_iova_to_phys(iommu_get_domain_for_dev(dev), dev_addr);
> -	__dma_map_area(phys_to_virt(phys), size, dir);
> +	arch_sync_dma_for_device(dev, phys, size, dir);
>   }
>   
>   static dma_addr_t __iommu_map_page(struct device *dev, struct page *page,
> @@ -734,7 +571,7 @@ static dma_addr_t __iommu_map_page(struct device *dev, struct page *page,
>   				   enum dma_data_direction dir,
>   				   unsigned long attrs)
>   {
> -	bool coherent = is_device_dma_coherent(dev);
> +	bool coherent = dev_is_dma_coherent(dev);
>   	int prot = dma_info_to_prot(dir, coherent, attrs);
>   	dma_addr_t dev_addr = iommu_dma_map_page(dev, page, offset, size, prot);
>   
> @@ -762,11 +599,11 @@ static void __iommu_sync_sg_for_cpu(struct device *dev,
>   	struct scatterlist *sg;
>   	int i;
>   
> -	if (is_device_dma_coherent(dev))
> +	if (dev_is_dma_coherent(dev))
>   		return;
>   
>   	for_each_sg(sgl, sg, nelems, i)
> -		__dma_unmap_area(sg_virt(sg), sg->length, dir);
> +		arch_sync_dma_for_cpu(dev, sg_phys(sg), sg->length, dir);
>   }
>   
>   static void __iommu_sync_sg_for_device(struct device *dev,
> @@ -776,18 +613,18 @@ static void __iommu_sync_sg_for_device(struct device *dev,
>   	struct scatterlist *sg;
>   	int i;
>   
> -	if (is_device_dma_coherent(dev))
> +	if (dev_is_dma_coherent(dev))
>   		return;
>   
>   	for_each_sg(sgl, sg, nelems, i)
> -		__dma_map_area(sg_virt(sg), sg->length, dir);
> +		arch_sync_dma_for_device(dev, sg_phys(sg), sg->length, dir);
>   }
>   
>   static int __iommu_map_sg_attrs(struct device *dev, struct scatterlist *sgl,
>   				int nelems, enum dma_data_direction dir,
>   				unsigned long attrs)
>   {
> -	bool coherent = is_device_dma_coherent(dev);
> +	bool coherent = dev_is_dma_coherent(dev);
>   
>   	if ((attrs & DMA_ATTR_SKIP_CPU_SYNC) == 0)
>   		__iommu_sync_sg_for_device(dev, sgl, nelems, dir);
> @@ -879,9 +716,9 @@ void arch_setup_dma_ops(struct device *dev, u64 dma_base, u64 size,
>   			const struct iommu_ops *iommu, bool coherent)
>   {
>   	if (!dev->dma_ops)
> -		dev->dma_ops = &arm64_swiotlb_dma_ops;
> +		dev->dma_ops = &swiotlb_dma_ops;
>   
> -	dev->archdata.dma_coherent = coherent;
> +	dev->dma_coherent = coherent;
>   	__iommu_setup_dma_ops(dev, dma_base, size, iommu);
>   
>   #ifdef CONFIG_XEN
> 

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 09/10] swiotlb: add support for non-coherent DMA
  2018-10-22 17:11   ` Robin Murphy
@ 2018-10-26  8:04     ` Christoph Hellwig
  2018-10-26  9:59       ` Robin Murphy
  0 siblings, 1 reply; 57+ messages in thread
From: Christoph Hellwig @ 2018-10-26  8:04 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Christoph Hellwig, Will Deacon, Catalin Marinas,
	Konrad Rzeszutek Wilk, linux-arm-kernel, iommu, linux-kernel

On Mon, Oct 22, 2018 at 06:11:04PM +0100, Robin Murphy wrote:
>>   +	if (!dev_is_dma_coherent(dev) &&
>> +	    (attrs & DMA_ATTR_SKIP_CPU_SYNC) == 0)
>
> Nit: other parts of the file are already using the "!(...)" style rather 
> than "(...) == 0".

Indeed.  Probably some kind of copy and past error..

> I had a quick play at moving the cache maintenance here out into the 
> callers, which comes out arguably looking perhaps a little cleaner (only +1 
> source line overall, and actually reduces text size by 32 bytes for my 
> build), but sadly I can't really see any way of doing the equivalent for 
> map/unmap short of duplicating the whole 3-line arch_sync_*() block, which 
> just makes for a different readability problem. As you mentioned on patch 
> #7, I guess this really is just one of those things which has no nice 
> solution, so cosmetics aside,

It looks pretty ok, but given that I want to send this version to
Linus for the merge window I don't really dare to touch it.

The master plan for 4.21 is to actually merge swiotlb into dma-direct
and to allow direct calls to dma-direct given how expensive indirect
calls are with spectre, and dma mappings show up badly in various
benchmarks.  I'll see if I can do that in a way following what you've
done.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 09/10] swiotlb: add support for non-coherent DMA
  2018-10-26  8:04     ` Christoph Hellwig
@ 2018-10-26  9:59       ` Robin Murphy
  0 siblings, 0 replies; 57+ messages in thread
From: Robin Murphy @ 2018-10-26  9:59 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Will Deacon, Catalin Marinas, Konrad Rzeszutek Wilk,
	linux-arm-kernel, iommu, linux-kernel

On 2018-10-26 9:04 am, Christoph Hellwig wrote:
> On Mon, Oct 22, 2018 at 06:11:04PM +0100, Robin Murphy wrote:
>>>    +	if (!dev_is_dma_coherent(dev) &&
>>> +	    (attrs & DMA_ATTR_SKIP_CPU_SYNC) == 0)
>>
>> Nit: other parts of the file are already using the "!(...)" style rather
>> than "(...) == 0".
> 
> Indeed.  Probably some kind of copy and past error..
> 
>> I had a quick play at moving the cache maintenance here out into the
>> callers, which comes out arguably looking perhaps a little cleaner (only +1
>> source line overall, and actually reduces text size by 32 bytes for my
>> build), but sadly I can't really see any way of doing the equivalent for
>> map/unmap short of duplicating the whole 3-line arch_sync_*() block, which
>> just makes for a different readability problem. As you mentioned on patch
>> #7, I guess this really is just one of those things which has no nice
>> solution, so cosmetics aside,
> 
> It looks pretty ok, but given that I want to send this version to
> Linus for the merge window I don't really dare to touch it.
> 
> The master plan for 4.21 is to actually merge swiotlb into dma-direct
> and to allow direct calls to dma-direct given how expensive indirect
> calls are with spectre, and dma mappings show up badly in various
> benchmarks.  I'll see if I can do that in a way following what you've
> done.

Right, if we're going to be coming back here soon to make it efficient, 
there's even less need to worry about how beautiful it is at this point 
when we're just making it functional. I'll save up all my optimisation 
urges for the next round :)

Robin.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 10/10] arm64: use the generic swiotlb_dma_ops
  2018-10-22 17:52   ` Robin Murphy
@ 2018-10-26 12:44     ` Christoph Hellwig
  0 siblings, 0 replies; 57+ messages in thread
From: Christoph Hellwig @ 2018-10-26 12:44 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Christoph Hellwig, Will Deacon, Catalin Marinas,
	Konrad Rzeszutek Wilk, linux-arm-kernel, iommu, linux-kernel

On Mon, Oct 22, 2018 at 06:52:51PM +0100, Robin Murphy wrote:
>> scatterlist *sgl,
>> -				  int nelems, enum dma_data_direction dir,
>> -				  unsigned long attrs)
>> +long arch_dma_coherent_to_pfn(struct device *dev, void *cpu_addr,
>
> I realise I'm spotting this a bit late, but does this interface really need 
> somewhat-atypical signed PFNs?

It has been a while since I wrote this code, and I can't remember the
reason for it except that I'm pretty sure I copy and pasted it from
existing code somewhere.

Switching it to unsigned long should be fine, and patches are welcome.

>> +static int __swiotlb_get_sgtable_page(struct sg_table *sgt,
>> +				      struct page *page, size_t size)
>
> If __iommu_get_sgtable() is now the only user, I'd say just inline these 
> remnants there (or let that case call dma_common_sgtable() if that's what 
> swiotlb_dma_ops are now relying on). Either way the "__swiotlb" is now a 
> complete misnomer.

It is.  But we hopefully kill both the only user and this helper in
the 4.21 merge window when moving the iommu dma api wrappers to common
code.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 06/10] swiotlb: use swiotlb_map_page in swiotlb_map_sg_attrs
  2018-10-08  8:02 ` [PATCH 06/10] swiotlb: use swiotlb_map_page in swiotlb_map_sg_attrs Christoph Hellwig
  2018-10-18 17:53   ` Robin Murphy
  2018-10-19  0:33   ` Konrad Rzeszutek Wilk
@ 2018-11-07  1:27   ` John Stultz
  2018-11-09  7:49     ` Christoph Hellwig
  2 siblings, 1 reply; 57+ messages in thread
From: John Stultz @ 2018-11-07  1:27 UTC (permalink / raw)
  To: hch
  Cc: Will Deacon, Catalin Marinas, robin.murphy, konrad.wilk,
	linux-arm-kernel, iommu, Linux Kernel Mailing List,
	Valentin Schneider

On Mon, Oct 8, 2018 at 1:04 AM Christoph Hellwig <hch@lst.de> wrote:
>
> No need to duplicate the code - map_sg is equivalent to map_page
> for each page in the scatterlist.
>
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>  kernel/dma/swiotlb.c | 34 ++++++++++++----------------------
>  1 file changed, 12 insertions(+), 22 deletions(-)

Hey all,

So, I've found this patch seems to break userspace booting on the
HiKey960 board. Initially I thought this was an issue with the mali
drivers, and have worked w/ the mali team to try to find a solution,
but I've since found that booting just the upstream kernel (with no
graphics support) will see userland hang/block unless this patch is
reverted.

When I see the hangs, it seems like the filesystems are stuck or
something, as kernel messages still show up and sometimes I can get to
a shell, but commands that I run in that shell (like ls) just hang. I
don't see any other error messages.

Reverting this patch then gets it work. In order to cleanly revert the
patch, I have to revert the following set:
  "arm64: use the generic swiotlb_dma_ops"
  "swiotlb: add support for non-coherent DMA"
  "swiotlb: don't dip into swiotlb pool for coherent allocations"
  "swiotlb: refactor swiotlb_map_page"
  "swiotlb: use swiotlb_map_page in swiotlb_map_sg_attrs"

But at that point if I just re-apply "swiotlb: use swiotlb_map_page in
swiotlb_map_sg_attrs", I reproduce the hangs.

Any suggestions for how to further debug what might be going wrong
would be appreciated!

thanks
-john

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 06/10] swiotlb: use swiotlb_map_page in swiotlb_map_sg_attrs
  2018-11-07  1:27   ` John Stultz
@ 2018-11-09  7:49     ` Christoph Hellwig
  2018-11-09 16:37       ` Robin Murphy
  2018-11-13  0:07       ` John Stultz
  0 siblings, 2 replies; 57+ messages in thread
From: Christoph Hellwig @ 2018-11-09  7:49 UTC (permalink / raw)
  To: John Stultz
  Cc: hch, Will Deacon, Catalin Marinas, robin.murphy, konrad.wilk,
	linux-arm-kernel, iommu, Linux Kernel Mailing List,
	Valentin Schneider

On Tue, Nov 06, 2018 at 05:27:14PM -0800, John Stultz wrote:
> But at that point if I just re-apply "swiotlb: use swiotlb_map_page in
> swiotlb_map_sg_attrs", I reproduce the hangs.
> 
> Any suggestions for how to further debug what might be going wrong
> would be appreciated!

Very odd.  In the end map_sg and map_page are defined to do the same
things to start with.  The only real issue we had in this area was:

"[PATCH v2] of/device: Really only set bus DMA mask when appropriate"

so with current mainline + that you still see a problem, and if you
rever the commit we are replying to it still goes away?

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 06/10] swiotlb: use swiotlb_map_page in swiotlb_map_sg_attrs
  2018-11-09  7:49     ` Christoph Hellwig
@ 2018-11-09 16:37       ` Robin Murphy
  2018-11-19 19:36         ` Robin Murphy
  2018-11-13  0:07       ` John Stultz
  1 sibling, 1 reply; 57+ messages in thread
From: Robin Murphy @ 2018-11-09 16:37 UTC (permalink / raw)
  To: Christoph Hellwig, John Stultz
  Cc: Will Deacon, Catalin Marinas, konrad.wilk, linux-arm-kernel,
	iommu, Linux Kernel Mailing List, Valentin Schneider

On 09/11/2018 07:49, Christoph Hellwig wrote:
> On Tue, Nov 06, 2018 at 05:27:14PM -0800, John Stultz wrote:
>> But at that point if I just re-apply "swiotlb: use swiotlb_map_page in
>> swiotlb_map_sg_attrs", I reproduce the hangs.
>>
>> Any suggestions for how to further debug what might be going wrong
>> would be appreciated!
> 
> Very odd.  In the end map_sg and map_page are defined to do the same
> things to start with.  The only real issue we had in this area was:
> 
> "[PATCH v2] of/device: Really only set bus DMA mask when appropriate"
> 
> so with current mainline + that you still see a problem, and if you
> rever the commit we are replying to it still goes away?

OK, after quite a bit of trying I have managed to provoke a 
similar-looking problem with straight 4.20-rc1 on my Juno board - so far 
my "reproducer" is to decompress a ~10GB .tar.xz off an external USB 
hard disk, wherein after somewhere between 5 minutes and half an hour or 
so it tends to falls over with xz choking on corrupt data and/or a USB 
error.

 From the presentation, this really smells like there's some corner in 
which we're either missing cache maintenance or doing it to the wrong 
address - I've not seen any issues with Juno's main PCIe-attached I/O, 
but the EHCI here is non-coherent (and 32-bit, so the bus_dma_mask thing 
doesn't matter) as are the HiKey UFS and SD controller.

I'll keep digging...

Robin.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 06/10] swiotlb: use swiotlb_map_page in swiotlb_map_sg_attrs
  2018-11-09  7:49     ` Christoph Hellwig
  2018-11-09 16:37       ` Robin Murphy
@ 2018-11-13  0:07       ` John Stultz
  2018-11-13  0:26         ` John Stultz
  2018-11-14 14:13         ` Christoph Hellwig
  1 sibling, 2 replies; 57+ messages in thread
From: John Stultz @ 2018-11-13  0:07 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Will Deacon, Catalin Marinas, Robin Murphy,
	Konrad Rzeszutek Wilk, linux-arm-kernel, iommu,
	Linux Kernel Mailing List, Valentin Schneider

On Thu, Nov 8, 2018 at 11:49 PM, Christoph Hellwig <hch@lst.de> wrote:
> On Tue, Nov 06, 2018 at 05:27:14PM -0800, John Stultz wrote:
>> But at that point if I just re-apply "swiotlb: use swiotlb_map_page in
>> swiotlb_map_sg_attrs", I reproduce the hangs.
>>
>> Any suggestions for how to further debug what might be going wrong
>> would be appreciated!
>
> Very odd.  In the end map_sg and map_page are defined to do the same
> things to start with.  The only real issue we had in this area was:
>
> "[PATCH v2] of/device: Really only set bus DMA mask when appropriate"
>
> so with current mainline + that you still see a problem, and if you
> rever the commit we are replying to it still goes away?

Just to confirm, as of 4.20-rc2 (which includes the of/device patch
above), I'm still seeing this issue, but it isn't as reliable to
reproduce as before.

With "swiotlb: use swiotlb_map_page in swiotlb_map_sg_attrs" reverted
(along with the dependent swiotlb patches) it doesn't seem to trigger
(no matter what I try).  But re-applying that patch it tends to
trigger by itself at boot up, but sometimes I have to run "find /" to
trigger the io hang/stall.

thanks
-john

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 06/10] swiotlb: use swiotlb_map_page in swiotlb_map_sg_attrs
  2018-11-13  0:07       ` John Stultz
@ 2018-11-13  0:26         ` John Stultz
  2018-11-14 14:13         ` Christoph Hellwig
  1 sibling, 0 replies; 57+ messages in thread
From: John Stultz @ 2018-11-13  0:26 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Will Deacon, Catalin Marinas, Robin Murphy,
	Konrad Rzeszutek Wilk, linux-arm-kernel, iommu,
	Linux Kernel Mailing List, Valentin Schneider

On Mon, Nov 12, 2018 at 4:07 PM, John Stultz <john.stultz@linaro.org> wrote:
> On Thu, Nov 8, 2018 at 11:49 PM, Christoph Hellwig <hch@lst.de> wrote:
>> On Tue, Nov 06, 2018 at 05:27:14PM -0800, John Stultz wrote:
>>> But at that point if I just re-apply "swiotlb: use swiotlb_map_page in
>>> swiotlb_map_sg_attrs", I reproduce the hangs.
>>>
>>> Any suggestions for how to further debug what might be going wrong
>>> would be appreciated!
>>
>> Very odd.  In the end map_sg and map_page are defined to do the same
>> things to start with.  The only real issue we had in this area was:
>>
>> "[PATCH v2] of/device: Really only set bus DMA mask when appropriate"
>>
>> so with current mainline + that you still see a problem, and if you
>> rever the commit we are replying to it still goes away?
>
> Just to confirm, as of 4.20-rc2 (which includes the of/device patch
> above), I'm still seeing this issue, but it isn't as reliable to
> reproduce as before.
>
> With "swiotlb: use swiotlb_map_page in swiotlb_map_sg_attrs" reverted
> (along with the dependent swiotlb patches) it doesn't seem to trigger
> (no matter what I try).  But re-applying that patch it tends to
> trigger by itself at boot up, but sometimes I have to run "find /" to
> trigger the io hang/stall.

Also, I wanted to mention I've not seen this issue on the original
hikey board. So far only on the hikey960, which might mean this is
tied to something in the hisi UFS driver?

thanks
-john

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 06/10] swiotlb: use swiotlb_map_page in swiotlb_map_sg_attrs
  2018-11-13  0:07       ` John Stultz
  2018-11-13  0:26         ` John Stultz
@ 2018-11-14 14:13         ` Christoph Hellwig
  2018-11-14 16:12           ` Christoph Hellwig
  1 sibling, 1 reply; 57+ messages in thread
From: Christoph Hellwig @ 2018-11-14 14:13 UTC (permalink / raw)
  To: John Stultz
  Cc: Christoph Hellwig, Will Deacon, Catalin Marinas, Robin Murphy,
	Konrad Rzeszutek Wilk, linux-arm-kernel, iommu,
	Linux Kernel Mailing List, Valentin Schneider

Does the patch below make a difference for you?  Assigning an
address to the S/G list is the only functional difference I could
spot.  Drivers really should never look at the S/G list on an
error return, but..

diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
index 75c82f050c9e..a896f46d0c31 100644
--- a/kernel/dma/swiotlb.c
+++ b/kernel/dma/swiotlb.c
@@ -788,7 +788,7 @@ swiotlb_map_sg_attrs(struct device *dev, struct scatterlist *sgl, int nelems,
 	for_each_sg(sgl, sg, nelems, i) {
 		dma_addr_t dma_addr;
 
-		dma_add = swiotlb_map_page(dev, sg_page(sg), sg->offset,
+		dma_addr = swiotlb_map_page(dev, sg_page(sg), sg->offset,
 				sg->length, dir, attrs);
 		if (dma_addr == DIRECT_MAPPING_ERROR)
 			goto out_error;

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 06/10] swiotlb: use swiotlb_map_page in swiotlb_map_sg_attrs
  2018-11-14 14:13         ` Christoph Hellwig
@ 2018-11-14 16:12           ` Christoph Hellwig
  2018-11-19 23:22             ` John Stultz
  0 siblings, 1 reply; 57+ messages in thread
From: Christoph Hellwig @ 2018-11-14 16:12 UTC (permalink / raw)
  To: John Stultz
  Cc: Christoph Hellwig, Will Deacon, Catalin Marinas, Robin Murphy,
	Konrad Rzeszutek Wilk, linux-arm-kernel, iommu,
	Linux Kernel Mailing List, Valentin Schneider

On Wed, Nov 14, 2018 at 03:13:11PM +0100, Christoph Hellwig wrote:
> Does the patch below make a difference for you?  Assigning an
> address to the S/G list is the only functional difference I could
> spot.  Drivers really should never look at the S/G list on an
> error return, but..

And that was obviously just the second half of the patch, so here
is the full one:

diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
index 5731daa09a32..a896f46d0c31 100644
--- a/kernel/dma/swiotlb.c
+++ b/kernel/dma/swiotlb.c
@@ -786,10 +786,13 @@ swiotlb_map_sg_attrs(struct device *dev, struct scatterlist *sgl, int nelems,
 	int i;
 
 	for_each_sg(sgl, sg, nelems, i) {
-		sg->dma_address = swiotlb_map_page(dev, sg_page(sg), sg->offset,
+		dma_addr_t dma_addr;
+
+		dma_addr = swiotlb_map_page(dev, sg_page(sg), sg->offset,
 				sg->length, dir, attrs);
-		if (sg->dma_address == DIRECT_MAPPING_ERROR)
+		if (dma_addr == DIRECT_MAPPING_ERROR)
 			goto out_error;
+		sg->dma_address = dma_addr;
 		sg_dma_len(sg) = sg->length;
 	}
 

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 06/10] swiotlb: use swiotlb_map_page in swiotlb_map_sg_attrs
  2018-11-09 16:37       ` Robin Murphy
@ 2018-11-19 19:36         ` Robin Murphy
  2018-11-20  9:22           ` Christoph Hellwig
  0 siblings, 1 reply; 57+ messages in thread
From: Robin Murphy @ 2018-11-19 19:36 UTC (permalink / raw)
  To: Christoph Hellwig, John Stultz
  Cc: konrad.wilk, Catalin Marinas, Will Deacon,
	Linux Kernel Mailing List, iommu, Valentin Schneider,
	linux-arm-kernel

On 09/11/2018 16:37, Robin Murphy wrote:
> On 09/11/2018 07:49, Christoph Hellwig wrote:
>> On Tue, Nov 06, 2018 at 05:27:14PM -0800, John Stultz wrote:
>>> But at that point if I just re-apply "swiotlb: use swiotlb_map_page in
>>> swiotlb_map_sg_attrs", I reproduce the hangs.
>>>
>>> Any suggestions for how to further debug what might be going wrong
>>> would be appreciated!
>>
>> Very odd.  In the end map_sg and map_page are defined to do the same
>> things to start with.  The only real issue we had in this area was:
>>
>> "[PATCH v2] of/device: Really only set bus DMA mask when appropriate"
>>
>> so with current mainline + that you still see a problem, and if you
>> rever the commit we are replying to it still goes away?
> 
> OK, after quite a bit of trying I have managed to provoke a 
> similar-looking problem with straight 4.20-rc1 on my Juno board - so far 
> my "reproducer" is to decompress a ~10GB .tar.xz off an external USB 
> hard disk, wherein after somewhere between 5 minutes and half an hour or 
> so it tends to falls over with xz choking on corrupt data and/or a USB 
> error.
> 
>  From the presentation, this really smells like there's some corner in 
> which we're either missing cache maintenance or doing it to the wrong 
> address - I've not seen any issues with Juno's main PCIe-attached I/O, 
> but the EHCI here is non-coherent (and 32-bit, so the bus_dma_mask thing 
> doesn't matter) as are the HiKey UFS and SD controller.
> 
> I'll keep digging...

OK, having brought my Hikey to life and reproduced John's stall with 
rc1, what's going on is that at some point dma_map_sg() returns 0, which 
causes the SCSI/UFS layer to go round in circles repeatedly trying to 
map the same list(s) equally unsuccessfully.

Why does dma_map_sg() fail? Turns out what we all managed to overlook is 
that this patch *does* introduce a subtle change in behaviour, in that 
previously the non-bounced case assigned dev_addr to sg->dma_address 
without looking at it; now with the swiotlb_map_page() call we check the 
return value against DIRECT_MAPPING_ERROR regardless of whether it was 
bounced or not.

Flash back to the other thread when I said "...but I suspect there may 
well be non-IOMMU platforms where DMA to physical address 0 is a thing 
:("? I have the 3GB Hikey where all the RAM is below 32 bits so SWIOTLB 
never ever bounces, but sure enough, guess where that RAM starts...

So in fact it looks like patch #4 technically introduces the first 
instance of this problem, we're just getting lucky not to hit it with a 
map_page/map_single case such that direct_mapping_error() would wrongly 
report failure for page 0. The bad news (for me) is that that can't have 
anything to do with my apparent memory corruption thing above, so now I 
still need to figure out what the hell is going on there.

Robin.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 06/10] swiotlb: use swiotlb_map_page in swiotlb_map_sg_attrs
  2018-11-14 16:12           ` Christoph Hellwig
@ 2018-11-19 23:22             ` John Stultz
  2018-11-20  9:25               ` Christoph Hellwig
  0 siblings, 1 reply; 57+ messages in thread
From: John Stultz @ 2018-11-19 23:22 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Will Deacon, Catalin Marinas, Robin Murphy,
	Konrad Rzeszutek Wilk, linux-arm-kernel, iommu, lkml,
	Valentin Schneider

On Wed, Nov 14, 2018 at 8:12 AM Christoph Hellwig <hch@lst.de> wrote:
>
> On Wed, Nov 14, 2018 at 03:13:11PM +0100, Christoph Hellwig wrote:
> > Does the patch below make a difference for you?  Assigning an
> > address to the S/G list is the only functional difference I could
> > spot.  Drivers really should never look at the S/G list on an
> > error return, but..
>
> And that was obviously just the second half of the patch, so here
> is the full one:
>
> diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
> index 5731daa09a32..a896f46d0c31 100644
> --- a/kernel/dma/swiotlb.c
> +++ b/kernel/dma/swiotlb.c
> @@ -786,10 +786,13 @@ swiotlb_map_sg_attrs(struct device *dev, struct scatterlist *sgl, int nelems,
>         int i;
>
>         for_each_sg(sgl, sg, nelems, i) {
> -               sg->dma_address = swiotlb_map_page(dev, sg_page(sg), sg->offset,
> +               dma_addr_t dma_addr;
> +
> +               dma_addr = swiotlb_map_page(dev, sg_page(sg), sg->offset,
>                                 sg->length, dir, attrs);
> -               if (sg->dma_address == DIRECT_MAPPING_ERROR)
> +               if (dma_addr == DIRECT_MAPPING_ERROR)
>                         goto out_error;
> +               sg->dma_address = dma_addr;
>                 sg_dma_len(sg) = sg->length;
>         }

I know Robin has already replied with more detailed info, but just to
close the loop as I'm finally home, applying this patch didn't seem to
help with the IO hangs I'm seeing w/ HiKey960.

thanks
-john

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 06/10] swiotlb: use swiotlb_map_page in swiotlb_map_sg_attrs
  2018-11-19 19:36         ` Robin Murphy
@ 2018-11-20  9:22           ` Christoph Hellwig
  0 siblings, 0 replies; 57+ messages in thread
From: Christoph Hellwig @ 2018-11-20  9:22 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Christoph Hellwig, John Stultz, konrad.wilk, Catalin Marinas,
	Will Deacon, Linux Kernel Mailing List, iommu,
	Valentin Schneider, linux-arm-kernel

On Mon, Nov 19, 2018 at 07:36:44PM +0000, Robin Murphy wrote:
> OK, having brought my Hikey to life and reproduced John's stall with rc1, 
> what's going on is that at some point dma_map_sg() returns 0, which causes 
> the SCSI/UFS layer to go round in circles repeatedly trying to map the same 
> list(s) equally unsuccessfully.
>
> Why does dma_map_sg() fail? Turns out what we all managed to overlook is 
> that this patch *does* introduce a subtle change in behaviour, in that 
> previously the non-bounced case assigned dev_addr to sg->dma_address 
> without looking at it; now with the swiotlb_map_page() call we check the 
> return value against DIRECT_MAPPING_ERROR regardless of whether it was 
> bounced or not.
>
> Flash back to the other thread when I said "...but I suspect there may well 
> be non-IOMMU platforms where DMA to physical address 0 is a thing :("? I 
> have the 3GB Hikey where all the RAM is below 32 bits so SWIOTLB never ever 
> bounces, but sure enough, guess where that RAM starts...

What is PAGE_OFFSET on that machine?   We usually don't use kernel
virtual address 0 so that we can deal with 0 pointer derferences sanely,
but I guess we can get to physical address 0.

I guess the quick fix would be to move DMA_DIRECT_MAPPING_ERROR to all-F
ASAP..

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 06/10] swiotlb: use swiotlb_map_page in swiotlb_map_sg_attrs
  2018-11-19 23:22             ` John Stultz
@ 2018-11-20  9:25               ` Christoph Hellwig
  2018-11-23 18:27                 ` Will Deacon
  0 siblings, 1 reply; 57+ messages in thread
From: Christoph Hellwig @ 2018-11-20  9:25 UTC (permalink / raw)
  To: John Stultz
  Cc: Christoph Hellwig, Will Deacon, Catalin Marinas, Robin Murphy,
	Konrad Rzeszutek Wilk, linux-arm-kernel, iommu, lkml,
	Valentin Schneider

On Mon, Nov 19, 2018 at 03:22:13PM -0800, John Stultz wrote:
> > +               sg->dma_address = dma_addr;
> >                 sg_dma_len(sg) = sg->length;
> >         }
> 
> I know Robin has already replied with more detailed info, but just to
> close the loop as I'm finally home, applying this patch didn't seem to
> help with the IO hangs I'm seeing w/ HiKey960.

If Robins observation is right this should fix the problem for you:

diff --git a/include/linux/dma-direct.h b/include/linux/dma-direct.h
index bd73e7a91410..1833f0c1fba0 100644
--- a/include/linux/dma-direct.h
+++ b/include/linux/dma-direct.h
@@ -5,7 +5,7 @@
 #include <linux/dma-mapping.h>
 #include <linux/mem_encrypt.h>
 
-#define DIRECT_MAPPING_ERROR		0
+#define DIRECT_MAPPING_ERROR		(~(dma_addr_t)0x0)
 
 #ifdef CONFIG_ARCH_HAS_PHYS_TO_DMA
 #include <asm/dma-direct.h>

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 06/10] swiotlb: use swiotlb_map_page in swiotlb_map_sg_attrs
  2018-11-20  9:25               ` Christoph Hellwig
@ 2018-11-23 18:27                 ` Will Deacon
  2018-11-23 19:34                   ` Robin Murphy
  0 siblings, 1 reply; 57+ messages in thread
From: Will Deacon @ 2018-11-23 18:27 UTC (permalink / raw)
  To: Christoph Hellwig, john.stultz
  Cc: Catalin Marinas, Robin Murphy, Konrad Rzeszutek Wilk,
	linux-arm-kernel, iommu, lkml, Valentin Schneider

Hi John,

On Tue, Nov 20, 2018 at 10:25:16AM +0100, Christoph Hellwig wrote:
> On Mon, Nov 19, 2018 at 03:22:13PM -0800, John Stultz wrote:
> > > +               sg->dma_address = dma_addr;
> > >                 sg_dma_len(sg) = sg->length;
> > >         }
> > 
> > I know Robin has already replied with more detailed info, but just to
> > close the loop as I'm finally home, applying this patch didn't seem to
> > help with the IO hangs I'm seeing w/ HiKey960.
> 
> If Robins observation is right this should fix the problem for you:

Please could you give this diff a try and let us know whether the problem
persists with your board?

Thanks,

Will

> diff --git a/include/linux/dma-direct.h b/include/linux/dma-direct.h
> index bd73e7a91410..1833f0c1fba0 100644
> --- a/include/linux/dma-direct.h
> +++ b/include/linux/dma-direct.h
> @@ -5,7 +5,7 @@
>  #include <linux/dma-mapping.h>
>  #include <linux/mem_encrypt.h>
>  
> -#define DIRECT_MAPPING_ERROR		0
> +#define DIRECT_MAPPING_ERROR		(~(dma_addr_t)0x0)
>  
>  #ifdef CONFIG_ARCH_HAS_PHYS_TO_DMA
>  #include <asm/dma-direct.h>

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 06/10] swiotlb: use swiotlb_map_page in swiotlb_map_sg_attrs
  2018-11-23 18:27                 ` Will Deacon
@ 2018-11-23 19:34                   ` Robin Murphy
  2018-11-26 19:31                     ` Will Deacon
  0 siblings, 1 reply; 57+ messages in thread
From: Robin Murphy @ 2018-11-23 19:34 UTC (permalink / raw)
  To: Will Deacon, Christoph Hellwig, john.stultz
  Cc: Catalin Marinas, Konrad Rzeszutek Wilk, linux-arm-kernel, iommu,
	lkml, Valentin Schneider

Hi Will,

On 2018-11-23 6:27 pm, Will Deacon wrote:
> Hi John,
> 
> On Tue, Nov 20, 2018 at 10:25:16AM +0100, Christoph Hellwig wrote:
>> On Mon, Nov 19, 2018 at 03:22:13PM -0800, John Stultz wrote:
>>>> +               sg->dma_address = dma_addr;
>>>>                  sg_dma_len(sg) = sg->length;
>>>>          }
>>>
>>> I know Robin has already replied with more detailed info, but just to
>>> close the loop as I'm finally home, applying this patch didn't seem to
>>> help with the IO hangs I'm seeing w/ HiKey960.
>>
>> If Robins observation is right this should fix the problem for you:
> 
> Please could you give this diff a try and let us know whether the problem
> persists with your board?

This is actually the exact same change that I ended up with in my 
fixes[1], which John has indeed confirmed.

(sorry, it's the thing I was telling you about the other day, but I 
neglected to include you on cc you when I sent the patches out the 
following morning)

Robin.

[1] https://lore.kernel.org/lkml/cover.1542812807.git.robin.murphy@arm.com/

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 06/10] swiotlb: use swiotlb_map_page in swiotlb_map_sg_attrs
  2018-11-23 19:34                   ` Robin Murphy
@ 2018-11-26 19:31                     ` Will Deacon
  0 siblings, 0 replies; 57+ messages in thread
From: Will Deacon @ 2018-11-26 19:31 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Christoph Hellwig, john.stultz, Catalin Marinas,
	Konrad Rzeszutek Wilk, linux-arm-kernel, iommu, lkml,
	Valentin Schneider

Hi Robin,

On Fri, Nov 23, 2018 at 07:34:15PM +0000, Robin Murphy wrote:
> On 2018-11-23 6:27 pm, Will Deacon wrote:
> >On Tue, Nov 20, 2018 at 10:25:16AM +0100, Christoph Hellwig wrote:
> >>On Mon, Nov 19, 2018 at 03:22:13PM -0800, John Stultz wrote:
> >>>>+               sg->dma_address = dma_addr;
> >>>>                 sg_dma_len(sg) = sg->length;
> >>>>         }
> >>>
> >>>I know Robin has already replied with more detailed info, but just to
> >>>close the loop as I'm finally home, applying this patch didn't seem to
> >>>help with the IO hangs I'm seeing w/ HiKey960.
> >>
> >>If Robins observation is right this should fix the problem for you:
> >
> >Please could you give this diff a try and let us know whether the problem
> >persists with your board?
> 
> This is actually the exact same change that I ended up with in my fixes[1],
> which John has indeed confirmed.
> 
> (sorry, it's the thing I was telling you about the other day, but I
> neglected to include you on cc you when I sent the patches out the following
> morning)

Thanks, I missed that series. I'll forget all about this problem then!

Will

^ permalink raw reply	[flat|nested] 57+ messages in thread

end of thread, other threads:[~2018-11-26 19:31 UTC | newest]

Thread overview: 57+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-10-08  8:02 move swiotlb noncoherent dma support from arm64 to generic code V2 Christoph Hellwig
2018-10-08  8:02 ` [PATCH 01/10] swiotlb: remove a pointless comment Christoph Hellwig
2018-10-11 17:49   ` Robin Murphy
2018-10-19  0:09   ` Konrad Rzeszutek Wilk
2018-10-08  8:02 ` [PATCH 02/10] swiotlb: mark is_swiotlb_buffer static Christoph Hellwig
2018-10-11 17:54   ` Robin Murphy
2018-10-19  0:12   ` Konrad Rzeszutek Wilk
2018-10-08  8:02 ` [PATCH 03/10] swiotlb: do not panic on mapping failures Christoph Hellwig
2018-10-11 18:06   ` Robin Murphy
2018-10-19  0:18     ` Konrad Rzeszutek Wilk
2018-10-19  0:17   ` Konrad Rzeszutek Wilk
2018-10-19  6:04     ` Christoph Hellwig
2018-10-19 13:45       ` Konrad Rzeszutek Wilk
2018-10-08  8:02 ` [PATCH 04/10] swiotlb: remove the overflow buffer Christoph Hellwig
2018-10-11 18:19   ` Robin Murphy
2018-10-12 17:04   ` Catalin Marinas
2018-10-19  0:23   ` Konrad Rzeszutek Wilk
2018-10-08  8:02 ` [PATCH 05/10] swiotlb: merge swiotlb_unmap_page and unmap_single Christoph Hellwig
2018-10-18 17:44   ` Robin Murphy
2018-10-19  0:25   ` Konrad Rzeszutek Wilk
2018-10-08  8:02 ` [PATCH 06/10] swiotlb: use swiotlb_map_page in swiotlb_map_sg_attrs Christoph Hellwig
2018-10-18 17:53   ` Robin Murphy
2018-10-19  0:33   ` Konrad Rzeszutek Wilk
2018-11-07  1:27   ` John Stultz
2018-11-09  7:49     ` Christoph Hellwig
2018-11-09 16:37       ` Robin Murphy
2018-11-19 19:36         ` Robin Murphy
2018-11-20  9:22           ` Christoph Hellwig
2018-11-13  0:07       ` John Stultz
2018-11-13  0:26         ` John Stultz
2018-11-14 14:13         ` Christoph Hellwig
2018-11-14 16:12           ` Christoph Hellwig
2018-11-19 23:22             ` John Stultz
2018-11-20  9:25               ` Christoph Hellwig
2018-11-23 18:27                 ` Will Deacon
2018-11-23 19:34                   ` Robin Murphy
2018-11-26 19:31                     ` Will Deacon
2018-10-08  8:02 ` [PATCH 07/10] swiotlb: refactor swiotlb_map_page Christoph Hellwig
2018-10-18 18:09   ` Robin Murphy
2018-10-19  0:37     ` Konrad Rzeszutek Wilk
2018-10-19  6:52       ` Christoph Hellwig
2018-10-19 13:46         ` Konrad Rzeszutek Wilk
2018-10-08  8:02 ` [PATCH 08/10] swiotlb: don't dip into swiotlb pool for coherent allocations Christoph Hellwig
2018-10-12 17:04   ` Catalin Marinas
2018-10-19  0:40   ` Konrad Rzeszutek Wilk
2018-10-19 16:45   ` Robin Murphy
2018-10-08  8:02 ` [PATCH 09/10] swiotlb: add support for non-coherent DMA Christoph Hellwig
2018-10-19  0:49   ` Konrad Rzeszutek Wilk
2018-10-22 17:11   ` Robin Murphy
2018-10-26  8:04     ` Christoph Hellwig
2018-10-26  9:59       ` Robin Murphy
2018-10-08  8:02 ` [PATCH 10/10] arm64: use the generic swiotlb_dma_ops Christoph Hellwig
2018-10-12 13:01   ` Robin Murphy
2018-10-12 14:40     ` Christoph Hellwig
2018-10-12 17:05       ` Catalin Marinas
2018-10-22 17:52   ` Robin Murphy
2018-10-26 12:44     ` Christoph Hellwig

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).