linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* implement generic dma_map_ops for IOMMUs v3
@ 2019-04-22 17:59 Christoph Hellwig
  2019-04-22 17:59 ` [PATCH 01/26] arm64/iommu: handle non-remapped addresses in ->mmap and ->get_sgtable Christoph Hellwig
                   ` (26 more replies)
  0 siblings, 27 replies; 48+ messages in thread
From: Christoph Hellwig @ 2019-04-22 17:59 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Joerg Roedel, Catalin Marinas, Will Deacon, Tom Lendacky, iommu,
	linux-arm-kernel, linux-kernel

Hi Robin,

please take a look at this series, which implements a completely generic
set of dma_map_ops for IOMMU drivers.  This is done by taking the
existing arm64 code, moving it to drivers/iommu and then massaging it
so that it can also work for architectures with DMA remapping.  This
should help future ports to support IOMMUs more easily, and also allow
to remove various custom IOMMU dma_map_ops implementations, like Tom
was planning to for the AMD one.

A git tree is also available at:

    git://git.infradead.org/users/hch/misc.git dma-iommu-ops.3

Gitweb:

    http://git.infradead.org/users/hch/misc.git/shortlog/refs/heads/dma-iommu-ops.3

Changes since v2:
 - address various review comments and include patches from Robin

Changes since v1:
 - only include other headers in dma-iommu.h if CONFIG_DMA_IOMMU is enabled
 - keep using a scatterlist in iommu_dma_alloc
 - split out mmap/sgtable fixes and move them early in the series
 - updated a few commit logs

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [PATCH 01/26] arm64/iommu: handle non-remapped addresses in ->mmap and ->get_sgtable
  2019-04-22 17:59 implement generic dma_map_ops for IOMMUs v3 Christoph Hellwig
@ 2019-04-22 17:59 ` Christoph Hellwig
  2019-04-22 17:59 ` [PATCH 02/26] arm64/iommu: improve mmap bounds checking Christoph Hellwig
                   ` (25 subsequent siblings)
  26 siblings, 0 replies; 48+ messages in thread
From: Christoph Hellwig @ 2019-04-22 17:59 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Joerg Roedel, Catalin Marinas, Will Deacon, Tom Lendacky, iommu,
	linux-arm-kernel, linux-kernel

DMA allocations that can't sleep may return non-remapped addresses, but
we do not properly handle them in the mmap and get_sgtable methods.
Resolve non-vmalloc addresses using virt_to_page to handle this corner
case.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
---
 arch/arm64/mm/dma-mapping.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c
index 78c0a72f822c..674860e3e478 100644
--- a/arch/arm64/mm/dma-mapping.c
+++ b/arch/arm64/mm/dma-mapping.c
@@ -249,6 +249,11 @@ static int __iommu_mmap_attrs(struct device *dev, struct vm_area_struct *vma,
 	if (dma_mmap_from_dev_coherent(dev, vma, cpu_addr, size, &ret))
 		return ret;
 
+	if (!is_vmalloc_addr(cpu_addr)) {
+		unsigned long pfn = page_to_pfn(virt_to_page(cpu_addr));
+		return __swiotlb_mmap_pfn(vma, pfn, size);
+	}
+
 	if (attrs & DMA_ATTR_FORCE_CONTIGUOUS) {
 		/*
 		 * DMA_ATTR_FORCE_CONTIGUOUS allocations are always remapped,
@@ -272,6 +277,11 @@ static int __iommu_get_sgtable(struct device *dev, struct sg_table *sgt,
 	unsigned int count = PAGE_ALIGN(size) >> PAGE_SHIFT;
 	struct vm_struct *area = find_vm_area(cpu_addr);
 
+	if (!is_vmalloc_addr(cpu_addr)) {
+		struct page *page = virt_to_page(cpu_addr);
+		return __swiotlb_get_sgtable_page(sgt, page, size);
+	}
+
 	if (attrs & DMA_ATTR_FORCE_CONTIGUOUS) {
 		/*
 		 * DMA_ATTR_FORCE_CONTIGUOUS allocations are always remapped,
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH 02/26] arm64/iommu: improve mmap bounds checking
  2019-04-22 17:59 implement generic dma_map_ops for IOMMUs v3 Christoph Hellwig
  2019-04-22 17:59 ` [PATCH 01/26] arm64/iommu: handle non-remapped addresses in ->mmap and ->get_sgtable Christoph Hellwig
@ 2019-04-22 17:59 ` Christoph Hellwig
  2019-04-29 12:35   ` Robin Murphy
  2019-04-22 17:59 ` [PATCH 03/26] dma-mapping: add a Kconfig symbol to indicated arch_dma_prep_coherent presence Christoph Hellwig
                   ` (24 subsequent siblings)
  26 siblings, 1 reply; 48+ messages in thread
From: Christoph Hellwig @ 2019-04-22 17:59 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Joerg Roedel, Catalin Marinas, Will Deacon, Tom Lendacky, iommu,
	linux-arm-kernel, linux-kernel

The nr_pages checks should be done for all mmap requests, not just those
using remap_pfn_range.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 arch/arm64/mm/dma-mapping.c | 21 ++++++++-------------
 1 file changed, 8 insertions(+), 13 deletions(-)

diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c
index 674860e3e478..604c638b2787 100644
--- a/arch/arm64/mm/dma-mapping.c
+++ b/arch/arm64/mm/dma-mapping.c
@@ -73,19 +73,9 @@ static int __swiotlb_get_sgtable_page(struct sg_table *sgt,
 static int __swiotlb_mmap_pfn(struct vm_area_struct *vma,
 			      unsigned long pfn, size_t size)
 {
-	int ret = -ENXIO;
-	unsigned long nr_vma_pages = vma_pages(vma);
-	unsigned long nr_pages = PAGE_ALIGN(size) >> PAGE_SHIFT;
-	unsigned long off = vma->vm_pgoff;
-
-	if (off < nr_pages && nr_vma_pages <= (nr_pages - off)) {
-		ret = remap_pfn_range(vma, vma->vm_start,
-				      pfn + off,
-				      vma->vm_end - vma->vm_start,
-				      vma->vm_page_prot);
-	}
-
-	return ret;
+	return remap_pfn_range(vma, vma->vm_start, pfn + vma->vm_pgoff,
+			      vma->vm_end - vma->vm_start,
+			      vma->vm_page_prot);
 }
 #endif /* CONFIG_IOMMU_DMA */
 
@@ -241,6 +231,8 @@ static int __iommu_mmap_attrs(struct device *dev, struct vm_area_struct *vma,
 			      void *cpu_addr, dma_addr_t dma_addr, size_t size,
 			      unsigned long attrs)
 {
+	unsigned long nr_pages = PAGE_ALIGN(size) >> PAGE_SHIFT;
+	unsigned long off = vma->vm_pgoff;
 	struct vm_struct *area;
 	int ret;
 
@@ -249,6 +241,9 @@ static int __iommu_mmap_attrs(struct device *dev, struct vm_area_struct *vma,
 	if (dma_mmap_from_dev_coherent(dev, vma, cpu_addr, size, &ret))
 		return ret;
 
+	if (off >= nr_pages || vma_pages(vma) > nr_pages - off)
+		return -ENXIO;
+
 	if (!is_vmalloc_addr(cpu_addr)) {
 		unsigned long pfn = page_to_pfn(virt_to_page(cpu_addr));
 		return __swiotlb_mmap_pfn(vma, pfn, size);
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH 03/26] dma-mapping: add a Kconfig symbol to indicated arch_dma_prep_coherent presence
  2019-04-22 17:59 implement generic dma_map_ops for IOMMUs v3 Christoph Hellwig
  2019-04-22 17:59 ` [PATCH 01/26] arm64/iommu: handle non-remapped addresses in ->mmap and ->get_sgtable Christoph Hellwig
  2019-04-22 17:59 ` [PATCH 02/26] arm64/iommu: improve mmap bounds checking Christoph Hellwig
@ 2019-04-22 17:59 ` Christoph Hellwig
  2019-04-22 17:59 ` [PATCH 04/26] iommu/dma: Cleanup dma-iommu.h Christoph Hellwig
                   ` (23 subsequent siblings)
  26 siblings, 0 replies; 48+ messages in thread
From: Christoph Hellwig @ 2019-04-22 17:59 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Joerg Roedel, Catalin Marinas, Will Deacon, Tom Lendacky, iommu,
	linux-arm-kernel, linux-kernel

Add a Kconfig symbol that indicates an architecture provides a
arch_dma_prep_coherent implementation, and provide a stub otherwise.

This will allow the generic dma-iommu code to use it while still
allowing to be built for cache coherent architectures.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
---
 arch/arm64/Kconfig              | 1 +
 arch/csky/Kconfig               | 1 +
 include/linux/dma-noncoherent.h | 6 ++++++
 kernel/dma/Kconfig              | 3 +++
 4 files changed, 11 insertions(+)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 7e34b9eba5de..adda078d6df7 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -13,6 +13,7 @@ config ARM64
 	select ARCH_HAS_DEVMEM_IS_ALLOWED
 	select ARCH_HAS_DMA_COHERENT_TO_PFN
 	select ARCH_HAS_DMA_MMAP_PGPROT
+	select ARCH_HAS_DMA_PREP_COHERENT
 	select ARCH_HAS_ACPI_TABLE_UPGRADE if ACPI
 	select ARCH_HAS_ELF_RANDOMIZE
 	select ARCH_HAS_FAST_MULTIPLIER
diff --git a/arch/csky/Kconfig b/arch/csky/Kconfig
index 725a115759c9..2c3178848b7d 100644
--- a/arch/csky/Kconfig
+++ b/arch/csky/Kconfig
@@ -1,6 +1,7 @@
 config CSKY
 	def_bool y
 	select ARCH_32BIT_OFF_T
+	select ARCH_HAS_DMA_PREP_COHERENT
 	select ARCH_HAS_SYNC_DMA_FOR_CPU
 	select ARCH_HAS_SYNC_DMA_FOR_DEVICE
 	select ARCH_USE_BUILTIN_BSWAP
diff --git a/include/linux/dma-noncoherent.h b/include/linux/dma-noncoherent.h
index 69b36ed31a99..9741767e400f 100644
--- a/include/linux/dma-noncoherent.h
+++ b/include/linux/dma-noncoherent.h
@@ -72,6 +72,12 @@ static inline void arch_sync_dma_for_cpu_all(struct device *dev)
 }
 #endif /* CONFIG_ARCH_HAS_SYNC_DMA_FOR_CPU_ALL */
 
+#ifdef CONFIG_ARCH_HAS_DMA_PREP_COHERENT
 void arch_dma_prep_coherent(struct page *page, size_t size);
+#else
+static inline void arch_dma_prep_coherent(struct page *page, size_t size)
+{
+}
+#endif /* CONFIG_ARCH_HAS_DMA_PREP_COHERENT */
 
 #endif /* _LINUX_DMA_NONCOHERENT_H */
diff --git a/kernel/dma/Kconfig b/kernel/dma/Kconfig
index a06ba3013b3b..feff2d21d8ee 100644
--- a/kernel/dma/Kconfig
+++ b/kernel/dma/Kconfig
@@ -38,6 +38,9 @@ config ARCH_HAS_SYNC_DMA_FOR_CPU
 config ARCH_HAS_SYNC_DMA_FOR_CPU_ALL
 	bool
 
+config ARCH_HAS_DMA_PREP_COHERENT
+	bool
+
 config ARCH_HAS_DMA_COHERENT_TO_PFN
 	bool
 
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH 04/26] iommu/dma: Cleanup dma-iommu.h
  2019-04-22 17:59 implement generic dma_map_ops for IOMMUs v3 Christoph Hellwig
                   ` (2 preceding siblings ...)
  2019-04-22 17:59 ` [PATCH 03/26] dma-mapping: add a Kconfig symbol to indicated arch_dma_prep_coherent presence Christoph Hellwig
@ 2019-04-22 17:59 ` Christoph Hellwig
  2019-04-22 17:59 ` [PATCH 05/26] iommu/dma: Remove the flush_page callback Christoph Hellwig
                   ` (22 subsequent siblings)
  26 siblings, 0 replies; 48+ messages in thread
From: Christoph Hellwig @ 2019-04-22 17:59 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Joerg Roedel, Catalin Marinas, Will Deacon, Tom Lendacky, iommu,
	linux-arm-kernel, linux-kernel

No need for a __KERNEL__ guard outside uapi and add a missing comment
describing the #else cpp statement.  Last but not least include
<linux/errno.h> instead of the asm version, which is frowned upon.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
---
 include/linux/dma-iommu.h | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/include/linux/dma-iommu.h b/include/linux/dma-iommu.h
index e760dc5d1fa8..8741637941ca 100644
--- a/include/linux/dma-iommu.h
+++ b/include/linux/dma-iommu.h
@@ -16,9 +16,8 @@
 #ifndef __DMA_IOMMU_H
 #define __DMA_IOMMU_H
 
-#ifdef __KERNEL__
+#include <linux/errno.h>
 #include <linux/types.h>
-#include <asm/errno.h>
 
 #ifdef CONFIG_IOMMU_DMA
 #include <linux/dma-mapping.h>
@@ -74,7 +73,7 @@ void iommu_dma_unmap_resource(struct device *dev, dma_addr_t handle,
 void iommu_dma_map_msi_msg(int irq, struct msi_msg *msg);
 void iommu_dma_get_resv_regions(struct device *dev, struct list_head *list);
 
-#else
+#else /* CONFIG_IOMMU_DMA */
 
 struct iommu_domain;
 struct msi_msg;
@@ -108,5 +107,4 @@ static inline void iommu_dma_get_resv_regions(struct device *dev, struct list_he
 }
 
 #endif	/* CONFIG_IOMMU_DMA */
-#endif	/* __KERNEL__ */
 #endif	/* __DMA_IOMMU_H */
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH 05/26] iommu/dma: Remove the flush_page callback
  2019-04-22 17:59 implement generic dma_map_ops for IOMMUs v3 Christoph Hellwig
                   ` (3 preceding siblings ...)
  2019-04-22 17:59 ` [PATCH 04/26] iommu/dma: Cleanup dma-iommu.h Christoph Hellwig
@ 2019-04-22 17:59 ` Christoph Hellwig
  2019-04-22 17:59 ` [PATCH 06/26] iommu/dma: Use for_each_sg in iommu_dma_alloc Christoph Hellwig
                   ` (21 subsequent siblings)
  26 siblings, 0 replies; 48+ messages in thread
From: Christoph Hellwig @ 2019-04-22 17:59 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Joerg Roedel, Catalin Marinas, Will Deacon, Tom Lendacky, iommu,
	linux-arm-kernel, linux-kernel

We now have a arch_dma_prep_coherent architecture hook that is used
for the generic DMA remap allocator, and we should use the same
interface for the dma-iommu code.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
---
 arch/arm64/mm/dma-mapping.c | 8 +-------
 drivers/iommu/dma-iommu.c   | 8 +++-----
 include/linux/dma-iommu.h   | 3 +--
 3 files changed, 5 insertions(+), 14 deletions(-)

diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c
index 604c638b2787..636fa7c64370 100644
--- a/arch/arm64/mm/dma-mapping.c
+++ b/arch/arm64/mm/dma-mapping.c
@@ -94,12 +94,6 @@ arch_initcall(arm64_dma_init);
 #include <linux/platform_device.h>
 #include <linux/amba/bus.h>
 
-/* Thankfully, all cache ops are by VA so we can ignore phys here */
-static void flush_page(struct device *dev, const void *virt, phys_addr_t phys)
-{
-	__dma_flush_area(virt, PAGE_SIZE);
-}
-
 static void *__iommu_alloc_attrs(struct device *dev, size_t size,
 				 dma_addr_t *handle, gfp_t gfp,
 				 unsigned long attrs)
@@ -176,7 +170,7 @@ static void *__iommu_alloc_attrs(struct device *dev, size_t size,
 		struct page **pages;
 
 		pages = iommu_dma_alloc(dev, iosize, gfp, attrs, ioprot,
-					handle, flush_page);
+					handle);
 		if (!pages)
 			return NULL;
 
diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index 77aabe637a60..77d704c8f565 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -22,6 +22,7 @@
 #include <linux/acpi_iort.h>
 #include <linux/device.h>
 #include <linux/dma-iommu.h>
+#include <linux/dma-noncoherent.h>
 #include <linux/gfp.h>
 #include <linux/huge_mm.h>
 #include <linux/iommu.h>
@@ -531,8 +532,6 @@ void iommu_dma_free(struct device *dev, struct page **pages, size_t size,
  * @attrs: DMA attributes for this allocation
  * @prot: IOMMU mapping flags
  * @handle: Out argument for allocated DMA handle
- * @flush_page: Arch callback which must ensure PAGE_SIZE bytes from the
- *		given VA/PA are visible to the given non-coherent device.
  *
  * If @size is less than PAGE_SIZE, then a full CPU page will be allocated,
  * but an IOMMU which supports smaller pages might not map the whole thing.
@@ -541,8 +540,7 @@ void iommu_dma_free(struct device *dev, struct page **pages, size_t size,
  *	   or NULL on failure.
  */
 struct page **iommu_dma_alloc(struct device *dev, size_t size, gfp_t gfp,
-		unsigned long attrs, int prot, dma_addr_t *handle,
-		void (*flush_page)(struct device *, const void *, phys_addr_t))
+		unsigned long attrs, int prot, dma_addr_t *handle)
 {
 	struct iommu_domain *domain = iommu_get_dma_domain(dev);
 	struct iommu_dma_cookie *cookie = domain->iova_cookie;
@@ -586,7 +584,7 @@ struct page **iommu_dma_alloc(struct device *dev, size_t size, gfp_t gfp,
 		 */
 		sg_miter_start(&miter, sgt.sgl, sgt.orig_nents, SG_MITER_FROM_SG);
 		while (sg_miter_next(&miter))
-			flush_page(dev, miter.addr, page_to_phys(miter.page));
+			arch_dma_prep_coherent(miter.page, PAGE_SIZE);
 		sg_miter_stop(&miter);
 	}
 
diff --git a/include/linux/dma-iommu.h b/include/linux/dma-iommu.h
index 8741637941ca..3216447178a7 100644
--- a/include/linux/dma-iommu.h
+++ b/include/linux/dma-iommu.h
@@ -44,8 +44,7 @@ int dma_info_to_prot(enum dma_data_direction dir, bool coherent,
  * the arch code to take care of attributes and cache maintenance
  */
 struct page **iommu_dma_alloc(struct device *dev, size_t size, gfp_t gfp,
-		unsigned long attrs, int prot, dma_addr_t *handle,
-		void (*flush_page)(struct device *, const void *, phys_addr_t));
+		unsigned long attrs, int prot, dma_addr_t *handle);
 void iommu_dma_free(struct device *dev, struct page **pages, size_t size,
 		dma_addr_t *handle);
 
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH 06/26] iommu/dma: Use for_each_sg in iommu_dma_alloc
  2019-04-22 17:59 implement generic dma_map_ops for IOMMUs v3 Christoph Hellwig
                   ` (4 preceding siblings ...)
  2019-04-22 17:59 ` [PATCH 05/26] iommu/dma: Remove the flush_page callback Christoph Hellwig
@ 2019-04-22 17:59 ` Christoph Hellwig
  2019-04-22 17:59 ` [PATCH 07/26] iommu/dma: move the arm64 wrappers to common code Christoph Hellwig
                   ` (20 subsequent siblings)
  26 siblings, 0 replies; 48+ messages in thread
From: Christoph Hellwig @ 2019-04-22 17:59 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Joerg Roedel, Catalin Marinas, Will Deacon, Tom Lendacky, iommu,
	linux-arm-kernel, linux-kernel

arch_dma_prep_coherent can handle physically contiguous ranges larger
than PAGE_SIZE just fine, which means we don't need a page-based
iterator.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
---
 drivers/iommu/dma-iommu.c | 14 +++++---------
 1 file changed, 5 insertions(+), 9 deletions(-)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index 77d704c8f565..f915cb7c46e6 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -577,15 +577,11 @@ struct page **iommu_dma_alloc(struct device *dev, size_t size, gfp_t gfp,
 		goto out_free_iova;
 
 	if (!(prot & IOMMU_CACHE)) {
-		struct sg_mapping_iter miter;
-		/*
-		 * The CPU-centric flushing implied by SG_MITER_TO_SG isn't
-		 * sufficient here, so skip it by using the "wrong" direction.
-		 */
-		sg_miter_start(&miter, sgt.sgl, sgt.orig_nents, SG_MITER_FROM_SG);
-		while (sg_miter_next(&miter))
-			arch_dma_prep_coherent(miter.page, PAGE_SIZE);
-		sg_miter_stop(&miter);
+		struct scatterlist *sg;
+		int i;
+
+		for_each_sg(sgt.sgl, sg, sgt.orig_nents, i)
+			arch_dma_prep_coherent(sg_page(sg), sg->length);
 	}
 
 	if (iommu_map_sg(domain, iova, sgt.sgl, sgt.orig_nents, prot)
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH 07/26] iommu/dma: move the arm64 wrappers to common code
  2019-04-22 17:59 implement generic dma_map_ops for IOMMUs v3 Christoph Hellwig
                   ` (5 preceding siblings ...)
  2019-04-22 17:59 ` [PATCH 06/26] iommu/dma: Use for_each_sg in iommu_dma_alloc Christoph Hellwig
@ 2019-04-22 17:59 ` Christoph Hellwig
  2019-04-29 12:56   ` Robin Murphy
  2019-04-22 17:59 ` [PATCH 08/26] iommu/dma: Move __iommu_dma_map Christoph Hellwig
                   ` (19 subsequent siblings)
  26 siblings, 1 reply; 48+ messages in thread
From: Christoph Hellwig @ 2019-04-22 17:59 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Joerg Roedel, Catalin Marinas, Will Deacon, Tom Lendacky, iommu,
	linux-arm-kernel, linux-kernel

There is nothing really arm64 specific in the iommu_dma_ops
implementation, so move it to dma-iommu.c and keep a lot of symbols
self-contained.  Note the implementation does depend on the
DMA_DIRECT_REMAP infrastructure for now, so we'll have to make the
DMA_IOMMU support depend on it, but this will be relaxed soon.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 arch/arm64/mm/dma-mapping.c | 389 +-----------------------------------
 drivers/iommu/Kconfig       |   1 +
 drivers/iommu/dma-iommu.c   | 388 ++++++++++++++++++++++++++++++++---
 include/linux/dma-iommu.h   |  43 +---
 4 files changed, 369 insertions(+), 452 deletions(-)

diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c
index 636fa7c64370..d1661f78eb4d 100644
--- a/arch/arm64/mm/dma-mapping.c
+++ b/arch/arm64/mm/dma-mapping.c
@@ -27,6 +27,7 @@
 #include <linux/dma-direct.h>
 #include <linux/dma-noncoherent.h>
 #include <linux/dma-contiguous.h>
+#include <linux/dma-iommu.h>
 #include <linux/vmalloc.h>
 #include <linux/swiotlb.h>
 #include <linux/pci.h>
@@ -58,27 +59,6 @@ void arch_dma_prep_coherent(struct page *page, size_t size)
 	__dma_flush_area(page_address(page), size);
 }
 
-#ifdef CONFIG_IOMMU_DMA
-static int __swiotlb_get_sgtable_page(struct sg_table *sgt,
-				      struct page *page, size_t size)
-{
-	int ret = sg_alloc_table(sgt, 1, GFP_KERNEL);
-
-	if (!ret)
-		sg_set_page(sgt->sgl, page, PAGE_ALIGN(size), 0);
-
-	return ret;
-}
-
-static int __swiotlb_mmap_pfn(struct vm_area_struct *vma,
-			      unsigned long pfn, size_t size)
-{
-	return remap_pfn_range(vma, vma->vm_start, pfn + vma->vm_pgoff,
-			      vma->vm_end - vma->vm_start,
-			      vma->vm_page_prot);
-}
-#endif /* CONFIG_IOMMU_DMA */
-
 static int __init arm64_dma_init(void)
 {
 	WARN_TAINT(ARCH_DMA_MINALIGN < cache_line_size(),
@@ -90,379 +70,18 @@ static int __init arm64_dma_init(void)
 arch_initcall(arm64_dma_init);
 
 #ifdef CONFIG_IOMMU_DMA
-#include <linux/dma-iommu.h>
-#include <linux/platform_device.h>
-#include <linux/amba/bus.h>
-
-static void *__iommu_alloc_attrs(struct device *dev, size_t size,
-				 dma_addr_t *handle, gfp_t gfp,
-				 unsigned long attrs)
-{
-	bool coherent = dev_is_dma_coherent(dev);
-	int ioprot = dma_info_to_prot(DMA_BIDIRECTIONAL, coherent, attrs);
-	size_t iosize = size;
-	void *addr;
-
-	if (WARN(!dev, "cannot create IOMMU mapping for unknown device\n"))
-		return NULL;
-
-	size = PAGE_ALIGN(size);
-
-	/*
-	 * Some drivers rely on this, and we probably don't want the
-	 * possibility of stale kernel data being read by devices anyway.
-	 */
-	gfp |= __GFP_ZERO;
-
-	if (!gfpflags_allow_blocking(gfp)) {
-		struct page *page;
-		/*
-		 * In atomic context we can't remap anything, so we'll only
-		 * get the virtually contiguous buffer we need by way of a
-		 * physically contiguous allocation.
-		 */
-		if (coherent) {
-			page = alloc_pages(gfp, get_order(size));
-			addr = page ? page_address(page) : NULL;
-		} else {
-			addr = dma_alloc_from_pool(size, &page, gfp);
-		}
-		if (!addr)
-			return NULL;
-
-		*handle = iommu_dma_map_page(dev, page, 0, iosize, ioprot);
-		if (*handle == DMA_MAPPING_ERROR) {
-			if (coherent)
-				__free_pages(page, get_order(size));
-			else
-				dma_free_from_pool(addr, size);
-			addr = NULL;
-		}
-	} else if (attrs & DMA_ATTR_FORCE_CONTIGUOUS) {
-		pgprot_t prot = arch_dma_mmap_pgprot(dev, PAGE_KERNEL, attrs);
-		struct page *page;
-
-		page = dma_alloc_from_contiguous(dev, size >> PAGE_SHIFT,
-					get_order(size), gfp & __GFP_NOWARN);
-		if (!page)
-			return NULL;
-
-		*handle = iommu_dma_map_page(dev, page, 0, iosize, ioprot);
-		if (*handle == DMA_MAPPING_ERROR) {
-			dma_release_from_contiguous(dev, page,
-						    size >> PAGE_SHIFT);
-			return NULL;
-		}
-		addr = dma_common_contiguous_remap(page, size, VM_USERMAP,
-						   prot,
-						   __builtin_return_address(0));
-		if (addr) {
-			if (!coherent)
-				__dma_flush_area(page_to_virt(page), iosize);
-			memset(addr, 0, size);
-		} else {
-			iommu_dma_unmap_page(dev, *handle, iosize, 0, attrs);
-			dma_release_from_contiguous(dev, page,
-						    size >> PAGE_SHIFT);
-		}
-	} else {
-		pgprot_t prot = arch_dma_mmap_pgprot(dev, PAGE_KERNEL, attrs);
-		struct page **pages;
-
-		pages = iommu_dma_alloc(dev, iosize, gfp, attrs, ioprot,
-					handle);
-		if (!pages)
-			return NULL;
-
-		addr = dma_common_pages_remap(pages, size, VM_USERMAP, prot,
-					      __builtin_return_address(0));
-		if (!addr)
-			iommu_dma_free(dev, pages, iosize, handle);
-	}
-	return addr;
-}
-
-static void __iommu_free_attrs(struct device *dev, size_t size, void *cpu_addr,
-			       dma_addr_t handle, unsigned long attrs)
-{
-	size_t iosize = size;
-
-	size = PAGE_ALIGN(size);
-	/*
-	 * @cpu_addr will be one of 4 things depending on how it was allocated:
-	 * - A remapped array of pages for contiguous allocations.
-	 * - A remapped array of pages from iommu_dma_alloc(), for all
-	 *   non-atomic allocations.
-	 * - A non-cacheable alias from the atomic pool, for atomic
-	 *   allocations by non-coherent devices.
-	 * - A normal lowmem address, for atomic allocations by
-	 *   coherent devices.
-	 * Hence how dodgy the below logic looks...
-	 */
-	if (dma_in_atomic_pool(cpu_addr, size)) {
-		iommu_dma_unmap_page(dev, handle, iosize, 0, 0);
-		dma_free_from_pool(cpu_addr, size);
-	} else if (attrs & DMA_ATTR_FORCE_CONTIGUOUS) {
-		struct page *page = vmalloc_to_page(cpu_addr);
-
-		iommu_dma_unmap_page(dev, handle, iosize, 0, attrs);
-		dma_release_from_contiguous(dev, page, size >> PAGE_SHIFT);
-		dma_common_free_remap(cpu_addr, size, VM_USERMAP);
-	} else if (is_vmalloc_addr(cpu_addr)){
-		struct vm_struct *area = find_vm_area(cpu_addr);
-
-		if (WARN_ON(!area || !area->pages))
-			return;
-		iommu_dma_free(dev, area->pages, iosize, &handle);
-		dma_common_free_remap(cpu_addr, size, VM_USERMAP);
-	} else {
-		iommu_dma_unmap_page(dev, handle, iosize, 0, 0);
-		__free_pages(virt_to_page(cpu_addr), get_order(size));
-	}
-}
-
-static int __iommu_mmap_attrs(struct device *dev, struct vm_area_struct *vma,
-			      void *cpu_addr, dma_addr_t dma_addr, size_t size,
-			      unsigned long attrs)
-{
-	unsigned long nr_pages = PAGE_ALIGN(size) >> PAGE_SHIFT;
-	unsigned long off = vma->vm_pgoff;
-	struct vm_struct *area;
-	int ret;
-
-	vma->vm_page_prot = arch_dma_mmap_pgprot(dev, vma->vm_page_prot, attrs);
-
-	if (dma_mmap_from_dev_coherent(dev, vma, cpu_addr, size, &ret))
-		return ret;
-
-	if (off >= nr_pages || vma_pages(vma) > nr_pages - off)
-		return -ENXIO;
-
-	if (!is_vmalloc_addr(cpu_addr)) {
-		unsigned long pfn = page_to_pfn(virt_to_page(cpu_addr));
-		return __swiotlb_mmap_pfn(vma, pfn, size);
-	}
-
-	if (attrs & DMA_ATTR_FORCE_CONTIGUOUS) {
-		/*
-		 * DMA_ATTR_FORCE_CONTIGUOUS allocations are always remapped,
-		 * hence in the vmalloc space.
-		 */
-		unsigned long pfn = vmalloc_to_pfn(cpu_addr);
-		return __swiotlb_mmap_pfn(vma, pfn, size);
-	}
-
-	area = find_vm_area(cpu_addr);
-	if (WARN_ON(!area || !area->pages))
-		return -ENXIO;
-
-	return iommu_dma_mmap(area->pages, size, vma);
-}
-
-static int __iommu_get_sgtable(struct device *dev, struct sg_table *sgt,
-			       void *cpu_addr, dma_addr_t dma_addr,
-			       size_t size, unsigned long attrs)
-{
-	unsigned int count = PAGE_ALIGN(size) >> PAGE_SHIFT;
-	struct vm_struct *area = find_vm_area(cpu_addr);
-
-	if (!is_vmalloc_addr(cpu_addr)) {
-		struct page *page = virt_to_page(cpu_addr);
-		return __swiotlb_get_sgtable_page(sgt, page, size);
-	}
-
-	if (attrs & DMA_ATTR_FORCE_CONTIGUOUS) {
-		/*
-		 * DMA_ATTR_FORCE_CONTIGUOUS allocations are always remapped,
-		 * hence in the vmalloc space.
-		 */
-		struct page *page = vmalloc_to_page(cpu_addr);
-		return __swiotlb_get_sgtable_page(sgt, page, size);
-	}
-
-	if (WARN_ON(!area || !area->pages))
-		return -ENXIO;
-
-	return sg_alloc_table_from_pages(sgt, area->pages, count, 0, size,
-					 GFP_KERNEL);
-}
-
-static void __iommu_sync_single_for_cpu(struct device *dev,
-					dma_addr_t dev_addr, size_t size,
-					enum dma_data_direction dir)
-{
-	phys_addr_t phys;
-
-	if (dev_is_dma_coherent(dev))
-		return;
-
-	phys = iommu_iova_to_phys(iommu_get_dma_domain(dev), dev_addr);
-	arch_sync_dma_for_cpu(dev, phys, size, dir);
-}
-
-static void __iommu_sync_single_for_device(struct device *dev,
-					   dma_addr_t dev_addr, size_t size,
-					   enum dma_data_direction dir)
-{
-	phys_addr_t phys;
-
-	if (dev_is_dma_coherent(dev))
-		return;
-
-	phys = iommu_iova_to_phys(iommu_get_dma_domain(dev), dev_addr);
-	arch_sync_dma_for_device(dev, phys, size, dir);
-}
-
-static dma_addr_t __iommu_map_page(struct device *dev, struct page *page,
-				   unsigned long offset, size_t size,
-				   enum dma_data_direction dir,
-				   unsigned long attrs)
-{
-	bool coherent = dev_is_dma_coherent(dev);
-	int prot = dma_info_to_prot(dir, coherent, attrs);
-	dma_addr_t dev_addr = iommu_dma_map_page(dev, page, offset, size, prot);
-
-	if (!coherent && !(attrs & DMA_ATTR_SKIP_CPU_SYNC) &&
-	    dev_addr != DMA_MAPPING_ERROR)
-		__dma_map_area(page_address(page) + offset, size, dir);
-
-	return dev_addr;
-}
-
-static void __iommu_unmap_page(struct device *dev, dma_addr_t dev_addr,
-			       size_t size, enum dma_data_direction dir,
-			       unsigned long attrs)
-{
-	if ((attrs & DMA_ATTR_SKIP_CPU_SYNC) == 0)
-		__iommu_sync_single_for_cpu(dev, dev_addr, size, dir);
-
-	iommu_dma_unmap_page(dev, dev_addr, size, dir, attrs);
-}
-
-static void __iommu_sync_sg_for_cpu(struct device *dev,
-				    struct scatterlist *sgl, int nelems,
-				    enum dma_data_direction dir)
-{
-	struct scatterlist *sg;
-	int i;
-
-	if (dev_is_dma_coherent(dev))
-		return;
-
-	for_each_sg(sgl, sg, nelems, i)
-		arch_sync_dma_for_cpu(dev, sg_phys(sg), sg->length, dir);
-}
-
-static void __iommu_sync_sg_for_device(struct device *dev,
-				       struct scatterlist *sgl, int nelems,
-				       enum dma_data_direction dir)
-{
-	struct scatterlist *sg;
-	int i;
-
-	if (dev_is_dma_coherent(dev))
-		return;
-
-	for_each_sg(sgl, sg, nelems, i)
-		arch_sync_dma_for_device(dev, sg_phys(sg), sg->length, dir);
-}
-
-static int __iommu_map_sg_attrs(struct device *dev, struct scatterlist *sgl,
-				int nelems, enum dma_data_direction dir,
-				unsigned long attrs)
-{
-	bool coherent = dev_is_dma_coherent(dev);
-
-	if ((attrs & DMA_ATTR_SKIP_CPU_SYNC) == 0)
-		__iommu_sync_sg_for_device(dev, sgl, nelems, dir);
-
-	return iommu_dma_map_sg(dev, sgl, nelems,
-				dma_info_to_prot(dir, coherent, attrs));
-}
-
-static void __iommu_unmap_sg_attrs(struct device *dev,
-				   struct scatterlist *sgl, int nelems,
-				   enum dma_data_direction dir,
-				   unsigned long attrs)
-{
-	if ((attrs & DMA_ATTR_SKIP_CPU_SYNC) == 0)
-		__iommu_sync_sg_for_cpu(dev, sgl, nelems, dir);
-
-	iommu_dma_unmap_sg(dev, sgl, nelems, dir, attrs);
-}
-
-static const struct dma_map_ops iommu_dma_ops = {
-	.alloc = __iommu_alloc_attrs,
-	.free = __iommu_free_attrs,
-	.mmap = __iommu_mmap_attrs,
-	.get_sgtable = __iommu_get_sgtable,
-	.map_page = __iommu_map_page,
-	.unmap_page = __iommu_unmap_page,
-	.map_sg = __iommu_map_sg_attrs,
-	.unmap_sg = __iommu_unmap_sg_attrs,
-	.sync_single_for_cpu = __iommu_sync_single_for_cpu,
-	.sync_single_for_device = __iommu_sync_single_for_device,
-	.sync_sg_for_cpu = __iommu_sync_sg_for_cpu,
-	.sync_sg_for_device = __iommu_sync_sg_for_device,
-	.map_resource = iommu_dma_map_resource,
-	.unmap_resource = iommu_dma_unmap_resource,
-};
-
-static int __init __iommu_dma_init(void)
-{
-	return iommu_dma_init();
-}
-arch_initcall(__iommu_dma_init);
-
-static void __iommu_setup_dma_ops(struct device *dev, u64 dma_base, u64 size,
-				  const struct iommu_ops *ops)
-{
-	struct iommu_domain *domain;
-
-	if (!ops)
-		return;
-
-	/*
-	 * The IOMMU core code allocates the default DMA domain, which the
-	 * underlying IOMMU driver needs to support via the dma-iommu layer.
-	 */
-	domain = iommu_get_domain_for_dev(dev);
-
-	if (!domain)
-		goto out_err;
-
-	if (domain->type == IOMMU_DOMAIN_DMA) {
-		if (iommu_dma_init_domain(domain, dma_base, size, dev))
-			goto out_err;
-
-		dev->dma_ops = &iommu_dma_ops;
-	}
-
-	return;
-
-out_err:
-	 pr_warn("Failed to set up IOMMU for device %s; retaining platform DMA ops\n",
-		 dev_name(dev));
-}
-
 void arch_teardown_dma_ops(struct device *dev)
 {
 	dev->dma_ops = NULL;
 }
-
-#else
-
-static void __iommu_setup_dma_ops(struct device *dev, u64 dma_base, u64 size,
-				  const struct iommu_ops *iommu)
-{ }
-
-#endif  /* CONFIG_IOMMU_DMA */
+#endif
 
 void arch_setup_dma_ops(struct device *dev, u64 dma_base, u64 size,
 			const struct iommu_ops *iommu, bool coherent)
 {
 	dev->dma_coherent = coherent;
-	__iommu_setup_dma_ops(dev, dma_base, size, iommu);
+	if (iommu)
+		iommu_setup_dma_ops(dev, dma_base, size);
 
 #ifdef CONFIG_XEN
 	if (xen_initial_domain())
diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
index 6f07f3b21816..bdc14baf2ee5 100644
--- a/drivers/iommu/Kconfig
+++ b/drivers/iommu/Kconfig
@@ -95,6 +95,7 @@ config IOMMU_DMA
 	select IOMMU_API
 	select IOMMU_IOVA
 	select NEED_SG_DMA_LENGTH
+	depends on DMA_DIRECT_REMAP
 
 config FSL_PAMU
 	bool "Freescale IOMMU support"
diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index f915cb7c46e6..622123551bba 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -21,6 +21,7 @@
 
 #include <linux/acpi_iort.h>
 #include <linux/device.h>
+#include <linux/dma-contiguous.h>
 #include <linux/dma-iommu.h>
 #include <linux/dma-noncoherent.h>
 #include <linux/gfp.h>
@@ -79,11 +80,6 @@ static struct iommu_dma_cookie *cookie_alloc(enum iommu_dma_cookie_type type)
 	return cookie;
 }
 
-int iommu_dma_init(void)
-{
-	return iova_cache_get();
-}
-
 /**
  * iommu_get_dma_cookie - Acquire DMA-API resources for a domain
  * @domain: IOMMU domain to prepare for DMA-API usage
@@ -285,7 +281,7 @@ static void iommu_dma_flush_iotlb_all(struct iova_domain *iovad)
  * to ensure it is an invalid IOVA. It is safe to reinitialise a domain, but
  * any change which could make prior IOVAs invalid will fail.
  */
-int iommu_dma_init_domain(struct iommu_domain *domain, dma_addr_t base,
+static int iommu_dma_init_domain(struct iommu_domain *domain, dma_addr_t base,
 		u64 size, struct device *dev)
 {
 	struct iommu_dma_cookie *cookie = domain->iova_cookie;
@@ -336,7 +332,6 @@ int iommu_dma_init_domain(struct iommu_domain *domain, dma_addr_t base,
 
 	return iova_reserve_iommu_regions(dev, domain);
 }
-EXPORT_SYMBOL(iommu_dma_init_domain);
 
 /**
  * dma_info_to_prot - Translate DMA API directions and attributes to IOMMU API
@@ -347,7 +342,7 @@ EXPORT_SYMBOL(iommu_dma_init_domain);
  *
  * Return: corresponding IOMMU API page protection flags
  */
-int dma_info_to_prot(enum dma_data_direction dir, bool coherent,
+static int dma_info_to_prot(enum dma_data_direction dir, bool coherent,
 		     unsigned long attrs)
 {
 	int prot = coherent ? IOMMU_CACHE : 0;
@@ -506,17 +501,17 @@ static struct page **__iommu_dma_alloc_pages(struct device *dev,
 }
 
 /**
- * iommu_dma_free - Free a buffer allocated by iommu_dma_alloc()
+ * iommu_dma_free - Free a buffer allocated by __iommu_dma_alloc()
  * @dev: Device which owns this buffer
- * @pages: Array of buffer pages as returned by iommu_dma_alloc()
+ * @pages: Array of buffer pages as returned by __iommu_dma_alloc()
  * @size: Size of buffer in bytes
  * @handle: DMA address of buffer
  *
  * Frees both the pages associated with the buffer, and the array
  * describing them
  */
-void iommu_dma_free(struct device *dev, struct page **pages, size_t size,
-		dma_addr_t *handle)
+static void __iommu_dma_free(struct device *dev, struct page **pages,
+		size_t size, dma_addr_t *handle)
 {
 	__iommu_dma_unmap(iommu_get_dma_domain(dev), *handle, size);
 	__iommu_dma_free_pages(pages, PAGE_ALIGN(size) >> PAGE_SHIFT);
@@ -524,7 +519,7 @@ void iommu_dma_free(struct device *dev, struct page **pages, size_t size,
 }
 
 /**
- * iommu_dma_alloc - Allocate and map a buffer contiguous in IOVA space
+ * __iommu_dma_alloc - Allocate and map a buffer contiguous in IOVA space
  * @dev: Device to allocate memory for. Must be a real device
  *	 attached to an iommu_dma_domain
  * @size: Size of buffer in bytes
@@ -539,8 +534,8 @@ void iommu_dma_free(struct device *dev, struct page **pages, size_t size,
  * Return: Array of struct page pointers describing the buffer,
  *	   or NULL on failure.
  */
-struct page **iommu_dma_alloc(struct device *dev, size_t size, gfp_t gfp,
-		unsigned long attrs, int prot, dma_addr_t *handle)
+static struct page **__iommu_dma_alloc(struct device *dev, size_t size,
+		gfp_t gfp, unsigned long attrs, int prot, dma_addr_t *handle)
 {
 	struct iommu_domain *domain = iommu_get_dma_domain(dev);
 	struct iommu_dma_cookie *cookie = domain->iova_cookie;
@@ -602,16 +597,16 @@ struct page **iommu_dma_alloc(struct device *dev, size_t size, gfp_t gfp,
 }
 
 /**
- * iommu_dma_mmap - Map a buffer into provided user VMA
- * @pages: Array representing buffer from iommu_dma_alloc()
+ * __iommu_dma_mmap - Map a buffer into provided user VMA
+ * @pages: Array representing buffer from __iommu_dma_alloc()
  * @size: Size of buffer in bytes
  * @vma: VMA describing requested userspace mapping
  *
  * Maps the pages of the buffer in @pages into @vma. The caller is responsible
  * for verifying the correct size and protection of @vma beforehand.
  */
-
-int iommu_dma_mmap(struct page **pages, size_t size, struct vm_area_struct *vma)
+static int __iommu_dma_mmap(struct page **pages, size_t size,
+		struct vm_area_struct *vma)
 {
 	unsigned long uaddr = vma->vm_start;
 	unsigned int i, count = PAGE_ALIGN(size) >> PAGE_SHIFT;
@@ -626,6 +621,58 @@ int iommu_dma_mmap(struct page **pages, size_t size, struct vm_area_struct *vma)
 	return ret;
 }
 
+static void iommu_dma_sync_single_for_cpu(struct device *dev,
+		dma_addr_t dma_handle, size_t size, enum dma_data_direction dir)
+{
+	phys_addr_t phys;
+
+	if (dev_is_dma_coherent(dev))
+		return;
+
+	phys = iommu_iova_to_phys(iommu_get_dma_domain(dev), dma_handle);
+	arch_sync_dma_for_cpu(dev, phys, size, dir);
+}
+
+static void iommu_dma_sync_single_for_device(struct device *dev,
+		dma_addr_t dma_handle, size_t size, enum dma_data_direction dir)
+{
+	phys_addr_t phys;
+
+	if (dev_is_dma_coherent(dev))
+		return;
+
+	phys = iommu_iova_to_phys(iommu_get_dma_domain(dev), dma_handle);
+	arch_sync_dma_for_device(dev, phys, size, dir);
+}
+
+static void iommu_dma_sync_sg_for_cpu(struct device *dev,
+		struct scatterlist *sgl, int nelems,
+		enum dma_data_direction dir)
+{
+	struct scatterlist *sg;
+	int i;
+
+	if (dev_is_dma_coherent(dev))
+		return;
+
+	for_each_sg(sgl, sg, nelems, i)
+		arch_sync_dma_for_cpu(dev, sg_phys(sg), sg->length, dir);
+}
+
+static void iommu_dma_sync_sg_for_device(struct device *dev,
+		struct scatterlist *sgl, int nelems,
+		enum dma_data_direction dir)
+{
+	struct scatterlist *sg;
+	int i;
+
+	if (dev_is_dma_coherent(dev))
+		return;
+
+	for_each_sg(sgl, sg, nelems, i)
+		arch_sync_dma_for_device(dev, sg_phys(sg), sg->length, dir);
+}
+
 static dma_addr_t __iommu_dma_map(struct device *dev, phys_addr_t phys,
 		size_t size, int prot, struct iommu_domain *domain)
 {
@@ -649,19 +696,44 @@ static dma_addr_t __iommu_dma_map(struct device *dev, phys_addr_t phys,
 	return iova + iova_off;
 }
 
-dma_addr_t iommu_dma_map_page(struct device *dev, struct page *page,
+static dma_addr_t __iommu_dma_map_page(struct device *dev, struct page *page,
 		unsigned long offset, size_t size, int prot)
 {
 	return __iommu_dma_map(dev, page_to_phys(page) + offset, size, prot,
 			iommu_get_dma_domain(dev));
 }
 
-void iommu_dma_unmap_page(struct device *dev, dma_addr_t handle, size_t size,
-		enum dma_data_direction dir, unsigned long attrs)
+static void __iommu_dma_unmap_page(struct device *dev, dma_addr_t handle,
+		size_t size, enum dma_data_direction dir, unsigned long attrs)
 {
 	__iommu_dma_unmap(iommu_get_dma_domain(dev), handle, size);
 }
 
+static dma_addr_t iommu_dma_map_page(struct device *dev, struct page *page,
+		unsigned long offset, size_t size, enum dma_data_direction dir,
+		unsigned long attrs)
+{
+	phys_addr_t phys = page_to_phys(page) + offset;
+	bool coherent = dev_is_dma_coherent(dev);
+	dma_addr_t dma_handle;
+
+	dma_handle =__iommu_dma_map(dev, phys, size,
+			dma_info_to_prot(dir, coherent, attrs),
+			iommu_get_dma_domain(dev));
+	if (!coherent && !(attrs & DMA_ATTR_SKIP_CPU_SYNC) &&
+	    dma_handle != DMA_MAPPING_ERROR)
+		arch_sync_dma_for_device(dev, phys, size, dir);
+	return dma_handle;
+}
+
+static void iommu_dma_unmap_page(struct device *dev, dma_addr_t dma_handle,
+		size_t size, enum dma_data_direction dir, unsigned long attrs)
+{
+	if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC))
+		iommu_dma_sync_single_for_cpu(dev, dma_handle, size, dir);
+	__iommu_dma_unmap(iommu_get_dma_domain(dev), dma_handle, size);
+}
+
 /*
  * Prepare a successfully-mapped scatterlist to give back to the caller.
  *
@@ -744,18 +816,22 @@ static void __invalidate_sg(struct scatterlist *sg, int nents)
  * impedance-matching, to be able to hand off a suitably-aligned list,
  * but still preserve the original offsets and sizes for the caller.
  */
-int iommu_dma_map_sg(struct device *dev, struct scatterlist *sg,
-		int nents, int prot)
+static int iommu_dma_map_sg(struct device *dev, struct scatterlist *sg,
+		int nents, enum dma_data_direction dir, unsigned long attrs)
 {
 	struct iommu_domain *domain = iommu_get_dma_domain(dev);
 	struct iommu_dma_cookie *cookie = domain->iova_cookie;
 	struct iova_domain *iovad = &cookie->iovad;
 	struct scatterlist *s, *prev = NULL;
+	int prot = dma_info_to_prot(dir, dev_is_dma_coherent(dev), attrs);
 	dma_addr_t iova;
 	size_t iova_len = 0;
 	unsigned long mask = dma_get_seg_boundary(dev);
 	int i;
 
+	if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC))
+		iommu_dma_sync_sg_for_device(dev, sg, nents, dir);
+
 	/*
 	 * Work out how much IOVA space we need, and align the segments to
 	 * IOVA granules for the IOMMU driver to handle. With some clever
@@ -815,12 +891,16 @@ int iommu_dma_map_sg(struct device *dev, struct scatterlist *sg,
 	return 0;
 }
 
-void iommu_dma_unmap_sg(struct device *dev, struct scatterlist *sg, int nents,
-		enum dma_data_direction dir, unsigned long attrs)
+static void iommu_dma_unmap_sg(struct device *dev, struct scatterlist *sg,
+		int nents, enum dma_data_direction dir, unsigned long attrs)
 {
 	dma_addr_t start, end;
 	struct scatterlist *tmp;
 	int i;
+
+	if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC) == 0)
+		iommu_dma_sync_sg_for_cpu(dev, sg, nents, dir);
+
 	/*
 	 * The scatterlist segments are mapped into a single
 	 * contiguous IOVA allocation, so this is incredibly easy.
@@ -835,7 +915,7 @@ void iommu_dma_unmap_sg(struct device *dev, struct scatterlist *sg, int nents,
 	__iommu_dma_unmap(iommu_get_dma_domain(dev), start, end - start);
 }
 
-dma_addr_t iommu_dma_map_resource(struct device *dev, phys_addr_t phys,
+static dma_addr_t iommu_dma_map_resource(struct device *dev, phys_addr_t phys,
 		size_t size, enum dma_data_direction dir, unsigned long attrs)
 {
 	return __iommu_dma_map(dev, phys, size,
@@ -843,12 +923,258 @@ dma_addr_t iommu_dma_map_resource(struct device *dev, phys_addr_t phys,
 			iommu_get_dma_domain(dev));
 }
 
-void iommu_dma_unmap_resource(struct device *dev, dma_addr_t handle,
+static void iommu_dma_unmap_resource(struct device *dev, dma_addr_t handle,
 		size_t size, enum dma_data_direction dir, unsigned long attrs)
 {
 	__iommu_dma_unmap(iommu_get_dma_domain(dev), handle, size);
 }
 
+static void *iommu_dma_alloc(struct device *dev, size_t size,
+		dma_addr_t *handle, gfp_t gfp, unsigned long attrs)
+{
+	bool coherent = dev_is_dma_coherent(dev);
+	int ioprot = dma_info_to_prot(DMA_BIDIRECTIONAL, coherent, attrs);
+	size_t iosize = size;
+	void *addr;
+
+	size = PAGE_ALIGN(size);
+	gfp |= __GFP_ZERO;
+
+	if (!gfpflags_allow_blocking(gfp)) {
+		struct page *page;
+		/*
+		 * In atomic context we can't remap anything, so we'll only
+		 * get the virtually contiguous buffer we need by way of a
+		 * physically contiguous allocation.
+		 */
+		if (coherent) {
+			page = alloc_pages(gfp, get_order(size));
+			addr = page ? page_address(page) : NULL;
+		} else {
+			addr = dma_alloc_from_pool(size, &page, gfp);
+		}
+		if (!addr)
+			return NULL;
+
+		*handle = __iommu_dma_map_page(dev, page, 0, iosize, ioprot);
+		if (*handle == DMA_MAPPING_ERROR) {
+			if (coherent)
+				__free_pages(page, get_order(size));
+			else
+				dma_free_from_pool(addr, size);
+			addr = NULL;
+		}
+	} else if (attrs & DMA_ATTR_FORCE_CONTIGUOUS) {
+		pgprot_t prot = arch_dma_mmap_pgprot(dev, PAGE_KERNEL, attrs);
+		struct page *page;
+
+		page = dma_alloc_from_contiguous(dev, size >> PAGE_SHIFT,
+					get_order(size), gfp & __GFP_NOWARN);
+		if (!page)
+			return NULL;
+
+		*handle = __iommu_dma_map_page(dev, page, 0, iosize, ioprot);
+		if (*handle == DMA_MAPPING_ERROR) {
+			dma_release_from_contiguous(dev, page,
+						    size >> PAGE_SHIFT);
+			return NULL;
+		}
+		addr = dma_common_contiguous_remap(page, size, VM_USERMAP,
+						   prot,
+						   __builtin_return_address(0));
+		if (addr) {
+			if (!coherent)
+				arch_dma_prep_coherent(page, iosize);
+			memset(addr, 0, size);
+		} else {
+			__iommu_dma_unmap_page(dev, *handle, iosize, 0, attrs);
+			dma_release_from_contiguous(dev, page,
+						    size >> PAGE_SHIFT);
+		}
+	} else {
+		pgprot_t prot = arch_dma_mmap_pgprot(dev, PAGE_KERNEL, attrs);
+		struct page **pages;
+
+		pages = __iommu_dma_alloc(dev, iosize, gfp, attrs, ioprot,
+					handle);
+		if (!pages)
+			return NULL;
+
+		addr = dma_common_pages_remap(pages, size, VM_USERMAP, prot,
+					      __builtin_return_address(0));
+		if (!addr)
+			__iommu_dma_free(dev, pages, iosize, handle);
+	}
+	return addr;
+}
+
+static void iommu_dma_free(struct device *dev, size_t size, void *cpu_addr,
+		dma_addr_t handle, unsigned long attrs)
+{
+	size_t iosize = size;
+
+	size = PAGE_ALIGN(size);
+	/*
+	 * @cpu_addr will be one of 4 things depending on how it was allocated:
+	 * - A remapped array of pages for contiguous allocations.
+	 * - A remapped array of pages from __iommu_dma_alloc(), for all
+	 *   non-atomic allocations.
+	 * - A non-cacheable alias from the atomic pool, for atomic
+	 *   allocations by non-coherent devices.
+	 * - A normal lowmem address, for atomic allocations by
+	 *   coherent devices.
+	 * Hence how dodgy the below logic looks...
+	 */
+	if (dma_in_atomic_pool(cpu_addr, size)) {
+		__iommu_dma_unmap_page(dev, handle, iosize, 0, 0);
+		dma_free_from_pool(cpu_addr, size);
+	} else if (attrs & DMA_ATTR_FORCE_CONTIGUOUS) {
+		struct page *page = vmalloc_to_page(cpu_addr);
+
+		__iommu_dma_unmap_page(dev, handle, iosize, 0, attrs);
+		dma_release_from_contiguous(dev, page, size >> PAGE_SHIFT);
+		dma_common_free_remap(cpu_addr, size, VM_USERMAP);
+	} else if (is_vmalloc_addr(cpu_addr)){
+		struct vm_struct *area = find_vm_area(cpu_addr);
+
+		if (WARN_ON(!area || !area->pages))
+			return;
+		__iommu_dma_free(dev, area->pages, iosize, &handle);
+		dma_common_free_remap(cpu_addr, size, VM_USERMAP);
+	} else {
+		__iommu_dma_unmap_page(dev, handle, iosize, 0, 0);
+		__free_pages(virt_to_page(cpu_addr), get_order(size));
+	}
+}
+
+static int __iommu_dma_mmap_pfn(struct vm_area_struct *vma,
+			      unsigned long pfn, size_t size)
+{
+	return remap_pfn_range(vma, vma->vm_start, pfn + vma->vm_pgoff,
+			       vma->vm_end - vma->vm_start,
+			       vma->vm_page_prot);
+}
+
+static int iommu_dma_mmap(struct device *dev, struct vm_area_struct *vma,
+		void *cpu_addr, dma_addr_t dma_addr, size_t size,
+		unsigned long attrs)
+{
+	unsigned long nr_pages = PAGE_ALIGN(size) >> PAGE_SHIFT;
+	unsigned long off = vma->vm_pgoff;
+	struct vm_struct *area;
+	int ret;
+
+	vma->vm_page_prot = arch_dma_mmap_pgprot(dev, vma->vm_page_prot, attrs);
+
+	if (dma_mmap_from_dev_coherent(dev, vma, cpu_addr, size, &ret))
+		return ret;
+
+	if (off >= nr_pages || vma_pages(vma) > nr_pages - off)
+		return -ENXIO;
+
+	if (!is_vmalloc_addr(cpu_addr)) {
+		unsigned long pfn = page_to_pfn(virt_to_page(cpu_addr));
+		return __iommu_dma_mmap_pfn(vma, pfn, size);
+	}
+
+	if (attrs & DMA_ATTR_FORCE_CONTIGUOUS) {
+		/*
+		 * DMA_ATTR_FORCE_CONTIGUOUS allocations are always remapped,
+		 * hence in the vmalloc space.
+		 */
+		unsigned long pfn = vmalloc_to_pfn(cpu_addr);
+		return __iommu_dma_mmap_pfn(vma, pfn, size);
+	}
+
+	area = find_vm_area(cpu_addr);
+	if (WARN_ON(!area || !area->pages))
+		return -ENXIO;
+
+	return __iommu_dma_mmap(area->pages, size, vma);
+}
+
+static int __iommu_dma_get_sgtable_page(struct sg_table *sgt, struct page *page,
+		size_t size)
+{
+	int ret = sg_alloc_table(sgt, 1, GFP_KERNEL);
+
+	if (!ret)
+		sg_set_page(sgt->sgl, page, PAGE_ALIGN(size), 0);
+	return ret;
+}
+
+static int iommu_dma_get_sgtable(struct device *dev, struct sg_table *sgt,
+		void *cpu_addr, dma_addr_t dma_addr, size_t size,
+		unsigned long attrs)
+{
+	unsigned int count = PAGE_ALIGN(size) >> PAGE_SHIFT;
+	struct vm_struct *area = find_vm_area(cpu_addr);
+
+	if (!is_vmalloc_addr(cpu_addr)) {
+		struct page *page = virt_to_page(cpu_addr);
+		return __iommu_dma_get_sgtable_page(sgt, page, size);
+	}
+
+	if (attrs & DMA_ATTR_FORCE_CONTIGUOUS) {
+		/*
+		 * DMA_ATTR_FORCE_CONTIGUOUS allocations are always remapped,
+		 * hence in the vmalloc space.
+		 */
+		struct page *page = vmalloc_to_page(cpu_addr);
+		return __iommu_dma_get_sgtable_page(sgt, page, size);
+	}
+
+	if (WARN_ON(!area || !area->pages))
+		return -ENXIO;
+
+	return sg_alloc_table_from_pages(sgt, area->pages, count, 0, size,
+					 GFP_KERNEL);
+}
+
+static const struct dma_map_ops iommu_dma_ops = {
+	.alloc			= iommu_dma_alloc,
+	.free			= iommu_dma_free,
+	.mmap			= iommu_dma_mmap,
+	.get_sgtable		= iommu_dma_get_sgtable,
+	.map_page		= iommu_dma_map_page,
+	.unmap_page		= iommu_dma_unmap_page,
+	.map_sg			= iommu_dma_map_sg,
+	.unmap_sg		= iommu_dma_unmap_sg,
+	.sync_single_for_cpu	= iommu_dma_sync_single_for_cpu,
+	.sync_single_for_device	= iommu_dma_sync_single_for_device,
+	.sync_sg_for_cpu	= iommu_dma_sync_sg_for_cpu,
+	.sync_sg_for_device	= iommu_dma_sync_sg_for_device,
+	.map_resource		= iommu_dma_map_resource,
+	.unmap_resource		= iommu_dma_unmap_resource,
+};
+
+/*
+ * The IOMMU core code allocates the default DMA domain, which the underlying
+ * IOMMU driver needs to support via the dma-iommu layer.
+ */
+void iommu_setup_dma_ops(struct device *dev, u64 dma_base, u64 size)
+{
+	struct iommu_domain *domain = iommu_get_domain_for_dev(dev);
+
+	if (!domain)
+		goto out_err;
+
+	/*
+	 * The IOMMU core code allocates the default DMA domain, which the
+	 * underlying IOMMU driver needs to support via the dma-iommu layer.
+	 */
+	if (domain->type == IOMMU_DOMAIN_DMA) {
+		if (iommu_dma_init_domain(domain, dma_base, size, dev))
+			goto out_err;
+		dev->dma_ops = &iommu_dma_ops;
+	}
+
+	return;
+out_err:
+	 pr_warn("Failed to set up IOMMU for device %s; retaining platform DMA ops\n",
+		 dev_name(dev));
+}
+
 static struct iommu_dma_msi_page *iommu_dma_get_msi_page(struct device *dev,
 		phys_addr_t msi_addr, struct iommu_domain *domain)
 {
@@ -921,3 +1247,9 @@ void iommu_dma_map_msi_msg(int irq, struct msi_msg *msg)
 		msg->address_lo += lower_32_bits(msi_page->iova);
 	}
 }
+
+static int iommu_dma_init(void)
+{
+	return iova_cache_get();
+}
+arch_initcall(iommu_dma_init);
diff --git a/include/linux/dma-iommu.h b/include/linux/dma-iommu.h
index 3216447178a7..dadf4383f555 100644
--- a/include/linux/dma-iommu.h
+++ b/include/linux/dma-iommu.h
@@ -24,49 +24,13 @@
 #include <linux/iommu.h>
 #include <linux/msi.h>
 
-int iommu_dma_init(void);
-
 /* Domain management interface for IOMMU drivers */
 int iommu_get_dma_cookie(struct iommu_domain *domain);
 int iommu_get_msi_cookie(struct iommu_domain *domain, dma_addr_t base);
 void iommu_put_dma_cookie(struct iommu_domain *domain);
 
 /* Setup call for arch DMA mapping code */
-int iommu_dma_init_domain(struct iommu_domain *domain, dma_addr_t base,
-		u64 size, struct device *dev);
-
-/* General helpers for DMA-API <-> IOMMU-API interaction */
-int dma_info_to_prot(enum dma_data_direction dir, bool coherent,
-		     unsigned long attrs);
-
-/*
- * These implement the bulk of the relevant DMA mapping callbacks, but require
- * the arch code to take care of attributes and cache maintenance
- */
-struct page **iommu_dma_alloc(struct device *dev, size_t size, gfp_t gfp,
-		unsigned long attrs, int prot, dma_addr_t *handle);
-void iommu_dma_free(struct device *dev, struct page **pages, size_t size,
-		dma_addr_t *handle);
-
-int iommu_dma_mmap(struct page **pages, size_t size, struct vm_area_struct *vma);
-
-dma_addr_t iommu_dma_map_page(struct device *dev, struct page *page,
-		unsigned long offset, size_t size, int prot);
-int iommu_dma_map_sg(struct device *dev, struct scatterlist *sg,
-		int nents, int prot);
-
-/*
- * Arch code with no special attribute handling may use these
- * directly as DMA mapping callbacks for simplicity
- */
-void iommu_dma_unmap_page(struct device *dev, dma_addr_t handle, size_t size,
-		enum dma_data_direction dir, unsigned long attrs);
-void iommu_dma_unmap_sg(struct device *dev, struct scatterlist *sg, int nents,
-		enum dma_data_direction dir, unsigned long attrs);
-dma_addr_t iommu_dma_map_resource(struct device *dev, phys_addr_t phys,
-		size_t size, enum dma_data_direction dir, unsigned long attrs);
-void iommu_dma_unmap_resource(struct device *dev, dma_addr_t handle,
-		size_t size, enum dma_data_direction dir, unsigned long attrs);
+void iommu_setup_dma_ops(struct device *dev, u64 dma_base, u64 size);
 
 /* The DMA API isn't _quite_ the whole story, though... */
 void iommu_dma_map_msi_msg(int irq, struct msi_msg *msg);
@@ -75,12 +39,13 @@ void iommu_dma_get_resv_regions(struct device *dev, struct list_head *list);
 #else /* CONFIG_IOMMU_DMA */
 
 struct iommu_domain;
+struct iommu_ops;
 struct msi_msg;
 struct device;
 
-static inline int iommu_dma_init(void)
+static inline void iommu_setup_dma_ops(struct device *dev, u64 dma_base,
+		u64 size)
 {
-	return 0;
 }
 
 static inline int iommu_get_dma_cookie(struct iommu_domain *domain)
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH 08/26] iommu/dma: Move __iommu_dma_map
  2019-04-22 17:59 implement generic dma_map_ops for IOMMUs v3 Christoph Hellwig
                   ` (6 preceding siblings ...)
  2019-04-22 17:59 ` [PATCH 07/26] iommu/dma: move the arm64 wrappers to common code Christoph Hellwig
@ 2019-04-22 17:59 ` Christoph Hellwig
  2019-04-22 17:59 ` [PATCH 09/26] iommu/dma: Move domain lookup into __iommu_dma_{map,unmap} Christoph Hellwig
                   ` (18 subsequent siblings)
  26 siblings, 0 replies; 48+ messages in thread
From: Christoph Hellwig @ 2019-04-22 17:59 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Joerg Roedel, Catalin Marinas, Will Deacon, Tom Lendacky, iommu,
	linux-arm-kernel, linux-kernel

Moving this function up to its unmap counterpart helps to keep related
code together for the following changes.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
---
 drivers/iommu/dma-iommu.c | 46 +++++++++++++++++++--------------------
 1 file changed, 23 insertions(+), 23 deletions(-)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index 622123551bba..e33724497c7b 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -435,6 +435,29 @@ static void __iommu_dma_unmap(struct iommu_domain *domain, dma_addr_t dma_addr,
 	iommu_dma_free_iova(cookie, dma_addr, size);
 }
 
+static dma_addr_t __iommu_dma_map(struct device *dev, phys_addr_t phys,
+		size_t size, int prot, struct iommu_domain *domain)
+{
+	struct iommu_dma_cookie *cookie = domain->iova_cookie;
+	size_t iova_off = 0;
+	dma_addr_t iova;
+
+	if (cookie->type == IOMMU_DMA_IOVA_COOKIE) {
+		iova_off = iova_offset(&cookie->iovad, phys);
+		size = iova_align(&cookie->iovad, size + iova_off);
+	}
+
+	iova = iommu_dma_alloc_iova(domain, size, dma_get_mask(dev), dev);
+	if (!iova)
+		return DMA_MAPPING_ERROR;
+
+	if (iommu_map(domain, iova, phys - iova_off, size, prot)) {
+		iommu_dma_free_iova(cookie, iova, size);
+		return DMA_MAPPING_ERROR;
+	}
+	return iova + iova_off;
+}
+
 static void __iommu_dma_free_pages(struct page **pages, int count)
 {
 	while (count--)
@@ -673,29 +696,6 @@ static void iommu_dma_sync_sg_for_device(struct device *dev,
 		arch_sync_dma_for_device(dev, sg_phys(sg), sg->length, dir);
 }
 
-static dma_addr_t __iommu_dma_map(struct device *dev, phys_addr_t phys,
-		size_t size, int prot, struct iommu_domain *domain)
-{
-	struct iommu_dma_cookie *cookie = domain->iova_cookie;
-	size_t iova_off = 0;
-	dma_addr_t iova;
-
-	if (cookie->type == IOMMU_DMA_IOVA_COOKIE) {
-		iova_off = iova_offset(&cookie->iovad, phys);
-		size = iova_align(&cookie->iovad, size + iova_off);
-	}
-
-	iova = iommu_dma_alloc_iova(domain, size, dma_get_mask(dev), dev);
-	if (!iova)
-		return DMA_MAPPING_ERROR;
-
-	if (iommu_map(domain, iova, phys - iova_off, size, prot)) {
-		iommu_dma_free_iova(cookie, iova, size);
-		return DMA_MAPPING_ERROR;
-	}
-	return iova + iova_off;
-}
-
 static dma_addr_t __iommu_dma_map_page(struct device *dev, struct page *page,
 		unsigned long offset, size_t size, int prot)
 {
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH 09/26] iommu/dma: Move domain lookup into __iommu_dma_{map,unmap}
  2019-04-22 17:59 implement generic dma_map_ops for IOMMUs v3 Christoph Hellwig
                   ` (7 preceding siblings ...)
  2019-04-22 17:59 ` [PATCH 08/26] iommu/dma: Move __iommu_dma_map Christoph Hellwig
@ 2019-04-22 17:59 ` Christoph Hellwig
  2019-04-22 17:59 ` [PATCH 10/26] iommu/dma: Squash __iommu_dma_{map,unmap}_page helpers Christoph Hellwig
                   ` (17 subsequent siblings)
  26 siblings, 0 replies; 48+ messages in thread
From: Christoph Hellwig @ 2019-04-22 17:59 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Joerg Roedel, Catalin Marinas, Will Deacon, Tom Lendacky, iommu,
	linux-arm-kernel, linux-kernel

From: Robin Murphy <robin.murphy@arm.com>

Most of the callers don't care, and the couple that do already have the
domain to hand for other reasons are in slow paths where the (trivial)
overhead of a repeated lookup will be utterly immaterial.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/iommu/dma-iommu.c | 34 ++++++++++++++++------------------
 1 file changed, 16 insertions(+), 18 deletions(-)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index e33724497c7b..4ebd08e3a83a 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -419,9 +419,10 @@ static void iommu_dma_free_iova(struct iommu_dma_cookie *cookie,
 				size >> iova_shift(iovad));
 }
 
-static void __iommu_dma_unmap(struct iommu_domain *domain, dma_addr_t dma_addr,
+static void __iommu_dma_unmap(struct device *dev, dma_addr_t dma_addr,
 		size_t size)
 {
+	struct iommu_domain *domain = iommu_get_dma_domain(dev);
 	struct iommu_dma_cookie *cookie = domain->iova_cookie;
 	struct iova_domain *iovad = &cookie->iovad;
 	size_t iova_off = iova_offset(iovad, dma_addr);
@@ -436,8 +437,9 @@ static void __iommu_dma_unmap(struct iommu_domain *domain, dma_addr_t dma_addr,
 }
 
 static dma_addr_t __iommu_dma_map(struct device *dev, phys_addr_t phys,
-		size_t size, int prot, struct iommu_domain *domain)
+		size_t size, int prot)
 {
+	struct iommu_domain *domain = iommu_get_dma_domain(dev);
 	struct iommu_dma_cookie *cookie = domain->iova_cookie;
 	size_t iova_off = 0;
 	dma_addr_t iova;
@@ -536,7 +538,7 @@ static struct page **__iommu_dma_alloc_pages(struct device *dev,
 static void __iommu_dma_free(struct device *dev, struct page **pages,
 		size_t size, dma_addr_t *handle)
 {
-	__iommu_dma_unmap(iommu_get_dma_domain(dev), *handle, size);
+	__iommu_dma_unmap(dev, *handle, size);
 	__iommu_dma_free_pages(pages, PAGE_ALIGN(size) >> PAGE_SHIFT);
 	*handle = DMA_MAPPING_ERROR;
 }
@@ -699,14 +701,13 @@ static void iommu_dma_sync_sg_for_device(struct device *dev,
 static dma_addr_t __iommu_dma_map_page(struct device *dev, struct page *page,
 		unsigned long offset, size_t size, int prot)
 {
-	return __iommu_dma_map(dev, page_to_phys(page) + offset, size, prot,
-			iommu_get_dma_domain(dev));
+	return __iommu_dma_map(dev, page_to_phys(page) + offset, size, prot);
 }
 
 static void __iommu_dma_unmap_page(struct device *dev, dma_addr_t handle,
 		size_t size, enum dma_data_direction dir, unsigned long attrs)
 {
-	__iommu_dma_unmap(iommu_get_dma_domain(dev), handle, size);
+	__iommu_dma_unmap(dev, handle, size);
 }
 
 static dma_addr_t iommu_dma_map_page(struct device *dev, struct page *page,
@@ -715,11 +716,10 @@ static dma_addr_t iommu_dma_map_page(struct device *dev, struct page *page,
 {
 	phys_addr_t phys = page_to_phys(page) + offset;
 	bool coherent = dev_is_dma_coherent(dev);
+	int prot = dma_info_to_prot(dir, coherent, attrs);
 	dma_addr_t dma_handle;
 
-	dma_handle =__iommu_dma_map(dev, phys, size,
-			dma_info_to_prot(dir, coherent, attrs),
-			iommu_get_dma_domain(dev));
+	dma_handle =__iommu_dma_map(dev, phys, size, prot);
 	if (!coherent && !(attrs & DMA_ATTR_SKIP_CPU_SYNC) &&
 	    dma_handle != DMA_MAPPING_ERROR)
 		arch_sync_dma_for_device(dev, phys, size, dir);
@@ -731,7 +731,7 @@ static void iommu_dma_unmap_page(struct device *dev, dma_addr_t dma_handle,
 {
 	if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC))
 		iommu_dma_sync_single_for_cpu(dev, dma_handle, size, dir);
-	__iommu_dma_unmap(iommu_get_dma_domain(dev), dma_handle, size);
+	__iommu_dma_unmap(dev, dma_handle, size);
 }
 
 /*
@@ -912,21 +912,20 @@ static void iommu_dma_unmap_sg(struct device *dev, struct scatterlist *sg,
 		sg = tmp;
 	}
 	end = sg_dma_address(sg) + sg_dma_len(sg);
-	__iommu_dma_unmap(iommu_get_dma_domain(dev), start, end - start);
+	__iommu_dma_unmap(dev, start, end - start);
 }
 
 static dma_addr_t iommu_dma_map_resource(struct device *dev, phys_addr_t phys,
 		size_t size, enum dma_data_direction dir, unsigned long attrs)
 {
 	return __iommu_dma_map(dev, phys, size,
-			dma_info_to_prot(dir, false, attrs) | IOMMU_MMIO,
-			iommu_get_dma_domain(dev));
+			dma_info_to_prot(dir, false, attrs) | IOMMU_MMIO);
 }
 
 static void iommu_dma_unmap_resource(struct device *dev, dma_addr_t handle,
 		size_t size, enum dma_data_direction dir, unsigned long attrs)
 {
-	__iommu_dma_unmap(iommu_get_dma_domain(dev), handle, size);
+	__iommu_dma_unmap(dev, handle, size);
 }
 
 static void *iommu_dma_alloc(struct device *dev, size_t size,
@@ -1176,9 +1175,8 @@ void iommu_setup_dma_ops(struct device *dev, u64 dma_base, u64 size)
 }
 
 static struct iommu_dma_msi_page *iommu_dma_get_msi_page(struct device *dev,
-		phys_addr_t msi_addr, struct iommu_domain *domain)
+		phys_addr_t msi_addr, struct iommu_dma_cookie *cookie)
 {
-	struct iommu_dma_cookie *cookie = domain->iova_cookie;
 	struct iommu_dma_msi_page *msi_page;
 	dma_addr_t iova;
 	int prot = IOMMU_WRITE | IOMMU_NOEXEC | IOMMU_MMIO;
@@ -1193,7 +1191,7 @@ static struct iommu_dma_msi_page *iommu_dma_get_msi_page(struct device *dev,
 	if (!msi_page)
 		return NULL;
 
-	iova = __iommu_dma_map(dev, msi_addr, size, prot, domain);
+	iova = __iommu_dma_map(dev, msi_addr, size, prot);
 	if (iova == DMA_MAPPING_ERROR)
 		goto out_free_page;
 
@@ -1228,7 +1226,7 @@ void iommu_dma_map_msi_msg(int irq, struct msi_msg *msg)
 	 * of an MSI from within an IPI handler.
 	 */
 	spin_lock_irqsave(&cookie->msi_lock, flags);
-	msi_page = iommu_dma_get_msi_page(dev, msi_addr, domain);
+	msi_page = iommu_dma_get_msi_page(dev, msi_addr, cookie);
 	spin_unlock_irqrestore(&cookie->msi_lock, flags);
 
 	if (WARN_ON(!msi_page)) {
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH 10/26] iommu/dma: Squash __iommu_dma_{map,unmap}_page helpers
  2019-04-22 17:59 implement generic dma_map_ops for IOMMUs v3 Christoph Hellwig
                   ` (8 preceding siblings ...)
  2019-04-22 17:59 ` [PATCH 09/26] iommu/dma: Move domain lookup into __iommu_dma_{map,unmap} Christoph Hellwig
@ 2019-04-22 17:59 ` Christoph Hellwig
  2019-04-22 17:59 ` [PATCH 11/26] iommu/dma: Factor out remapped pages lookup Christoph Hellwig
                   ` (16 subsequent siblings)
  26 siblings, 0 replies; 48+ messages in thread
From: Christoph Hellwig @ 2019-04-22 17:59 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Joerg Roedel, Catalin Marinas, Will Deacon, Tom Lendacky, iommu,
	linux-arm-kernel, linux-kernel

From: Robin Murphy <robin.murphy@arm.com>

The remaining internal callsites don't care about having prototypes
compatible with the relevant dma_map_ops callbacks, so the extra
level of indirection just wastes space and complictaes things.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/iommu/dma-iommu.c | 25 +++++++------------------
 1 file changed, 7 insertions(+), 18 deletions(-)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index 4ebd08e3a83a..b52c5d6be7b4 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -698,18 +698,6 @@ static void iommu_dma_sync_sg_for_device(struct device *dev,
 		arch_sync_dma_for_device(dev, sg_phys(sg), sg->length, dir);
 }
 
-static dma_addr_t __iommu_dma_map_page(struct device *dev, struct page *page,
-		unsigned long offset, size_t size, int prot)
-{
-	return __iommu_dma_map(dev, page_to_phys(page) + offset, size, prot);
-}
-
-static void __iommu_dma_unmap_page(struct device *dev, dma_addr_t handle,
-		size_t size, enum dma_data_direction dir, unsigned long attrs)
-{
-	__iommu_dma_unmap(dev, handle, size);
-}
-
 static dma_addr_t iommu_dma_map_page(struct device *dev, struct page *page,
 		unsigned long offset, size_t size, enum dma_data_direction dir,
 		unsigned long attrs)
@@ -955,7 +943,8 @@ static void *iommu_dma_alloc(struct device *dev, size_t size,
 		if (!addr)
 			return NULL;
 
-		*handle = __iommu_dma_map_page(dev, page, 0, iosize, ioprot);
+		*handle = __iommu_dma_map(dev, page_to_phys(page), iosize,
+					  ioprot);
 		if (*handle == DMA_MAPPING_ERROR) {
 			if (coherent)
 				__free_pages(page, get_order(size));
@@ -972,7 +961,7 @@ static void *iommu_dma_alloc(struct device *dev, size_t size,
 		if (!page)
 			return NULL;
 
-		*handle = __iommu_dma_map_page(dev, page, 0, iosize, ioprot);
+		*handle = __iommu_dma_map(dev, page_to_phys(page), iosize, ioprot);
 		if (*handle == DMA_MAPPING_ERROR) {
 			dma_release_from_contiguous(dev, page,
 						    size >> PAGE_SHIFT);
@@ -986,7 +975,7 @@ static void *iommu_dma_alloc(struct device *dev, size_t size,
 				arch_dma_prep_coherent(page, iosize);
 			memset(addr, 0, size);
 		} else {
-			__iommu_dma_unmap_page(dev, *handle, iosize, 0, attrs);
+			__iommu_dma_unmap(dev, *handle, iosize);
 			dma_release_from_contiguous(dev, page,
 						    size >> PAGE_SHIFT);
 		}
@@ -1025,12 +1014,12 @@ static void iommu_dma_free(struct device *dev, size_t size, void *cpu_addr,
 	 * Hence how dodgy the below logic looks...
 	 */
 	if (dma_in_atomic_pool(cpu_addr, size)) {
-		__iommu_dma_unmap_page(dev, handle, iosize, 0, 0);
+		__iommu_dma_unmap(dev, handle, iosize);
 		dma_free_from_pool(cpu_addr, size);
 	} else if (attrs & DMA_ATTR_FORCE_CONTIGUOUS) {
 		struct page *page = vmalloc_to_page(cpu_addr);
 
-		__iommu_dma_unmap_page(dev, handle, iosize, 0, attrs);
+		__iommu_dma_unmap(dev, handle, iosize);
 		dma_release_from_contiguous(dev, page, size >> PAGE_SHIFT);
 		dma_common_free_remap(cpu_addr, size, VM_USERMAP);
 	} else if (is_vmalloc_addr(cpu_addr)){
@@ -1041,7 +1030,7 @@ static void iommu_dma_free(struct device *dev, size_t size, void *cpu_addr,
 		__iommu_dma_free(dev, area->pages, iosize, &handle);
 		dma_common_free_remap(cpu_addr, size, VM_USERMAP);
 	} else {
-		__iommu_dma_unmap_page(dev, handle, iosize, 0, 0);
+		__iommu_dma_unmap(dev, handle, iosize);
 		__free_pages(virt_to_page(cpu_addr), get_order(size));
 	}
 }
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH 11/26] iommu/dma: Factor out remapped pages lookup
  2019-04-22 17:59 implement generic dma_map_ops for IOMMUs v3 Christoph Hellwig
                   ` (9 preceding siblings ...)
  2019-04-22 17:59 ` [PATCH 10/26] iommu/dma: Squash __iommu_dma_{map,unmap}_page helpers Christoph Hellwig
@ 2019-04-22 17:59 ` Christoph Hellwig
  2019-04-29 13:05   ` Robin Murphy
  2019-04-22 17:59 ` [PATCH 12/26] iommu/dma: Refactor the page array remapping allocator Christoph Hellwig
                   ` (15 subsequent siblings)
  26 siblings, 1 reply; 48+ messages in thread
From: Christoph Hellwig @ 2019-04-22 17:59 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Joerg Roedel, Catalin Marinas, Will Deacon, Tom Lendacky, iommu,
	linux-arm-kernel, linux-kernel

From: Robin Murphy <robin.murphy@arm.com>

Since we duplicate the find_vm_area() logic a few times in places where
we only care aboute the pages, factor out a helper to abstract it.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
[hch: don't warn when not finding a region, as we'll rely on that later]
Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/iommu/dma-iommu.c | 32 ++++++++++++++++++++------------
 1 file changed, 20 insertions(+), 12 deletions(-)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index b52c5d6be7b4..8e2d9733113e 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -525,6 +525,15 @@ static struct page **__iommu_dma_alloc_pages(struct device *dev,
 	return pages;
 }
 
+static struct page **__iommu_dma_get_pages(void *cpu_addr)
+{
+	struct vm_struct *area = find_vm_area(cpu_addr);
+
+	if (!area || !area->pages)
+		return NULL;
+	return area->pages;
+}
+
 /**
  * iommu_dma_free - Free a buffer allocated by __iommu_dma_alloc()
  * @dev: Device which owns this buffer
@@ -1023,11 +1032,11 @@ static void iommu_dma_free(struct device *dev, size_t size, void *cpu_addr,
 		dma_release_from_contiguous(dev, page, size >> PAGE_SHIFT);
 		dma_common_free_remap(cpu_addr, size, VM_USERMAP);
 	} else if (is_vmalloc_addr(cpu_addr)){
-		struct vm_struct *area = find_vm_area(cpu_addr);
+		struct page **pages = __iommu_dma_get_pages(cpu_addr);
 
-		if (WARN_ON(!area || !area->pages))
+		if (WARN_ON(!pages))
 			return;
-		__iommu_dma_free(dev, area->pages, iosize, &handle);
+		__iommu_dma_free(dev, pages, iosize, &handle);
 		dma_common_free_remap(cpu_addr, size, VM_USERMAP);
 	} else {
 		__iommu_dma_unmap(dev, handle, iosize);
@@ -1049,7 +1058,7 @@ static int iommu_dma_mmap(struct device *dev, struct vm_area_struct *vma,
 {
 	unsigned long nr_pages = PAGE_ALIGN(size) >> PAGE_SHIFT;
 	unsigned long off = vma->vm_pgoff;
-	struct vm_struct *area;
+	struct page **pages;
 	int ret;
 
 	vma->vm_page_prot = arch_dma_mmap_pgprot(dev, vma->vm_page_prot, attrs);
@@ -1074,11 +1083,10 @@ static int iommu_dma_mmap(struct device *dev, struct vm_area_struct *vma,
 		return __iommu_dma_mmap_pfn(vma, pfn, size);
 	}
 
-	area = find_vm_area(cpu_addr);
-	if (WARN_ON(!area || !area->pages))
+	pages = __iommu_dma_get_pages(cpu_addr);
+	if (WARN_ON_ONCE(!pages))
 		return -ENXIO;
-
-	return __iommu_dma_mmap(area->pages, size, vma);
+	return __iommu_dma_mmap(pages, size, vma);
 }
 
 static int __iommu_dma_get_sgtable_page(struct sg_table *sgt, struct page *page,
@@ -1096,7 +1104,7 @@ static int iommu_dma_get_sgtable(struct device *dev, struct sg_table *sgt,
 		unsigned long attrs)
 {
 	unsigned int count = PAGE_ALIGN(size) >> PAGE_SHIFT;
-	struct vm_struct *area = find_vm_area(cpu_addr);
+	struct page **pages;
 
 	if (!is_vmalloc_addr(cpu_addr)) {
 		struct page *page = virt_to_page(cpu_addr);
@@ -1112,10 +1120,10 @@ static int iommu_dma_get_sgtable(struct device *dev, struct sg_table *sgt,
 		return __iommu_dma_get_sgtable_page(sgt, page, size);
 	}
 
-	if (WARN_ON(!area || !area->pages))
+	pages = __iommu_dma_get_pages(cpu_addr);
+	if (WARN_ON_ONCE(!pages))
 		return -ENXIO;
-
-	return sg_alloc_table_from_pages(sgt, area->pages, count, 0, size,
+	return sg_alloc_table_from_pages(sgt, pages, count, 0, size,
 					 GFP_KERNEL);
 }
 
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH 12/26] iommu/dma: Refactor the page array remapping allocator
  2019-04-22 17:59 implement generic dma_map_ops for IOMMUs v3 Christoph Hellwig
                   ` (10 preceding siblings ...)
  2019-04-22 17:59 ` [PATCH 11/26] iommu/dma: Factor out remapped pages lookup Christoph Hellwig
@ 2019-04-22 17:59 ` Christoph Hellwig
  2019-04-29 13:10   ` Robin Murphy
  2019-04-22 17:59 ` [PATCH 13/26] iommu/dma: Remove __iommu_dma_free Christoph Hellwig
                   ` (14 subsequent siblings)
  26 siblings, 1 reply; 48+ messages in thread
From: Christoph Hellwig @ 2019-04-22 17:59 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Joerg Roedel, Catalin Marinas, Will Deacon, Tom Lendacky, iommu,
	linux-arm-kernel, linux-kernel

Move the call to dma_common_pages_remap into __iommu_dma_alloc and
rename it to iommu_dma_alloc_remap.  This creates a self-contained
helper for remapped pages allocation and mapping.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/iommu/dma-iommu.c | 54 +++++++++++++++++++--------------------
 1 file changed, 26 insertions(+), 28 deletions(-)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index 8e2d9733113e..b8e46e89a60a 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -535,9 +535,9 @@ static struct page **__iommu_dma_get_pages(void *cpu_addr)
 }
 
 /**
- * iommu_dma_free - Free a buffer allocated by __iommu_dma_alloc()
+ * iommu_dma_free - Free a buffer allocated by iommu_dma_alloc_remap()
  * @dev: Device which owns this buffer
- * @pages: Array of buffer pages as returned by __iommu_dma_alloc()
+ * @pages: Array of buffer pages as returned by __iommu_dma_alloc_remap()
  * @size: Size of buffer in bytes
  * @handle: DMA address of buffer
  *
@@ -553,33 +553,35 @@ static void __iommu_dma_free(struct device *dev, struct page **pages,
 }
 
 /**
- * __iommu_dma_alloc - Allocate and map a buffer contiguous in IOVA space
+ * iommu_dma_alloc_remap - Allocate and map a buffer contiguous in IOVA space
  * @dev: Device to allocate memory for. Must be a real device
  *	 attached to an iommu_dma_domain
  * @size: Size of buffer in bytes
+ * @dma_handle: Out argument for allocated DMA handle
  * @gfp: Allocation flags
  * @attrs: DMA attributes for this allocation
- * @prot: IOMMU mapping flags
- * @handle: Out argument for allocated DMA handle
  *
  * If @size is less than PAGE_SIZE, then a full CPU page will be allocated,
  * but an IOMMU which supports smaller pages might not map the whole thing.
  *
- * Return: Array of struct page pointers describing the buffer,
- *	   or NULL on failure.
+ * Return: Mapped virtual address, or NULL on failure.
  */
-static struct page **__iommu_dma_alloc(struct device *dev, size_t size,
-		gfp_t gfp, unsigned long attrs, int prot, dma_addr_t *handle)
+static void *iommu_dma_alloc_remap(struct device *dev, size_t size,
+		dma_addr_t *dma_handle, gfp_t gfp, unsigned long attrs)
 {
 	struct iommu_domain *domain = iommu_get_dma_domain(dev);
 	struct iommu_dma_cookie *cookie = domain->iova_cookie;
 	struct iova_domain *iovad = &cookie->iovad;
+	bool coherent = dev_is_dma_coherent(dev);
+	int ioprot = dma_info_to_prot(DMA_BIDIRECTIONAL, coherent, attrs);
+	pgprot_t prot = arch_dma_mmap_pgprot(dev, PAGE_KERNEL, attrs);
+	unsigned int count, min_size, alloc_sizes = domain->pgsize_bitmap;
 	struct page **pages;
 	struct sg_table sgt;
 	dma_addr_t iova;
-	unsigned int count, min_size, alloc_sizes = domain->pgsize_bitmap;
+	void *vaddr;
 
-	*handle = DMA_MAPPING_ERROR;
+	*dma_handle = DMA_MAPPING_ERROR;
 
 	min_size = alloc_sizes & -alloc_sizes;
 	if (min_size < PAGE_SIZE) {
@@ -605,7 +607,7 @@ static struct page **__iommu_dma_alloc(struct device *dev, size_t size,
 	if (sg_alloc_table_from_pages(&sgt, pages, count, 0, size, GFP_KERNEL))
 		goto out_free_iova;
 
-	if (!(prot & IOMMU_CACHE)) {
+	if (!(ioprot & IOMMU_CACHE)) {
 		struct scatterlist *sg;
 		int i;
 
@@ -613,14 +615,21 @@ static struct page **__iommu_dma_alloc(struct device *dev, size_t size,
 			arch_dma_prep_coherent(sg_page(sg), sg->length);
 	}
 
-	if (iommu_map_sg(domain, iova, sgt.sgl, sgt.orig_nents, prot)
+	if (iommu_map_sg(domain, iova, sgt.sgl, sgt.orig_nents, ioprot)
 			< size)
 		goto out_free_sg;
 
-	*handle = iova;
+	vaddr = dma_common_pages_remap(pages, size, VM_USERMAP, prot,
+			__builtin_return_address(0));
+	if (!vaddr)
+		goto out_unmap;
+
+	*dma_handle = iova;
 	sg_free_table(&sgt);
-	return pages;
+	return vaddr;
 
+out_unmap:
+	__iommu_dma_unmap(dev, iova, size);
 out_free_sg:
 	sg_free_table(&sgt);
 out_free_iova:
@@ -989,18 +998,7 @@ static void *iommu_dma_alloc(struct device *dev, size_t size,
 						    size >> PAGE_SHIFT);
 		}
 	} else {
-		pgprot_t prot = arch_dma_mmap_pgprot(dev, PAGE_KERNEL, attrs);
-		struct page **pages;
-
-		pages = __iommu_dma_alloc(dev, iosize, gfp, attrs, ioprot,
-					handle);
-		if (!pages)
-			return NULL;
-
-		addr = dma_common_pages_remap(pages, size, VM_USERMAP, prot,
-					      __builtin_return_address(0));
-		if (!addr)
-			__iommu_dma_free(dev, pages, iosize, handle);
+		addr = iommu_dma_alloc_remap(dev, iosize, handle, gfp, attrs);
 	}
 	return addr;
 }
@@ -1014,7 +1012,7 @@ static void iommu_dma_free(struct device *dev, size_t size, void *cpu_addr,
 	/*
 	 * @cpu_addr will be one of 4 things depending on how it was allocated:
 	 * - A remapped array of pages for contiguous allocations.
-	 * - A remapped array of pages from __iommu_dma_alloc(), for all
+	 * - A remapped array of pages from iommu_dma_alloc_remap(), for all
 	 *   non-atomic allocations.
 	 * - A non-cacheable alias from the atomic pool, for atomic
 	 *   allocations by non-coherent devices.
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH 13/26] iommu/dma: Remove __iommu_dma_free
  2019-04-22 17:59 implement generic dma_map_ops for IOMMUs v3 Christoph Hellwig
                   ` (11 preceding siblings ...)
  2019-04-22 17:59 ` [PATCH 12/26] iommu/dma: Refactor the page array remapping allocator Christoph Hellwig
@ 2019-04-22 17:59 ` Christoph Hellwig
  2019-04-29 13:18   ` Robin Murphy
  2019-04-22 17:59 ` [PATCH 14/26] iommu/dma: Refactor iommu_dma_free Christoph Hellwig
                   ` (13 subsequent siblings)
  26 siblings, 1 reply; 48+ messages in thread
From: Christoph Hellwig @ 2019-04-22 17:59 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Joerg Roedel, Catalin Marinas, Will Deacon, Tom Lendacky, iommu,
	linux-arm-kernel, linux-kernel

We only have a single caller of this function left, so open code it there.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/iommu/dma-iommu.c | 21 ++-------------------
 1 file changed, 2 insertions(+), 19 deletions(-)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index b8e46e89a60a..4632b9d301a1 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -534,24 +534,6 @@ static struct page **__iommu_dma_get_pages(void *cpu_addr)
 	return area->pages;
 }
 
-/**
- * iommu_dma_free - Free a buffer allocated by iommu_dma_alloc_remap()
- * @dev: Device which owns this buffer
- * @pages: Array of buffer pages as returned by __iommu_dma_alloc_remap()
- * @size: Size of buffer in bytes
- * @handle: DMA address of buffer
- *
- * Frees both the pages associated with the buffer, and the array
- * describing them
- */
-static void __iommu_dma_free(struct device *dev, struct page **pages,
-		size_t size, dma_addr_t *handle)
-{
-	__iommu_dma_unmap(dev, *handle, size);
-	__iommu_dma_free_pages(pages, PAGE_ALIGN(size) >> PAGE_SHIFT);
-	*handle = DMA_MAPPING_ERROR;
-}
-
 /**
  * iommu_dma_alloc_remap - Allocate and map a buffer contiguous in IOVA space
  * @dev: Device to allocate memory for. Must be a real device
@@ -1034,7 +1016,8 @@ static void iommu_dma_free(struct device *dev, size_t size, void *cpu_addr,
 
 		if (WARN_ON(!pages))
 			return;
-		__iommu_dma_free(dev, pages, iosize, &handle);
+		__iommu_dma_unmap(dev, handle, iosize);
+		__iommu_dma_free_pages(pages, size >> PAGE_SHIFT);
 		dma_common_free_remap(cpu_addr, size, VM_USERMAP);
 	} else {
 		__iommu_dma_unmap(dev, handle, iosize);
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH 14/26] iommu/dma: Refactor iommu_dma_free
  2019-04-22 17:59 implement generic dma_map_ops for IOMMUs v3 Christoph Hellwig
                   ` (12 preceding siblings ...)
  2019-04-22 17:59 ` [PATCH 13/26] iommu/dma: Remove __iommu_dma_free Christoph Hellwig
@ 2019-04-22 17:59 ` Christoph Hellwig
  2019-04-29 13:59   ` Robin Murphy
  2019-04-22 17:59 ` [PATCH 15/26] iommu/dma: Refactor iommu_dma_alloc Christoph Hellwig
                   ` (12 subsequent siblings)
  26 siblings, 1 reply; 48+ messages in thread
From: Christoph Hellwig @ 2019-04-22 17:59 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Joerg Roedel, Catalin Marinas, Will Deacon, Tom Lendacky, iommu,
	linux-arm-kernel, linux-kernel

From: Robin Murphy <robin.murphy@arm.com>

The freeing logic was made particularly horrible by part of it being
opaque to the arch wrapper, which led to a lot of convoluted repetition
to ensure each path did everything in the right order. Now that it's
all private, we can pick apart and consolidate the logically-distinct
steps of freeing the IOMMU mapping, the underlying pages, and the CPU
remap (if necessary) into something much more manageable.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
[various cosmetic changes to the code flow]
Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/iommu/dma-iommu.c | 75 ++++++++++++++++++---------------------
 1 file changed, 35 insertions(+), 40 deletions(-)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index 4632b9d301a1..9658c4cc3cfe 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -916,6 +916,41 @@ static void iommu_dma_unmap_resource(struct device *dev, dma_addr_t handle,
 	__iommu_dma_unmap(dev, handle, size);
 }
 
+static void iommu_dma_free(struct device *dev, size_t size, void *cpu_addr,
+		dma_addr_t handle, unsigned long attrs)
+{
+	size_t alloc_size = PAGE_ALIGN(size);
+	int count = alloc_size >> PAGE_SHIFT;
+	struct page *page = NULL;
+
+	__iommu_dma_unmap(dev, handle, size);
+
+	/* Non-coherent atomic allocation? Easy */
+	if (dma_free_from_pool(cpu_addr, alloc_size))
+		return;
+
+	if (is_vmalloc_addr(cpu_addr)) {
+		/*
+		 * If it the address is remapped, then it's either non-coherent
+		 * or highmem CMA, or an iommu_dma_alloc_remap() construction.
+		 */
+		struct page **pages = __iommu_dma_get_pages(cpu_addr);
+
+		if (pages)
+			__iommu_dma_free_pages(pages, count);
+		else
+			page = vmalloc_to_page(cpu_addr);
+
+		dma_common_free_remap(cpu_addr, alloc_size, VM_USERMAP);
+	} else {
+		/* Lowmem means a coherent atomic or CMA allocation */
+		page = virt_to_page(cpu_addr);
+	}
+
+	if (page && !dma_release_from_contiguous(dev, page, count))
+		__free_pages(page, get_order(alloc_size));
+}
+
 static void *iommu_dma_alloc(struct device *dev, size_t size,
 		dma_addr_t *handle, gfp_t gfp, unsigned long attrs)
 {
@@ -985,46 +1020,6 @@ static void *iommu_dma_alloc(struct device *dev, size_t size,
 	return addr;
 }
 
-static void iommu_dma_free(struct device *dev, size_t size, void *cpu_addr,
-		dma_addr_t handle, unsigned long attrs)
-{
-	size_t iosize = size;
-
-	size = PAGE_ALIGN(size);
-	/*
-	 * @cpu_addr will be one of 4 things depending on how it was allocated:
-	 * - A remapped array of pages for contiguous allocations.
-	 * - A remapped array of pages from iommu_dma_alloc_remap(), for all
-	 *   non-atomic allocations.
-	 * - A non-cacheable alias from the atomic pool, for atomic
-	 *   allocations by non-coherent devices.
-	 * - A normal lowmem address, for atomic allocations by
-	 *   coherent devices.
-	 * Hence how dodgy the below logic looks...
-	 */
-	if (dma_in_atomic_pool(cpu_addr, size)) {
-		__iommu_dma_unmap(dev, handle, iosize);
-		dma_free_from_pool(cpu_addr, size);
-	} else if (attrs & DMA_ATTR_FORCE_CONTIGUOUS) {
-		struct page *page = vmalloc_to_page(cpu_addr);
-
-		__iommu_dma_unmap(dev, handle, iosize);
-		dma_release_from_contiguous(dev, page, size >> PAGE_SHIFT);
-		dma_common_free_remap(cpu_addr, size, VM_USERMAP);
-	} else if (is_vmalloc_addr(cpu_addr)){
-		struct page **pages = __iommu_dma_get_pages(cpu_addr);
-
-		if (WARN_ON(!pages))
-			return;
-		__iommu_dma_unmap(dev, handle, iosize);
-		__iommu_dma_free_pages(pages, size >> PAGE_SHIFT);
-		dma_common_free_remap(cpu_addr, size, VM_USERMAP);
-	} else {
-		__iommu_dma_unmap(dev, handle, iosize);
-		__free_pages(virt_to_page(cpu_addr), get_order(size));
-	}
-}
-
 static int __iommu_dma_mmap_pfn(struct vm_area_struct *vma,
 			      unsigned long pfn, size_t size)
 {
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH 15/26] iommu/dma: Refactor iommu_dma_alloc
  2019-04-22 17:59 implement generic dma_map_ops for IOMMUs v3 Christoph Hellwig
                   ` (13 preceding siblings ...)
  2019-04-22 17:59 ` [PATCH 14/26] iommu/dma: Refactor iommu_dma_free Christoph Hellwig
@ 2019-04-22 17:59 ` Christoph Hellwig
  2019-04-22 17:59 ` [PATCH 16/26] iommu/dma: Don't remap CMA unnecessarily Christoph Hellwig
                   ` (11 subsequent siblings)
  26 siblings, 0 replies; 48+ messages in thread
From: Christoph Hellwig @ 2019-04-22 17:59 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Joerg Roedel, Catalin Marinas, Will Deacon, Tom Lendacky, iommu,
	linux-arm-kernel, linux-kernel

From: Robin Murphy <robin.murphy@arm.com>

Shuffle around the self-contained atomic and non-contiguous cases to
return early and get out of the way of the CMA case that we're about to
work on next.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
[hch: slight changes to the code flow]
Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/iommu/dma-iommu.c | 60 +++++++++++++++++++--------------------
 1 file changed, 30 insertions(+), 30 deletions(-)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index 9658c4cc3cfe..504dd27312bb 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -956,14 +956,19 @@ static void *iommu_dma_alloc(struct device *dev, size_t size,
 {
 	bool coherent = dev_is_dma_coherent(dev);
 	int ioprot = dma_info_to_prot(DMA_BIDIRECTIONAL, coherent, attrs);
+	pgprot_t prot = arch_dma_mmap_pgprot(dev, PAGE_KERNEL, attrs);
 	size_t iosize = size;
+	struct page *page;
 	void *addr;
 
 	size = PAGE_ALIGN(size);
 	gfp |= __GFP_ZERO;
 
+	if (gfpflags_allow_blocking(gfp) &&
+	    !(attrs & DMA_ATTR_FORCE_CONTIGUOUS))
+		return iommu_dma_alloc_remap(dev, iosize, handle, gfp, attrs);
+
 	if (!gfpflags_allow_blocking(gfp)) {
-		struct page *page;
 		/*
 		 * In atomic context we can't remap anything, so we'll only
 		 * get the virtually contiguous buffer we need by way of a
@@ -985,39 +990,34 @@ static void *iommu_dma_alloc(struct device *dev, size_t size,
 				__free_pages(page, get_order(size));
 			else
 				dma_free_from_pool(addr, size);
-			addr = NULL;
-		}
-	} else if (attrs & DMA_ATTR_FORCE_CONTIGUOUS) {
-		pgprot_t prot = arch_dma_mmap_pgprot(dev, PAGE_KERNEL, attrs);
-		struct page *page;
-
-		page = dma_alloc_from_contiguous(dev, size >> PAGE_SHIFT,
-					get_order(size), gfp & __GFP_NOWARN);
-		if (!page)
 			return NULL;
-
-		*handle = __iommu_dma_map(dev, page_to_phys(page), iosize, ioprot);
-		if (*handle == DMA_MAPPING_ERROR) {
-			dma_release_from_contiguous(dev, page,
-						    size >> PAGE_SHIFT);
-			return NULL;
-		}
-		addr = dma_common_contiguous_remap(page, size, VM_USERMAP,
-						   prot,
-						   __builtin_return_address(0));
-		if (addr) {
-			if (!coherent)
-				arch_dma_prep_coherent(page, iosize);
-			memset(addr, 0, size);
-		} else {
-			__iommu_dma_unmap(dev, *handle, iosize);
-			dma_release_from_contiguous(dev, page,
-						    size >> PAGE_SHIFT);
 		}
-	} else {
-		addr = iommu_dma_alloc_remap(dev, iosize, handle, gfp, attrs);
+		return addr;
 	}
+
+	page = dma_alloc_from_contiguous(dev, size >> PAGE_SHIFT,
+					 get_order(size), gfp & __GFP_NOWARN);
+	if (!page)
+		return NULL;
+
+	*handle = __iommu_dma_map(dev, page_to_phys(page), iosize, ioprot);
+	if (*handle == DMA_MAPPING_ERROR)
+		goto out_free_pages;
+
+	addr = dma_common_contiguous_remap(page, size, VM_USERMAP, prot,
+			__builtin_return_address(0));
+	if (!addr)
+		goto out_unmap;
+
+	if (!coherent)
+		arch_dma_prep_coherent(page, iosize);
+	memset(addr, 0, size);
 	return addr;
+out_unmap:
+	__iommu_dma_unmap(dev, *handle, iosize);
+out_free_pages:
+	dma_release_from_contiguous(dev, page, size >> PAGE_SHIFT);
+	return NULL;
 }
 
 static int __iommu_dma_mmap_pfn(struct vm_area_struct *vma,
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH 16/26] iommu/dma: Don't remap CMA unnecessarily
  2019-04-22 17:59 implement generic dma_map_ops for IOMMUs v3 Christoph Hellwig
                   ` (14 preceding siblings ...)
  2019-04-22 17:59 ` [PATCH 15/26] iommu/dma: Refactor iommu_dma_alloc Christoph Hellwig
@ 2019-04-22 17:59 ` Christoph Hellwig
  2019-04-22 17:59 ` [PATCH 17/26] iommu/dma: Merge the CMA and alloc_pages allocation paths Christoph Hellwig
                   ` (10 subsequent siblings)
  26 siblings, 0 replies; 48+ messages in thread
From: Christoph Hellwig @ 2019-04-22 17:59 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Joerg Roedel, Catalin Marinas, Will Deacon, Tom Lendacky, iommu,
	linux-arm-kernel, linux-kernel

From: Robin Murphy <robin.murphy@arm.com>

Always remapping CMA allocations was largely a bodge to keep the freeing
logic manageable when it was split between here and an arch wrapper. Now
that it's all together and streamlined, we can relax that limitation.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/iommu/dma-iommu.c | 19 ++++++++++++-------
 1 file changed, 12 insertions(+), 7 deletions(-)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index 504dd27312bb..6f4febf5e1de 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -956,7 +956,6 @@ static void *iommu_dma_alloc(struct device *dev, size_t size,
 {
 	bool coherent = dev_is_dma_coherent(dev);
 	int ioprot = dma_info_to_prot(DMA_BIDIRECTIONAL, coherent, attrs);
-	pgprot_t prot = arch_dma_mmap_pgprot(dev, PAGE_KERNEL, attrs);
 	size_t iosize = size;
 	struct page *page;
 	void *addr;
@@ -1004,13 +1003,19 @@ static void *iommu_dma_alloc(struct device *dev, size_t size,
 	if (*handle == DMA_MAPPING_ERROR)
 		goto out_free_pages;
 
-	addr = dma_common_contiguous_remap(page, size, VM_USERMAP, prot,
-			__builtin_return_address(0));
-	if (!addr)
-		goto out_unmap;
+	if (!coherent || PageHighMem(page)) {
+		pgprot_t prot = arch_dma_mmap_pgprot(dev, PAGE_KERNEL, attrs);
 
-	if (!coherent)
-		arch_dma_prep_coherent(page, iosize);
+		addr = dma_common_contiguous_remap(page, size, VM_USERMAP, prot,
+				__builtin_return_address(0));
+		if (!addr)
+			goto out_unmap;
+
+		if (!coherent)
+			arch_dma_prep_coherent(page, iosize);
+	} else {
+		addr = page_address(page);
+	}
 	memset(addr, 0, size);
 	return addr;
 out_unmap:
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH 17/26] iommu/dma: Merge the CMA and alloc_pages allocation paths
  2019-04-22 17:59 implement generic dma_map_ops for IOMMUs v3 Christoph Hellwig
                   ` (15 preceding siblings ...)
  2019-04-22 17:59 ` [PATCH 16/26] iommu/dma: Don't remap CMA unnecessarily Christoph Hellwig
@ 2019-04-22 17:59 ` Christoph Hellwig
  2019-04-22 17:59 ` [PATCH 18/26] iommu/dma: Split iommu_dma_free Christoph Hellwig
                   ` (9 subsequent siblings)
  26 siblings, 0 replies; 48+ messages in thread
From: Christoph Hellwig @ 2019-04-22 17:59 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Joerg Roedel, Catalin Marinas, Will Deacon, Tom Lendacky, iommu,
	linux-arm-kernel, linux-kernel

Instead of having a separate code path for the non-blocking alloc_pages
and CMA allocations paths merge them into one.  There is a slight
behavior change here in that we try the page allocator if CMA fails.
This matches what dma-direct and other iommu drivers do and will be
needed to use the dma-iommu code on architectures without DMA remapping
later on.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/iommu/dma-iommu.c | 32 ++++++++++++--------------------
 1 file changed, 12 insertions(+), 20 deletions(-)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index 6f4febf5e1de..a1b8c232ad42 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -957,7 +957,7 @@ static void *iommu_dma_alloc(struct device *dev, size_t size,
 	bool coherent = dev_is_dma_coherent(dev);
 	int ioprot = dma_info_to_prot(DMA_BIDIRECTIONAL, coherent, attrs);
 	size_t iosize = size;
-	struct page *page;
+	struct page *page = NULL;
 	void *addr;
 
 	size = PAGE_ALIGN(size);
@@ -967,35 +967,26 @@ static void *iommu_dma_alloc(struct device *dev, size_t size,
 	    !(attrs & DMA_ATTR_FORCE_CONTIGUOUS))
 		return iommu_dma_alloc_remap(dev, iosize, handle, gfp, attrs);
 
-	if (!gfpflags_allow_blocking(gfp)) {
-		/*
-		 * In atomic context we can't remap anything, so we'll only
-		 * get the virtually contiguous buffer we need by way of a
-		 * physically contiguous allocation.
-		 */
-		if (coherent) {
-			page = alloc_pages(gfp, get_order(size));
-			addr = page ? page_address(page) : NULL;
-		} else {
-			addr = dma_alloc_from_pool(size, &page, gfp);
-		}
+	if (!gfpflags_allow_blocking(gfp) && !coherent) {
+		addr = dma_alloc_from_pool(size, &page, gfp);
 		if (!addr)
 			return NULL;
 
 		*handle = __iommu_dma_map(dev, page_to_phys(page), iosize,
 					  ioprot);
 		if (*handle == DMA_MAPPING_ERROR) {
-			if (coherent)
-				__free_pages(page, get_order(size));
-			else
-				dma_free_from_pool(addr, size);
+			dma_free_from_pool(addr, size);
 			return NULL;
 		}
 		return addr;
 	}
 
-	page = dma_alloc_from_contiguous(dev, size >> PAGE_SHIFT,
-					 get_order(size), gfp & __GFP_NOWARN);
+	if (gfpflags_allow_blocking(gfp))
+		page = dma_alloc_from_contiguous(dev, size >> PAGE_SHIFT,
+						 get_order(size),
+						 gfp & __GFP_NOWARN);
+	if (!page)
+		page = alloc_pages(gfp, get_order(size));
 	if (!page)
 		return NULL;
 
@@ -1021,7 +1012,8 @@ static void *iommu_dma_alloc(struct device *dev, size_t size,
 out_unmap:
 	__iommu_dma_unmap(dev, *handle, iosize);
 out_free_pages:
-	dma_release_from_contiguous(dev, page, size >> PAGE_SHIFT);
+	if (!dma_release_from_contiguous(dev, page, size >> PAGE_SHIFT))
+		__free_pages(page, get_order(size));
 	return NULL;
 }
 
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH 18/26] iommu/dma: Split iommu_dma_free
  2019-04-22 17:59 implement generic dma_map_ops for IOMMUs v3 Christoph Hellwig
                   ` (16 preceding siblings ...)
  2019-04-22 17:59 ` [PATCH 17/26] iommu/dma: Merge the CMA and alloc_pages allocation paths Christoph Hellwig
@ 2019-04-22 17:59 ` Christoph Hellwig
  2019-04-22 17:59 ` [PATCH 19/26] iommu/dma: Cleanup variable naming in iommu_dma_alloc Christoph Hellwig
                   ` (8 subsequent siblings)
  26 siblings, 0 replies; 48+ messages in thread
From: Christoph Hellwig @ 2019-04-22 17:59 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Joerg Roedel, Catalin Marinas, Will Deacon, Tom Lendacky, iommu,
	linux-arm-kernel, linux-kernel

From: Robin Murphy <robin.murphy@arm.com>

Most of it can double up to serve the failure cleanup path for
iommu_dma_alloc().

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
---
 drivers/iommu/dma-iommu.c | 12 ++++++++----
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index a1b8c232ad42..95a12e975994 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -916,15 +916,12 @@ static void iommu_dma_unmap_resource(struct device *dev, dma_addr_t handle,
 	__iommu_dma_unmap(dev, handle, size);
 }
 
-static void iommu_dma_free(struct device *dev, size_t size, void *cpu_addr,
-		dma_addr_t handle, unsigned long attrs)
+static void __iommu_dma_free(struct device *dev, size_t size, void *cpu_addr)
 {
 	size_t alloc_size = PAGE_ALIGN(size);
 	int count = alloc_size >> PAGE_SHIFT;
 	struct page *page = NULL;
 
-	__iommu_dma_unmap(dev, handle, size);
-
 	/* Non-coherent atomic allocation? Easy */
 	if (dma_free_from_pool(cpu_addr, alloc_size))
 		return;
@@ -951,6 +948,13 @@ static void iommu_dma_free(struct device *dev, size_t size, void *cpu_addr,
 		__free_pages(page, get_order(alloc_size));
 }
 
+static void iommu_dma_free(struct device *dev, size_t size, void *cpu_addr,
+		dma_addr_t handle, unsigned long attrs)
+{
+	__iommu_dma_unmap(dev, handle, size);
+	__iommu_dma_free(dev, size, cpu_addr);
+}
+
 static void *iommu_dma_alloc(struct device *dev, size_t size,
 		dma_addr_t *handle, gfp_t gfp, unsigned long attrs)
 {
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH 19/26] iommu/dma: Cleanup variable naming in iommu_dma_alloc
  2019-04-22 17:59 implement generic dma_map_ops for IOMMUs v3 Christoph Hellwig
                   ` (17 preceding siblings ...)
  2019-04-22 17:59 ` [PATCH 18/26] iommu/dma: Split iommu_dma_free Christoph Hellwig
@ 2019-04-22 17:59 ` Christoph Hellwig
  2019-04-29 14:11   ` Robin Murphy
  2019-04-22 17:59 ` [PATCH 20/26] iommu/dma: Refactor iommu_dma_alloc, part 2 Christoph Hellwig
                   ` (7 subsequent siblings)
  26 siblings, 1 reply; 48+ messages in thread
From: Christoph Hellwig @ 2019-04-22 17:59 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Joerg Roedel, Catalin Marinas, Will Deacon, Tom Lendacky, iommu,
	linux-arm-kernel, linux-kernel

From: Robin Murphy <robin.murphy@arm.com>

Most importantly clear up the size / iosize confusion.  Also rename addr
to cpu_addr to match the surrounding code and make the intention a little
more clear.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
[hch: split from a larger patch]
Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/iommu/dma-iommu.c | 45 +++++++++++++++++++--------------------
 1 file changed, 22 insertions(+), 23 deletions(-)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index 95a12e975994..9b269f0792f3 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -960,64 +960,63 @@ static void *iommu_dma_alloc(struct device *dev, size_t size,
 {
 	bool coherent = dev_is_dma_coherent(dev);
 	int ioprot = dma_info_to_prot(DMA_BIDIRECTIONAL, coherent, attrs);
-	size_t iosize = size;
+	size_t alloc_size = PAGE_ALIGN(size);
 	struct page *page = NULL;
-	void *addr;
+	void *cpu_addr;
 
-	size = PAGE_ALIGN(size);
 	gfp |= __GFP_ZERO;
 
 	if (gfpflags_allow_blocking(gfp) &&
 	    !(attrs & DMA_ATTR_FORCE_CONTIGUOUS))
-		return iommu_dma_alloc_remap(dev, iosize, handle, gfp, attrs);
+		return iommu_dma_alloc_remap(dev, size, handle, gfp, attrs);
 
 	if (!gfpflags_allow_blocking(gfp) && !coherent) {
-		addr = dma_alloc_from_pool(size, &page, gfp);
-		if (!addr)
+		cpu_addr = dma_alloc_from_pool(alloc_size, &page, gfp);
+		if (!cpu_addr)
 			return NULL;
 
-		*handle = __iommu_dma_map(dev, page_to_phys(page), iosize,
+		*handle = __iommu_dma_map(dev, page_to_phys(page), size,
 					  ioprot);
 		if (*handle == DMA_MAPPING_ERROR) {
-			dma_free_from_pool(addr, size);
+			dma_free_from_pool(cpu_addr, alloc_size);
 			return NULL;
 		}
-		return addr;
+		return cpu_addr;
 	}
 
 	if (gfpflags_allow_blocking(gfp))
-		page = dma_alloc_from_contiguous(dev, size >> PAGE_SHIFT,
-						 get_order(size),
+		page = dma_alloc_from_contiguous(dev, alloc_size >> PAGE_SHIFT,
+						 get_order(alloc_size),
 						 gfp & __GFP_NOWARN);
 	if (!page)
-		page = alloc_pages(gfp, get_order(size));
+		page = alloc_pages(gfp, get_order(alloc_size));
 	if (!page)
 		return NULL;
 
-	*handle = __iommu_dma_map(dev, page_to_phys(page), iosize, ioprot);
+	*handle = __iommu_dma_map(dev, page_to_phys(page), size, ioprot);
 	if (*handle == DMA_MAPPING_ERROR)
 		goto out_free_pages;
 
 	if (!coherent || PageHighMem(page)) {
 		pgprot_t prot = arch_dma_mmap_pgprot(dev, PAGE_KERNEL, attrs);
 
-		addr = dma_common_contiguous_remap(page, size, VM_USERMAP, prot,
-				__builtin_return_address(0));
-		if (!addr)
+		cpu_addr = dma_common_contiguous_remap(page, alloc_size,
+				VM_USERMAP, prot, __builtin_return_address(0));
+		if (!cpu_addr)
 			goto out_unmap;
 
 		if (!coherent)
-			arch_dma_prep_coherent(page, iosize);
+			arch_dma_prep_coherent(page, size);
 	} else {
-		addr = page_address(page);
+		cpu_addr = page_address(page);
 	}
-	memset(addr, 0, size);
-	return addr;
+	memset(cpu_addr, 0, alloc_size);
+	return cpu_addr;
 out_unmap:
-	__iommu_dma_unmap(dev, *handle, iosize);
+	__iommu_dma_unmap(dev, *handle, size);
 out_free_pages:
-	if (!dma_release_from_contiguous(dev, page, size >> PAGE_SHIFT))
-		__free_pages(page, get_order(size));
+	if (!dma_release_from_contiguous(dev, page, alloc_size >> PAGE_SHIFT))
+		__free_pages(page, get_order(alloc_size));
 	return NULL;
 }
 
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH 20/26] iommu/dma: Refactor iommu_dma_alloc, part 2
  2019-04-22 17:59 implement generic dma_map_ops for IOMMUs v3 Christoph Hellwig
                   ` (18 preceding siblings ...)
  2019-04-22 17:59 ` [PATCH 19/26] iommu/dma: Cleanup variable naming in iommu_dma_alloc Christoph Hellwig
@ 2019-04-22 17:59 ` Christoph Hellwig
  2019-04-29 14:45   ` Robin Murphy
  2019-04-22 17:59 ` [PATCH 21/26] iommu/dma: Refactor iommu_dma_get_sgtable Christoph Hellwig
                   ` (6 subsequent siblings)
  26 siblings, 1 reply; 48+ messages in thread
From: Christoph Hellwig @ 2019-04-22 17:59 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Joerg Roedel, Catalin Marinas, Will Deacon, Tom Lendacky, iommu,
	linux-arm-kernel, linux-kernel

From: Robin Murphy <robin.murphy@arm.com>

Apart from the iommu_dma_alloc_remap() case which remains sufficiently
different that it's better off being self-contained, the rest of the
logic can now be consolidated into a single flow which separates the
logcially-distinct steps of allocating pages, getting the CPU address,
and finally getting the IOMMU address.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
[hch: split the page allocation into a new helper to simplify the flow]
Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/iommu/dma-iommu.c | 65 +++++++++++++++++++++------------------
 1 file changed, 35 insertions(+), 30 deletions(-)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index 9b269f0792f3..acdfe866cb29 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -955,35 +955,14 @@ static void iommu_dma_free(struct device *dev, size_t size, void *cpu_addr,
 	__iommu_dma_free(dev, size, cpu_addr);
 }
 
-static void *iommu_dma_alloc(struct device *dev, size_t size,
-		dma_addr_t *handle, gfp_t gfp, unsigned long attrs)
+static void *iommu_dma_alloc_pages(struct device *dev, size_t size,
+		struct page **pagep, gfp_t gfp, unsigned long attrs)
 {
 	bool coherent = dev_is_dma_coherent(dev);
-	int ioprot = dma_info_to_prot(DMA_BIDIRECTIONAL, coherent, attrs);
 	size_t alloc_size = PAGE_ALIGN(size);
 	struct page *page = NULL;
 	void *cpu_addr;
 
-	gfp |= __GFP_ZERO;
-
-	if (gfpflags_allow_blocking(gfp) &&
-	    !(attrs & DMA_ATTR_FORCE_CONTIGUOUS))
-		return iommu_dma_alloc_remap(dev, size, handle, gfp, attrs);
-
-	if (!gfpflags_allow_blocking(gfp) && !coherent) {
-		cpu_addr = dma_alloc_from_pool(alloc_size, &page, gfp);
-		if (!cpu_addr)
-			return NULL;
-
-		*handle = __iommu_dma_map(dev, page_to_phys(page), size,
-					  ioprot);
-		if (*handle == DMA_MAPPING_ERROR) {
-			dma_free_from_pool(cpu_addr, alloc_size);
-			return NULL;
-		}
-		return cpu_addr;
-	}
-
 	if (gfpflags_allow_blocking(gfp))
 		page = dma_alloc_from_contiguous(dev, alloc_size >> PAGE_SHIFT,
 						 get_order(alloc_size),
@@ -993,33 +972,59 @@ static void *iommu_dma_alloc(struct device *dev, size_t size,
 	if (!page)
 		return NULL;
 
-	*handle = __iommu_dma_map(dev, page_to_phys(page), size, ioprot);
-	if (*handle == DMA_MAPPING_ERROR)
-		goto out_free_pages;
-
 	if (!coherent || PageHighMem(page)) {
 		pgprot_t prot = arch_dma_mmap_pgprot(dev, PAGE_KERNEL, attrs);
 
 		cpu_addr = dma_common_contiguous_remap(page, alloc_size,
 				VM_USERMAP, prot, __builtin_return_address(0));
 		if (!cpu_addr)
-			goto out_unmap;
+			goto out_free_pages;
 
 		if (!coherent)
 			arch_dma_prep_coherent(page, size);
 	} else {
 		cpu_addr = page_address(page);
 	}
+
+	*pagep = page;
 	memset(cpu_addr, 0, alloc_size);
 	return cpu_addr;
-out_unmap:
-	__iommu_dma_unmap(dev, *handle, size);
 out_free_pages:
 	if (!dma_release_from_contiguous(dev, page, alloc_size >> PAGE_SHIFT))
 		__free_pages(page, get_order(alloc_size));
 	return NULL;
 }
 
+static void *iommu_dma_alloc(struct device *dev, size_t size,
+		dma_addr_t *handle, gfp_t gfp, unsigned long attrs)
+{
+	bool coherent = dev_is_dma_coherent(dev);
+	int ioprot = dma_info_to_prot(DMA_BIDIRECTIONAL, coherent, attrs);
+	struct page *page = NULL;
+	void *cpu_addr;
+
+	gfp |= __GFP_ZERO;
+
+	if (gfpflags_allow_blocking(gfp) &&
+	    !(attrs & DMA_ATTR_FORCE_CONTIGUOUS))
+		return iommu_dma_alloc_remap(dev, size, handle, gfp, attrs);
+
+	if (!gfpflags_allow_blocking(gfp) && !coherent)
+		cpu_addr = dma_alloc_from_pool(PAGE_ALIGN(size), &page, gfp);
+	else
+		cpu_addr = iommu_dma_alloc_pages(dev, size, &page, gfp, attrs);
+	if (!cpu_addr)
+		return NULL;
+
+	*handle = __iommu_dma_map(dev, page_to_phys(page), size, ioprot);
+	if (*handle == DMA_MAPPING_ERROR) {
+		__iommu_dma_free(dev, size, cpu_addr);
+		return NULL;
+	}
+
+	return cpu_addr;
+}
+
 static int __iommu_dma_mmap_pfn(struct vm_area_struct *vma,
 			      unsigned long pfn, size_t size)
 {
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH 21/26] iommu/dma: Refactor iommu_dma_get_sgtable
  2019-04-22 17:59 implement generic dma_map_ops for IOMMUs v3 Christoph Hellwig
                   ` (19 preceding siblings ...)
  2019-04-22 17:59 ` [PATCH 20/26] iommu/dma: Refactor iommu_dma_alloc, part 2 Christoph Hellwig
@ 2019-04-22 17:59 ` Christoph Hellwig
  2019-04-29 14:08   ` Robin Murphy
  2019-04-22 17:59 ` [PATCH 22/26] iommu/dma: Refactor iommu_dma_mmap Christoph Hellwig
                   ` (5 subsequent siblings)
  26 siblings, 1 reply; 48+ messages in thread
From: Christoph Hellwig @ 2019-04-22 17:59 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Joerg Roedel, Catalin Marinas, Will Deacon, Tom Lendacky, iommu,
	linux-arm-kernel, linux-kernel

Inline __iommu_dma_get_sgtable_page into the main function, and use the
fact that __iommu_dma_get_pages return NULL for remapped contigous
allocations to simplify the code flow a bit.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/iommu/dma-iommu.c | 45 +++++++++++++++------------------------
 1 file changed, 17 insertions(+), 28 deletions(-)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index acdfe866cb29..138b85e675c8 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -1070,42 +1070,31 @@ static int iommu_dma_mmap(struct device *dev, struct vm_area_struct *vma,
 	return __iommu_dma_mmap(pages, size, vma);
 }
 
-static int __iommu_dma_get_sgtable_page(struct sg_table *sgt, struct page *page,
-		size_t size)
-{
-	int ret = sg_alloc_table(sgt, 1, GFP_KERNEL);
-
-	if (!ret)
-		sg_set_page(sgt->sgl, page, PAGE_ALIGN(size), 0);
-	return ret;
-}
-
 static int iommu_dma_get_sgtable(struct device *dev, struct sg_table *sgt,
 		void *cpu_addr, dma_addr_t dma_addr, size_t size,
 		unsigned long attrs)
 {
-	unsigned int count = PAGE_ALIGN(size) >> PAGE_SHIFT;
-	struct page **pages;
+	struct page *page;
+	int ret;
 
-	if (!is_vmalloc_addr(cpu_addr)) {
-		struct page *page = virt_to_page(cpu_addr);
-		return __iommu_dma_get_sgtable_page(sgt, page, size);
-	}
+	if (is_vmalloc_addr(cpu_addr)) {
+		struct page **pages = __iommu_dma_get_pages(cpu_addr);
 
-	if (attrs & DMA_ATTR_FORCE_CONTIGUOUS) {
-		/*
-		 * DMA_ATTR_FORCE_CONTIGUOUS allocations are always remapped,
-		 * hence in the vmalloc space.
-		 */
-		struct page *page = vmalloc_to_page(cpu_addr);
-		return __iommu_dma_get_sgtable_page(sgt, page, size);
+		if (pages) {
+			return sg_alloc_table_from_pages(sgt, pages,
+					PAGE_ALIGN(size) >> PAGE_SHIFT,
+					0, size, GFP_KERNEL);
+		}
+
+		page = vmalloc_to_page(cpu_addr);
+	} else {
+		page = virt_to_page(cpu_addr);
 	}
 
-	pages = __iommu_dma_get_pages(cpu_addr);
-	if (WARN_ON_ONCE(!pages))
-		return -ENXIO;
-	return sg_alloc_table_from_pages(sgt, pages, count, 0, size,
-					 GFP_KERNEL);
+	ret = sg_alloc_table(sgt, 1, GFP_KERNEL);
+	if (!ret)
+		sg_set_page(sgt->sgl, page, PAGE_ALIGN(size), 0);
+	return ret;
 }
 
 static const struct dma_map_ops iommu_dma_ops = {
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH 22/26] iommu/dma: Refactor iommu_dma_mmap
  2019-04-22 17:59 implement generic dma_map_ops for IOMMUs v3 Christoph Hellwig
                   ` (20 preceding siblings ...)
  2019-04-22 17:59 ` [PATCH 21/26] iommu/dma: Refactor iommu_dma_get_sgtable Christoph Hellwig
@ 2019-04-22 17:59 ` Christoph Hellwig
  2019-04-29 14:04   ` Robin Murphy
  2019-04-22 17:59 ` [PATCH 23/26] iommu/dma: Don't depend on CONFIG_DMA_DIRECT_REMAP Christoph Hellwig
                   ` (4 subsequent siblings)
  26 siblings, 1 reply; 48+ messages in thread
From: Christoph Hellwig @ 2019-04-22 17:59 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Joerg Roedel, Catalin Marinas, Will Deacon, Tom Lendacky, iommu,
	linux-arm-kernel, linux-kernel

Inline __iommu_dma_mmap_pfn into the main function, and use the
fact that __iommu_dma_get_pages return NULL for remapped contigous
allocations to simplify the code flow a bit.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/iommu/dma-iommu.c | 36 +++++++++++-------------------------
 1 file changed, 11 insertions(+), 25 deletions(-)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index 138b85e675c8..8fc6098c1eeb 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -1025,21 +1025,12 @@ static void *iommu_dma_alloc(struct device *dev, size_t size,
 	return cpu_addr;
 }
 
-static int __iommu_dma_mmap_pfn(struct vm_area_struct *vma,
-			      unsigned long pfn, size_t size)
-{
-	return remap_pfn_range(vma, vma->vm_start, pfn + vma->vm_pgoff,
-			       vma->vm_end - vma->vm_start,
-			       vma->vm_page_prot);
-}
-
 static int iommu_dma_mmap(struct device *dev, struct vm_area_struct *vma,
 		void *cpu_addr, dma_addr_t dma_addr, size_t size,
 		unsigned long attrs)
 {
 	unsigned long nr_pages = PAGE_ALIGN(size) >> PAGE_SHIFT;
-	unsigned long off = vma->vm_pgoff;
-	struct page **pages;
+	unsigned long pfn, off = vma->vm_pgoff;
 	int ret;
 
 	vma->vm_page_prot = arch_dma_mmap_pgprot(dev, vma->vm_page_prot, attrs);
@@ -1050,24 +1041,19 @@ static int iommu_dma_mmap(struct device *dev, struct vm_area_struct *vma,
 	if (off >= nr_pages || vma_pages(vma) > nr_pages - off)
 		return -ENXIO;
 
-	if (!is_vmalloc_addr(cpu_addr)) {
-		unsigned long pfn = page_to_pfn(virt_to_page(cpu_addr));
-		return __iommu_dma_mmap_pfn(vma, pfn, size);
-	}
+	if (is_vmalloc_addr(cpu_addr)) {
+		struct page **pages = __iommu_dma_get_pages(cpu_addr);
 
-	if (attrs & DMA_ATTR_FORCE_CONTIGUOUS) {
-		/*
-		 * DMA_ATTR_FORCE_CONTIGUOUS allocations are always remapped,
-		 * hence in the vmalloc space.
-		 */
-		unsigned long pfn = vmalloc_to_pfn(cpu_addr);
-		return __iommu_dma_mmap_pfn(vma, pfn, size);
+		if (pages)
+			return __iommu_dma_mmap(pages, size, vma);
+		pfn = vmalloc_to_pfn(cpu_addr);
+	} else {
+		pfn = page_to_pfn(virt_to_page(cpu_addr));
 	}
 
-	pages = __iommu_dma_get_pages(cpu_addr);
-	if (WARN_ON_ONCE(!pages))
-		return -ENXIO;
-	return __iommu_dma_mmap(pages, size, vma);
+	return remap_pfn_range(vma, vma->vm_start, pfn + off,
+			       vma->vm_end - vma->vm_start,
+			       vma->vm_page_prot);
 }
 
 static int iommu_dma_get_sgtable(struct device *dev, struct sg_table *sgt,
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH 23/26] iommu/dma: Don't depend on CONFIG_DMA_DIRECT_REMAP
  2019-04-22 17:59 implement generic dma_map_ops for IOMMUs v3 Christoph Hellwig
                   ` (21 preceding siblings ...)
  2019-04-22 17:59 ` [PATCH 22/26] iommu/dma: Refactor iommu_dma_mmap Christoph Hellwig
@ 2019-04-22 17:59 ` Christoph Hellwig
  2019-04-29 14:46   ` Robin Murphy
  2019-04-22 17:59 ` [PATCH 24/26] iommu/dma: Switch copyright boilerplace to SPDX Christoph Hellwig
                   ` (3 subsequent siblings)
  26 siblings, 1 reply; 48+ messages in thread
From: Christoph Hellwig @ 2019-04-22 17:59 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Joerg Roedel, Catalin Marinas, Will Deacon, Tom Lendacky, iommu,
	linux-arm-kernel, linux-kernel

For entirely dma coherent architectures there is no requirement to ever
remap dma coherent allocation.  Move all the remap and pool code under
IS_ENABLED() checks and drop the Kconfig dependency.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/iommu/Kconfig     |  1 -
 drivers/iommu/dma-iommu.c | 16 +++++++++-------
 2 files changed, 9 insertions(+), 8 deletions(-)

diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
index bdc14baf2ee5..6f07f3b21816 100644
--- a/drivers/iommu/Kconfig
+++ b/drivers/iommu/Kconfig
@@ -95,7 +95,6 @@ config IOMMU_DMA
 	select IOMMU_API
 	select IOMMU_IOVA
 	select NEED_SG_DMA_LENGTH
-	depends on DMA_DIRECT_REMAP
 
 config FSL_PAMU
 	bool "Freescale IOMMU support"
diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index 8fc6098c1eeb..278a9a960107 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -923,10 +923,11 @@ static void __iommu_dma_free(struct device *dev, size_t size, void *cpu_addr)
 	struct page *page = NULL;
 
 	/* Non-coherent atomic allocation? Easy */
-	if (dma_free_from_pool(cpu_addr, alloc_size))
+	if (IS_ENABLED(CONFIG_DMA_DIRECT_REMAP) &&
+	    dma_free_from_pool(cpu_addr, alloc_size))
 		return;
 
-	if (is_vmalloc_addr(cpu_addr)) {
+	if (IS_ENABLED(CONFIG_DMA_REMAP) && is_vmalloc_addr(cpu_addr)) {
 		/*
 		 * If it the address is remapped, then it's either non-coherent
 		 * or highmem CMA, or an iommu_dma_alloc_remap() construction.
@@ -972,7 +973,7 @@ static void *iommu_dma_alloc_pages(struct device *dev, size_t size,
 	if (!page)
 		return NULL;
 
-	if (!coherent || PageHighMem(page)) {
+	if (IS_ENABLED(CONFIG_DMA_REMAP) && (!coherent || PageHighMem(page))) {
 		pgprot_t prot = arch_dma_mmap_pgprot(dev, PAGE_KERNEL, attrs);
 
 		cpu_addr = dma_common_contiguous_remap(page, alloc_size,
@@ -1005,11 +1006,12 @@ static void *iommu_dma_alloc(struct device *dev, size_t size,
 
 	gfp |= __GFP_ZERO;
 
-	if (gfpflags_allow_blocking(gfp) &&
+	if (IS_ENABLED(CONFIG_DMA_REMAP) && gfpflags_allow_blocking(gfp) &&
 	    !(attrs & DMA_ATTR_FORCE_CONTIGUOUS))
 		return iommu_dma_alloc_remap(dev, size, handle, gfp, attrs);
 
-	if (!gfpflags_allow_blocking(gfp) && !coherent)
+	if (IS_ENABLED(CONFIG_DMA_DIRECT_REMAP) &&
+	    !gfpflags_allow_blocking(gfp) && !coherent)
 		cpu_addr = dma_alloc_from_pool(PAGE_ALIGN(size), &page, gfp);
 	else
 		cpu_addr = iommu_dma_alloc_pages(dev, size, &page, gfp, attrs);
@@ -1041,7 +1043,7 @@ static int iommu_dma_mmap(struct device *dev, struct vm_area_struct *vma,
 	if (off >= nr_pages || vma_pages(vma) > nr_pages - off)
 		return -ENXIO;
 
-	if (is_vmalloc_addr(cpu_addr)) {
+	if (IS_ENABLED(CONFIG_DMA_REMAP) && is_vmalloc_addr(cpu_addr)) {
 		struct page **pages = __iommu_dma_get_pages(cpu_addr);
 
 		if (pages)
@@ -1063,7 +1065,7 @@ static int iommu_dma_get_sgtable(struct device *dev, struct sg_table *sgt,
 	struct page *page;
 	int ret;
 
-	if (is_vmalloc_addr(cpu_addr)) {
+	if (IS_ENABLED(CONFIG_DMA_REMAP) && is_vmalloc_addr(cpu_addr)) {
 		struct page **pages = __iommu_dma_get_pages(cpu_addr);
 
 		if (pages) {
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH 24/26] iommu/dma: Switch copyright boilerplace to SPDX
  2019-04-22 17:59 implement generic dma_map_ops for IOMMUs v3 Christoph Hellwig
                   ` (22 preceding siblings ...)
  2019-04-22 17:59 ` [PATCH 23/26] iommu/dma: Don't depend on CONFIG_DMA_DIRECT_REMAP Christoph Hellwig
@ 2019-04-22 17:59 ` Christoph Hellwig
  2019-04-22 17:59 ` [PATCH 25/26] arm64: switch copyright boilerplace to SPDX in dma-mapping.c Christoph Hellwig
                   ` (2 subsequent siblings)
  26 siblings, 0 replies; 48+ messages in thread
From: Christoph Hellwig @ 2019-04-22 17:59 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Joerg Roedel, Catalin Marinas, Will Deacon, Tom Lendacky, iommu,
	linux-arm-kernel, linux-kernel

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Robin Murphy <robin.murphy@arm.com>
---
 drivers/iommu/dma-iommu.c | 13 +------------
 include/linux/dma-iommu.h | 13 +------------
 2 files changed, 2 insertions(+), 24 deletions(-)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index 278a9a960107..b6bc342a8163 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -1,3 +1,4 @@
+// SPDX-License-Identifier: GPL-2.0
 /*
  * A fairly generic DMA-API to IOMMU-API glue layer.
  *
@@ -5,18 +6,6 @@
  *
  * based in part on arch/arm/mm/dma-mapping.c:
  * Copyright (C) 2000-2004 Russell King
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License version 2 as
- * published by the Free Software Foundation.
- *
- * This program is distributed in the hope that it will be useful,
- * but WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
- * GNU General Public License for more details.
- *
- * You should have received a copy of the GNU General Public License
- * along with this program.  If not, see <http://www.gnu.org/licenses/>.
  */
 
 #include <linux/acpi_iort.h>
diff --git a/include/linux/dma-iommu.h b/include/linux/dma-iommu.h
index dadf4383f555..3fc76918e9bf 100644
--- a/include/linux/dma-iommu.h
+++ b/include/linux/dma-iommu.h
@@ -1,17 +1,6 @@
+/* SPDX-License-Identifier: GPL-2.0 */
 /*
  * Copyright (C) 2014-2015 ARM Ltd.
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License version 2 as
- * published by the Free Software Foundation.
- *
- * This program is distributed in the hope that it will be useful,
- * but WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
- * GNU General Public License for more details.
- *
- * You should have received a copy of the GNU General Public License
- * along with this program.  If not, see <http://www.gnu.org/licenses/>.
  */
 #ifndef __DMA_IOMMU_H
 #define __DMA_IOMMU_H
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH 25/26] arm64: switch copyright boilerplace to SPDX in dma-mapping.c
  2019-04-22 17:59 implement generic dma_map_ops for IOMMUs v3 Christoph Hellwig
                   ` (23 preceding siblings ...)
  2019-04-22 17:59 ` [PATCH 24/26] iommu/dma: Switch copyright boilerplace to SPDX Christoph Hellwig
@ 2019-04-22 17:59 ` Christoph Hellwig
  2019-04-22 17:59 ` [PATCH 26/26] arm64: trim includes " Christoph Hellwig
  2019-04-29 15:03 ` implement generic dma_map_ops for IOMMUs v3 Robin Murphy
  26 siblings, 0 replies; 48+ messages in thread
From: Christoph Hellwig @ 2019-04-22 17:59 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Joerg Roedel, Catalin Marinas, Will Deacon, Tom Lendacky, iommu,
	linux-arm-kernel, linux-kernel, Mukesh Ojha

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Robin Murphy <robin.murphy@arm.com>
Reviewed-by: Mukesh Ojha <mojha@codeaurora.org>
---
 arch/arm64/mm/dma-mapping.c | 15 +--------------
 1 file changed, 1 insertion(+), 14 deletions(-)

diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c
index d1661f78eb4d..184ef9ccd69d 100644
--- a/arch/arm64/mm/dma-mapping.c
+++ b/arch/arm64/mm/dma-mapping.c
@@ -1,20 +1,7 @@
+// SPDX-License-Identifier: GPL-2.0
 /*
- * SWIOTLB-based DMA API implementation
- *
  * Copyright (C) 2012 ARM Ltd.
  * Author: Catalin Marinas <catalin.marinas@arm.com>
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License version 2 as
- * published by the Free Software Foundation.
- *
- * This program is distributed in the hope that it will be useful,
- * but WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
- * GNU General Public License for more details.
- *
- * You should have received a copy of the GNU General Public License
- * along with this program.  If not, see <http://www.gnu.org/licenses/>.
  */
 
 #include <linux/gfp.h>
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH 26/26] arm64: trim includes in dma-mapping.c
  2019-04-22 17:59 implement generic dma_map_ops for IOMMUs v3 Christoph Hellwig
                   ` (24 preceding siblings ...)
  2019-04-22 17:59 ` [PATCH 25/26] arm64: switch copyright boilerplace to SPDX in dma-mapping.c Christoph Hellwig
@ 2019-04-22 17:59 ` Christoph Hellwig
  2019-04-29 15:00   ` Robin Murphy
  2019-04-29 15:03 ` implement generic dma_map_ops for IOMMUs v3 Robin Murphy
  26 siblings, 1 reply; 48+ messages in thread
From: Christoph Hellwig @ 2019-04-22 17:59 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Joerg Roedel, Catalin Marinas, Will Deacon, Tom Lendacky, iommu,
	linux-arm-kernel, linux-kernel

With most of the previous functionality now elsewhere a lot of the
headers included in this file are not needed.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 arch/arm64/mm/dma-mapping.c | 11 -----------
 1 file changed, 11 deletions(-)

diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c
index 184ef9ccd69d..15bd768ceb7e 100644
--- a/arch/arm64/mm/dma-mapping.c
+++ b/arch/arm64/mm/dma-mapping.c
@@ -5,20 +5,9 @@
  */
 
 #include <linux/gfp.h>
-#include <linux/acpi.h>
-#include <linux/memblock.h>
 #include <linux/cache.h>
-#include <linux/export.h>
-#include <linux/slab.h>
-#include <linux/genalloc.h>
-#include <linux/dma-direct.h>
 #include <linux/dma-noncoherent.h>
-#include <linux/dma-contiguous.h>
 #include <linux/dma-iommu.h>
-#include <linux/vmalloc.h>
-#include <linux/swiotlb.h>
-#include <linux/pci.h>
-
 #include <asm/cacheflush.h>
 
 pgprot_t arch_dma_mmap_pgprot(struct device *dev, pgprot_t prot,
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* Re: [PATCH 02/26] arm64/iommu: improve mmap bounds checking
  2019-04-22 17:59 ` [PATCH 02/26] arm64/iommu: improve mmap bounds checking Christoph Hellwig
@ 2019-04-29 12:35   ` Robin Murphy
  2019-04-29 19:01     ` Christoph Hellwig
  0 siblings, 1 reply; 48+ messages in thread
From: Robin Murphy @ 2019-04-29 12:35 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Joerg Roedel, Catalin Marinas, Will Deacon, Tom Lendacky, iommu,
	linux-arm-kernel, linux-kernel

On 22/04/2019 18:59, Christoph Hellwig wrote:
> The nr_pages checks should be done for all mmap requests, not just those
> using remap_pfn_range.

I think it probably makes sense now to just squash this with #22 one way 
or the other, but if you really really still want to keep it as a 
separate patch with a misleading commit message then I'm willing to keep 
my complaints to myself :)

Robin.

> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>   arch/arm64/mm/dma-mapping.c | 21 ++++++++-------------
>   1 file changed, 8 insertions(+), 13 deletions(-)
> 
> diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c
> index 674860e3e478..604c638b2787 100644
> --- a/arch/arm64/mm/dma-mapping.c
> +++ b/arch/arm64/mm/dma-mapping.c
> @@ -73,19 +73,9 @@ static int __swiotlb_get_sgtable_page(struct sg_table *sgt,
>   static int __swiotlb_mmap_pfn(struct vm_area_struct *vma,
>   			      unsigned long pfn, size_t size)
>   {
> -	int ret = -ENXIO;
> -	unsigned long nr_vma_pages = vma_pages(vma);
> -	unsigned long nr_pages = PAGE_ALIGN(size) >> PAGE_SHIFT;
> -	unsigned long off = vma->vm_pgoff;
> -
> -	if (off < nr_pages && nr_vma_pages <= (nr_pages - off)) {
> -		ret = remap_pfn_range(vma, vma->vm_start,
> -				      pfn + off,
> -				      vma->vm_end - vma->vm_start,
> -				      vma->vm_page_prot);
> -	}
> -
> -	return ret;
> +	return remap_pfn_range(vma, vma->vm_start, pfn + vma->vm_pgoff,
> +			      vma->vm_end - vma->vm_start,
> +			      vma->vm_page_prot);
>   }
>   #endif /* CONFIG_IOMMU_DMA */
>   
> @@ -241,6 +231,8 @@ static int __iommu_mmap_attrs(struct device *dev, struct vm_area_struct *vma,
>   			      void *cpu_addr, dma_addr_t dma_addr, size_t size,
>   			      unsigned long attrs)
>   {
> +	unsigned long nr_pages = PAGE_ALIGN(size) >> PAGE_SHIFT;
> +	unsigned long off = vma->vm_pgoff;
>   	struct vm_struct *area;
>   	int ret;
>   
> @@ -249,6 +241,9 @@ static int __iommu_mmap_attrs(struct device *dev, struct vm_area_struct *vma,
>   	if (dma_mmap_from_dev_coherent(dev, vma, cpu_addr, size, &ret))
>   		return ret;
>   
> +	if (off >= nr_pages || vma_pages(vma) > nr_pages - off)
> +		return -ENXIO;
> +
>   	if (!is_vmalloc_addr(cpu_addr)) {
>   		unsigned long pfn = page_to_pfn(virt_to_page(cpu_addr));
>   		return __swiotlb_mmap_pfn(vma, pfn, size);
> 

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 07/26] iommu/dma: move the arm64 wrappers to common code
  2019-04-22 17:59 ` [PATCH 07/26] iommu/dma: move the arm64 wrappers to common code Christoph Hellwig
@ 2019-04-29 12:56   ` Robin Murphy
  2019-06-03 19:47     ` Jon Hunter
  0 siblings, 1 reply; 48+ messages in thread
From: Robin Murphy @ 2019-04-29 12:56 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Joerg Roedel, Catalin Marinas, Will Deacon, Tom Lendacky, iommu,
	linux-arm-kernel, linux-kernel

On 22/04/2019 18:59, Christoph Hellwig wrote:
> There is nothing really arm64 specific in the iommu_dma_ops
> implementation, so move it to dma-iommu.c and keep a lot of symbols
> self-contained.  Note the implementation does depend on the
> DMA_DIRECT_REMAP infrastructure for now, so we'll have to make the
> DMA_IOMMU support depend on it, but this will be relaxed soon.

Nothing looks objectionable, and boot testing with this much of the 
series merged has my coherent and non-coherent IOMMU-backed devices 
appearing to still work OK, so:

Acked-by: Robin Murphy <robin.murphy@arm.com>

> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>   arch/arm64/mm/dma-mapping.c | 389 +-----------------------------------
>   drivers/iommu/Kconfig       |   1 +
>   drivers/iommu/dma-iommu.c   | 388 ++++++++++++++++++++++++++++++++---
>   include/linux/dma-iommu.h   |  43 +---
>   4 files changed, 369 insertions(+), 452 deletions(-)
> 
> diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c
> index 636fa7c64370..d1661f78eb4d 100644
> --- a/arch/arm64/mm/dma-mapping.c
> +++ b/arch/arm64/mm/dma-mapping.c
> @@ -27,6 +27,7 @@
>   #include <linux/dma-direct.h>
>   #include <linux/dma-noncoherent.h>
>   #include <linux/dma-contiguous.h>
> +#include <linux/dma-iommu.h>
>   #include <linux/vmalloc.h>
>   #include <linux/swiotlb.h>
>   #include <linux/pci.h>
> @@ -58,27 +59,6 @@ void arch_dma_prep_coherent(struct page *page, size_t size)
>   	__dma_flush_area(page_address(page), size);
>   }
>   
> -#ifdef CONFIG_IOMMU_DMA
> -static int __swiotlb_get_sgtable_page(struct sg_table *sgt,
> -				      struct page *page, size_t size)
> -{
> -	int ret = sg_alloc_table(sgt, 1, GFP_KERNEL);
> -
> -	if (!ret)
> -		sg_set_page(sgt->sgl, page, PAGE_ALIGN(size), 0);
> -
> -	return ret;
> -}
> -
> -static int __swiotlb_mmap_pfn(struct vm_area_struct *vma,
> -			      unsigned long pfn, size_t size)
> -{
> -	return remap_pfn_range(vma, vma->vm_start, pfn + vma->vm_pgoff,
> -			      vma->vm_end - vma->vm_start,
> -			      vma->vm_page_prot);
> -}
> -#endif /* CONFIG_IOMMU_DMA */
> -
>   static int __init arm64_dma_init(void)
>   {
>   	WARN_TAINT(ARCH_DMA_MINALIGN < cache_line_size(),
> @@ -90,379 +70,18 @@ static int __init arm64_dma_init(void)
>   arch_initcall(arm64_dma_init);
>   
>   #ifdef CONFIG_IOMMU_DMA
> -#include <linux/dma-iommu.h>
> -#include <linux/platform_device.h>
> -#include <linux/amba/bus.h>
> -
> -static void *__iommu_alloc_attrs(struct device *dev, size_t size,
> -				 dma_addr_t *handle, gfp_t gfp,
> -				 unsigned long attrs)
> -{
> -	bool coherent = dev_is_dma_coherent(dev);
> -	int ioprot = dma_info_to_prot(DMA_BIDIRECTIONAL, coherent, attrs);
> -	size_t iosize = size;
> -	void *addr;
> -
> -	if (WARN(!dev, "cannot create IOMMU mapping for unknown device\n"))
> -		return NULL;
> -
> -	size = PAGE_ALIGN(size);
> -
> -	/*
> -	 * Some drivers rely on this, and we probably don't want the
> -	 * possibility of stale kernel data being read by devices anyway.
> -	 */
> -	gfp |= __GFP_ZERO;
> -
> -	if (!gfpflags_allow_blocking(gfp)) {
> -		struct page *page;
> -		/*
> -		 * In atomic context we can't remap anything, so we'll only
> -		 * get the virtually contiguous buffer we need by way of a
> -		 * physically contiguous allocation.
> -		 */
> -		if (coherent) {
> -			page = alloc_pages(gfp, get_order(size));
> -			addr = page ? page_address(page) : NULL;
> -		} else {
> -			addr = dma_alloc_from_pool(size, &page, gfp);
> -		}
> -		if (!addr)
> -			return NULL;
> -
> -		*handle = iommu_dma_map_page(dev, page, 0, iosize, ioprot);
> -		if (*handle == DMA_MAPPING_ERROR) {
> -			if (coherent)
> -				__free_pages(page, get_order(size));
> -			else
> -				dma_free_from_pool(addr, size);
> -			addr = NULL;
> -		}
> -	} else if (attrs & DMA_ATTR_FORCE_CONTIGUOUS) {
> -		pgprot_t prot = arch_dma_mmap_pgprot(dev, PAGE_KERNEL, attrs);
> -		struct page *page;
> -
> -		page = dma_alloc_from_contiguous(dev, size >> PAGE_SHIFT,
> -					get_order(size), gfp & __GFP_NOWARN);
> -		if (!page)
> -			return NULL;
> -
> -		*handle = iommu_dma_map_page(dev, page, 0, iosize, ioprot);
> -		if (*handle == DMA_MAPPING_ERROR) {
> -			dma_release_from_contiguous(dev, page,
> -						    size >> PAGE_SHIFT);
> -			return NULL;
> -		}
> -		addr = dma_common_contiguous_remap(page, size, VM_USERMAP,
> -						   prot,
> -						   __builtin_return_address(0));
> -		if (addr) {
> -			if (!coherent)
> -				__dma_flush_area(page_to_virt(page), iosize);
> -			memset(addr, 0, size);
> -		} else {
> -			iommu_dma_unmap_page(dev, *handle, iosize, 0, attrs);
> -			dma_release_from_contiguous(dev, page,
> -						    size >> PAGE_SHIFT);
> -		}
> -	} else {
> -		pgprot_t prot = arch_dma_mmap_pgprot(dev, PAGE_KERNEL, attrs);
> -		struct page **pages;
> -
> -		pages = iommu_dma_alloc(dev, iosize, gfp, attrs, ioprot,
> -					handle);
> -		if (!pages)
> -			return NULL;
> -
> -		addr = dma_common_pages_remap(pages, size, VM_USERMAP, prot,
> -					      __builtin_return_address(0));
> -		if (!addr)
> -			iommu_dma_free(dev, pages, iosize, handle);
> -	}
> -	return addr;
> -}
> -
> -static void __iommu_free_attrs(struct device *dev, size_t size, void *cpu_addr,
> -			       dma_addr_t handle, unsigned long attrs)
> -{
> -	size_t iosize = size;
> -
> -	size = PAGE_ALIGN(size);
> -	/*
> -	 * @cpu_addr will be one of 4 things depending on how it was allocated:
> -	 * - A remapped array of pages for contiguous allocations.
> -	 * - A remapped array of pages from iommu_dma_alloc(), for all
> -	 *   non-atomic allocations.
> -	 * - A non-cacheable alias from the atomic pool, for atomic
> -	 *   allocations by non-coherent devices.
> -	 * - A normal lowmem address, for atomic allocations by
> -	 *   coherent devices.
> -	 * Hence how dodgy the below logic looks...
> -	 */
> -	if (dma_in_atomic_pool(cpu_addr, size)) {
> -		iommu_dma_unmap_page(dev, handle, iosize, 0, 0);
> -		dma_free_from_pool(cpu_addr, size);
> -	} else if (attrs & DMA_ATTR_FORCE_CONTIGUOUS) {
> -		struct page *page = vmalloc_to_page(cpu_addr);
> -
> -		iommu_dma_unmap_page(dev, handle, iosize, 0, attrs);
> -		dma_release_from_contiguous(dev, page, size >> PAGE_SHIFT);
> -		dma_common_free_remap(cpu_addr, size, VM_USERMAP);
> -	} else if (is_vmalloc_addr(cpu_addr)){
> -		struct vm_struct *area = find_vm_area(cpu_addr);
> -
> -		if (WARN_ON(!area || !area->pages))
> -			return;
> -		iommu_dma_free(dev, area->pages, iosize, &handle);
> -		dma_common_free_remap(cpu_addr, size, VM_USERMAP);
> -	} else {
> -		iommu_dma_unmap_page(dev, handle, iosize, 0, 0);
> -		__free_pages(virt_to_page(cpu_addr), get_order(size));
> -	}
> -}
> -
> -static int __iommu_mmap_attrs(struct device *dev, struct vm_area_struct *vma,
> -			      void *cpu_addr, dma_addr_t dma_addr, size_t size,
> -			      unsigned long attrs)
> -{
> -	unsigned long nr_pages = PAGE_ALIGN(size) >> PAGE_SHIFT;
> -	unsigned long off = vma->vm_pgoff;
> -	struct vm_struct *area;
> -	int ret;
> -
> -	vma->vm_page_prot = arch_dma_mmap_pgprot(dev, vma->vm_page_prot, attrs);
> -
> -	if (dma_mmap_from_dev_coherent(dev, vma, cpu_addr, size, &ret))
> -		return ret;
> -
> -	if (off >= nr_pages || vma_pages(vma) > nr_pages - off)
> -		return -ENXIO;
> -
> -	if (!is_vmalloc_addr(cpu_addr)) {
> -		unsigned long pfn = page_to_pfn(virt_to_page(cpu_addr));
> -		return __swiotlb_mmap_pfn(vma, pfn, size);
> -	}
> -
> -	if (attrs & DMA_ATTR_FORCE_CONTIGUOUS) {
> -		/*
> -		 * DMA_ATTR_FORCE_CONTIGUOUS allocations are always remapped,
> -		 * hence in the vmalloc space.
> -		 */
> -		unsigned long pfn = vmalloc_to_pfn(cpu_addr);
> -		return __swiotlb_mmap_pfn(vma, pfn, size);
> -	}
> -
> -	area = find_vm_area(cpu_addr);
> -	if (WARN_ON(!area || !area->pages))
> -		return -ENXIO;
> -
> -	return iommu_dma_mmap(area->pages, size, vma);
> -}
> -
> -static int __iommu_get_sgtable(struct device *dev, struct sg_table *sgt,
> -			       void *cpu_addr, dma_addr_t dma_addr,
> -			       size_t size, unsigned long attrs)
> -{
> -	unsigned int count = PAGE_ALIGN(size) >> PAGE_SHIFT;
> -	struct vm_struct *area = find_vm_area(cpu_addr);
> -
> -	if (!is_vmalloc_addr(cpu_addr)) {
> -		struct page *page = virt_to_page(cpu_addr);
> -		return __swiotlb_get_sgtable_page(sgt, page, size);
> -	}
> -
> -	if (attrs & DMA_ATTR_FORCE_CONTIGUOUS) {
> -		/*
> -		 * DMA_ATTR_FORCE_CONTIGUOUS allocations are always remapped,
> -		 * hence in the vmalloc space.
> -		 */
> -		struct page *page = vmalloc_to_page(cpu_addr);
> -		return __swiotlb_get_sgtable_page(sgt, page, size);
> -	}
> -
> -	if (WARN_ON(!area || !area->pages))
> -		return -ENXIO;
> -
> -	return sg_alloc_table_from_pages(sgt, area->pages, count, 0, size,
> -					 GFP_KERNEL);
> -}
> -
> -static void __iommu_sync_single_for_cpu(struct device *dev,
> -					dma_addr_t dev_addr, size_t size,
> -					enum dma_data_direction dir)
> -{
> -	phys_addr_t phys;
> -
> -	if (dev_is_dma_coherent(dev))
> -		return;
> -
> -	phys = iommu_iova_to_phys(iommu_get_dma_domain(dev), dev_addr);
> -	arch_sync_dma_for_cpu(dev, phys, size, dir);
> -}
> -
> -static void __iommu_sync_single_for_device(struct device *dev,
> -					   dma_addr_t dev_addr, size_t size,
> -					   enum dma_data_direction dir)
> -{
> -	phys_addr_t phys;
> -
> -	if (dev_is_dma_coherent(dev))
> -		return;
> -
> -	phys = iommu_iova_to_phys(iommu_get_dma_domain(dev), dev_addr);
> -	arch_sync_dma_for_device(dev, phys, size, dir);
> -}
> -
> -static dma_addr_t __iommu_map_page(struct device *dev, struct page *page,
> -				   unsigned long offset, size_t size,
> -				   enum dma_data_direction dir,
> -				   unsigned long attrs)
> -{
> -	bool coherent = dev_is_dma_coherent(dev);
> -	int prot = dma_info_to_prot(dir, coherent, attrs);
> -	dma_addr_t dev_addr = iommu_dma_map_page(dev, page, offset, size, prot);
> -
> -	if (!coherent && !(attrs & DMA_ATTR_SKIP_CPU_SYNC) &&
> -	    dev_addr != DMA_MAPPING_ERROR)
> -		__dma_map_area(page_address(page) + offset, size, dir);
> -
> -	return dev_addr;
> -}
> -
> -static void __iommu_unmap_page(struct device *dev, dma_addr_t dev_addr,
> -			       size_t size, enum dma_data_direction dir,
> -			       unsigned long attrs)
> -{
> -	if ((attrs & DMA_ATTR_SKIP_CPU_SYNC) == 0)
> -		__iommu_sync_single_for_cpu(dev, dev_addr, size, dir);
> -
> -	iommu_dma_unmap_page(dev, dev_addr, size, dir, attrs);
> -}
> -
> -static void __iommu_sync_sg_for_cpu(struct device *dev,
> -				    struct scatterlist *sgl, int nelems,
> -				    enum dma_data_direction dir)
> -{
> -	struct scatterlist *sg;
> -	int i;
> -
> -	if (dev_is_dma_coherent(dev))
> -		return;
> -
> -	for_each_sg(sgl, sg, nelems, i)
> -		arch_sync_dma_for_cpu(dev, sg_phys(sg), sg->length, dir);
> -}
> -
> -static void __iommu_sync_sg_for_device(struct device *dev,
> -				       struct scatterlist *sgl, int nelems,
> -				       enum dma_data_direction dir)
> -{
> -	struct scatterlist *sg;
> -	int i;
> -
> -	if (dev_is_dma_coherent(dev))
> -		return;
> -
> -	for_each_sg(sgl, sg, nelems, i)
> -		arch_sync_dma_for_device(dev, sg_phys(sg), sg->length, dir);
> -}
> -
> -static int __iommu_map_sg_attrs(struct device *dev, struct scatterlist *sgl,
> -				int nelems, enum dma_data_direction dir,
> -				unsigned long attrs)
> -{
> -	bool coherent = dev_is_dma_coherent(dev);
> -
> -	if ((attrs & DMA_ATTR_SKIP_CPU_SYNC) == 0)
> -		__iommu_sync_sg_for_device(dev, sgl, nelems, dir);
> -
> -	return iommu_dma_map_sg(dev, sgl, nelems,
> -				dma_info_to_prot(dir, coherent, attrs));
> -}
> -
> -static void __iommu_unmap_sg_attrs(struct device *dev,
> -				   struct scatterlist *sgl, int nelems,
> -				   enum dma_data_direction dir,
> -				   unsigned long attrs)
> -{
> -	if ((attrs & DMA_ATTR_SKIP_CPU_SYNC) == 0)
> -		__iommu_sync_sg_for_cpu(dev, sgl, nelems, dir);
> -
> -	iommu_dma_unmap_sg(dev, sgl, nelems, dir, attrs);
> -}
> -
> -static const struct dma_map_ops iommu_dma_ops = {
> -	.alloc = __iommu_alloc_attrs,
> -	.free = __iommu_free_attrs,
> -	.mmap = __iommu_mmap_attrs,
> -	.get_sgtable = __iommu_get_sgtable,
> -	.map_page = __iommu_map_page,
> -	.unmap_page = __iommu_unmap_page,
> -	.map_sg = __iommu_map_sg_attrs,
> -	.unmap_sg = __iommu_unmap_sg_attrs,
> -	.sync_single_for_cpu = __iommu_sync_single_for_cpu,
> -	.sync_single_for_device = __iommu_sync_single_for_device,
> -	.sync_sg_for_cpu = __iommu_sync_sg_for_cpu,
> -	.sync_sg_for_device = __iommu_sync_sg_for_device,
> -	.map_resource = iommu_dma_map_resource,
> -	.unmap_resource = iommu_dma_unmap_resource,
> -};
> -
> -static int __init __iommu_dma_init(void)
> -{
> -	return iommu_dma_init();
> -}
> -arch_initcall(__iommu_dma_init);
> -
> -static void __iommu_setup_dma_ops(struct device *dev, u64 dma_base, u64 size,
> -				  const struct iommu_ops *ops)
> -{
> -	struct iommu_domain *domain;
> -
> -	if (!ops)
> -		return;
> -
> -	/*
> -	 * The IOMMU core code allocates the default DMA domain, which the
> -	 * underlying IOMMU driver needs to support via the dma-iommu layer.
> -	 */
> -	domain = iommu_get_domain_for_dev(dev);
> -
> -	if (!domain)
> -		goto out_err;
> -
> -	if (domain->type == IOMMU_DOMAIN_DMA) {
> -		if (iommu_dma_init_domain(domain, dma_base, size, dev))
> -			goto out_err;
> -
> -		dev->dma_ops = &iommu_dma_ops;
> -	}
> -
> -	return;
> -
> -out_err:
> -	 pr_warn("Failed to set up IOMMU for device %s; retaining platform DMA ops\n",
> -		 dev_name(dev));
> -}
> -
>   void arch_teardown_dma_ops(struct device *dev)
>   {
>   	dev->dma_ops = NULL;
>   }
> -
> -#else
> -
> -static void __iommu_setup_dma_ops(struct device *dev, u64 dma_base, u64 size,
> -				  const struct iommu_ops *iommu)
> -{ }
> -
> -#endif  /* CONFIG_IOMMU_DMA */
> +#endif
>   
>   void arch_setup_dma_ops(struct device *dev, u64 dma_base, u64 size,
>   			const struct iommu_ops *iommu, bool coherent)
>   {
>   	dev->dma_coherent = coherent;
> -	__iommu_setup_dma_ops(dev, dma_base, size, iommu);
> +	if (iommu)
> +		iommu_setup_dma_ops(dev, dma_base, size);
>   
>   #ifdef CONFIG_XEN
>   	if (xen_initial_domain())
> diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
> index 6f07f3b21816..bdc14baf2ee5 100644
> --- a/drivers/iommu/Kconfig
> +++ b/drivers/iommu/Kconfig
> @@ -95,6 +95,7 @@ config IOMMU_DMA
>   	select IOMMU_API
>   	select IOMMU_IOVA
>   	select NEED_SG_DMA_LENGTH
> +	depends on DMA_DIRECT_REMAP
>   
>   config FSL_PAMU
>   	bool "Freescale IOMMU support"
> diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
> index f915cb7c46e6..622123551bba 100644
> --- a/drivers/iommu/dma-iommu.c
> +++ b/drivers/iommu/dma-iommu.c
> @@ -21,6 +21,7 @@
>   
>   #include <linux/acpi_iort.h>
>   #include <linux/device.h>
> +#include <linux/dma-contiguous.h>
>   #include <linux/dma-iommu.h>
>   #include <linux/dma-noncoherent.h>
>   #include <linux/gfp.h>
> @@ -79,11 +80,6 @@ static struct iommu_dma_cookie *cookie_alloc(enum iommu_dma_cookie_type type)
>   	return cookie;
>   }
>   
> -int iommu_dma_init(void)
> -{
> -	return iova_cache_get();
> -}
> -
>   /**
>    * iommu_get_dma_cookie - Acquire DMA-API resources for a domain
>    * @domain: IOMMU domain to prepare for DMA-API usage
> @@ -285,7 +281,7 @@ static void iommu_dma_flush_iotlb_all(struct iova_domain *iovad)
>    * to ensure it is an invalid IOVA. It is safe to reinitialise a domain, but
>    * any change which could make prior IOVAs invalid will fail.
>    */
> -int iommu_dma_init_domain(struct iommu_domain *domain, dma_addr_t base,
> +static int iommu_dma_init_domain(struct iommu_domain *domain, dma_addr_t base,
>   		u64 size, struct device *dev)
>   {
>   	struct iommu_dma_cookie *cookie = domain->iova_cookie;
> @@ -336,7 +332,6 @@ int iommu_dma_init_domain(struct iommu_domain *domain, dma_addr_t base,
>   
>   	return iova_reserve_iommu_regions(dev, domain);
>   }
> -EXPORT_SYMBOL(iommu_dma_init_domain);
>   
>   /**
>    * dma_info_to_prot - Translate DMA API directions and attributes to IOMMU API
> @@ -347,7 +342,7 @@ EXPORT_SYMBOL(iommu_dma_init_domain);
>    *
>    * Return: corresponding IOMMU API page protection flags
>    */
> -int dma_info_to_prot(enum dma_data_direction dir, bool coherent,
> +static int dma_info_to_prot(enum dma_data_direction dir, bool coherent,
>   		     unsigned long attrs)
>   {
>   	int prot = coherent ? IOMMU_CACHE : 0;
> @@ -506,17 +501,17 @@ static struct page **__iommu_dma_alloc_pages(struct device *dev,
>   }
>   
>   /**
> - * iommu_dma_free - Free a buffer allocated by iommu_dma_alloc()
> + * iommu_dma_free - Free a buffer allocated by __iommu_dma_alloc()
>    * @dev: Device which owns this buffer
> - * @pages: Array of buffer pages as returned by iommu_dma_alloc()
> + * @pages: Array of buffer pages as returned by __iommu_dma_alloc()
>    * @size: Size of buffer in bytes
>    * @handle: DMA address of buffer
>    *
>    * Frees both the pages associated with the buffer, and the array
>    * describing them
>    */
> -void iommu_dma_free(struct device *dev, struct page **pages, size_t size,
> -		dma_addr_t *handle)
> +static void __iommu_dma_free(struct device *dev, struct page **pages,
> +		size_t size, dma_addr_t *handle)
>   {
>   	__iommu_dma_unmap(iommu_get_dma_domain(dev), *handle, size);
>   	__iommu_dma_free_pages(pages, PAGE_ALIGN(size) >> PAGE_SHIFT);
> @@ -524,7 +519,7 @@ void iommu_dma_free(struct device *dev, struct page **pages, size_t size,
>   }
>   
>   /**
> - * iommu_dma_alloc - Allocate and map a buffer contiguous in IOVA space
> + * __iommu_dma_alloc - Allocate and map a buffer contiguous in IOVA space
>    * @dev: Device to allocate memory for. Must be a real device
>    *	 attached to an iommu_dma_domain
>    * @size: Size of buffer in bytes
> @@ -539,8 +534,8 @@ void iommu_dma_free(struct device *dev, struct page **pages, size_t size,
>    * Return: Array of struct page pointers describing the buffer,
>    *	   or NULL on failure.
>    */
> -struct page **iommu_dma_alloc(struct device *dev, size_t size, gfp_t gfp,
> -		unsigned long attrs, int prot, dma_addr_t *handle)
> +static struct page **__iommu_dma_alloc(struct device *dev, size_t size,
> +		gfp_t gfp, unsigned long attrs, int prot, dma_addr_t *handle)
>   {
>   	struct iommu_domain *domain = iommu_get_dma_domain(dev);
>   	struct iommu_dma_cookie *cookie = domain->iova_cookie;
> @@ -602,16 +597,16 @@ struct page **iommu_dma_alloc(struct device *dev, size_t size, gfp_t gfp,
>   }
>   
>   /**
> - * iommu_dma_mmap - Map a buffer into provided user VMA
> - * @pages: Array representing buffer from iommu_dma_alloc()
> + * __iommu_dma_mmap - Map a buffer into provided user VMA
> + * @pages: Array representing buffer from __iommu_dma_alloc()
>    * @size: Size of buffer in bytes
>    * @vma: VMA describing requested userspace mapping
>    *
>    * Maps the pages of the buffer in @pages into @vma. The caller is responsible
>    * for verifying the correct size and protection of @vma beforehand.
>    */
> -
> -int iommu_dma_mmap(struct page **pages, size_t size, struct vm_area_struct *vma)
> +static int __iommu_dma_mmap(struct page **pages, size_t size,
> +		struct vm_area_struct *vma)
>   {
>   	unsigned long uaddr = vma->vm_start;
>   	unsigned int i, count = PAGE_ALIGN(size) >> PAGE_SHIFT;
> @@ -626,6 +621,58 @@ int iommu_dma_mmap(struct page **pages, size_t size, struct vm_area_struct *vma)
>   	return ret;
>   }
>   
> +static void iommu_dma_sync_single_for_cpu(struct device *dev,
> +		dma_addr_t dma_handle, size_t size, enum dma_data_direction dir)
> +{
> +	phys_addr_t phys;
> +
> +	if (dev_is_dma_coherent(dev))
> +		return;
> +
> +	phys = iommu_iova_to_phys(iommu_get_dma_domain(dev), dma_handle);
> +	arch_sync_dma_for_cpu(dev, phys, size, dir);
> +}
> +
> +static void iommu_dma_sync_single_for_device(struct device *dev,
> +		dma_addr_t dma_handle, size_t size, enum dma_data_direction dir)
> +{
> +	phys_addr_t phys;
> +
> +	if (dev_is_dma_coherent(dev))
> +		return;
> +
> +	phys = iommu_iova_to_phys(iommu_get_dma_domain(dev), dma_handle);
> +	arch_sync_dma_for_device(dev, phys, size, dir);
> +}
> +
> +static void iommu_dma_sync_sg_for_cpu(struct device *dev,
> +		struct scatterlist *sgl, int nelems,
> +		enum dma_data_direction dir)
> +{
> +	struct scatterlist *sg;
> +	int i;
> +
> +	if (dev_is_dma_coherent(dev))
> +		return;
> +
> +	for_each_sg(sgl, sg, nelems, i)
> +		arch_sync_dma_for_cpu(dev, sg_phys(sg), sg->length, dir);
> +}
> +
> +static void iommu_dma_sync_sg_for_device(struct device *dev,
> +		struct scatterlist *sgl, int nelems,
> +		enum dma_data_direction dir)
> +{
> +	struct scatterlist *sg;
> +	int i;
> +
> +	if (dev_is_dma_coherent(dev))
> +		return;
> +
> +	for_each_sg(sgl, sg, nelems, i)
> +		arch_sync_dma_for_device(dev, sg_phys(sg), sg->length, dir);
> +}
> +
>   static dma_addr_t __iommu_dma_map(struct device *dev, phys_addr_t phys,
>   		size_t size, int prot, struct iommu_domain *domain)
>   {
> @@ -649,19 +696,44 @@ static dma_addr_t __iommu_dma_map(struct device *dev, phys_addr_t phys,
>   	return iova + iova_off;
>   }
>   
> -dma_addr_t iommu_dma_map_page(struct device *dev, struct page *page,
> +static dma_addr_t __iommu_dma_map_page(struct device *dev, struct page *page,
>   		unsigned long offset, size_t size, int prot)
>   {
>   	return __iommu_dma_map(dev, page_to_phys(page) + offset, size, prot,
>   			iommu_get_dma_domain(dev));
>   }
>   
> -void iommu_dma_unmap_page(struct device *dev, dma_addr_t handle, size_t size,
> -		enum dma_data_direction dir, unsigned long attrs)
> +static void __iommu_dma_unmap_page(struct device *dev, dma_addr_t handle,
> +		size_t size, enum dma_data_direction dir, unsigned long attrs)
>   {
>   	__iommu_dma_unmap(iommu_get_dma_domain(dev), handle, size);
>   }
>   
> +static dma_addr_t iommu_dma_map_page(struct device *dev, struct page *page,
> +		unsigned long offset, size_t size, enum dma_data_direction dir,
> +		unsigned long attrs)
> +{
> +	phys_addr_t phys = page_to_phys(page) + offset;
> +	bool coherent = dev_is_dma_coherent(dev);
> +	dma_addr_t dma_handle;
> +
> +	dma_handle =__iommu_dma_map(dev, phys, size,
> +			dma_info_to_prot(dir, coherent, attrs),
> +			iommu_get_dma_domain(dev));
> +	if (!coherent && !(attrs & DMA_ATTR_SKIP_CPU_SYNC) &&
> +	    dma_handle != DMA_MAPPING_ERROR)
> +		arch_sync_dma_for_device(dev, phys, size, dir);
> +	return dma_handle;
> +}
> +
> +static void iommu_dma_unmap_page(struct device *dev, dma_addr_t dma_handle,
> +		size_t size, enum dma_data_direction dir, unsigned long attrs)
> +{
> +	if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC))
> +		iommu_dma_sync_single_for_cpu(dev, dma_handle, size, dir);
> +	__iommu_dma_unmap(iommu_get_dma_domain(dev), dma_handle, size);
> +}
> +
>   /*
>    * Prepare a successfully-mapped scatterlist to give back to the caller.
>    *
> @@ -744,18 +816,22 @@ static void __invalidate_sg(struct scatterlist *sg, int nents)
>    * impedance-matching, to be able to hand off a suitably-aligned list,
>    * but still preserve the original offsets and sizes for the caller.
>    */
> -int iommu_dma_map_sg(struct device *dev, struct scatterlist *sg,
> -		int nents, int prot)
> +static int iommu_dma_map_sg(struct device *dev, struct scatterlist *sg,
> +		int nents, enum dma_data_direction dir, unsigned long attrs)
>   {
>   	struct iommu_domain *domain = iommu_get_dma_domain(dev);
>   	struct iommu_dma_cookie *cookie = domain->iova_cookie;
>   	struct iova_domain *iovad = &cookie->iovad;
>   	struct scatterlist *s, *prev = NULL;
> +	int prot = dma_info_to_prot(dir, dev_is_dma_coherent(dev), attrs);
>   	dma_addr_t iova;
>   	size_t iova_len = 0;
>   	unsigned long mask = dma_get_seg_boundary(dev);
>   	int i;
>   
> +	if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC))
> +		iommu_dma_sync_sg_for_device(dev, sg, nents, dir);
> +
>   	/*
>   	 * Work out how much IOVA space we need, and align the segments to
>   	 * IOVA granules for the IOMMU driver to handle. With some clever
> @@ -815,12 +891,16 @@ int iommu_dma_map_sg(struct device *dev, struct scatterlist *sg,
>   	return 0;
>   }
>   
> -void iommu_dma_unmap_sg(struct device *dev, struct scatterlist *sg, int nents,
> -		enum dma_data_direction dir, unsigned long attrs)
> +static void iommu_dma_unmap_sg(struct device *dev, struct scatterlist *sg,
> +		int nents, enum dma_data_direction dir, unsigned long attrs)
>   {
>   	dma_addr_t start, end;
>   	struct scatterlist *tmp;
>   	int i;
> +
> +	if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC) == 0)
> +		iommu_dma_sync_sg_for_cpu(dev, sg, nents, dir);
> +
>   	/*
>   	 * The scatterlist segments are mapped into a single
>   	 * contiguous IOVA allocation, so this is incredibly easy.
> @@ -835,7 +915,7 @@ void iommu_dma_unmap_sg(struct device *dev, struct scatterlist *sg, int nents,
>   	__iommu_dma_unmap(iommu_get_dma_domain(dev), start, end - start);
>   }
>   
> -dma_addr_t iommu_dma_map_resource(struct device *dev, phys_addr_t phys,
> +static dma_addr_t iommu_dma_map_resource(struct device *dev, phys_addr_t phys,
>   		size_t size, enum dma_data_direction dir, unsigned long attrs)
>   {
>   	return __iommu_dma_map(dev, phys, size,
> @@ -843,12 +923,258 @@ dma_addr_t iommu_dma_map_resource(struct device *dev, phys_addr_t phys,
>   			iommu_get_dma_domain(dev));
>   }
>   
> -void iommu_dma_unmap_resource(struct device *dev, dma_addr_t handle,
> +static void iommu_dma_unmap_resource(struct device *dev, dma_addr_t handle,
>   		size_t size, enum dma_data_direction dir, unsigned long attrs)
>   {
>   	__iommu_dma_unmap(iommu_get_dma_domain(dev), handle, size);
>   }
>   
> +static void *iommu_dma_alloc(struct device *dev, size_t size,
> +		dma_addr_t *handle, gfp_t gfp, unsigned long attrs)
> +{
> +	bool coherent = dev_is_dma_coherent(dev);
> +	int ioprot = dma_info_to_prot(DMA_BIDIRECTIONAL, coherent, attrs);
> +	size_t iosize = size;
> +	void *addr;
> +
> +	size = PAGE_ALIGN(size);
> +	gfp |= __GFP_ZERO;
> +
> +	if (!gfpflags_allow_blocking(gfp)) {
> +		struct page *page;
> +		/*
> +		 * In atomic context we can't remap anything, so we'll only
> +		 * get the virtually contiguous buffer we need by way of a
> +		 * physically contiguous allocation.
> +		 */
> +		if (coherent) {
> +			page = alloc_pages(gfp, get_order(size));
> +			addr = page ? page_address(page) : NULL;
> +		} else {
> +			addr = dma_alloc_from_pool(size, &page, gfp);
> +		}
> +		if (!addr)
> +			return NULL;
> +
> +		*handle = __iommu_dma_map_page(dev, page, 0, iosize, ioprot);
> +		if (*handle == DMA_MAPPING_ERROR) {
> +			if (coherent)
> +				__free_pages(page, get_order(size));
> +			else
> +				dma_free_from_pool(addr, size);
> +			addr = NULL;
> +		}
> +	} else if (attrs & DMA_ATTR_FORCE_CONTIGUOUS) {
> +		pgprot_t prot = arch_dma_mmap_pgprot(dev, PAGE_KERNEL, attrs);
> +		struct page *page;
> +
> +		page = dma_alloc_from_contiguous(dev, size >> PAGE_SHIFT,
> +					get_order(size), gfp & __GFP_NOWARN);
> +		if (!page)
> +			return NULL;
> +
> +		*handle = __iommu_dma_map_page(dev, page, 0, iosize, ioprot);
> +		if (*handle == DMA_MAPPING_ERROR) {
> +			dma_release_from_contiguous(dev, page,
> +						    size >> PAGE_SHIFT);
> +			return NULL;
> +		}
> +		addr = dma_common_contiguous_remap(page, size, VM_USERMAP,
> +						   prot,
> +						   __builtin_return_address(0));
> +		if (addr) {
> +			if (!coherent)
> +				arch_dma_prep_coherent(page, iosize);
> +			memset(addr, 0, size);
> +		} else {
> +			__iommu_dma_unmap_page(dev, *handle, iosize, 0, attrs);
> +			dma_release_from_contiguous(dev, page,
> +						    size >> PAGE_SHIFT);
> +		}
> +	} else {
> +		pgprot_t prot = arch_dma_mmap_pgprot(dev, PAGE_KERNEL, attrs);
> +		struct page **pages;
> +
> +		pages = __iommu_dma_alloc(dev, iosize, gfp, attrs, ioprot,
> +					handle);
> +		if (!pages)
> +			return NULL;
> +
> +		addr = dma_common_pages_remap(pages, size, VM_USERMAP, prot,
> +					      __builtin_return_address(0));
> +		if (!addr)
> +			__iommu_dma_free(dev, pages, iosize, handle);
> +	}
> +	return addr;
> +}
> +
> +static void iommu_dma_free(struct device *dev, size_t size, void *cpu_addr,
> +		dma_addr_t handle, unsigned long attrs)
> +{
> +	size_t iosize = size;
> +
> +	size = PAGE_ALIGN(size);
> +	/*
> +	 * @cpu_addr will be one of 4 things depending on how it was allocated:
> +	 * - A remapped array of pages for contiguous allocations.
> +	 * - A remapped array of pages from __iommu_dma_alloc(), for all
> +	 *   non-atomic allocations.
> +	 * - A non-cacheable alias from the atomic pool, for atomic
> +	 *   allocations by non-coherent devices.
> +	 * - A normal lowmem address, for atomic allocations by
> +	 *   coherent devices.
> +	 * Hence how dodgy the below logic looks...
> +	 */
> +	if (dma_in_atomic_pool(cpu_addr, size)) {
> +		__iommu_dma_unmap_page(dev, handle, iosize, 0, 0);
> +		dma_free_from_pool(cpu_addr, size);
> +	} else if (attrs & DMA_ATTR_FORCE_CONTIGUOUS) {
> +		struct page *page = vmalloc_to_page(cpu_addr);
> +
> +		__iommu_dma_unmap_page(dev, handle, iosize, 0, attrs);
> +		dma_release_from_contiguous(dev, page, size >> PAGE_SHIFT);
> +		dma_common_free_remap(cpu_addr, size, VM_USERMAP);
> +	} else if (is_vmalloc_addr(cpu_addr)){
> +		struct vm_struct *area = find_vm_area(cpu_addr);
> +
> +		if (WARN_ON(!area || !area->pages))
> +			return;
> +		__iommu_dma_free(dev, area->pages, iosize, &handle);
> +		dma_common_free_remap(cpu_addr, size, VM_USERMAP);
> +	} else {
> +		__iommu_dma_unmap_page(dev, handle, iosize, 0, 0);
> +		__free_pages(virt_to_page(cpu_addr), get_order(size));
> +	}
> +}
> +
> +static int __iommu_dma_mmap_pfn(struct vm_area_struct *vma,
> +			      unsigned long pfn, size_t size)
> +{
> +	return remap_pfn_range(vma, vma->vm_start, pfn + vma->vm_pgoff,
> +			       vma->vm_end - vma->vm_start,
> +			       vma->vm_page_prot);
> +}
> +
> +static int iommu_dma_mmap(struct device *dev, struct vm_area_struct *vma,
> +		void *cpu_addr, dma_addr_t dma_addr, size_t size,
> +		unsigned long attrs)
> +{
> +	unsigned long nr_pages = PAGE_ALIGN(size) >> PAGE_SHIFT;
> +	unsigned long off = vma->vm_pgoff;
> +	struct vm_struct *area;
> +	int ret;
> +
> +	vma->vm_page_prot = arch_dma_mmap_pgprot(dev, vma->vm_page_prot, attrs);
> +
> +	if (dma_mmap_from_dev_coherent(dev, vma, cpu_addr, size, &ret))
> +		return ret;
> +
> +	if (off >= nr_pages || vma_pages(vma) > nr_pages - off)
> +		return -ENXIO;
> +
> +	if (!is_vmalloc_addr(cpu_addr)) {
> +		unsigned long pfn = page_to_pfn(virt_to_page(cpu_addr));
> +		return __iommu_dma_mmap_pfn(vma, pfn, size);
> +	}
> +
> +	if (attrs & DMA_ATTR_FORCE_CONTIGUOUS) {
> +		/*
> +		 * DMA_ATTR_FORCE_CONTIGUOUS allocations are always remapped,
> +		 * hence in the vmalloc space.
> +		 */
> +		unsigned long pfn = vmalloc_to_pfn(cpu_addr);
> +		return __iommu_dma_mmap_pfn(vma, pfn, size);
> +	}
> +
> +	area = find_vm_area(cpu_addr);
> +	if (WARN_ON(!area || !area->pages))
> +		return -ENXIO;
> +
> +	return __iommu_dma_mmap(area->pages, size, vma);
> +}
> +
> +static int __iommu_dma_get_sgtable_page(struct sg_table *sgt, struct page *page,
> +		size_t size)
> +{
> +	int ret = sg_alloc_table(sgt, 1, GFP_KERNEL);
> +
> +	if (!ret)
> +		sg_set_page(sgt->sgl, page, PAGE_ALIGN(size), 0);
> +	return ret;
> +}
> +
> +static int iommu_dma_get_sgtable(struct device *dev, struct sg_table *sgt,
> +		void *cpu_addr, dma_addr_t dma_addr, size_t size,
> +		unsigned long attrs)
> +{
> +	unsigned int count = PAGE_ALIGN(size) >> PAGE_SHIFT;
> +	struct vm_struct *area = find_vm_area(cpu_addr);
> +
> +	if (!is_vmalloc_addr(cpu_addr)) {
> +		struct page *page = virt_to_page(cpu_addr);
> +		return __iommu_dma_get_sgtable_page(sgt, page, size);
> +	}
> +
> +	if (attrs & DMA_ATTR_FORCE_CONTIGUOUS) {
> +		/*
> +		 * DMA_ATTR_FORCE_CONTIGUOUS allocations are always remapped,
> +		 * hence in the vmalloc space.
> +		 */
> +		struct page *page = vmalloc_to_page(cpu_addr);
> +		return __iommu_dma_get_sgtable_page(sgt, page, size);
> +	}
> +
> +	if (WARN_ON(!area || !area->pages))
> +		return -ENXIO;
> +
> +	return sg_alloc_table_from_pages(sgt, area->pages, count, 0, size,
> +					 GFP_KERNEL);
> +}
> +
> +static const struct dma_map_ops iommu_dma_ops = {
> +	.alloc			= iommu_dma_alloc,
> +	.free			= iommu_dma_free,
> +	.mmap			= iommu_dma_mmap,
> +	.get_sgtable		= iommu_dma_get_sgtable,
> +	.map_page		= iommu_dma_map_page,
> +	.unmap_page		= iommu_dma_unmap_page,
> +	.map_sg			= iommu_dma_map_sg,
> +	.unmap_sg		= iommu_dma_unmap_sg,
> +	.sync_single_for_cpu	= iommu_dma_sync_single_for_cpu,
> +	.sync_single_for_device	= iommu_dma_sync_single_for_device,
> +	.sync_sg_for_cpu	= iommu_dma_sync_sg_for_cpu,
> +	.sync_sg_for_device	= iommu_dma_sync_sg_for_device,
> +	.map_resource		= iommu_dma_map_resource,
> +	.unmap_resource		= iommu_dma_unmap_resource,
> +};
> +
> +/*
> + * The IOMMU core code allocates the default DMA domain, which the underlying
> + * IOMMU driver needs to support via the dma-iommu layer.
> + */
> +void iommu_setup_dma_ops(struct device *dev, u64 dma_base, u64 size)
> +{
> +	struct iommu_domain *domain = iommu_get_domain_for_dev(dev);
> +
> +	if (!domain)
> +		goto out_err;
> +
> +	/*
> +	 * The IOMMU core code allocates the default DMA domain, which the
> +	 * underlying IOMMU driver needs to support via the dma-iommu layer.
> +	 */
> +	if (domain->type == IOMMU_DOMAIN_DMA) {
> +		if (iommu_dma_init_domain(domain, dma_base, size, dev))
> +			goto out_err;
> +		dev->dma_ops = &iommu_dma_ops;
> +	}
> +
> +	return;
> +out_err:
> +	 pr_warn("Failed to set up IOMMU for device %s; retaining platform DMA ops\n",
> +		 dev_name(dev));
> +}
> +
>   static struct iommu_dma_msi_page *iommu_dma_get_msi_page(struct device *dev,
>   		phys_addr_t msi_addr, struct iommu_domain *domain)
>   {
> @@ -921,3 +1247,9 @@ void iommu_dma_map_msi_msg(int irq, struct msi_msg *msg)
>   		msg->address_lo += lower_32_bits(msi_page->iova);
>   	}
>   }
> +
> +static int iommu_dma_init(void)
> +{
> +	return iova_cache_get();
> +}
> +arch_initcall(iommu_dma_init);
> diff --git a/include/linux/dma-iommu.h b/include/linux/dma-iommu.h
> index 3216447178a7..dadf4383f555 100644
> --- a/include/linux/dma-iommu.h
> +++ b/include/linux/dma-iommu.h
> @@ -24,49 +24,13 @@
>   #include <linux/iommu.h>
>   #include <linux/msi.h>
>   
> -int iommu_dma_init(void);
> -
>   /* Domain management interface for IOMMU drivers */
>   int iommu_get_dma_cookie(struct iommu_domain *domain);
>   int iommu_get_msi_cookie(struct iommu_domain *domain, dma_addr_t base);
>   void iommu_put_dma_cookie(struct iommu_domain *domain);
>   
>   /* Setup call for arch DMA mapping code */
> -int iommu_dma_init_domain(struct iommu_domain *domain, dma_addr_t base,
> -		u64 size, struct device *dev);
> -
> -/* General helpers for DMA-API <-> IOMMU-API interaction */
> -int dma_info_to_prot(enum dma_data_direction dir, bool coherent,
> -		     unsigned long attrs);
> -
> -/*
> - * These implement the bulk of the relevant DMA mapping callbacks, but require
> - * the arch code to take care of attributes and cache maintenance
> - */
> -struct page **iommu_dma_alloc(struct device *dev, size_t size, gfp_t gfp,
> -		unsigned long attrs, int prot, dma_addr_t *handle);
> -void iommu_dma_free(struct device *dev, struct page **pages, size_t size,
> -		dma_addr_t *handle);
> -
> -int iommu_dma_mmap(struct page **pages, size_t size, struct vm_area_struct *vma);
> -
> -dma_addr_t iommu_dma_map_page(struct device *dev, struct page *page,
> -		unsigned long offset, size_t size, int prot);
> -int iommu_dma_map_sg(struct device *dev, struct scatterlist *sg,
> -		int nents, int prot);
> -
> -/*
> - * Arch code with no special attribute handling may use these
> - * directly as DMA mapping callbacks for simplicity
> - */
> -void iommu_dma_unmap_page(struct device *dev, dma_addr_t handle, size_t size,
> -		enum dma_data_direction dir, unsigned long attrs);
> -void iommu_dma_unmap_sg(struct device *dev, struct scatterlist *sg, int nents,
> -		enum dma_data_direction dir, unsigned long attrs);
> -dma_addr_t iommu_dma_map_resource(struct device *dev, phys_addr_t phys,
> -		size_t size, enum dma_data_direction dir, unsigned long attrs);
> -void iommu_dma_unmap_resource(struct device *dev, dma_addr_t handle,
> -		size_t size, enum dma_data_direction dir, unsigned long attrs);
> +void iommu_setup_dma_ops(struct device *dev, u64 dma_base, u64 size);
>   
>   /* The DMA API isn't _quite_ the whole story, though... */
>   void iommu_dma_map_msi_msg(int irq, struct msi_msg *msg);
> @@ -75,12 +39,13 @@ void iommu_dma_get_resv_regions(struct device *dev, struct list_head *list);
>   #else /* CONFIG_IOMMU_DMA */
>   
>   struct iommu_domain;
> +struct iommu_ops;
>   struct msi_msg;
>   struct device;
>   
> -static inline int iommu_dma_init(void)
> +static inline void iommu_setup_dma_ops(struct device *dev, u64 dma_base,
> +		u64 size)
>   {
> -	return 0;
>   }
>   
>   static inline int iommu_get_dma_cookie(struct iommu_domain *domain)
> 

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 11/26] iommu/dma: Factor out remapped pages lookup
  2019-04-22 17:59 ` [PATCH 11/26] iommu/dma: Factor out remapped pages lookup Christoph Hellwig
@ 2019-04-29 13:05   ` Robin Murphy
  2019-04-29 19:10     ` Christoph Hellwig
  0 siblings, 1 reply; 48+ messages in thread
From: Robin Murphy @ 2019-04-29 13:05 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Joerg Roedel, Catalin Marinas, Will Deacon, Tom Lendacky, iommu,
	linux-arm-kernel, linux-kernel

On 22/04/2019 18:59, Christoph Hellwig wrote:
> From: Robin Murphy <robin.murphy@arm.com>
> 
> Since we duplicate the find_vm_area() logic a few times in places where
> we only care aboute the pages, factor out a helper to abstract it.
> 
> Signed-off-by: Robin Murphy <robin.murphy@arm.com>
> [hch: don't warn when not finding a region, as we'll rely on that later]

Yeah, I did think about that and the things which it might make a little 
easier, but preserved it as-is for the sake of keeping my modifications 
quick and simple. TBH I'm now feeling more inclined to drop the WARNs 
entirely at this point, since it's not like there's ever been any 
general guarantee that freeing the wrong thing shouldn't just crash, but 
that's something we can easily come back to later if need be.

Robin.

> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>   drivers/iommu/dma-iommu.c | 32 ++++++++++++++++++++------------
>   1 file changed, 20 insertions(+), 12 deletions(-)
> 
> diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
> index b52c5d6be7b4..8e2d9733113e 100644
> --- a/drivers/iommu/dma-iommu.c
> +++ b/drivers/iommu/dma-iommu.c
> @@ -525,6 +525,15 @@ static struct page **__iommu_dma_alloc_pages(struct device *dev,
>   	return pages;
>   }
>   
> +static struct page **__iommu_dma_get_pages(void *cpu_addr)
> +{
> +	struct vm_struct *area = find_vm_area(cpu_addr);
> +
> +	if (!area || !area->pages)
> +		return NULL;
> +	return area->pages;
> +}
> +
>   /**
>    * iommu_dma_free - Free a buffer allocated by __iommu_dma_alloc()
>    * @dev: Device which owns this buffer
> @@ -1023,11 +1032,11 @@ static void iommu_dma_free(struct device *dev, size_t size, void *cpu_addr,
>   		dma_release_from_contiguous(dev, page, size >> PAGE_SHIFT);
>   		dma_common_free_remap(cpu_addr, size, VM_USERMAP);
>   	} else if (is_vmalloc_addr(cpu_addr)){
> -		struct vm_struct *area = find_vm_area(cpu_addr);
> +		struct page **pages = __iommu_dma_get_pages(cpu_addr);
>   
> -		if (WARN_ON(!area || !area->pages))
> +		if (WARN_ON(!pages))
>   			return;
> -		__iommu_dma_free(dev, area->pages, iosize, &handle);
> +		__iommu_dma_free(dev, pages, iosize, &handle);
>   		dma_common_free_remap(cpu_addr, size, VM_USERMAP);
>   	} else {
>   		__iommu_dma_unmap(dev, handle, iosize);
> @@ -1049,7 +1058,7 @@ static int iommu_dma_mmap(struct device *dev, struct vm_area_struct *vma,
>   {
>   	unsigned long nr_pages = PAGE_ALIGN(size) >> PAGE_SHIFT;
>   	unsigned long off = vma->vm_pgoff;
> -	struct vm_struct *area;
> +	struct page **pages;
>   	int ret;
>   
>   	vma->vm_page_prot = arch_dma_mmap_pgprot(dev, vma->vm_page_prot, attrs);
> @@ -1074,11 +1083,10 @@ static int iommu_dma_mmap(struct device *dev, struct vm_area_struct *vma,
>   		return __iommu_dma_mmap_pfn(vma, pfn, size);
>   	}
>   
> -	area = find_vm_area(cpu_addr);
> -	if (WARN_ON(!area || !area->pages))
> +	pages = __iommu_dma_get_pages(cpu_addr);
> +	if (WARN_ON_ONCE(!pages))
>   		return -ENXIO;
> -
> -	return __iommu_dma_mmap(area->pages, size, vma);
> +	return __iommu_dma_mmap(pages, size, vma);
>   }
>   
>   static int __iommu_dma_get_sgtable_page(struct sg_table *sgt, struct page *page,
> @@ -1096,7 +1104,7 @@ static int iommu_dma_get_sgtable(struct device *dev, struct sg_table *sgt,
>   		unsigned long attrs)
>   {
>   	unsigned int count = PAGE_ALIGN(size) >> PAGE_SHIFT;
> -	struct vm_struct *area = find_vm_area(cpu_addr);
> +	struct page **pages;
>   
>   	if (!is_vmalloc_addr(cpu_addr)) {
>   		struct page *page = virt_to_page(cpu_addr);
> @@ -1112,10 +1120,10 @@ static int iommu_dma_get_sgtable(struct device *dev, struct sg_table *sgt,
>   		return __iommu_dma_get_sgtable_page(sgt, page, size);
>   	}
>   
> -	if (WARN_ON(!area || !area->pages))
> +	pages = __iommu_dma_get_pages(cpu_addr);
> +	if (WARN_ON_ONCE(!pages))
>   		return -ENXIO;
> -
> -	return sg_alloc_table_from_pages(sgt, area->pages, count, 0, size,
> +	return sg_alloc_table_from_pages(sgt, pages, count, 0, size,
>   					 GFP_KERNEL);
>   }
>   
> 

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 12/26] iommu/dma: Refactor the page array remapping allocator
  2019-04-22 17:59 ` [PATCH 12/26] iommu/dma: Refactor the page array remapping allocator Christoph Hellwig
@ 2019-04-29 13:10   ` Robin Murphy
  0 siblings, 0 replies; 48+ messages in thread
From: Robin Murphy @ 2019-04-29 13:10 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Joerg Roedel, Catalin Marinas, Will Deacon, Tom Lendacky, iommu,
	linux-arm-kernel, linux-kernel

On 22/04/2019 18:59, Christoph Hellwig wrote:
> Move the call to dma_common_pages_remap into __iommu_dma_alloc and
> rename it to iommu_dma_alloc_remap.  This creates a self-contained
> helper for remapped pages allocation and mapping.

Reviewed-by: Robin Murphy <robin.murphy@arm.com>

> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>   drivers/iommu/dma-iommu.c | 54 +++++++++++++++++++--------------------
>   1 file changed, 26 insertions(+), 28 deletions(-)
> 
> diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
> index 8e2d9733113e..b8e46e89a60a 100644
> --- a/drivers/iommu/dma-iommu.c
> +++ b/drivers/iommu/dma-iommu.c
> @@ -535,9 +535,9 @@ static struct page **__iommu_dma_get_pages(void *cpu_addr)
>   }
>   
>   /**
> - * iommu_dma_free - Free a buffer allocated by __iommu_dma_alloc()
> + * iommu_dma_free - Free a buffer allocated by iommu_dma_alloc_remap()
>    * @dev: Device which owns this buffer
> - * @pages: Array of buffer pages as returned by __iommu_dma_alloc()
> + * @pages: Array of buffer pages as returned by __iommu_dma_alloc_remap()
>    * @size: Size of buffer in bytes
>    * @handle: DMA address of buffer
>    *
> @@ -553,33 +553,35 @@ static void __iommu_dma_free(struct device *dev, struct page **pages,
>   }
>   
>   /**
> - * __iommu_dma_alloc - Allocate and map a buffer contiguous in IOVA space
> + * iommu_dma_alloc_remap - Allocate and map a buffer contiguous in IOVA space
>    * @dev: Device to allocate memory for. Must be a real device
>    *	 attached to an iommu_dma_domain
>    * @size: Size of buffer in bytes
> + * @dma_handle: Out argument for allocated DMA handle
>    * @gfp: Allocation flags
>    * @attrs: DMA attributes for this allocation
> - * @prot: IOMMU mapping flags
> - * @handle: Out argument for allocated DMA handle
>    *
>    * If @size is less than PAGE_SIZE, then a full CPU page will be allocated,
>    * but an IOMMU which supports smaller pages might not map the whole thing.
>    *
> - * Return: Array of struct page pointers describing the buffer,
> - *	   or NULL on failure.
> + * Return: Mapped virtual address, or NULL on failure.
>    */
> -static struct page **__iommu_dma_alloc(struct device *dev, size_t size,
> -		gfp_t gfp, unsigned long attrs, int prot, dma_addr_t *handle)
> +static void *iommu_dma_alloc_remap(struct device *dev, size_t size,
> +		dma_addr_t *dma_handle, gfp_t gfp, unsigned long attrs)
>   {
>   	struct iommu_domain *domain = iommu_get_dma_domain(dev);
>   	struct iommu_dma_cookie *cookie = domain->iova_cookie;
>   	struct iova_domain *iovad = &cookie->iovad;
> +	bool coherent = dev_is_dma_coherent(dev);
> +	int ioprot = dma_info_to_prot(DMA_BIDIRECTIONAL, coherent, attrs);
> +	pgprot_t prot = arch_dma_mmap_pgprot(dev, PAGE_KERNEL, attrs);
> +	unsigned int count, min_size, alloc_sizes = domain->pgsize_bitmap;
>   	struct page **pages;
>   	struct sg_table sgt;
>   	dma_addr_t iova;
> -	unsigned int count, min_size, alloc_sizes = domain->pgsize_bitmap;
> +	void *vaddr;
>   
> -	*handle = DMA_MAPPING_ERROR;
> +	*dma_handle = DMA_MAPPING_ERROR;
>   
>   	min_size = alloc_sizes & -alloc_sizes;
>   	if (min_size < PAGE_SIZE) {
> @@ -605,7 +607,7 @@ static struct page **__iommu_dma_alloc(struct device *dev, size_t size,
>   	if (sg_alloc_table_from_pages(&sgt, pages, count, 0, size, GFP_KERNEL))
>   		goto out_free_iova;
>   
> -	if (!(prot & IOMMU_CACHE)) {
> +	if (!(ioprot & IOMMU_CACHE)) {
>   		struct scatterlist *sg;
>   		int i;
>   
> @@ -613,14 +615,21 @@ static struct page **__iommu_dma_alloc(struct device *dev, size_t size,
>   			arch_dma_prep_coherent(sg_page(sg), sg->length);
>   	}
>   
> -	if (iommu_map_sg(domain, iova, sgt.sgl, sgt.orig_nents, prot)
> +	if (iommu_map_sg(domain, iova, sgt.sgl, sgt.orig_nents, ioprot)
>   			< size)
>   		goto out_free_sg;
>   
> -	*handle = iova;
> +	vaddr = dma_common_pages_remap(pages, size, VM_USERMAP, prot,
> +			__builtin_return_address(0));
> +	if (!vaddr)
> +		goto out_unmap;
> +
> +	*dma_handle = iova;
>   	sg_free_table(&sgt);
> -	return pages;
> +	return vaddr;
>   
> +out_unmap:
> +	__iommu_dma_unmap(dev, iova, size);
>   out_free_sg:
>   	sg_free_table(&sgt);
>   out_free_iova:
> @@ -989,18 +998,7 @@ static void *iommu_dma_alloc(struct device *dev, size_t size,
>   						    size >> PAGE_SHIFT);
>   		}
>   	} else {
> -		pgprot_t prot = arch_dma_mmap_pgprot(dev, PAGE_KERNEL, attrs);
> -		struct page **pages;
> -
> -		pages = __iommu_dma_alloc(dev, iosize, gfp, attrs, ioprot,
> -					handle);
> -		if (!pages)
> -			return NULL;
> -
> -		addr = dma_common_pages_remap(pages, size, VM_USERMAP, prot,
> -					      __builtin_return_address(0));
> -		if (!addr)
> -			__iommu_dma_free(dev, pages, iosize, handle);
> +		addr = iommu_dma_alloc_remap(dev, iosize, handle, gfp, attrs);
>   	}
>   	return addr;
>   }
> @@ -1014,7 +1012,7 @@ static void iommu_dma_free(struct device *dev, size_t size, void *cpu_addr,
>   	/*
>   	 * @cpu_addr will be one of 4 things depending on how it was allocated:
>   	 * - A remapped array of pages for contiguous allocations.
> -	 * - A remapped array of pages from __iommu_dma_alloc(), for all
> +	 * - A remapped array of pages from iommu_dma_alloc_remap(), for all
>   	 *   non-atomic allocations.
>   	 * - A non-cacheable alias from the atomic pool, for atomic
>   	 *   allocations by non-coherent devices.
> 

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 13/26] iommu/dma: Remove __iommu_dma_free
  2019-04-22 17:59 ` [PATCH 13/26] iommu/dma: Remove __iommu_dma_free Christoph Hellwig
@ 2019-04-29 13:18   ` Robin Murphy
  0 siblings, 0 replies; 48+ messages in thread
From: Robin Murphy @ 2019-04-29 13:18 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Joerg Roedel, Catalin Marinas, Will Deacon, Tom Lendacky, iommu,
	linux-arm-kernel, linux-kernel

On 22/04/2019 18:59, Christoph Hellwig wrote:
> We only have a single caller of this function left, so open code it there.

Heh, I even caught myself out for a moment thinking this looked 
redundant with #18 now, but no :)

Reviewed-by: Robin Murphy <robin.murphy@arm.com>

> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>   drivers/iommu/dma-iommu.c | 21 ++-------------------
>   1 file changed, 2 insertions(+), 19 deletions(-)
> 
> diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
> index b8e46e89a60a..4632b9d301a1 100644
> --- a/drivers/iommu/dma-iommu.c
> +++ b/drivers/iommu/dma-iommu.c
> @@ -534,24 +534,6 @@ static struct page **__iommu_dma_get_pages(void *cpu_addr)
>   	return area->pages;
>   }
>   
> -/**
> - * iommu_dma_free - Free a buffer allocated by iommu_dma_alloc_remap()
> - * @dev: Device which owns this buffer
> - * @pages: Array of buffer pages as returned by __iommu_dma_alloc_remap()
> - * @size: Size of buffer in bytes
> - * @handle: DMA address of buffer
> - *
> - * Frees both the pages associated with the buffer, and the array
> - * describing them
> - */
> -static void __iommu_dma_free(struct device *dev, struct page **pages,
> -		size_t size, dma_addr_t *handle)
> -{
> -	__iommu_dma_unmap(dev, *handle, size);
> -	__iommu_dma_free_pages(pages, PAGE_ALIGN(size) >> PAGE_SHIFT);
> -	*handle = DMA_MAPPING_ERROR;
> -}
> -
>   /**
>    * iommu_dma_alloc_remap - Allocate and map a buffer contiguous in IOVA space
>    * @dev: Device to allocate memory for. Must be a real device
> @@ -1034,7 +1016,8 @@ static void iommu_dma_free(struct device *dev, size_t size, void *cpu_addr,
>   
>   		if (WARN_ON(!pages))
>   			return;
> -		__iommu_dma_free(dev, pages, iosize, &handle);
> +		__iommu_dma_unmap(dev, handle, iosize);
> +		__iommu_dma_free_pages(pages, size >> PAGE_SHIFT);
>   		dma_common_free_remap(cpu_addr, size, VM_USERMAP);
>   	} else {
>   		__iommu_dma_unmap(dev, handle, iosize);
> 

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 14/26] iommu/dma: Refactor iommu_dma_free
  2019-04-22 17:59 ` [PATCH 14/26] iommu/dma: Refactor iommu_dma_free Christoph Hellwig
@ 2019-04-29 13:59   ` Robin Murphy
  2019-04-29 19:03     ` Christoph Hellwig
  0 siblings, 1 reply; 48+ messages in thread
From: Robin Murphy @ 2019-04-29 13:59 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Joerg Roedel, Catalin Marinas, Will Deacon, Tom Lendacky, iommu,
	linux-arm-kernel, linux-kernel

On 22/04/2019 18:59, Christoph Hellwig wrote:
> From: Robin Murphy <robin.murphy@arm.com>
> 
> The freeing logic was made particularly horrible by part of it being
> opaque to the arch wrapper, which led to a lot of convoluted repetition
> to ensure each path did everything in the right order. Now that it's
> all private, we can pick apart and consolidate the logically-distinct
> steps of freeing the IOMMU mapping, the underlying pages, and the CPU
> remap (if necessary) into something much more manageable.
> 
> Signed-off-by: Robin Murphy <robin.murphy@arm.com>
> [various cosmetic changes to the code flow]

Hmm, I do still prefer my original flow with the dma_common_free_remap() 
call right out of the way at the end rather than being a special case in 
the middle of all the page-freeing (which is the kind of existing 
complexity I was trying to eliminate). I guess you've done this to avoid 
having "if (!dma_release_from_contiguous(...))..." twice like I ended up 
with, which is fair enough I suppose - once we manage to solve the new 
dma_{alloc,free}_contiguous() interface that may tip the balance so I 
can always revisit this then.

> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>   drivers/iommu/dma-iommu.c | 75 ++++++++++++++++++---------------------
>   1 file changed, 35 insertions(+), 40 deletions(-)
> 
> diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
> index 4632b9d301a1..9658c4cc3cfe 100644
> --- a/drivers/iommu/dma-iommu.c
> +++ b/drivers/iommu/dma-iommu.c
> @@ -916,6 +916,41 @@ static void iommu_dma_unmap_resource(struct device *dev, dma_addr_t handle,
>   	__iommu_dma_unmap(dev, handle, size);
>   }
>   
> +static void iommu_dma_free(struct device *dev, size_t size, void *cpu_addr,
> +		dma_addr_t handle, unsigned long attrs)
> +{
> +	size_t alloc_size = PAGE_ALIGN(size);
> +	int count = alloc_size >> PAGE_SHIFT;
> +	struct page *page = NULL;
> +
> +	__iommu_dma_unmap(dev, handle, size);
> +
> +	/* Non-coherent atomic allocation? Easy */
> +	if (dma_free_from_pool(cpu_addr, alloc_size))
> +		return;
> +
> +	if (is_vmalloc_addr(cpu_addr)) {
> +		/*
> +		 * If it the address is remapped, then it's either non-coherent

s/If it/If/

My "easy; more involved; most complex" narrative definitely doesn't 
translate very well to a different order :)

Otherwise,

Reluctantly-Acked-by: Robin Murphy <robin.murphy@arm.com>

> +		 * or highmem CMA, or an iommu_dma_alloc_remap() construction.
> +		 */
> +		struct page **pages = __iommu_dma_get_pages(cpu_addr);
> +
> +		if (pages)
> +			__iommu_dma_free_pages(pages, count);
> +		else
> +			page = vmalloc_to_page(cpu_addr);
> +
> +		dma_common_free_remap(cpu_addr, alloc_size, VM_USERMAP);
> +	} else {
> +		/* Lowmem means a coherent atomic or CMA allocation */
> +		page = virt_to_page(cpu_addr);
> +	}
> +
> +	if (page && !dma_release_from_contiguous(dev, page, count))
> +		__free_pages(page, get_order(alloc_size));
> +}
> +
>   static void *iommu_dma_alloc(struct device *dev, size_t size,
>   		dma_addr_t *handle, gfp_t gfp, unsigned long attrs)
>   {
> @@ -985,46 +1020,6 @@ static void *iommu_dma_alloc(struct device *dev, size_t size,
>   	return addr;
>   }
>   
> -static void iommu_dma_free(struct device *dev, size_t size, void *cpu_addr,
> -		dma_addr_t handle, unsigned long attrs)
> -{
> -	size_t iosize = size;
> -
> -	size = PAGE_ALIGN(size);
> -	/*
> -	 * @cpu_addr will be one of 4 things depending on how it was allocated:
> -	 * - A remapped array of pages for contiguous allocations.
> -	 * - A remapped array of pages from iommu_dma_alloc_remap(), for all
> -	 *   non-atomic allocations.
> -	 * - A non-cacheable alias from the atomic pool, for atomic
> -	 *   allocations by non-coherent devices.
> -	 * - A normal lowmem address, for atomic allocations by
> -	 *   coherent devices.
> -	 * Hence how dodgy the below logic looks...
> -	 */
> -	if (dma_in_atomic_pool(cpu_addr, size)) {
> -		__iommu_dma_unmap(dev, handle, iosize);
> -		dma_free_from_pool(cpu_addr, size);
> -	} else if (attrs & DMA_ATTR_FORCE_CONTIGUOUS) {
> -		struct page *page = vmalloc_to_page(cpu_addr);
> -
> -		__iommu_dma_unmap(dev, handle, iosize);
> -		dma_release_from_contiguous(dev, page, size >> PAGE_SHIFT);
> -		dma_common_free_remap(cpu_addr, size, VM_USERMAP);
> -	} else if (is_vmalloc_addr(cpu_addr)){
> -		struct page **pages = __iommu_dma_get_pages(cpu_addr);
> -
> -		if (WARN_ON(!pages))
> -			return;
> -		__iommu_dma_unmap(dev, handle, iosize);
> -		__iommu_dma_free_pages(pages, size >> PAGE_SHIFT);
> -		dma_common_free_remap(cpu_addr, size, VM_USERMAP);
> -	} else {
> -		__iommu_dma_unmap(dev, handle, iosize);
> -		__free_pages(virt_to_page(cpu_addr), get_order(size));
> -	}
> -}
> -
>   static int __iommu_dma_mmap_pfn(struct vm_area_struct *vma,
>   			      unsigned long pfn, size_t size)
>   {
> 

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 22/26] iommu/dma: Refactor iommu_dma_mmap
  2019-04-22 17:59 ` [PATCH 22/26] iommu/dma: Refactor iommu_dma_mmap Christoph Hellwig
@ 2019-04-29 14:04   ` Robin Murphy
  0 siblings, 0 replies; 48+ messages in thread
From: Robin Murphy @ 2019-04-29 14:04 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Joerg Roedel, Catalin Marinas, Will Deacon, Tom Lendacky, iommu,
	linux-arm-kernel, linux-kernel

On 22/04/2019 18:59, Christoph Hellwig wrote:
> Inline __iommu_dma_mmap_pfn into the main function, and use the
> fact that __iommu_dma_get_pages return NULL for remapped contigous
> allocations to simplify the code flow a bit.

...and later we can squash __iommu_dma_mmap() once the dust settles on 
vm_map_pages() - seems good to me.

Reviewed-by: Robin Murphy <robin.murphy@arm.com>

> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>   drivers/iommu/dma-iommu.c | 36 +++++++++++-------------------------
>   1 file changed, 11 insertions(+), 25 deletions(-)
> 
> diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
> index 138b85e675c8..8fc6098c1eeb 100644
> --- a/drivers/iommu/dma-iommu.c
> +++ b/drivers/iommu/dma-iommu.c
> @@ -1025,21 +1025,12 @@ static void *iommu_dma_alloc(struct device *dev, size_t size,
>   	return cpu_addr;
>   }
>   
> -static int __iommu_dma_mmap_pfn(struct vm_area_struct *vma,
> -			      unsigned long pfn, size_t size)
> -{
> -	return remap_pfn_range(vma, vma->vm_start, pfn + vma->vm_pgoff,
> -			       vma->vm_end - vma->vm_start,
> -			       vma->vm_page_prot);
> -}
> -
>   static int iommu_dma_mmap(struct device *dev, struct vm_area_struct *vma,
>   		void *cpu_addr, dma_addr_t dma_addr, size_t size,
>   		unsigned long attrs)
>   {
>   	unsigned long nr_pages = PAGE_ALIGN(size) >> PAGE_SHIFT;
> -	unsigned long off = vma->vm_pgoff;
> -	struct page **pages;
> +	unsigned long pfn, off = vma->vm_pgoff;
>   	int ret;
>   
>   	vma->vm_page_prot = arch_dma_mmap_pgprot(dev, vma->vm_page_prot, attrs);
> @@ -1050,24 +1041,19 @@ static int iommu_dma_mmap(struct device *dev, struct vm_area_struct *vma,
>   	if (off >= nr_pages || vma_pages(vma) > nr_pages - off)
>   		return -ENXIO;
>   
> -	if (!is_vmalloc_addr(cpu_addr)) {
> -		unsigned long pfn = page_to_pfn(virt_to_page(cpu_addr));
> -		return __iommu_dma_mmap_pfn(vma, pfn, size);
> -	}
> +	if (is_vmalloc_addr(cpu_addr)) {
> +		struct page **pages = __iommu_dma_get_pages(cpu_addr);
>   
> -	if (attrs & DMA_ATTR_FORCE_CONTIGUOUS) {
> -		/*
> -		 * DMA_ATTR_FORCE_CONTIGUOUS allocations are always remapped,
> -		 * hence in the vmalloc space.
> -		 */
> -		unsigned long pfn = vmalloc_to_pfn(cpu_addr);
> -		return __iommu_dma_mmap_pfn(vma, pfn, size);
> +		if (pages)
> +			return __iommu_dma_mmap(pages, size, vma);
> +		pfn = vmalloc_to_pfn(cpu_addr);
> +	} else {
> +		pfn = page_to_pfn(virt_to_page(cpu_addr));
>   	}
>   
> -	pages = __iommu_dma_get_pages(cpu_addr);
> -	if (WARN_ON_ONCE(!pages))
> -		return -ENXIO;
> -	return __iommu_dma_mmap(pages, size, vma);
> +	return remap_pfn_range(vma, vma->vm_start, pfn + off,
> +			       vma->vm_end - vma->vm_start,
> +			       vma->vm_page_prot);
>   }
>   
>   static int iommu_dma_get_sgtable(struct device *dev, struct sg_table *sgt,
> 

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 21/26] iommu/dma: Refactor iommu_dma_get_sgtable
  2019-04-22 17:59 ` [PATCH 21/26] iommu/dma: Refactor iommu_dma_get_sgtable Christoph Hellwig
@ 2019-04-29 14:08   ` Robin Murphy
  0 siblings, 0 replies; 48+ messages in thread
From: Robin Murphy @ 2019-04-29 14:08 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Joerg Roedel, Catalin Marinas, Will Deacon, Tom Lendacky, iommu,
	linux-arm-kernel, linux-kernel

On 22/04/2019 18:59, Christoph Hellwig wrote:
> Inline __iommu_dma_get_sgtable_page into the main function, and use the
> fact that __iommu_dma_get_pages return NULL for remapped contigous
> allocations to simplify the code flow a bit.

Yeah, even I was a bit dubious about the readability of "if (page)... 
else if (pages)..." that my attempt ended up with, so I don't really 
have anything to complain about here.

Reviewed-by: Robin Murphy <robin.murphy@arm.com>

> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>   drivers/iommu/dma-iommu.c | 45 +++++++++++++++------------------------
>   1 file changed, 17 insertions(+), 28 deletions(-)
> 
> diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
> index acdfe866cb29..138b85e675c8 100644
> --- a/drivers/iommu/dma-iommu.c
> +++ b/drivers/iommu/dma-iommu.c
> @@ -1070,42 +1070,31 @@ static int iommu_dma_mmap(struct device *dev, struct vm_area_struct *vma,
>   	return __iommu_dma_mmap(pages, size, vma);
>   }
>   
> -static int __iommu_dma_get_sgtable_page(struct sg_table *sgt, struct page *page,
> -		size_t size)
> -{
> -	int ret = sg_alloc_table(sgt, 1, GFP_KERNEL);
> -
> -	if (!ret)
> -		sg_set_page(sgt->sgl, page, PAGE_ALIGN(size), 0);
> -	return ret;
> -}
> -
>   static int iommu_dma_get_sgtable(struct device *dev, struct sg_table *sgt,
>   		void *cpu_addr, dma_addr_t dma_addr, size_t size,
>   		unsigned long attrs)
>   {
> -	unsigned int count = PAGE_ALIGN(size) >> PAGE_SHIFT;
> -	struct page **pages;
> +	struct page *page;
> +	int ret;
>   
> -	if (!is_vmalloc_addr(cpu_addr)) {
> -		struct page *page = virt_to_page(cpu_addr);
> -		return __iommu_dma_get_sgtable_page(sgt, page, size);
> -	}
> +	if (is_vmalloc_addr(cpu_addr)) {
> +		struct page **pages = __iommu_dma_get_pages(cpu_addr);
>   
> -	if (attrs & DMA_ATTR_FORCE_CONTIGUOUS) {
> -		/*
> -		 * DMA_ATTR_FORCE_CONTIGUOUS allocations are always remapped,
> -		 * hence in the vmalloc space.
> -		 */
> -		struct page *page = vmalloc_to_page(cpu_addr);
> -		return __iommu_dma_get_sgtable_page(sgt, page, size);
> +		if (pages) {
> +			return sg_alloc_table_from_pages(sgt, pages,
> +					PAGE_ALIGN(size) >> PAGE_SHIFT,
> +					0, size, GFP_KERNEL);
> +		}
> +
> +		page = vmalloc_to_page(cpu_addr);
> +	} else {
> +		page = virt_to_page(cpu_addr);
>   	}
>   
> -	pages = __iommu_dma_get_pages(cpu_addr);
> -	if (WARN_ON_ONCE(!pages))
> -		return -ENXIO;
> -	return sg_alloc_table_from_pages(sgt, pages, count, 0, size,
> -					 GFP_KERNEL);
> +	ret = sg_alloc_table(sgt, 1, GFP_KERNEL);
> +	if (!ret)
> +		sg_set_page(sgt->sgl, page, PAGE_ALIGN(size), 0);
> +	return ret;
>   }
>   
>   static const struct dma_map_ops iommu_dma_ops = {
> 

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 19/26] iommu/dma: Cleanup variable naming in iommu_dma_alloc
  2019-04-22 17:59 ` [PATCH 19/26] iommu/dma: Cleanup variable naming in iommu_dma_alloc Christoph Hellwig
@ 2019-04-29 14:11   ` Robin Murphy
  0 siblings, 0 replies; 48+ messages in thread
From: Robin Murphy @ 2019-04-29 14:11 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Joerg Roedel, Catalin Marinas, Will Deacon, Tom Lendacky, iommu,
	linux-arm-kernel, linux-kernel

On 22/04/2019 18:59, Christoph Hellwig wrote:
> From: Robin Murphy <robin.murphy@arm.com>
> 
> Most importantly clear up the size / iosize confusion.  Also rename addr
> to cpu_addr to match the surrounding code and make the intention a little
> more clear.
> 
> Signed-off-by: Robin Murphy <robin.murphy@arm.com>
> [hch: split from a larger patch]

I can't bring myself to actually ack "my" patch, but I am perfectly 
happy with the split :)

Robin.

> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>   drivers/iommu/dma-iommu.c | 45 +++++++++++++++++++--------------------
>   1 file changed, 22 insertions(+), 23 deletions(-)
> 
> diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
> index 95a12e975994..9b269f0792f3 100644
> --- a/drivers/iommu/dma-iommu.c
> +++ b/drivers/iommu/dma-iommu.c
> @@ -960,64 +960,63 @@ static void *iommu_dma_alloc(struct device *dev, size_t size,
>   {
>   	bool coherent = dev_is_dma_coherent(dev);
>   	int ioprot = dma_info_to_prot(DMA_BIDIRECTIONAL, coherent, attrs);
> -	size_t iosize = size;
> +	size_t alloc_size = PAGE_ALIGN(size);
>   	struct page *page = NULL;
> -	void *addr;
> +	void *cpu_addr;
>   
> -	size = PAGE_ALIGN(size);
>   	gfp |= __GFP_ZERO;
>   
>   	if (gfpflags_allow_blocking(gfp) &&
>   	    !(attrs & DMA_ATTR_FORCE_CONTIGUOUS))
> -		return iommu_dma_alloc_remap(dev, iosize, handle, gfp, attrs);
> +		return iommu_dma_alloc_remap(dev, size, handle, gfp, attrs);
>   
>   	if (!gfpflags_allow_blocking(gfp) && !coherent) {
> -		addr = dma_alloc_from_pool(size, &page, gfp);
> -		if (!addr)
> +		cpu_addr = dma_alloc_from_pool(alloc_size, &page, gfp);
> +		if (!cpu_addr)
>   			return NULL;
>   
> -		*handle = __iommu_dma_map(dev, page_to_phys(page), iosize,
> +		*handle = __iommu_dma_map(dev, page_to_phys(page), size,
>   					  ioprot);
>   		if (*handle == DMA_MAPPING_ERROR) {
> -			dma_free_from_pool(addr, size);
> +			dma_free_from_pool(cpu_addr, alloc_size);
>   			return NULL;
>   		}
> -		return addr;
> +		return cpu_addr;
>   	}
>   
>   	if (gfpflags_allow_blocking(gfp))
> -		page = dma_alloc_from_contiguous(dev, size >> PAGE_SHIFT,
> -						 get_order(size),
> +		page = dma_alloc_from_contiguous(dev, alloc_size >> PAGE_SHIFT,
> +						 get_order(alloc_size),
>   						 gfp & __GFP_NOWARN);
>   	if (!page)
> -		page = alloc_pages(gfp, get_order(size));
> +		page = alloc_pages(gfp, get_order(alloc_size));
>   	if (!page)
>   		return NULL;
>   
> -	*handle = __iommu_dma_map(dev, page_to_phys(page), iosize, ioprot);
> +	*handle = __iommu_dma_map(dev, page_to_phys(page), size, ioprot);
>   	if (*handle == DMA_MAPPING_ERROR)
>   		goto out_free_pages;
>   
>   	if (!coherent || PageHighMem(page)) {
>   		pgprot_t prot = arch_dma_mmap_pgprot(dev, PAGE_KERNEL, attrs);
>   
> -		addr = dma_common_contiguous_remap(page, size, VM_USERMAP, prot,
> -				__builtin_return_address(0));
> -		if (!addr)
> +		cpu_addr = dma_common_contiguous_remap(page, alloc_size,
> +				VM_USERMAP, prot, __builtin_return_address(0));
> +		if (!cpu_addr)
>   			goto out_unmap;
>   
>   		if (!coherent)
> -			arch_dma_prep_coherent(page, iosize);
> +			arch_dma_prep_coherent(page, size);
>   	} else {
> -		addr = page_address(page);
> +		cpu_addr = page_address(page);
>   	}
> -	memset(addr, 0, size);
> -	return addr;
> +	memset(cpu_addr, 0, alloc_size);
> +	return cpu_addr;
>   out_unmap:
> -	__iommu_dma_unmap(dev, *handle, iosize);
> +	__iommu_dma_unmap(dev, *handle, size);
>   out_free_pages:
> -	if (!dma_release_from_contiguous(dev, page, size >> PAGE_SHIFT))
> -		__free_pages(page, get_order(size));
> +	if (!dma_release_from_contiguous(dev, page, alloc_size >> PAGE_SHIFT))
> +		__free_pages(page, get_order(alloc_size));
>   	return NULL;
>   }
>   
> 

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 20/26] iommu/dma: Refactor iommu_dma_alloc, part 2
  2019-04-22 17:59 ` [PATCH 20/26] iommu/dma: Refactor iommu_dma_alloc, part 2 Christoph Hellwig
@ 2019-04-29 14:45   ` Robin Murphy
  0 siblings, 0 replies; 48+ messages in thread
From: Robin Murphy @ 2019-04-29 14:45 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Joerg Roedel, Catalin Marinas, Will Deacon, Tom Lendacky, iommu,
	linux-arm-kernel, linux-kernel

On 22/04/2019 18:59, Christoph Hellwig wrote:
> From: Robin Murphy <robin.murphy@arm.com>

Honestly I don't think anything left of my patch here...

> Apart from the iommu_dma_alloc_remap() case which remains sufficiently
> different that it's better off being self-contained, the rest of the
> logic can now be consolidated into a single flow which separates the
> logcially-distinct steps of allocating pages, getting the CPU address,
> and finally getting the IOMMU address.

...and it certainly doesn't do that any more.

It's clear that we have fundamentally different ways of reading code, so 
I don't think it's productive to keep arguing personal preference - I 
still find the end result here a fair bit more tolerable than before, so 
if you update the commit message to reflect the actual change (at which 
point there's really nothing left of my authorship) I can live with it.

Robin.

> Signed-off-by: Robin Murphy <robin.murphy@arm.com>
> [hch: split the page allocation into a new helper to simplify the flow]
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>   drivers/iommu/dma-iommu.c | 65 +++++++++++++++++++++------------------
>   1 file changed, 35 insertions(+), 30 deletions(-)
> 
> diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
> index 9b269f0792f3..acdfe866cb29 100644
> --- a/drivers/iommu/dma-iommu.c
> +++ b/drivers/iommu/dma-iommu.c
> @@ -955,35 +955,14 @@ static void iommu_dma_free(struct device *dev, size_t size, void *cpu_addr,
>   	__iommu_dma_free(dev, size, cpu_addr);
>   }
>   
> -static void *iommu_dma_alloc(struct device *dev, size_t size,
> -		dma_addr_t *handle, gfp_t gfp, unsigned long attrs)
> +static void *iommu_dma_alloc_pages(struct device *dev, size_t size,
> +		struct page **pagep, gfp_t gfp, unsigned long attrs)
>   {
>   	bool coherent = dev_is_dma_coherent(dev);
> -	int ioprot = dma_info_to_prot(DMA_BIDIRECTIONAL, coherent, attrs);
>   	size_t alloc_size = PAGE_ALIGN(size);
>   	struct page *page = NULL;
>   	void *cpu_addr;
>   
> -	gfp |= __GFP_ZERO;
> -
> -	if (gfpflags_allow_blocking(gfp) &&
> -	    !(attrs & DMA_ATTR_FORCE_CONTIGUOUS))
> -		return iommu_dma_alloc_remap(dev, size, handle, gfp, attrs);
> -
> -	if (!gfpflags_allow_blocking(gfp) && !coherent) {
> -		cpu_addr = dma_alloc_from_pool(alloc_size, &page, gfp);
> -		if (!cpu_addr)
> -			return NULL;
> -
> -		*handle = __iommu_dma_map(dev, page_to_phys(page), size,
> -					  ioprot);
> -		if (*handle == DMA_MAPPING_ERROR) {
> -			dma_free_from_pool(cpu_addr, alloc_size);
> -			return NULL;
> -		}
> -		return cpu_addr;
> -	}
> -
>   	if (gfpflags_allow_blocking(gfp))
>   		page = dma_alloc_from_contiguous(dev, alloc_size >> PAGE_SHIFT,
>   						 get_order(alloc_size),
> @@ -993,33 +972,59 @@ static void *iommu_dma_alloc(struct device *dev, size_t size,
>   	if (!page)
>   		return NULL;
>   
> -	*handle = __iommu_dma_map(dev, page_to_phys(page), size, ioprot);
> -	if (*handle == DMA_MAPPING_ERROR)
> -		goto out_free_pages;
> -
>   	if (!coherent || PageHighMem(page)) {
>   		pgprot_t prot = arch_dma_mmap_pgprot(dev, PAGE_KERNEL, attrs);
>   
>   		cpu_addr = dma_common_contiguous_remap(page, alloc_size,
>   				VM_USERMAP, prot, __builtin_return_address(0));
>   		if (!cpu_addr)
> -			goto out_unmap;
> +			goto out_free_pages;
>   
>   		if (!coherent)
>   			arch_dma_prep_coherent(page, size);
>   	} else {
>   		cpu_addr = page_address(page);
>   	}
> +
> +	*pagep = page;
>   	memset(cpu_addr, 0, alloc_size);
>   	return cpu_addr;
> -out_unmap:
> -	__iommu_dma_unmap(dev, *handle, size);
>   out_free_pages:
>   	if (!dma_release_from_contiguous(dev, page, alloc_size >> PAGE_SHIFT))
>   		__free_pages(page, get_order(alloc_size));
>   	return NULL;
>   }
>   
> +static void *iommu_dma_alloc(struct device *dev, size_t size,
> +		dma_addr_t *handle, gfp_t gfp, unsigned long attrs)
> +{
> +	bool coherent = dev_is_dma_coherent(dev);
> +	int ioprot = dma_info_to_prot(DMA_BIDIRECTIONAL, coherent, attrs);
> +	struct page *page = NULL;
> +	void *cpu_addr;
> +
> +	gfp |= __GFP_ZERO;
> +
> +	if (gfpflags_allow_blocking(gfp) &&
> +	    !(attrs & DMA_ATTR_FORCE_CONTIGUOUS))
> +		return iommu_dma_alloc_remap(dev, size, handle, gfp, attrs);
> +
> +	if (!gfpflags_allow_blocking(gfp) && !coherent)
> +		cpu_addr = dma_alloc_from_pool(PAGE_ALIGN(size), &page, gfp);
> +	else
> +		cpu_addr = iommu_dma_alloc_pages(dev, size, &page, gfp, attrs);
> +	if (!cpu_addr)
> +		return NULL;
> +
> +	*handle = __iommu_dma_map(dev, page_to_phys(page), size, ioprot);
> +	if (*handle == DMA_MAPPING_ERROR) {
> +		__iommu_dma_free(dev, size, cpu_addr);
> +		return NULL;
> +	}
> +
> +	return cpu_addr;
> +}
> +
>   static int __iommu_dma_mmap_pfn(struct vm_area_struct *vma,
>   			      unsigned long pfn, size_t size)
>   {
> 

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 23/26] iommu/dma: Don't depend on CONFIG_DMA_DIRECT_REMAP
  2019-04-22 17:59 ` [PATCH 23/26] iommu/dma: Don't depend on CONFIG_DMA_DIRECT_REMAP Christoph Hellwig
@ 2019-04-29 14:46   ` Robin Murphy
  0 siblings, 0 replies; 48+ messages in thread
From: Robin Murphy @ 2019-04-29 14:46 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Joerg Roedel, Catalin Marinas, Will Deacon, Tom Lendacky, iommu,
	linux-arm-kernel, linux-kernel

On 22/04/2019 18:59, Christoph Hellwig wrote:
> For entirely dma coherent architectures there is no requirement to ever
> remap dma coherent allocation.  Move all the remap and pool code under
> IS_ENABLED() checks and drop the Kconfig dependency.

Reviewed-by: Robin Murphy <robin.murphy@arm.com>

> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>   drivers/iommu/Kconfig     |  1 -
>   drivers/iommu/dma-iommu.c | 16 +++++++++-------
>   2 files changed, 9 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
> index bdc14baf2ee5..6f07f3b21816 100644
> --- a/drivers/iommu/Kconfig
> +++ b/drivers/iommu/Kconfig
> @@ -95,7 +95,6 @@ config IOMMU_DMA
>   	select IOMMU_API
>   	select IOMMU_IOVA
>   	select NEED_SG_DMA_LENGTH
> -	depends on DMA_DIRECT_REMAP
>   
>   config FSL_PAMU
>   	bool "Freescale IOMMU support"
> diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
> index 8fc6098c1eeb..278a9a960107 100644
> --- a/drivers/iommu/dma-iommu.c
> +++ b/drivers/iommu/dma-iommu.c
> @@ -923,10 +923,11 @@ static void __iommu_dma_free(struct device *dev, size_t size, void *cpu_addr)
>   	struct page *page = NULL;
>   
>   	/* Non-coherent atomic allocation? Easy */
> -	if (dma_free_from_pool(cpu_addr, alloc_size))
> +	if (IS_ENABLED(CONFIG_DMA_DIRECT_REMAP) &&
> +	    dma_free_from_pool(cpu_addr, alloc_size))
>   		return;
>   
> -	if (is_vmalloc_addr(cpu_addr)) {
> +	if (IS_ENABLED(CONFIG_DMA_REMAP) && is_vmalloc_addr(cpu_addr)) {
>   		/*
>   		 * If it the address is remapped, then it's either non-coherent
>   		 * or highmem CMA, or an iommu_dma_alloc_remap() construction.
> @@ -972,7 +973,7 @@ static void *iommu_dma_alloc_pages(struct device *dev, size_t size,
>   	if (!page)
>   		return NULL;
>   
> -	if (!coherent || PageHighMem(page)) {
> +	if (IS_ENABLED(CONFIG_DMA_REMAP) && (!coherent || PageHighMem(page))) {
>   		pgprot_t prot = arch_dma_mmap_pgprot(dev, PAGE_KERNEL, attrs);
>   
>   		cpu_addr = dma_common_contiguous_remap(page, alloc_size,
> @@ -1005,11 +1006,12 @@ static void *iommu_dma_alloc(struct device *dev, size_t size,
>   
>   	gfp |= __GFP_ZERO;
>   
> -	if (gfpflags_allow_blocking(gfp) &&
> +	if (IS_ENABLED(CONFIG_DMA_REMAP) && gfpflags_allow_blocking(gfp) &&
>   	    !(attrs & DMA_ATTR_FORCE_CONTIGUOUS))
>   		return iommu_dma_alloc_remap(dev, size, handle, gfp, attrs);
>   
> -	if (!gfpflags_allow_blocking(gfp) && !coherent)
> +	if (IS_ENABLED(CONFIG_DMA_DIRECT_REMAP) &&
> +	    !gfpflags_allow_blocking(gfp) && !coherent)
>   		cpu_addr = dma_alloc_from_pool(PAGE_ALIGN(size), &page, gfp);
>   	else
>   		cpu_addr = iommu_dma_alloc_pages(dev, size, &page, gfp, attrs);
> @@ -1041,7 +1043,7 @@ static int iommu_dma_mmap(struct device *dev, struct vm_area_struct *vma,
>   	if (off >= nr_pages || vma_pages(vma) > nr_pages - off)
>   		return -ENXIO;
>   
> -	if (is_vmalloc_addr(cpu_addr)) {
> +	if (IS_ENABLED(CONFIG_DMA_REMAP) && is_vmalloc_addr(cpu_addr)) {
>   		struct page **pages = __iommu_dma_get_pages(cpu_addr);
>   
>   		if (pages)
> @@ -1063,7 +1065,7 @@ static int iommu_dma_get_sgtable(struct device *dev, struct sg_table *sgt,
>   	struct page *page;
>   	int ret;
>   
> -	if (is_vmalloc_addr(cpu_addr)) {
> +	if (IS_ENABLED(CONFIG_DMA_REMAP) && is_vmalloc_addr(cpu_addr)) {
>   		struct page **pages = __iommu_dma_get_pages(cpu_addr);
>   
>   		if (pages) {
> 

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 26/26] arm64: trim includes in dma-mapping.c
  2019-04-22 17:59 ` [PATCH 26/26] arm64: trim includes " Christoph Hellwig
@ 2019-04-29 15:00   ` Robin Murphy
  0 siblings, 0 replies; 48+ messages in thread
From: Robin Murphy @ 2019-04-29 15:00 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Joerg Roedel, Catalin Marinas, Will Deacon, Tom Lendacky, iommu,
	linux-arm-kernel, linux-kernel

On 22/04/2019 18:59, Christoph Hellwig wrote:
> With most of the previous functionality now elsewhere a lot of the
> headers included in this file are not needed.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>   arch/arm64/mm/dma-mapping.c | 11 -----------
>   1 file changed, 11 deletions(-)
> 
> diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c
> index 184ef9ccd69d..15bd768ceb7e 100644
> --- a/arch/arm64/mm/dma-mapping.c
> +++ b/arch/arm64/mm/dma-mapping.c
> @@ -5,20 +5,9 @@
>    */
>   
>   #include <linux/gfp.h>
> -#include <linux/acpi.h>
> -#include <linux/memblock.h>
>   #include <linux/cache.h>
> -#include <linux/export.h>
> -#include <linux/slab.h>
> -#include <linux/genalloc.h>
> -#include <linux/dma-direct.h>
>   #include <linux/dma-noncoherent.h>
> -#include <linux/dma-contiguous.h>
>   #include <linux/dma-iommu.h>
> -#include <linux/vmalloc.h>
> -#include <linux/swiotlb.h>
> -#include <linux/pci.h>
> -

Nit: please keep the blank line between linux/ and asm/ include blocks 
to match the predominant local style.

With that,

Reviewed-by: Robin Murphy <robin.murphy@arm.com>

>   #include <asm/cacheflush.h>
>   
>   pgprot_t arch_dma_mmap_pgprot(struct device *dev, pgprot_t prot,
> 

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: implement generic dma_map_ops for IOMMUs v3
  2019-04-22 17:59 implement generic dma_map_ops for IOMMUs v3 Christoph Hellwig
                   ` (25 preceding siblings ...)
  2019-04-22 17:59 ` [PATCH 26/26] arm64: trim includes " Christoph Hellwig
@ 2019-04-29 15:03 ` Robin Murphy
  26 siblings, 0 replies; 48+ messages in thread
From: Robin Murphy @ 2019-04-29 15:03 UTC (permalink / raw)
  To: Christoph Hellwig, Catalin Marinas, Will Deacon
  Cc: Joerg Roedel, Tom Lendacky, iommu, linux-arm-kernel, linux-kernel

On 22/04/2019 18:59, Christoph Hellwig wrote:
> Hi Robin,
> 
> please take a look at this series, which implements a completely generic
> set of dma_map_ops for IOMMU drivers.  This is done by taking the
> existing arm64 code, moving it to drivers/iommu and then massaging it
> so that it can also work for architectures with DMA remapping.  This
> should help future ports to support IOMMUs more easily, and also allow
> to remove various custom IOMMU dma_map_ops implementations, like Tom
> was planning to for the AMD one.

Modulo a few nits, I think this looks pretty much good to go, and it 
would certainly be good to get as much as reasonable queued soon for the 
sake of progress elsewhere.

Catalin, Will, I think the arm64 parts are all OK but please take a look 
to confirm.

Thanks,
Robin.

> 
> A git tree is also available at:
> 
>      git://git.infradead.org/users/hch/misc.git dma-iommu-ops.3
> 
> Gitweb:
> 
>      http://git.infradead.org/users/hch/misc.git/shortlog/refs/heads/dma-iommu-ops.3
> 
> Changes since v2:
>   - address various review comments and include patches from Robin
> 
> Changes since v1:
>   - only include other headers in dma-iommu.h if CONFIG_DMA_IOMMU is enabled
>   - keep using a scatterlist in iommu_dma_alloc
>   - split out mmap/sgtable fixes and move them early in the series
>   - updated a few commit logs
> 

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 02/26] arm64/iommu: improve mmap bounds checking
  2019-04-29 12:35   ` Robin Murphy
@ 2019-04-29 19:01     ` Christoph Hellwig
  2019-04-30 11:38       ` Robin Murphy
  0 siblings, 1 reply; 48+ messages in thread
From: Christoph Hellwig @ 2019-04-29 19:01 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Christoph Hellwig, Joerg Roedel, Catalin Marinas, Will Deacon,
	Tom Lendacky, iommu, linux-arm-kernel, linux-kernel

On Mon, Apr 29, 2019 at 01:35:46PM +0100, Robin Murphy wrote:
> On 22/04/2019 18:59, Christoph Hellwig wrote:
>> The nr_pages checks should be done for all mmap requests, not just those
>> using remap_pfn_range.
>
> I think it probably makes sense now to just squash this with #22 one way or 
> the other, but if you really really still want to keep it as a separate 
> patch with a misleading commit message then I'm willing to keep my 
> complaints to myself :)

Well, I split this out in response to your earlier comments, so if you
prefer it squasheѕ back in I can do that..

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 14/26] iommu/dma: Refactor iommu_dma_free
  2019-04-29 13:59   ` Robin Murphy
@ 2019-04-29 19:03     ` Christoph Hellwig
  2019-04-29 19:16       ` Christoph Hellwig
  0 siblings, 1 reply; 48+ messages in thread
From: Christoph Hellwig @ 2019-04-29 19:03 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Christoph Hellwig, Joerg Roedel, Catalin Marinas, Will Deacon,
	Tom Lendacky, iommu, linux-arm-kernel, linux-kernel

On Mon, Apr 29, 2019 at 02:59:43PM +0100, Robin Murphy wrote:
> Hmm, I do still prefer my original flow with the dma_common_free_remap() 
> call right out of the way at the end rather than being a special case in 
> the middle of all the page-freeing (which is the kind of existing 
> complexity I was trying to eliminate). I guess you've done this to avoid 
> having "if (!dma_release_from_contiguous(...))..." twice like I ended up 
> with, which is fair enough I suppose - once we manage to solve the new 
> dma_{alloc,free}_contiguous() interface that may tip the balance so I can 
> always revisit this then.

Ok, I'll try to accomodate that with a minor rework.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 11/26] iommu/dma: Factor out remapped pages lookup
  2019-04-29 13:05   ` Robin Murphy
@ 2019-04-29 19:10     ` Christoph Hellwig
  0 siblings, 0 replies; 48+ messages in thread
From: Christoph Hellwig @ 2019-04-29 19:10 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Christoph Hellwig, Joerg Roedel, Catalin Marinas, Will Deacon,
	Tom Lendacky, iommu, linux-arm-kernel, linux-kernel

On Mon, Apr 29, 2019 at 02:05:46PM +0100, Robin Murphy wrote:
> On 22/04/2019 18:59, Christoph Hellwig wrote:
>> From: Robin Murphy <robin.murphy@arm.com>
>>
>> Since we duplicate the find_vm_area() logic a few times in places where
>> we only care aboute the pages, factor out a helper to abstract it.
>>
>> Signed-off-by: Robin Murphy <robin.murphy@arm.com>
>> [hch: don't warn when not finding a region, as we'll rely on that later]
>
> Yeah, I did think about that and the things which it might make a little 
> easier, but preserved it as-is for the sake of keeping my modifications 
> quick and simple. TBH I'm now feeling more inclined to drop the WARNs 
> entirely at this point, since it's not like there's ever been any general 
> guarantee that freeing the wrong thing shouldn't just crash, but that's 
> something we can easily come back to later if need be.

Ok, I've dropped the warnings.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 14/26] iommu/dma: Refactor iommu_dma_free
  2019-04-29 19:03     ` Christoph Hellwig
@ 2019-04-29 19:16       ` Christoph Hellwig
  0 siblings, 0 replies; 48+ messages in thread
From: Christoph Hellwig @ 2019-04-29 19:16 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Christoph Hellwig, Joerg Roedel, Catalin Marinas, Will Deacon,
	Tom Lendacky, iommu, linux-arm-kernel, linux-kernel

On Mon, Apr 29, 2019 at 09:03:48PM +0200, Christoph Hellwig wrote:
> On Mon, Apr 29, 2019 at 02:59:43PM +0100, Robin Murphy wrote:
> > Hmm, I do still prefer my original flow with the dma_common_free_remap() 
> > call right out of the way at the end rather than being a special case in 
> > the middle of all the page-freeing (which is the kind of existing 
> > complexity I was trying to eliminate). I guess you've done this to avoid 
> > having "if (!dma_release_from_contiguous(...))..." twice like I ended up 
> > with, which is fair enough I suppose - once we manage to solve the new 
> > dma_{alloc,free}_contiguous() interface that may tip the balance so I can 
> > always revisit this then.
> 
> Ok, I'll try to accomodate that with a minor rework.

Does this look reasonable?

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index 5b2a2bf44078..f884d22b1388 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -921,7 +921,7 @@ static void iommu_dma_free(struct device *dev, size_t size, void *cpu_addr,
 {
 	size_t alloc_size = PAGE_ALIGN(size);
 	int count = alloc_size >> PAGE_SHIFT;
-	struct page *page = NULL;
+	struct page *page = NULL, **pages = NULL;
 
 	__iommu_dma_unmap(dev, handle, size);
 
@@ -934,19 +934,17 @@ static void iommu_dma_free(struct device *dev, size_t size, void *cpu_addr,
 		 * If it the address is remapped, then it's either non-coherent
 		 * or highmem CMA, or an iommu_dma_alloc_remap() construction.
 		 */
-		struct page **pages = __iommu_dma_get_pages(cpu_addr);
-
-		if (pages)
-			__iommu_dma_free_pages(pages, count);
-		else
+		pages = __iommu_dma_get_pages(cpu_addr);
+		if (!pages)
 			page = vmalloc_to_page(cpu_addr);
-
 		dma_common_free_remap(cpu_addr, alloc_size, VM_USERMAP);
 	} else {
 		/* Lowmem means a coherent atomic or CMA allocation */
 		page = virt_to_page(cpu_addr);
 	}
 
+	if (pages)
+		__iommu_dma_free_pages(pages, count);
 	if (page && !dma_release_from_contiguous(dev, page, count))
 		__free_pages(page, get_order(alloc_size));
 }

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* Re: [PATCH 02/26] arm64/iommu: improve mmap bounds checking
  2019-04-29 19:01     ` Christoph Hellwig
@ 2019-04-30 11:38       ` Robin Murphy
  0 siblings, 0 replies; 48+ messages in thread
From: Robin Murphy @ 2019-04-30 11:38 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Joerg Roedel, Catalin Marinas, Will Deacon, Tom Lendacky, iommu,
	linux-arm-kernel, linux-kernel

On 29/04/2019 20:01, Christoph Hellwig wrote:
> On Mon, Apr 29, 2019 at 01:35:46PM +0100, Robin Murphy wrote:
>> On 22/04/2019 18:59, Christoph Hellwig wrote:
>>> The nr_pages checks should be done for all mmap requests, not just those
>>> using remap_pfn_range.
>>
>> I think it probably makes sense now to just squash this with #22 one way or
>> the other, but if you really really still want to keep it as a separate
>> patch with a misleading commit message then I'm willing to keep my
>> complaints to myself :)
> 
> Well, I split this out in response to your earlier comments, so if you
> prefer it squasheѕ back in I can do that..

AFAICS I only ever suggested splitting the original "fix and refactor" 
commit into the fix (patch #1) and the refactor - I think we've just 
ended up adding more "refactor" on top in the evolution of the series :)

Robin.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 07/26] iommu/dma: move the arm64 wrappers to common code
  2019-04-29 12:56   ` Robin Murphy
@ 2019-06-03 19:47     ` Jon Hunter
  2019-06-04  6:05       ` Christoph Hellwig
  0 siblings, 1 reply; 48+ messages in thread
From: Jon Hunter @ 2019-06-03 19:47 UTC (permalink / raw)
  To: Robin Murphy, Christoph Hellwig
  Cc: Joerg Roedel, Catalin Marinas, Will Deacon, Tom Lendacky, iommu,
	linux-arm-kernel, linux-kernel, linux-tegra


On 29/04/2019 13:56, Robin Murphy wrote:
> On 22/04/2019 18:59, Christoph Hellwig wrote:
>> There is nothing really arm64 specific in the iommu_dma_ops
>> implementation, so move it to dma-iommu.c and keep a lot of symbols
>> self-contained.  Note the implementation does depend on the
>> DMA_DIRECT_REMAP infrastructure for now, so we'll have to make the
>> DMA_IOMMU support depend on it, but this will be relaxed soon.
> 
> Nothing looks objectionable, and boot testing with this much of the
> series merged has my coherent and non-coherent IOMMU-backed devices
> appearing to still work OK, so:
> 
> Acked-by: Robin Murphy <robin.murphy@arm.com>

Since next-20190529 one of our tests for MMC has started failing, where
the symptom is that the data written to the MMC does not match the
source. Bisecting this is pointing to this commit. Unfortunately, I am
not able to cleanly revert this on top of -next, but wanted to report
this if case you have any ideas.

Cheers
Jon

-- 
nvpublic

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 07/26] iommu/dma: move the arm64 wrappers to common code
  2019-06-03 19:47     ` Jon Hunter
@ 2019-06-04  6:05       ` Christoph Hellwig
  2019-06-04 11:35         ` Jon Hunter
  0 siblings, 1 reply; 48+ messages in thread
From: Christoph Hellwig @ 2019-06-04  6:05 UTC (permalink / raw)
  To: Jon Hunter
  Cc: Robin Murphy, Christoph Hellwig, Joerg Roedel, Catalin Marinas,
	Will Deacon, Tom Lendacky, iommu, linux-arm-kernel, linux-kernel,
	linux-tegra

On Mon, Jun 03, 2019 at 08:47:57PM +0100, Jon Hunter wrote:
> Since next-20190529 one of our tests for MMC has started failing, where
> the symptom is that the data written to the MMC does not match the
> source. Bisecting this is pointing to this commit. Unfortunately, I am
> not able to cleanly revert this on top of -next, but wanted to report
> this if case you have any ideas.

Does this fix your problem?

https://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu.git/commit/?h=generic-dma-ops&id=1b961423158caaae49d3900b7c9c37477bbfa9b3

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 07/26] iommu/dma: move the arm64 wrappers to common code
  2019-06-04  6:05       ` Christoph Hellwig
@ 2019-06-04 11:35         ` Jon Hunter
  0 siblings, 0 replies; 48+ messages in thread
From: Jon Hunter @ 2019-06-04 11:35 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Robin Murphy, Joerg Roedel, Catalin Marinas, Will Deacon,
	Tom Lendacky, iommu, linux-arm-kernel, linux-kernel, linux-tegra


On 04/06/2019 07:05, Christoph Hellwig wrote:
> On Mon, Jun 03, 2019 at 08:47:57PM +0100, Jon Hunter wrote:
>> Since next-20190529 one of our tests for MMC has started failing, where
>> the symptom is that the data written to the MMC does not match the
>> source. Bisecting this is pointing to this commit. Unfortunately, I am
>> not able to cleanly revert this on top of -next, but wanted to report
>> this if case you have any ideas.
> 
> Does this fix your problem?
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu.git/commit/?h=generic-dma-ops&id=1b961423158caaae49d3900b7c9c37477bbfa9b3

Yes I can confirm with this patch on today's -next the issue is no
longer seen, and reverting this patch on top of today's -next causes the
problem to occur again. So yes this fixes my problem.

Thanks!
Jon

-- 
nvpublic

^ permalink raw reply	[flat|nested] 48+ messages in thread

end of thread, other threads:[~2019-06-04 11:35 UTC | newest]

Thread overview: 48+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-04-22 17:59 implement generic dma_map_ops for IOMMUs v3 Christoph Hellwig
2019-04-22 17:59 ` [PATCH 01/26] arm64/iommu: handle non-remapped addresses in ->mmap and ->get_sgtable Christoph Hellwig
2019-04-22 17:59 ` [PATCH 02/26] arm64/iommu: improve mmap bounds checking Christoph Hellwig
2019-04-29 12:35   ` Robin Murphy
2019-04-29 19:01     ` Christoph Hellwig
2019-04-30 11:38       ` Robin Murphy
2019-04-22 17:59 ` [PATCH 03/26] dma-mapping: add a Kconfig symbol to indicated arch_dma_prep_coherent presence Christoph Hellwig
2019-04-22 17:59 ` [PATCH 04/26] iommu/dma: Cleanup dma-iommu.h Christoph Hellwig
2019-04-22 17:59 ` [PATCH 05/26] iommu/dma: Remove the flush_page callback Christoph Hellwig
2019-04-22 17:59 ` [PATCH 06/26] iommu/dma: Use for_each_sg in iommu_dma_alloc Christoph Hellwig
2019-04-22 17:59 ` [PATCH 07/26] iommu/dma: move the arm64 wrappers to common code Christoph Hellwig
2019-04-29 12:56   ` Robin Murphy
2019-06-03 19:47     ` Jon Hunter
2019-06-04  6:05       ` Christoph Hellwig
2019-06-04 11:35         ` Jon Hunter
2019-04-22 17:59 ` [PATCH 08/26] iommu/dma: Move __iommu_dma_map Christoph Hellwig
2019-04-22 17:59 ` [PATCH 09/26] iommu/dma: Move domain lookup into __iommu_dma_{map,unmap} Christoph Hellwig
2019-04-22 17:59 ` [PATCH 10/26] iommu/dma: Squash __iommu_dma_{map,unmap}_page helpers Christoph Hellwig
2019-04-22 17:59 ` [PATCH 11/26] iommu/dma: Factor out remapped pages lookup Christoph Hellwig
2019-04-29 13:05   ` Robin Murphy
2019-04-29 19:10     ` Christoph Hellwig
2019-04-22 17:59 ` [PATCH 12/26] iommu/dma: Refactor the page array remapping allocator Christoph Hellwig
2019-04-29 13:10   ` Robin Murphy
2019-04-22 17:59 ` [PATCH 13/26] iommu/dma: Remove __iommu_dma_free Christoph Hellwig
2019-04-29 13:18   ` Robin Murphy
2019-04-22 17:59 ` [PATCH 14/26] iommu/dma: Refactor iommu_dma_free Christoph Hellwig
2019-04-29 13:59   ` Robin Murphy
2019-04-29 19:03     ` Christoph Hellwig
2019-04-29 19:16       ` Christoph Hellwig
2019-04-22 17:59 ` [PATCH 15/26] iommu/dma: Refactor iommu_dma_alloc Christoph Hellwig
2019-04-22 17:59 ` [PATCH 16/26] iommu/dma: Don't remap CMA unnecessarily Christoph Hellwig
2019-04-22 17:59 ` [PATCH 17/26] iommu/dma: Merge the CMA and alloc_pages allocation paths Christoph Hellwig
2019-04-22 17:59 ` [PATCH 18/26] iommu/dma: Split iommu_dma_free Christoph Hellwig
2019-04-22 17:59 ` [PATCH 19/26] iommu/dma: Cleanup variable naming in iommu_dma_alloc Christoph Hellwig
2019-04-29 14:11   ` Robin Murphy
2019-04-22 17:59 ` [PATCH 20/26] iommu/dma: Refactor iommu_dma_alloc, part 2 Christoph Hellwig
2019-04-29 14:45   ` Robin Murphy
2019-04-22 17:59 ` [PATCH 21/26] iommu/dma: Refactor iommu_dma_get_sgtable Christoph Hellwig
2019-04-29 14:08   ` Robin Murphy
2019-04-22 17:59 ` [PATCH 22/26] iommu/dma: Refactor iommu_dma_mmap Christoph Hellwig
2019-04-29 14:04   ` Robin Murphy
2019-04-22 17:59 ` [PATCH 23/26] iommu/dma: Don't depend on CONFIG_DMA_DIRECT_REMAP Christoph Hellwig
2019-04-29 14:46   ` Robin Murphy
2019-04-22 17:59 ` [PATCH 24/26] iommu/dma: Switch copyright boilerplace to SPDX Christoph Hellwig
2019-04-22 17:59 ` [PATCH 25/26] arm64: switch copyright boilerplace to SPDX in dma-mapping.c Christoph Hellwig
2019-04-22 17:59 ` [PATCH 26/26] arm64: trim includes " Christoph Hellwig
2019-04-29 15:00   ` Robin Murphy
2019-04-29 15:03 ` implement generic dma_map_ops for IOMMUs v3 Robin Murphy

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).