linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* implement generic dma_map_ops for IOMMUs
@ 2019-01-14  9:41 Christoph Hellwig
  2019-01-14  9:41 ` [PATCH 01/19] dma-mapping: add a Kconfig symbol to indicated arch_dma_prep_coherent presence Christoph Hellwig
                   ` (19 more replies)
  0 siblings, 20 replies; 38+ messages in thread
From: Christoph Hellwig @ 2019-01-14  9:41 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Tom Lendacky, Catalin Marinas, Joerg Roedel, Will Deacon,
	linux-kernel, iommu, linux-arm-kernel

Hi Robin,

please take a look at this series, which implements a completely generic
set of dma_map_ops for IOMMU drivers.  This is done by taking the
existing arm64 code, moving it to drivers/iommu and then massaging it
so that it can also work for architectures with DMA remapping.  This
should help future ports to support IOMMUs more easily, and also allow
to remove various custom IOMMU dma_map_ops implementations, like Tom
was planning to for the AMD one.

A git tree is also available at:

    git://git.infradead.org/users/hch/misc.git dma-iommu-ops

Gitweb:

    http://git.infradead.org/users/hch/misc.git/shortlog/refs/heads/dma-iommu-ops

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCH 01/19] dma-mapping: add a Kconfig symbol to indicated arch_dma_prep_coherent presence
  2019-01-14  9:41 implement generic dma_map_ops for IOMMUs Christoph Hellwig
@ 2019-01-14  9:41 ` Christoph Hellwig
  2019-02-01 14:22   ` Robin Murphy
  2019-01-14  9:41 ` [PATCH 02/19] dma-iommu: cleanup dma-iommu.h Christoph Hellwig
                   ` (18 subsequent siblings)
  19 siblings, 1 reply; 38+ messages in thread
From: Christoph Hellwig @ 2019-01-14  9:41 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Tom Lendacky, Catalin Marinas, Joerg Roedel, Will Deacon,
	linux-kernel, iommu, linux-arm-kernel

Add a Kconfig symbol that indicates an architecture provides a
arch_dma_prep_coherent implementation, and provide a stub otherwise.

This will allow the generic dma-iommu code to it while still allowing
to be built for cache coherent architectures.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 arch/arm64/Kconfig              | 1 +
 arch/csky/Kconfig               | 1 +
 include/linux/dma-noncoherent.h | 6 ++++++
 kernel/dma/Kconfig              | 3 +++
 4 files changed, 11 insertions(+)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index a4168d366127..ae3f581a9bcc 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -13,6 +13,7 @@ config ARM64
 	select ARCH_HAS_DEVMEM_IS_ALLOWED
 	select ARCH_HAS_DMA_COHERENT_TO_PFN
 	select ARCH_HAS_DMA_MMAP_PGPROT
+	select ARCH_HAS_DMA_PREP_COHERENT
 	select ARCH_HAS_ACPI_TABLE_UPGRADE if ACPI
 	select ARCH_HAS_ELF_RANDOMIZE
 	select ARCH_HAS_FAST_MULTIPLIER
diff --git a/arch/csky/Kconfig b/arch/csky/Kconfig
index 398113c845f5..8b84d4362ff6 100644
--- a/arch/csky/Kconfig
+++ b/arch/csky/Kconfig
@@ -1,5 +1,6 @@
 config CSKY
 	def_bool y
+	select ARCH_HAS_DMA_PREP_COHERENT
 	select ARCH_HAS_SYNC_DMA_FOR_CPU
 	select ARCH_HAS_SYNC_DMA_FOR_DEVICE
 	select ARCH_USE_BUILTIN_BSWAP
diff --git a/include/linux/dma-noncoherent.h b/include/linux/dma-noncoherent.h
index 69b36ed31a99..9741767e400f 100644
--- a/include/linux/dma-noncoherent.h
+++ b/include/linux/dma-noncoherent.h
@@ -72,6 +72,12 @@ static inline void arch_sync_dma_for_cpu_all(struct device *dev)
 }
 #endif /* CONFIG_ARCH_HAS_SYNC_DMA_FOR_CPU_ALL */
 
+#ifdef CONFIG_ARCH_HAS_DMA_PREP_COHERENT
 void arch_dma_prep_coherent(struct page *page, size_t size);
+#else
+static inline void arch_dma_prep_coherent(struct page *page, size_t size)
+{
+}
+#endif /* CONFIG_ARCH_HAS_DMA_PREP_COHERENT */
 
 #endif /* _LINUX_DMA_NONCOHERENT_H */
diff --git a/kernel/dma/Kconfig b/kernel/dma/Kconfig
index ca88b867e7fe..541128a32c5d 100644
--- a/kernel/dma/Kconfig
+++ b/kernel/dma/Kconfig
@@ -29,6 +29,9 @@ config ARCH_HAS_SYNC_DMA_FOR_CPU
 config ARCH_HAS_SYNC_DMA_FOR_CPU_ALL
 	bool
 
+config ARCH_HAS_DMA_PREP_COHERENT
+	bool
+
 config ARCH_HAS_DMA_COHERENT_TO_PFN
 	bool
 
-- 
2.20.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 02/19] dma-iommu: cleanup dma-iommu.h
  2019-01-14  9:41 implement generic dma_map_ops for IOMMUs Christoph Hellwig
  2019-01-14  9:41 ` [PATCH 01/19] dma-mapping: add a Kconfig symbol to indicated arch_dma_prep_coherent presence Christoph Hellwig
@ 2019-01-14  9:41 ` Christoph Hellwig
  2019-02-01 14:47   ` Robin Murphy
  2019-01-14  9:41 ` [PATCH 03/19] dma-iommu: don't use a scatterlist in iommu_dma_alloc Christoph Hellwig
                   ` (17 subsequent siblings)
  19 siblings, 1 reply; 38+ messages in thread
From: Christoph Hellwig @ 2019-01-14  9:41 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Tom Lendacky, Catalin Marinas, Joerg Roedel, Will Deacon,
	linux-kernel, iommu, linux-arm-kernel

No need for a __KERNEL__ guard outside uapi, make sure we pull in the
includes unconditionally so users can rely on it, and add a missing
comment describing the #else cpp statement.  Last but not least include
<linux/errno.h> instead of the asm version, which is frowned upon.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 include/linux/dma-iommu.h | 15 ++++-----------
 1 file changed, 4 insertions(+), 11 deletions(-)

diff --git a/include/linux/dma-iommu.h b/include/linux/dma-iommu.h
index e760dc5d1fa8..65aa888c2768 100644
--- a/include/linux/dma-iommu.h
+++ b/include/linux/dma-iommu.h
@@ -16,15 +16,13 @@
 #ifndef __DMA_IOMMU_H
 #define __DMA_IOMMU_H
 
-#ifdef __KERNEL__
-#include <linux/types.h>
-#include <asm/errno.h>
-
-#ifdef CONFIG_IOMMU_DMA
+#include <linux/errno.h>
 #include <linux/dma-mapping.h>
 #include <linux/iommu.h>
 #include <linux/msi.h>
+#include <linux/types.h>
 
+#ifdef CONFIG_IOMMU_DMA
 int iommu_dma_init(void);
 
 /* Domain management interface for IOMMU drivers */
@@ -74,11 +72,7 @@ void iommu_dma_unmap_resource(struct device *dev, dma_addr_t handle,
 void iommu_dma_map_msi_msg(int irq, struct msi_msg *msg);
 void iommu_dma_get_resv_regions(struct device *dev, struct list_head *list);
 
-#else
-
-struct iommu_domain;
-struct msi_msg;
-struct device;
+#else /* CONFIG_IOMMU_DMA */
 
 static inline int iommu_dma_init(void)
 {
@@ -108,5 +102,4 @@ static inline void iommu_dma_get_resv_regions(struct device *dev, struct list_he
 }
 
 #endif	/* CONFIG_IOMMU_DMA */
-#endif	/* __KERNEL__ */
 #endif	/* __DMA_IOMMU_H */
-- 
2.20.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 03/19] dma-iommu: don't use a scatterlist in iommu_dma_alloc
  2019-01-14  9:41 implement generic dma_map_ops for IOMMUs Christoph Hellwig
  2019-01-14  9:41 ` [PATCH 01/19] dma-mapping: add a Kconfig symbol to indicated arch_dma_prep_coherent presence Christoph Hellwig
  2019-01-14  9:41 ` [PATCH 02/19] dma-iommu: cleanup dma-iommu.h Christoph Hellwig
@ 2019-01-14  9:41 ` Christoph Hellwig
  2019-02-01 15:24   ` Robin Murphy
  2019-01-14  9:41 ` [PATCH 04/19] dma-iommu: remove the flush_page callback Christoph Hellwig
                   ` (16 subsequent siblings)
  19 siblings, 1 reply; 38+ messages in thread
From: Christoph Hellwig @ 2019-01-14  9:41 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Tom Lendacky, Catalin Marinas, Joerg Roedel, Will Deacon,
	linux-kernel, iommu, linux-arm-kernel

Directly iterating over the pages makes the code a bit simpler and
prepares for the following changes.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/iommu/dma-iommu.c | 40 +++++++++++++++++----------------------
 1 file changed, 17 insertions(+), 23 deletions(-)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index d19f3d6b43c1..4f5546a103d8 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -30,6 +30,7 @@
 #include <linux/mm.h>
 #include <linux/pci.h>
 #include <linux/scatterlist.h>
+#include <linux/highmem.h>
 #include <linux/vmalloc.h>
 
 struct iommu_dma_msi_page {
@@ -549,9 +550,9 @@ struct page **iommu_dma_alloc(struct device *dev, size_t size, gfp_t gfp,
 	struct iommu_dma_cookie *cookie = domain->iova_cookie;
 	struct iova_domain *iovad = &cookie->iovad;
 	struct page **pages;
-	struct sg_table sgt;
 	dma_addr_t iova;
-	unsigned int count, min_size, alloc_sizes = domain->pgsize_bitmap;
+	unsigned int count, min_size, alloc_sizes = domain->pgsize_bitmap, i;
+	size_t mapped = 0;
 
 	*handle = DMA_MAPPING_ERROR;
 
@@ -576,32 +577,25 @@ struct page **iommu_dma_alloc(struct device *dev, size_t size, gfp_t gfp,
 	if (!iova)
 		goto out_free_pages;
 
-	if (sg_alloc_table_from_pages(&sgt, pages, count, 0, size, GFP_KERNEL))
-		goto out_free_iova;
+	for (i = 0; i < count; i++) {
+		phys_addr_t phys = page_to_phys(pages[i]);
 
-	if (!(prot & IOMMU_CACHE)) {
-		struct sg_mapping_iter miter;
-		/*
-		 * The CPU-centric flushing implied by SG_MITER_TO_SG isn't
-		 * sufficient here, so skip it by using the "wrong" direction.
-		 */
-		sg_miter_start(&miter, sgt.sgl, sgt.orig_nents, SG_MITER_FROM_SG);
-		while (sg_miter_next(&miter))
-			flush_page(dev, miter.addr, page_to_phys(miter.page));
-		sg_miter_stop(&miter);
-	}
+		if (!(prot & IOMMU_CACHE)) {
+			void *vaddr = kmap_atomic(pages[i]);
 
-	if (iommu_map_sg(domain, iova, sgt.sgl, sgt.orig_nents, prot)
-			< size)
-		goto out_free_sg;
+			flush_page(dev, vaddr, phys);
+			kunmap_atomic(vaddr);
+		}
+
+		if (iommu_map(domain, iova + mapped, phys, PAGE_SIZE, prot))
+			goto out_unmap;
+		mapped += PAGE_SIZE;
+	}
 
 	*handle = iova;
-	sg_free_table(&sgt);
 	return pages;
-
-out_free_sg:
-	sg_free_table(&sgt);
-out_free_iova:
+out_unmap:
+	iommu_unmap(domain, iova, mapped);
 	iommu_dma_free_iova(cookie, iova, size);
 out_free_pages:
 	__iommu_dma_free_pages(pages, count);
-- 
2.20.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 04/19] dma-iommu: remove the flush_page callback
  2019-01-14  9:41 implement generic dma_map_ops for IOMMUs Christoph Hellwig
                   ` (2 preceding siblings ...)
  2019-01-14  9:41 ` [PATCH 03/19] dma-iommu: don't use a scatterlist in iommu_dma_alloc Christoph Hellwig
@ 2019-01-14  9:41 ` Christoph Hellwig
  2019-02-01 15:28   ` Robin Murphy
  2019-01-14  9:41 ` [PATCH 05/19] dma-iommu: move the arm64 wrappers to common code Christoph Hellwig
                   ` (15 subsequent siblings)
  19 siblings, 1 reply; 38+ messages in thread
From: Christoph Hellwig @ 2019-01-14  9:41 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Tom Lendacky, Catalin Marinas, Joerg Roedel, Will Deacon,
	linux-kernel, iommu, linux-arm-kernel

We now have a arch_dma_prep_coherent architecture hook that is used
for the generic DMA remap allocator, and we should use the same
interface for the dma-iommu code.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 arch/arm64/mm/dma-mapping.c |  8 +-------
 drivers/iommu/dma-iommu.c   | 14 ++++----------
 include/linux/dma-iommu.h   |  3 +--
 3 files changed, 6 insertions(+), 19 deletions(-)

diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c
index fb0908456a1f..75fe7273a1e4 100644
--- a/arch/arm64/mm/dma-mapping.c
+++ b/arch/arm64/mm/dma-mapping.c
@@ -104,12 +104,6 @@ arch_initcall(arm64_dma_init);
 #include <linux/platform_device.h>
 #include <linux/amba/bus.h>
 
-/* Thankfully, all cache ops are by VA so we can ignore phys here */
-static void flush_page(struct device *dev, const void *virt, phys_addr_t phys)
-{
-	__dma_flush_area(virt, PAGE_SIZE);
-}
-
 static void *__iommu_alloc_attrs(struct device *dev, size_t size,
 				 dma_addr_t *handle, gfp_t gfp,
 				 unsigned long attrs)
@@ -186,7 +180,7 @@ static void *__iommu_alloc_attrs(struct device *dev, size_t size,
 		struct page **pages;
 
 		pages = iommu_dma_alloc(dev, iosize, gfp, attrs, ioprot,
-					handle, flush_page);
+					handle);
 		if (!pages)
 			return NULL;
 
diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index 4f5546a103d8..d6a437385b26 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -22,6 +22,7 @@
 #include <linux/acpi_iort.h>
 #include <linux/device.h>
 #include <linux/dma-iommu.h>
+#include <linux/dma-noncoherent.h>
 #include <linux/gfp.h>
 #include <linux/huge_mm.h>
 #include <linux/iommu.h>
@@ -533,8 +534,6 @@ void iommu_dma_free(struct device *dev, struct page **pages, size_t size,
  * @attrs: DMA attributes for this allocation
  * @prot: IOMMU mapping flags
  * @handle: Out argument for allocated DMA handle
- * @flush_page: Arch callback which must ensure PAGE_SIZE bytes from the
- *		given VA/PA are visible to the given non-coherent device.
  *
  * If @size is less than PAGE_SIZE, then a full CPU page will be allocated,
  * but an IOMMU which supports smaller pages might not map the whole thing.
@@ -543,8 +542,7 @@ void iommu_dma_free(struct device *dev, struct page **pages, size_t size,
  *	   or NULL on failure.
  */
 struct page **iommu_dma_alloc(struct device *dev, size_t size, gfp_t gfp,
-		unsigned long attrs, int prot, dma_addr_t *handle,
-		void (*flush_page)(struct device *, const void *, phys_addr_t))
+		unsigned long attrs, int prot, dma_addr_t *handle)
 {
 	struct iommu_domain *domain = iommu_get_dma_domain(dev);
 	struct iommu_dma_cookie *cookie = domain->iova_cookie;
@@ -580,12 +578,8 @@ struct page **iommu_dma_alloc(struct device *dev, size_t size, gfp_t gfp,
 	for (i = 0; i < count; i++) {
 		phys_addr_t phys = page_to_phys(pages[i]);
 
-		if (!(prot & IOMMU_CACHE)) {
-			void *vaddr = kmap_atomic(pages[i]);
-
-			flush_page(dev, vaddr, phys);
-			kunmap_atomic(vaddr);
-		}
+		if (!(prot & IOMMU_CACHE))
+			arch_dma_prep_coherent(pages[i], PAGE_SIZE);
 
 		if (iommu_map(domain, iova + mapped, phys, PAGE_SIZE, prot))
 			goto out_unmap;
diff --git a/include/linux/dma-iommu.h b/include/linux/dma-iommu.h
index 65aa888c2768..59e606f78626 100644
--- a/include/linux/dma-iommu.h
+++ b/include/linux/dma-iommu.h
@@ -43,8 +43,7 @@ int dma_info_to_prot(enum dma_data_direction dir, bool coherent,
  * the arch code to take care of attributes and cache maintenance
  */
 struct page **iommu_dma_alloc(struct device *dev, size_t size, gfp_t gfp,
-		unsigned long attrs, int prot, dma_addr_t *handle,
-		void (*flush_page)(struct device *, const void *, phys_addr_t));
+		unsigned long attrs, int prot, dma_addr_t *handle);
 void iommu_dma_free(struct device *dev, struct page **pages, size_t size,
 		dma_addr_t *handle);
 
-- 
2.20.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 05/19] dma-iommu: move the arm64 wrappers to common code
  2019-01-14  9:41 implement generic dma_map_ops for IOMMUs Christoph Hellwig
                   ` (3 preceding siblings ...)
  2019-01-14  9:41 ` [PATCH 04/19] dma-iommu: remove the flush_page callback Christoph Hellwig
@ 2019-01-14  9:41 ` Christoph Hellwig
  2019-01-14  9:41 ` [PATCH 06/19] dma-iommu: fix and refactor iommu_dma_mmap Christoph Hellwig
                   ` (14 subsequent siblings)
  19 siblings, 0 replies; 38+ messages in thread
From: Christoph Hellwig @ 2019-01-14  9:41 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Tom Lendacky, Catalin Marinas, Joerg Roedel, Will Deacon,
	linux-kernel, iommu, linux-arm-kernel

There is nothing really arm64 specific in the iommu_dma_ops
implementation, so move it to dma-iommu.c and keep a lot of symbols
self-contained.  Note the implementation does depend on the
DMA_DIRECT_REMAP infrastructure for now, so we'll have to make the
DMA_IOMMU support depend on it, but this will be relaxed soon.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 arch/arm64/mm/dma-mapping.c | 384 +-----------------------------------
 drivers/iommu/Kconfig       |   1 +
 drivers/iommu/dma-iommu.c   | 378 ++++++++++++++++++++++++++++++++---
 include/linux/dma-iommu.h   |  43 +---
 4 files changed, 359 insertions(+), 447 deletions(-)

diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c
index 75fe7273a1e4..fffba9426ee4 100644
--- a/arch/arm64/mm/dma-mapping.c
+++ b/arch/arm64/mm/dma-mapping.c
@@ -27,6 +27,7 @@
 #include <linux/dma-direct.h>
 #include <linux/dma-noncoherent.h>
 #include <linux/dma-contiguous.h>
+#include <linux/dma-iommu.h>
 #include <linux/vmalloc.h>
 #include <linux/swiotlb.h>
 #include <linux/pci.h>
@@ -58,37 +59,6 @@ void arch_dma_prep_coherent(struct page *page, size_t size)
 	__dma_flush_area(page_address(page), size);
 }
 
-#ifdef CONFIG_IOMMU_DMA
-static int __swiotlb_get_sgtable_page(struct sg_table *sgt,
-				      struct page *page, size_t size)
-{
-	int ret = sg_alloc_table(sgt, 1, GFP_KERNEL);
-
-	if (!ret)
-		sg_set_page(sgt->sgl, page, PAGE_ALIGN(size), 0);
-
-	return ret;
-}
-
-static int __swiotlb_mmap_pfn(struct vm_area_struct *vma,
-			      unsigned long pfn, size_t size)
-{
-	int ret = -ENXIO;
-	unsigned long nr_vma_pages = vma_pages(vma);
-	unsigned long nr_pages = PAGE_ALIGN(size) >> PAGE_SHIFT;
-	unsigned long off = vma->vm_pgoff;
-
-	if (off < nr_pages && nr_vma_pages <= (nr_pages - off)) {
-		ret = remap_pfn_range(vma, vma->vm_start,
-				      pfn + off,
-				      vma->vm_end - vma->vm_start,
-				      vma->vm_page_prot);
-	}
-
-	return ret;
-}
-#endif /* CONFIG_IOMMU_DMA */
-
 static int __init arm64_dma_init(void)
 {
 	WARN_TAINT(ARCH_DMA_MINALIGN < cache_line_size(),
@@ -100,364 +70,18 @@ static int __init arm64_dma_init(void)
 arch_initcall(arm64_dma_init);
 
 #ifdef CONFIG_IOMMU_DMA
-#include <linux/dma-iommu.h>
-#include <linux/platform_device.h>
-#include <linux/amba/bus.h>
-
-static void *__iommu_alloc_attrs(struct device *dev, size_t size,
-				 dma_addr_t *handle, gfp_t gfp,
-				 unsigned long attrs)
-{
-	bool coherent = dev_is_dma_coherent(dev);
-	int ioprot = dma_info_to_prot(DMA_BIDIRECTIONAL, coherent, attrs);
-	size_t iosize = size;
-	void *addr;
-
-	if (WARN(!dev, "cannot create IOMMU mapping for unknown device\n"))
-		return NULL;
-
-	size = PAGE_ALIGN(size);
-
-	/*
-	 * Some drivers rely on this, and we probably don't want the
-	 * possibility of stale kernel data being read by devices anyway.
-	 */
-	gfp |= __GFP_ZERO;
-
-	if (!gfpflags_allow_blocking(gfp)) {
-		struct page *page;
-		/*
-		 * In atomic context we can't remap anything, so we'll only
-		 * get the virtually contiguous buffer we need by way of a
-		 * physically contiguous allocation.
-		 */
-		if (coherent) {
-			page = alloc_pages(gfp, get_order(size));
-			addr = page ? page_address(page) : NULL;
-		} else {
-			addr = dma_alloc_from_pool(size, &page, gfp);
-		}
-		if (!addr)
-			return NULL;
-
-		*handle = iommu_dma_map_page(dev, page, 0, iosize, ioprot);
-		if (*handle == DMA_MAPPING_ERROR) {
-			if (coherent)
-				__free_pages(page, get_order(size));
-			else
-				dma_free_from_pool(addr, size);
-			addr = NULL;
-		}
-	} else if (attrs & DMA_ATTR_FORCE_CONTIGUOUS) {
-		pgprot_t prot = arch_dma_mmap_pgprot(dev, PAGE_KERNEL, attrs);
-		struct page *page;
-
-		page = dma_alloc_from_contiguous(dev, size >> PAGE_SHIFT,
-					get_order(size), gfp & __GFP_NOWARN);
-		if (!page)
-			return NULL;
-
-		*handle = iommu_dma_map_page(dev, page, 0, iosize, ioprot);
-		if (*handle == DMA_MAPPING_ERROR) {
-			dma_release_from_contiguous(dev, page,
-						    size >> PAGE_SHIFT);
-			return NULL;
-		}
-		addr = dma_common_contiguous_remap(page, size, VM_USERMAP,
-						   prot,
-						   __builtin_return_address(0));
-		if (addr) {
-			if (!coherent)
-				__dma_flush_area(page_to_virt(page), iosize);
-			memset(addr, 0, size);
-		} else {
-			iommu_dma_unmap_page(dev, *handle, iosize, 0, attrs);
-			dma_release_from_contiguous(dev, page,
-						    size >> PAGE_SHIFT);
-		}
-	} else {
-		pgprot_t prot = arch_dma_mmap_pgprot(dev, PAGE_KERNEL, attrs);
-		struct page **pages;
-
-		pages = iommu_dma_alloc(dev, iosize, gfp, attrs, ioprot,
-					handle);
-		if (!pages)
-			return NULL;
-
-		addr = dma_common_pages_remap(pages, size, VM_USERMAP, prot,
-					      __builtin_return_address(0));
-		if (!addr)
-			iommu_dma_free(dev, pages, iosize, handle);
-	}
-	return addr;
-}
-
-static void __iommu_free_attrs(struct device *dev, size_t size, void *cpu_addr,
-			       dma_addr_t handle, unsigned long attrs)
-{
-	size_t iosize = size;
-
-	size = PAGE_ALIGN(size);
-	/*
-	 * @cpu_addr will be one of 4 things depending on how it was allocated:
-	 * - A remapped array of pages for contiguous allocations.
-	 * - A remapped array of pages from iommu_dma_alloc(), for all
-	 *   non-atomic allocations.
-	 * - A non-cacheable alias from the atomic pool, for atomic
-	 *   allocations by non-coherent devices.
-	 * - A normal lowmem address, for atomic allocations by
-	 *   coherent devices.
-	 * Hence how dodgy the below logic looks...
-	 */
-	if (dma_in_atomic_pool(cpu_addr, size)) {
-		iommu_dma_unmap_page(dev, handle, iosize, 0, 0);
-		dma_free_from_pool(cpu_addr, size);
-	} else if (attrs & DMA_ATTR_FORCE_CONTIGUOUS) {
-		struct page *page = vmalloc_to_page(cpu_addr);
-
-		iommu_dma_unmap_page(dev, handle, iosize, 0, attrs);
-		dma_release_from_contiguous(dev, page, size >> PAGE_SHIFT);
-		dma_common_free_remap(cpu_addr, size, VM_USERMAP);
-	} else if (is_vmalloc_addr(cpu_addr)){
-		struct vm_struct *area = find_vm_area(cpu_addr);
-
-		if (WARN_ON(!area || !area->pages))
-			return;
-		iommu_dma_free(dev, area->pages, iosize, &handle);
-		dma_common_free_remap(cpu_addr, size, VM_USERMAP);
-	} else {
-		iommu_dma_unmap_page(dev, handle, iosize, 0, 0);
-		__free_pages(virt_to_page(cpu_addr), get_order(size));
-	}
-}
-
-static int __iommu_mmap_attrs(struct device *dev, struct vm_area_struct *vma,
-			      void *cpu_addr, dma_addr_t dma_addr, size_t size,
-			      unsigned long attrs)
-{
-	struct vm_struct *area;
-	int ret;
-
-	vma->vm_page_prot = arch_dma_mmap_pgprot(dev, vma->vm_page_prot, attrs);
-
-	if (dma_mmap_from_dev_coherent(dev, vma, cpu_addr, size, &ret))
-		return ret;
-
-	if (attrs & DMA_ATTR_FORCE_CONTIGUOUS) {
-		/*
-		 * DMA_ATTR_FORCE_CONTIGUOUS allocations are always remapped,
-		 * hence in the vmalloc space.
-		 */
-		unsigned long pfn = vmalloc_to_pfn(cpu_addr);
-		return __swiotlb_mmap_pfn(vma, pfn, size);
-	}
-
-	area = find_vm_area(cpu_addr);
-	if (WARN_ON(!area || !area->pages))
-		return -ENXIO;
-
-	return iommu_dma_mmap(area->pages, size, vma);
-}
-
-static int __iommu_get_sgtable(struct device *dev, struct sg_table *sgt,
-			       void *cpu_addr, dma_addr_t dma_addr,
-			       size_t size, unsigned long attrs)
-{
-	unsigned int count = PAGE_ALIGN(size) >> PAGE_SHIFT;
-	struct vm_struct *area = find_vm_area(cpu_addr);
-
-	if (attrs & DMA_ATTR_FORCE_CONTIGUOUS) {
-		/*
-		 * DMA_ATTR_FORCE_CONTIGUOUS allocations are always remapped,
-		 * hence in the vmalloc space.
-		 */
-		struct page *page = vmalloc_to_page(cpu_addr);
-		return __swiotlb_get_sgtable_page(sgt, page, size);
-	}
-
-	if (WARN_ON(!area || !area->pages))
-		return -ENXIO;
-
-	return sg_alloc_table_from_pages(sgt, area->pages, count, 0, size,
-					 GFP_KERNEL);
-}
-
-static void __iommu_sync_single_for_cpu(struct device *dev,
-					dma_addr_t dev_addr, size_t size,
-					enum dma_data_direction dir)
-{
-	phys_addr_t phys;
-
-	if (dev_is_dma_coherent(dev))
-		return;
-
-	phys = iommu_iova_to_phys(iommu_get_dma_domain(dev), dev_addr);
-	arch_sync_dma_for_cpu(dev, phys, size, dir);
-}
-
-static void __iommu_sync_single_for_device(struct device *dev,
-					   dma_addr_t dev_addr, size_t size,
-					   enum dma_data_direction dir)
-{
-	phys_addr_t phys;
-
-	if (dev_is_dma_coherent(dev))
-		return;
-
-	phys = iommu_iova_to_phys(iommu_get_dma_domain(dev), dev_addr);
-	arch_sync_dma_for_device(dev, phys, size, dir);
-}
-
-static dma_addr_t __iommu_map_page(struct device *dev, struct page *page,
-				   unsigned long offset, size_t size,
-				   enum dma_data_direction dir,
-				   unsigned long attrs)
-{
-	bool coherent = dev_is_dma_coherent(dev);
-	int prot = dma_info_to_prot(dir, coherent, attrs);
-	dma_addr_t dev_addr = iommu_dma_map_page(dev, page, offset, size, prot);
-
-	if (!coherent && !(attrs & DMA_ATTR_SKIP_CPU_SYNC) &&
-	    dev_addr != DMA_MAPPING_ERROR)
-		__dma_map_area(page_address(page) + offset, size, dir);
-
-	return dev_addr;
-}
-
-static void __iommu_unmap_page(struct device *dev, dma_addr_t dev_addr,
-			       size_t size, enum dma_data_direction dir,
-			       unsigned long attrs)
-{
-	if ((attrs & DMA_ATTR_SKIP_CPU_SYNC) == 0)
-		__iommu_sync_single_for_cpu(dev, dev_addr, size, dir);
-
-	iommu_dma_unmap_page(dev, dev_addr, size, dir, attrs);
-}
-
-static void __iommu_sync_sg_for_cpu(struct device *dev,
-				    struct scatterlist *sgl, int nelems,
-				    enum dma_data_direction dir)
-{
-	struct scatterlist *sg;
-	int i;
-
-	if (dev_is_dma_coherent(dev))
-		return;
-
-	for_each_sg(sgl, sg, nelems, i)
-		arch_sync_dma_for_cpu(dev, sg_phys(sg), sg->length, dir);
-}
-
-static void __iommu_sync_sg_for_device(struct device *dev,
-				       struct scatterlist *sgl, int nelems,
-				       enum dma_data_direction dir)
-{
-	struct scatterlist *sg;
-	int i;
-
-	if (dev_is_dma_coherent(dev))
-		return;
-
-	for_each_sg(sgl, sg, nelems, i)
-		arch_sync_dma_for_device(dev, sg_phys(sg), sg->length, dir);
-}
-
-static int __iommu_map_sg_attrs(struct device *dev, struct scatterlist *sgl,
-				int nelems, enum dma_data_direction dir,
-				unsigned long attrs)
-{
-	bool coherent = dev_is_dma_coherent(dev);
-
-	if ((attrs & DMA_ATTR_SKIP_CPU_SYNC) == 0)
-		__iommu_sync_sg_for_device(dev, sgl, nelems, dir);
-
-	return iommu_dma_map_sg(dev, sgl, nelems,
-				dma_info_to_prot(dir, coherent, attrs));
-}
-
-static void __iommu_unmap_sg_attrs(struct device *dev,
-				   struct scatterlist *sgl, int nelems,
-				   enum dma_data_direction dir,
-				   unsigned long attrs)
-{
-	if ((attrs & DMA_ATTR_SKIP_CPU_SYNC) == 0)
-		__iommu_sync_sg_for_cpu(dev, sgl, nelems, dir);
-
-	iommu_dma_unmap_sg(dev, sgl, nelems, dir, attrs);
-}
-
-static const struct dma_map_ops iommu_dma_ops = {
-	.alloc = __iommu_alloc_attrs,
-	.free = __iommu_free_attrs,
-	.mmap = __iommu_mmap_attrs,
-	.get_sgtable = __iommu_get_sgtable,
-	.map_page = __iommu_map_page,
-	.unmap_page = __iommu_unmap_page,
-	.map_sg = __iommu_map_sg_attrs,
-	.unmap_sg = __iommu_unmap_sg_attrs,
-	.sync_single_for_cpu = __iommu_sync_single_for_cpu,
-	.sync_single_for_device = __iommu_sync_single_for_device,
-	.sync_sg_for_cpu = __iommu_sync_sg_for_cpu,
-	.sync_sg_for_device = __iommu_sync_sg_for_device,
-	.map_resource = iommu_dma_map_resource,
-	.unmap_resource = iommu_dma_unmap_resource,
-};
-
-static int __init __iommu_dma_init(void)
-{
-	return iommu_dma_init();
-}
-arch_initcall(__iommu_dma_init);
-
-static void __iommu_setup_dma_ops(struct device *dev, u64 dma_base, u64 size,
-				  const struct iommu_ops *ops)
-{
-	struct iommu_domain *domain;
-
-	if (!ops)
-		return;
-
-	/*
-	 * The IOMMU core code allocates the default DMA domain, which the
-	 * underlying IOMMU driver needs to support via the dma-iommu layer.
-	 */
-	domain = iommu_get_domain_for_dev(dev);
-
-	if (!domain)
-		goto out_err;
-
-	if (domain->type == IOMMU_DOMAIN_DMA) {
-		if (iommu_dma_init_domain(domain, dma_base, size, dev))
-			goto out_err;
-
-		dev->dma_ops = &iommu_dma_ops;
-	}
-
-	return;
-
-out_err:
-	 pr_warn("Failed to set up IOMMU for device %s; retaining platform DMA ops\n",
-		 dev_name(dev));
-}
-
 void arch_teardown_dma_ops(struct device *dev)
 {
 	dev->dma_ops = NULL;
 }
-
-#else
-
-static void __iommu_setup_dma_ops(struct device *dev, u64 dma_base, u64 size,
-				  const struct iommu_ops *iommu)
-{ }
-
-#endif  /* CONFIG_IOMMU_DMA */
+#endif
 
 void arch_setup_dma_ops(struct device *dev, u64 dma_base, u64 size,
 			const struct iommu_ops *iommu, bool coherent)
 {
 	dev->dma_coherent = coherent;
-	__iommu_setup_dma_ops(dev, dma_base, size, iommu);
+	if (iommu)
+		iommu_setup_dma_ops(dev, dma_base, size, iommu);
 
 #ifdef CONFIG_XEN
 	if (xen_initial_domain()) {
diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
index d9a25715650e..8b13fb7d0263 100644
--- a/drivers/iommu/Kconfig
+++ b/drivers/iommu/Kconfig
@@ -94,6 +94,7 @@ config IOMMU_DMA
 	select IOMMU_API
 	select IOMMU_IOVA
 	select NEED_SG_DMA_LENGTH
+	depends on DMA_DIRECT_REMAP
 
 config FSL_PAMU
 	bool "Freescale IOMMU support"
diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index d6a437385b26..e0ffe22775ac 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -21,6 +21,7 @@
 
 #include <linux/acpi_iort.h>
 #include <linux/device.h>
+#include <linux/dma-contiguous.h>
 #include <linux/dma-iommu.h>
 #include <linux/dma-noncoherent.h>
 #include <linux/gfp.h>
@@ -80,11 +81,6 @@ static struct iommu_dma_cookie *cookie_alloc(enum iommu_dma_cookie_type type)
 	return cookie;
 }
 
-int iommu_dma_init(void)
-{
-	return iova_cache_get();
-}
-
 /**
  * iommu_get_dma_cookie - Acquire DMA-API resources for a domain
  * @domain: IOMMU domain to prepare for DMA-API usage
@@ -286,7 +282,7 @@ static void iommu_dma_flush_iotlb_all(struct iova_domain *iovad)
  * to ensure it is an invalid IOVA. It is safe to reinitialise a domain, but
  * any change which could make prior IOVAs invalid will fail.
  */
-int iommu_dma_init_domain(struct iommu_domain *domain, dma_addr_t base,
+static int iommu_dma_init_domain(struct iommu_domain *domain, dma_addr_t base,
 		u64 size, struct device *dev)
 {
 	struct iommu_dma_cookie *cookie = domain->iova_cookie;
@@ -338,7 +334,6 @@ int iommu_dma_init_domain(struct iommu_domain *domain, dma_addr_t base,
 
 	return iova_reserve_iommu_regions(dev, domain);
 }
-EXPORT_SYMBOL(iommu_dma_init_domain);
 
 /**
  * dma_info_to_prot - Translate DMA API directions and attributes to IOMMU API
@@ -349,7 +344,7 @@ EXPORT_SYMBOL(iommu_dma_init_domain);
  *
  * Return: corresponding IOMMU API page protection flags
  */
-int dma_info_to_prot(enum dma_data_direction dir, bool coherent,
+static int dma_info_to_prot(enum dma_data_direction dir, bool coherent,
 		     unsigned long attrs)
 {
 	int prot = coherent ? IOMMU_CACHE : 0;
@@ -508,17 +503,17 @@ static struct page **__iommu_dma_alloc_pages(struct device *dev,
 }
 
 /**
- * iommu_dma_free - Free a buffer allocated by iommu_dma_alloc()
+ * iommu_dma_free - Free a buffer allocated by __iommu_dma_alloc()
  * @dev: Device which owns this buffer
- * @pages: Array of buffer pages as returned by iommu_dma_alloc()
+ * @pages: Array of buffer pages as returned by __iommu_dma_alloc()
  * @size: Size of buffer in bytes
  * @handle: DMA address of buffer
  *
  * Frees both the pages associated with the buffer, and the array
  * describing them
  */
-void iommu_dma_free(struct device *dev, struct page **pages, size_t size,
-		dma_addr_t *handle)
+static void __iommu_dma_free(struct device *dev, struct page **pages,
+		size_t size, dma_addr_t *handle)
 {
 	__iommu_dma_unmap(iommu_get_dma_domain(dev), *handle, size);
 	__iommu_dma_free_pages(pages, PAGE_ALIGN(size) >> PAGE_SHIFT);
@@ -526,7 +521,7 @@ void iommu_dma_free(struct device *dev, struct page **pages, size_t size,
 }
 
 /**
- * iommu_dma_alloc - Allocate and map a buffer contiguous in IOVA space
+ * __iommu_dma_alloc - Allocate and map a buffer contiguous in IOVA space
  * @dev: Device to allocate memory for. Must be a real device
  *	 attached to an iommu_dma_domain
  * @size: Size of buffer in bytes
@@ -541,8 +536,8 @@ void iommu_dma_free(struct device *dev, struct page **pages, size_t size,
  * Return: Array of struct page pointers describing the buffer,
  *	   or NULL on failure.
  */
-struct page **iommu_dma_alloc(struct device *dev, size_t size, gfp_t gfp,
-		unsigned long attrs, int prot, dma_addr_t *handle)
+static struct page **__iommu_dma_alloc(struct device *dev, size_t size,
+		gfp_t gfp, unsigned long attrs, int prot, dma_addr_t *handle)
 {
 	struct iommu_domain *domain = iommu_get_dma_domain(dev);
 	struct iommu_dma_cookie *cookie = domain->iova_cookie;
@@ -597,16 +592,16 @@ struct page **iommu_dma_alloc(struct device *dev, size_t size, gfp_t gfp,
 }
 
 /**
- * iommu_dma_mmap - Map a buffer into provided user VMA
- * @pages: Array representing buffer from iommu_dma_alloc()
+ * __iommu_dma_mmap - Map a buffer into provided user VMA
+ * @pages: Array representing buffer from __iommu_dma_alloc()
  * @size: Size of buffer in bytes
  * @vma: VMA describing requested userspace mapping
  *
  * Maps the pages of the buffer in @pages into @vma. The caller is responsible
  * for verifying the correct size and protection of @vma beforehand.
  */
-
-int iommu_dma_mmap(struct page **pages, size_t size, struct vm_area_struct *vma)
+static int __iommu_dma_mmap(struct page **pages, size_t size,
+		struct vm_area_struct *vma)
 {
 	unsigned long uaddr = vma->vm_start;
 	unsigned int i, count = PAGE_ALIGN(size) >> PAGE_SHIFT;
@@ -621,6 +616,58 @@ int iommu_dma_mmap(struct page **pages, size_t size, struct vm_area_struct *vma)
 	return ret;
 }
 
+static void iommu_dma_sync_single_for_cpu(struct device *dev,
+		dma_addr_t dma_handle, size_t size, enum dma_data_direction dir)
+{
+	phys_addr_t phys;
+
+	if (dev_is_dma_coherent(dev))
+		return;
+
+	phys = iommu_iova_to_phys(iommu_get_dma_domain(dev), dma_handle);
+	arch_sync_dma_for_cpu(dev, phys, size, dir);
+}
+
+static void iommu_dma_sync_single_for_device(struct device *dev,
+		dma_addr_t dma_handle, size_t size, enum dma_data_direction dir)
+{
+	phys_addr_t phys;
+
+	if (dev_is_dma_coherent(dev))
+		return;
+
+	phys = iommu_iova_to_phys(iommu_get_dma_domain(dev), dma_handle);
+	arch_sync_dma_for_device(dev, phys, size, dir);
+}
+
+static void iommu_dma_sync_sg_for_cpu(struct device *dev,
+		struct scatterlist *sgl, int nelems,
+		enum dma_data_direction dir)
+{
+	struct scatterlist *sg;
+	int i;
+
+	if (dev_is_dma_coherent(dev))
+		return;
+
+	for_each_sg(sgl, sg, nelems, i)
+		arch_sync_dma_for_cpu(dev, sg_phys(sg), sg->length, dir);
+}
+
+static void iommu_dma_sync_sg_for_device(struct device *dev,
+		struct scatterlist *sgl, int nelems,
+		enum dma_data_direction dir)
+{
+	struct scatterlist *sg;
+	int i;
+
+	if (dev_is_dma_coherent(dev))
+		return;
+
+	for_each_sg(sgl, sg, nelems, i)
+		arch_sync_dma_for_device(dev, sg_phys(sg), sg->length, dir);
+}
+
 static dma_addr_t __iommu_dma_map(struct device *dev, phys_addr_t phys,
 		size_t size, int prot, struct iommu_domain *domain)
 {
@@ -644,19 +691,44 @@ static dma_addr_t __iommu_dma_map(struct device *dev, phys_addr_t phys,
 	return iova + iova_off;
 }
 
-dma_addr_t iommu_dma_map_page(struct device *dev, struct page *page,
+static dma_addr_t __iommu_dma_map_page(struct device *dev, struct page *page,
 		unsigned long offset, size_t size, int prot)
 {
 	return __iommu_dma_map(dev, page_to_phys(page) + offset, size, prot,
 			iommu_get_dma_domain(dev));
 }
 
-void iommu_dma_unmap_page(struct device *dev, dma_addr_t handle, size_t size,
-		enum dma_data_direction dir, unsigned long attrs)
+static void __iommu_dma_unmap_page(struct device *dev, dma_addr_t handle,
+		size_t size, enum dma_data_direction dir, unsigned long attrs)
 {
 	__iommu_dma_unmap(iommu_get_dma_domain(dev), handle, size);
 }
 
+static dma_addr_t iommu_dma_map_page(struct device *dev, struct page *page,
+		unsigned long offset, size_t size, enum dma_data_direction dir,
+		unsigned long attrs)
+{
+	phys_addr_t phys = page_to_phys(page) + offset;
+	bool coherent = dev_is_dma_coherent(dev);
+	dma_addr_t dma_handle;
+
+	dma_handle =__iommu_dma_map(dev, phys, size,
+			dma_info_to_prot(dir, coherent, attrs),
+			iommu_get_dma_domain(dev));
+	if (!coherent && !(attrs & DMA_ATTR_SKIP_CPU_SYNC) &&
+	    dma_handle != DMA_MAPPING_ERROR)
+		arch_sync_dma_for_device(dev, phys, size, dir);
+	return dma_handle;
+}
+
+static void iommu_dma_unmap_page(struct device *dev, dma_addr_t dma_handle,
+		size_t size, enum dma_data_direction dir, unsigned long attrs)
+{
+	if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC))
+		iommu_dma_sync_single_for_cpu(dev, dma_handle, size, dir);
+	__iommu_dma_unmap(iommu_get_domain_for_dev(dev), dma_handle, size);
+}
+
 /*
  * Prepare a successfully-mapped scatterlist to give back to the caller.
  *
@@ -739,18 +811,22 @@ static void __invalidate_sg(struct scatterlist *sg, int nents)
  * impedance-matching, to be able to hand off a suitably-aligned list,
  * but still preserve the original offsets and sizes for the caller.
  */
-int iommu_dma_map_sg(struct device *dev, struct scatterlist *sg,
-		int nents, int prot)
+static int iommu_dma_map_sg(struct device *dev, struct scatterlist *sg,
+		int nents, enum dma_data_direction dir, unsigned long attrs)
 {
 	struct iommu_domain *domain = iommu_get_dma_domain(dev);
 	struct iommu_dma_cookie *cookie = domain->iova_cookie;
 	struct iova_domain *iovad = &cookie->iovad;
 	struct scatterlist *s, *prev = NULL;
+	int prot = dma_info_to_prot(dir, dev_is_dma_coherent(dev), attrs);
 	dma_addr_t iova;
 	size_t iova_len = 0;
 	unsigned long mask = dma_get_seg_boundary(dev);
 	int i;
 
+	if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC))
+		iommu_dma_sync_sg_for_device(dev, sg, nents, dir);
+
 	/*
 	 * Work out how much IOVA space we need, and align the segments to
 	 * IOVA granules for the IOMMU driver to handle. With some clever
@@ -810,12 +886,16 @@ int iommu_dma_map_sg(struct device *dev, struct scatterlist *sg,
 	return 0;
 }
 
-void iommu_dma_unmap_sg(struct device *dev, struct scatterlist *sg, int nents,
-		enum dma_data_direction dir, unsigned long attrs)
+static void iommu_dma_unmap_sg(struct device *dev, struct scatterlist *sg,
+		int nents, enum dma_data_direction dir, unsigned long attrs)
 {
 	dma_addr_t start, end;
 	struct scatterlist *tmp;
 	int i;
+
+	if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC) == 0)
+		iommu_dma_sync_sg_for_cpu(dev, sg, nents, dir);
+
 	/*
 	 * The scatterlist segments are mapped into a single
 	 * contiguous IOVA allocation, so this is incredibly easy.
@@ -830,7 +910,7 @@ void iommu_dma_unmap_sg(struct device *dev, struct scatterlist *sg, int nents,
 	__iommu_dma_unmap(iommu_get_dma_domain(dev), start, end - start);
 }
 
-dma_addr_t iommu_dma_map_resource(struct device *dev, phys_addr_t phys,
+static dma_addr_t iommu_dma_map_resource(struct device *dev, phys_addr_t phys,
 		size_t size, enum dma_data_direction dir, unsigned long attrs)
 {
 	return __iommu_dma_map(dev, phys, size,
@@ -838,12 +918,252 @@ dma_addr_t iommu_dma_map_resource(struct device *dev, phys_addr_t phys,
 			iommu_get_dma_domain(dev));
 }
 
-void iommu_dma_unmap_resource(struct device *dev, dma_addr_t handle,
+static void iommu_dma_unmap_resource(struct device *dev, dma_addr_t handle,
 		size_t size, enum dma_data_direction dir, unsigned long attrs)
 {
 	__iommu_dma_unmap(iommu_get_dma_domain(dev), handle, size);
 }
 
+static void *iommu_dma_alloc(struct device *dev, size_t size,
+		dma_addr_t *handle, gfp_t gfp, unsigned long attrs)
+{
+	bool coherent = dev_is_dma_coherent(dev);
+	int ioprot = dma_info_to_prot(DMA_BIDIRECTIONAL, coherent, attrs);
+	size_t iosize = size;
+	void *addr;
+
+	size = PAGE_ALIGN(size);
+
+	/*
+	 * Some drivers rely on this, and we probably don't want the
+	 * possibility of stale kernel data being read by devices anyway.
+	 */
+	gfp |= __GFP_ZERO;
+
+	if (!gfpflags_allow_blocking(gfp)) {
+		struct page *page;
+		/*
+		 * In atomic context we can't remap anything, so we'll only
+		 * get the virtually contiguous buffer we need by way of a
+		 * physically contiguous allocation.
+		 */
+		if (coherent) {
+			page = alloc_pages(gfp, get_order(size));
+			addr = page ? page_address(page) : NULL;
+		} else {
+			addr = dma_alloc_from_pool(size, &page, gfp);
+		}
+		if (!addr)
+			return NULL;
+
+		*handle = __iommu_dma_map_page(dev, page, 0, iosize, ioprot);
+		if (*handle == DMA_MAPPING_ERROR) {
+			if (coherent)
+				__free_pages(page, get_order(size));
+			else
+				dma_free_from_pool(addr, size);
+			addr = NULL;
+		}
+	} else if (attrs & DMA_ATTR_FORCE_CONTIGUOUS) {
+		pgprot_t prot = arch_dma_mmap_pgprot(dev, PAGE_KERNEL, attrs);
+		struct page *page;
+
+		page = dma_alloc_from_contiguous(dev, size >> PAGE_SHIFT,
+					get_order(size), gfp & __GFP_NOWARN);
+		if (!page)
+			return NULL;
+
+		*handle = __iommu_dma_map_page(dev, page, 0, iosize, ioprot);
+		if (*handle == DMA_MAPPING_ERROR) {
+			dma_release_from_contiguous(dev, page,
+						    size >> PAGE_SHIFT);
+			return NULL;
+		}
+		addr = dma_common_contiguous_remap(page, size, VM_USERMAP,
+						   prot,
+						   __builtin_return_address(0));
+		if (addr) {
+			if (!coherent)
+				arch_dma_prep_coherent(page, iosize);
+			memset(addr, 0, size);
+		} else {
+			__iommu_dma_unmap_page(dev, *handle, iosize, 0, attrs);
+			dma_release_from_contiguous(dev, page,
+						    size >> PAGE_SHIFT);
+		}
+	} else {
+		pgprot_t prot = arch_dma_mmap_pgprot(dev, PAGE_KERNEL, attrs);
+		struct page **pages;
+
+		pages = __iommu_dma_alloc(dev, iosize, gfp, attrs, ioprot,
+					handle);
+		if (!pages)
+			return NULL;
+
+		addr = dma_common_pages_remap(pages, size, VM_USERMAP, prot,
+					      __builtin_return_address(0));
+		if (!addr)
+			__iommu_dma_free(dev, pages, iosize, handle);
+	}
+	return addr;
+}
+
+static void iommu_dma_free(struct device *dev, size_t size, void *cpu_addr,
+		dma_addr_t handle, unsigned long attrs)
+{
+	size_t iosize = size;
+
+	size = PAGE_ALIGN(size);
+	/*
+	 * @cpu_addr will be one of 4 things depending on how it was allocated:
+	 * - A remapped array of pages for contiguous allocations.
+	 * - A remapped array of pages from __iommu_dma_alloc(), for all
+	 *   non-atomic allocations.
+	 * - A non-cacheable alias from the atomic pool, for atomic
+	 *   allocations by non-coherent devices.
+	 * - A normal lowmem address, for atomic allocations by
+	 *   coherent devices.
+	 * Hence how dodgy the below logic looks...
+	 */
+	if (dma_in_atomic_pool(cpu_addr, size)) {
+		__iommu_dma_unmap_page(dev, handle, iosize, 0, 0);
+		dma_free_from_pool(cpu_addr, size);
+	} else if (attrs & DMA_ATTR_FORCE_CONTIGUOUS) {
+		struct page *page = vmalloc_to_page(cpu_addr);
+
+		__iommu_dma_unmap_page(dev, handle, iosize, 0, attrs);
+		dma_release_from_contiguous(dev, page, size >> PAGE_SHIFT);
+		dma_common_free_remap(cpu_addr, size, VM_USERMAP);
+	} else if (is_vmalloc_addr(cpu_addr)){
+		struct vm_struct *area = find_vm_area(cpu_addr);
+
+		if (WARN_ON(!area || !area->pages))
+			return;
+		__iommu_dma_free(dev, area->pages, iosize, &handle);
+		dma_common_free_remap(cpu_addr, size, VM_USERMAP);
+	} else {
+		__iommu_dma_unmap_page(dev, handle, iosize, 0, 0);
+		__free_pages(virt_to_page(cpu_addr), get_order(size));
+	}
+}
+
+static int __iommu_dma_mmap_pfn(struct vm_area_struct *vma,
+			      unsigned long pfn, size_t size)
+{
+	int ret = -ENXIO;
+	unsigned long nr_vma_pages = vma_pages(vma);
+	unsigned long nr_pages = PAGE_ALIGN(size) >> PAGE_SHIFT;
+	unsigned long off = vma->vm_pgoff;
+
+	if (off < nr_pages && nr_vma_pages <= (nr_pages - off)) {
+		ret = remap_pfn_range(vma, vma->vm_start,
+				      pfn + off,
+				      vma->vm_end - vma->vm_start,
+				      vma->vm_page_prot);
+	}
+
+	return ret;
+}
+
+static int iommu_dma_mmap(struct device *dev, struct vm_area_struct *vma,
+		void *cpu_addr, dma_addr_t dma_addr, size_t size,
+		unsigned long attrs)
+{
+	struct vm_struct *area;
+	int ret;
+
+	vma->vm_page_prot = arch_dma_mmap_pgprot(dev, vma->vm_page_prot, attrs);
+
+	if (dma_mmap_from_dev_coherent(dev, vma, cpu_addr, size, &ret))
+		return ret;
+
+	if (attrs & DMA_ATTR_FORCE_CONTIGUOUS) {
+		/*
+		 * DMA_ATTR_FORCE_CONTIGUOUS allocations are always remapped,
+		 * hence in the vmalloc space.
+		 */
+		unsigned long pfn = vmalloc_to_pfn(cpu_addr);
+		return __iommu_dma_mmap_pfn(vma, pfn, size);
+	}
+
+	area = find_vm_area(cpu_addr);
+	if (WARN_ON(!area || !area->pages))
+		return -ENXIO;
+
+	return __iommu_dma_mmap(area->pages, size, vma);
+}
+
+static int __iommu_dma_get_sgtable_page(struct sg_table *sgt, struct page *page,
+		size_t size)
+{
+	int ret = sg_alloc_table(sgt, 1, GFP_KERNEL);
+
+	if (!ret)
+		sg_set_page(sgt->sgl, page, PAGE_ALIGN(size), 0);
+	return ret;
+}
+
+static int iommu_dma_get_sgtable(struct device *dev, struct sg_table *sgt,
+		void *cpu_addr, dma_addr_t dma_addr, size_t size,
+		unsigned long attrs)
+{
+	unsigned int count = PAGE_ALIGN(size) >> PAGE_SHIFT;
+	struct vm_struct *area = find_vm_area(cpu_addr);
+
+	if (attrs & DMA_ATTR_FORCE_CONTIGUOUS) {
+		/*
+		 * DMA_ATTR_FORCE_CONTIGUOUS allocations are always remapped,
+		 * hence in the vmalloc space.
+		 */
+		struct page *page = vmalloc_to_page(cpu_addr);
+		return __iommu_dma_get_sgtable_page(sgt, page, size);
+	}
+
+	if (WARN_ON(!area || !area->pages))
+		return -ENXIO;
+
+	return sg_alloc_table_from_pages(sgt, area->pages, count, 0, size,
+					 GFP_KERNEL);
+}
+
+static const struct dma_map_ops iommu_dma_ops = {
+	.alloc			= iommu_dma_alloc,
+	.free			= iommu_dma_free,
+	.mmap			= iommu_dma_mmap,
+	.get_sgtable		= iommu_dma_get_sgtable,
+	.map_page		= iommu_dma_map_page,
+	.unmap_page		= iommu_dma_unmap_page,
+	.map_sg			= iommu_dma_map_sg,
+	.unmap_sg		= iommu_dma_unmap_sg,
+	.sync_single_for_cpu	= iommu_dma_sync_single_for_cpu,
+	.sync_single_for_device	= iommu_dma_sync_single_for_device,
+	.sync_sg_for_cpu	= iommu_dma_sync_sg_for_cpu,
+	.sync_sg_for_device	= iommu_dma_sync_sg_for_device,
+	.map_resource		= iommu_dma_map_resource,
+	.unmap_resource		= iommu_dma_unmap_resource,
+};
+
+/*
+ * The IOMMU core code allocates the default DMA domain, which the underlying
+ * IOMMU driver needs to support via the dma-iommu layer.
+ */
+void iommu_setup_dma_ops(struct device *dev, u64 dma_base, u64 size,
+		const struct iommu_ops *ops)
+{
+	struct iommu_domain *domain = iommu_get_domain_for_dev(dev);
+
+	if (!domain || domain->type != IOMMU_DOMAIN_DMA)
+		goto out_err;
+	if (iommu_dma_init_domain(domain, dma_base, size, dev))
+		goto out_err;
+
+	dev->dma_ops = &iommu_dma_ops;
+	return;
+out_err:
+	 pr_warn("Failed to set up IOMMU for device %s; retaining platform DMA ops\n",
+		 dev_name(dev));
+}
+
 static struct iommu_dma_msi_page *iommu_dma_get_msi_page(struct device *dev,
 		phys_addr_t msi_addr, struct iommu_domain *domain)
 {
@@ -916,3 +1236,5 @@ void iommu_dma_map_msi_msg(int irq, struct msi_msg *msg)
 		msg->address_lo += lower_32_bits(msi_page->iova);
 	}
 }
+
+arch_initcall(iova_cache_get);
diff --git a/include/linux/dma-iommu.h b/include/linux/dma-iommu.h
index 59e606f78626..5277aa8782bf 100644
--- a/include/linux/dma-iommu.h
+++ b/include/linux/dma-iommu.h
@@ -23,49 +23,14 @@
 #include <linux/types.h>
 
 #ifdef CONFIG_IOMMU_DMA
-int iommu_dma_init(void);
-
 /* Domain management interface for IOMMU drivers */
 int iommu_get_dma_cookie(struct iommu_domain *domain);
 int iommu_get_msi_cookie(struct iommu_domain *domain, dma_addr_t base);
 void iommu_put_dma_cookie(struct iommu_domain *domain);
 
 /* Setup call for arch DMA mapping code */
-int iommu_dma_init_domain(struct iommu_domain *domain, dma_addr_t base,
-		u64 size, struct device *dev);
-
-/* General helpers for DMA-API <-> IOMMU-API interaction */
-int dma_info_to_prot(enum dma_data_direction dir, bool coherent,
-		     unsigned long attrs);
-
-/*
- * These implement the bulk of the relevant DMA mapping callbacks, but require
- * the arch code to take care of attributes and cache maintenance
- */
-struct page **iommu_dma_alloc(struct device *dev, size_t size, gfp_t gfp,
-		unsigned long attrs, int prot, dma_addr_t *handle);
-void iommu_dma_free(struct device *dev, struct page **pages, size_t size,
-		dma_addr_t *handle);
-
-int iommu_dma_mmap(struct page **pages, size_t size, struct vm_area_struct *vma);
-
-dma_addr_t iommu_dma_map_page(struct device *dev, struct page *page,
-		unsigned long offset, size_t size, int prot);
-int iommu_dma_map_sg(struct device *dev, struct scatterlist *sg,
-		int nents, int prot);
-
-/*
- * Arch code with no special attribute handling may use these
- * directly as DMA mapping callbacks for simplicity
- */
-void iommu_dma_unmap_page(struct device *dev, dma_addr_t handle, size_t size,
-		enum dma_data_direction dir, unsigned long attrs);
-void iommu_dma_unmap_sg(struct device *dev, struct scatterlist *sg, int nents,
-		enum dma_data_direction dir, unsigned long attrs);
-dma_addr_t iommu_dma_map_resource(struct device *dev, phys_addr_t phys,
-		size_t size, enum dma_data_direction dir, unsigned long attrs);
-void iommu_dma_unmap_resource(struct device *dev, dma_addr_t handle,
-		size_t size, enum dma_data_direction dir, unsigned long attrs);
+void iommu_setup_dma_ops(struct device *dev, u64 dma_base, u64 size,
+		const struct iommu_ops *ops);
 
 /* The DMA API isn't _quite_ the whole story, though... */
 void iommu_dma_map_msi_msg(int irq, struct msi_msg *msg);
@@ -73,9 +38,9 @@ void iommu_dma_get_resv_regions(struct device *dev, struct list_head *list);
 
 #else /* CONFIG_IOMMU_DMA */
 
-static inline int iommu_dma_init(void)
+static inline void iommu_setup_dma_ops(struct device *dev, u64 dma_base,
+		u64 size, const struct iommu_ops *ops)
 {
-	return 0;
 }
 
 static inline int iommu_get_dma_cookie(struct iommu_domain *domain)
-- 
2.20.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 06/19] dma-iommu: fix and refactor iommu_dma_mmap
  2019-01-14  9:41 implement generic dma_map_ops for IOMMUs Christoph Hellwig
                   ` (4 preceding siblings ...)
  2019-01-14  9:41 ` [PATCH 05/19] dma-iommu: move the arm64 wrappers to common code Christoph Hellwig
@ 2019-01-14  9:41 ` Christoph Hellwig
  2019-02-05 15:02   ` Robin Murphy
  2019-01-14  9:41 ` [PATCH 07/19] dma-iommu: fix and refactor iommu_dma_get_sgtable Christoph Hellwig
                   ` (13 subsequent siblings)
  19 siblings, 1 reply; 38+ messages in thread
From: Christoph Hellwig @ 2019-01-14  9:41 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Tom Lendacky, Catalin Marinas, Joerg Roedel, Will Deacon,
	linux-kernel, iommu, linux-arm-kernel

The current iommu_dma_mmap code does not properly handle memory from the
page allocator that hasn't been remapped, which can happen in the rare
case of allocations for a coherent device that aren't allowed to block.

Fix this by replacing iommu_dma_mmap with a slightly tweaked copy of
dma_common_mmap with special handling for the remapped array of
pages allocated from __iommu_dma_alloc.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/iommu/dma-iommu.c | 59 +++++++++++++++------------------------
 1 file changed, 23 insertions(+), 36 deletions(-)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index e0ffe22775ac..26f479d49103 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -592,23 +592,27 @@ static struct page **__iommu_dma_alloc(struct device *dev, size_t size,
 }
 
 /**
- * __iommu_dma_mmap - Map a buffer into provided user VMA
- * @pages: Array representing buffer from __iommu_dma_alloc()
+ * iommu_dma_mmap_remap - Map a remapped page array into provided user VMA
+ * @cpu_addr: virtual address of the memory to be remapped
  * @size: Size of buffer in bytes
  * @vma: VMA describing requested userspace mapping
  *
- * Maps the pages of the buffer in @pages into @vma. The caller is responsible
+ * Maps the pages pointed to by @cpu_addr into @vma. The caller is responsible
  * for verifying the correct size and protection of @vma beforehand.
  */
-static int __iommu_dma_mmap(struct page **pages, size_t size,
+static int iommu_dma_mmap_remap(void *cpu_addr, size_t size,
 		struct vm_area_struct *vma)
 {
+	struct vm_struct *area = find_vm_area(cpu_addr);
 	unsigned long uaddr = vma->vm_start;
 	unsigned int i, count = PAGE_ALIGN(size) >> PAGE_SHIFT;
 	int ret = -ENXIO;
 
+	if (WARN_ON(!area || !area->pages))
+		return -ENXIO;
+
 	for (i = vma->vm_pgoff; i < count && uaddr < vma->vm_end; i++) {
-		ret = vm_insert_page(vma, uaddr, pages[i]);
+		ret = vm_insert_page(vma, uaddr, area->pages[i]);
 		if (ret)
 			break;
 		uaddr += PAGE_SIZE;
@@ -1047,29 +1051,14 @@ static void iommu_dma_free(struct device *dev, size_t size, void *cpu_addr,
 	}
 }
 
-static int __iommu_dma_mmap_pfn(struct vm_area_struct *vma,
-			      unsigned long pfn, size_t size)
-{
-	int ret = -ENXIO;
-	unsigned long nr_vma_pages = vma_pages(vma);
-	unsigned long nr_pages = PAGE_ALIGN(size) >> PAGE_SHIFT;
-	unsigned long off = vma->vm_pgoff;
-
-	if (off < nr_pages && nr_vma_pages <= (nr_pages - off)) {
-		ret = remap_pfn_range(vma, vma->vm_start,
-				      pfn + off,
-				      vma->vm_end - vma->vm_start,
-				      vma->vm_page_prot);
-	}
-
-	return ret;
-}
-
 static int iommu_dma_mmap(struct device *dev, struct vm_area_struct *vma,
 		void *cpu_addr, dma_addr_t dma_addr, size_t size,
 		unsigned long attrs)
 {
-	struct vm_struct *area;
+	unsigned long user_count = vma_pages(vma);
+	unsigned long count = PAGE_ALIGN(size) >> PAGE_SHIFT;
+	unsigned long off = vma->vm_pgoff;
+	unsigned long pfn;
 	int ret;
 
 	vma->vm_page_prot = arch_dma_mmap_pgprot(dev, vma->vm_page_prot, attrs);
@@ -1077,20 +1066,18 @@ static int iommu_dma_mmap(struct device *dev, struct vm_area_struct *vma,
 	if (dma_mmap_from_dev_coherent(dev, vma, cpu_addr, size, &ret))
 		return ret;
 
-	if (attrs & DMA_ATTR_FORCE_CONTIGUOUS) {
-		/*
-		 * DMA_ATTR_FORCE_CONTIGUOUS allocations are always remapped,
-		 * hence in the vmalloc space.
-		 */
-		unsigned long pfn = vmalloc_to_pfn(cpu_addr);
-		return __iommu_dma_mmap_pfn(vma, pfn, size);
-	}
-
-	area = find_vm_area(cpu_addr);
-	if (WARN_ON(!area || !area->pages))
+	if (off >= count || user_count > count - off)
 		return -ENXIO;
 
-	return __iommu_dma_mmap(area->pages, size, vma);
+	if (is_vmalloc_addr(cpu_addr)) {
+		if (!(attrs & DMA_ATTR_FORCE_CONTIGUOUS))
+			return iommu_dma_mmap_remap(cpu_addr, size, vma);
+		pfn = vmalloc_to_pfn(cpu_addr);
+	} else
+		pfn = page_to_pfn(virt_to_page(cpu_addr));
+
+	return remap_pfn_range(vma, vma->vm_start, pfn + vma->vm_pgoff,
+			user_count << PAGE_SHIFT, vma->vm_page_prot);
 }
 
 static int __iommu_dma_get_sgtable_page(struct sg_table *sgt, struct page *page,
-- 
2.20.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 07/19] dma-iommu: fix and refactor iommu_dma_get_sgtable
  2019-01-14  9:41 implement generic dma_map_ops for IOMMUs Christoph Hellwig
                   ` (5 preceding siblings ...)
  2019-01-14  9:41 ` [PATCH 06/19] dma-iommu: fix and refactor iommu_dma_mmap Christoph Hellwig
@ 2019-01-14  9:41 ` Christoph Hellwig
  2019-01-14  9:41 ` [PATCH 08/19] dma-iommu: move __iommu_dma_map Christoph Hellwig
                   ` (12 subsequent siblings)
  19 siblings, 0 replies; 38+ messages in thread
From: Christoph Hellwig @ 2019-01-14  9:41 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Tom Lendacky, Catalin Marinas, Joerg Roedel, Will Deacon,
	linux-kernel, iommu, linux-arm-kernel

The current iommu_dma_get_sgtable code does not properly handle memory
from the page allocator that hasn't been remapped, which can happen in
the rare case of allocations for a coherent device that aren't allowed
to block.

Fix this by replacing iommu_dma_get_sgtable with a slightly tweaked copy
of dma_common_get_sgtable with special handling for the remapped array
of pages allocated from __iommu_dma_alloc.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/iommu/dma-iommu.c | 49 +++++++++++++++++++--------------------
 1 file changed, 24 insertions(+), 25 deletions(-)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index 26f479d49103..8f3dc6ab3da1 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -620,6 +620,18 @@ static int iommu_dma_mmap_remap(void *cpu_addr, size_t size,
 	return ret;
 }
 
+static int iommu_dma_get_sgtable_remap(struct sg_table *sgt, void *cpu_addr,
+		size_t size)
+{
+	unsigned int count = PAGE_ALIGN(size) >> PAGE_SHIFT;
+	struct vm_struct *area = find_vm_area(cpu_addr);
+
+	if (WARN_ON(!area || !area->pages))
+		return -ENXIO;
+	return sg_alloc_table_from_pages(sgt, area->pages, count, 0, size,
+			GFP_KERNEL);
+}
+
 static void iommu_dma_sync_single_for_cpu(struct device *dev,
 		dma_addr_t dma_handle, size_t size, enum dma_data_direction dir)
 {
@@ -1080,37 +1092,24 @@ static int iommu_dma_mmap(struct device *dev, struct vm_area_struct *vma,
 			user_count << PAGE_SHIFT, vma->vm_page_prot);
 }
 
-static int __iommu_dma_get_sgtable_page(struct sg_table *sgt, struct page *page,
-		size_t size)
-{
-	int ret = sg_alloc_table(sgt, 1, GFP_KERNEL);
-
-	if (!ret)
-		sg_set_page(sgt->sgl, page, PAGE_ALIGN(size), 0);
-	return ret;
-}
-
 static int iommu_dma_get_sgtable(struct device *dev, struct sg_table *sgt,
 		void *cpu_addr, dma_addr_t dma_addr, size_t size,
 		unsigned long attrs)
 {
-	unsigned int count = PAGE_ALIGN(size) >> PAGE_SHIFT;
-	struct vm_struct *area = find_vm_area(cpu_addr);
-
-	if (attrs & DMA_ATTR_FORCE_CONTIGUOUS) {
-		/*
-		 * DMA_ATTR_FORCE_CONTIGUOUS allocations are always remapped,
-		 * hence in the vmalloc space.
-		 */
-		struct page *page = vmalloc_to_page(cpu_addr);
-		return __iommu_dma_get_sgtable_page(sgt, page, size);
-	}
+	struct page *page;
+	int ret;
 
-	if (WARN_ON(!area || !area->pages))
-		return -ENXIO;
+	if (is_vmalloc_addr(cpu_addr)) {
+		if (!(attrs & DMA_ATTR_FORCE_CONTIGUOUS))
+			return iommu_dma_get_sgtable_remap(sgt, cpu_addr, size);
+		page = vmalloc_to_page(cpu_addr);
+	} else
+		page = virt_to_page(cpu_addr);
 
-	return sg_alloc_table_from_pages(sgt, area->pages, count, 0, size,
-					 GFP_KERNEL);
+	ret = sg_alloc_table(sgt, 1, GFP_KERNEL);
+	if (!ret)
+		sg_set_page(sgt->sgl, page, PAGE_ALIGN(size), 0);
+	return ret;
 }
 
 static const struct dma_map_ops iommu_dma_ops = {
-- 
2.20.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 08/19] dma-iommu: move __iommu_dma_map
  2019-01-14  9:41 implement generic dma_map_ops for IOMMUs Christoph Hellwig
                   ` (6 preceding siblings ...)
  2019-01-14  9:41 ` [PATCH 07/19] dma-iommu: fix and refactor iommu_dma_get_sgtable Christoph Hellwig
@ 2019-01-14  9:41 ` Christoph Hellwig
  2019-01-14  9:41 ` [PATCH 09/19] dma-iommu: refactor page array remap helpers Christoph Hellwig
                   ` (11 subsequent siblings)
  19 siblings, 0 replies; 38+ messages in thread
From: Christoph Hellwig @ 2019-01-14  9:41 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Tom Lendacky, Catalin Marinas, Joerg Roedel, Will Deacon,
	linux-kernel, iommu, linux-arm-kernel

Moving this function up to its unmap counterpart helps to keep related
code together for the following changes.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/iommu/dma-iommu.c | 46 +++++++++++++++++++--------------------
 1 file changed, 23 insertions(+), 23 deletions(-)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index 8f3dc6ab3da1..0727c109bcab 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -437,6 +437,29 @@ static void __iommu_dma_unmap(struct iommu_domain *domain, dma_addr_t dma_addr,
 	iommu_dma_free_iova(cookie, dma_addr, size);
 }
 
+static dma_addr_t __iommu_dma_map(struct device *dev, phys_addr_t phys,
+		size_t size, int prot, struct iommu_domain *domain)
+{
+	struct iommu_dma_cookie *cookie = domain->iova_cookie;
+	size_t iova_off = 0;
+	dma_addr_t iova;
+
+	if (cookie->type == IOMMU_DMA_IOVA_COOKIE) {
+		iova_off = iova_offset(&cookie->iovad, phys);
+		size = iova_align(&cookie->iovad, size + iova_off);
+	}
+
+	iova = iommu_dma_alloc_iova(domain, size, dma_get_mask(dev), dev);
+	if (!iova)
+		return DMA_MAPPING_ERROR;
+
+	if (iommu_map(domain, iova, phys - iova_off, size, prot)) {
+		iommu_dma_free_iova(cookie, iova, size);
+		return DMA_MAPPING_ERROR;
+	}
+	return iova + iova_off;
+}
+
 static void __iommu_dma_free_pages(struct page **pages, int count)
 {
 	while (count--)
@@ -684,29 +707,6 @@ static void iommu_dma_sync_sg_for_device(struct device *dev,
 		arch_sync_dma_for_device(dev, sg_phys(sg), sg->length, dir);
 }
 
-static dma_addr_t __iommu_dma_map(struct device *dev, phys_addr_t phys,
-		size_t size, int prot, struct iommu_domain *domain)
-{
-	struct iommu_dma_cookie *cookie = domain->iova_cookie;
-	size_t iova_off = 0;
-	dma_addr_t iova;
-
-	if (cookie->type == IOMMU_DMA_IOVA_COOKIE) {
-		iova_off = iova_offset(&cookie->iovad, phys);
-		size = iova_align(&cookie->iovad, size + iova_off);
-	}
-
-	iova = iommu_dma_alloc_iova(domain, size, dma_get_mask(dev), dev);
-	if (!iova)
-		return DMA_MAPPING_ERROR;
-
-	if (iommu_map(domain, iova, phys - iova_off, size, prot)) {
-		iommu_dma_free_iova(cookie, iova, size);
-		return DMA_MAPPING_ERROR;
-	}
-	return iova + iova_off;
-}
-
 static dma_addr_t __iommu_dma_map_page(struct device *dev, struct page *page,
 		unsigned long offset, size_t size, int prot)
 {
-- 
2.20.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 09/19] dma-iommu: refactor page array remap helpers
  2019-01-14  9:41 implement generic dma_map_ops for IOMMUs Christoph Hellwig
                   ` (7 preceding siblings ...)
  2019-01-14  9:41 ` [PATCH 08/19] dma-iommu: move __iommu_dma_map Christoph Hellwig
@ 2019-01-14  9:41 ` Christoph Hellwig
  2019-01-14  9:41 ` [PATCH 10/19] dma-iommu: factor atomic pool allocations into helpers Christoph Hellwig
                   ` (10 subsequent siblings)
  19 siblings, 0 replies; 38+ messages in thread
From: Christoph Hellwig @ 2019-01-14  9:41 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Tom Lendacky, Catalin Marinas, Joerg Roedel, Will Deacon,
	linux-kernel, iommu, linux-arm-kernel

Move the call to dma_common_pages_remap / dma_common_free_remap  into
__iommu_dma_alloc / __iommu_dma_free and rename those functions to
better describe what they do.  This keeps the functionality that
allocates and remaps a non-contigous array of pages nicely abstracted
out from the calling code.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/iommu/dma-iommu.c | 73 ++++++++++++++++++---------------------
 1 file changed, 34 insertions(+), 39 deletions(-)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index 0727c109bcab..95d30b96e5bd 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -526,51 +526,57 @@ static struct page **__iommu_dma_alloc_pages(struct device *dev,
 }
 
 /**
- * iommu_dma_free - Free a buffer allocated by __iommu_dma_alloc()
+ * iommu_dma_free_remap - Free a buffer allocated by iommu_dma_alloc_remap
  * @dev: Device which owns this buffer
- * @pages: Array of buffer pages as returned by __iommu_dma_alloc()
  * @size: Size of buffer in bytes
+ * @cpu_address: Virtual address of the buffer
  * @handle: DMA address of buffer
  *
  * Frees both the pages associated with the buffer, and the array
  * describing them
  */
-static void __iommu_dma_free(struct device *dev, struct page **pages,
-		size_t size, dma_addr_t *handle)
+static void iommu_dma_free_remap(struct device *dev, size_t size,
+		void *cpu_addr, dma_addr_t dma_handle)
 {
-	__iommu_dma_unmap(iommu_get_dma_domain(dev), *handle, size);
-	__iommu_dma_free_pages(pages, PAGE_ALIGN(size) >> PAGE_SHIFT);
-	*handle = DMA_MAPPING_ERROR;
+	struct vm_struct *area = find_vm_area(cpu_addr);
+
+	if (WARN_ON(!area || !area->pages))
+		return;
+	__iommu_dma_unmap(iommu_get_dma_domain(dev), dma_handle, size);
+	__iommu_dma_free_pages(area->pages, PAGE_ALIGN(size) >> PAGE_SHIFT);
+	dma_common_free_remap(cpu_addr, PAGE_ALIGN(size), VM_USERMAP);
 }
 
 /**
- * __iommu_dma_alloc - Allocate and map a buffer contiguous in IOVA space
+ * iommu_dma_alloc_remap - Allocate and map a buffer contiguous in IOVA space
  * @dev: Device to allocate memory for. Must be a real device
  *	 attached to an iommu_dma_domain
  * @size: Size of buffer in bytes
+ * @dma_handle: Out argument for allocated DMA handle
  * @gfp: Allocation flags
  * @attrs: DMA attributes for this allocation
- * @prot: IOMMU mapping flags
- * @handle: Out argument for allocated DMA handle
  *
  * If @size is less than PAGE_SIZE, then a full CPU page will be allocated,
  * but an IOMMU which supports smaller pages might not map the whole thing.
  *
- * Return: Array of struct page pointers describing the buffer,
- *	   or NULL on failure.
+ * Return: Mapped virtual address, or NULL on failure.
  */
-static struct page **__iommu_dma_alloc(struct device *dev, size_t size,
-		gfp_t gfp, unsigned long attrs, int prot, dma_addr_t *handle)
+static void *iommu_dma_alloc_remap(struct device *dev, size_t size,
+		dma_addr_t *dma_handle, gfp_t gfp, unsigned long attrs)
 {
 	struct iommu_domain *domain = iommu_get_dma_domain(dev);
 	struct iommu_dma_cookie *cookie = domain->iova_cookie;
 	struct iova_domain *iovad = &cookie->iovad;
+	bool coherent = dev_is_dma_coherent(dev);
+	int ioprot = dma_info_to_prot(DMA_BIDIRECTIONAL, coherent, attrs);
+	pgprot_t prot = arch_dma_mmap_pgprot(dev, PAGE_KERNEL, attrs);
+	unsigned int count, min_size, alloc_sizes = domain->pgsize_bitmap, i;
 	struct page **pages;
 	dma_addr_t iova;
-	unsigned int count, min_size, alloc_sizes = domain->pgsize_bitmap, i;
 	size_t mapped = 0;
+	void *vaddr;
 
-	*handle = DMA_MAPPING_ERROR;
+	*dma_handle = DMA_MAPPING_ERROR;
 
 	min_size = alloc_sizes & -alloc_sizes;
 	if (min_size < PAGE_SIZE) {
@@ -596,16 +602,21 @@ static struct page **__iommu_dma_alloc(struct device *dev, size_t size,
 	for (i = 0; i < count; i++) {
 		phys_addr_t phys = page_to_phys(pages[i]);
 
-		if (!(prot & IOMMU_CACHE))
+		if (!(ioprot & IOMMU_CACHE))
 			arch_dma_prep_coherent(pages[i], PAGE_SIZE);
 
-		if (iommu_map(domain, iova + mapped, phys, PAGE_SIZE, prot))
+		if (iommu_map(domain, iova + mapped, phys, PAGE_SIZE, ioprot))
 			goto out_unmap;
 		mapped += PAGE_SIZE;
 	}
 
-	*handle = iova;
-	return pages;
+	vaddr = dma_common_pages_remap(pages, size, VM_USERMAP, prot,
+			__builtin_return_address(0));
+	if (!vaddr)
+		goto out_unmap;
+
+	*dma_handle = iova;
+	return vaddr;
 out_unmap:
 	iommu_unmap(domain, iova, mapped);
 	iommu_dma_free_iova(cookie, iova, size);
@@ -1008,18 +1019,7 @@ static void *iommu_dma_alloc(struct device *dev, size_t size,
 						    size >> PAGE_SHIFT);
 		}
 	} else {
-		pgprot_t prot = arch_dma_mmap_pgprot(dev, PAGE_KERNEL, attrs);
-		struct page **pages;
-
-		pages = __iommu_dma_alloc(dev, iosize, gfp, attrs, ioprot,
-					handle);
-		if (!pages)
-			return NULL;
-
-		addr = dma_common_pages_remap(pages, size, VM_USERMAP, prot,
-					      __builtin_return_address(0));
-		if (!addr)
-			__iommu_dma_free(dev, pages, iosize, handle);
+		addr = iommu_dma_alloc_remap(dev, iosize, handle, gfp, attrs);
 	}
 	return addr;
 }
@@ -1033,7 +1033,7 @@ static void iommu_dma_free(struct device *dev, size_t size, void *cpu_addr,
 	/*
 	 * @cpu_addr will be one of 4 things depending on how it was allocated:
 	 * - A remapped array of pages for contiguous allocations.
-	 * - A remapped array of pages from __iommu_dma_alloc(), for all
+	 * - A remapped array of pages from iommu_dma_alloc_remap(), for all
 	 *   non-atomic allocations.
 	 * - A non-cacheable alias from the atomic pool, for atomic
 	 *   allocations by non-coherent devices.
@@ -1051,12 +1051,7 @@ static void iommu_dma_free(struct device *dev, size_t size, void *cpu_addr,
 		dma_release_from_contiguous(dev, page, size >> PAGE_SHIFT);
 		dma_common_free_remap(cpu_addr, size, VM_USERMAP);
 	} else if (is_vmalloc_addr(cpu_addr)){
-		struct vm_struct *area = find_vm_area(cpu_addr);
-
-		if (WARN_ON(!area || !area->pages))
-			return;
-		__iommu_dma_free(dev, area->pages, iosize, &handle);
-		dma_common_free_remap(cpu_addr, size, VM_USERMAP);
+		iommu_dma_free_remap(dev, iosize, cpu_addr, handle);
 	} else {
 		__iommu_dma_unmap_page(dev, handle, iosize, 0, 0);
 		__free_pages(virt_to_page(cpu_addr), get_order(size));
-- 
2.20.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 10/19] dma-iommu: factor atomic pool allocations into helpers
  2019-01-14  9:41 implement generic dma_map_ops for IOMMUs Christoph Hellwig
                   ` (8 preceding siblings ...)
  2019-01-14  9:41 ` [PATCH 09/19] dma-iommu: refactor page array remap helpers Christoph Hellwig
@ 2019-01-14  9:41 ` Christoph Hellwig
  2019-01-14  9:41 ` [PATCH 11/19] dma-iommu: factor contiguous " Christoph Hellwig
                   ` (9 subsequent siblings)
  19 siblings, 0 replies; 38+ messages in thread
From: Christoph Hellwig @ 2019-01-14  9:41 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Tom Lendacky, Catalin Marinas, Joerg Roedel, Will Deacon,
	linux-kernel, iommu, linux-arm-kernel

This keeps the code together and will simplify compiling the code
out on architectures that are always dma coherent.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/iommu/dma-iommu.c | 51 +++++++++++++++++++++++++++++----------
 1 file changed, 38 insertions(+), 13 deletions(-)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index 95d30b96e5bd..fdd283f45656 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -666,6 +666,35 @@ static int iommu_dma_get_sgtable_remap(struct sg_table *sgt, void *cpu_addr,
 			GFP_KERNEL);
 }
 
+static void iommu_dma_free_pool(struct device *dev, size_t size,
+		void *vaddr, dma_addr_t dma_handle)
+{
+	__iommu_dma_unmap(iommu_get_domain_for_dev(dev), dma_handle, size);
+	dma_free_from_pool(vaddr, PAGE_ALIGN(size));
+}
+
+static void *iommu_dma_alloc_pool(struct device *dev, size_t size,
+		dma_addr_t *dma_handle, gfp_t gfp, unsigned long attrs)
+{
+	bool coherent = dev_is_dma_coherent(dev);
+	struct page *page;
+	void *vaddr;
+
+	vaddr = dma_alloc_from_pool(PAGE_ALIGN(size), &page, gfp);
+	if (!vaddr)
+		return NULL;
+
+	*dma_handle = __iommu_dma_map(dev, page_to_phys(page), size,
+			dma_info_to_prot(DMA_BIDIRECTIONAL, coherent, attrs),
+			iommu_get_domain_for_dev(dev));
+	if (*dma_handle == DMA_MAPPING_ERROR) {
+		dma_free_from_pool(vaddr, PAGE_ALIGN(size));
+		return NULL;
+	}
+
+	return vaddr;
+}
+
 static void iommu_dma_sync_single_for_cpu(struct device *dev,
 		dma_addr_t dma_handle, size_t size, enum dma_data_direction dir)
 {
@@ -974,21 +1003,18 @@ static void *iommu_dma_alloc(struct device *dev, size_t size,
 		 * get the virtually contiguous buffer we need by way of a
 		 * physically contiguous allocation.
 		 */
-		if (coherent) {
-			page = alloc_pages(gfp, get_order(size));
-			addr = page ? page_address(page) : NULL;
-		} else {
-			addr = dma_alloc_from_pool(size, &page, gfp);
-		}
-		if (!addr)
+		if (!coherent)
+			return iommu_dma_alloc_pool(dev, iosize, handle, gfp,
+					attrs);
+
+		page = alloc_pages(gfp, get_order(size));
+		if (!page)
 			return NULL;
 
+		addr = page_address(page);
 		*handle = __iommu_dma_map_page(dev, page, 0, iosize, ioprot);
 		if (*handle == DMA_MAPPING_ERROR) {
-			if (coherent)
-				__free_pages(page, get_order(size));
-			else
-				dma_free_from_pool(addr, size);
+			__free_pages(page, get_order(size));
 			addr = NULL;
 		}
 	} else if (attrs & DMA_ATTR_FORCE_CONTIGUOUS) {
@@ -1042,8 +1068,7 @@ static void iommu_dma_free(struct device *dev, size_t size, void *cpu_addr,
 	 * Hence how dodgy the below logic looks...
 	 */
 	if (dma_in_atomic_pool(cpu_addr, size)) {
-		__iommu_dma_unmap_page(dev, handle, iosize, 0, 0);
-		dma_free_from_pool(cpu_addr, size);
+		iommu_dma_free_pool(dev, size, cpu_addr, handle);
 	} else if (attrs & DMA_ATTR_FORCE_CONTIGUOUS) {
 		struct page *page = vmalloc_to_page(cpu_addr);
 
-- 
2.20.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 11/19] dma-iommu: factor contiguous allocations into helpers
  2019-01-14  9:41 implement generic dma_map_ops for IOMMUs Christoph Hellwig
                   ` (9 preceding siblings ...)
  2019-01-14  9:41 ` [PATCH 10/19] dma-iommu: factor atomic pool allocations into helpers Christoph Hellwig
@ 2019-01-14  9:41 ` Christoph Hellwig
  2019-01-14  9:41 ` [PATCH 12/19] dma-iommu: refactor iommu_dma_free Christoph Hellwig
                   ` (8 subsequent siblings)
  19 siblings, 0 replies; 38+ messages in thread
From: Christoph Hellwig @ 2019-01-14  9:41 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Tom Lendacky, Catalin Marinas, Joerg Roedel, Will Deacon,
	linux-kernel, iommu, linux-arm-kernel

This keeps the code together and will simplify using it in different
ways.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/iommu/dma-iommu.c | 110 ++++++++++++++++++++------------------
 1 file changed, 59 insertions(+), 51 deletions(-)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index fdd283f45656..73f76226ff5e 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -460,6 +460,48 @@ static dma_addr_t __iommu_dma_map(struct device *dev, phys_addr_t phys,
 	return iova + iova_off;
 }
 
+static void iommu_dma_free_contiguous(struct device *dev, size_t size,
+		struct page *page, dma_addr_t dma_handle)
+{
+	unsigned int count = PAGE_ALIGN(size) >> PAGE_SHIFT;
+
+	__iommu_dma_unmap(iommu_get_domain_for_dev(dev), dma_handle, size);
+	if (!dma_release_from_contiguous(dev, page, count))
+		__free_pages(page, get_order(size));
+}
+
+
+static void *iommu_dma_alloc_contiguous(struct device *dev, size_t size,
+		dma_addr_t *dma_handle, gfp_t gfp, unsigned long attrs)
+{
+	bool coherent = dev_is_dma_coherent(dev);
+	int ioprot = dma_info_to_prot(DMA_BIDIRECTIONAL, coherent, attrs);
+	unsigned int count = PAGE_ALIGN(size) >> PAGE_SHIFT;
+	unsigned int page_order = get_order(size);
+	struct page *page = NULL;
+
+	if (gfpflags_allow_blocking(gfp))
+		page = dma_alloc_from_contiguous(dev, count, page_order,
+						 gfp & __GFP_NOWARN);
+
+	if (page)
+		memset(page_address(page), 0, PAGE_ALIGN(size));
+	else
+		page = alloc_pages(gfp, page_order);
+	if (!page)
+		return NULL;
+
+	*dma_handle = __iommu_dma_map(dev, page_to_phys(page), size, ioprot,
+			iommu_get_dma_domain(dev));
+	if (*dma_handle == DMA_MAPPING_ERROR) {
+		if (!dma_release_from_contiguous(dev, page, count))
+			__free_pages(page, page_order);
+		return NULL;
+	}
+
+	return page_address(page);
+}
+
 static void __iommu_dma_free_pages(struct page **pages, int count)
 {
 	while (count--)
@@ -747,19 +789,6 @@ static void iommu_dma_sync_sg_for_device(struct device *dev,
 		arch_sync_dma_for_device(dev, sg_phys(sg), sg->length, dir);
 }
 
-static dma_addr_t __iommu_dma_map_page(struct device *dev, struct page *page,
-		unsigned long offset, size_t size, int prot)
-{
-	return __iommu_dma_map(dev, page_to_phys(page) + offset, size, prot,
-			iommu_get_dma_domain(dev));
-}
-
-static void __iommu_dma_unmap_page(struct device *dev, dma_addr_t handle,
-		size_t size, enum dma_data_direction dir, unsigned long attrs)
-{
-	__iommu_dma_unmap(iommu_get_dma_domain(dev), handle, size);
-}
-
 static dma_addr_t iommu_dma_map_page(struct device *dev, struct page *page,
 		unsigned long offset, size_t size, enum dma_data_direction dir,
 		unsigned long attrs)
@@ -984,7 +1013,6 @@ static void *iommu_dma_alloc(struct device *dev, size_t size,
 		dma_addr_t *handle, gfp_t gfp, unsigned long attrs)
 {
 	bool coherent = dev_is_dma_coherent(dev);
-	int ioprot = dma_info_to_prot(DMA_BIDIRECTIONAL, coherent, attrs);
 	size_t iosize = size;
 	void *addr;
 
@@ -997,7 +1025,6 @@ static void *iommu_dma_alloc(struct device *dev, size_t size,
 	gfp |= __GFP_ZERO;
 
 	if (!gfpflags_allow_blocking(gfp)) {
-		struct page *page;
 		/*
 		 * In atomic context we can't remap anything, so we'll only
 		 * get the virtually contiguous buffer we need by way of a
@@ -1006,44 +1033,27 @@ static void *iommu_dma_alloc(struct device *dev, size_t size,
 		if (!coherent)
 			return iommu_dma_alloc_pool(dev, iosize, handle, gfp,
 					attrs);
-
-		page = alloc_pages(gfp, get_order(size));
-		if (!page)
-			return NULL;
-
-		addr = page_address(page);
-		*handle = __iommu_dma_map_page(dev, page, 0, iosize, ioprot);
-		if (*handle == DMA_MAPPING_ERROR) {
-			__free_pages(page, get_order(size));
-			addr = NULL;
-		}
+		return iommu_dma_alloc_contiguous(dev, iosize, handle, gfp,
+				attrs);
 	} else if (attrs & DMA_ATTR_FORCE_CONTIGUOUS) {
 		pgprot_t prot = arch_dma_mmap_pgprot(dev, PAGE_KERNEL, attrs);
 		struct page *page;
 
-		page = dma_alloc_from_contiguous(dev, size >> PAGE_SHIFT,
-					get_order(size), gfp & __GFP_NOWARN);
-		if (!page)
+		addr = iommu_dma_alloc_contiguous(dev, iosize, handle, gfp,
+				attrs);
+		if (!addr)
 			return NULL;
+		page = virt_to_page(addr);
 
-		*handle = __iommu_dma_map_page(dev, page, 0, iosize, ioprot);
-		if (*handle == DMA_MAPPING_ERROR) {
-			dma_release_from_contiguous(dev, page,
-						    size >> PAGE_SHIFT);
+		addr = dma_common_contiguous_remap(page, size, VM_USERMAP, prot,
+				__builtin_return_address(0));
+		if (!addr) {
+			iommu_dma_free_contiguous(dev, iosize, page, *handle);
 			return NULL;
 		}
-		addr = dma_common_contiguous_remap(page, size, VM_USERMAP,
-						   prot,
-						   __builtin_return_address(0));
-		if (addr) {
-			if (!coherent)
-				arch_dma_prep_coherent(page, iosize);
-			memset(addr, 0, size);
-		} else {
-			__iommu_dma_unmap_page(dev, *handle, iosize, 0, attrs);
-			dma_release_from_contiguous(dev, page,
-						    size >> PAGE_SHIFT);
-		}
+
+		if (!coherent)
+			arch_dma_prep_coherent(page, iosize);
 	} else {
 		addr = iommu_dma_alloc_remap(dev, iosize, handle, gfp, attrs);
 	}
@@ -1070,16 +1080,14 @@ static void iommu_dma_free(struct device *dev, size_t size, void *cpu_addr,
 	if (dma_in_atomic_pool(cpu_addr, size)) {
 		iommu_dma_free_pool(dev, size, cpu_addr, handle);
 	} else if (attrs & DMA_ATTR_FORCE_CONTIGUOUS) {
-		struct page *page = vmalloc_to_page(cpu_addr);
-
-		__iommu_dma_unmap_page(dev, handle, iosize, 0, attrs);
-		dma_release_from_contiguous(dev, page, size >> PAGE_SHIFT);
+		iommu_dma_free_contiguous(dev, iosize,
+				vmalloc_to_page(cpu_addr), handle);
 		dma_common_free_remap(cpu_addr, size, VM_USERMAP);
 	} else if (is_vmalloc_addr(cpu_addr)){
 		iommu_dma_free_remap(dev, iosize, cpu_addr, handle);
 	} else {
-		__iommu_dma_unmap_page(dev, handle, iosize, 0, 0);
-		__free_pages(virt_to_page(cpu_addr), get_order(size));
+		iommu_dma_free_contiguous(dev, iosize, virt_to_page(cpu_addr),
+				handle);
 	}
 }
 
-- 
2.20.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 12/19] dma-iommu: refactor iommu_dma_free
  2019-01-14  9:41 implement generic dma_map_ops for IOMMUs Christoph Hellwig
                   ` (10 preceding siblings ...)
  2019-01-14  9:41 ` [PATCH 11/19] dma-iommu: factor contiguous " Christoph Hellwig
@ 2019-01-14  9:41 ` Christoph Hellwig
  2019-01-14  9:41 ` [PATCH 13/19] dma-iommu: don't remap contiguous allocations for coherent devices Christoph Hellwig
                   ` (7 subsequent siblings)
  19 siblings, 0 replies; 38+ messages in thread
From: Christoph Hellwig @ 2019-01-14  9:41 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Tom Lendacky, Catalin Marinas, Joerg Roedel, Will Deacon,
	linux-kernel, iommu, linux-arm-kernel

Reorder the checks a bit so that a non-remapped allocation is the
fallthrough case, as this will ease making remapping conditional.
Also get rid of the confusing game with the size and iosize variables
and rename the handle argument to the more standard dma_handle.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/iommu/dma-iommu.c | 46 ++++++++++++++++++++-------------------
 1 file changed, 24 insertions(+), 22 deletions(-)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index 73f76226ff5e..c9788e0c1d5d 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -1061,34 +1061,36 @@ static void *iommu_dma_alloc(struct device *dev, size_t size,
 }
 
 static void iommu_dma_free(struct device *dev, size_t size, void *cpu_addr,
-		dma_addr_t handle, unsigned long attrs)
+		dma_addr_t dma_handle, unsigned long attrs)
 {
-	size_t iosize = size;
+	struct page *page;
 
-	size = PAGE_ALIGN(size);
 	/*
-	 * @cpu_addr will be one of 4 things depending on how it was allocated:
-	 * - A remapped array of pages for contiguous allocations.
-	 * - A remapped array of pages from iommu_dma_alloc_remap(), for all
-	 *   non-atomic allocations.
-	 * - A non-cacheable alias from the atomic pool, for atomic
-	 *   allocations by non-coherent devices.
-	 * - A normal lowmem address, for atomic allocations by
-	 *   coherent devices.
+	 * cpu_addr can be one of 4 things depending on how it was allocated:
+	 *
+	 *  (1) A non-cacheable alias from the atomic pool.
+	 *  (2) A remapped array of pages from iommu_dma_alloc_remap().
+	 *  (3) A remapped contiguous lowmem allocation.
+	 *  (4) A normal lowmem address.
+	 *
 	 * Hence how dodgy the below logic looks...
 	 */
-	if (dma_in_atomic_pool(cpu_addr, size)) {
-		iommu_dma_free_pool(dev, size, cpu_addr, handle);
-	} else if (attrs & DMA_ATTR_FORCE_CONTIGUOUS) {
-		iommu_dma_free_contiguous(dev, iosize,
-				vmalloc_to_page(cpu_addr), handle);
-		dma_common_free_remap(cpu_addr, size, VM_USERMAP);
-	} else if (is_vmalloc_addr(cpu_addr)){
-		iommu_dma_free_remap(dev, iosize, cpu_addr, handle);
-	} else {
-		iommu_dma_free_contiguous(dev, iosize, virt_to_page(cpu_addr),
-				handle);
+	if (dma_in_atomic_pool(cpu_addr, PAGE_ALIGN(size))) {
+		iommu_dma_free_pool(dev, size, cpu_addr, dma_handle);
+		return;
 	}
+
+	if (is_vmalloc_addr(cpu_addr)) {
+		if (!(attrs & DMA_ATTR_FORCE_CONTIGUOUS)) {
+			iommu_dma_free_remap(dev, size, cpu_addr, dma_handle);
+			return;
+		}
+		page = vmalloc_to_page(cpu_addr);
+		dma_common_free_remap(cpu_addr, PAGE_ALIGN(size), VM_USERMAP);
+	} else
+		page = virt_to_page(cpu_addr);
+
+	iommu_dma_free_contiguous(dev, size, page, dma_handle);
 }
 
 static int iommu_dma_mmap(struct device *dev, struct vm_area_struct *vma,
-- 
2.20.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 13/19] dma-iommu: don't remap contiguous allocations for coherent devices
  2019-01-14  9:41 implement generic dma_map_ops for IOMMUs Christoph Hellwig
                   ` (11 preceding siblings ...)
  2019-01-14  9:41 ` [PATCH 12/19] dma-iommu: refactor iommu_dma_free Christoph Hellwig
@ 2019-01-14  9:41 ` Christoph Hellwig
  2019-01-14  9:41 ` [PATCH 14/19] dma-iommu: factor contiguous remapped allocations into helpers Christoph Hellwig
                   ` (6 subsequent siblings)
  19 siblings, 0 replies; 38+ messages in thread
From: Christoph Hellwig @ 2019-01-14  9:41 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Tom Lendacky, Catalin Marinas, Joerg Roedel, Will Deacon,
	linux-kernel, iommu, linux-arm-kernel

There is no need to remap for pte attributes, or for a virtually
contiguous address, so just don't do it.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/iommu/dma-iommu.c | 9 ++++-----
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index c9788e0c1d5d..710814a370b9 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -1041,10 +1041,10 @@ static void *iommu_dma_alloc(struct device *dev, size_t size,
 
 		addr = iommu_dma_alloc_contiguous(dev, iosize, handle, gfp,
 				attrs);
-		if (!addr)
-			return NULL;
-		page = virt_to_page(addr);
+		if (coherent || !addr)
+			return addr;
 
+		page = virt_to_page(addr);
 		addr = dma_common_contiguous_remap(page, size, VM_USERMAP, prot,
 				__builtin_return_address(0));
 		if (!addr) {
@@ -1052,8 +1052,7 @@ static void *iommu_dma_alloc(struct device *dev, size_t size,
 			return NULL;
 		}
 
-		if (!coherent)
-			arch_dma_prep_coherent(page, iosize);
+		arch_dma_prep_coherent(page, iosize);
 	} else {
 		addr = iommu_dma_alloc_remap(dev, iosize, handle, gfp, attrs);
 	}
-- 
2.20.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 14/19] dma-iommu: factor contiguous remapped allocations into helpers
  2019-01-14  9:41 implement generic dma_map_ops for IOMMUs Christoph Hellwig
                   ` (12 preceding siblings ...)
  2019-01-14  9:41 ` [PATCH 13/19] dma-iommu: don't remap contiguous allocations for coherent devices Christoph Hellwig
@ 2019-01-14  9:41 ` Christoph Hellwig
  2019-01-14  9:41 ` [PATCH 15/19] dma-iommu: refactor iommu_dma_alloc Christoph Hellwig
                   ` (5 subsequent siblings)
  19 siblings, 0 replies; 38+ messages in thread
From: Christoph Hellwig @ 2019-01-14  9:41 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Tom Lendacky, Catalin Marinas, Joerg Roedel, Will Deacon,
	linux-kernel, iommu, linux-arm-kernel

This moves the last remaning non-dispatch code out of iommu_dma_alloc,
preparing to refactor the allocation method selection.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/iommu/dma-iommu.c | 48 +++++++++++++++++++++++----------------
 1 file changed, 29 insertions(+), 19 deletions(-)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index 710814a370b9..956cb218c6ba 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -667,6 +667,29 @@ static void *iommu_dma_alloc_remap(struct device *dev, size_t size,
 	return NULL;
 }
 
+static void *iommu_dma_alloc_contiguous_remap(struct device *dev, size_t size,
+		dma_addr_t *dma_handle, gfp_t gfp, unsigned long attrs)
+{
+	pgprot_t prot = arch_dma_mmap_pgprot(dev, PAGE_KERNEL, attrs);
+	struct page *page;
+	void *addr;
+
+	addr = iommu_dma_alloc_contiguous(dev, size, dma_handle, gfp, attrs);
+	if (!addr)
+		return NULL;
+
+	page = virt_to_page(addr);
+	addr = dma_common_contiguous_remap(page, PAGE_ALIGN(size), VM_USERMAP,
+			prot, __builtin_return_address(0));
+	if (!addr)
+		goto out_free;
+	arch_dma_prep_coherent(page, size);
+	return addr;
+out_free:
+	iommu_dma_free_contiguous(dev, size, page, *dma_handle);
+	return NULL;
+}
+
 /**
  * iommu_dma_mmap_remap - Map a remapped page array into provided user VMA
  * @cpu_addr: virtual address of the memory to be remapped
@@ -1016,8 +1039,6 @@ static void *iommu_dma_alloc(struct device *dev, size_t size,
 	size_t iosize = size;
 	void *addr;
 
-	size = PAGE_ALIGN(size);
-
 	/*
 	 * Some drivers rely on this, and we probably don't want the
 	 * possibility of stale kernel data being read by devices anyway.
@@ -1036,23 +1057,12 @@ static void *iommu_dma_alloc(struct device *dev, size_t size,
 		return iommu_dma_alloc_contiguous(dev, iosize, handle, gfp,
 				attrs);
 	} else if (attrs & DMA_ATTR_FORCE_CONTIGUOUS) {
-		pgprot_t prot = arch_dma_mmap_pgprot(dev, PAGE_KERNEL, attrs);
-		struct page *page;
-
-		addr = iommu_dma_alloc_contiguous(dev, iosize, handle, gfp,
-				attrs);
-		if (coherent || !addr)
-			return addr;
-
-		page = virt_to_page(addr);
-		addr = dma_common_contiguous_remap(page, size, VM_USERMAP, prot,
-				__builtin_return_address(0));
-		if (!addr) {
-			iommu_dma_free_contiguous(dev, iosize, page, *handle);
-			return NULL;
-		}
-
-		arch_dma_prep_coherent(page, iosize);
+		if (coherent)
+			addr = iommu_dma_alloc_contiguous(dev, iosize, handle,
+					gfp, attrs);
+		else
+			addr = iommu_dma_alloc_contiguous_remap(dev, iosize,
+					handle, gfp, attrs);
 	} else {
 		addr = iommu_dma_alloc_remap(dev, iosize, handle, gfp, attrs);
 	}
-- 
2.20.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 15/19] dma-iommu: refactor iommu_dma_alloc
  2019-01-14  9:41 implement generic dma_map_ops for IOMMUs Christoph Hellwig
                   ` (13 preceding siblings ...)
  2019-01-14  9:41 ` [PATCH 14/19] dma-iommu: factor contiguous remapped allocations into helpers Christoph Hellwig
@ 2019-01-14  9:41 ` Christoph Hellwig
  2019-01-14  9:41 ` [PATCH 16/19] dma-iommu: don't depend on CONFIG_DMA_DIRECT_REMAP Christoph Hellwig
                   ` (4 subsequent siblings)
  19 siblings, 0 replies; 38+ messages in thread
From: Christoph Hellwig @ 2019-01-14  9:41 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Tom Lendacky, Catalin Marinas, Joerg Roedel, Will Deacon,
	linux-kernel, iommu, linux-arm-kernel

Split all functionality related to non-coherent devices into a
separate helper, and make the decision flow more obvious.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/iommu/dma-iommu.c | 51 +++++++++++++++++++--------------------
 1 file changed, 25 insertions(+), 26 deletions(-)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index 956cb218c6ba..fd25c995bde4 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -760,6 +760,22 @@ static void *iommu_dma_alloc_pool(struct device *dev, size_t size,
 	return vaddr;
 }
 
+static void *iommu_dma_alloc_noncoherent(struct device *dev, size_t size,
+		dma_addr_t *dma_handle, gfp_t gfp, unsigned long attrs)
+{
+	/*
+	 * In atomic context we can't remap anything, so we'll only get the
+	 * virtually contiguous buffer we need by way of a physically
+	 * contiguous allocation.
+	 */
+	if (!gfpflags_allow_blocking(gfp))
+		return iommu_dma_alloc_pool(dev, size, dma_handle, gfp, attrs);
+	if (attrs & DMA_ATTR_FORCE_CONTIGUOUS)
+		return iommu_dma_alloc_contiguous_remap(dev, size, dma_handle,
+				gfp, attrs);
+	return iommu_dma_alloc_remap(dev, size, dma_handle, gfp, attrs);
+}
+
 static void iommu_dma_sync_single_for_cpu(struct device *dev,
 		dma_addr_t dma_handle, size_t size, enum dma_data_direction dir)
 {
@@ -1033,40 +1049,23 @@ static void iommu_dma_unmap_resource(struct device *dev, dma_addr_t handle,
 }
 
 static void *iommu_dma_alloc(struct device *dev, size_t size,
-		dma_addr_t *handle, gfp_t gfp, unsigned long attrs)
+		dma_addr_t *dma_handle, gfp_t gfp, unsigned long attrs)
 {
-	bool coherent = dev_is_dma_coherent(dev);
-	size_t iosize = size;
-	void *addr;
-
 	/*
 	 * Some drivers rely on this, and we probably don't want the
 	 * possibility of stale kernel data being read by devices anyway.
 	 */
 	gfp |= __GFP_ZERO;
 
-	if (!gfpflags_allow_blocking(gfp)) {
-		/*
-		 * In atomic context we can't remap anything, so we'll only
-		 * get the virtually contiguous buffer we need by way of a
-		 * physically contiguous allocation.
-		 */
-		if (!coherent)
-			return iommu_dma_alloc_pool(dev, iosize, handle, gfp,
-					attrs);
-		return iommu_dma_alloc_contiguous(dev, iosize, handle, gfp,
+	if (!dev_is_dma_coherent(dev))
+		return iommu_dma_alloc_noncoherent(dev, size, dma_handle, gfp,
 				attrs);
-	} else if (attrs & DMA_ATTR_FORCE_CONTIGUOUS) {
-		if (coherent)
-			addr = iommu_dma_alloc_contiguous(dev, iosize, handle,
-					gfp, attrs);
-		else
-			addr = iommu_dma_alloc_contiguous_remap(dev, iosize,
-					handle, gfp, attrs);
-	} else {
-		addr = iommu_dma_alloc_remap(dev, iosize, handle, gfp, attrs);
-	}
-	return addr;
+
+	if (gfpflags_allow_blocking(gfp) &&
+	    !(attrs & DMA_ATTR_FORCE_CONTIGUOUS))
+		return iommu_dma_alloc_remap(dev, size, dma_handle, gfp, attrs);
+
+	return iommu_dma_alloc_contiguous(dev, size, dma_handle, gfp, attrs);
 }
 
 static void iommu_dma_free(struct device *dev, size_t size, void *cpu_addr,
-- 
2.20.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 16/19] dma-iommu: don't depend on CONFIG_DMA_DIRECT_REMAP
  2019-01-14  9:41 implement generic dma_map_ops for IOMMUs Christoph Hellwig
                   ` (14 preceding siblings ...)
  2019-01-14  9:41 ` [PATCH 15/19] dma-iommu: refactor iommu_dma_alloc Christoph Hellwig
@ 2019-01-14  9:41 ` Christoph Hellwig
  2019-02-06 11:55   ` Robin Murphy
  2019-01-14  9:41 ` [PATCH 17/19] dma-iommu: switch copyright boilerplace to SPDX Christoph Hellwig
                   ` (3 subsequent siblings)
  19 siblings, 1 reply; 38+ messages in thread
From: Christoph Hellwig @ 2019-01-14  9:41 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Tom Lendacky, Catalin Marinas, Joerg Roedel, Will Deacon,
	linux-kernel, iommu, linux-arm-kernel

For entirely dma coherent architectures there is no good reason to ever
remap dma coherent allocation.  Move all the remap and pool code under
CONFIG_DMA_DIRECT_REMAP ifdefs, and drop the Kconfig dependency.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/iommu/Kconfig     |  1 -
 drivers/iommu/dma-iommu.c | 10 ++++++++++
 2 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
index 8b13fb7d0263..d9a25715650e 100644
--- a/drivers/iommu/Kconfig
+++ b/drivers/iommu/Kconfig
@@ -94,7 +94,6 @@ config IOMMU_DMA
 	select IOMMU_API
 	select IOMMU_IOVA
 	select NEED_SG_DMA_LENGTH
-	depends on DMA_DIRECT_REMAP
 
 config FSL_PAMU
 	bool "Freescale IOMMU support"
diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index fd25c995bde4..e27909771d55 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -502,6 +502,7 @@ static void *iommu_dma_alloc_contiguous(struct device *dev, size_t size,
 	return page_address(page);
 }
 
+#ifdef CONFIG_DMA_DIRECT_REMAP
 static void __iommu_dma_free_pages(struct page **pages, int count)
 {
 	while (count--)
@@ -775,6 +776,7 @@ static void *iommu_dma_alloc_noncoherent(struct device *dev, size_t size,
 				gfp, attrs);
 	return iommu_dma_alloc_remap(dev, size, dma_handle, gfp, attrs);
 }
+#endif /* CONFIG_DMA_DIRECT_REMAP */
 
 static void iommu_dma_sync_single_for_cpu(struct device *dev,
 		dma_addr_t dma_handle, size_t size, enum dma_data_direction dir)
@@ -1057,6 +1059,7 @@ static void *iommu_dma_alloc(struct device *dev, size_t size,
 	 */
 	gfp |= __GFP_ZERO;
 
+#ifdef CONFIG_DMA_DIRECT_REMAP
 	if (!dev_is_dma_coherent(dev))
 		return iommu_dma_alloc_noncoherent(dev, size, dma_handle, gfp,
 				attrs);
@@ -1064,6 +1067,7 @@ static void *iommu_dma_alloc(struct device *dev, size_t size,
 	if (gfpflags_allow_blocking(gfp) &&
 	    !(attrs & DMA_ATTR_FORCE_CONTIGUOUS))
 		return iommu_dma_alloc_remap(dev, size, dma_handle, gfp, attrs);
+#endif
 
 	return iommu_dma_alloc_contiguous(dev, size, dma_handle, gfp, attrs);
 }
@@ -1083,6 +1087,7 @@ static void iommu_dma_free(struct device *dev, size_t size, void *cpu_addr,
 	 *
 	 * Hence how dodgy the below logic looks...
 	 */
+#ifdef CONFIG_DMA_DIRECT_REMAP
 	if (dma_in_atomic_pool(cpu_addr, PAGE_ALIGN(size))) {
 		iommu_dma_free_pool(dev, size, cpu_addr, dma_handle);
 		return;
@@ -1096,6 +1101,7 @@ static void iommu_dma_free(struct device *dev, size_t size, void *cpu_addr,
 		page = vmalloc_to_page(cpu_addr);
 		dma_common_free_remap(cpu_addr, PAGE_ALIGN(size), VM_USERMAP);
 	} else
+#endif
 		page = virt_to_page(cpu_addr);
 
 	iommu_dma_free_contiguous(dev, size, page, dma_handle);
@@ -1119,11 +1125,13 @@ static int iommu_dma_mmap(struct device *dev, struct vm_area_struct *vma,
 	if (off >= count || user_count > count - off)
 		return -ENXIO;
 
+#ifdef CONFIG_DMA_DIRECT_REMAP
 	if (is_vmalloc_addr(cpu_addr)) {
 		if (!(attrs & DMA_ATTR_FORCE_CONTIGUOUS))
 			return iommu_dma_mmap_remap(cpu_addr, size, vma);
 		pfn = vmalloc_to_pfn(cpu_addr);
 	} else
+#endif
 		pfn = page_to_pfn(virt_to_page(cpu_addr));
 
 	return remap_pfn_range(vma, vma->vm_start, pfn + vma->vm_pgoff,
@@ -1137,11 +1145,13 @@ static int iommu_dma_get_sgtable(struct device *dev, struct sg_table *sgt,
 	struct page *page;
 	int ret;
 
+#ifdef CONFIG_DMA_DIRECT_REMAP
 	if (is_vmalloc_addr(cpu_addr)) {
 		if (!(attrs & DMA_ATTR_FORCE_CONTIGUOUS))
 			return iommu_dma_get_sgtable_remap(sgt, cpu_addr, size);
 		page = vmalloc_to_page(cpu_addr);
 	} else
+#endif
 		page = virt_to_page(cpu_addr);
 
 	ret = sg_alloc_table(sgt, 1, GFP_KERNEL);
-- 
2.20.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 17/19] dma-iommu: switch copyright boilerplace to SPDX
  2019-01-14  9:41 implement generic dma_map_ops for IOMMUs Christoph Hellwig
                   ` (15 preceding siblings ...)
  2019-01-14  9:41 ` [PATCH 16/19] dma-iommu: don't depend on CONFIG_DMA_DIRECT_REMAP Christoph Hellwig
@ 2019-01-14  9:41 ` Christoph Hellwig
  2019-02-06 11:57   ` Robin Murphy
  2019-01-14  9:41 ` [PATCH 18/19] arm64: switch copyright boilerplace to SPDX in dma-mapping.c Christoph Hellwig
                   ` (2 subsequent siblings)
  19 siblings, 1 reply; 38+ messages in thread
From: Christoph Hellwig @ 2019-01-14  9:41 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Tom Lendacky, Catalin Marinas, Joerg Roedel, Will Deacon,
	linux-kernel, iommu, linux-arm-kernel

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/iommu/dma-iommu.c | 13 +------------
 include/linux/dma-iommu.h | 13 +------------
 2 files changed, 2 insertions(+), 24 deletions(-)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index e27909771d55..1b76121df94e 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -1,3 +1,4 @@
+// SPDX-License-Identifier: GPL-2.0
 /*
  * A fairly generic DMA-API to IOMMU-API glue layer.
  *
@@ -5,18 +6,6 @@
  *
  * based in part on arch/arm/mm/dma-mapping.c:
  * Copyright (C) 2000-2004 Russell King
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License version 2 as
- * published by the Free Software Foundation.
- *
- * This program is distributed in the hope that it will be useful,
- * but WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
- * GNU General Public License for more details.
- *
- * You should have received a copy of the GNU General Public License
- * along with this program.  If not, see <http://www.gnu.org/licenses/>.
  */
 
 #include <linux/acpi_iort.h>
diff --git a/include/linux/dma-iommu.h b/include/linux/dma-iommu.h
index 5277aa8782bf..bfe9f19b1171 100644
--- a/include/linux/dma-iommu.h
+++ b/include/linux/dma-iommu.h
@@ -1,17 +1,6 @@
+/* SPDX-License-Identifier: GPL-2.0 */
 /*
  * Copyright (C) 2014-2015 ARM Ltd.
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License version 2 as
- * published by the Free Software Foundation.
- *
- * This program is distributed in the hope that it will be useful,
- * but WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
- * GNU General Public License for more details.
- *
- * You should have received a copy of the GNU General Public License
- * along with this program.  If not, see <http://www.gnu.org/licenses/>.
  */
 #ifndef __DMA_IOMMU_H
 #define __DMA_IOMMU_H
-- 
2.20.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 18/19] arm64: switch copyright boilerplace to SPDX in dma-mapping.c
  2019-01-14  9:41 implement generic dma_map_ops for IOMMUs Christoph Hellwig
                   ` (16 preceding siblings ...)
  2019-01-14  9:41 ` [PATCH 17/19] dma-iommu: switch copyright boilerplace to SPDX Christoph Hellwig
@ 2019-01-14  9:41 ` Christoph Hellwig
  2019-02-06 12:19   ` Robin Murphy
  2019-01-14  9:41 ` [PATCH 19/19] arm64: trim includes " Christoph Hellwig
  2019-01-28  7:53 ` implement generic dma_map_ops for IOMMUs Christoph Hellwig
  19 siblings, 1 reply; 38+ messages in thread
From: Christoph Hellwig @ 2019-01-14  9:41 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Tom Lendacky, Catalin Marinas, Joerg Roedel, Will Deacon,
	linux-kernel, iommu, linux-arm-kernel

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 arch/arm64/mm/dma-mapping.c | 15 +--------------
 1 file changed, 1 insertion(+), 14 deletions(-)

diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c
index fffba9426ee4..bdfb4e985a69 100644
--- a/arch/arm64/mm/dma-mapping.c
+++ b/arch/arm64/mm/dma-mapping.c
@@ -1,20 +1,7 @@
+// SPDX-License-Identifier: GPL-2.0
 /*
- * SWIOTLB-based DMA API implementation
- *
  * Copyright (C) 2012 ARM Ltd.
  * Author: Catalin Marinas <catalin.marinas@arm.com>
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License version 2 as
- * published by the Free Software Foundation.
- *
- * This program is distributed in the hope that it will be useful,
- * but WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
- * GNU General Public License for more details.
- *
- * You should have received a copy of the GNU General Public License
- * along with this program.  If not, see <http://www.gnu.org/licenses/>.
  */
 
 #include <linux/gfp.h>
-- 
2.20.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 19/19] arm64: trim includes in dma-mapping.c
  2019-01-14  9:41 implement generic dma_map_ops for IOMMUs Christoph Hellwig
                   ` (17 preceding siblings ...)
  2019-01-14  9:41 ` [PATCH 18/19] arm64: switch copyright boilerplace to SPDX in dma-mapping.c Christoph Hellwig
@ 2019-01-14  9:41 ` Christoph Hellwig
  2019-01-28  7:53 ` implement generic dma_map_ops for IOMMUs Christoph Hellwig
  19 siblings, 0 replies; 38+ messages in thread
From: Christoph Hellwig @ 2019-01-14  9:41 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Tom Lendacky, Catalin Marinas, Joerg Roedel, Will Deacon,
	linux-kernel, iommu, linux-arm-kernel

With most of the previous functionality now elsewhere a lot of the
headers included in this file are not needed.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 arch/arm64/mm/dma-mapping.c | 11 -----------
 1 file changed, 11 deletions(-)

diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c
index bdfb4e985a69..b6e910d1533b 100644
--- a/arch/arm64/mm/dma-mapping.c
+++ b/arch/arm64/mm/dma-mapping.c
@@ -5,20 +5,9 @@
  */
 
 #include <linux/gfp.h>
-#include <linux/acpi.h>
-#include <linux/memblock.h>
 #include <linux/cache.h>
-#include <linux/export.h>
-#include <linux/slab.h>
-#include <linux/genalloc.h>
-#include <linux/dma-direct.h>
 #include <linux/dma-noncoherent.h>
-#include <linux/dma-contiguous.h>
 #include <linux/dma-iommu.h>
-#include <linux/vmalloc.h>
-#include <linux/swiotlb.h>
-#include <linux/pci.h>
-
 #include <asm/cacheflush.h>
 
 pgprot_t arch_dma_mmap_pgprot(struct device *dev, pgprot_t prot,
-- 
2.20.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* Re: implement generic dma_map_ops for IOMMUs
  2019-01-14  9:41 implement generic dma_map_ops for IOMMUs Christoph Hellwig
                   ` (18 preceding siblings ...)
  2019-01-14  9:41 ` [PATCH 19/19] arm64: trim includes " Christoph Hellwig
@ 2019-01-28  7:53 ` Christoph Hellwig
  19 siblings, 0 replies; 38+ messages in thread
From: Christoph Hellwig @ 2019-01-28  7:53 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Tom Lendacky, Catalin Marinas, Will Deacon, linux-kernel, iommu,
	linux-arm-kernel

Any chance to get a review on this one?

On Mon, Jan 14, 2019 at 10:41:40AM +0100, Christoph Hellwig wrote:
> Hi Robin,
> 
> please take a look at this series, which implements a completely generic
> set of dma_map_ops for IOMMU drivers.  This is done by taking the
> existing arm64 code, moving it to drivers/iommu and then massaging it
> so that it can also work for architectures with DMA remapping.  This
> should help future ports to support IOMMUs more easily, and also allow
> to remove various custom IOMMU dma_map_ops implementations, like Tom
> was planning to for the AMD one.
> 
> A git tree is also available at:
> 
>     git://git.infradead.org/users/hch/misc.git dma-iommu-ops
> 
> Gitweb:
> 
>     http://git.infradead.org/users/hch/misc.git/shortlog/refs/heads/dma-iommu-ops
> _______________________________________________
> iommu mailing list
> iommu@lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/iommu
---end quoted text---

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 01/19] dma-mapping: add a Kconfig symbol to indicated arch_dma_prep_coherent presence
  2019-01-14  9:41 ` [PATCH 01/19] dma-mapping: add a Kconfig symbol to indicated arch_dma_prep_coherent presence Christoph Hellwig
@ 2019-02-01 14:22   ` Robin Murphy
  2019-02-01 16:12     ` Christoph Hellwig
  0 siblings, 1 reply; 38+ messages in thread
From: Robin Murphy @ 2019-02-01 14:22 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Tom Lendacky, Catalin Marinas, Joerg Roedel, Will Deacon,
	linux-kernel, iommu, linux-arm-kernel

On 14/01/2019 09:41, Christoph Hellwig wrote:
> Add a Kconfig symbol that indicates an architecture provides a
> arch_dma_prep_coherent implementation, and provide a stub otherwise.
> 
> This will allow the generic dma-iommu code to it while still allowing
> to be built for cache coherent architectures.

I've been pondering this for a while now, and I still can't really come 
up with a case where arch_dma_prep_coherent() would need to behave 
differently from arch_sync_dma_for_device(..., DMA_BIDIRECTIONAL). I 
wonder if we could just save ourselves this little bit of complexity by 
using that instead...

Robin.

> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>   arch/arm64/Kconfig              | 1 +
>   arch/csky/Kconfig               | 1 +
>   include/linux/dma-noncoherent.h | 6 ++++++
>   kernel/dma/Kconfig              | 3 +++
>   4 files changed, 11 insertions(+)
> 
> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> index a4168d366127..ae3f581a9bcc 100644
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -13,6 +13,7 @@ config ARM64
>   	select ARCH_HAS_DEVMEM_IS_ALLOWED
>   	select ARCH_HAS_DMA_COHERENT_TO_PFN
>   	select ARCH_HAS_DMA_MMAP_PGPROT
> +	select ARCH_HAS_DMA_PREP_COHERENT
>   	select ARCH_HAS_ACPI_TABLE_UPGRADE if ACPI
>   	select ARCH_HAS_ELF_RANDOMIZE
>   	select ARCH_HAS_FAST_MULTIPLIER
> diff --git a/arch/csky/Kconfig b/arch/csky/Kconfig
> index 398113c845f5..8b84d4362ff6 100644
> --- a/arch/csky/Kconfig
> +++ b/arch/csky/Kconfig
> @@ -1,5 +1,6 @@
>   config CSKY
>   	def_bool y
> +	select ARCH_HAS_DMA_PREP_COHERENT
>   	select ARCH_HAS_SYNC_DMA_FOR_CPU
>   	select ARCH_HAS_SYNC_DMA_FOR_DEVICE
>   	select ARCH_USE_BUILTIN_BSWAP
> diff --git a/include/linux/dma-noncoherent.h b/include/linux/dma-noncoherent.h
> index 69b36ed31a99..9741767e400f 100644
> --- a/include/linux/dma-noncoherent.h
> +++ b/include/linux/dma-noncoherent.h
> @@ -72,6 +72,12 @@ static inline void arch_sync_dma_for_cpu_all(struct device *dev)
>   }
>   #endif /* CONFIG_ARCH_HAS_SYNC_DMA_FOR_CPU_ALL */
>   
> +#ifdef CONFIG_ARCH_HAS_DMA_PREP_COHERENT
>   void arch_dma_prep_coherent(struct page *page, size_t size);
> +#else
> +static inline void arch_dma_prep_coherent(struct page *page, size_t size)
> +{
> +}
> +#endif /* CONFIG_ARCH_HAS_DMA_PREP_COHERENT */
>   
>   #endif /* _LINUX_DMA_NONCOHERENT_H */
> diff --git a/kernel/dma/Kconfig b/kernel/dma/Kconfig
> index ca88b867e7fe..541128a32c5d 100644
> --- a/kernel/dma/Kconfig
> +++ b/kernel/dma/Kconfig
> @@ -29,6 +29,9 @@ config ARCH_HAS_SYNC_DMA_FOR_CPU
>   config ARCH_HAS_SYNC_DMA_FOR_CPU_ALL
>   	bool
>   
> +config ARCH_HAS_DMA_PREP_COHERENT
> +	bool
> +
>   config ARCH_HAS_DMA_COHERENT_TO_PFN
>   	bool
>   
> 

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 02/19] dma-iommu: cleanup dma-iommu.h
  2019-01-14  9:41 ` [PATCH 02/19] dma-iommu: cleanup dma-iommu.h Christoph Hellwig
@ 2019-02-01 14:47   ` Robin Murphy
  2019-02-01 16:13     ` Christoph Hellwig
  0 siblings, 1 reply; 38+ messages in thread
From: Robin Murphy @ 2019-02-01 14:47 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Tom Lendacky, Catalin Marinas, Joerg Roedel, Will Deacon,
	linux-kernel, iommu, linux-arm-kernel

On 14/01/2019 09:41, Christoph Hellwig wrote:
> No need for a __KERNEL__ guard outside uapi, make sure we pull in the
> includes unconditionally so users can rely on it, and add a missing
> comment describing the #else cpp statement.  Last but not least include
> <linux/errno.h> instead of the asm version, which is frowned upon.

I think the __KERNEL__ and asm/errno.h slip-ups are things I 
cargo-culted from the arch code as a fresh-faced noob yet to learn the 
finer details, so ack for those parts. The forward-declarations, though, 
were a deliberate effort to minimise header dependencies and compilation 
bloat for includers who absolutely wouldn't care, and specifically to 
try to avoid setting transitive include expectations since they always 
seem to end up breaking someone's config somewhere down the line. 
Admittedly this little backwater is hardly comparable to the likes of 
the sched.h business, but I'm still somewhat on the fence about that 
change :/

Robin.

> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>   include/linux/dma-iommu.h | 15 ++++-----------
>   1 file changed, 4 insertions(+), 11 deletions(-)
> 
> diff --git a/include/linux/dma-iommu.h b/include/linux/dma-iommu.h
> index e760dc5d1fa8..65aa888c2768 100644
> --- a/include/linux/dma-iommu.h
> +++ b/include/linux/dma-iommu.h
> @@ -16,15 +16,13 @@
>   #ifndef __DMA_IOMMU_H
>   #define __DMA_IOMMU_H
>   
> -#ifdef __KERNEL__
> -#include <linux/types.h>
> -#include <asm/errno.h>
> -
> -#ifdef CONFIG_IOMMU_DMA
> +#include <linux/errno.h>
>   #include <linux/dma-mapping.h>
>   #include <linux/iommu.h>
>   #include <linux/msi.h>
> +#include <linux/types.h>
>   
> +#ifdef CONFIG_IOMMU_DMA
>   int iommu_dma_init(void);
>   
>   /* Domain management interface for IOMMU drivers */
> @@ -74,11 +72,7 @@ void iommu_dma_unmap_resource(struct device *dev, dma_addr_t handle,
>   void iommu_dma_map_msi_msg(int irq, struct msi_msg *msg);
>   void iommu_dma_get_resv_regions(struct device *dev, struct list_head *list);
>   
> -#else
> -
> -struct iommu_domain;
> -struct msi_msg;
> -struct device;
> +#else /* CONFIG_IOMMU_DMA */
>   
>   static inline int iommu_dma_init(void)
>   {
> @@ -108,5 +102,4 @@ static inline void iommu_dma_get_resv_regions(struct device *dev, struct list_he
>   }
>   
>   #endif	/* CONFIG_IOMMU_DMA */
> -#endif	/* __KERNEL__ */
>   #endif	/* __DMA_IOMMU_H */
> 

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 03/19] dma-iommu: don't use a scatterlist in iommu_dma_alloc
  2019-01-14  9:41 ` [PATCH 03/19] dma-iommu: don't use a scatterlist in iommu_dma_alloc Christoph Hellwig
@ 2019-02-01 15:24   ` Robin Murphy
  2019-02-01 16:16     ` Christoph Hellwig
  0 siblings, 1 reply; 38+ messages in thread
From: Robin Murphy @ 2019-02-01 15:24 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Tom Lendacky, Catalin Marinas, Joerg Roedel, Will Deacon,
	linux-kernel, iommu, linux-arm-kernel

On 14/01/2019 09:41, Christoph Hellwig wrote:
> Directly iterating over the pages makes the code a bit simpler and
> prepares for the following changes.

It also defeats the whole purpose of __iommu_dma_alloc_pages(), so I'm 
not really buying the simplification angle - you've *seen* that code, 
right? ;)

If you want simple, get rid of the pages array entirely. However, as 
I've touched on previously, it's all there for a reason, because making 
the individual iommu_map() calls as large as possible gives significant 
performance/power benefits in many cases which I'm not too keen to 
regress. In fact I still have the spark of an idea to sort the filled 
pages array for optimal physical layout, I've just never had the free 
time to play with it. FWIW, since iommu_map_sg() was new and promising 
at the time, using sg_alloc_table_from_pages() actually *was* the 
simplification over copying arch/arm's __iommu_create_mapping() logic.

Robin.

> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>   drivers/iommu/dma-iommu.c | 40 +++++++++++++++++----------------------
>   1 file changed, 17 insertions(+), 23 deletions(-)
> 
> diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
> index d19f3d6b43c1..4f5546a103d8 100644
> --- a/drivers/iommu/dma-iommu.c
> +++ b/drivers/iommu/dma-iommu.c
> @@ -30,6 +30,7 @@
>   #include <linux/mm.h>
>   #include <linux/pci.h>
>   #include <linux/scatterlist.h>
> +#include <linux/highmem.h>
>   #include <linux/vmalloc.h>
>   
>   struct iommu_dma_msi_page {
> @@ -549,9 +550,9 @@ struct page **iommu_dma_alloc(struct device *dev, size_t size, gfp_t gfp,
>   	struct iommu_dma_cookie *cookie = domain->iova_cookie;
>   	struct iova_domain *iovad = &cookie->iovad;
>   	struct page **pages;
> -	struct sg_table sgt;
>   	dma_addr_t iova;
> -	unsigned int count, min_size, alloc_sizes = domain->pgsize_bitmap;
> +	unsigned int count, min_size, alloc_sizes = domain->pgsize_bitmap, i;
> +	size_t mapped = 0;
>   
>   	*handle = DMA_MAPPING_ERROR;
>   
> @@ -576,32 +577,25 @@ struct page **iommu_dma_alloc(struct device *dev, size_t size, gfp_t gfp,
>   	if (!iova)
>   		goto out_free_pages;
>   
> -	if (sg_alloc_table_from_pages(&sgt, pages, count, 0, size, GFP_KERNEL))
> -		goto out_free_iova;
> +	for (i = 0; i < count; i++) {
> +		phys_addr_t phys = page_to_phys(pages[i]);
>   
> -	if (!(prot & IOMMU_CACHE)) {
> -		struct sg_mapping_iter miter;
> -		/*
> -		 * The CPU-centric flushing implied by SG_MITER_TO_SG isn't
> -		 * sufficient here, so skip it by using the "wrong" direction.
> -		 */
> -		sg_miter_start(&miter, sgt.sgl, sgt.orig_nents, SG_MITER_FROM_SG);
> -		while (sg_miter_next(&miter))
> -			flush_page(dev, miter.addr, page_to_phys(miter.page));
> -		sg_miter_stop(&miter);
> -	}
> +		if (!(prot & IOMMU_CACHE)) {
> +			void *vaddr = kmap_atomic(pages[i]);
>   
> -	if (iommu_map_sg(domain, iova, sgt.sgl, sgt.orig_nents, prot)
> -			< size)
> -		goto out_free_sg;
> +			flush_page(dev, vaddr, phys);
> +			kunmap_atomic(vaddr);
> +		}
> +
> +		if (iommu_map(domain, iova + mapped, phys, PAGE_SIZE, prot))
> +			goto out_unmap;
> +		mapped += PAGE_SIZE;
> +	}
>   
>   	*handle = iova;
> -	sg_free_table(&sgt);
>   	return pages;
> -
> -out_free_sg:
> -	sg_free_table(&sgt);
> -out_free_iova:
> +out_unmap:
> +	iommu_unmap(domain, iova, mapped);
>   	iommu_dma_free_iova(cookie, iova, size);
>   out_free_pages:
>   	__iommu_dma_free_pages(pages, count);
> 

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 04/19] dma-iommu: remove the flush_page callback
  2019-01-14  9:41 ` [PATCH 04/19] dma-iommu: remove the flush_page callback Christoph Hellwig
@ 2019-02-01 15:28   ` Robin Murphy
  0 siblings, 0 replies; 38+ messages in thread
From: Robin Murphy @ 2019-02-01 15:28 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Tom Lendacky, Catalin Marinas, Joerg Roedel, Will Deacon,
	linux-kernel, iommu, linux-arm-kernel

On 14/01/2019 09:41, Christoph Hellwig wrote:
> We now have a arch_dma_prep_coherent architecture hook that is used
> for the generic DMA remap allocator, and we should use the same
> interface for the dma-iommu code.

Agreed - I'd definitely ack a version of this change which didn't depend 
on patch #3 ;)

Robin.

> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>   arch/arm64/mm/dma-mapping.c |  8 +-------
>   drivers/iommu/dma-iommu.c   | 14 ++++----------
>   include/linux/dma-iommu.h   |  3 +--
>   3 files changed, 6 insertions(+), 19 deletions(-)
> 
> diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c
> index fb0908456a1f..75fe7273a1e4 100644
> --- a/arch/arm64/mm/dma-mapping.c
> +++ b/arch/arm64/mm/dma-mapping.c
> @@ -104,12 +104,6 @@ arch_initcall(arm64_dma_init);
>   #include <linux/platform_device.h>
>   #include <linux/amba/bus.h>
>   
> -/* Thankfully, all cache ops are by VA so we can ignore phys here */
> -static void flush_page(struct device *dev, const void *virt, phys_addr_t phys)
> -{
> -	__dma_flush_area(virt, PAGE_SIZE);
> -}
> -
>   static void *__iommu_alloc_attrs(struct device *dev, size_t size,
>   				 dma_addr_t *handle, gfp_t gfp,
>   				 unsigned long attrs)
> @@ -186,7 +180,7 @@ static void *__iommu_alloc_attrs(struct device *dev, size_t size,
>   		struct page **pages;
>   
>   		pages = iommu_dma_alloc(dev, iosize, gfp, attrs, ioprot,
> -					handle, flush_page);
> +					handle);
>   		if (!pages)
>   			return NULL;
>   
> diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
> index 4f5546a103d8..d6a437385b26 100644
> --- a/drivers/iommu/dma-iommu.c
> +++ b/drivers/iommu/dma-iommu.c
> @@ -22,6 +22,7 @@
>   #include <linux/acpi_iort.h>
>   #include <linux/device.h>
>   #include <linux/dma-iommu.h>
> +#include <linux/dma-noncoherent.h>
>   #include <linux/gfp.h>
>   #include <linux/huge_mm.h>
>   #include <linux/iommu.h>
> @@ -533,8 +534,6 @@ void iommu_dma_free(struct device *dev, struct page **pages, size_t size,
>    * @attrs: DMA attributes for this allocation
>    * @prot: IOMMU mapping flags
>    * @handle: Out argument for allocated DMA handle
> - * @flush_page: Arch callback which must ensure PAGE_SIZE bytes from the
> - *		given VA/PA are visible to the given non-coherent device.
>    *
>    * If @size is less than PAGE_SIZE, then a full CPU page will be allocated,
>    * but an IOMMU which supports smaller pages might not map the whole thing.
> @@ -543,8 +542,7 @@ void iommu_dma_free(struct device *dev, struct page **pages, size_t size,
>    *	   or NULL on failure.
>    */
>   struct page **iommu_dma_alloc(struct device *dev, size_t size, gfp_t gfp,
> -		unsigned long attrs, int prot, dma_addr_t *handle,
> -		void (*flush_page)(struct device *, const void *, phys_addr_t))
> +		unsigned long attrs, int prot, dma_addr_t *handle)
>   {
>   	struct iommu_domain *domain = iommu_get_dma_domain(dev);
>   	struct iommu_dma_cookie *cookie = domain->iova_cookie;
> @@ -580,12 +578,8 @@ struct page **iommu_dma_alloc(struct device *dev, size_t size, gfp_t gfp,
>   	for (i = 0; i < count; i++) {
>   		phys_addr_t phys = page_to_phys(pages[i]);
>   
> -		if (!(prot & IOMMU_CACHE)) {
> -			void *vaddr = kmap_atomic(pages[i]);
> -
> -			flush_page(dev, vaddr, phys);
> -			kunmap_atomic(vaddr);
> -		}
> +		if (!(prot & IOMMU_CACHE))
> +			arch_dma_prep_coherent(pages[i], PAGE_SIZE);
>   
>   		if (iommu_map(domain, iova + mapped, phys, PAGE_SIZE, prot))
>   			goto out_unmap;
> diff --git a/include/linux/dma-iommu.h b/include/linux/dma-iommu.h
> index 65aa888c2768..59e606f78626 100644
> --- a/include/linux/dma-iommu.h
> +++ b/include/linux/dma-iommu.h
> @@ -43,8 +43,7 @@ int dma_info_to_prot(enum dma_data_direction dir, bool coherent,
>    * the arch code to take care of attributes and cache maintenance
>    */
>   struct page **iommu_dma_alloc(struct device *dev, size_t size, gfp_t gfp,
> -		unsigned long attrs, int prot, dma_addr_t *handle,
> -		void (*flush_page)(struct device *, const void *, phys_addr_t));
> +		unsigned long attrs, int prot, dma_addr_t *handle);
>   void iommu_dma_free(struct device *dev, struct page **pages, size_t size,
>   		dma_addr_t *handle);
>   
> 

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 01/19] dma-mapping: add a Kconfig symbol to indicated arch_dma_prep_coherent presence
  2019-02-01 14:22   ` Robin Murphy
@ 2019-02-01 16:12     ` Christoph Hellwig
  0 siblings, 0 replies; 38+ messages in thread
From: Christoph Hellwig @ 2019-02-01 16:12 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Tom Lendacky, Catalin Marinas, Joerg Roedel, Will Deacon,
	linux-kernel, iommu, Christoph Hellwig, linux-arm-kernel

On Fri, Feb 01, 2019 at 02:22:46PM +0000, Robin Murphy wrote:
> On 14/01/2019 09:41, Christoph Hellwig wrote:
>> Add a Kconfig symbol that indicates an architecture provides a
>> arch_dma_prep_coherent implementation, and provide a stub otherwise.
>>
>> This will allow the generic dma-iommu code to it while still allowing
>> to be built for cache coherent architectures.
>
> I've been pondering this for a while now, and I still can't really come up 
> with a case where arch_dma_prep_coherent() would need to behave differently 
> from arch_sync_dma_for_device(..., DMA_BIDIRECTIONAL). I wonder if we could 
> just save ourselves this little bit of complexity by using that instead...

A lot of architectures do really weird stuff in the dma sync routines.
So my plan would be to consolidate a lot more logic in there first,
and then maybe as a next step we could look into using
arch_sync_dma_for_device eventually.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 02/19] dma-iommu: cleanup dma-iommu.h
  2019-02-01 14:47   ` Robin Murphy
@ 2019-02-01 16:13     ` Christoph Hellwig
  2019-02-06 15:08       ` Robin Murphy
  0 siblings, 1 reply; 38+ messages in thread
From: Christoph Hellwig @ 2019-02-01 16:13 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Tom Lendacky, Catalin Marinas, Joerg Roedel, Will Deacon,
	linux-kernel, iommu, Christoph Hellwig, linux-arm-kernel

On Fri, Feb 01, 2019 at 02:47:17PM +0000, Robin Murphy wrote:
> On 14/01/2019 09:41, Christoph Hellwig wrote:
>> No need for a __KERNEL__ guard outside uapi, make sure we pull in the
>> includes unconditionally so users can rely on it, and add a missing
>> comment describing the #else cpp statement.  Last but not least include
>> <linux/errno.h> instead of the asm version, which is frowned upon.
>
> I think the __KERNEL__ and asm/errno.h slip-ups are things I cargo-culted 
> from the arch code as a fresh-faced noob yet to learn the finer details, so 
> ack for those parts. The forward-declarations, though, were a deliberate 
> effort to minimise header dependencies and compilation bloat for includers 
> who absolutely wouldn't care, and specifically to try to avoid setting 
> transitive include expectations since they always seem to end up breaking 
> someone's config somewhere down the line. Admittedly this little backwater 
> is hardly comparable to the likes of the sched.h business, but I'm still 
> somewhat on the fence about that change :/

As far as I can tell almost all users of linux/dma-iommu.h require
CONFIG_DMA_IOMMU to be enabled anyway..

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 03/19] dma-iommu: don't use a scatterlist in iommu_dma_alloc
  2019-02-01 15:24   ` Robin Murphy
@ 2019-02-01 16:16     ` Christoph Hellwig
  2019-02-06 15:28       ` Robin Murphy
  0 siblings, 1 reply; 38+ messages in thread
From: Christoph Hellwig @ 2019-02-01 16:16 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Tom Lendacky, Catalin Marinas, Joerg Roedel, Will Deacon,
	linux-kernel, iommu, Christoph Hellwig, linux-arm-kernel

On Fri, Feb 01, 2019 at 03:24:45PM +0000, Robin Murphy wrote:
> On 14/01/2019 09:41, Christoph Hellwig wrote:
>> Directly iterating over the pages makes the code a bit simpler and
>> prepares for the following changes.
>
> It also defeats the whole purpose of __iommu_dma_alloc_pages(), so I'm not 
> really buying the simplification angle - you've *seen* that code, right? ;)

How does it defeat the purpose of __iommu_dma_alloc_pages?


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 06/19] dma-iommu: fix and refactor iommu_dma_mmap
  2019-01-14  9:41 ` [PATCH 06/19] dma-iommu: fix and refactor iommu_dma_mmap Christoph Hellwig
@ 2019-02-05 15:02   ` Robin Murphy
  2019-02-11 16:03     ` Christoph Hellwig
  0 siblings, 1 reply; 38+ messages in thread
From: Robin Murphy @ 2019-02-05 15:02 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Tom Lendacky, Catalin Marinas, Joerg Roedel, Will Deacon,
	linux-kernel, iommu, linux-arm-kernel

On 14/01/2019 09:41, Christoph Hellwig wrote:
> The current iommu_dma_mmap code does not properly handle memory from the
> page allocator that hasn't been remapped, which can happen in the rare
> case of allocations for a coherent device that aren't allowed to block.
> 
> Fix this by replacing iommu_dma_mmap with a slightly tweaked copy of
> dma_common_mmap with special handling for the remapped array of
> pages allocated from __iommu_dma_alloc.

If there's an actual bugfix here, can we make that before all of the 
other code movement? If it's at all related to other reports of weird 
mmap behaviour it might warrant backporting, and either way I'm finding 
it needlessly tough to follow what's going on in this patch :(

Robin.

> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>   drivers/iommu/dma-iommu.c | 59 +++++++++++++++------------------------
>   1 file changed, 23 insertions(+), 36 deletions(-)
> 
> diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
> index e0ffe22775ac..26f479d49103 100644
> --- a/drivers/iommu/dma-iommu.c
> +++ b/drivers/iommu/dma-iommu.c
> @@ -592,23 +592,27 @@ static struct page **__iommu_dma_alloc(struct device *dev, size_t size,
>   }
>   
>   /**
> - * __iommu_dma_mmap - Map a buffer into provided user VMA
> - * @pages: Array representing buffer from __iommu_dma_alloc()
> + * iommu_dma_mmap_remap - Map a remapped page array into provided user VMA
> + * @cpu_addr: virtual address of the memory to be remapped
>    * @size: Size of buffer in bytes
>    * @vma: VMA describing requested userspace mapping
>    *
> - * Maps the pages of the buffer in @pages into @vma. The caller is responsible
> + * Maps the pages pointed to by @cpu_addr into @vma. The caller is responsible
>    * for verifying the correct size and protection of @vma beforehand.
>    */
> -static int __iommu_dma_mmap(struct page **pages, size_t size,
> +static int iommu_dma_mmap_remap(void *cpu_addr, size_t size,
>   		struct vm_area_struct *vma)
>   {
> +	struct vm_struct *area = find_vm_area(cpu_addr);
>   	unsigned long uaddr = vma->vm_start;
>   	unsigned int i, count = PAGE_ALIGN(size) >> PAGE_SHIFT;
>   	int ret = -ENXIO;
>   
> +	if (WARN_ON(!area || !area->pages))
> +		return -ENXIO;
> +
>   	for (i = vma->vm_pgoff; i < count && uaddr < vma->vm_end; i++) {
> -		ret = vm_insert_page(vma, uaddr, pages[i]);
> +		ret = vm_insert_page(vma, uaddr, area->pages[i]);
>   		if (ret)
>   			break;
>   		uaddr += PAGE_SIZE;
> @@ -1047,29 +1051,14 @@ static void iommu_dma_free(struct device *dev, size_t size, void *cpu_addr,
>   	}
>   }
>   
> -static int __iommu_dma_mmap_pfn(struct vm_area_struct *vma,
> -			      unsigned long pfn, size_t size)
> -{
> -	int ret = -ENXIO;
> -	unsigned long nr_vma_pages = vma_pages(vma);
> -	unsigned long nr_pages = PAGE_ALIGN(size) >> PAGE_SHIFT;
> -	unsigned long off = vma->vm_pgoff;
> -
> -	if (off < nr_pages && nr_vma_pages <= (nr_pages - off)) {
> -		ret = remap_pfn_range(vma, vma->vm_start,
> -				      pfn + off,
> -				      vma->vm_end - vma->vm_start,
> -				      vma->vm_page_prot);
> -	}
> -
> -	return ret;
> -}
> -
>   static int iommu_dma_mmap(struct device *dev, struct vm_area_struct *vma,
>   		void *cpu_addr, dma_addr_t dma_addr, size_t size,
>   		unsigned long attrs)
>   {
> -	struct vm_struct *area;
> +	unsigned long user_count = vma_pages(vma);
> +	unsigned long count = PAGE_ALIGN(size) >> PAGE_SHIFT;
> +	unsigned long off = vma->vm_pgoff;
> +	unsigned long pfn;
>   	int ret;
>   
>   	vma->vm_page_prot = arch_dma_mmap_pgprot(dev, vma->vm_page_prot, attrs);
> @@ -1077,20 +1066,18 @@ static int iommu_dma_mmap(struct device *dev, struct vm_area_struct *vma,
>   	if (dma_mmap_from_dev_coherent(dev, vma, cpu_addr, size, &ret))
>   		return ret;
>   
> -	if (attrs & DMA_ATTR_FORCE_CONTIGUOUS) {
> -		/*
> -		 * DMA_ATTR_FORCE_CONTIGUOUS allocations are always remapped,
> -		 * hence in the vmalloc space.
> -		 */
> -		unsigned long pfn = vmalloc_to_pfn(cpu_addr);
> -		return __iommu_dma_mmap_pfn(vma, pfn, size);
> -	}
> -
> -	area = find_vm_area(cpu_addr);
> -	if (WARN_ON(!area || !area->pages))
> +	if (off >= count || user_count > count - off)
>   		return -ENXIO;
>   
> -	return __iommu_dma_mmap(area->pages, size, vma);
> +	if (is_vmalloc_addr(cpu_addr)) {
> +		if (!(attrs & DMA_ATTR_FORCE_CONTIGUOUS))
> +			return iommu_dma_mmap_remap(cpu_addr, size, vma);
> +		pfn = vmalloc_to_pfn(cpu_addr);
> +	} else
> +		pfn = page_to_pfn(virt_to_page(cpu_addr));
> +
> +	return remap_pfn_range(vma, vma->vm_start, pfn + vma->vm_pgoff,
> +			user_count << PAGE_SHIFT, vma->vm_page_prot);
>   }
>   
>   static int __iommu_dma_get_sgtable_page(struct sg_table *sgt, struct page *page,
> 

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 16/19] dma-iommu: don't depend on CONFIG_DMA_DIRECT_REMAP
  2019-01-14  9:41 ` [PATCH 16/19] dma-iommu: don't depend on CONFIG_DMA_DIRECT_REMAP Christoph Hellwig
@ 2019-02-06 11:55   ` Robin Murphy
  2019-02-11 16:39     ` Christoph Hellwig
  0 siblings, 1 reply; 38+ messages in thread
From: Robin Murphy @ 2019-02-06 11:55 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Tom Lendacky, Catalin Marinas, Joerg Roedel, Will Deacon,
	linux-kernel, iommu, linux-arm-kernel

On 14/01/2019 09:41, Christoph Hellwig wrote:
> For entirely dma coherent architectures there is no good reason to ever
> remap dma coherent allocation.

Yes there is, namely assembling large buffers without the need for 
massive CMA areas and compaction overhead under memory fragmentation. 
That has always been a distinct concern from the DMA_DIRECT_REMAP cases; 
they've just been able to share a fair few code paths.

>  Move all the remap and pool code under
> CONFIG_DMA_DIRECT_REMAP ifdefs, and drop the Kconfig dependency.

As far as I'm concerned that splits things the wrong way. Logically, 
iommu_dma_alloc() should always have done its own vmap() instead of just 
returning the bare pages array, but that was tricky to resolve with the 
design of having the caller handle everything to do with coherency 
(forcing the caller to unpick that mapping just to remap it yet again in 
the noncoherent case didn't seem sensible).

Robin.

> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>   drivers/iommu/Kconfig     |  1 -
>   drivers/iommu/dma-iommu.c | 10 ++++++++++
>   2 files changed, 10 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
> index 8b13fb7d0263..d9a25715650e 100644
> --- a/drivers/iommu/Kconfig
> +++ b/drivers/iommu/Kconfig
> @@ -94,7 +94,6 @@ config IOMMU_DMA
>   	select IOMMU_API
>   	select IOMMU_IOVA
>   	select NEED_SG_DMA_LENGTH
> -	depends on DMA_DIRECT_REMAP
>   
>   config FSL_PAMU
>   	bool "Freescale IOMMU support"
> diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
> index fd25c995bde4..e27909771d55 100644
> --- a/drivers/iommu/dma-iommu.c
> +++ b/drivers/iommu/dma-iommu.c
> @@ -502,6 +502,7 @@ static void *iommu_dma_alloc_contiguous(struct device *dev, size_t size,
>   	return page_address(page);
>   }
>   
> +#ifdef CONFIG_DMA_DIRECT_REMAP
>   static void __iommu_dma_free_pages(struct page **pages, int count)
>   {
>   	while (count--)
> @@ -775,6 +776,7 @@ static void *iommu_dma_alloc_noncoherent(struct device *dev, size_t size,
>   				gfp, attrs);
>   	return iommu_dma_alloc_remap(dev, size, dma_handle, gfp, attrs);
>   }
> +#endif /* CONFIG_DMA_DIRECT_REMAP */
>   
>   static void iommu_dma_sync_single_for_cpu(struct device *dev,
>   		dma_addr_t dma_handle, size_t size, enum dma_data_direction dir)
> @@ -1057,6 +1059,7 @@ static void *iommu_dma_alloc(struct device *dev, size_t size,
>   	 */
>   	gfp |= __GFP_ZERO;
>   
> +#ifdef CONFIG_DMA_DIRECT_REMAP
>   	if (!dev_is_dma_coherent(dev))
>   		return iommu_dma_alloc_noncoherent(dev, size, dma_handle, gfp,
>   				attrs);
> @@ -1064,6 +1067,7 @@ static void *iommu_dma_alloc(struct device *dev, size_t size,
>   	if (gfpflags_allow_blocking(gfp) &&
>   	    !(attrs & DMA_ATTR_FORCE_CONTIGUOUS))
>   		return iommu_dma_alloc_remap(dev, size, dma_handle, gfp, attrs);
> +#endif
>   
>   	return iommu_dma_alloc_contiguous(dev, size, dma_handle, gfp, attrs);
>   }
> @@ -1083,6 +1087,7 @@ static void iommu_dma_free(struct device *dev, size_t size, void *cpu_addr,
>   	 *
>   	 * Hence how dodgy the below logic looks...
>   	 */
> +#ifdef CONFIG_DMA_DIRECT_REMAP
>   	if (dma_in_atomic_pool(cpu_addr, PAGE_ALIGN(size))) {
>   		iommu_dma_free_pool(dev, size, cpu_addr, dma_handle);
>   		return;
> @@ -1096,6 +1101,7 @@ static void iommu_dma_free(struct device *dev, size_t size, void *cpu_addr,
>   		page = vmalloc_to_page(cpu_addr);
>   		dma_common_free_remap(cpu_addr, PAGE_ALIGN(size), VM_USERMAP);
>   	} else
> +#endif
>   		page = virt_to_page(cpu_addr);
>   
>   	iommu_dma_free_contiguous(dev, size, page, dma_handle);
> @@ -1119,11 +1125,13 @@ static int iommu_dma_mmap(struct device *dev, struct vm_area_struct *vma,
>   	if (off >= count || user_count > count - off)
>   		return -ENXIO;
>   
> +#ifdef CONFIG_DMA_DIRECT_REMAP
>   	if (is_vmalloc_addr(cpu_addr)) {
>   		if (!(attrs & DMA_ATTR_FORCE_CONTIGUOUS))
>   			return iommu_dma_mmap_remap(cpu_addr, size, vma);
>   		pfn = vmalloc_to_pfn(cpu_addr);
>   	} else
> +#endif
>   		pfn = page_to_pfn(virt_to_page(cpu_addr));
>   
>   	return remap_pfn_range(vma, vma->vm_start, pfn + vma->vm_pgoff,
> @@ -1137,11 +1145,13 @@ static int iommu_dma_get_sgtable(struct device *dev, struct sg_table *sgt,
>   	struct page *page;
>   	int ret;
>   
> +#ifdef CONFIG_DMA_DIRECT_REMAP
>   	if (is_vmalloc_addr(cpu_addr)) {
>   		if (!(attrs & DMA_ATTR_FORCE_CONTIGUOUS))
>   			return iommu_dma_get_sgtable_remap(sgt, cpu_addr, size);
>   		page = vmalloc_to_page(cpu_addr);
>   	} else
> +#endif
>   		page = virt_to_page(cpu_addr);
>   
>   	ret = sg_alloc_table(sgt, 1, GFP_KERNEL);
> 

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 17/19] dma-iommu: switch copyright boilerplace to SPDX
  2019-01-14  9:41 ` [PATCH 17/19] dma-iommu: switch copyright boilerplace to SPDX Christoph Hellwig
@ 2019-02-06 11:57   ` Robin Murphy
  0 siblings, 0 replies; 38+ messages in thread
From: Robin Murphy @ 2019-02-06 11:57 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Tom Lendacky, Catalin Marinas, Joerg Roedel, Will Deacon,
	linux-kernel, iommu, linux-arm-kernel

On 14/01/2019 09:41, Christoph Hellwig wrote:
> Signed-off-by: Christoph Hellwig <hch@lst.de>

Acked-by: Robin Murphy <robin.murphy@arm.com>

> ---
>   drivers/iommu/dma-iommu.c | 13 +------------
>   include/linux/dma-iommu.h | 13 +------------
>   2 files changed, 2 insertions(+), 24 deletions(-)
> 
> diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
> index e27909771d55..1b76121df94e 100644
> --- a/drivers/iommu/dma-iommu.c
> +++ b/drivers/iommu/dma-iommu.c
> @@ -1,3 +1,4 @@
> +// SPDX-License-Identifier: GPL-2.0
>   /*
>    * A fairly generic DMA-API to IOMMU-API glue layer.
>    *
> @@ -5,18 +6,6 @@
>    *
>    * based in part on arch/arm/mm/dma-mapping.c:
>    * Copyright (C) 2000-2004 Russell King
> - *
> - * This program is free software; you can redistribute it and/or modify
> - * it under the terms of the GNU General Public License version 2 as
> - * published by the Free Software Foundation.
> - *
> - * This program is distributed in the hope that it will be useful,
> - * but WITHOUT ANY WARRANTY; without even the implied warranty of
> - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> - * GNU General Public License for more details.
> - *
> - * You should have received a copy of the GNU General Public License
> - * along with this program.  If not, see <http://www.gnu.org/licenses/>.
>    */
>   
>   #include <linux/acpi_iort.h>
> diff --git a/include/linux/dma-iommu.h b/include/linux/dma-iommu.h
> index 5277aa8782bf..bfe9f19b1171 100644
> --- a/include/linux/dma-iommu.h
> +++ b/include/linux/dma-iommu.h
> @@ -1,17 +1,6 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
>   /*
>    * Copyright (C) 2014-2015 ARM Ltd.
> - *
> - * This program is free software; you can redistribute it and/or modify
> - * it under the terms of the GNU General Public License version 2 as
> - * published by the Free Software Foundation.
> - *
> - * This program is distributed in the hope that it will be useful,
> - * but WITHOUT ANY WARRANTY; without even the implied warranty of
> - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> - * GNU General Public License for more details.
> - *
> - * You should have received a copy of the GNU General Public License
> - * along with this program.  If not, see <http://www.gnu.org/licenses/>.
>    */
>   #ifndef __DMA_IOMMU_H
>   #define __DMA_IOMMU_H
> 

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 18/19] arm64: switch copyright boilerplace to SPDX in dma-mapping.c
  2019-01-14  9:41 ` [PATCH 18/19] arm64: switch copyright boilerplace to SPDX in dma-mapping.c Christoph Hellwig
@ 2019-02-06 12:19   ` Robin Murphy
  0 siblings, 0 replies; 38+ messages in thread
From: Robin Murphy @ 2019-02-06 12:19 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Tom Lendacky, Catalin Marinas, Joerg Roedel, Will Deacon,
	linux-kernel, iommu, linux-arm-kernel

On 14/01/2019 09:41, Christoph Hellwig wrote:
> Signed-off-by: Christoph Hellwig <hch@lst.de>

Acked-by: Robin Murphy <robin.murphy@arm.com>

> ---
>   arch/arm64/mm/dma-mapping.c | 15 +--------------
>   1 file changed, 1 insertion(+), 14 deletions(-)
> 
> diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c
> index fffba9426ee4..bdfb4e985a69 100644
> --- a/arch/arm64/mm/dma-mapping.c
> +++ b/arch/arm64/mm/dma-mapping.c
> @@ -1,20 +1,7 @@
> +// SPDX-License-Identifier: GPL-2.0
>   /*
> - * SWIOTLB-based DMA API implementation
> - *
>    * Copyright (C) 2012 ARM Ltd.
>    * Author: Catalin Marinas <catalin.marinas@arm.com>
> - *
> - * This program is free software; you can redistribute it and/or modify
> - * it under the terms of the GNU General Public License version 2 as
> - * published by the Free Software Foundation.
> - *
> - * This program is distributed in the hope that it will be useful,
> - * but WITHOUT ANY WARRANTY; without even the implied warranty of
> - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> - * GNU General Public License for more details.
> - *
> - * You should have received a copy of the GNU General Public License
> - * along with this program.  If not, see <http://www.gnu.org/licenses/>.
>    */
>   
>   #include <linux/gfp.h>
> 

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 02/19] dma-iommu: cleanup dma-iommu.h
  2019-02-01 16:13     ` Christoph Hellwig
@ 2019-02-06 15:08       ` Robin Murphy
  2019-02-11 15:59         ` Christoph Hellwig
  0 siblings, 1 reply; 38+ messages in thread
From: Robin Murphy @ 2019-02-06 15:08 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Tom Lendacky, Catalin Marinas, Joerg Roedel, Will Deacon,
	linux-kernel, iommu, linux-arm-kernel

On 01/02/2019 16:13, Christoph Hellwig wrote:
> On Fri, Feb 01, 2019 at 02:47:17PM +0000, Robin Murphy wrote:
>> On 14/01/2019 09:41, Christoph Hellwig wrote:
>>> No need for a __KERNEL__ guard outside uapi, make sure we pull in the
>>> includes unconditionally so users can rely on it, and add a missing
>>> comment describing the #else cpp statement.  Last but not least include
>>> <linux/errno.h> instead of the asm version, which is frowned upon.
>>
>> I think the __KERNEL__ and asm/errno.h slip-ups are things I cargo-culted
>> from the arch code as a fresh-faced noob yet to learn the finer details, so
>> ack for those parts. The forward-declarations, though, were a deliberate
>> effort to minimise header dependencies and compilation bloat for includers
>> who absolutely wouldn't care, and specifically to try to avoid setting
>> transitive include expectations since they always seem to end up breaking
>> someone's config somewhere down the line. Admittedly this little backwater
>> is hardly comparable to the likes of the sched.h business, but I'm still
>> somewhat on the fence about that change :/
> 
> As far as I can tell almost all users of linux/dma-iommu.h require
> CONFIG_DMA_IOMMU to be enabled anyway..

Other than dma-iommu.c itself, none of them *require* it - only 
arch/arm64 selects it (the one from MTK_IOMMU is just bogus), and a lot 
of the drivers also build for at least one other architecture (and/or 
arm64 with !IOMMU_API).

Either way, I have no vehement objection to the change, I just don't see 
any positive value in it.

Robin.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 03/19] dma-iommu: don't use a scatterlist in iommu_dma_alloc
  2019-02-01 16:16     ` Christoph Hellwig
@ 2019-02-06 15:28       ` Robin Murphy
  2019-02-11 16:00         ` Christoph Hellwig
  0 siblings, 1 reply; 38+ messages in thread
From: Robin Murphy @ 2019-02-06 15:28 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Tom Lendacky, Catalin Marinas, Joerg Roedel, Will Deacon,
	linux-kernel, iommu, linux-arm-kernel

On 01/02/2019 16:16, Christoph Hellwig wrote:
> On Fri, Feb 01, 2019 at 03:24:45PM +0000, Robin Murphy wrote:
>> On 14/01/2019 09:41, Christoph Hellwig wrote:
>>> Directly iterating over the pages makes the code a bit simpler and
>>> prepares for the following changes.
>>
>> It also defeats the whole purpose of __iommu_dma_alloc_pages(), so I'm not
>> really buying the simplification angle - you've *seen* that code, right? ;)
> 
> How does it defeat the purpose of __iommu_dma_alloc_pages?

Because if iommu_map() only gets called at PAGE_SIZE granularity, then 
the IOMMU PTEs will be created at PAGE_SIZE (or smaller) granularity, so 
any effort to get higher-order allocations matching larger IOMMU block 
sizes is wasted, and we may as well have just done this:

	for (i = 0; i < count; i++) {
		struct page *page = alloc_page(gfp);
		...
		iommu_map(..., page_to_phys(page), PAGE_SIZE, ...);
	}

Really, it's a shame we have to split huge pages for the CPU remap, 
since in the common case the CPU MMU will have a matching block size, 
but IIRC there was something in vmap() or thereabouts that explicitly 
chokes on them.

Robin.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 02/19] dma-iommu: cleanup dma-iommu.h
  2019-02-06 15:08       ` Robin Murphy
@ 2019-02-11 15:59         ` Christoph Hellwig
  0 siblings, 0 replies; 38+ messages in thread
From: Christoph Hellwig @ 2019-02-11 15:59 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Tom Lendacky, Catalin Marinas, Joerg Roedel, Will Deacon,
	linux-kernel, iommu, Christoph Hellwig, linux-arm-kernel

On Wed, Feb 06, 2019 at 03:08:26PM +0000, Robin Murphy wrote:
> Other than dma-iommu.c itself, none of them *require* it - only arch/arm64 
> selects it (the one from MTK_IOMMU is just bogus), and a lot of the drivers 
> also build for at least one other architecture (and/or arm64 with 
> !IOMMU_API).
>
> Either way, I have no vehement objection to the change, I just don't see 
> any positive value in it.

I've moved the idef back down below the includes.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 03/19] dma-iommu: don't use a scatterlist in iommu_dma_alloc
  2019-02-06 15:28       ` Robin Murphy
@ 2019-02-11 16:00         ` Christoph Hellwig
  0 siblings, 0 replies; 38+ messages in thread
From: Christoph Hellwig @ 2019-02-11 16:00 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Tom Lendacky, Catalin Marinas, Joerg Roedel, Will Deacon,
	linux-kernel, iommu, Christoph Hellwig, linux-arm-kernel

On Wed, Feb 06, 2019 at 03:28:28PM +0000, Robin Murphy wrote:
> Because if iommu_map() only gets called at PAGE_SIZE granularity, then the 
> IOMMU PTEs will be created at PAGE_SIZE (or smaller) granularity, so any 
> effort to get higher-order allocations matching larger IOMMU block sizes is 
> wasted, and we may as well have just done this:
>
> 	for (i = 0; i < count; i++) {
> 		struct page *page = alloc_page(gfp);
> 		...
> 		iommu_map(..., page_to_phys(page), PAGE_SIZE, ...);
> 	}

True.  I've dropped this patch.

> Really, it's a shame we have to split huge pages for the CPU remap, since 
> in the common case the CPU MMU will have a matching block size, but IIRC 
> there was something in vmap() or thereabouts that explicitly chokes on 
> them.

That just needs a volunteer to fix the implementation, as there is no
fundamental reason not to remap large pages.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 06/19] dma-iommu: fix and refactor iommu_dma_mmap
  2019-02-05 15:02   ` Robin Murphy
@ 2019-02-11 16:03     ` Christoph Hellwig
  0 siblings, 0 replies; 38+ messages in thread
From: Christoph Hellwig @ 2019-02-11 16:03 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Tom Lendacky, Catalin Marinas, Joerg Roedel, Will Deacon,
	linux-kernel, iommu, Christoph Hellwig, linux-arm-kernel

On Tue, Feb 05, 2019 at 03:02:23PM +0000, Robin Murphy wrote:
> On 14/01/2019 09:41, Christoph Hellwig wrote:
>> The current iommu_dma_mmap code does not properly handle memory from the
>> page allocator that hasn't been remapped, which can happen in the rare
>> case of allocations for a coherent device that aren't allowed to block.
>>
>> Fix this by replacing iommu_dma_mmap with a slightly tweaked copy of
>> dma_common_mmap with special handling for the remapped array of
>> pages allocated from __iommu_dma_alloc.
>
> If there's an actual bugfix here, can we make that before all of the other 
> code movement? If it's at all related to other reports of weird mmap 
> behaviour it might warrant backporting, and either way I'm finding it 
> needlessly tough to follow what's going on in this patch :(

The bug fix is to handle non-vmalloc pages.  I'll see if I can do
a smaller and more bandaid-y fix first.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 16/19] dma-iommu: don't depend on CONFIG_DMA_DIRECT_REMAP
  2019-02-06 11:55   ` Robin Murphy
@ 2019-02-11 16:39     ` Christoph Hellwig
  0 siblings, 0 replies; 38+ messages in thread
From: Christoph Hellwig @ 2019-02-11 16:39 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Tom Lendacky, Catalin Marinas, Joerg Roedel, Will Deacon,
	linux-kernel, iommu, Christoph Hellwig, linux-arm-kernel

On Wed, Feb 06, 2019 at 11:55:49AM +0000, Robin Murphy wrote:
> On 14/01/2019 09:41, Christoph Hellwig wrote:
>> For entirely dma coherent architectures there is no good reason to ever
>> remap dma coherent allocation.
>
> Yes there is, namely assembling large buffers without the need for massive 
> CMA areas and compaction overhead under memory fragmentation. That has 
> always been a distinct concern from the DMA_DIRECT_REMAP cases; they've 
> just been able to share a fair few code paths.

Well, I guess I need to reword this - there is no _requirement_ to
remap.  And x86 has been happy to not remap so far and I see absolutely
no reason to force anyone to remap.

>>  Move all the remap and pool code under
>> CONFIG_DMA_DIRECT_REMAP ifdefs, and drop the Kconfig dependency.
>
> As far as I'm concerned that splits things the wrong way. Logically, 
> iommu_dma_alloc() should always have done its own vmap() instead of just 
> returning the bare pages array, but that was tricky to resolve with the 
> design of having the caller handle everything to do with coherency (forcing 
> the caller to unpick that mapping just to remap it yet again in the 
> noncoherent case didn't seem sensible).

I don't parse this.  In the old code base before this series
iommu_dma_alloc is a relatively low-level helper allocating and mapping
pages.  And that one should have done the remapping, and in fact does
so since ("dma-iommu: refactor page array remap helpers").  It just
happens that the function is now called iommu_dma_alloc_remap.

The new iommu_dma_alloc is the high level entry point that handles
every possible case of different allocations, including those where
we do not have a virtual mapping.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 38+ messages in thread

end of thread, other threads:[~2019-02-11 16:39 UTC | newest]

Thread overview: 38+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-01-14  9:41 implement generic dma_map_ops for IOMMUs Christoph Hellwig
2019-01-14  9:41 ` [PATCH 01/19] dma-mapping: add a Kconfig symbol to indicated arch_dma_prep_coherent presence Christoph Hellwig
2019-02-01 14:22   ` Robin Murphy
2019-02-01 16:12     ` Christoph Hellwig
2019-01-14  9:41 ` [PATCH 02/19] dma-iommu: cleanup dma-iommu.h Christoph Hellwig
2019-02-01 14:47   ` Robin Murphy
2019-02-01 16:13     ` Christoph Hellwig
2019-02-06 15:08       ` Robin Murphy
2019-02-11 15:59         ` Christoph Hellwig
2019-01-14  9:41 ` [PATCH 03/19] dma-iommu: don't use a scatterlist in iommu_dma_alloc Christoph Hellwig
2019-02-01 15:24   ` Robin Murphy
2019-02-01 16:16     ` Christoph Hellwig
2019-02-06 15:28       ` Robin Murphy
2019-02-11 16:00         ` Christoph Hellwig
2019-01-14  9:41 ` [PATCH 04/19] dma-iommu: remove the flush_page callback Christoph Hellwig
2019-02-01 15:28   ` Robin Murphy
2019-01-14  9:41 ` [PATCH 05/19] dma-iommu: move the arm64 wrappers to common code Christoph Hellwig
2019-01-14  9:41 ` [PATCH 06/19] dma-iommu: fix and refactor iommu_dma_mmap Christoph Hellwig
2019-02-05 15:02   ` Robin Murphy
2019-02-11 16:03     ` Christoph Hellwig
2019-01-14  9:41 ` [PATCH 07/19] dma-iommu: fix and refactor iommu_dma_get_sgtable Christoph Hellwig
2019-01-14  9:41 ` [PATCH 08/19] dma-iommu: move __iommu_dma_map Christoph Hellwig
2019-01-14  9:41 ` [PATCH 09/19] dma-iommu: refactor page array remap helpers Christoph Hellwig
2019-01-14  9:41 ` [PATCH 10/19] dma-iommu: factor atomic pool allocations into helpers Christoph Hellwig
2019-01-14  9:41 ` [PATCH 11/19] dma-iommu: factor contiguous " Christoph Hellwig
2019-01-14  9:41 ` [PATCH 12/19] dma-iommu: refactor iommu_dma_free Christoph Hellwig
2019-01-14  9:41 ` [PATCH 13/19] dma-iommu: don't remap contiguous allocations for coherent devices Christoph Hellwig
2019-01-14  9:41 ` [PATCH 14/19] dma-iommu: factor contiguous remapped allocations into helpers Christoph Hellwig
2019-01-14  9:41 ` [PATCH 15/19] dma-iommu: refactor iommu_dma_alloc Christoph Hellwig
2019-01-14  9:41 ` [PATCH 16/19] dma-iommu: don't depend on CONFIG_DMA_DIRECT_REMAP Christoph Hellwig
2019-02-06 11:55   ` Robin Murphy
2019-02-11 16:39     ` Christoph Hellwig
2019-01-14  9:41 ` [PATCH 17/19] dma-iommu: switch copyright boilerplace to SPDX Christoph Hellwig
2019-02-06 11:57   ` Robin Murphy
2019-01-14  9:41 ` [PATCH 18/19] arm64: switch copyright boilerplace to SPDX in dma-mapping.c Christoph Hellwig
2019-02-06 12:19   ` Robin Murphy
2019-01-14  9:41 ` [PATCH 19/19] arm64: trim includes " Christoph Hellwig
2019-01-28  7:53 ` implement generic dma_map_ops for IOMMUs Christoph Hellwig

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).