linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* merge dma_direct_ops and dma_noncoherent_ops v3
@ 2018-09-14  9:58 Christoph Hellwig
  2018-09-14  9:58 ` [PATCH 1/6] dma-mapping: add the missing ARCH_HAS_SYNC_DMA_FOR_CPU_ALL declaration Christoph Hellwig
                   ` (6 more replies)
  0 siblings, 7 replies; 17+ messages in thread
From: Christoph Hellwig @ 2018-09-14  9:58 UTC (permalink / raw)
  To: iommu
  Cc: Marek Szyprowski, Robin Murphy, Paul Burton, Greg Kroah-Hartman,
	linux-mips, linux-kernel

While most architectures are either always or never dma coherent for a
given build, the arm, arm64, mips and soon arc architectures can have
different dma coherent settings on a per-device basis.  Additionally
some mips builds can decide at boot time if dma is coherent or not.

I've started to look into handling noncoherent dma in swiotlb, and
moving the dma-iommu ops into common code [1], and for that we need a
generic way to check if a given device is coherent or not.  Moving
this flag into struct device also simplifies the conditionally coherent
architecture implementations.

These patches are also available in a git tree given that they have
a few previous posted dependencies:

    git://git.infradead.org/users/hch/misc.git dma-direct-noncoherent-merge

Gitweb:

    http://git.infradead.org/users/hch/misc.git/shortlog/refs/heads/dma-direct-noncoherent-merge

Changes since v2:
 - return bool from dev_is_dma_coherent

Changes since v1:
 - rebased to the latest Linus' tree which includes coherent dma support
   for arc
 - a couple tidyups suggested by Paul Burton

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH 1/6] dma-mapping: add the missing ARCH_HAS_SYNC_DMA_FOR_CPU_ALL declaration
  2018-09-14  9:58 merge dma_direct_ops and dma_noncoherent_ops v3 Christoph Hellwig
@ 2018-09-14  9:58 ` Christoph Hellwig
  2018-09-14  9:58 ` [PATCH 2/6] MIPS: don't select DMA_MAYBE_COHERENT from DMA_PERDEV_COHERENT Christoph Hellwig
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 17+ messages in thread
From: Christoph Hellwig @ 2018-09-14  9:58 UTC (permalink / raw)
  To: iommu
  Cc: Marek Szyprowski, Robin Murphy, Paul Burton, Greg Kroah-Hartman,
	linux-mips, linux-kernel

The patch adding the infrastructure failed to actually add the symbol
declaration, oops..

Fixes: faef87723a ("dma-noncoherent: add a arch_sync_dma_for_cpu_all hook")
Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 kernel/dma/Kconfig | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/kernel/dma/Kconfig b/kernel/dma/Kconfig
index 9bd54304446f..1b1d63b3634b 100644
--- a/kernel/dma/Kconfig
+++ b/kernel/dma/Kconfig
@@ -23,6 +23,9 @@ config ARCH_HAS_SYNC_DMA_FOR_CPU
 	bool
 	select NEED_DMA_MAP_STATE
 
+config ARCH_HAS_SYNC_DMA_FOR_CPU_ALL
+	bool
+
 config DMA_DIRECT_OPS
 	bool
 	depends on HAS_DMA
-- 
2.18.0


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 2/6] MIPS: don't select DMA_MAYBE_COHERENT from DMA_PERDEV_COHERENT
  2018-09-14  9:58 merge dma_direct_ops and dma_noncoherent_ops v3 Christoph Hellwig
  2018-09-14  9:58 ` [PATCH 1/6] dma-mapping: add the missing ARCH_HAS_SYNC_DMA_FOR_CPU_ALL declaration Christoph Hellwig
@ 2018-09-14  9:58 ` Christoph Hellwig
  2018-09-14  9:58 ` [PATCH 3/6] dma-mapping: move the dma_coherent flag to struct device Christoph Hellwig
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 17+ messages in thread
From: Christoph Hellwig @ 2018-09-14  9:58 UTC (permalink / raw)
  To: iommu
  Cc: Marek Szyprowski, Robin Murphy, Paul Burton, Greg Kroah-Hartman,
	linux-mips, linux-kernel

While both option select a form of conditional dma coherence they don't
actually share any code in the implementation, so untangle them.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Paul Burton <paul.burton@mips.com>
---
 arch/mips/Kconfig        |  2 +-
 arch/mips/kernel/setup.c |  2 +-
 arch/mips/mm/c-r4k.c     | 17 ++++++++---------
 3 files changed, 10 insertions(+), 11 deletions(-)

diff --git a/arch/mips/Kconfig b/arch/mips/Kconfig
index 35511999156a..0b25180028b8 100644
--- a/arch/mips/Kconfig
+++ b/arch/mips/Kconfig
@@ -1111,7 +1111,7 @@ config DMA_MAYBE_COHERENT
 
 config DMA_PERDEV_COHERENT
 	bool
-	select DMA_MAYBE_COHERENT
+	select DMA_NONCOHERENT
 
 config DMA_NONCOHERENT
 	bool
diff --git a/arch/mips/kernel/setup.c b/arch/mips/kernel/setup.c
index c71d1eb7da59..6d840a44fa36 100644
--- a/arch/mips/kernel/setup.c
+++ b/arch/mips/kernel/setup.c
@@ -1067,7 +1067,7 @@ static int __init debugfs_mips(void)
 arch_initcall(debugfs_mips);
 #endif
 
-#if defined(CONFIG_DMA_MAYBE_COHERENT) && !defined(CONFIG_DMA_PERDEV_COHERENT)
+#ifdef CONFIG_DMA_MAYBE_COHERENT
 /* User defined DMA coherency from command line. */
 enum coherent_io_user_state coherentio = IO_COHERENCE_DEFAULT;
 EXPORT_SYMBOL_GPL(coherentio);
diff --git a/arch/mips/mm/c-r4k.c b/arch/mips/mm/c-r4k.c
index a9ef057c79fe..05bd77727fb9 100644
--- a/arch/mips/mm/c-r4k.c
+++ b/arch/mips/mm/c-r4k.c
@@ -1955,22 +1955,21 @@ void r4k_cache_init(void)
 	__flush_icache_user_range	= r4k_flush_icache_user_range;
 	__local_flush_icache_user_range	= local_r4k_flush_icache_user_range;
 
-#if defined(CONFIG_DMA_NONCOHERENT) || defined(CONFIG_DMA_MAYBE_COHERENT)
-# if defined(CONFIG_DMA_PERDEV_COHERENT)
-	if (0) {
-# else
-	if ((coherentio == IO_COHERENCE_ENABLED) ||
-	    ((coherentio == IO_COHERENCE_DEFAULT) && hw_coherentio)) {
-# endif
+#ifdef CONFIG_DMA_NONCOHERENT
+#ifdef CONFIG_DMA_MAYBE_COHERENT
+	if (coherentio == IO_COHERENCE_ENABLED ||
+	    (coherentio == IO_COHERENCE_DEFAULT && hw_coherentio)) {
 		_dma_cache_wback_inv	= (void *)cache_noop;
 		_dma_cache_wback	= (void *)cache_noop;
 		_dma_cache_inv		= (void *)cache_noop;
-	} else {
+	} else
+#endif /* CONFIG_DMA_MAYBE_COHERENT */
+	{
 		_dma_cache_wback_inv	= r4k_dma_cache_wback_inv;
 		_dma_cache_wback	= r4k_dma_cache_wback_inv;
 		_dma_cache_inv		= r4k_dma_cache_inv;
 	}
-#endif
+#endif /* CONFIG_DMA_NONCOHERENT */
 
 	build_clear_page();
 	build_copy_page();
-- 
2.18.0


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 3/6] dma-mapping: move the dma_coherent flag to struct device
  2018-09-14  9:58 merge dma_direct_ops and dma_noncoherent_ops v3 Christoph Hellwig
  2018-09-14  9:58 ` [PATCH 1/6] dma-mapping: add the missing ARCH_HAS_SYNC_DMA_FOR_CPU_ALL declaration Christoph Hellwig
  2018-09-14  9:58 ` [PATCH 2/6] MIPS: don't select DMA_MAYBE_COHERENT from DMA_PERDEV_COHERENT Christoph Hellwig
@ 2018-09-14  9:58 ` Christoph Hellwig
  2018-09-14  9:58 ` [PATCH 4/6] dma-mapping: merge direct and noncoherent ops Christoph Hellwig
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 17+ messages in thread
From: Christoph Hellwig @ 2018-09-14  9:58 UTC (permalink / raw)
  To: iommu
  Cc: Marek Szyprowski, Robin Murphy, Paul Burton, Greg Kroah-Hartman,
	linux-mips, linux-kernel

Various architectures support both coherent and non-coherent dma on a
per-device basis.  Move the dma_noncoherent flag from the mips archdata
field to struct device proper to prepare the infrastructure for reuse on
other architectures.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Paul Burton <paul.burton@mips.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/mips/Kconfig                     |  1 +
 arch/mips/include/asm/Kbuild          |  1 +
 arch/mips/include/asm/device.h        | 19 ----------------
 arch/mips/include/asm/dma-coherence.h |  6 +++++
 arch/mips/include/asm/dma-mapping.h   |  2 +-
 arch/mips/mm/dma-noncoherent.c        | 32 +++++----------------------
 include/linux/device.h                |  7 ++++++
 include/linux/dma-noncoherent.h       | 16 ++++++++++++++
 kernel/dma/Kconfig                    |  3 +++
 9 files changed, 41 insertions(+), 46 deletions(-)
 delete mode 100644 arch/mips/include/asm/device.h

diff --git a/arch/mips/Kconfig b/arch/mips/Kconfig
index 0b25180028b8..54c52bd0d9d3 100644
--- a/arch/mips/Kconfig
+++ b/arch/mips/Kconfig
@@ -1106,6 +1106,7 @@ config ARCH_SUPPORTS_UPROBES
 	bool
 
 config DMA_MAYBE_COHERENT
+	select ARCH_HAS_DMA_COHERENCE_H
 	select DMA_NONCOHERENT
 	bool
 
diff --git a/arch/mips/include/asm/Kbuild b/arch/mips/include/asm/Kbuild
index 58351e48421e..9a81e72119da 100644
--- a/arch/mips/include/asm/Kbuild
+++ b/arch/mips/include/asm/Kbuild
@@ -1,6 +1,7 @@
 # MIPS headers
 generic-(CONFIG_GENERIC_CSUM) += checksum.h
 generic-y += current.h
+generic-y += device.h
 generic-y += dma-contiguous.h
 generic-y += emergency-restart.h
 generic-y += export.h
diff --git a/arch/mips/include/asm/device.h b/arch/mips/include/asm/device.h
deleted file mode 100644
index 6aa796f1081a..000000000000
--- a/arch/mips/include/asm/device.h
+++ /dev/null
@@ -1,19 +0,0 @@
-/*
- * Arch specific extensions to struct device
- *
- * This file is released under the GPLv2
- */
-#ifndef _ASM_MIPS_DEVICE_H
-#define _ASM_MIPS_DEVICE_H
-
-struct dev_archdata {
-#ifdef CONFIG_DMA_PERDEV_COHERENT
-	/* Non-zero if DMA is coherent with CPU caches */
-	bool dma_coherent;
-#endif
-};
-
-struct pdev_archdata {
-};
-
-#endif /* _ASM_MIPS_DEVICE_H*/
diff --git a/arch/mips/include/asm/dma-coherence.h b/arch/mips/include/asm/dma-coherence.h
index 8eda48748ed5..5eaa1fcc878a 100644
--- a/arch/mips/include/asm/dma-coherence.h
+++ b/arch/mips/include/asm/dma-coherence.h
@@ -20,6 +20,12 @@ enum coherent_io_user_state {
 #elif defined(CONFIG_DMA_MAYBE_COHERENT)
 extern enum coherent_io_user_state coherentio;
 extern int hw_coherentio;
+
+static inline bool dev_is_dma_coherent(struct device *dev)
+{
+	return coherentio == IO_COHERENCE_ENABLED ||
+		(coherentio == IO_COHERENCE_DEFAULT && hw_coherentio);
+}
 #else
 #ifdef CONFIG_DMA_NONCOHERENT
 #define coherentio	IO_COHERENCE_DISABLED
diff --git a/arch/mips/include/asm/dma-mapping.h b/arch/mips/include/asm/dma-mapping.h
index e81c4e97ff1a..40d825c779de 100644
--- a/arch/mips/include/asm/dma-mapping.h
+++ b/arch/mips/include/asm/dma-mapping.h
@@ -25,7 +25,7 @@ static inline void arch_setup_dma_ops(struct device *dev, u64 dma_base,
 				      bool coherent)
 {
 #ifdef CONFIG_DMA_PERDEV_COHERENT
-	dev->archdata.dma_coherent = coherent;
+	dev->dma_coherent = coherent;
 #endif
 }
 
diff --git a/arch/mips/mm/dma-noncoherent.c b/arch/mips/mm/dma-noncoherent.c
index 2aca1236af36..d408ac51f56c 100644
--- a/arch/mips/mm/dma-noncoherent.c
+++ b/arch/mips/mm/dma-noncoherent.c
@@ -14,26 +14,6 @@
 #include <asm/dma-coherence.h>
 #include <asm/io.h>
 
-#ifdef CONFIG_DMA_PERDEV_COHERENT
-static inline int dev_is_coherent(struct device *dev)
-{
-	return dev->archdata.dma_coherent;
-}
-#else
-static inline int dev_is_coherent(struct device *dev)
-{
-	switch (coherentio) {
-	default:
-	case IO_COHERENCE_DEFAULT:
-		return hw_coherentio;
-	case IO_COHERENCE_ENABLED:
-		return 1;
-	case IO_COHERENCE_DISABLED:
-		return 0;
-	}
-}
-#endif /* CONFIG_DMA_PERDEV_COHERENT */
-
 /*
  * The affected CPUs below in 'cpu_needs_post_dma_flush()' can speculatively
  * fill random cachelines with stale data at any time, requiring an extra
@@ -49,7 +29,7 @@ static inline int dev_is_coherent(struct device *dev)
  */
 static inline bool cpu_needs_post_dma_flush(struct device *dev)
 {
-	if (dev_is_coherent(dev))
+	if (dev_is_dma_coherent(dev))
 		return false;
 
 	switch (boot_cpu_type()) {
@@ -76,7 +56,7 @@ void *arch_dma_alloc(struct device *dev, size_t size,
 	if (!ret)
 		return NULL;
 
-	if (!dev_is_coherent(dev) && !(attrs & DMA_ATTR_NON_CONSISTENT)) {
+	if (!dev_is_dma_coherent(dev) && !(attrs & DMA_ATTR_NON_CONSISTENT)) {
 		dma_cache_wback_inv((unsigned long) ret, size);
 		ret = (void *)UNCAC_ADDR(ret);
 	}
@@ -87,7 +67,7 @@ void *arch_dma_alloc(struct device *dev, size_t size,
 void arch_dma_free(struct device *dev, size_t size, void *cpu_addr,
 		dma_addr_t dma_addr, unsigned long attrs)
 {
-	if (!(attrs & DMA_ATTR_NON_CONSISTENT) && !dev_is_coherent(dev))
+	if (!(attrs & DMA_ATTR_NON_CONSISTENT) && !dev_is_dma_coherent(dev))
 		cpu_addr = (void *)CAC_ADDR((unsigned long)cpu_addr);
 	dma_direct_free(dev, size, cpu_addr, dma_addr, attrs);
 }
@@ -103,7 +83,7 @@ int arch_dma_mmap(struct device *dev, struct vm_area_struct *vma,
 	unsigned long pfn;
 	int ret = -ENXIO;
 
-	if (!dev_is_coherent(dev))
+	if (!dev_is_dma_coherent(dev))
 		addr = CAC_ADDR(addr);
 
 	pfn = page_to_pfn(virt_to_page((void *)addr));
@@ -187,7 +167,7 @@ static inline void dma_sync_phys(phys_addr_t paddr, size_t size,
 void arch_sync_dma_for_device(struct device *dev, phys_addr_t paddr,
 		size_t size, enum dma_data_direction dir)
 {
-	if (!dev_is_coherent(dev))
+	if (!dev_is_dma_coherent(dev))
 		dma_sync_phys(paddr, size, dir);
 }
 
@@ -203,6 +183,6 @@ void arch_dma_cache_sync(struct device *dev, void *vaddr, size_t size,
 {
 	BUG_ON(direction == DMA_NONE);
 
-	if (!dev_is_coherent(dev))
+	if (!dev_is_dma_coherent(dev))
 		dma_sync_virt(vaddr, size, direction);
 }
diff --git a/include/linux/device.h b/include/linux/device.h
index 8f882549edee..983506789402 100644
--- a/include/linux/device.h
+++ b/include/linux/device.h
@@ -927,6 +927,8 @@ struct dev_links_info {
  * @offline:	Set after successful invocation of bus type's .offline().
  * @of_node_reused: Set if the device-tree node is shared with an ancestor
  *              device.
+ * @dma_coherent: this particular device is dma coherent, even if the
+ *		architecture supports non-coherent devices.
  *
  * At the lowest level, every device in a Linux system is represented by an
  * instance of struct device. The device structure contains the information
@@ -1016,6 +1018,11 @@ struct device {
 	bool			offline_disabled:1;
 	bool			offline:1;
 	bool			of_node_reused:1;
+#if defined(CONFIG_ARCH_HAS_SYNC_DMA_FOR_DEVICE) || \
+    defined(CONFIG_ARCH_HAS_SYNC_DMA_FOR_CPU) || \
+    defined(CONFIG_ARCH_HAS_SYNC_DMA_FOR_CPU_ALL)
+	bool			dma_coherent:1;
+#endif
 };
 
 static inline struct device *kobj_to_dev(struct kobject *kobj)
diff --git a/include/linux/dma-noncoherent.h b/include/linux/dma-noncoherent.h
index a0aa00cc909d..ce9732506ef4 100644
--- a/include/linux/dma-noncoherent.h
+++ b/include/linux/dma-noncoherent.h
@@ -4,6 +4,22 @@
 
 #include <linux/dma-mapping.h>
 
+#ifdef CONFIG_ARCH_HAS_DMA_COHERENCE_H
+#include <asm/dma-coherence.h>
+#elif defined(CONFIG_ARCH_HAS_SYNC_DMA_FOR_DEVICE) || \
+	defined(CONFIG_ARCH_HAS_SYNC_DMA_FOR_CPU) || \
+	defined(CONFIG_ARCH_HAS_SYNC_DMA_FOR_CPU_ALL)
+static inline bool dev_is_dma_coherent(struct device *dev)
+{
+	return dev->dma_coherent;
+}
+#else
+static inline bool dev_is_dma_coherent(struct device *dev)
+{
+	return true;
+}
+#endif /* CONFIG_ARCH_HAS_DMA_COHERENCE_H */
+
 void *arch_dma_alloc(struct device *dev, size_t size, dma_addr_t *dma_handle,
 		gfp_t gfp, unsigned long attrs);
 void arch_dma_free(struct device *dev, size_t size, void *cpu_addr,
diff --git a/kernel/dma/Kconfig b/kernel/dma/Kconfig
index 1b1d63b3634b..79476749f196 100644
--- a/kernel/dma/Kconfig
+++ b/kernel/dma/Kconfig
@@ -13,6 +13,9 @@ config NEED_DMA_MAP_STATE
 config ARCH_DMA_ADDR_T_64BIT
 	def_bool 64BIT || PHYS_ADDR_T_64BIT
 
+config ARCH_HAS_DMA_COHERENCE_H
+	bool
+
 config HAVE_GENERIC_DMA_COHERENT
 	bool
 
-- 
2.18.0


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 4/6] dma-mapping: merge direct and noncoherent ops
  2018-09-14  9:58 merge dma_direct_ops and dma_noncoherent_ops v3 Christoph Hellwig
                   ` (2 preceding siblings ...)
  2018-09-14  9:58 ` [PATCH 3/6] dma-mapping: move the dma_coherent flag to struct device Christoph Hellwig
@ 2018-09-14  9:58 ` Christoph Hellwig
  2018-10-31 14:24   ` Maciej W. Rozycki
  2018-09-14  9:58 ` [PATCH 5/6] dma-mapping: consolidate the dma mmap implementations Christoph Hellwig
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 17+ messages in thread
From: Christoph Hellwig @ 2018-09-14  9:58 UTC (permalink / raw)
  To: iommu
  Cc: Marek Szyprowski, Robin Murphy, Paul Burton, Greg Kroah-Hartman,
	linux-mips, linux-kernel

All the cache maintainance is already stubbed out when not enabled,
but merging the two allows us to nicely handle the case where
cache maintainance is required for some devices, but not others.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Paul Burton <paul.burton@mips.com> # MIPS parts
---
 arch/arc/Kconfig                     |   2 +-
 arch/arc/mm/dma.c                    |  16 ++--
 arch/arm/mm/dma-mapping-nommu.c      |   5 +-
 arch/c6x/Kconfig                     |   2 +-
 arch/hexagon/Kconfig                 |   2 +-
 arch/m68k/Kconfig                    |   2 +-
 arch/microblaze/Kconfig              |   2 +-
 arch/mips/Kconfig                    |   1 -
 arch/mips/include/asm/dma-mapping.h  |   2 -
 arch/mips/jazz/jazzdma.c             |   6 +-
 arch/mips/mm/dma-noncoherent.c       |  29 ++-----
 arch/nds32/Kconfig                   |   2 +-
 arch/nios2/Kconfig                   |   2 +-
 arch/openrisc/Kconfig                |   2 +-
 arch/parisc/Kconfig                  |   2 +-
 arch/parisc/kernel/setup.c           |   2 +-
 arch/sh/Kconfig                      |   3 +-
 arch/sparc/Kconfig                   |   2 +-
 arch/sparc/include/asm/dma-mapping.h |   4 +-
 arch/x86/kernel/amd_gart_64.c        |   6 +-
 arch/xtensa/Kconfig                  |   2 +-
 include/asm-generic/dma-mapping.h    |   9 --
 include/linux/dma-direct.h           |   4 +
 include/linux/dma-mapping.h          |   1 -
 include/linux/dma-noncoherent.h      |   5 --
 kernel/dma/Kconfig                   |   9 +-
 kernel/dma/Makefile                  |   1 -
 kernel/dma/direct.c                  | 121 +++++++++++++++++++++++++--
 kernel/dma/noncoherent.c             | 106 -----------------------
 29 files changed, 160 insertions(+), 192 deletions(-)
 delete mode 100644 kernel/dma/noncoherent.c

diff --git a/arch/arc/Kconfig b/arch/arc/Kconfig
index b4441b0764d7..ca03694d518a 100644
--- a/arch/arc/Kconfig
+++ b/arch/arc/Kconfig
@@ -17,7 +17,7 @@ config ARC
 	select BUILDTIME_EXTABLE_SORT
 	select CLONE_BACKWARDS
 	select COMMON_CLK
-	select DMA_NONCOHERENT_OPS
+	select DMA_DIRECT_OPS
 	select DMA_NONCOHERENT_MMAP
 	select GENERIC_ATOMIC64 if !ISA_ARCV2 || !(ARC_HAS_LL64 && ARC_HAS_LLSC)
 	select GENERIC_CLOCKEVENTS
diff --git a/arch/arc/mm/dma.c b/arch/arc/mm/dma.c
index c75d5c3470e3..535ed4a068ef 100644
--- a/arch/arc/mm/dma.c
+++ b/arch/arc/mm/dma.c
@@ -167,7 +167,7 @@ void arch_sync_dma_for_cpu(struct device *dev, phys_addr_t paddr,
 }
 
 /*
- * Plug in coherent or noncoherent dma ops
+ * Plug in direct dma map ops.
  */
 void arch_setup_dma_ops(struct device *dev, u64 dma_base, u64 size,
 			const struct iommu_ops *iommu, bool coherent)
@@ -175,13 +175,11 @@ void arch_setup_dma_ops(struct device *dev, u64 dma_base, u64 size,
 	/*
 	 * IOC hardware snoops all DMA traffic keeping the caches consistent
 	 * with memory - eliding need for any explicit cache maintenance of
-	 * DMA buffers - so we can use dma_direct cache ops.
+	 * DMA buffers.
 	 */
-	if (is_isa_arcv2() && ioc_enable && coherent) {
-		set_dma_ops(dev, &dma_direct_ops);
-		dev_info(dev, "use dma_direct_ops cache ops\n");
-	} else {
-		set_dma_ops(dev, &dma_noncoherent_ops);
-		dev_info(dev, "use dma_noncoherent_ops cache ops\n");
-	}
+	if (is_isa_arcv2() && ioc_enable && coherent)
+		dev->dma_coherent = true;
+
+	dev_info(dev, "use %sncoherent DMA ops\n",
+		 dev->dma_coherent ? "" : "non");
 }
diff --git a/arch/arm/mm/dma-mapping-nommu.c b/arch/arm/mm/dma-mapping-nommu.c
index aa7aba302e76..0ad156f9985b 100644
--- a/arch/arm/mm/dma-mapping-nommu.c
+++ b/arch/arm/mm/dma-mapping-nommu.c
@@ -47,7 +47,8 @@ static void *arm_nommu_dma_alloc(struct device *dev, size_t size,
 	 */
 
 	if (attrs & DMA_ATTR_NON_CONSISTENT)
-		return dma_direct_alloc(dev, size, dma_handle, gfp, attrs);
+		return dma_direct_alloc_pages(dev, size, dma_handle, gfp,
+				attrs);
 
 	ret = dma_alloc_from_global_coherent(size, dma_handle);
 
@@ -70,7 +71,7 @@ static void arm_nommu_dma_free(struct device *dev, size_t size,
 			       unsigned long attrs)
 {
 	if (attrs & DMA_ATTR_NON_CONSISTENT) {
-		dma_direct_free(dev, size, cpu_addr, dma_addr, attrs);
+		dma_direct_free_pages(dev, size, cpu_addr, dma_addr, attrs);
 	} else {
 		int ret = dma_release_from_global_coherent(get_order(size),
 							   cpu_addr);
diff --git a/arch/c6x/Kconfig b/arch/c6x/Kconfig
index a641b0bf1611..f65a084607fd 100644
--- a/arch/c6x/Kconfig
+++ b/arch/c6x/Kconfig
@@ -9,7 +9,7 @@ config C6X
 	select ARCH_HAS_SYNC_DMA_FOR_CPU
 	select ARCH_HAS_SYNC_DMA_FOR_DEVICE
 	select CLKDEV_LOOKUP
-	select DMA_NONCOHERENT_OPS
+	select DMA_DIRECT_OPS
 	select GENERIC_ATOMIC64
 	select GENERIC_IRQ_SHOW
 	select HAVE_ARCH_TRACEHOOK
diff --git a/arch/hexagon/Kconfig b/arch/hexagon/Kconfig
index 6cee842a9b44..3ef46522e89f 100644
--- a/arch/hexagon/Kconfig
+++ b/arch/hexagon/Kconfig
@@ -30,7 +30,7 @@ config HEXAGON
 	select GENERIC_CLOCKEVENTS_BROADCAST
 	select MODULES_USE_ELF_RELA
 	select GENERIC_CPU_DEVICES
-	select DMA_NONCOHERENT_OPS
+	select DMA_DIRECT_OPS
 	---help---
 	  Qualcomm Hexagon is a processor architecture designed for high
 	  performance and low power across a wide variety of applications.
diff --git a/arch/m68k/Kconfig b/arch/m68k/Kconfig
index 070553791e97..c7b2a8d60a41 100644
--- a/arch/m68k/Kconfig
+++ b/arch/m68k/Kconfig
@@ -26,7 +26,7 @@ config M68K
 	select MODULES_USE_ELF_RELA
 	select OLD_SIGSUSPEND3
 	select OLD_SIGACTION
-	select DMA_NONCOHERENT_OPS if HAS_DMA
+	select DMA_DIRECT_OPS if HAS_DMA
 	select HAVE_MEMBLOCK
 	select ARCH_DISCARD_MEMBLOCK
 	select NO_BOOTMEM
diff --git a/arch/microblaze/Kconfig b/arch/microblaze/Kconfig
index ace5c5bf1836..0f48ab6a8070 100644
--- a/arch/microblaze/Kconfig
+++ b/arch/microblaze/Kconfig
@@ -11,7 +11,7 @@ config MICROBLAZE
 	select TIMER_OF
 	select CLONE_BACKWARDS3
 	select COMMON_CLK
-	select DMA_NONCOHERENT_OPS
+	select DMA_DIRECT_OPS
 	select DMA_NONCOHERENT_MMAP
 	select GENERIC_ATOMIC64
 	select GENERIC_CLOCKEVENTS
diff --git a/arch/mips/Kconfig b/arch/mips/Kconfig
index 54c52bd0d9d3..96da6e3396e1 100644
--- a/arch/mips/Kconfig
+++ b/arch/mips/Kconfig
@@ -1121,7 +1121,6 @@ config DMA_NONCOHERENT
 	select NEED_DMA_MAP_STATE
 	select DMA_NONCOHERENT_MMAP
 	select DMA_NONCOHERENT_CACHE_SYNC
-	select DMA_NONCOHERENT_OPS
 
 config SYS_HAS_EARLY_PRINTK
 	bool
diff --git a/arch/mips/include/asm/dma-mapping.h b/arch/mips/include/asm/dma-mapping.h
index 40d825c779de..b4c477eb46ce 100644
--- a/arch/mips/include/asm/dma-mapping.h
+++ b/arch/mips/include/asm/dma-mapping.h
@@ -12,8 +12,6 @@ static inline const struct dma_map_ops *get_arch_dma_ops(struct bus_type *bus)
 	return &jazz_dma_ops;
 #elif defined(CONFIG_SWIOTLB)
 	return &swiotlb_dma_ops;
-#elif defined(CONFIG_DMA_NONCOHERENT_OPS)
-	return &dma_noncoherent_ops;
 #else
 	return &dma_direct_ops;
 #endif
diff --git a/arch/mips/jazz/jazzdma.c b/arch/mips/jazz/jazzdma.c
index d31bc2f01208..bb49dfa1a9a3 100644
--- a/arch/mips/jazz/jazzdma.c
+++ b/arch/mips/jazz/jazzdma.c
@@ -564,13 +564,13 @@ static void *jazz_dma_alloc(struct device *dev, size_t size,
 {
 	void *ret;
 
-	ret = dma_direct_alloc(dev, size, dma_handle, gfp, attrs);
+	ret = dma_direct_alloc_pages(dev, size, dma_handle, gfp, attrs);
 	if (!ret)
 		return NULL;
 
 	*dma_handle = vdma_alloc(virt_to_phys(ret), size);
 	if (*dma_handle == VDMA_ERROR) {
-		dma_direct_free(dev, size, ret, *dma_handle, attrs);
+		dma_direct_free_pages(dev, size, ret, *dma_handle, attrs);
 		return NULL;
 	}
 
@@ -587,7 +587,7 @@ static void jazz_dma_free(struct device *dev, size_t size, void *vaddr,
 	vdma_free(dma_handle);
 	if (!(attrs & DMA_ATTR_NON_CONSISTENT))
 		vaddr = (void *)CAC_ADDR((unsigned long)vaddr);
-	return dma_direct_free(dev, size, vaddr, dma_handle, attrs);
+	dma_direct_free_pages(dev, size, vaddr, dma_handle, attrs);
 }
 
 static dma_addr_t jazz_dma_map_page(struct device *dev, struct page *page,
diff --git a/arch/mips/mm/dma-noncoherent.c b/arch/mips/mm/dma-noncoherent.c
index d408ac51f56c..b01b9a3e424f 100644
--- a/arch/mips/mm/dma-noncoherent.c
+++ b/arch/mips/mm/dma-noncoherent.c
@@ -29,9 +29,6 @@
  */
 static inline bool cpu_needs_post_dma_flush(struct device *dev)
 {
-	if (dev_is_dma_coherent(dev))
-		return false;
-
 	switch (boot_cpu_type()) {
 	case CPU_R10000:
 	case CPU_R12000:
@@ -52,11 +49,8 @@ void *arch_dma_alloc(struct device *dev, size_t size,
 {
 	void *ret;
 
-	ret = dma_direct_alloc(dev, size, dma_handle, gfp, attrs);
-	if (!ret)
-		return NULL;
-
-	if (!dev_is_dma_coherent(dev) && !(attrs & DMA_ATTR_NON_CONSISTENT)) {
+	ret = dma_direct_alloc_pages(dev, size, dma_handle, gfp, attrs);
+	if (!ret && !(attrs & DMA_ATTR_NON_CONSISTENT)) {
 		dma_cache_wback_inv((unsigned long) ret, size);
 		ret = (void *)UNCAC_ADDR(ret);
 	}
@@ -67,9 +61,9 @@ void *arch_dma_alloc(struct device *dev, size_t size,
 void arch_dma_free(struct device *dev, size_t size, void *cpu_addr,
 		dma_addr_t dma_addr, unsigned long attrs)
 {
-	if (!(attrs & DMA_ATTR_NON_CONSISTENT) && !dev_is_dma_coherent(dev))
+	if (!(attrs & DMA_ATTR_NON_CONSISTENT))
 		cpu_addr = (void *)CAC_ADDR((unsigned long)cpu_addr);
-	dma_direct_free(dev, size, cpu_addr, dma_addr, attrs);
+	dma_direct_free_pages(dev, size, cpu_addr, dma_addr, attrs);
 }
 
 int arch_dma_mmap(struct device *dev, struct vm_area_struct *vma,
@@ -78,16 +72,11 @@ int arch_dma_mmap(struct device *dev, struct vm_area_struct *vma,
 {
 	unsigned long user_count = vma_pages(vma);
 	unsigned long count = PAGE_ALIGN(size) >> PAGE_SHIFT;
-	unsigned long addr = (unsigned long)cpu_addr;
+	unsigned long addr = CAC_ADDR((unsigned long)cpu_addr);
 	unsigned long off = vma->vm_pgoff;
-	unsigned long pfn;
+	unsigned long pfn = page_to_pfn(virt_to_page((void *)addr));
 	int ret = -ENXIO;
 
-	if (!dev_is_dma_coherent(dev))
-		addr = CAC_ADDR(addr);
-
-	pfn = page_to_pfn(virt_to_page((void *)addr));
-
 	if (attrs & DMA_ATTR_WRITE_COMBINE)
 		vma->vm_page_prot = pgprot_writecombine(vma->vm_page_prot);
 	else
@@ -167,8 +156,7 @@ static inline void dma_sync_phys(phys_addr_t paddr, size_t size,
 void arch_sync_dma_for_device(struct device *dev, phys_addr_t paddr,
 		size_t size, enum dma_data_direction dir)
 {
-	if (!dev_is_dma_coherent(dev))
-		dma_sync_phys(paddr, size, dir);
+	dma_sync_phys(paddr, size, dir);
 }
 
 void arch_sync_dma_for_cpu(struct device *dev, phys_addr_t paddr,
@@ -183,6 +171,5 @@ void arch_dma_cache_sync(struct device *dev, void *vaddr, size_t size,
 {
 	BUG_ON(direction == DMA_NONE);
 
-	if (!dev_is_dma_coherent(dev))
-		dma_sync_virt(vaddr, size, direction);
+	dma_sync_virt(vaddr, size, direction);
 }
diff --git a/arch/nds32/Kconfig b/arch/nds32/Kconfig
index 7068f341133d..56992330026a 100644
--- a/arch/nds32/Kconfig
+++ b/arch/nds32/Kconfig
@@ -11,7 +11,7 @@ config NDS32
 	select CLKSRC_MMIO
 	select CLONE_BACKWARDS
 	select COMMON_CLK
-	select DMA_NONCOHERENT_OPS
+	select DMA_DIRECT_OPS
 	select GENERIC_ATOMIC64
 	select GENERIC_CPU_DEVICES
 	select GENERIC_CLOCKEVENTS
diff --git a/arch/nios2/Kconfig b/arch/nios2/Kconfig
index f4ad1138e6b9..03965692fbfe 100644
--- a/arch/nios2/Kconfig
+++ b/arch/nios2/Kconfig
@@ -4,7 +4,7 @@ config NIOS2
 	select ARCH_HAS_SYNC_DMA_FOR_CPU
 	select ARCH_HAS_SYNC_DMA_FOR_DEVICE
 	select ARCH_NO_SWAP
-	select DMA_NONCOHERENT_OPS
+	select DMA_DIRECT_OPS
 	select TIMER_OF
 	select GENERIC_ATOMIC64
 	select GENERIC_CLOCKEVENTS
diff --git a/arch/openrisc/Kconfig b/arch/openrisc/Kconfig
index e0081e734827..a655ae280637 100644
--- a/arch/openrisc/Kconfig
+++ b/arch/openrisc/Kconfig
@@ -7,7 +7,7 @@
 config OPENRISC
 	def_bool y
 	select ARCH_HAS_SYNC_DMA_FOR_DEVICE
-	select DMA_NONCOHERENT_OPS
+	select DMA_DIRECT_OPS
 	select OF
 	select OF_EARLY_FLATTREE
 	select IRQ_DOMAIN
diff --git a/arch/parisc/Kconfig b/arch/parisc/Kconfig
index 8e6d83f79e72..f1cd12afd943 100644
--- a/arch/parisc/Kconfig
+++ b/arch/parisc/Kconfig
@@ -186,7 +186,7 @@ config PA11
 	depends on PA7000 || PA7100LC || PA7200 || PA7300LC
 	select ARCH_HAS_SYNC_DMA_FOR_CPU
 	select ARCH_HAS_SYNC_DMA_FOR_DEVICE
-	select DMA_NONCOHERENT_OPS
+	select DMA_DIRECT_OPS
 	select DMA_NONCOHERENT_CACHE_SYNC
 
 config PREFETCH
diff --git a/arch/parisc/kernel/setup.c b/arch/parisc/kernel/setup.c
index 4e87c35c22b7..755e89ec828a 100644
--- a/arch/parisc/kernel/setup.c
+++ b/arch/parisc/kernel/setup.c
@@ -102,7 +102,7 @@ void __init dma_ops_init(void)
 	case pcxl: /* falls through */
 	case pcxs:
 	case pcxt:
-		hppa_dma_ops = &dma_noncoherent_ops;
+		hppa_dma_ops = &dma_direct_ops;
 		break;
 	default:
 		break;
diff --git a/arch/sh/Kconfig b/arch/sh/Kconfig
index 1fb7b6d72baf..475d786a65b0 100644
--- a/arch/sh/Kconfig
+++ b/arch/sh/Kconfig
@@ -7,6 +7,7 @@ config SUPERH
 	select ARCH_NO_COHERENT_DMA_MMAP if !MMU
 	select HAVE_PATA_PLATFORM
 	select CLKDEV_LOOKUP
+	select DMA_DIRECT_OPS
 	select HAVE_IDE if HAS_IOPORT_MAP
 	select HAVE_MEMBLOCK
 	select HAVE_MEMBLOCK_NODE_MAP
@@ -158,13 +159,11 @@ config SWAP_IO_SPACE
 	bool
 
 config DMA_COHERENT
-	select DMA_DIRECT_OPS
 	bool
 
 config DMA_NONCOHERENT
 	def_bool !DMA_COHERENT
 	select ARCH_HAS_SYNC_DMA_FOR_DEVICE
-	select DMA_NONCOHERENT_OPS
 
 config PGTABLE_LEVELS
 	default 3 if X2TLB
diff --git a/arch/sparc/Kconfig b/arch/sparc/Kconfig
index e6f2a38d2e61..7e2aa59fcc29 100644
--- a/arch/sparc/Kconfig
+++ b/arch/sparc/Kconfig
@@ -51,7 +51,7 @@ config SPARC
 config SPARC32
 	def_bool !64BIT
 	select ARCH_HAS_SYNC_DMA_FOR_CPU
-	select DMA_NONCOHERENT_OPS
+	select DMA_DIRECT_OPS
 	select GENERIC_ATOMIC64
 	select CLZ_TAB
 	select HAVE_UID16
diff --git a/arch/sparc/include/asm/dma-mapping.h b/arch/sparc/include/asm/dma-mapping.h
index e17566376934..b0bb2fcaf1c9 100644
--- a/arch/sparc/include/asm/dma-mapping.h
+++ b/arch/sparc/include/asm/dma-mapping.h
@@ -14,11 +14,11 @@ static inline const struct dma_map_ops *get_arch_dma_ops(struct bus_type *bus)
 {
 #ifdef CONFIG_SPARC_LEON
 	if (sparc_cpu_model == sparc_leon)
-		return &dma_noncoherent_ops;
+		return &dma_direct_ops;
 #endif
 #if defined(CONFIG_SPARC32) && defined(CONFIG_PCI)
 	if (bus == &pci_bus_type)
-		return &dma_noncoherent_ops;
+		return &dma_direct_ops;
 #endif
 	return dma_ops;
 }
diff --git a/arch/x86/kernel/amd_gart_64.c b/arch/x86/kernel/amd_gart_64.c
index f299d8a479bb..3f9d1b4019bb 100644
--- a/arch/x86/kernel/amd_gart_64.c
+++ b/arch/x86/kernel/amd_gart_64.c
@@ -482,7 +482,7 @@ gart_alloc_coherent(struct device *dev, size_t size, dma_addr_t *dma_addr,
 {
 	void *vaddr;
 
-	vaddr = dma_direct_alloc(dev, size, dma_addr, flag, attrs);
+	vaddr = dma_direct_alloc_pages(dev, size, dma_addr, flag, attrs);
 	if (!vaddr ||
 	    !force_iommu || dev->coherent_dma_mask <= DMA_BIT_MASK(24))
 		return vaddr;
@@ -494,7 +494,7 @@ gart_alloc_coherent(struct device *dev, size_t size, dma_addr_t *dma_addr,
 		goto out_free;
 	return vaddr;
 out_free:
-	dma_direct_free(dev, size, vaddr, *dma_addr, attrs);
+	dma_direct_free_pages(dev, size, vaddr, *dma_addr, attrs);
 	return NULL;
 }
 
@@ -504,7 +504,7 @@ gart_free_coherent(struct device *dev, size_t size, void *vaddr,
 		   dma_addr_t dma_addr, unsigned long attrs)
 {
 	gart_unmap_page(dev, dma_addr, size, DMA_BIDIRECTIONAL, 0);
-	dma_direct_free(dev, size, vaddr, dma_addr, attrs);
+	dma_direct_free_pages(dev, size, vaddr, dma_addr, attrs);
 }
 
 static int gart_mapping_error(struct device *dev, dma_addr_t dma_addr)
diff --git a/arch/xtensa/Kconfig b/arch/xtensa/Kconfig
index 04d038f3b6fa..516694937b7a 100644
--- a/arch/xtensa/Kconfig
+++ b/arch/xtensa/Kconfig
@@ -12,7 +12,7 @@ config XTENSA
 	select BUILDTIME_EXTABLE_SORT
 	select CLONE_BACKWARDS
 	select COMMON_CLK
-	select DMA_NONCOHERENT_OPS
+	select DMA_DIRECT_OPS
 	select GENERIC_ATOMIC64
 	select GENERIC_CLOCKEVENTS
 	select GENERIC_IRQ_SHOW
diff --git a/include/asm-generic/dma-mapping.h b/include/asm-generic/dma-mapping.h
index ad2868263867..880a292d792f 100644
--- a/include/asm-generic/dma-mapping.h
+++ b/include/asm-generic/dma-mapping.h
@@ -4,16 +4,7 @@
 
 static inline const struct dma_map_ops *get_arch_dma_ops(struct bus_type *bus)
 {
-	/*
-	 * Use the non-coherent ops if available.  If an architecture wants a
-	 * more fine-grained selection of operations it will have to implement
-	 * get_arch_dma_ops itself or use the per-device dma_ops.
-	 */
-#ifdef CONFIG_DMA_NONCOHERENT_OPS
-	return &dma_noncoherent_ops;
-#else
 	return &dma_direct_ops;
-#endif
 }
 
 #endif /* _ASM_GENERIC_DMA_MAPPING_H */
diff --git a/include/linux/dma-direct.h b/include/linux/dma-direct.h
index 8d9f33febde5..86a59ba5a7f3 100644
--- a/include/linux/dma-direct.h
+++ b/include/linux/dma-direct.h
@@ -59,6 +59,10 @@ void *dma_direct_alloc(struct device *dev, size_t size, dma_addr_t *dma_handle,
 		gfp_t gfp, unsigned long attrs);
 void dma_direct_free(struct device *dev, size_t size, void *cpu_addr,
 		dma_addr_t dma_addr, unsigned long attrs);
+void *dma_direct_alloc_pages(struct device *dev, size_t size,
+		dma_addr_t *dma_handle, gfp_t gfp, unsigned long attrs);
+void dma_direct_free_pages(struct device *dev, size_t size, void *cpu_addr,
+		dma_addr_t dma_addr, unsigned long attrs);
 dma_addr_t dma_direct_map_page(struct device *dev, struct page *page,
 		unsigned long offset, size_t size, enum dma_data_direction dir,
 		unsigned long attrs);
diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h
index eafd6f318e78..8f2001181cd1 100644
--- a/include/linux/dma-mapping.h
+++ b/include/linux/dma-mapping.h
@@ -136,7 +136,6 @@ struct dma_map_ops {
 };
 
 extern const struct dma_map_ops dma_direct_ops;
-extern const struct dma_map_ops dma_noncoherent_ops;
 extern const struct dma_map_ops dma_virt_ops;
 
 #define DMA_BIT_MASK(n)	(((n) == 64) ? ~0ULL : ((1ULL<<(n))-1))
diff --git a/include/linux/dma-noncoherent.h b/include/linux/dma-noncoherent.h
index ce9732506ef4..3f503025a0cd 100644
--- a/include/linux/dma-noncoherent.h
+++ b/include/linux/dma-noncoherent.h
@@ -24,14 +24,9 @@ void *arch_dma_alloc(struct device *dev, size_t size, dma_addr_t *dma_handle,
 		gfp_t gfp, unsigned long attrs);
 void arch_dma_free(struct device *dev, size_t size, void *cpu_addr,
 		dma_addr_t dma_addr, unsigned long attrs);
-
-#ifdef CONFIG_DMA_NONCOHERENT_MMAP
 int arch_dma_mmap(struct device *dev, struct vm_area_struct *vma,
 		void *cpu_addr, dma_addr_t dma_addr, size_t size,
 		unsigned long attrs);
-#else
-#define arch_dma_mmap NULL
-#endif /* CONFIG_DMA_NONCOHERENT_MMAP */
 
 #ifdef CONFIG_DMA_NONCOHERENT_CACHE_SYNC
 void arch_dma_cache_sync(struct device *dev, void *vaddr, size_t size,
diff --git a/kernel/dma/Kconfig b/kernel/dma/Kconfig
index 79476749f196..5617c9a76208 100644
--- a/kernel/dma/Kconfig
+++ b/kernel/dma/Kconfig
@@ -33,18 +33,13 @@ config DMA_DIRECT_OPS
 	bool
 	depends on HAS_DMA
 
-config DMA_NONCOHERENT_OPS
-	bool
-	depends on HAS_DMA
-	select DMA_DIRECT_OPS
-
 config DMA_NONCOHERENT_MMAP
 	bool
-	depends on DMA_NONCOHERENT_OPS
+	depends on DMA_DIRECT_OPS
 
 config DMA_NONCOHERENT_CACHE_SYNC
 	bool
-	depends on DMA_NONCOHERENT_OPS
+	depends on DMA_DIRECT_OPS
 
 config DMA_VIRT_OPS
 	bool
diff --git a/kernel/dma/Makefile b/kernel/dma/Makefile
index 6de44e4eb454..7d581e4eea4a 100644
--- a/kernel/dma/Makefile
+++ b/kernel/dma/Makefile
@@ -4,7 +4,6 @@ obj-$(CONFIG_HAS_DMA)			+= mapping.o
 obj-$(CONFIG_DMA_CMA)			+= contiguous.o
 obj-$(CONFIG_HAVE_GENERIC_DMA_COHERENT) += coherent.o
 obj-$(CONFIG_DMA_DIRECT_OPS)		+= direct.o
-obj-$(CONFIG_DMA_NONCOHERENT_OPS)	+= noncoherent.o
 obj-$(CONFIG_DMA_VIRT_OPS)		+= virt.o
 obj-$(CONFIG_DMA_API_DEBUG)		+= debug.o
 obj-$(CONFIG_SWIOTLB)			+= swiotlb.o
diff --git a/kernel/dma/direct.c b/kernel/dma/direct.c
index de87b0282e74..09e85f6aa4ba 100644
--- a/kernel/dma/direct.c
+++ b/kernel/dma/direct.c
@@ -1,13 +1,15 @@
 // SPDX-License-Identifier: GPL-2.0
 /*
- * DMA operations that map physical memory directly without using an IOMMU or
- * flushing caches.
+ * Copyright (C) 2018 Christoph Hellwig.
+ *
+ * DMA operations that map physical memory directly without using an IOMMU.
  */
 #include <linux/export.h>
 #include <linux/mm.h>
 #include <linux/dma-direct.h>
 #include <linux/scatterlist.h>
 #include <linux/dma-contiguous.h>
+#include <linux/dma-noncoherent.h>
 #include <linux/pfn.h>
 #include <linux/set_memory.h>
 
@@ -58,8 +60,8 @@ static bool dma_coherent_ok(struct device *dev, phys_addr_t phys, size_t size)
 	return addr + size - 1 <= dev->coherent_dma_mask;
 }
 
-void *dma_direct_alloc(struct device *dev, size_t size, dma_addr_t *dma_handle,
-		gfp_t gfp, unsigned long attrs)
+void *dma_direct_alloc_pages(struct device *dev, size_t size,
+		dma_addr_t *dma_handle, gfp_t gfp, unsigned long attrs)
 {
 	unsigned int count = PAGE_ALIGN(size) >> PAGE_SHIFT;
 	int page_order = get_order(size);
@@ -124,7 +126,7 @@ void *dma_direct_alloc(struct device *dev, size_t size, dma_addr_t *dma_handle,
  * NOTE: this function must never look at the dma_addr argument, because we want
  * to be able to use it as a helper for iommu implementations as well.
  */
-void dma_direct_free(struct device *dev, size_t size, void *cpu_addr,
+void dma_direct_free_pages(struct device *dev, size_t size, void *cpu_addr,
 		dma_addr_t dma_addr, unsigned long attrs)
 {
 	unsigned int count = PAGE_ALIGN(size) >> PAGE_SHIFT;
@@ -136,14 +138,106 @@ void dma_direct_free(struct device *dev, size_t size, void *cpu_addr,
 		free_pages((unsigned long)cpu_addr, page_order);
 }
 
+void *dma_direct_alloc(struct device *dev, size_t size,
+		dma_addr_t *dma_handle, gfp_t gfp, unsigned long attrs)
+{
+	if (!dev_is_dma_coherent(dev))
+		return arch_dma_alloc(dev, size, dma_handle, gfp, attrs);
+	return dma_direct_alloc_pages(dev, size, dma_handle, gfp, attrs);
+}
+
+void dma_direct_free(struct device *dev, size_t size,
+		void *cpu_addr, dma_addr_t dma_addr, unsigned long attrs)
+{
+	if (!dev_is_dma_coherent(dev))
+		arch_dma_free(dev, size, cpu_addr, dma_addr, attrs);
+	else
+		dma_direct_free_pages(dev, size, cpu_addr, dma_addr, attrs);
+}
+
+static int dma_direct_mmap(struct device *dev, struct vm_area_struct *vma,
+		void *cpu_addr, dma_addr_t dma_addr, size_t size,
+		unsigned long attrs)
+{
+	if (!dev_is_dma_coherent(dev) &&
+	    IS_ENABLED(CONFIG_DMA_NONCOHERENT_MMAP))
+		return arch_dma_mmap(dev, vma, cpu_addr, dma_addr, size, attrs);
+	return dma_common_mmap(dev, vma, cpu_addr, dma_addr, size);
+}
+
+static void dma_direct_sync_single_for_device(struct device *dev,
+		dma_addr_t addr, size_t size, enum dma_data_direction dir)
+{
+	if (dev_is_dma_coherent(dev))
+		return;
+	arch_sync_dma_for_device(dev, dma_to_phys(dev, addr), size, dir);
+}
+
+static void dma_direct_sync_sg_for_device(struct device *dev,
+		struct scatterlist *sgl, int nents, enum dma_data_direction dir)
+{
+	struct scatterlist *sg;
+	int i;
+
+	if (dev_is_dma_coherent(dev))
+		return;
+
+	for_each_sg(sgl, sg, nents, i)
+		arch_sync_dma_for_device(dev, sg_phys(sg), sg->length, dir);
+}
+
+#if defined(CONFIG_ARCH_HAS_SYNC_DMA_FOR_CPU) || \
+    defined(CONFIG_ARCH_HAS_SYNC_DMA_FOR_CPU_ALL)
+static void dma_direct_sync_single_for_cpu(struct device *dev,
+		dma_addr_t addr, size_t size, enum dma_data_direction dir)
+{
+	if (dev_is_dma_coherent(dev))
+		return;
+	arch_sync_dma_for_cpu(dev, dma_to_phys(dev, addr), size, dir);
+	arch_sync_dma_for_cpu_all(dev);
+}
+
+static void dma_direct_sync_sg_for_cpu(struct device *dev,
+		struct scatterlist *sgl, int nents, enum dma_data_direction dir)
+{
+	struct scatterlist *sg;
+	int i;
+
+	if (dev_is_dma_coherent(dev))
+		return;
+
+	for_each_sg(sgl, sg, nents, i)
+		arch_sync_dma_for_cpu(dev, sg_phys(sg), sg->length, dir);
+	arch_sync_dma_for_cpu_all(dev);
+}
+
+static void dma_direct_unmap_page(struct device *dev, dma_addr_t addr,
+		size_t size, enum dma_data_direction dir, unsigned long attrs)
+{
+	if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC))
+		dma_direct_sync_single_for_cpu(dev, addr, size, dir);
+}
+
+static void dma_direct_unmap_sg(struct device *dev, struct scatterlist *sgl,
+		int nents, enum dma_data_direction dir, unsigned long attrs)
+{
+	if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC))
+		dma_direct_sync_sg_for_cpu(dev, sgl, nents, dir);
+}
+#endif
+
 dma_addr_t dma_direct_map_page(struct device *dev, struct page *page,
 		unsigned long offset, size_t size, enum dma_data_direction dir,
 		unsigned long attrs)
 {
-	dma_addr_t dma_addr = phys_to_dma(dev, page_to_phys(page)) + offset;
+	phys_addr_t phys = page_to_phys(page) + offset;
+	dma_addr_t dma_addr = phys_to_dma(dev, phys);
 
 	if (!check_addr(dev, dma_addr, size, __func__))
 		return DIRECT_MAPPING_ERROR;
+
+	if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC))
+		dma_direct_sync_single_for_device(dev, dma_addr, size, dir);
 	return dma_addr;
 }
 
@@ -162,6 +256,8 @@ int dma_direct_map_sg(struct device *dev, struct scatterlist *sgl, int nents,
 		sg_dma_len(sg) = sg->length;
 	}
 
+	if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC))
+		dma_direct_sync_sg_for_device(dev, sgl, nents, dir);
 	return nents;
 }
 
@@ -197,9 +293,22 @@ int dma_direct_mapping_error(struct device *dev, dma_addr_t dma_addr)
 const struct dma_map_ops dma_direct_ops = {
 	.alloc			= dma_direct_alloc,
 	.free			= dma_direct_free,
+	.mmap			= dma_direct_mmap,
 	.map_page		= dma_direct_map_page,
 	.map_sg			= dma_direct_map_sg,
+#if defined(CONFIG_ARCH_HAS_SYNC_DMA_FOR_DEVICE)
+	.sync_single_for_device	= dma_direct_sync_single_for_device,
+	.sync_sg_for_device	= dma_direct_sync_sg_for_device,
+#endif
+#if defined(CONFIG_ARCH_HAS_SYNC_DMA_FOR_CPU) || \
+    defined(CONFIG_ARCH_HAS_SYNC_DMA_FOR_CPU_ALL)
+	.sync_single_for_cpu	= dma_direct_sync_single_for_cpu,
+	.sync_sg_for_cpu	= dma_direct_sync_sg_for_cpu,
+	.unmap_page		= dma_direct_unmap_page,
+	.unmap_sg		= dma_direct_unmap_sg,
+#endif
 	.dma_supported		= dma_direct_supported,
 	.mapping_error		= dma_direct_mapping_error,
+	.cache_sync		= arch_dma_cache_sync,
 };
 EXPORT_SYMBOL(dma_direct_ops);
diff --git a/kernel/dma/noncoherent.c b/kernel/dma/noncoherent.c
deleted file mode 100644
index 031fe235d958..000000000000
--- a/kernel/dma/noncoherent.c
+++ /dev/null
@@ -1,106 +0,0 @@
-// SPDX-License-Identifier: GPL-2.0
-/*
- * Copyright (C) 2018 Christoph Hellwig.
- *
- * DMA operations that map physical memory directly without providing cache
- * coherence.
- */
-#include <linux/export.h>
-#include <linux/mm.h>
-#include <linux/dma-direct.h>
-#include <linux/dma-noncoherent.h>
-#include <linux/scatterlist.h>
-
-static void dma_noncoherent_sync_single_for_device(struct device *dev,
-		dma_addr_t addr, size_t size, enum dma_data_direction dir)
-{
-	arch_sync_dma_for_device(dev, dma_to_phys(dev, addr), size, dir);
-}
-
-static void dma_noncoherent_sync_sg_for_device(struct device *dev,
-		struct scatterlist *sgl, int nents, enum dma_data_direction dir)
-{
-	struct scatterlist *sg;
-	int i;
-
-	for_each_sg(sgl, sg, nents, i)
-		arch_sync_dma_for_device(dev, sg_phys(sg), sg->length, dir);
-}
-
-static dma_addr_t dma_noncoherent_map_page(struct device *dev, struct page *page,
-		unsigned long offset, size_t size, enum dma_data_direction dir,
-		unsigned long attrs)
-{
-	dma_addr_t addr;
-
-	addr = dma_direct_map_page(dev, page, offset, size, dir, attrs);
-	if (!dma_mapping_error(dev, addr) && !(attrs & DMA_ATTR_SKIP_CPU_SYNC))
-		arch_sync_dma_for_device(dev, page_to_phys(page) + offset,
-				size, dir);
-	return addr;
-}
-
-static int dma_noncoherent_map_sg(struct device *dev, struct scatterlist *sgl,
-		int nents, enum dma_data_direction dir, unsigned long attrs)
-{
-	nents = dma_direct_map_sg(dev, sgl, nents, dir, attrs);
-	if (nents > 0 && !(attrs & DMA_ATTR_SKIP_CPU_SYNC))
-		dma_noncoherent_sync_sg_for_device(dev, sgl, nents, dir);
-	return nents;
-}
-
-#if defined(CONFIG_ARCH_HAS_SYNC_DMA_FOR_CPU) || \
-    defined(CONFIG_ARCH_HAS_SYNC_DMA_FOR_CPU_ALL)
-static void dma_noncoherent_sync_single_for_cpu(struct device *dev,
-		dma_addr_t addr, size_t size, enum dma_data_direction dir)
-{
-	arch_sync_dma_for_cpu(dev, dma_to_phys(dev, addr), size, dir);
-	arch_sync_dma_for_cpu_all(dev);
-}
-
-static void dma_noncoherent_sync_sg_for_cpu(struct device *dev,
-		struct scatterlist *sgl, int nents, enum dma_data_direction dir)
-{
-	struct scatterlist *sg;
-	int i;
-
-	for_each_sg(sgl, sg, nents, i)
-		arch_sync_dma_for_cpu(dev, sg_phys(sg), sg->length, dir);
-	arch_sync_dma_for_cpu_all(dev);
-}
-
-static void dma_noncoherent_unmap_page(struct device *dev, dma_addr_t addr,
-		size_t size, enum dma_data_direction dir, unsigned long attrs)
-{
-	if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC))
-		dma_noncoherent_sync_single_for_cpu(dev, addr, size, dir);
-}
-
-static void dma_noncoherent_unmap_sg(struct device *dev, struct scatterlist *sgl,
-		int nents, enum dma_data_direction dir, unsigned long attrs)
-{
-	if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC))
-		dma_noncoherent_sync_sg_for_cpu(dev, sgl, nents, dir);
-}
-#endif
-
-const struct dma_map_ops dma_noncoherent_ops = {
-	.alloc			= arch_dma_alloc,
-	.free			= arch_dma_free,
-	.mmap			= arch_dma_mmap,
-	.sync_single_for_device	= dma_noncoherent_sync_single_for_device,
-	.sync_sg_for_device	= dma_noncoherent_sync_sg_for_device,
-	.map_page		= dma_noncoherent_map_page,
-	.map_sg			= dma_noncoherent_map_sg,
-#if defined(CONFIG_ARCH_HAS_SYNC_DMA_FOR_CPU) || \
-    defined(CONFIG_ARCH_HAS_SYNC_DMA_FOR_CPU_ALL)
-	.sync_single_for_cpu	= dma_noncoherent_sync_single_for_cpu,
-	.sync_sg_for_cpu	= dma_noncoherent_sync_sg_for_cpu,
-	.unmap_page		= dma_noncoherent_unmap_page,
-	.unmap_sg		= dma_noncoherent_unmap_sg,
-#endif
-	.dma_supported		= dma_direct_supported,
-	.mapping_error		= dma_direct_mapping_error,
-	.cache_sync		= arch_dma_cache_sync,
-};
-EXPORT_SYMBOL(dma_noncoherent_ops);
-- 
2.18.0


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 5/6] dma-mapping: consolidate the dma mmap implementations
  2018-09-14  9:58 merge dma_direct_ops and dma_noncoherent_ops v3 Christoph Hellwig
                   ` (3 preceding siblings ...)
  2018-09-14  9:58 ` [PATCH 4/6] dma-mapping: merge direct and noncoherent ops Christoph Hellwig
@ 2018-09-14  9:58 ` Christoph Hellwig
  2018-09-14  9:58 ` [PATCH 6/6] dma-mapping: support non-coherent devices in dma_common_get_sgtable Christoph Hellwig
  2018-09-20  7:02 ` merge dma_direct_ops and dma_noncoherent_ops v3 Christoph Hellwig
  6 siblings, 0 replies; 17+ messages in thread
From: Christoph Hellwig @ 2018-09-14  9:58 UTC (permalink / raw)
  To: iommu
  Cc: Marek Szyprowski, Robin Murphy, Paul Burton, Greg Kroah-Hartman,
	linux-mips, linux-kernel

The only functional differences (modulo a few missing fixes in the arch
code) is that architectures without coherent caches need a hook to
convert a virtual or dma address into a pfn, given that we don't have
the kernel linear mapping available for the otherwise easy virt_to_page
call.  As a side effect we can support mmap of the per-device coherent
area even on architectures not providing the callback, and we make
previous dangerous default methods dma_common_mmap actually save for
non-coherent architectures by rejecting it without the right helper.

In addition to that we need a hook so that some architectures can
override the protection bits when mmaping a dma coherent allocations.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Paul Burton <paul.burton@mips.com> # MIPS parts
---
 arch/arc/Kconfig                      |  2 +-
 arch/arc/mm/dma.c                     | 25 +++------------------
 arch/arm/mm/dma-mapping-nommu.c       |  2 +-
 arch/microblaze/Kconfig               |  2 +-
 arch/microblaze/include/asm/pgtable.h |  2 --
 arch/microblaze/kernel/dma.c          | 22 ------------------
 arch/microblaze/mm/consistent.c       |  3 ++-
 arch/mips/Kconfig                     |  3 ++-
 arch/mips/jazz/jazzdma.c              |  1 -
 arch/mips/mm/dma-noncoherent.c        | 32 ++++++++-------------------
 drivers/xen/swiotlb-xen.c             |  2 +-
 include/linux/dma-mapping.h           |  5 +++--
 include/linux/dma-noncoherent.h       | 10 +++++++--
 kernel/dma/Kconfig                    | 10 +++++----
 kernel/dma/direct.c                   | 11 ---------
 kernel/dma/mapping.c                  | 32 ++++++++++++++++++---------
 16 files changed, 58 insertions(+), 106 deletions(-)

diff --git a/arch/arc/Kconfig b/arch/arc/Kconfig
index ca03694d518a..3d9bdecfa52d 100644
--- a/arch/arc/Kconfig
+++ b/arch/arc/Kconfig
@@ -9,6 +9,7 @@
 config ARC
 	def_bool y
 	select ARC_TIMERS
+	select ARCH_HAS_DMA_COHERENT_TO_PFN
 	select ARCH_HAS_PTE_SPECIAL
 	select ARCH_HAS_SYNC_DMA_FOR_CPU
 	select ARCH_HAS_SYNC_DMA_FOR_DEVICE
@@ -18,7 +19,6 @@ config ARC
 	select CLONE_BACKWARDS
 	select COMMON_CLK
 	select DMA_DIRECT_OPS
-	select DMA_NONCOHERENT_MMAP
 	select GENERIC_ATOMIC64 if !ISA_ARCV2 || !(ARC_HAS_LL64 && ARC_HAS_LLSC)
 	select GENERIC_CLOCKEVENTS
 	select GENERIC_FIND_FIRST_BIT
diff --git a/arch/arc/mm/dma.c b/arch/arc/mm/dma.c
index 535ed4a068ef..db203ff69ccf 100644
--- a/arch/arc/mm/dma.c
+++ b/arch/arc/mm/dma.c
@@ -84,29 +84,10 @@ void arch_dma_free(struct device *dev, size_t size, void *vaddr,
 	__free_pages(page, get_order(size));
 }
 
-int arch_dma_mmap(struct device *dev, struct vm_area_struct *vma,
-		void *cpu_addr, dma_addr_t dma_addr, size_t size,
-		unsigned long attrs)
+long arch_dma_coherent_to_pfn(struct device *dev, void *cpu_addr,
+		dma_addr_t dma_addr)
 {
-	unsigned long user_count = vma_pages(vma);
-	unsigned long count = PAGE_ALIGN(size) >> PAGE_SHIFT;
-	unsigned long pfn = __phys_to_pfn(dma_addr);
-	unsigned long off = vma->vm_pgoff;
-	int ret = -ENXIO;
-
-	vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);
-
-	if (dma_mmap_from_dev_coherent(dev, vma, cpu_addr, size, &ret))
-		return ret;
-
-	if (off < count && user_count <= (count - off)) {
-		ret = remap_pfn_range(vma, vma->vm_start,
-				      pfn + off,
-				      user_count << PAGE_SHIFT,
-				      vma->vm_page_prot);
-	}
-
-	return ret;
+	return __phys_to_pfn(dma_addr);
 }
 
 /*
diff --git a/arch/arm/mm/dma-mapping-nommu.c b/arch/arm/mm/dma-mapping-nommu.c
index 0ad156f9985b..712416ecd8e6 100644
--- a/arch/arm/mm/dma-mapping-nommu.c
+++ b/arch/arm/mm/dma-mapping-nommu.c
@@ -91,7 +91,7 @@ static int arm_nommu_dma_mmap(struct device *dev, struct vm_area_struct *vma,
 	if (dma_mmap_from_global_coherent(vma, cpu_addr, size, &ret))
 		return ret;
 
-	return dma_common_mmap(dev, vma, cpu_addr, dma_addr, size);
+	return dma_common_mmap(dev, vma, cpu_addr, dma_addr, size, attrs);
 }
 
 
diff --git a/arch/microblaze/Kconfig b/arch/microblaze/Kconfig
index 0f48ab6a8070..164a4857737a 100644
--- a/arch/microblaze/Kconfig
+++ b/arch/microblaze/Kconfig
@@ -1,6 +1,7 @@
 config MICROBLAZE
 	def_bool y
 	select ARCH_NO_SWAP
+	select ARCH_HAS_DMA_COHERENT_TO_PFN if MMU
 	select ARCH_HAS_GCOV_PROFILE_ALL
 	select ARCH_HAS_SYNC_DMA_FOR_CPU
 	select ARCH_HAS_SYNC_DMA_FOR_DEVICE
@@ -12,7 +13,6 @@ config MICROBLAZE
 	select CLONE_BACKWARDS3
 	select COMMON_CLK
 	select DMA_DIRECT_OPS
-	select DMA_NONCOHERENT_MMAP
 	select GENERIC_ATOMIC64
 	select GENERIC_CLOCKEVENTS
 	select GENERIC_CPU_DEVICES
diff --git a/arch/microblaze/include/asm/pgtable.h b/arch/microblaze/include/asm/pgtable.h
index 7b650ab14fa0..f64ebb9c9a41 100644
--- a/arch/microblaze/include/asm/pgtable.h
+++ b/arch/microblaze/include/asm/pgtable.h
@@ -553,8 +553,6 @@ void __init *early_get_page(void);
 
 extern unsigned long ioremap_bot, ioremap_base;
 
-unsigned long consistent_virt_to_pfn(void *vaddr);
-
 void setup_memory(void);
 #endif /* __ASSEMBLY__ */
 
diff --git a/arch/microblaze/kernel/dma.c b/arch/microblaze/kernel/dma.c
index 71032cf64669..a89c2d4ed5ff 100644
--- a/arch/microblaze/kernel/dma.c
+++ b/arch/microblaze/kernel/dma.c
@@ -42,25 +42,3 @@ void arch_sync_dma_for_cpu(struct device *dev, phys_addr_t paddr,
 {
 	__dma_sync(dev, paddr, size, dir);
 }
-
-int arch_dma_mmap(struct device *dev, struct vm_area_struct *vma,
-		void *cpu_addr, dma_addr_t handle, size_t size,
-		unsigned long attrs)
-{
-#ifdef CONFIG_MMU
-	unsigned long user_count = vma_pages(vma);
-	unsigned long count = PAGE_ALIGN(size) >> PAGE_SHIFT;
-	unsigned long off = vma->vm_pgoff;
-	unsigned long pfn;
-
-	if (off >= count || user_count > (count - off))
-		return -ENXIO;
-
-	vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);
-	pfn = consistent_virt_to_pfn(cpu_addr);
-	return remap_pfn_range(vma, vma->vm_start, pfn + off,
-			       vma->vm_end - vma->vm_start, vma->vm_page_prot);
-#else
-	return -ENXIO;
-#endif
-}
diff --git a/arch/microblaze/mm/consistent.c b/arch/microblaze/mm/consistent.c
index c9a278ac795a..d801cc5f5b95 100644
--- a/arch/microblaze/mm/consistent.c
+++ b/arch/microblaze/mm/consistent.c
@@ -165,7 +165,8 @@ static pte_t *consistent_virt_to_pte(void *vaddr)
 	return pte_offset_kernel(pmd_offset(pgd_offset_k(addr), addr), addr);
 }
 
-unsigned long consistent_virt_to_pfn(void *vaddr)
+long arch_dma_coherent_to_pfn(struct device *dev, void *vaddr,
+		dma_addr_t dma_addr)
 {
 	pte_t *ptep = consistent_virt_to_pte(vaddr);
 
diff --git a/arch/mips/Kconfig b/arch/mips/Kconfig
index 96da6e3396e1..77c022e56e6e 100644
--- a/arch/mips/Kconfig
+++ b/arch/mips/Kconfig
@@ -1116,10 +1116,11 @@ config DMA_PERDEV_COHERENT
 
 config DMA_NONCOHERENT
 	bool
+	select ARCH_HAS_DMA_MMAP_PGPROT
 	select ARCH_HAS_SYNC_DMA_FOR_DEVICE
 	select ARCH_HAS_SYNC_DMA_FOR_CPU
 	select NEED_DMA_MAP_STATE
-	select DMA_NONCOHERENT_MMAP
+	select ARCH_HAS_DMA_COHERENT_TO_PFN
 	select DMA_NONCOHERENT_CACHE_SYNC
 
 config SYS_HAS_EARLY_PRINTK
diff --git a/arch/mips/jazz/jazzdma.c b/arch/mips/jazz/jazzdma.c
index bb49dfa1a9a3..0a0aaf39fd16 100644
--- a/arch/mips/jazz/jazzdma.c
+++ b/arch/mips/jazz/jazzdma.c
@@ -682,7 +682,6 @@ static int jazz_dma_mapping_error(struct device *dev, dma_addr_t dma_addr)
 const struct dma_map_ops jazz_dma_ops = {
 	.alloc			= jazz_dma_alloc,
 	.free			= jazz_dma_free,
-	.mmap			= arch_dma_mmap,
 	.map_page		= jazz_dma_map_page,
 	.unmap_page		= jazz_dma_unmap_page,
 	.map_sg			= jazz_dma_map_sg,
diff --git a/arch/mips/mm/dma-noncoherent.c b/arch/mips/mm/dma-noncoherent.c
index b01b9a3e424f..e6c9485cadcf 100644
--- a/arch/mips/mm/dma-noncoherent.c
+++ b/arch/mips/mm/dma-noncoherent.c
@@ -66,33 +66,19 @@ void arch_dma_free(struct device *dev, size_t size, void *cpu_addr,
 	dma_direct_free_pages(dev, size, cpu_addr, dma_addr, attrs);
 }
 
-int arch_dma_mmap(struct device *dev, struct vm_area_struct *vma,
-		void *cpu_addr, dma_addr_t dma_addr, size_t size,
-		unsigned long attrs)
+long arch_dma_coherent_to_pfn(struct device *dev, void *cpu_addr,
+		dma_addr_t dma_addr)
 {
-	unsigned long user_count = vma_pages(vma);
-	unsigned long count = PAGE_ALIGN(size) >> PAGE_SHIFT;
 	unsigned long addr = CAC_ADDR((unsigned long)cpu_addr);
-	unsigned long off = vma->vm_pgoff;
-	unsigned long pfn = page_to_pfn(virt_to_page((void *)addr));
-	int ret = -ENXIO;
+	return page_to_pfn(virt_to_page((void *)addr));
+}
 
+pgprot_t arch_dma_mmap_pgprot(struct device *dev, pgprot_t prot,
+		unsigned long attrs)
+{
 	if (attrs & DMA_ATTR_WRITE_COMBINE)
-		vma->vm_page_prot = pgprot_writecombine(vma->vm_page_prot);
-	else
-		vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);
-
-	if (dma_mmap_from_dev_coherent(dev, vma, cpu_addr, size, &ret))
-		return ret;
-
-	if (off < count && user_count <= (count - off)) {
-		ret = remap_pfn_range(vma, vma->vm_start,
-				      pfn + off,
-				      user_count << PAGE_SHIFT,
-				      vma->vm_page_prot);
-	}
-
-	return ret;
+		return pgprot_writecombine(prot);
+	return pgprot_noncached(prot);
 }
 
 static inline void dma_sync_virt(void *addr, size_t size,
diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
index a6f9ba85dc4b..470757ddddea 100644
--- a/drivers/xen/swiotlb-xen.c
+++ b/drivers/xen/swiotlb-xen.c
@@ -662,7 +662,7 @@ xen_swiotlb_dma_mmap(struct device *dev, struct vm_area_struct *vma,
 		return xen_get_dma_ops(dev)->mmap(dev, vma, cpu_addr,
 						    dma_addr, size, attrs);
 #endif
-	return dma_common_mmap(dev, vma, cpu_addr, dma_addr, size);
+	return dma_common_mmap(dev, vma, cpu_addr, dma_addr, size, attrs);
 }
 
 /*
diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h
index 8f2001181cd1..c3378d4e0d57 100644
--- a/include/linux/dma-mapping.h
+++ b/include/linux/dma-mapping.h
@@ -444,7 +444,8 @@ dma_cache_sync(struct device *dev, void *vaddr, size_t size,
 }
 
 extern int dma_common_mmap(struct device *dev, struct vm_area_struct *vma,
-			   void *cpu_addr, dma_addr_t dma_addr, size_t size);
+		void *cpu_addr, dma_addr_t dma_addr, size_t size,
+		unsigned long attrs);
 
 void *dma_common_contiguous_remap(struct page *page, size_t size,
 			unsigned long vm_flags,
@@ -476,7 +477,7 @@ dma_mmap_attrs(struct device *dev, struct vm_area_struct *vma, void *cpu_addr,
 	BUG_ON(!ops);
 	if (ops->mmap)
 		return ops->mmap(dev, vma, cpu_addr, dma_addr, size, attrs);
-	return dma_common_mmap(dev, vma, cpu_addr, dma_addr, size);
+	return dma_common_mmap(dev, vma, cpu_addr, dma_addr, size, attrs);
 }
 
 #define dma_mmap_coherent(d, v, c, h, s) dma_mmap_attrs(d, v, c, h, s, 0)
diff --git a/include/linux/dma-noncoherent.h b/include/linux/dma-noncoherent.h
index 3f503025a0cd..9051b055beec 100644
--- a/include/linux/dma-noncoherent.h
+++ b/include/linux/dma-noncoherent.h
@@ -24,9 +24,15 @@ void *arch_dma_alloc(struct device *dev, size_t size, dma_addr_t *dma_handle,
 		gfp_t gfp, unsigned long attrs);
 void arch_dma_free(struct device *dev, size_t size, void *cpu_addr,
 		dma_addr_t dma_addr, unsigned long attrs);
-int arch_dma_mmap(struct device *dev, struct vm_area_struct *vma,
-		void *cpu_addr, dma_addr_t dma_addr, size_t size,
+long arch_dma_coherent_to_pfn(struct device *dev, void *cpu_addr,
+		dma_addr_t dma_addr);
+
+#ifdef CONFIG_ARCH_HAS_DMA_MMAP_PGPROT
+pgprot_t arch_dma_mmap_pgprot(struct device *dev, pgprot_t prot,
 		unsigned long attrs);
+#else
+# define arch_dma_mmap_pgprot(dev, prot, attrs)	pgprot_noncached(prot)
+#endif
 
 #ifdef CONFIG_DMA_NONCOHERENT_CACHE_SYNC
 void arch_dma_cache_sync(struct device *dev, void *vaddr, size_t size,
diff --git a/kernel/dma/Kconfig b/kernel/dma/Kconfig
index 5617c9a76208..645c7a2ecde8 100644
--- a/kernel/dma/Kconfig
+++ b/kernel/dma/Kconfig
@@ -29,13 +29,15 @@ config ARCH_HAS_SYNC_DMA_FOR_CPU
 config ARCH_HAS_SYNC_DMA_FOR_CPU_ALL
 	bool
 
-config DMA_DIRECT_OPS
+config ARCH_HAS_DMA_COHERENT_TO_PFN
 	bool
-	depends on HAS_DMA
 
-config DMA_NONCOHERENT_MMAP
+config ARCH_HAS_DMA_MMAP_PGPROT
 	bool
-	depends on DMA_DIRECT_OPS
+
+config DMA_DIRECT_OPS
+	bool
+	depends on HAS_DMA
 
 config DMA_NONCOHERENT_CACHE_SYNC
 	bool
diff --git a/kernel/dma/direct.c b/kernel/dma/direct.c
index 09e85f6aa4ba..c954f0a6dc62 100644
--- a/kernel/dma/direct.c
+++ b/kernel/dma/direct.c
@@ -155,16 +155,6 @@ void dma_direct_free(struct device *dev, size_t size,
 		dma_direct_free_pages(dev, size, cpu_addr, dma_addr, attrs);
 }
 
-static int dma_direct_mmap(struct device *dev, struct vm_area_struct *vma,
-		void *cpu_addr, dma_addr_t dma_addr, size_t size,
-		unsigned long attrs)
-{
-	if (!dev_is_dma_coherent(dev) &&
-	    IS_ENABLED(CONFIG_DMA_NONCOHERENT_MMAP))
-		return arch_dma_mmap(dev, vma, cpu_addr, dma_addr, size, attrs);
-	return dma_common_mmap(dev, vma, cpu_addr, dma_addr, size);
-}
-
 static void dma_direct_sync_single_for_device(struct device *dev,
 		dma_addr_t addr, size_t size, enum dma_data_direction dir)
 {
@@ -293,7 +283,6 @@ int dma_direct_mapping_error(struct device *dev, dma_addr_t dma_addr)
 const struct dma_map_ops dma_direct_ops = {
 	.alloc			= dma_direct_alloc,
 	.free			= dma_direct_free,
-	.mmap			= dma_direct_mmap,
 	.map_page		= dma_direct_map_page,
 	.map_sg			= dma_direct_map_sg,
 #if defined(CONFIG_ARCH_HAS_SYNC_DMA_FOR_DEVICE)
diff --git a/kernel/dma/mapping.c b/kernel/dma/mapping.c
index 3540cb399bd2..42fd73aca305 100644
--- a/kernel/dma/mapping.c
+++ b/kernel/dma/mapping.c
@@ -7,7 +7,7 @@
  */
 
 #include <linux/acpi.h>
-#include <linux/dma-mapping.h>
+#include <linux/dma-noncoherent.h>
 #include <linux/export.h>
 #include <linux/gfp.h>
 #include <linux/of_device.h>
@@ -220,27 +220,37 @@ EXPORT_SYMBOL(dma_common_get_sgtable);
  * Create userspace mapping for the DMA-coherent memory.
  */
 int dma_common_mmap(struct device *dev, struct vm_area_struct *vma,
-		    void *cpu_addr, dma_addr_t dma_addr, size_t size)
+		void *cpu_addr, dma_addr_t dma_addr, size_t size,
+		unsigned long attrs)
 {
-	int ret = -ENXIO;
 #ifndef CONFIG_ARCH_NO_COHERENT_DMA_MMAP
 	unsigned long user_count = vma_pages(vma);
 	unsigned long count = PAGE_ALIGN(size) >> PAGE_SHIFT;
 	unsigned long off = vma->vm_pgoff;
+	unsigned long pfn;
+	int ret = -ENXIO;
 
-	vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);
+	vma->vm_page_prot = arch_dma_mmap_pgprot(dev, vma->vm_page_prot, attrs);
 
 	if (dma_mmap_from_dev_coherent(dev, vma, cpu_addr, size, &ret))
 		return ret;
 
-	if (off < count && user_count <= (count - off))
-		ret = remap_pfn_range(vma, vma->vm_start,
-				      page_to_pfn(virt_to_page(cpu_addr)) + off,
-				      user_count << PAGE_SHIFT,
-				      vma->vm_page_prot);
-#endif	/* !CONFIG_ARCH_NO_COHERENT_DMA_MMAP */
+	if (off >= count || user_count > count - off)
+		return -ENXIO;
+
+	if (!dev_is_dma_coherent(dev)) {
+		if (!IS_ENABLED(CONFIG_ARCH_HAS_DMA_COHERENT_TO_PFN))
+			return -ENXIO;
+		pfn = arch_dma_coherent_to_pfn(dev, cpu_addr, dma_addr);
+	} else {
+		pfn = page_to_pfn(virt_to_page(cpu_addr));
+	}
 
-	return ret;
+	return remap_pfn_range(vma, vma->vm_start, pfn + vma->vm_pgoff,
+			user_count << PAGE_SHIFT, vma->vm_page_prot);
+#else
+	return -ENXIO;
+#endif /* !CONFIG_ARCH_NO_COHERENT_DMA_MMAP */
 }
 EXPORT_SYMBOL(dma_common_mmap);
 
-- 
2.18.0


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 6/6] dma-mapping: support non-coherent devices in dma_common_get_sgtable
  2018-09-14  9:58 merge dma_direct_ops and dma_noncoherent_ops v3 Christoph Hellwig
                   ` (4 preceding siblings ...)
  2018-09-14  9:58 ` [PATCH 5/6] dma-mapping: consolidate the dma mmap implementations Christoph Hellwig
@ 2018-09-14  9:58 ` Christoph Hellwig
  2018-09-20  7:02 ` merge dma_direct_ops and dma_noncoherent_ops v3 Christoph Hellwig
  6 siblings, 0 replies; 17+ messages in thread
From: Christoph Hellwig @ 2018-09-14  9:58 UTC (permalink / raw)
  To: iommu
  Cc: Marek Szyprowski, Robin Murphy, Paul Burton, Greg Kroah-Hartman,
	linux-mips, linux-kernel

We can use the arch_dma_coherent_to_pfn hook to provide a ->get_sgtable
implementation.  Note that this isn't an endorsement of this interface
(which is a horrible bad idea), but it is required to move arm64 over
to the generic code without a loss of functionality.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/xen/swiotlb-xen.c   |  2 +-
 include/linux/dma-mapping.h |  7 ++++---
 kernel/dma/mapping.c        | 23 ++++++++++++++++-------
 3 files changed, 21 insertions(+), 11 deletions(-)

diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
index 470757ddddea..28819a0e61d0 100644
--- a/drivers/xen/swiotlb-xen.c
+++ b/drivers/xen/swiotlb-xen.c
@@ -689,7 +689,7 @@ xen_swiotlb_get_sgtable(struct device *dev, struct sg_table *sgt,
 							   handle, size, attrs);
 	}
 #endif
-	return dma_common_get_sgtable(dev, sgt, cpu_addr, handle, size);
+	return dma_common_get_sgtable(dev, sgt, cpu_addr, handle, size, attrs);
 }
 
 static int xen_swiotlb_mapping_error(struct device *dev, dma_addr_t dma_addr)
diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h
index c3378d4e0d57..bd81e74cca7b 100644
--- a/include/linux/dma-mapping.h
+++ b/include/linux/dma-mapping.h
@@ -483,8 +483,8 @@ dma_mmap_attrs(struct device *dev, struct vm_area_struct *vma, void *cpu_addr,
 #define dma_mmap_coherent(d, v, c, h, s) dma_mmap_attrs(d, v, c, h, s, 0)
 
 int
-dma_common_get_sgtable(struct device *dev, struct sg_table *sgt,
-		       void *cpu_addr, dma_addr_t dma_addr, size_t size);
+dma_common_get_sgtable(struct device *dev, struct sg_table *sgt, void *cpu_addr,
+		dma_addr_t dma_addr, size_t size, unsigned long attrs);
 
 static inline int
 dma_get_sgtable_attrs(struct device *dev, struct sg_table *sgt, void *cpu_addr,
@@ -496,7 +496,8 @@ dma_get_sgtable_attrs(struct device *dev, struct sg_table *sgt, void *cpu_addr,
 	if (ops->get_sgtable)
 		return ops->get_sgtable(dev, sgt, cpu_addr, dma_addr, size,
 					attrs);
-	return dma_common_get_sgtable(dev, sgt, cpu_addr, dma_addr, size);
+	return dma_common_get_sgtable(dev, sgt, cpu_addr, dma_addr, size,
+			attrs);
 }
 
 #define dma_get_sgtable(d, t, v, h, s) dma_get_sgtable_attrs(d, t, v, h, s, 0)
diff --git a/kernel/dma/mapping.c b/kernel/dma/mapping.c
index 42fd73aca305..58dec7a92b7b 100644
--- a/kernel/dma/mapping.c
+++ b/kernel/dma/mapping.c
@@ -202,17 +202,26 @@ EXPORT_SYMBOL(dmam_release_declared_memory);
  * Create scatter-list for the already allocated DMA buffer.
  */
 int dma_common_get_sgtable(struct device *dev, struct sg_table *sgt,
-		 void *cpu_addr, dma_addr_t handle, size_t size)
+		 void *cpu_addr, dma_addr_t dma_addr, size_t size,
+		 unsigned long attrs)
 {
-	struct page *page = virt_to_page(cpu_addr);
+	struct page *page;
 	int ret;
 
-	ret = sg_alloc_table(sgt, 1, GFP_KERNEL);
-	if (unlikely(ret))
-		return ret;
+	if (!dev_is_dma_coherent(dev)) {
+		if (!IS_ENABLED(CONFIG_ARCH_HAS_DMA_COHERENT_TO_PFN))
+			return -ENXIO;
 
-	sg_set_page(sgt->sgl, page, PAGE_ALIGN(size), 0);
-	return 0;
+		page = pfn_to_page(arch_dma_coherent_to_pfn(dev, cpu_addr,
+				dma_addr));
+	} else {
+		page = virt_to_page(cpu_addr);
+	}
+
+	ret = sg_alloc_table(sgt, 1, GFP_KERNEL);
+	if (!ret)
+		sg_set_page(sgt->sgl, page, PAGE_ALIGN(size), 0);
+	return ret;
 }
 EXPORT_SYMBOL(dma_common_get_sgtable);
 
-- 
2.18.0


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: merge dma_direct_ops and dma_noncoherent_ops v3
  2018-09-14  9:58 merge dma_direct_ops and dma_noncoherent_ops v3 Christoph Hellwig
                   ` (5 preceding siblings ...)
  2018-09-14  9:58 ` [PATCH 6/6] dma-mapping: support non-coherent devices in dma_common_get_sgtable Christoph Hellwig
@ 2018-09-20  7:02 ` Christoph Hellwig
  6 siblings, 0 replies; 17+ messages in thread
From: Christoph Hellwig @ 2018-09-20  7:02 UTC (permalink / raw)
  To: iommu
  Cc: linux-mips, Greg Kroah-Hartman, linux-kernel, Paul Burton, Robin Murphy

I've pulled this into the dma-mapping for-next tree now to make
progress.  Any further review comments should be dealt with in
incremental patches.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 4/6] dma-mapping: merge direct and noncoherent ops
  2018-09-14  9:58 ` [PATCH 4/6] dma-mapping: merge direct and noncoherent ops Christoph Hellwig
@ 2018-10-31 14:24   ` Maciej W. Rozycki
  2018-10-31 16:31     ` Maciej W. Rozycki
  0 siblings, 1 reply; 17+ messages in thread
From: Maciej W. Rozycki @ 2018-10-31 14:24 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: iommu, Marek Szyprowski, Robin Murphy, Paul Burton,
	Greg Kroah-Hartman, linux-mips, linux-kernel

Hi Christoph,

On Fri, 14 Sep 2018, Christoph Hellwig wrote:

> All the cache maintainance is already stubbed out when not enabled,
> but merging the two allows us to nicely handle the case where
> cache maintainance is required for some devices, but not others.

 FYI, you commit bc3ec75de545 ("dma-mapping: merge direct and noncoherent 
ops") has caused:

fddi0: DMA command request failed!
fddi0: Adapter open failed!

with the `defxx' driver on my R4400SC TURBOchannel DECstation (but not the 
R3400 one) and consequently the interface does not work anymore.  Both are 
non-coherent cache systems, however the R3400 implements the write-through 
policy while the policy of the R4400SC is write-back (it also has 1MiB of 
secondary cache), which I suspect is the reason of the difference.

 I'll be experimenting with this commit to figure out what the real cause 
is, but it'll take a while as my resources are limited, so I'm sending 
this early heads-up in case you or someone else has an idea what might be 
going wrong here.

  Maciej

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 4/6] dma-mapping: merge direct and noncoherent ops
  2018-10-31 14:24   ` Maciej W. Rozycki
@ 2018-10-31 16:31     ` Maciej W. Rozycki
  2018-10-31 20:32       ` Christoph Hellwig
  0 siblings, 1 reply; 17+ messages in thread
From: Maciej W. Rozycki @ 2018-10-31 16:31 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: iommu, Marek Szyprowski, Robin Murphy, Paul Burton,
	Greg Kroah-Hartman, linux-mips, linux-kernel

On Wed, 31 Oct 2018, Maciej W. Rozycki wrote:

> > All the cache maintainance is already stubbed out when not enabled,
> > but merging the two allows us to nicely handle the case where
> > cache maintainance is required for some devices, but not others.
> 
>  FYI, you commit bc3ec75de545 ("dma-mapping: merge direct and noncoherent 
> ops") has caused:
> 
> fddi0: DMA command request failed!
> fddi0: Adapter open failed!
> 
> with the `defxx' driver on my R4400SC TURBOchannel DECstation (but not the 
> R3400 one) and consequently the interface does not work anymore.  Both are 
> non-coherent cache systems, however the R3400 implements the write-through 
> policy while the policy of the R4400SC is write-back (it also has 1MiB of 
> secondary cache), which I suspect is the reason of the difference.

 Doh!  It would have helped if I actually had the right adapter installed 
in the R3400 box.  I've got a spare one, which I have now plugged in there 
and the R3400 actually shows the same issue as the R4400SC does.  So this 
has nothing to do WRT write-through vs write-back.

 Hmm, in `dfx_hw_dma_cmd_req' the driver polls the consumer block (which 
is write-only by the adapter and read-only by the host, except for the 
initialisation time before adapter's DMA engines have been started, and 
write-through vs write-back indeed does not matter for this kind of use) 
and there's no DMA synchronisation whatsoever in that.

 However the block has been allocated with `dma_zalloc_coherent', which 
means no synchronisation is supposed to be required.  For non-coherent 
cache systems that essentially means an uncached memory allocation.

  Maciej

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 4/6] dma-mapping: merge direct and noncoherent ops
  2018-10-31 16:31     ` Maciej W. Rozycki
@ 2018-10-31 20:32       ` Christoph Hellwig
  2018-10-31 20:50         ` Maciej W. Rozycki
  0 siblings, 1 reply; 17+ messages in thread
From: Christoph Hellwig @ 2018-10-31 20:32 UTC (permalink / raw)
  To: Maciej W. Rozycki
  Cc: Christoph Hellwig, iommu, Marek Szyprowski, Robin Murphy,
	Paul Burton, Greg Kroah-Hartman, linux-mips, linux-kernel

Can you send me your .config?

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 4/6] dma-mapping: merge direct and noncoherent ops
  2018-10-31 20:32       ` Christoph Hellwig
@ 2018-10-31 20:50         ` Maciej W. Rozycki
  2018-11-01  5:13           ` Christoph Hellwig
  0 siblings, 1 reply; 17+ messages in thread
From: Maciej W. Rozycki @ 2018-10-31 20:50 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: iommu, Marek Szyprowski, Robin Murphy, Paul Burton,
	Greg Kroah-Hartman, linux-mips, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 295 bytes --]

On Wed, 31 Oct 2018, Christoph Hellwig wrote:

> Can you send me your .config?

 Sure, attached.

 This is an updated 4.19 defconfig I was going to submit as I tripped over 
this problem, so just rename it to .config and use `make olddefconfig' to 
regenerate a full version.

 Thanks,

  Maciej

[-- Attachment #2: Type: text/plain, Size: 5848 bytes --]

CONFIG_SYSVIPC=y
CONFIG_POSIX_MQUEUE=y
CONFIG_HIGH_RES_TIMERS=y
CONFIG_BSD_PROCESS_ACCT=y
CONFIG_BSD_PROCESS_ACCT_V3=y
CONFIG_LOG_BUF_SHIFT=15
CONFIG_EXPERT=y
# CONFIG_SGETMASK_SYSCALL is not set
# CONFIG_SYSFS_SYSCALL is not set
CONFIG_BPF_SYSCALL=y
# CONFIG_COMPAT_BRK is not set
CONFIG_SLAB=y
CONFIG_MACH_DECSTATION=y
CONFIG_CPU_R3000=y
CONFIG_TC=y
# CONFIG_SUSPEND is not set
CONFIG_MODULES=y
CONFIG_MODULE_UNLOAD=y
CONFIG_MODULE_SRCVERSION_ALL=y
# CONFIG_LBDAF is not set
CONFIG_PARTITION_ADVANCED=y
CONFIG_OSF_PARTITION=y
# CONFIG_EFI_PARTITION is not set
CONFIG_NET=y
CONFIG_PACKET=y
CONFIG_UNIX=y
CONFIG_NET_KEY=m
CONFIG_NET_KEY_MIGRATE=y
CONFIG_INET=y
CONFIG_IP_MULTICAST=y
CONFIG_IP_PNP=y
CONFIG_IP_PNP_BOOTP=y
CONFIG_SYN_COOKIES=y
CONFIG_INET_AH=m
CONFIG_INET_ESP=m
CONFIG_INET_IPCOMP=m
CONFIG_INET_XFRM_MODE_TRANSPORT=m
CONFIG_INET_XFRM_MODE_TUNNEL=m
CONFIG_INET_XFRM_MODE_BEET=m
CONFIG_TCP_MD5SIG=y
CONFIG_IPV6_ROUTER_PREF=y
CONFIG_IPV6_ROUTE_INFO=y
CONFIG_INET6_AH=m
CONFIG_INET6_ESP=m
CONFIG_INET6_IPCOMP=m
CONFIG_IPV6_MIP6=m
CONFIG_INET6_XFRM_MODE_ROUTEOPTIMIZATION=m
CONFIG_IPV6_MULTIPLE_TABLES=y
CONFIG_IPV6_SUBTREES=y
CONFIG_NETWORK_SECMARK=y
CONFIG_IP_SCTP=m
CONFIG_VLAN_8021Q=m
CONFIG_DECNET=m
CONFIG_DECNET_ROUTER=y
# CONFIG_WIRELESS is not set
# CONFIG_UEVENT_HELPER is not set
# CONFIG_FW_LOADER is not set
# CONFIG_ALLOW_DEV_COREDUMP is not set
CONFIG_MTD=m
CONFIG_MTD_BLOCK=m
CONFIG_MTD_BLOCK_RO=m
CONFIG_MTD_MS02NV=m
CONFIG_BLK_DEV_LOOP=m
CONFIG_BLK_DEV_RAM=m
CONFIG_SCSI=y
CONFIG_BLK_DEV_SD=y
CONFIG_CHR_DEV_ST=m
CONFIG_BLK_DEV_SR=m
CONFIG_CHR_DEV_SG=m
CONFIG_SCSI_CONSTANTS=y
CONFIG_SCSI_SPI_ATTRS=m
CONFIG_ISCSI_TCP=m
CONFIG_NETDEVICES=y
# CONFIG_NET_VENDOR_ALACRITECH is not set
# CONFIG_NET_VENDOR_AMAZON is not set
CONFIG_DECLANCE=y
# CONFIG_NET_VENDOR_AQUANTIA is not set
# CONFIG_NET_VENDOR_ARC is not set
# CONFIG_NET_VENDOR_AURORA is not set
# CONFIG_NET_VENDOR_BROADCOM is not set
# CONFIG_NET_VENDOR_CADENCE is not set
# CONFIG_NET_VENDOR_CAVIUM is not set
# CONFIG_NET_VENDOR_CORTINA is not set
# CONFIG_NET_VENDOR_EZCHIP is not set
# CONFIG_NET_VENDOR_HUAWEI is not set
# CONFIG_NET_VENDOR_INTEL is not set
# CONFIG_NET_VENDOR_MARVELL is not set
# CONFIG_NET_VENDOR_MICREL is not set
# CONFIG_NET_VENDOR_MICROSEMI is not set
# CONFIG_NET_VENDOR_NATSEMI is not set
# CONFIG_NET_VENDOR_NETRONOME is not set
# CONFIG_NET_VENDOR_NI is not set
# CONFIG_NET_VENDOR_QUALCOMM is not set
# CONFIG_NET_VENDOR_RENESAS is not set
# CONFIG_NET_VENDOR_ROCKER is not set
# CONFIG_NET_VENDOR_SAMSUNG is not set
# CONFIG_NET_VENDOR_SEEQ is not set
# CONFIG_NET_VENDOR_SOLARFLARE is not set
# CONFIG_NET_VENDOR_SMSC is not set
# CONFIG_NET_VENDOR_SOCIONEXT is not set
# CONFIG_NET_VENDOR_STMICRO is not set
# CONFIG_NET_VENDOR_SYNOPSYS is not set
# CONFIG_NET_VENDOR_VIA is not set
# CONFIG_NET_VENDOR_WIZNET is not set
# CONFIG_NET_VENDOR_XILINX is not set
CONFIG_FDDI=y
CONFIG_DEFZA=y
CONFIG_DEFXX=y
# CONFIG_WLAN is not set
# CONFIG_KEYBOARD_ATKBD is not set
CONFIG_KEYBOARD_LKKBD=y
# CONFIG_MOUSE_PS2 is not set
CONFIG_MOUSE_VSXXXAA=y
# CONFIG_HW_RANDOM is not set
# CONFIG_HWMON is not set
CONFIG_FB=y
CONFIG_FB_TGA=y
CONFIG_FB_PMAG_AA=y
CONFIG_FB_PMAG_BA=y
CONFIG_FB_PMAGB_B=y
# CONFIG_VGA_CONSOLE is not set
CONFIG_DUMMY_CONSOLE_COLUMNS=160
CONFIG_DUMMY_CONSOLE_ROWS=64
CONFIG_FRAMEBUFFER_CONSOLE=y
CONFIG_FRAMEBUFFER_CONSOLE_DETECT_PRIMARY=y
CONFIG_LOGO=y
# CONFIG_LOGO_LINUX_VGA16 is not set
# CONFIG_LOGO_LINUX_CLUT224 is not set
# CONFIG_HID is not set
# CONFIG_USB_SUPPORT is not set
CONFIG_RTC_CLASS=y
CONFIG_RTC_INTF_DEV_UIE_EMUL=y
CONFIG_RTC_DRV_CMOS=y
# CONFIG_MIPS_PLATFORM_DEVICES is not set
# CONFIG_IOMMU_SUPPORT is not set
CONFIG_EXT2_FS=y
CONFIG_EXT2_FS_XATTR=y
CONFIG_EXT2_FS_POSIX_ACL=y
CONFIG_EXT2_FS_SECURITY=y
CONFIG_EXT3_FS=y
CONFIG_EXT3_FS_POSIX_ACL=y
CONFIG_EXT3_FS_SECURITY=y
# CONFIG_MANDATORY_FILE_LOCKING is not set
CONFIG_ISO9660_FS=y
CONFIG_JOLIET=y
CONFIG_PROC_KCORE=y
CONFIG_PROC_CHILDREN=y
CONFIG_TMPFS=y
CONFIG_TMPFS_POSIX_ACL=y
CONFIG_CONFIGFS_FS=y
CONFIG_UFS_FS=y
CONFIG_UFS_FS_WRITE=y
CONFIG_NFS_FS=y
CONFIG_NFS_V3_ACL=y
CONFIG_NFS_SWAP=y
CONFIG_ROOT_NFS=y
CONFIG_NFSD=m
CONFIG_NFSD_V3=y
CONFIG_NFSD_V3_ACL=y
# CONFIG_RPCSEC_GSS_KRB5 is not set
CONFIG_NLS_ISO8859_8=m
CONFIG_NLS_ASCII=m
CONFIG_NLS_ISO8859_1=m
CONFIG_NLS_ISO8859_2=m
CONFIG_NLS_ISO8859_3=m
CONFIG_NLS_ISO8859_4=m
CONFIG_NLS_ISO8859_5=m
CONFIG_NLS_ISO8859_6=m
CONFIG_NLS_ISO8859_7=m
CONFIG_NLS_ISO8859_9=m
CONFIG_NLS_ISO8859_13=m
CONFIG_NLS_ISO8859_14=m
CONFIG_NLS_ISO8859_15=m
CONFIG_NLS_UTF8=m
CONFIG_CRYPTO_RSA=m
CONFIG_CRYPTO_MANAGER=y
CONFIG_CRYPTO_CCM=m
CONFIG_CRYPTO_GCM=m
CONFIG_CRYPTO_CHACHA20POLY1305=m
CONFIG_CRYPTO_CTS=m
CONFIG_CRYPTO_LRW=m
CONFIG_CRYPTO_PCBC=m
CONFIG_CRYPTO_XTS=m
CONFIG_CRYPTO_KEYWRAP=m
CONFIG_CRYPTO_CMAC=m
CONFIG_CRYPTO_XCBC=m
CONFIG_CRYPTO_VMAC=m
CONFIG_CRYPTO_CRC32=m
CONFIG_CRYPTO_CRCT10DIF=m
CONFIG_CRYPTO_MD4=m
CONFIG_CRYPTO_MICHAEL_MIC=m
CONFIG_CRYPTO_RMD128=m
CONFIG_CRYPTO_RMD160=m
CONFIG_CRYPTO_RMD256=m
CONFIG_CRYPTO_RMD320=m
CONFIG_CRYPTO_SHA512=m
CONFIG_CRYPTO_TGR192=m
CONFIG_CRYPTO_WP512=m
CONFIG_CRYPTO_ANUBIS=m
CONFIG_CRYPTO_ARC4=m
CONFIG_CRYPTO_BLOWFISH=m
CONFIG_CRYPTO_CAMELLIA=m
CONFIG_CRYPTO_CAST5=m
CONFIG_CRYPTO_CAST6=m
CONFIG_CRYPTO_FCRYPT=m
CONFIG_CRYPTO_KHAZAD=m
CONFIG_CRYPTO_SALSA20=m
CONFIG_CRYPTO_SEED=m
CONFIG_CRYPTO_SERPENT=m
CONFIG_CRYPTO_TEA=m
CONFIG_CRYPTO_TWOFISH=m
CONFIG_CRYPTO_LZO=m
CONFIG_CRYPTO_842=m
CONFIG_CRYPTO_LZ4=m
CONFIG_CRYPTO_LZ4HC=m
CONFIG_CRYPTO_ANSI_CPRNG=m
CONFIG_CRYPTO_DRBG_HASH=y
CONFIG_CRYPTO_DRBG_CTR=y
# CONFIG_CRYPTO_HW is not set
CONFIG_FRAME_WARN=2048
CONFIG_MAGIC_SYSRQ=y
# CONFIG_FTRACE is not set

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 4/6] dma-mapping: merge direct and noncoherent ops
  2018-10-31 20:50         ` Maciej W. Rozycki
@ 2018-11-01  5:13           ` Christoph Hellwig
  2018-11-01  7:54             ` [PATCH] MIPS: Fix `dma_alloc_coherent' returning a non-coherent allocation Maciej W. Rozycki
  0 siblings, 1 reply; 17+ messages in thread
From: Christoph Hellwig @ 2018-11-01  5:13 UTC (permalink / raw)
  To: Maciej W. Rozycki
  Cc: Christoph Hellwig, iommu, Marek Szyprowski, Robin Murphy,
	Paul Burton, Greg Kroah-Hartman, linux-mips, linux-kernel

On Wed, Oct 31, 2018 at 08:50:53PM +0000, Maciej W. Rozycki wrote:
> On Wed, 31 Oct 2018, Christoph Hellwig wrote:
> 
> > Can you send me your .config?
> 
>  Sure, attached.
> 
>  This is an updated 4.19 defconfig I was going to submit as I tripped over 
> this problem, so just rename it to .config and use `make olddefconfig' to 
> regenerate a full version.

Fails to compile for me with:

cc1: error: ‘-march=r3000’ requires ‘-mfp32’

using the x86_64 corss gcc 8 from Debian testing.

Either way the config looks like we have all the required bits for
non-coherent dma support.  The only guess is that somehow
dma_cache_wback_inv aren't hooked up to the right functions for some
reason, but I can't really think off why.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH] MIPS: Fix `dma_alloc_coherent' returning a non-coherent allocation
  2018-11-01  5:13           ` Christoph Hellwig
@ 2018-11-01  7:54             ` Maciej W. Rozycki
  2018-11-01  8:33               ` Christoph Hellwig
  2018-11-05 18:10               ` Paul Burton
  0 siblings, 2 replies; 17+ messages in thread
From: Maciej W. Rozycki @ 2018-11-01  7:54 UTC (permalink / raw)
  To: Christoph Hellwig, Paul Burton
  Cc: iommu, Marek Szyprowski, Robin Murphy, Greg Kroah-Hartman,
	linux-mips, linux-kernel, stable

Fix a MIPS `dma_alloc_coherent' regression from commit bc3ec75de545 
("dma-mapping: merge direct and noncoherent ops") that causes a cached 
allocation to be returned on noncoherent cache systems.

This is due to an inverted check now used in the MIPS implementation of 
`arch_dma_alloc' on the result from `dma_direct_alloc_pages' before 
doing the cached-to-uncached mapping of the allocation address obtained.  
The mapping has to be done for a non-NULL rather than NULL result, 
because a NULL result means the allocation has failed.

Invert the check for correct operation then.

Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org>
Fixes: bc3ec75de545 ("dma-mapping: merge direct and noncoherent ops")
Cc: stable@vger.kernel.org # 4.19+
---
On Thu, 1 Nov 2018, Christoph Hellwig wrote:

> Fails to compile for me with:
> 
> cc1: error: ‘-march=r3000’ requires ‘-mfp32’
> 
> using the x86_64 corss gcc 8 from Debian testing.

 Hmm, that seems related to the FPXX ABI, which the R3000 does not support 
and which they may have configured as the default for GCC (`-mfpxx').  
That's not relevant to the kernel, so we probably just ought to force 
`-mfp32' and `-mfp64' for 32-bit and 64-bit builds respectively these 
days.

> Either way the config looks like we have all the required bits for
> non-coherent dma support.  The only guess is that somehow
> dma_cache_wback_inv aren't hooked up to the right functions for some
> reason, but I can't really think off why.

 Well, `dma_cache_wback_inv' isn't actually called and with the 64-bit 
configuration I switched to the address returned is in the XKPHYS cached 
noncoherent space, as proved with a simple diagnostic patch applied to 
`dma_direct_alloc' showing the results of `dev_is_dma_coherent' and the 
actual allocation:

tc1: DEFTA at MMIO addr = 0x1e900000, IRQ = 20, Hardware addr = 08-00-2b-a3-a3-29
defxx tc1: dma_direct_alloc: coherent: 0
defxx tc1: dma_direct_alloc: returned: 9800000003db8000
tc1: registered as fddi0

(the value of 3 in bits 61:59 of the virtual address denotes the cached 
noncoherent attribute).

 The cause is commit bc3ec75de545 ("dma-mapping: merge direct and 
noncoherent ops") reversed the interpretation of the `dma_direct_alloc*' 
result in `arch_dma_alloc'.  I guess this change was unlucky not to have 
this part of the API actually verified at run-time by anyone anywhere.

 Fixed thus, with debug output now as expected:

defxx tc1: dma_direct_alloc: coherent: 0
defxx tc1: dma_direct_alloc: returned: 9000000003db8000

showing the address returned in the XKPHYS uncached space (the value of 2 
in bits 61:59 of the virtual address denotes the uncached attribute), and 
the network interface working properly.

 Please apply, and backport as required.

  Maciej
---
 arch/mips/mm/dma-noncoherent.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

linux-mips-arch-dma-alloc-err.patch
Index: linux-20181028-4maxp64-defconfig/arch/mips/mm/dma-noncoherent.c
===================================================================
--- linux-20181028-4maxp64-defconfig.orig/arch/mips/mm/dma-noncoherent.c
+++ linux-20181028-4maxp64-defconfig/arch/mips/mm/dma-noncoherent.c
@@ -50,7 +50,7 @@ void *arch_dma_alloc(struct device *dev,
 	void *ret;
 
 	ret = dma_direct_alloc_pages(dev, size, dma_handle, gfp, attrs);
-	if (!ret && !(attrs & DMA_ATTR_NON_CONSISTENT)) {
+	if (ret && !(attrs & DMA_ATTR_NON_CONSISTENT)) {
 		dma_cache_wback_inv((unsigned long) ret, size);
 		ret = (void *)UNCAC_ADDR(ret);
 	}

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] MIPS: Fix `dma_alloc_coherent' returning a non-coherent allocation
  2018-11-01  7:54             ` [PATCH] MIPS: Fix `dma_alloc_coherent' returning a non-coherent allocation Maciej W. Rozycki
@ 2018-11-01  8:33               ` Christoph Hellwig
  2018-11-01 14:30                 ` Maciej W. Rozycki
  2018-11-05 18:10               ` Paul Burton
  1 sibling, 1 reply; 17+ messages in thread
From: Christoph Hellwig @ 2018-11-01  8:33 UTC (permalink / raw)
  To: Maciej W. Rozycki
  Cc: Christoph Hellwig, Paul Burton, iommu, Marek Szyprowski,
	Robin Murphy, Greg Kroah-Hartman, linux-mips, linux-kernel,
	stable

On Thu, Nov 01, 2018 at 07:54:24AM +0000, Maciej W. Rozycki wrote:
> Fix a MIPS `dma_alloc_coherent' regression from commit bc3ec75de545 
> ("dma-mapping: merge direct and noncoherent ops") that causes a cached 
> allocation to be returned on noncoherent cache systems.
> 
> This is due to an inverted check now used in the MIPS implementation of 
> `arch_dma_alloc' on the result from `dma_direct_alloc_pages' before 
> doing the cached-to-uncached mapping of the allocation address obtained.  
> The mapping has to be done for a non-NULL rather than NULL result, 
> because a NULL result means the allocation has failed.
> 
> Invert the check for correct operation then.
> 
> Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org>
> Fixes: bc3ec75de545 ("dma-mapping: merge direct and noncoherent ops")
> Cc: stable@vger.kernel.org # 4.19+

Oops, yes this looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] MIPS: Fix `dma_alloc_coherent' returning a non-coherent allocation
  2018-11-01  8:33               ` Christoph Hellwig
@ 2018-11-01 14:30                 ` Maciej W. Rozycki
  0 siblings, 0 replies; 17+ messages in thread
From: Maciej W. Rozycki @ 2018-11-01 14:30 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Paul Burton, iommu, Marek Szyprowski, Robin Murphy,
	Greg Kroah-Hartman, linux-mips, linux-kernel, stable

On Thu, 1 Nov 2018, Christoph Hellwig wrote:

> Oops, yes this looks good:

 BTW, for anyone missing hardware suitable for serious DMA testing I can 
recommend getting a pair of DEFPA cards, the ubiquitous PCI version of 
this board, cheaply available, which is the same except for a different 
host bus bridge ASIC, developed later.  They can be wired back to back 
similarly to Ethernet adapters (or in a loop if you have 2 or more dual 
attachment versions), no other hardware is required save for patch cords.
Version 3 (DEFPA-xC) boards support universal PCI signalling, older ones 
are 5V-only.

 These devices are a stellar example of fine engineering[1].  Our `defxx' 
driver, which I believe has been adapted from the DEC OSF/1 one referred 
in the said document, has some latency and other issues that I plan to 
address sometime, once I have sorted higher-priority issues, however 
hardware itself is excellent.

References:

[1] Chran-Ham Chang et al., "High-performance TCP/IP and UDP/IP Networking 
    in DEC OSF/1 for Alpha AXP", Digital Technical Journal, vol. 5, no. 1 
    (Winter 1993), "Network Adapter Characteristics", p. 7
    <ftp://ftp.linux-mips.org/pub/linux/mips/people/macro/DEC/DTJ/DTJ904/DTJ904PF.PDF>

  Maciej

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] MIPS: Fix `dma_alloc_coherent' returning a non-coherent allocation
  2018-11-01  7:54             ` [PATCH] MIPS: Fix `dma_alloc_coherent' returning a non-coherent allocation Maciej W. Rozycki
  2018-11-01  8:33               ` Christoph Hellwig
@ 2018-11-05 18:10               ` Paul Burton
  1 sibling, 0 replies; 17+ messages in thread
From: Paul Burton @ 2018-11-05 18:10 UTC (permalink / raw)
  To: Maciej W. Rozycki
  Cc: Christoph Hellwig, iommu, Marek Szyprowski, Robin Murphy,
	Greg Kroah-Hartman, linux-mips, linux-kernel, stable

Hi Maciej,

On Thu, Nov 01, 2018 at 07:54:24AM +0000, Maciej W. Rozycki wrote:
> Fix a MIPS `dma_alloc_coherent' regression from commit bc3ec75de545 
> ("dma-mapping: merge direct and noncoherent ops") that causes a cached 
> allocation to be returned on noncoherent cache systems.
> 
> This is due to an inverted check now used in the MIPS implementation of 
> `arch_dma_alloc' on the result from `dma_direct_alloc_pages' before 
> doing the cached-to-uncached mapping of the allocation address obtained.  
> The mapping has to be done for a non-NULL rather than NULL result, 
> because a NULL result means the allocation has failed.
> 
> Invert the check for correct operation then.
> 
> Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org>
> Fixes: bc3ec75de545 ("dma-mapping: merge direct and noncoherent ops")
> Cc: stable@vger.kernel.org # 4.19+
> ---
>  arch/mips/mm/dma-noncoherent.c |    2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)

Thanks, nice catch! Applied to mips-fixes.

Paul

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2018-11-05 18:22 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-09-14  9:58 merge dma_direct_ops and dma_noncoherent_ops v3 Christoph Hellwig
2018-09-14  9:58 ` [PATCH 1/6] dma-mapping: add the missing ARCH_HAS_SYNC_DMA_FOR_CPU_ALL declaration Christoph Hellwig
2018-09-14  9:58 ` [PATCH 2/6] MIPS: don't select DMA_MAYBE_COHERENT from DMA_PERDEV_COHERENT Christoph Hellwig
2018-09-14  9:58 ` [PATCH 3/6] dma-mapping: move the dma_coherent flag to struct device Christoph Hellwig
2018-09-14  9:58 ` [PATCH 4/6] dma-mapping: merge direct and noncoherent ops Christoph Hellwig
2018-10-31 14:24   ` Maciej W. Rozycki
2018-10-31 16:31     ` Maciej W. Rozycki
2018-10-31 20:32       ` Christoph Hellwig
2018-10-31 20:50         ` Maciej W. Rozycki
2018-11-01  5:13           ` Christoph Hellwig
2018-11-01  7:54             ` [PATCH] MIPS: Fix `dma_alloc_coherent' returning a non-coherent allocation Maciej W. Rozycki
2018-11-01  8:33               ` Christoph Hellwig
2018-11-01 14:30                 ` Maciej W. Rozycki
2018-11-05 18:10               ` Paul Burton
2018-09-14  9:58 ` [PATCH 5/6] dma-mapping: consolidate the dma mmap implementations Christoph Hellwig
2018-09-14  9:58 ` [PATCH 6/6] dma-mapping: support non-coherent devices in dma_common_get_sgtable Christoph Hellwig
2018-09-20  7:02 ` merge dma_direct_ops and dma_noncoherent_ops v3 Christoph Hellwig

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).