iommu.lists.linux-foundation.org archive mirror
 help / color / mirror / Atom feed
* common non-cache coherent direct dma mapping ops v2
@ 2018-05-22 12:04 Christoph Hellwig
  2018-05-22 12:04 ` [PATCH 01/25] hexagon: remove the sync_single_for_cpu DMA operation Christoph Hellwig
                   ` (18 more replies)
  0 siblings, 19 replies; 26+ messages in thread
From: Christoph Hellwig @ 2018-05-22 12:04 UTC (permalink / raw)
  To: iommu
  Cc: linux-arch, Michal Simek, Greentime Hu, Vincent Chen,
	linux-alpha, linux-snps-arc, linux-arm-kernel, linux-c6x-dev,
	linux-hexagon, linux-m68k, nios2-dev, openrisc, linux-parisc,
	linux-sh, sparclinux, linux-xtensa, linux-kernel

Hi all,

this series continues consolidating the dma-mapping code, with a focus
on architectures that do not (always) provide cache coherence for DMA.
Three architectures (arm, mips and powerpc) are still left to be
converted later due to complexity of their dma ops selection.

The dma-noncoherent ops calls the dma-direct ops for the actual
translation of streaming mappins and allow the architecture to provide
any cache flushing required for cpu to device and/or device to cpu
ownership transfers.  The dma coherent allocator is for now still left
entirely to architecture supplied implementations due the amount of
variations.  Hopefully we can do some consolidation for them later on
as well.

A lot of architectures are currently doing very questionable things
in their dma mapping routines, which are documented in the changelogs
for each patch.  Please review them very careful and correct me on
incorrect assumptions.

Because this series sits on top of three previously submitted series
a git tree might be useful to actually test it.  It is provided here:

    git://git.infradead.org/users/hch/misc.git generic-dma-noncoherent.2

Gitweb:

    http://git.infradead.org/users/hch/misc.git/shortlog/refs/heads/generic-dma-noncoherent.2

Changes since v1:
 - dropped the already merged generic, arc and c6x patches
   (with the map_single offset fix from Alexey Brodkin)
 - split various changes into smaller patches
 - dropped the arm-nommu changes for now
 - fixed a typo

Changes since RFC:
 - fix a typo accidentally disabling the device to cpu transfer sync
 - fixed a few compile failures

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 01/25] hexagon: remove the sync_single_for_cpu DMA operation
  2018-05-22 12:04 common non-cache coherent direct dma mapping ops v2 Christoph Hellwig
@ 2018-05-22 12:04 ` Christoph Hellwig
  2018-05-22 12:04 ` [PATCH 02/25] hexagon: implement the sync_sg_for_device " Christoph Hellwig
                   ` (17 subsequent siblings)
  18 siblings, 0 replies; 26+ messages in thread
From: Christoph Hellwig @ 2018-05-22 12:04 UTC (permalink / raw)
  To: iommu
  Cc: linux-arch, Michal Simek, Greentime Hu, Vincent Chen,
	linux-alpha, linux-snps-arc, linux-arm-kernel, linux-c6x-dev,
	linux-hexagon, linux-m68k, nios2-dev, openrisc, linux-parisc,
	linux-sh, sparclinux, linux-xtensa, linux-kernel

hexagon does all the required cache maintainance at dma map time, and none
at unmap time.  It thus has to implement sync_single_for_device to match
the map cace for buffer reuse, but there is no point in doing another
invalidation in the sync_single_cpu_case, which in terms of cache
maintainance is equivalent to the unmap case.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 arch/hexagon/kernel/dma.c | 8 --------
 1 file changed, 8 deletions(-)

diff --git a/arch/hexagon/kernel/dma.c b/arch/hexagon/kernel/dma.c
index 77459df34e2e..d2b717f352f4 100644
--- a/arch/hexagon/kernel/dma.c
+++ b/arch/hexagon/kernel/dma.c
@@ -181,13 +181,6 @@ static dma_addr_t hexagon_map_page(struct device *dev, struct page *page,
 	return bus;
 }
 
-static void hexagon_sync_single_for_cpu(struct device *dev,
-					dma_addr_t dma_handle, size_t size,
-					enum dma_data_direction dir)
-{
-	dma_sync(dma_addr_to_virt(dma_handle), size, dir);
-}
-
 static void hexagon_sync_single_for_device(struct device *dev,
 					dma_addr_t dma_handle, size_t size,
 					enum dma_data_direction dir)
@@ -205,7 +198,6 @@ const struct dma_map_ops hexagon_dma_ops = {
 	.free		= hexagon_free_coherent,
 	.map_sg		= hexagon_map_sg,
 	.map_page	= hexagon_map_page,
-	.sync_single_for_cpu = hexagon_sync_single_for_cpu,
 	.sync_single_for_device = hexagon_sync_single_for_device,
 	.mapping_error	= hexagon_mapping_error,
 };
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 02/25] hexagon: implement the sync_sg_for_device DMA operation
  2018-05-22 12:04 common non-cache coherent direct dma mapping ops v2 Christoph Hellwig
  2018-05-22 12:04 ` [PATCH 01/25] hexagon: remove the sync_single_for_cpu DMA operation Christoph Hellwig
@ 2018-05-22 12:04 ` Christoph Hellwig
       [not found] ` <20180522120430.28709-1-hch-jcswGhMUV9g@public.gmane.org>
                   ` (16 subsequent siblings)
  18 siblings, 0 replies; 26+ messages in thread
From: Christoph Hellwig @ 2018-05-22 12:04 UTC (permalink / raw)
  To: iommu
  Cc: linux-arch, Michal Simek, Greentime Hu, Vincent Chen,
	linux-alpha, linux-snps-arc, linux-arm-kernel, linux-c6x-dev,
	linux-hexagon, linux-m68k, nios2-dev, openrisc, linux-parisc,
	linux-sh, sparclinux, linux-xtensa, linux-kernel

This methods needs to provide the equivalent of sync_single_for_device
for each S/G list element, but was missing.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 arch/hexagon/kernel/dma.c | 13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/arch/hexagon/kernel/dma.c b/arch/hexagon/kernel/dma.c
index d2b717f352f4..9e46556a227d 100644
--- a/arch/hexagon/kernel/dma.c
+++ b/arch/hexagon/kernel/dma.c
@@ -188,6 +188,18 @@ static void hexagon_sync_single_for_device(struct device *dev,
 	dma_sync(dma_addr_to_virt(dma_handle), size, dir);
 }
 
+static void hexagon_sync_sg_for_device(struct device *dev,
+		struct scatterlist *sgl, int nents, enum dma_data_direction dir)
+{
+	struct scatterlist *sg;
+	int i;
+
+	for_each_sg(sgl, sg, nents, i)
+		hexagon_sync_single_for_device(dev, sg_dma_address(sg),
+				sg->length, dir);
+}
+
+
 static int hexagon_mapping_error(struct device *dev, dma_addr_t dma_addr)
 {
 	return dma_addr == HEXAGON_MAPPING_ERROR;
@@ -199,6 +211,7 @@ const struct dma_map_ops hexagon_dma_ops = {
 	.map_sg		= hexagon_map_sg,
 	.map_page	= hexagon_map_page,
 	.sync_single_for_device = hexagon_sync_single_for_device,
+	.sync_sg_for_device = hexagon_sync_sg_for_device,
 	.mapping_error	= hexagon_mapping_error,
 };
 
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 03/25] hexagon: use generic dma_noncoherent_ops
       [not found] ` <20180522120430.28709-1-hch-jcswGhMUV9g@public.gmane.org>
@ 2018-05-22 12:04   ` Christoph Hellwig
  2018-05-22 12:04   ` [PATCH 04/25] m68k: " Christoph Hellwig
                     ` (5 subsequent siblings)
  6 siblings, 0 replies; 26+ messages in thread
From: Christoph Hellwig @ 2018-05-22 12:04 UTC (permalink / raw)
  To: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA
  Cc: linux-arch-u79uwXL29TY76Z2rM5mHXA,
	linux-xtensa-PjhNF2WwrV/0Sa2dR60CXw, Michal Simek, Vincent Chen,
	linux-c6x-dev-jPsnJVOj+W6hPH1hqNUYSQ,
	linux-parisc-u79uwXL29TY76Z2rM5mHXA,
	linux-sh-u79uwXL29TY76Z2rM5mHXA,
	linux-hexagon-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-m68k-cunTk1MwBs8S/qaLPR03pWD2FQJk+8+b,
	openrisc-cunTk1MwBs9a3B2Vnqf2dGD2FQJk+8+b, Greentime Hu,
	linux-alpha-u79uwXL29TY76Z2rM5mHXA,
	sparclinux-u79uwXL29TY76Z2rM5mHXA,
	nios2-dev-g9ZBwUv/Ih/yUk5EbOjzuce+I+R0W71w,
	linux-snps-arc-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r

Switch to the generic noncoherent direct mapping implementation.

Signed-off-by: Christoph Hellwig <hch-jcswGhMUV9g@public.gmane.org>
---
 arch/hexagon/Kconfig                   |   2 +
 arch/hexagon/include/asm/Kbuild        |   1 +
 arch/hexagon/include/asm/dma-mapping.h |  40 -------
 arch/hexagon/kernel/dma.c              | 148 ++-----------------------
 4 files changed, 11 insertions(+), 180 deletions(-)
 delete mode 100644 arch/hexagon/include/asm/dma-mapping.h

diff --git a/arch/hexagon/Kconfig b/arch/hexagon/Kconfig
index 37adb2003033..bcbdcb32935c 100644
--- a/arch/hexagon/Kconfig
+++ b/arch/hexagon/Kconfig
@@ -4,6 +4,7 @@ comment "Linux Kernel Configuration for Hexagon"
 
 config HEXAGON
 	def_bool y
+	select ARCH_HAS_SYNC_DMA_FOR_DEVICE
 	select HAVE_OPROFILE
 	# Other pending projects/to-do items.
 	# select HAVE_REGS_AND_STACK_ACCESS_API
@@ -28,6 +29,7 @@ config HEXAGON
 	select GENERIC_CLOCKEVENTS_BROADCAST
 	select MODULES_USE_ELF_RELA
 	select GENERIC_CPU_DEVICES
+	select DMA_NONCOHERENT_OPS
 	---help---
 	  Qualcomm Hexagon is a processor architecture designed for high
 	  performance and low power across a wide variety of applications.
diff --git a/arch/hexagon/include/asm/Kbuild b/arch/hexagon/include/asm/Kbuild
index e9743f689fb8..843a8086e980 100644
--- a/arch/hexagon/include/asm/Kbuild
+++ b/arch/hexagon/include/asm/Kbuild
@@ -5,6 +5,7 @@ generic-y += bugs.h
 generic-y += current.h
 generic-y += device.h
 generic-y += div64.h
+generic-y += dma-mapping.h
 generic-y += emergency-restart.h
 generic-y += extable.h
 generic-y += fb.h
diff --git a/arch/hexagon/include/asm/dma-mapping.h b/arch/hexagon/include/asm/dma-mapping.h
deleted file mode 100644
index 263f6acbfb0f..000000000000
--- a/arch/hexagon/include/asm/dma-mapping.h
+++ /dev/null
@@ -1,40 +0,0 @@
-/*
- * DMA operations for the Hexagon architecture
- *
- * Copyright (c) 2010-2011, The Linux Foundation. All rights reserved.
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License version 2 and
- * only version 2 as published by the Free Software Foundation.
- *
- * This program is distributed in the hope that it will be useful,
- * but WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
- * GNU General Public License for more details.
- *
- * You should have received a copy of the GNU General Public License
- * along with this program; if not, write to the Free Software
- * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA
- * 02110-1301, USA.
- */
-
-#ifndef _ASM_DMA_MAPPING_H
-#define _ASM_DMA_MAPPING_H
-
-#include <linux/types.h>
-#include <linux/cache.h>
-#include <linux/mm.h>
-#include <linux/scatterlist.h>
-#include <linux/dma-debug.h>
-#include <asm/io.h>
-
-struct device;
-
-extern const struct dma_map_ops *dma_ops;
-
-static inline const struct dma_map_ops *get_arch_dma_ops(struct bus_type *bus)
-{
-	return dma_ops;
-}
-
-#endif
diff --git a/arch/hexagon/kernel/dma.c b/arch/hexagon/kernel/dma.c
index 9e46556a227d..ffc4ae8e126f 100644
--- a/arch/hexagon/kernel/dma.c
+++ b/arch/hexagon/kernel/dma.c
@@ -18,32 +18,19 @@
  * 02110-1301, USA.
  */
 
-#include <linux/dma-mapping.h>
-#include <linux/dma-direct.h>
+#include <linux/dma-noncoherent.h>
 #include <linux/bootmem.h>
 #include <linux/genalloc.h>
-#include <asm/dma-mapping.h>
 #include <linux/module.h>
 #include <asm/page.h>
 
-#define HEXAGON_MAPPING_ERROR	0
-
-const struct dma_map_ops *dma_ops;
-EXPORT_SYMBOL(dma_ops);
-
-static inline void *dma_addr_to_virt(dma_addr_t dma_addr)
-{
-	return phys_to_virt((unsigned long) dma_addr);
-}
-
 static struct gen_pool *coherent_pool;
 
 
 /* Allocates from a pool of uncached memory that was reserved at boot time */
 
-static void *hexagon_dma_alloc_coherent(struct device *dev, size_t size,
-				 dma_addr_t *dma_addr, gfp_t flag,
-				 unsigned long attrs)
+void *arch_dma_alloc(struct device *dev, size_t size, dma_addr_t *dma_addr,
+		gfp_t flag, unsigned long attrs)
 {
 	void *ret;
 
@@ -75,58 +62,17 @@ static void *hexagon_dma_alloc_coherent(struct device *dev, size_t size,
 	return ret;
 }
 
-static void hexagon_free_coherent(struct device *dev, size_t size, void *vaddr,
-				  dma_addr_t dma_addr, unsigned long attrs)
+void arch_dma_free(struct device *dev, size_t size, void *vaddr,
+		dma_addr_t dma_addr, unsigned long attrs)
 {
 	gen_pool_free(coherent_pool, (unsigned long) vaddr, size);
 }
 
-static int check_addr(const char *name, struct device *hwdev,
-		      dma_addr_t bus, size_t size)
-{
-	if (hwdev && hwdev->dma_mask && !dma_capable(hwdev, bus, size)) {
-		if (*hwdev->dma_mask >= DMA_BIT_MASK(32))
-			printk(KERN_ERR
-				"%s: overflow %Lx+%zu of device mask %Lx\n",
-				name, (long long)bus, size,
-				(long long)*hwdev->dma_mask);
-		return 0;
-	}
-	return 1;
-}
-
-static int hexagon_map_sg(struct device *hwdev, struct scatterlist *sg,
-			  int nents, enum dma_data_direction dir,
-			  unsigned long attrs)
+void arch_sync_dma_for_device(struct device *dev, phys_addr_t paddr,
+		size_t size, enum dma_data_direction dir)
 {
-	struct scatterlist *s;
-	int i;
-
-	WARN_ON(nents == 0 || sg[0].length == 0);
-
-	for_each_sg(sg, s, nents, i) {
-		s->dma_address = sg_phys(s);
-		if (!check_addr("map_sg", hwdev, s->dma_address, s->length))
-			return 0;
-
-		s->dma_length = s->length;
+	void *addr = phys_to_virt(paddr);
 
-		if (attrs & DMA_ATTR_SKIP_CPU_SYNC)
-			continue;
-
-		flush_dcache_range(dma_addr_to_virt(s->dma_address),
-				   dma_addr_to_virt(s->dma_address + s->length));
-	}
-
-	return nents;
-}
-
-/*
- * address is virtual
- */
-static inline void dma_sync(void *addr, size_t size,
-			    enum dma_data_direction dir)
-{
 	switch (dir) {
 	case DMA_TO_DEVICE:
 		hexagon_clean_dcache_range((unsigned long) addr,
@@ -144,81 +90,3 @@ static inline void dma_sync(void *addr, size_t size,
 		BUG();
 	}
 }
-
-/**
- * hexagon_map_page() - maps an address for device DMA
- * @dev:	pointer to DMA device
- * @page:	pointer to page struct of DMA memory
- * @offset:	offset within page
- * @size:	size of memory to map
- * @dir:	transfer direction
- * @attrs:	pointer to DMA attrs (not used)
- *
- * Called to map a memory address to a DMA address prior
- * to accesses to/from device.
- *
- * We don't particularly have many hoops to jump through
- * so far.  Straight translation between phys and virtual.
- *
- * DMA is not cache coherent so sync is necessary; this
- * seems to be a convenient place to do it.
- *
- */
-static dma_addr_t hexagon_map_page(struct device *dev, struct page *page,
-				   unsigned long offset, size_t size,
-				   enum dma_data_direction dir,
-				   unsigned long attrs)
-{
-	dma_addr_t bus = page_to_phys(page) + offset;
-	WARN_ON(size == 0);
-
-	if (!check_addr("map_single", dev, bus, size))
-		return HEXAGON_MAPPING_ERROR;
-
-	if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC))
-		dma_sync(dma_addr_to_virt(bus), size, dir);
-
-	return bus;
-}
-
-static void hexagon_sync_single_for_device(struct device *dev,
-					dma_addr_t dma_handle, size_t size,
-					enum dma_data_direction dir)
-{
-	dma_sync(dma_addr_to_virt(dma_handle), size, dir);
-}
-
-static void hexagon_sync_sg_for_device(struct device *dev,
-		struct scatterlist *sgl, int nents, enum dma_data_direction dir)
-{
-	struct scatterlist *sg;
-	int i;
-
-	for_each_sg(sgl, sg, nents, i)
-		hexagon_sync_single_for_device(dev, sg_dma_address(sg),
-				sg->length, dir);
-}
-
-
-static int hexagon_mapping_error(struct device *dev, dma_addr_t dma_addr)
-{
-	return dma_addr == HEXAGON_MAPPING_ERROR;
-}
-
-const struct dma_map_ops hexagon_dma_ops = {
-	.alloc		= hexagon_dma_alloc_coherent,
-	.free		= hexagon_free_coherent,
-	.map_sg		= hexagon_map_sg,
-	.map_page	= hexagon_map_page,
-	.sync_single_for_device = hexagon_sync_single_for_device,
-	.sync_sg_for_device = hexagon_sync_sg_for_device,
-	.mapping_error	= hexagon_mapping_error,
-};
-
-void __init hexagon_dma_init(void)
-{
-	if (dma_ops)
-		return;
-
-	dma_ops = &hexagon_dma_ops;
-}
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 04/25] m68k: use generic dma_noncoherent_ops
       [not found] ` <20180522120430.28709-1-hch-jcswGhMUV9g@public.gmane.org>
  2018-05-22 12:04   ` [PATCH 03/25] hexagon: use generic dma_noncoherent_ops Christoph Hellwig
@ 2018-05-22 12:04   ` Christoph Hellwig
  2018-05-22 12:04   ` [PATCH 07/25] nds32: remove the broken kmap code in nds32_dma_map_sg Christoph Hellwig
                     ` (4 subsequent siblings)
  6 siblings, 0 replies; 26+ messages in thread
From: Christoph Hellwig @ 2018-05-22 12:04 UTC (permalink / raw)
  To: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA
  Cc: linux-arch-u79uwXL29TY76Z2rM5mHXA,
	linux-xtensa-PjhNF2WwrV/0Sa2dR60CXw, Michal Simek, Vincent Chen,
	linux-c6x-dev-jPsnJVOj+W6hPH1hqNUYSQ,
	linux-parisc-u79uwXL29TY76Z2rM5mHXA,
	linux-sh-u79uwXL29TY76Z2rM5mHXA,
	linux-hexagon-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-m68k-cunTk1MwBs8S/qaLPR03pWD2FQJk+8+b,
	openrisc-cunTk1MwBs9a3B2Vnqf2dGD2FQJk+8+b, Greentime Hu,
	linux-alpha-u79uwXL29TY76Z2rM5mHXA,
	sparclinux-u79uwXL29TY76Z2rM5mHXA,
	nios2-dev-g9ZBwUv/Ih/yUk5EbOjzuce+I+R0W71w,
	linux-snps-arc-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r

Switch to the generic noncoherent direct mapping implementation.

Signed-off-by: Christoph Hellwig <hch-jcswGhMUV9g@public.gmane.org>
---
 arch/m68k/Kconfig                   |  2 +
 arch/m68k/include/asm/Kbuild        |  1 +
 arch/m68k/include/asm/dma-mapping.h | 12 -----
 arch/m68k/kernel/dma.c              | 68 ++++-------------------------
 4 files changed, 11 insertions(+), 72 deletions(-)
 delete mode 100644 arch/m68k/include/asm/dma-mapping.h

diff --git a/arch/m68k/Kconfig b/arch/m68k/Kconfig
index 785612b576f7..3f61327da2d5 100644
--- a/arch/m68k/Kconfig
+++ b/arch/m68k/Kconfig
@@ -2,6 +2,7 @@
 config M68K
 	bool
 	default y
+	select ARCH_HAS_SYNC_DMA_FOR_DEVICE if HAS_DMA
 	select ARCH_MIGHT_HAVE_PC_PARPORT if ISA
 	select ARCH_NO_COHERENT_DMA_MMAP if !MMU
 	select HAVE_IDE
@@ -24,6 +25,7 @@ config M68K
 	select MODULES_USE_ELF_RELA
 	select OLD_SIGSUSPEND3
 	select OLD_SIGACTION
+	select DMA_NONCOHERENT_OPS if HAS_DMA
 
 config CPU_BIG_ENDIAN
 	def_bool y
diff --git a/arch/m68k/include/asm/Kbuild b/arch/m68k/include/asm/Kbuild
index 88a9d27df1ac..a853c00f1374 100644
--- a/arch/m68k/include/asm/Kbuild
+++ b/arch/m68k/include/asm/Kbuild
@@ -1,5 +1,6 @@
 generic-y += barrier.h
 generic-y += device.h
+generic-y += dma-mapping.h
 generic-y += emergency-restart.h
 generic-y += exec.h
 generic-y += extable.h
diff --git a/arch/m68k/include/asm/dma-mapping.h b/arch/m68k/include/asm/dma-mapping.h
deleted file mode 100644
index e3722ed04fbb..000000000000
--- a/arch/m68k/include/asm/dma-mapping.h
+++ /dev/null
@@ -1,12 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0 */
-#ifndef _M68K_DMA_MAPPING_H
-#define _M68K_DMA_MAPPING_H
-
-extern const struct dma_map_ops m68k_dma_ops;
-
-static inline const struct dma_map_ops *get_arch_dma_ops(struct bus_type *bus)
-{
-        return &m68k_dma_ops;
-}
-
-#endif  /* _M68K_DMA_MAPPING_H */
diff --git a/arch/m68k/kernel/dma.c b/arch/m68k/kernel/dma.c
index c01b9b8f97bf..3d561c577d35 100644
--- a/arch/m68k/kernel/dma.c
+++ b/arch/m68k/kernel/dma.c
@@ -6,7 +6,7 @@
 
 #undef DEBUG
 
-#include <linux/dma-mapping.h>
+#include <linux/dma-noncoherent.h>
 #include <linux/device.h>
 #include <linux/kernel.h>
 #include <linux/scatterlist.h>
@@ -18,7 +18,7 @@
 
 #if defined(CONFIG_MMU) && !defined(CONFIG_COLDFIRE)
 
-static void *m68k_dma_alloc(struct device *dev, size_t size, dma_addr_t *handle,
+void *arch_dma_alloc(struct device *dev, size_t size, dma_addr_t *handle,
 		gfp_t flag, unsigned long attrs)
 {
 	struct page *page, **map;
@@ -61,7 +61,7 @@ static void *m68k_dma_alloc(struct device *dev, size_t size, dma_addr_t *handle,
 	return addr;
 }
 
-static void m68k_dma_free(struct device *dev, size_t size, void *addr,
+void arch_dma_free(struct device *dev, size_t size, void *addr,
 		dma_addr_t handle, unsigned long attrs)
 {
 	pr_debug("dma_free_coherent: %p, %x\n", addr, handle);
@@ -72,8 +72,8 @@ static void m68k_dma_free(struct device *dev, size_t size, void *addr,
 
 #include <asm/cacheflush.h>
 
-static void *m68k_dma_alloc(struct device *dev, size_t size,
-		dma_addr_t *dma_handle, gfp_t gfp, unsigned long attrs)
+void *arch_dma_alloc(struct device *dev, size_t size, dma_addr_t *dma_handle,
+		gfp_t gfp, unsigned long attrs)
 {
 	void *ret;
 
@@ -88,7 +88,7 @@ static void *m68k_dma_alloc(struct device *dev, size_t size,
 	return ret;
 }
 
-static void m68k_dma_free(struct device *dev, size_t size, void *vaddr,
+void arch_dma_free(struct device *dev, size_t size, void *vaddr,
 		dma_addr_t dma_handle, unsigned long attrs)
 {
 	free_pages((unsigned long)vaddr, get_order(size));
@@ -96,8 +96,8 @@ static void m68k_dma_free(struct device *dev, size_t size, void *vaddr,
 
 #endif /* CONFIG_MMU && !CONFIG_COLDFIRE */
 
-static void m68k_dma_sync_single_for_device(struct device *dev,
-		dma_addr_t handle, size_t size, enum dma_data_direction dir)
+void arch_sync_dma_for_device(struct device *dev, phys_addr_t handle,
+		size_t size, enum dma_data_direction dir)
 {
 	switch (dir) {
 	case DMA_BIDIRECTIONAL:
@@ -113,55 +113,3 @@ static void m68k_dma_sync_single_for_device(struct device *dev,
 		break;
 	}
 }
-
-static void m68k_dma_sync_sg_for_device(struct device *dev,
-		struct scatterlist *sglist, int nents, enum dma_data_direction dir)
-{
-	int i;
-	struct scatterlist *sg;
-
-	for_each_sg(sglist, sg, nents, i) {
-		dma_sync_single_for_device(dev, sg->dma_address, sg->length,
-					   dir);
-	}
-}
-
-static dma_addr_t m68k_dma_map_page(struct device *dev, struct page *page,
-		unsigned long offset, size_t size, enum dma_data_direction dir,
-		unsigned long attrs)
-{
-	dma_addr_t handle = page_to_phys(page) + offset;
-
-	if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC))
-		dma_sync_single_for_device(dev, handle, size, dir);
-
-	return handle;
-}
-
-static int m68k_dma_map_sg(struct device *dev, struct scatterlist *sglist,
-		int nents, enum dma_data_direction dir, unsigned long attrs)
-{
-	int i;
-	struct scatterlist *sg;
-
-	for_each_sg(sglist, sg, nents, i) {
-		sg->dma_address = sg_phys(sg);
-
-		if (attrs & DMA_ATTR_SKIP_CPU_SYNC)
-			continue;
-
-		dma_sync_single_for_device(dev, sg->dma_address, sg->length,
-					   dir);
-	}
-	return nents;
-}
-
-const struct dma_map_ops m68k_dma_ops = {
-	.alloc			= m68k_dma_alloc,
-	.free			= m68k_dma_free,
-	.map_page		= m68k_dma_map_page,
-	.map_sg			= m68k_dma_map_sg,
-	.sync_single_for_device	= m68k_dma_sync_single_for_device,
-	.sync_sg_for_device	= m68k_dma_sync_sg_for_device,
-};
-EXPORT_SYMBOL(m68k_dma_ops);
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 05/25] microblaze: use generic dma_noncoherent_ops
  2018-05-22 12:04 common non-cache coherent direct dma mapping ops v2 Christoph Hellwig
                   ` (2 preceding siblings ...)
       [not found] ` <20180522120430.28709-1-hch-jcswGhMUV9g@public.gmane.org>
@ 2018-05-22 12:04 ` Christoph Hellwig
  2018-05-22 12:04 ` [PATCH 06/25] microblaze: remove the consistent_sync and consistent_sync_page Christoph Hellwig
                   ` (14 subsequent siblings)
  18 siblings, 0 replies; 26+ messages in thread
From: Christoph Hellwig @ 2018-05-22 12:04 UTC (permalink / raw)
  To: iommu
  Cc: linux-arch, Michal Simek, Greentime Hu, Vincent Chen,
	linux-alpha, linux-snps-arc, linux-arm-kernel, linux-c6x-dev,
	linux-hexagon, linux-m68k, nios2-dev, openrisc, linux-parisc,
	linux-sh, sparclinux, linux-xtensa, linux-kernel

Switch to the generic noncoherent direct mapping implementation.

This removes the direction-based optimizations in
sync_{single,sg}_for_{cpu,device} which were marked untestested and
do not match the usually very well tested {un,}map_{single,sg}
implementations.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 arch/microblaze/Kconfig                   |   4 +
 arch/microblaze/include/asm/Kbuild        |   1 +
 arch/microblaze/include/asm/dma-mapping.h |  28 -----
 arch/microblaze/include/asm/pgtable.h     |   2 -
 arch/microblaze/kernel/dma.c              | 144 ++--------------------
 arch/microblaze/mm/consistent.c           |   9 +-
 6 files changed, 22 insertions(+), 166 deletions(-)
 delete mode 100644 arch/microblaze/include/asm/dma-mapping.h

diff --git a/arch/microblaze/Kconfig b/arch/microblaze/Kconfig
index d14782100088..848e31a86ba5 100644
--- a/arch/microblaze/Kconfig
+++ b/arch/microblaze/Kconfig
@@ -1,6 +1,8 @@
 config MICROBLAZE
 	def_bool y
 	select ARCH_HAS_GCOV_PROFILE_ALL
+	select ARCH_HAS_SYNC_DMA_FOR_CPU
+	select ARCH_HAS_SYNC_DMA_FOR_DEVICE
 	select ARCH_MIGHT_HAVE_PC_PARPORT
 	select ARCH_NO_COHERENT_DMA_MMAP if !MMU
 	select ARCH_WANT_IPC_PARSE_VERSION
@@ -8,6 +10,8 @@ config MICROBLAZE
 	select TIMER_OF
 	select CLONE_BACKWARDS3
 	select COMMON_CLK
+	select DMA_NONCOHERENT_OPS
+	select DMA_NONCOHERENT_MMAP
 	select GENERIC_ATOMIC64
 	select GENERIC_CLOCKEVENTS
 	select GENERIC_CPU_DEVICES
diff --git a/arch/microblaze/include/asm/Kbuild b/arch/microblaze/include/asm/Kbuild
index 3c80a5a308ed..8d3e71f43a3e 100644
--- a/arch/microblaze/include/asm/Kbuild
+++ b/arch/microblaze/include/asm/Kbuild
@@ -4,6 +4,7 @@ generic-y += bug.h
 generic-y += bugs.h
 generic-y += device.h
 generic-y += div64.h
+generic-y += dma-mapping.h
 generic-y += emergency-restart.h
 generic-y += exec.h
 generic-y += extable.h
diff --git a/arch/microblaze/include/asm/dma-mapping.h b/arch/microblaze/include/asm/dma-mapping.h
deleted file mode 100644
index add50c1373bf..000000000000
--- a/arch/microblaze/include/asm/dma-mapping.h
+++ /dev/null
@@ -1,28 +0,0 @@
-/*
- * Implements the generic device dma API for microblaze and the pci
- *
- * Copyright (C) 2009-2010 Michal Simek <monstr@monstr.eu>
- * Copyright (C) 2009-2010 PetaLogix
- *
- * This file is subject to the terms and conditions of the GNU General
- * Public License. See the file COPYING in the main directory of this
- * archive for more details.
- *
- * This file is base on powerpc and x86 dma-mapping.h versions
- * Copyright (C) 2004 IBM
- */
-
-#ifndef _ASM_MICROBLAZE_DMA_MAPPING_H
-#define _ASM_MICROBLAZE_DMA_MAPPING_H
-
-/*
- * Available generic sets of operations
- */
-extern const struct dma_map_ops dma_nommu_ops;
-
-static inline const struct dma_map_ops *get_arch_dma_ops(struct bus_type *bus)
-{
-	return &dma_nommu_ops;
-}
-
-#endif	/* _ASM_MICROBLAZE_DMA_MAPPING_H */
diff --git a/arch/microblaze/include/asm/pgtable.h b/arch/microblaze/include/asm/pgtable.h
index db8b1fa83452..8a2e654b709f 100644
--- a/arch/microblaze/include/asm/pgtable.h
+++ b/arch/microblaze/include/asm/pgtable.h
@@ -553,8 +553,6 @@ void __init *early_get_page(void);
 
 extern unsigned long ioremap_bot, ioremap_base;
 
-void *consistent_alloc(gfp_t gfp, size_t size, dma_addr_t *dma_handle);
-void consistent_free(size_t size, void *vaddr);
 void consistent_sync(void *vaddr, size_t size, int direction);
 void consistent_sync_page(struct page *page, unsigned long offset,
 	size_t size, int direction);
diff --git a/arch/microblaze/kernel/dma.c b/arch/microblaze/kernel/dma.c
index 3145e7dc8ab1..71032cf64669 100644
--- a/arch/microblaze/kernel/dma.c
+++ b/arch/microblaze/kernel/dma.c
@@ -8,29 +8,15 @@
  */
 
 #include <linux/device.h>
-#include <linux/dma-mapping.h>
+#include <linux/dma-noncoherent.h>
 #include <linux/gfp.h>
 #include <linux/dma-debug.h>
 #include <linux/export.h>
 #include <linux/bug.h>
 #include <asm/cacheflush.h>
 
-static void *dma_nommu_alloc_coherent(struct device *dev, size_t size,
-				       dma_addr_t *dma_handle, gfp_t flag,
-				       unsigned long attrs)
-{
-	return consistent_alloc(flag, size, dma_handle);
-}
-
-static void dma_nommu_free_coherent(struct device *dev, size_t size,
-				     void *vaddr, dma_addr_t dma_handle,
-				     unsigned long attrs)
-{
-	consistent_free(size, vaddr);
-}
-
-static inline void __dma_sync(unsigned long paddr,
-			      size_t size, enum dma_data_direction direction)
+static void __dma_sync(struct device *dev, phys_addr_t paddr, size_t size,
+		enum dma_data_direction direction)
 {
 	switch (direction) {
 	case DMA_TO_DEVICE:
@@ -45,113 +31,21 @@ static inline void __dma_sync(unsigned long paddr,
 	}
 }
 
-static int dma_nommu_map_sg(struct device *dev, struct scatterlist *sgl,
-			     int nents, enum dma_data_direction direction,
-			     unsigned long attrs)
+void arch_sync_dma_for_device(struct device *dev, phys_addr_t paddr,
+		size_t size, enum dma_data_direction dir)
 {
-	struct scatterlist *sg;
-	int i;
-
-	/* FIXME this part of code is untested */
-	for_each_sg(sgl, sg, nents, i) {
-		sg->dma_address = sg_phys(sg);
-
-		if (attrs & DMA_ATTR_SKIP_CPU_SYNC)
-			continue;
-
-		__dma_sync(sg_phys(sg), sg->length, direction);
-	}
-
-	return nents;
-}
-
-static inline dma_addr_t dma_nommu_map_page(struct device *dev,
-					     struct page *page,
-					     unsigned long offset,
-					     size_t size,
-					     enum dma_data_direction direction,
-					     unsigned long attrs)
-{
-	if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC))
-		__dma_sync(page_to_phys(page) + offset, size, direction);
-	return page_to_phys(page) + offset;
+	__dma_sync(dev, paddr, size, dir);
 }
 
-static inline void dma_nommu_unmap_page(struct device *dev,
-					 dma_addr_t dma_address,
-					 size_t size,
-					 enum dma_data_direction direction,
-					 unsigned long attrs)
+void arch_sync_dma_for_cpu(struct device *dev, phys_addr_t paddr,
+		size_t size, enum dma_data_direction dir)
 {
-/* There is not necessary to do cache cleanup
- *
- * phys_to_virt is here because in __dma_sync_page is __virt_to_phys and
- * dma_address is physical address
- */
-	if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC))
-		__dma_sync(dma_address, size, direction);
+	__dma_sync(dev, paddr, size, dir);
 }
 
-static inline void
-dma_nommu_sync_single_for_cpu(struct device *dev,
-			       dma_addr_t dma_handle, size_t size,
-			       enum dma_data_direction direction)
-{
-	/*
-	 * It's pointless to flush the cache as the memory segment
-	 * is given to the CPU
-	 */
-
-	if (direction == DMA_FROM_DEVICE)
-		__dma_sync(dma_handle, size, direction);
-}
-
-static inline void
-dma_nommu_sync_single_for_device(struct device *dev,
-				  dma_addr_t dma_handle, size_t size,
-				  enum dma_data_direction direction)
-{
-	/*
-	 * It's pointless to invalidate the cache if the device isn't
-	 * supposed to write to the relevant region
-	 */
-
-	if (direction == DMA_TO_DEVICE)
-		__dma_sync(dma_handle, size, direction);
-}
-
-static inline void
-dma_nommu_sync_sg_for_cpu(struct device *dev,
-			   struct scatterlist *sgl, int nents,
-			   enum dma_data_direction direction)
-{
-	struct scatterlist *sg;
-	int i;
-
-	/* FIXME this part of code is untested */
-	if (direction == DMA_FROM_DEVICE)
-		for_each_sg(sgl, sg, nents, i)
-			__dma_sync(sg->dma_address, sg->length, direction);
-}
-
-static inline void
-dma_nommu_sync_sg_for_device(struct device *dev,
-			      struct scatterlist *sgl, int nents,
-			      enum dma_data_direction direction)
-{
-	struct scatterlist *sg;
-	int i;
-
-	/* FIXME this part of code is untested */
-	if (direction == DMA_TO_DEVICE)
-		for_each_sg(sgl, sg, nents, i)
-			__dma_sync(sg->dma_address, sg->length, direction);
-}
-
-static
-int dma_nommu_mmap_coherent(struct device *dev, struct vm_area_struct *vma,
-			     void *cpu_addr, dma_addr_t handle, size_t size,
-			     unsigned long attrs)
+int arch_dma_mmap(struct device *dev, struct vm_area_struct *vma,
+		void *cpu_addr, dma_addr_t handle, size_t size,
+		unsigned long attrs)
 {
 #ifdef CONFIG_MMU
 	unsigned long user_count = vma_pages(vma);
@@ -170,17 +64,3 @@ int dma_nommu_mmap_coherent(struct device *dev, struct vm_area_struct *vma,
 	return -ENXIO;
 #endif
 }
-
-const struct dma_map_ops dma_nommu_ops = {
-	.alloc			= dma_nommu_alloc_coherent,
-	.free			= dma_nommu_free_coherent,
-	.mmap			= dma_nommu_mmap_coherent,
-	.map_sg			= dma_nommu_map_sg,
-	.map_page		= dma_nommu_map_page,
-	.unmap_page		= dma_nommu_unmap_page,
-	.sync_single_for_cpu	= dma_nommu_sync_single_for_cpu,
-	.sync_single_for_device	= dma_nommu_sync_single_for_device,
-	.sync_sg_for_cpu	= dma_nommu_sync_sg_for_cpu,
-	.sync_sg_for_device	= dma_nommu_sync_sg_for_device,
-};
-EXPORT_SYMBOL(dma_nommu_ops);
diff --git a/arch/microblaze/mm/consistent.c b/arch/microblaze/mm/consistent.c
index b06c3a7faf20..b9a9c8c3397b 100644
--- a/arch/microblaze/mm/consistent.c
+++ b/arch/microblaze/mm/consistent.c
@@ -33,6 +33,7 @@
 #include <linux/pci.h>
 #include <linux/interrupt.h>
 #include <linux/gfp.h>
+#include <linux/dma-noncoherent.h>
 
 #include <asm/pgalloc.h>
 #include <linux/io.h>
@@ -59,7 +60,8 @@
  * uncached region.  This will no doubt cause big problems if memory allocated
  * here is not also freed properly. -- JW
  */
-void *consistent_alloc(gfp_t gfp, size_t size, dma_addr_t *dma_handle)
+void *arch_dma_alloc(struct device *dev, size_t size, dma_addr_t *dma_handle,
+		gfp_t gfp, unsigned long attrs)
 {
 	unsigned long order, vaddr;
 	void *ret;
@@ -154,7 +156,6 @@ void *consistent_alloc(gfp_t gfp, size_t size, dma_addr_t *dma_handle)
 
 	return ret;
 }
-EXPORT_SYMBOL(consistent_alloc);
 
 #ifdef CONFIG_MMU
 static pte_t *consistent_virt_to_pte(void *vaddr)
@@ -178,7 +179,8 @@ unsigned long consistent_virt_to_pfn(void *vaddr)
 /*
  * free page(s) as defined by the above mapping.
  */
-void consistent_free(size_t size, void *vaddr)
+void arch_dma_free(struct device *dev, size_t size, void *vaddr,
+		dma_addr_t dma_addr, unsigned long attrs)
 {
 	struct page *page;
 
@@ -218,7 +220,6 @@ void consistent_free(size_t size, void *vaddr)
 	flush_tlb_all();
 #endif
 }
-EXPORT_SYMBOL(consistent_free);
 
 /*
  * make an area consistent.
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 06/25] microblaze: remove the consistent_sync and consistent_sync_page
  2018-05-22 12:04 common non-cache coherent direct dma mapping ops v2 Christoph Hellwig
                   ` (3 preceding siblings ...)
  2018-05-22 12:04 ` [PATCH 05/25] microblaze: use generic dma_noncoherent_ops Christoph Hellwig
@ 2018-05-22 12:04 ` Christoph Hellwig
  2018-05-22 12:04 ` [PATCH 09/25] nds32: implement the unmap_sg DMA operation Christoph Hellwig
                   ` (13 subsequent siblings)
  18 siblings, 0 replies; 26+ messages in thread
From: Christoph Hellwig @ 2018-05-22 12:04 UTC (permalink / raw)
  To: iommu
  Cc: linux-arch, linux-xtensa, Michal Simek, Vincent Chen,
	linux-c6x-dev, linux-parisc, linux-sh, linux-hexagon,
	linux-kernel, linux-m68k, openrisc, Greentime Hu, linux-alpha,
	sparclinux, nios2-dev, linux-snps-arc, linux-arm-kernel

Both unused.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 arch/microblaze/include/asm/pgtable.h |  3 --
 arch/microblaze/mm/consistent.c       | 45 ---------------------------
 2 files changed, 48 deletions(-)

diff --git a/arch/microblaze/include/asm/pgtable.h b/arch/microblaze/include/asm/pgtable.h
index 8a2e654b709f..7b650ab14fa0 100644
--- a/arch/microblaze/include/asm/pgtable.h
+++ b/arch/microblaze/include/asm/pgtable.h
@@ -553,9 +553,6 @@ void __init *early_get_page(void);
 
 extern unsigned long ioremap_bot, ioremap_base;
 
-void consistent_sync(void *vaddr, size_t size, int direction);
-void consistent_sync_page(struct page *page, unsigned long offset,
-	size_t size, int direction);
 unsigned long consistent_virt_to_pfn(void *vaddr);
 
 void setup_memory(void);
diff --git a/arch/microblaze/mm/consistent.c b/arch/microblaze/mm/consistent.c
index b9a9c8c3397b..c9a278ac795a 100644
--- a/arch/microblaze/mm/consistent.c
+++ b/arch/microblaze/mm/consistent.c
@@ -220,48 +220,3 @@ void arch_dma_free(struct device *dev, size_t size, void *vaddr,
 	flush_tlb_all();
 #endif
 }
-
-/*
- * make an area consistent.
- */
-void consistent_sync(void *vaddr, size_t size, int direction)
-{
-	unsigned long start;
-	unsigned long end;
-
-	start = (unsigned long)vaddr;
-
-	/* Convert start address back down to unshadowed memory region */
-#ifdef CONFIG_XILINX_UNCACHED_SHADOW
-	start &= ~UNCACHED_SHADOW_MASK;
-#endif
-	end = start + size;
-
-	switch (direction) {
-	case PCI_DMA_NONE:
-		BUG();
-	case PCI_DMA_FROMDEVICE:	/* invalidate only */
-		invalidate_dcache_range(start, end);
-		break;
-	case PCI_DMA_TODEVICE:		/* writeback only */
-		flush_dcache_range(start, end);
-		break;
-	case PCI_DMA_BIDIRECTIONAL:	/* writeback and invalidate */
-		flush_dcache_range(start, end);
-		break;
-	}
-}
-EXPORT_SYMBOL(consistent_sync);
-
-/*
- * consistent_sync_page makes memory consistent. identical
- * to consistent_sync, but takes a struct page instead of a
- * virtual address
- */
-void consistent_sync_page(struct page *page, unsigned long offset,
-	size_t size, int direction)
-{
-	unsigned long start = (unsigned long)page_address(page) + offset;
-	consistent_sync((void *)start, size, direction);
-}
-EXPORT_SYMBOL(consistent_sync_page);
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 07/25] nds32: remove the broken kmap code in nds32_dma_map_sg
       [not found] ` <20180522120430.28709-1-hch-jcswGhMUV9g@public.gmane.org>
  2018-05-22 12:04   ` [PATCH 03/25] hexagon: use generic dma_noncoherent_ops Christoph Hellwig
  2018-05-22 12:04   ` [PATCH 04/25] m68k: " Christoph Hellwig
@ 2018-05-22 12:04   ` Christoph Hellwig
  2018-05-22 12:04   ` [PATCH 08/25] nds32: consolidate DMA cache maintainance routines Christoph Hellwig
                     ` (3 subsequent siblings)
  6 siblings, 0 replies; 26+ messages in thread
From: Christoph Hellwig @ 2018-05-22 12:04 UTC (permalink / raw)
  To: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA
  Cc: linux-arch-u79uwXL29TY76Z2rM5mHXA,
	linux-xtensa-PjhNF2WwrV/0Sa2dR60CXw, Michal Simek, Vincent Chen,
	linux-c6x-dev-jPsnJVOj+W6hPH1hqNUYSQ,
	linux-parisc-u79uwXL29TY76Z2rM5mHXA,
	linux-sh-u79uwXL29TY76Z2rM5mHXA,
	linux-hexagon-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-m68k-cunTk1MwBs8S/qaLPR03pWD2FQJk+8+b,
	openrisc-cunTk1MwBs9a3B2Vnqf2dGD2FQJk+8+b, Greentime Hu,
	linux-alpha-u79uwXL29TY76Z2rM5mHXA,
	sparclinux-u79uwXL29TY76Z2rM5mHXA,
	nios2-dev-g9ZBwUv/Ih/yUk5EbOjzuce+I+R0W71w,
	linux-snps-arc-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r

nds32_dma_map_sg is the only of the various DMA operations that tries
to deal with highmem (the single page variants and SG sync routines are
missing, SG unmap is entirely unimplemented), and it does so without
taking into account S/G list items that are bigger than a page, which
are legal and can happen frequently.

Remove this code for now - if highmem support on nds32 becomes a real
thing it needs to be added back as a loop over pages in the newly
consolidated code that deals with all operations.

Signed-off-by: Christoph Hellwig <hch-jcswGhMUV9g@public.gmane.org>
---
 arch/nds32/kernel/dma.c | 21 +++------------------
 1 file changed, 3 insertions(+), 18 deletions(-)

diff --git a/arch/nds32/kernel/dma.c b/arch/nds32/kernel/dma.c
index d291800fc621..e1bf7206e015 100644
--- a/arch/nds32/kernel/dma.c
+++ b/arch/nds32/kernel/dma.c
@@ -393,24 +393,9 @@ static int nds32_dma_map_sg(struct device *dev, struct scatterlist *sg,
 	int i;
 
 	for (i = 0; i < nents; i++, sg++) {
-		void *virt;
-		unsigned long pfn;
-		struct page *page = sg_page(sg);
-
-		sg->dma_address = sg_phys(sg);
-		pfn = page_to_pfn(page) + sg->offset / PAGE_SIZE;
-		page = pfn_to_page(pfn);
-		if (PageHighMem(page)) {
-			virt = kmap_atomic(page);
-			consistent_sync(virt, sg->length, dir, FOR_CPU);
-			kunmap_atomic(virt);
-		} else {
-			if (sg->offset > PAGE_SIZE)
-				panic("sg->offset:%08x > PAGE_SIZE\n",
-				      sg->offset);
-			virt = page_address(page) + sg->offset;
-			consistent_sync(virt, sg->length, dir, FOR_CPU);
-		}
+		char *virt =
+		    page_address((struct page *)sg->page_link) + sg->offset;
+		consistent_sync(virt, sg->length, dir, FOR_CPU);
 	}
 	return nents;
 }
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 08/25] nds32: consolidate DMA cache maintainance routines
       [not found] ` <20180522120430.28709-1-hch-jcswGhMUV9g@public.gmane.org>
                     ` (2 preceding siblings ...)
  2018-05-22 12:04   ` [PATCH 07/25] nds32: remove the broken kmap code in nds32_dma_map_sg Christoph Hellwig
@ 2018-05-22 12:04   ` Christoph Hellwig
  2018-05-22 12:04   ` [PATCH 10/25] nds32: use generic dma_noncoherent_ops Christoph Hellwig
                     ` (2 subsequent siblings)
  6 siblings, 0 replies; 26+ messages in thread
From: Christoph Hellwig @ 2018-05-22 12:04 UTC (permalink / raw)
  To: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA
  Cc: linux-arch-u79uwXL29TY76Z2rM5mHXA,
	linux-xtensa-PjhNF2WwrV/0Sa2dR60CXw, Michal Simek, Vincent Chen,
	linux-c6x-dev-jPsnJVOj+W6hPH1hqNUYSQ,
	linux-parisc-u79uwXL29TY76Z2rM5mHXA,
	linux-sh-u79uwXL29TY76Z2rM5mHXA,
	linux-hexagon-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-m68k-cunTk1MwBs8S/qaLPR03pWD2FQJk+8+b,
	openrisc-cunTk1MwBs9a3B2Vnqf2dGD2FQJk+8+b, Greentime Hu,
	linux-alpha-u79uwXL29TY76Z2rM5mHXA,
	sparclinux-u79uwXL29TY76Z2rM5mHXA,
	nios2-dev-g9ZBwUv/Ih/yUk5EbOjzuce+I+R0W71w,
	linux-snps-arc-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r

Make sure all other DMA methods call nds32_dma_sync_single_for_{device,cpu}
to perform cache maintaince, and remove the consisteny_sync helper that
implemented both with entirely separate code based off an argument.

Signed-off-by: Christoph Hellwig <hch-jcswGhMUV9g@public.gmane.org>
---
 arch/nds32/kernel/dma.c | 140 +++++++++++++++++-----------------------
 1 file changed, 61 insertions(+), 79 deletions(-)

diff --git a/arch/nds32/kernel/dma.c b/arch/nds32/kernel/dma.c
index e1bf7206e015..4e6fb4ffd3f7 100644
--- a/arch/nds32/kernel/dma.c
+++ b/arch/nds32/kernel/dma.c
@@ -22,11 +22,6 @@
 static pte_t *consistent_pte;
 static DEFINE_RAW_SPINLOCK(consistent_lock);
 
-enum master_type {
-	FOR_CPU = 0,
-	FOR_DEVICE = 1,
-};
-
 /*
  * VM region handling support.
  *
@@ -333,15 +328,53 @@ static int __init consistent_init(void)
 }
 
 core_initcall(consistent_init);
-static void consistent_sync(void *vaddr, size_t size, int direction, int master_type);
+
+static void
+nds32_dma_sync_single_for_device(struct device *dev, dma_addr_t handle,
+				 size_t size, enum dma_data_direction dir)
+{
+	unsigned long start = (unsigned long)phys_to_virt(handle);
+
+	switch (dir) {
+	case DMA_FROM_DEVICE:
+		break;
+	case DMA_TO_DEVICE:
+	case DMA_BIDIRECTIONAL:
+		cpu_dma_wb_range(start, start + size);
+		break;
+	default:
+		BUG();
+	}
+}
+
+static void
+nds32_dma_sync_single_for_cpu(struct device *dev, dma_addr_t handle,
+			      size_t size, enum dma_data_direction dir)
+{
+	unsigned long start = (unsigned long)phys_to_virt(handle);
+
+	switch (dir) {
+	case DMA_TO_DEVICE:
+		break;
+	case DMA_FROM_DEVICE:
+	case DMA_BIDIRECTIONAL:
+		cpu_dma_inval_range(start, start + size);
+		break;
+	default:
+		BUG();
+	}
+}
+
 static dma_addr_t nds32_dma_map_page(struct device *dev, struct page *page,
 				     unsigned long offset, size_t size,
 				     enum dma_data_direction dir,
 				     unsigned long attrs)
 {
+	dma_addr_t dma_addr = page_to_phys(page) + offset;
+
 	if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC))
-		consistent_sync((void *)(page_address(page) + offset), size, dir, FOR_DEVICE);
-	return page_to_phys(page) + offset;
+		nds32_dma_sync_single_for_device(dev, handle, size, dir);
+	return dma_addr;
 }
 
 static void nds32_dma_unmap_page(struct device *dev, dma_addr_t handle,
@@ -349,75 +382,19 @@ static void nds32_dma_unmap_page(struct device *dev, dma_addr_t handle,
 				 unsigned long attrs)
 {
 	if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC))
-		consistent_sync(phys_to_virt(handle), size, dir, FOR_CPU);
+		nds32_dma_sync_single_for_cpu(dev, handle, size, dir);
 }
 
-/*
- * Make an area consistent for devices.
- */
-static void consistent_sync(void *vaddr, size_t size, int direction, int master_type)
-{
-	unsigned long start = (unsigned long)vaddr;
-	unsigned long end = start + size;
-
-	if (master_type == FOR_CPU) {
-		switch (direction) {
-		case DMA_TO_DEVICE:
-			break;
-		case DMA_FROM_DEVICE:
-		case DMA_BIDIRECTIONAL:
-			cpu_dma_inval_range(start, end);
-			break;
-		default:
-			BUG();
-		}
-	} else {
-		/* FOR_DEVICE */
-		switch (direction) {
-		case DMA_FROM_DEVICE:
-			break;
-		case DMA_TO_DEVICE:
-		case DMA_BIDIRECTIONAL:
-			cpu_dma_wb_range(start, end);
-			break;
-		default:
-			BUG();
-		}
-	}
-}
-
-static int nds32_dma_map_sg(struct device *dev, struct scatterlist *sg,
-			    int nents, enum dma_data_direction dir,
-			    unsigned long attrs)
+static void
+nds32_dma_sync_sg_for_device(struct device *dev, struct scatterlist *sg,
+			     int nents, enum dma_data_direction dir)
 {
 	int i;
 
 	for (i = 0; i < nents; i++, sg++) {
-		char *virt =
-		    page_address((struct page *)sg->page_link) + sg->offset;
-		consistent_sync(virt, sg->length, dir, FOR_CPU);
+		nds32_dma_sync_single_for_device(dev, sg_dma_address(sg),
+				sg->length, dir);
 	}
-	return nents;
-}
-
-static void nds32_dma_unmap_sg(struct device *dev, struct scatterlist *sg,
-			       int nhwentries, enum dma_data_direction dir,
-			       unsigned long attrs)
-{
-}
-
-static void
-nds32_dma_sync_single_for_cpu(struct device *dev, dma_addr_t handle,
-			      size_t size, enum dma_data_direction dir)
-{
-	consistent_sync((void *)phys_to_virt(handle), size, dir, FOR_CPU);
-}
-
-static void
-nds32_dma_sync_single_for_device(struct device *dev, dma_addr_t handle,
-				 size_t size, enum dma_data_direction dir)
-{
-	consistent_sync((void *)phys_to_virt(handle), size, dir, FOR_DEVICE);
 }
 
 static void
@@ -427,23 +404,28 @@ nds32_dma_sync_sg_for_cpu(struct device *dev, struct scatterlist *sg, int nents,
 	int i;
 
 	for (i = 0; i < nents; i++, sg++) {
-		char *virt =
-		    page_address((struct page *)sg->page_link) + sg->offset;
-		consistent_sync(virt, sg->length, dir, FOR_CPU);
+		nds32_dma_sync_single_for_cpu(dev, sg_dma_address(sg),
+				sg->length, dir);
 	}
 }
 
-static void
-nds32_dma_sync_sg_for_device(struct device *dev, struct scatterlist *sg,
-			     int nents, enum dma_data_direction dir)
+static int nds32_dma_map_sg(struct device *dev, struct scatterlist *sg,
+			    int nents, enum dma_data_direction dir,
+			    unsigned long attrs)
 {
 	int i;
 
 	for (i = 0; i < nents; i++, sg++) {
-		char *virt =
-		    page_address((struct page *)sg->page_link) + sg->offset;
-		consistent_sync(virt, sg->length, dir, FOR_DEVICE);
+		nds32_dma_sync_single_for_device(dev, sg_dma_address(sg),
+				sg->length, dir);
 	}
+	return nents;
+}
+
+static void nds32_dma_unmap_sg(struct device *dev, struct scatterlist *sg,
+			       int nhwentries, enum dma_data_direction dir,
+			       unsigned long attrs)
+{
 }
 
 struct dma_map_ops nds32_dma_ops = {
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 09/25] nds32: implement the unmap_sg DMA operation
  2018-05-22 12:04 common non-cache coherent direct dma mapping ops v2 Christoph Hellwig
                   ` (4 preceding siblings ...)
  2018-05-22 12:04 ` [PATCH 06/25] microblaze: remove the consistent_sync and consistent_sync_page Christoph Hellwig
@ 2018-05-22 12:04 ` Christoph Hellwig
  2018-05-22 12:04 ` [PATCH 13/25] openrisc: remove the no-op unmap_page and unmap_sg DMA operations Christoph Hellwig
                   ` (12 subsequent siblings)
  18 siblings, 0 replies; 26+ messages in thread
From: Christoph Hellwig @ 2018-05-22 12:04 UTC (permalink / raw)
  To: iommu
  Cc: linux-arch, Michal Simek, Greentime Hu, Vincent Chen,
	linux-alpha, linux-snps-arc, linux-arm-kernel, linux-c6x-dev,
	linux-hexagon, linux-m68k, nios2-dev, openrisc, linux-parisc,
	linux-sh, sparclinux, linux-xtensa, linux-kernel

This matches the implementation of the more commonly used unmap_single
routines and the sync_sg_for_cpu method which should provide equivalent
cache maintainance.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 arch/nds32/kernel/dma.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/arch/nds32/kernel/dma.c b/arch/nds32/kernel/dma.c
index 4e6fb4ffd3f7..43d7fd432bb6 100644
--- a/arch/nds32/kernel/dma.c
+++ b/arch/nds32/kernel/dma.c
@@ -426,6 +426,12 @@ static void nds32_dma_unmap_sg(struct device *dev, struct scatterlist *sg,
 			       int nhwentries, enum dma_data_direction dir,
 			       unsigned long attrs)
 {
+	int i;
+
+	for (i = 0; i < nhwentries; i++, sg++) {
+		nds32_dma_sync_single_for_cpu(dev, sg_dma_address(sg),
+				sg->length, dir);
+	}
 }
 
 struct dma_map_ops nds32_dma_ops = {
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 10/25] nds32: use generic dma_noncoherent_ops
       [not found] ` <20180522120430.28709-1-hch-jcswGhMUV9g@public.gmane.org>
                     ` (3 preceding siblings ...)
  2018-05-22 12:04   ` [PATCH 08/25] nds32: consolidate DMA cache maintainance routines Christoph Hellwig
@ 2018-05-22 12:04   ` Christoph Hellwig
  2018-05-22 12:04   ` [PATCH 11/25] nios2: " Christoph Hellwig
  2018-05-22 12:04   ` [PATCH 12/25] openrisc: remove the sync_single_for_cpu DMA operation Christoph Hellwig
  6 siblings, 0 replies; 26+ messages in thread
From: Christoph Hellwig @ 2018-05-22 12:04 UTC (permalink / raw)
  To: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA
  Cc: linux-arch-u79uwXL29TY76Z2rM5mHXA,
	linux-xtensa-PjhNF2WwrV/0Sa2dR60CXw, Michal Simek, Vincent Chen,
	linux-c6x-dev-jPsnJVOj+W6hPH1hqNUYSQ,
	linux-parisc-u79uwXL29TY76Z2rM5mHXA,
	linux-sh-u79uwXL29TY76Z2rM5mHXA,
	linux-hexagon-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-m68k-cunTk1MwBs8S/qaLPR03pWD2FQJk+8+b,
	openrisc-cunTk1MwBs9a3B2Vnqf2dGD2FQJk+8+b, Greentime Hu,
	linux-alpha-u79uwXL29TY76Z2rM5mHXA,
	sparclinux-u79uwXL29TY76Z2rM5mHXA,
	nios2-dev-g9ZBwUv/Ih/yUk5EbOjzuce+I+R0W71w,
	linux-snps-arc-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r

Switch to the generic noncoherent direct mapping implementation.

Signed-off-by: Christoph Hellwig <hch-jcswGhMUV9g@public.gmane.org>
---
 arch/nds32/Kconfig                   |   3 +
 arch/nds32/include/asm/Kbuild        |   1 +
 arch/nds32/include/asm/dma-mapping.h |  14 ----
 arch/nds32/kernel/dma.c              | 113 +++------------------------
 4 files changed, 15 insertions(+), 116 deletions(-)
 delete mode 100644 arch/nds32/include/asm/dma-mapping.h

diff --git a/arch/nds32/Kconfig b/arch/nds32/Kconfig
index 249f38d3388f..67d0ac0a989c 100644
--- a/arch/nds32/Kconfig
+++ b/arch/nds32/Kconfig
@@ -5,10 +5,13 @@
 
 config NDS32
         def_bool y
+	select ARCH_HAS_SYNC_DMA_FOR_CPU
+	select ARCH_HAS_SYNC_DMA_FOR_DEVICE
 	select ARCH_WANT_FRAME_POINTERS if FTRACE
 	select CLKSRC_MMIO
 	select CLONE_BACKWARDS
 	select COMMON_CLK
+	select DMA_NONCOHERENT_OPS
 	select GENERIC_ATOMIC64
 	select GENERIC_CPU_DEVICES
 	select GENERIC_CLOCKEVENTS
diff --git a/arch/nds32/include/asm/Kbuild b/arch/nds32/include/asm/Kbuild
index 06bdf8167f5a..b3e951f805f8 100644
--- a/arch/nds32/include/asm/Kbuild
+++ b/arch/nds32/include/asm/Kbuild
@@ -13,6 +13,7 @@ generic-y += cputime.h
 generic-y += device.h
 generic-y += div64.h
 generic-y += dma.h
+generic-y += dma-mapping.h
 generic-y += emergency-restart.h
 generic-y += errno.h
 generic-y += exec.h
diff --git a/arch/nds32/include/asm/dma-mapping.h b/arch/nds32/include/asm/dma-mapping.h
deleted file mode 100644
index 2dd47d245c25..000000000000
--- a/arch/nds32/include/asm/dma-mapping.h
+++ /dev/null
@@ -1,14 +0,0 @@
-// SPDX-License-Identifier: GPL-2.0
-// Copyright (C) 2005-2017 Andes Technology Corporation
-
-#ifndef ASMNDS32_DMA_MAPPING_H
-#define ASMNDS32_DMA_MAPPING_H
-
-extern struct dma_map_ops nds32_dma_ops;
-
-static inline struct dma_map_ops *get_arch_dma_ops(struct bus_type *bus)
-{
-	return &nds32_dma_ops;
-}
-
-#endif
diff --git a/arch/nds32/kernel/dma.c b/arch/nds32/kernel/dma.c
index 43d7fd432bb6..2b3167a1f259 100644
--- a/arch/nds32/kernel/dma.c
+++ b/arch/nds32/kernel/dma.c
@@ -3,17 +3,14 @@
 
 #include <linux/types.h>
 #include <linux/mm.h>
-#include <linux/export.h>
 #include <linux/string.h>
-#include <linux/scatterlist.h>
-#include <linux/dma-mapping.h>
+#include <linux/dma-noncoherent.h>
 #include <linux/io.h>
 #include <linux/cache.h>
 #include <linux/highmem.h>
 #include <linux/slab.h>
 #include <asm/cacheflush.h>
 #include <asm/tlbflush.h>
-#include <asm/dma-mapping.h>
 #include <asm/proc-fns.h>
 
 /*
@@ -119,10 +116,8 @@ static struct arch_vm_region *vm_region_find(struct arch_vm_region *head,
 	return c;
 }
 
-/* FIXME: attrs is not used. */
-static void *nds32_dma_alloc_coherent(struct device *dev, size_t size,
-				      dma_addr_t * handle, gfp_t gfp,
-				      unsigned long attrs)
+void *arch_dma_alloc(struct device *dev, size_t size, dma_addr_t *handle,
+		gfp_t gfp, unsigned long attrs)
 {
 	struct page *page;
 	struct arch_vm_region *c;
@@ -227,8 +222,8 @@ static void *nds32_dma_alloc_coherent(struct device *dev, size_t size,
 	return NULL;
 }
 
-static void nds32_dma_free(struct device *dev, size_t size, void *cpu_addr,
-			   dma_addr_t handle, unsigned long attrs)
+void arch_dma_free(struct device *dev, size_t size, void *cpu_addr,
+		dma_addr_t handle, unsigned long attrs)
 {
 	struct arch_vm_region *c;
 	unsigned long flags, addr;
@@ -329,11 +324,10 @@ static int __init consistent_init(void)
 
 core_initcall(consistent_init);
 
-static void
-nds32_dma_sync_single_for_device(struct device *dev, dma_addr_t handle,
-				 size_t size, enum dma_data_direction dir)
+void arch_sync_dma_for_device(struct device *dev, phys_addr_t paddr,
+		size_t size, enum dma_data_direction dir)
 {
-	unsigned long start = (unsigned long)phys_to_virt(handle);
+	unsigned long start = (unsigned long)phys_to_virt(paddr);
 
 	switch (dir) {
 	case DMA_FROM_DEVICE:
@@ -347,11 +341,10 @@ nds32_dma_sync_single_for_device(struct device *dev, dma_addr_t handle,
 	}
 }
 
-static void
-nds32_dma_sync_single_for_cpu(struct device *dev, dma_addr_t handle,
-			      size_t size, enum dma_data_direction dir)
+void arch_sync_dma_for_cpu(struct device *dev, phys_addr_t paddr,
+		size_t size, enum dma_data_direction dir)
 {
-	unsigned long start = (unsigned long)phys_to_virt(handle);
+	unsigned long start = (unsigned long)phys_to_virt(paddr);
 
 	switch (dir) {
 	case DMA_TO_DEVICE:
@@ -364,87 +357,3 @@ nds32_dma_sync_single_for_cpu(struct device *dev, dma_addr_t handle,
 		BUG();
 	}
 }
-
-static dma_addr_t nds32_dma_map_page(struct device *dev, struct page *page,
-				     unsigned long offset, size_t size,
-				     enum dma_data_direction dir,
-				     unsigned long attrs)
-{
-	dma_addr_t dma_addr = page_to_phys(page) + offset;
-
-	if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC))
-		nds32_dma_sync_single_for_device(dev, handle, size, dir);
-	return dma_addr;
-}
-
-static void nds32_dma_unmap_page(struct device *dev, dma_addr_t handle,
-				 size_t size, enum dma_data_direction dir,
-				 unsigned long attrs)
-{
-	if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC))
-		nds32_dma_sync_single_for_cpu(dev, handle, size, dir);
-}
-
-static void
-nds32_dma_sync_sg_for_device(struct device *dev, struct scatterlist *sg,
-			     int nents, enum dma_data_direction dir)
-{
-	int i;
-
-	for (i = 0; i < nents; i++, sg++) {
-		nds32_dma_sync_single_for_device(dev, sg_dma_address(sg),
-				sg->length, dir);
-	}
-}
-
-static void
-nds32_dma_sync_sg_for_cpu(struct device *dev, struct scatterlist *sg, int nents,
-			  enum dma_data_direction dir)
-{
-	int i;
-
-	for (i = 0; i < nents; i++, sg++) {
-		nds32_dma_sync_single_for_cpu(dev, sg_dma_address(sg),
-				sg->length, dir);
-	}
-}
-
-static int nds32_dma_map_sg(struct device *dev, struct scatterlist *sg,
-			    int nents, enum dma_data_direction dir,
-			    unsigned long attrs)
-{
-	int i;
-
-	for (i = 0; i < nents; i++, sg++) {
-		nds32_dma_sync_single_for_device(dev, sg_dma_address(sg),
-				sg->length, dir);
-	}
-	return nents;
-}
-
-static void nds32_dma_unmap_sg(struct device *dev, struct scatterlist *sg,
-			       int nhwentries, enum dma_data_direction dir,
-			       unsigned long attrs)
-{
-	int i;
-
-	for (i = 0; i < nhwentries; i++, sg++) {
-		nds32_dma_sync_single_for_cpu(dev, sg_dma_address(sg),
-				sg->length, dir);
-	}
-}
-
-struct dma_map_ops nds32_dma_ops = {
-	.alloc = nds32_dma_alloc_coherent,
-	.free = nds32_dma_free,
-	.map_page = nds32_dma_map_page,
-	.unmap_page = nds32_dma_unmap_page,
-	.map_sg = nds32_dma_map_sg,
-	.unmap_sg = nds32_dma_unmap_sg,
-	.sync_single_for_device = nds32_dma_sync_single_for_device,
-	.sync_single_for_cpu = nds32_dma_sync_single_for_cpu,
-	.sync_sg_for_cpu = nds32_dma_sync_sg_for_cpu,
-	.sync_sg_for_device = nds32_dma_sync_sg_for_device,
-};
-
-EXPORT_SYMBOL(nds32_dma_ops);
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 11/25] nios2: use generic dma_noncoherent_ops
       [not found] ` <20180522120430.28709-1-hch-jcswGhMUV9g@public.gmane.org>
                     ` (4 preceding siblings ...)
  2018-05-22 12:04   ` [PATCH 10/25] nds32: use generic dma_noncoherent_ops Christoph Hellwig
@ 2018-05-22 12:04   ` Christoph Hellwig
  2018-05-22 12:04   ` [PATCH 12/25] openrisc: remove the sync_single_for_cpu DMA operation Christoph Hellwig
  6 siblings, 0 replies; 26+ messages in thread
From: Christoph Hellwig @ 2018-05-22 12:04 UTC (permalink / raw)
  To: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA
  Cc: linux-arch-u79uwXL29TY76Z2rM5mHXA,
	linux-xtensa-PjhNF2WwrV/0Sa2dR60CXw, Michal Simek, Vincent Chen,
	linux-c6x-dev-jPsnJVOj+W6hPH1hqNUYSQ,
	linux-parisc-u79uwXL29TY76Z2rM5mHXA,
	linux-sh-u79uwXL29TY76Z2rM5mHXA,
	linux-hexagon-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-m68k-cunTk1MwBs8S/qaLPR03pWD2FQJk+8+b,
	openrisc-cunTk1MwBs9a3B2Vnqf2dGD2FQJk+8+b, Greentime Hu,
	linux-alpha-u79uwXL29TY76Z2rM5mHXA,
	sparclinux-u79uwXL29TY76Z2rM5mHXA,
	nios2-dev-g9ZBwUv/Ih/yUk5EbOjzuce+I+R0W71w,
	linux-snps-arc-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r

Switch to the generic noncoherent direct mapping implementation.

Signed-off-by: Christoph Hellwig <hch-jcswGhMUV9g@public.gmane.org>
---
 arch/nios2/Kconfig                   |   3 +
 arch/nios2/include/asm/Kbuild        |   1 +
 arch/nios2/include/asm/dma-mapping.h |  20 ----
 arch/nios2/mm/dma-mapping.c          | 139 +++------------------------
 4 files changed, 17 insertions(+), 146 deletions(-)
 delete mode 100644 arch/nios2/include/asm/dma-mapping.h

diff --git a/arch/nios2/Kconfig b/arch/nios2/Kconfig
index 3d4ec88f1db1..92035042cf62 100644
--- a/arch/nios2/Kconfig
+++ b/arch/nios2/Kconfig
@@ -1,6 +1,9 @@
 # SPDX-License-Identifier: GPL-2.0
 config NIOS2
 	def_bool y
+	select ARCH_HAS_SYNC_DMA_FOR_CPU
+	select ARCH_HAS_SYNC_DMA_FOR_DEVICE
+	select DMA_NONCOHERENT_OPS
 	select TIMER_OF
 	select GENERIC_ATOMIC64
 	select GENERIC_CLOCKEVENTS
diff --git a/arch/nios2/include/asm/Kbuild b/arch/nios2/include/asm/Kbuild
index d232da2cbb38..24f6ee1ee69b 100644
--- a/arch/nios2/include/asm/Kbuild
+++ b/arch/nios2/include/asm/Kbuild
@@ -8,6 +8,7 @@ generic-y += current.h
 generic-y += device.h
 generic-y += div64.h
 generic-y += dma.h
+generic-y += dma-mapping.h
 generic-y += emergency-restart.h
 generic-y += exec.h
 generic-y += extable.h
diff --git a/arch/nios2/include/asm/dma-mapping.h b/arch/nios2/include/asm/dma-mapping.h
deleted file mode 100644
index 6ceb92251da0..000000000000
--- a/arch/nios2/include/asm/dma-mapping.h
+++ /dev/null
@@ -1,20 +0,0 @@
-/*
- * Copyright (C) 2011 Tobias Klauser <tklauser-93Khv+1bN0NyDzI6CaY1VQ@public.gmane.org>
- * Copyright (C) 2009 Wind River Systems Inc
- *
- * This file is subject to the terms and conditions of the GNU General
- * Public License.  See the file COPYING in the main directory of this
- * archive for more details.
- */
-
-#ifndef _ASM_NIOS2_DMA_MAPPING_H
-#define _ASM_NIOS2_DMA_MAPPING_H
-
-extern const struct dma_map_ops nios2_dma_ops;
-
-static inline const struct dma_map_ops *get_arch_dma_ops(struct bus_type *bus)
-{
-	return &nios2_dma_ops;
-}
-
-#endif /* _ASM_NIOS2_DMA_MAPPING_H */
diff --git a/arch/nios2/mm/dma-mapping.c b/arch/nios2/mm/dma-mapping.c
index 4be815519dd4..4af9e5b5ba1c 100644
--- a/arch/nios2/mm/dma-mapping.c
+++ b/arch/nios2/mm/dma-mapping.c
@@ -12,18 +12,18 @@
 
 #include <linux/types.h>
 #include <linux/mm.h>
-#include <linux/export.h>
 #include <linux/string.h>
-#include <linux/scatterlist.h>
 #include <linux/dma-mapping.h>
 #include <linux/io.h>
 #include <linux/cache.h>
 #include <asm/cacheflush.h>
 
-static inline void __dma_sync_for_device(void *vaddr, size_t size,
-			      enum dma_data_direction direction)
+void arch_sync_dma_for_device(struct device *dev, phys_addr_t paddr,
+		size_t size, enum dma_data_direction dir)
 {
-	switch (direction) {
+	void *vaddr = phys_to_virt(paddr);
+
+	switch (dir) {
 	case DMA_FROM_DEVICE:
 		invalidate_dcache_range((unsigned long)vaddr,
 			(unsigned long)(vaddr + size));
@@ -42,10 +42,12 @@ static inline void __dma_sync_for_device(void *vaddr, size_t size,
 	}
 }
 
-static inline void __dma_sync_for_cpu(void *vaddr, size_t size,
-			      enum dma_data_direction direction)
+void arch_sync_dma_for_cpu(struct device *dev, phys_addr_t paddr,
+		size_t size, enum dma_data_direction dir)
 {
-	switch (direction) {
+	void *vaddr = phys_to_virt(paddr);
+
+	switch (dir) {
 	case DMA_BIDIRECTIONAL:
 	case DMA_FROM_DEVICE:
 		invalidate_dcache_range((unsigned long)vaddr,
@@ -58,8 +60,8 @@ static inline void __dma_sync_for_cpu(void *vaddr, size_t size,
 	}
 }
 
-static void *nios2_dma_alloc(struct device *dev, size_t size,
-		dma_addr_t *dma_handle, gfp_t gfp, unsigned long attrs)
+void *arch_dma_alloc(struct device *dev, size_t size, dma_addr_t *dma_handle,
+		gfp_t gfp, unsigned long attrs)
 {
 	void *ret;
 
@@ -80,125 +82,10 @@ static void *nios2_dma_alloc(struct device *dev, size_t size,
 	return ret;
 }
 
-static void nios2_dma_free(struct device *dev, size_t size, void *vaddr,
+void arch_dma_free(struct device *dev, size_t size, void *vaddr,
 		dma_addr_t dma_handle, unsigned long attrs)
 {
 	unsigned long addr = (unsigned long) CAC_ADDR((unsigned long) vaddr);
 
 	free_pages(addr, get_order(size));
 }
-
-static int nios2_dma_map_sg(struct device *dev, struct scatterlist *sg,
-		int nents, enum dma_data_direction direction,
-		unsigned long attrs)
-{
-	int i;
-
-	for_each_sg(sg, sg, nents, i) {
-		void *addr = sg_virt(sg);
-
-		if (!addr)
-			continue;
-
-		sg->dma_address = sg_phys(sg);
-
-		if (attrs & DMA_ATTR_SKIP_CPU_SYNC)
-			continue;
-
-		__dma_sync_for_device(addr, sg->length, direction);
-	}
-
-	return nents;
-}
-
-static dma_addr_t nios2_dma_map_page(struct device *dev, struct page *page,
-			unsigned long offset, size_t size,
-			enum dma_data_direction direction,
-			unsigned long attrs)
-{
-	void *addr = page_address(page) + offset;
-
-	if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC))
-		__dma_sync_for_device(addr, size, direction);
-
-	return page_to_phys(page) + offset;
-}
-
-static void nios2_dma_unmap_page(struct device *dev, dma_addr_t dma_address,
-		size_t size, enum dma_data_direction direction,
-		unsigned long attrs)
-{
-	if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC))
-		__dma_sync_for_cpu(phys_to_virt(dma_address), size, direction);
-}
-
-static void nios2_dma_unmap_sg(struct device *dev, struct scatterlist *sg,
-		int nhwentries, enum dma_data_direction direction,
-		unsigned long attrs)
-{
-	void *addr;
-	int i;
-
-	if (direction == DMA_TO_DEVICE)
-		return;
-
-	if (attrs & DMA_ATTR_SKIP_CPU_SYNC)
-		return;
-
-	for_each_sg(sg, sg, nhwentries, i) {
-		addr = sg_virt(sg);
-		if (addr)
-			__dma_sync_for_cpu(addr, sg->length, direction);
-	}
-}
-
-static void nios2_dma_sync_single_for_cpu(struct device *dev,
-		dma_addr_t dma_handle, size_t size,
-		enum dma_data_direction direction)
-{
-	__dma_sync_for_cpu(phys_to_virt(dma_handle), size, direction);
-}
-
-static void nios2_dma_sync_single_for_device(struct device *dev,
-		dma_addr_t dma_handle, size_t size,
-		enum dma_data_direction direction)
-{
-	__dma_sync_for_device(phys_to_virt(dma_handle), size, direction);
-}
-
-static void nios2_dma_sync_sg_for_cpu(struct device *dev,
-		struct scatterlist *sg, int nelems,
-		enum dma_data_direction direction)
-{
-	int i;
-
-	/* Make sure that gcc doesn't leave the empty loop body.  */
-	for_each_sg(sg, sg, nelems, i)
-		__dma_sync_for_cpu(sg_virt(sg), sg->length, direction);
-}
-
-static void nios2_dma_sync_sg_for_device(struct device *dev,
-		struct scatterlist *sg, int nelems,
-		enum dma_data_direction direction)
-{
-	int i;
-
-	/* Make sure that gcc doesn't leave the empty loop body.  */
-	for_each_sg(sg, sg, nelems, i)
-		__dma_sync_for_device(sg_virt(sg), sg->length, direction);
-
-}
-
-const struct dma_map_ops nios2_dma_ops = {
-	.alloc			= nios2_dma_alloc,
-	.free			= nios2_dma_free,
-	.map_page		= nios2_dma_map_page,
-	.unmap_page		= nios2_dma_unmap_page,
-	.map_sg			= nios2_dma_map_sg,
-	.unmap_sg		= nios2_dma_unmap_sg,
-	.sync_single_for_device	= nios2_dma_sync_single_for_device,
-	.sync_single_for_cpu	= nios2_dma_sync_single_for_cpu,
-	.sync_sg_for_cpu	= nios2_dma_sync_sg_for_cpu,
-	.sync_sg_for_device	= nios2_dma_sync_sg_for_device,
-};
-EXPORT_SYMBOL(nios2_dma_ops);
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 12/25] openrisc: remove the sync_single_for_cpu DMA operation
       [not found] ` <20180522120430.28709-1-hch-jcswGhMUV9g@public.gmane.org>
                     ` (5 preceding siblings ...)
  2018-05-22 12:04   ` [PATCH 11/25] nios2: " Christoph Hellwig
@ 2018-05-22 12:04   ` Christoph Hellwig
  6 siblings, 0 replies; 26+ messages in thread
From: Christoph Hellwig @ 2018-05-22 12:04 UTC (permalink / raw)
  To: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA
  Cc: linux-arch-u79uwXL29TY76Z2rM5mHXA,
	linux-xtensa-PjhNF2WwrV/0Sa2dR60CXw, Michal Simek, Vincent Chen,
	linux-c6x-dev-jPsnJVOj+W6hPH1hqNUYSQ,
	linux-parisc-u79uwXL29TY76Z2rM5mHXA,
	linux-sh-u79uwXL29TY76Z2rM5mHXA,
	linux-hexagon-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-m68k-cunTk1MwBs8S/qaLPR03pWD2FQJk+8+b,
	openrisc-cunTk1MwBs9a3B2Vnqf2dGD2FQJk+8+b, Greentime Hu,
	linux-alpha-u79uwXL29TY76Z2rM5mHXA,
	sparclinux-u79uwXL29TY76Z2rM5mHXA,
	nios2-dev-g9ZBwUv/Ih/yUk5EbOjzuce+I+R0W71w,
	linux-snps-arc-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r

openrisc does all the required cache maintainance at dma map time, and none
at unmap time.  It thus has to implement sync_single_for_device to match
the map cace for buffer reuse, but there is no point in doing another
invalidation in the sync_single_cpu_case, which in terms of cache
maintainance is equivalent to the unmap case.

Signed-off-by: Christoph Hellwig <hch-jcswGhMUV9g@public.gmane.org>
---
 arch/openrisc/kernel/dma.c | 15 ---------------
 1 file changed, 15 deletions(-)

diff --git a/arch/openrisc/kernel/dma.c b/arch/openrisc/kernel/dma.c
index ec7fd45704d2..47601274abf7 100644
--- a/arch/openrisc/kernel/dma.c
+++ b/arch/openrisc/kernel/dma.c
@@ -208,20 +208,6 @@ or1k_unmap_sg(struct device *dev, struct scatterlist *sg,
 	}
 }
 
-static void
-or1k_sync_single_for_cpu(struct device *dev,
-			 dma_addr_t dma_handle, size_t size,
-			 enum dma_data_direction dir)
-{
-	unsigned long cl;
-	dma_addr_t addr = dma_handle;
-	struct cpuinfo_or1k *cpuinfo = &cpuinfo_or1k[smp_processor_id()];
-
-	/* Invalidate the dcache for the requested range */
-	for (cl = addr; cl < addr + size; cl += cpuinfo->dcache_block_size)
-		mtspr(SPR_DCBIR, cl);
-}
-
 static void
 or1k_sync_single_for_device(struct device *dev,
 			    dma_addr_t dma_handle, size_t size,
@@ -243,7 +229,6 @@ const struct dma_map_ops or1k_dma_map_ops = {
 	.unmap_page = or1k_unmap_page,
 	.map_sg = or1k_map_sg,
 	.unmap_sg = or1k_unmap_sg,
-	.sync_single_for_cpu = or1k_sync_single_for_cpu,
 	.sync_single_for_device = or1k_sync_single_for_device,
 };
 EXPORT_SYMBOL(or1k_dma_map_ops);
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 13/25] openrisc: remove the no-op unmap_page and unmap_sg DMA operations
  2018-05-22 12:04 common non-cache coherent direct dma mapping ops v2 Christoph Hellwig
                   ` (5 preceding siblings ...)
  2018-05-22 12:04 ` [PATCH 09/25] nds32: implement the unmap_sg DMA operation Christoph Hellwig
@ 2018-05-22 12:04 ` Christoph Hellwig
  2018-05-22 12:04 ` [PATCH 14/25] openrisc: fix cache maintainance the the sync_single_for_device DMA operation Christoph Hellwig
                   ` (11 subsequent siblings)
  18 siblings, 0 replies; 26+ messages in thread
From: Christoph Hellwig @ 2018-05-22 12:04 UTC (permalink / raw)
  To: iommu
  Cc: linux-arch, Michal Simek, Greentime Hu, Vincent Chen,
	linux-alpha, linux-snps-arc, linux-arm-kernel, linux-c6x-dev,
	linux-hexagon, linux-m68k, nios2-dev, openrisc, linux-parisc,
	linux-sh, sparclinux, linux-xtensa, linux-kernel

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 arch/openrisc/kernel/dma.c | 23 -----------------------
 1 file changed, 23 deletions(-)

diff --git a/arch/openrisc/kernel/dma.c b/arch/openrisc/kernel/dma.c
index 47601274abf7..7cadff93d179 100644
--- a/arch/openrisc/kernel/dma.c
+++ b/arch/openrisc/kernel/dma.c
@@ -171,14 +171,6 @@ or1k_map_page(struct device *dev, struct page *page,
 	return addr;
 }
 
-static void
-or1k_unmap_page(struct device *dev, dma_addr_t dma_handle,
-		size_t size, enum dma_data_direction dir,
-		unsigned long attrs)
-{
-	/* Nothing special to do here... */
-}
-
 static int
 or1k_map_sg(struct device *dev, struct scatterlist *sg,
 	    int nents, enum dma_data_direction dir,
@@ -195,19 +187,6 @@ or1k_map_sg(struct device *dev, struct scatterlist *sg,
 	return nents;
 }
 
-static void
-or1k_unmap_sg(struct device *dev, struct scatterlist *sg,
-	      int nents, enum dma_data_direction dir,
-	      unsigned long attrs)
-{
-	struct scatterlist *s;
-	int i;
-
-	for_each_sg(sg, s, nents, i) {
-		or1k_unmap_page(dev, sg_dma_address(s), sg_dma_len(s), dir, 0);
-	}
-}
-
 static void
 or1k_sync_single_for_device(struct device *dev,
 			    dma_addr_t dma_handle, size_t size,
@@ -226,9 +205,7 @@ const struct dma_map_ops or1k_dma_map_ops = {
 	.alloc = or1k_dma_alloc,
 	.free = or1k_dma_free,
 	.map_page = or1k_map_page,
-	.unmap_page = or1k_unmap_page,
 	.map_sg = or1k_map_sg,
-	.unmap_sg = or1k_unmap_sg,
 	.sync_single_for_device = or1k_sync_single_for_device,
 };
 EXPORT_SYMBOL(or1k_dma_map_ops);
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 14/25] openrisc: fix cache maintainance the the sync_single_for_device DMA operation
  2018-05-22 12:04 common non-cache coherent direct dma mapping ops v2 Christoph Hellwig
                   ` (6 preceding siblings ...)
  2018-05-22 12:04 ` [PATCH 13/25] openrisc: remove the no-op unmap_page and unmap_sg DMA operations Christoph Hellwig
@ 2018-05-22 12:04 ` Christoph Hellwig
  2018-05-22 12:04 ` [PATCH 15/25] openrisc: use generic dma_noncoherent_ops Christoph Hellwig
                   ` (10 subsequent siblings)
  18 siblings, 0 replies; 26+ messages in thread
From: Christoph Hellwig @ 2018-05-22 12:04 UTC (permalink / raw)
  To: iommu
  Cc: linux-arch, Michal Simek, Greentime Hu, Vincent Chen,
	linux-alpha, linux-snps-arc, linux-arm-kernel, linux-c6x-dev,
	linux-hexagon, linux-m68k, nios2-dev, openrisc, linux-parisc,
	linux-sh, sparclinux, linux-xtensa, linux-kernel

The cache maintaince in the sync_single_for_device operation should be
equivalent to the map_page operation to facilitate reusing buffers.  Fix the
openrisc implementation by moving the cache maintaince performed in map_page
into the sync_single method, and calling that from map_page.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 arch/openrisc/kernel/dma.c | 42 +++++++++++++++++---------------------
 1 file changed, 19 insertions(+), 23 deletions(-)

diff --git a/arch/openrisc/kernel/dma.c b/arch/openrisc/kernel/dma.c
index 7cadff93d179..d6a0bf1fa713 100644
--- a/arch/openrisc/kernel/dma.c
+++ b/arch/openrisc/kernel/dma.c
@@ -133,19 +133,15 @@ or1k_dma_free(struct device *dev, size_t size, void *vaddr,
 	free_pages_exact(vaddr, size);
 }
 
-static dma_addr_t
-or1k_map_page(struct device *dev, struct page *page,
-	      unsigned long offset, size_t size,
-	      enum dma_data_direction dir,
-	      unsigned long attrs)
+static void
+or1k_sync_single_for_device(struct device *dev,
+			    dma_addr_t dma_handle, size_t size,
+			    enum dma_data_direction dir)
 {
 	unsigned long cl;
-	dma_addr_t addr = page_to_phys(page) + offset;
+	dma_addr_t addr = dma_handle;
 	struct cpuinfo_or1k *cpuinfo = &cpuinfo_or1k[smp_processor_id()];
 
-	if (attrs & DMA_ATTR_SKIP_CPU_SYNC)
-		return addr;
-
 	switch (dir) {
 	case DMA_TO_DEVICE:
 		/* Flush the dcache for the requested range */
@@ -168,6 +164,20 @@ or1k_map_page(struct device *dev, struct page *page,
 		break;
 	}
 
+}
+
+static dma_addr_t
+or1k_map_page(struct device *dev, struct page *page,
+	      unsigned long offset, size_t size,
+	      enum dma_data_direction dir,
+	      unsigned long attrs)
+{
+	unsigned long cl;
+	dma_addr_t addr = page_to_phys(page) + offset;
+	struct cpuinfo_or1k *cpuinfo = &cpuinfo_or1k[smp_processor_id()];
+
+	if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC))
+		or1k_sync_single_for_device(dev, addr, size, dir);
 	return addr;
 }
 
@@ -187,20 +197,6 @@ or1k_map_sg(struct device *dev, struct scatterlist *sg,
 	return nents;
 }
 
-static void
-or1k_sync_single_for_device(struct device *dev,
-			    dma_addr_t dma_handle, size_t size,
-			    enum dma_data_direction dir)
-{
-	unsigned long cl;
-	dma_addr_t addr = dma_handle;
-	struct cpuinfo_or1k *cpuinfo = &cpuinfo_or1k[smp_processor_id()];
-
-	/* Flush the dcache for the requested range */
-	for (cl = addr; cl < addr + size; cl += cpuinfo->dcache_block_size)
-		mtspr(SPR_DCBFR, cl);
-}
-
 const struct dma_map_ops or1k_dma_map_ops = {
 	.alloc = or1k_dma_alloc,
 	.free = or1k_dma_free,
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 15/25] openrisc: use generic dma_noncoherent_ops
  2018-05-22 12:04 common non-cache coherent direct dma mapping ops v2 Christoph Hellwig
                   ` (7 preceding siblings ...)
  2018-05-22 12:04 ` [PATCH 14/25] openrisc: fix cache maintainance the the sync_single_for_device DMA operation Christoph Hellwig
@ 2018-05-22 12:04 ` Christoph Hellwig
  2018-05-22 12:04 ` [PATCH 16/25] sh: simplify get_arch_dma_ops Christoph Hellwig
                   ` (9 subsequent siblings)
  18 siblings, 0 replies; 26+ messages in thread
From: Christoph Hellwig @ 2018-05-22 12:04 UTC (permalink / raw)
  To: iommu
  Cc: linux-arch, linux-xtensa, Michal Simek, Vincent Chen,
	linux-c6x-dev, linux-parisc, linux-sh, linux-hexagon,
	linux-kernel, linux-m68k, openrisc, Greentime Hu, linux-alpha,
	sparclinux, nios2-dev, linux-snps-arc, linux-arm-kernel

Switch to the generic noncoherent direct mapping implementation.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 arch/openrisc/Kconfig                   |  2 +
 arch/openrisc/include/asm/Kbuild        |  1 +
 arch/openrisc/include/asm/dma-mapping.h | 35 -------------
 arch/openrisc/kernel/dma.c              | 65 ++++---------------------
 4 files changed, 12 insertions(+), 91 deletions(-)
 delete mode 100644 arch/openrisc/include/asm/dma-mapping.h

diff --git a/arch/openrisc/Kconfig b/arch/openrisc/Kconfig
index 9ecad05bfc73..65e3c574c9d3 100644
--- a/arch/openrisc/Kconfig
+++ b/arch/openrisc/Kconfig
@@ -6,6 +6,8 @@
 
 config OPENRISC
 	def_bool y
+	select ARCH_HAS_SYNC_DMA_FOR_DEVICE
+	select DMA_NONCOHERENT_OPS
 	select OF
 	select OF_EARLY_FLATTREE
 	select IRQ_DOMAIN
diff --git a/arch/openrisc/include/asm/Kbuild b/arch/openrisc/include/asm/Kbuild
index f05c722a21f8..e663a996b612 100644
--- a/arch/openrisc/include/asm/Kbuild
+++ b/arch/openrisc/include/asm/Kbuild
@@ -6,6 +6,7 @@ generic-y += current.h
 generic-y += device.h
 generic-y += div64.h
 generic-y += dma.h
+generic-y += dma-mapping.h
 generic-y += emergency-restart.h
 generic-y += exec.h
 generic-y += extable.h
diff --git a/arch/openrisc/include/asm/dma-mapping.h b/arch/openrisc/include/asm/dma-mapping.h
deleted file mode 100644
index e212a1f0b6d2..000000000000
--- a/arch/openrisc/include/asm/dma-mapping.h
+++ /dev/null
@@ -1,35 +0,0 @@
-/*
- * OpenRISC Linux
- *
- * Linux architectural port borrowing liberally from similar works of
- * others.  All original copyrights apply as per the original source
- * declaration.
- *
- * OpenRISC implementation:
- * Copyright (C) 2010-2011 Jonas Bonn <jonas@southpole.se>
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License as published by
- * the Free Software Foundation; either version 2 of the License, or
- * (at your option) any later version.
- */
-
-#ifndef __ASM_OPENRISC_DMA_MAPPING_H
-#define __ASM_OPENRISC_DMA_MAPPING_H
-
-/*
- * See Documentation/DMA-API-HOWTO.txt and
- * Documentation/DMA-API.txt for documentation.
- */
-
-#include <linux/dma-debug.h>
-#include <linux/dma-mapping.h>
-
-extern const struct dma_map_ops or1k_dma_map_ops;
-
-static inline const struct dma_map_ops *get_arch_dma_ops(struct bus_type *bus)
-{
-	return &or1k_dma_map_ops;
-}
-
-#endif	/* __ASM_OPENRISC_DMA_MAPPING_H */
diff --git a/arch/openrisc/kernel/dma.c b/arch/openrisc/kernel/dma.c
index d6a0bf1fa713..159336adfa2f 100644
--- a/arch/openrisc/kernel/dma.c
+++ b/arch/openrisc/kernel/dma.c
@@ -19,9 +19,7 @@
  * the only thing implemented properly.  The rest need looking into...
  */
 
-#include <linux/dma-mapping.h>
-#include <linux/dma-debug.h>
-#include <linux/export.h>
+#include <linux/dma-noncoherent.h>
 
 #include <asm/cpuinfo.h>
 #include <asm/spr_defs.h>
@@ -80,10 +78,9 @@ page_clear_nocache(pte_t *pte, unsigned long addr,
  * is being ignored for now; uncached but write-combined memory is a
  * missing feature of the OR1K.
  */
-static void *
-or1k_dma_alloc(struct device *dev, size_t size,
-	       dma_addr_t *dma_handle, gfp_t gfp,
-	       unsigned long attrs)
+void *
+arch_dma_alloc(struct device *dev, size_t size, dma_addr_t *dma_handle,
+		gfp_t gfp, unsigned long attrs)
 {
 	unsigned long va;
 	void *page;
@@ -115,9 +112,9 @@ or1k_dma_alloc(struct device *dev, size_t size,
 	return (void *)va;
 }
 
-static void
-or1k_dma_free(struct device *dev, size_t size, void *vaddr,
-	      dma_addr_t dma_handle, unsigned long attrs)
+void
+arch_dma_free(struct device *dev, size_t size, void *vaddr,
+		dma_addr_t dma_handle, unsigned long attrs)
 {
 	unsigned long va = (unsigned long)vaddr;
 	struct mm_walk walk = {
@@ -133,13 +130,10 @@ or1k_dma_free(struct device *dev, size_t size, void *vaddr,
 	free_pages_exact(vaddr, size);
 }
 
-static void
-or1k_sync_single_for_device(struct device *dev,
-			    dma_addr_t dma_handle, size_t size,
-			    enum dma_data_direction dir)
+void arch_sync_dma_for_device(struct device *dev, phys_addr_t addr, size_t size,
+		enum dma_data_direction dir)
 {
 	unsigned long cl;
-	dma_addr_t addr = dma_handle;
 	struct cpuinfo_or1k *cpuinfo = &cpuinfo_or1k[smp_processor_id()];
 
 	switch (dir) {
@@ -163,45 +157,4 @@ or1k_sync_single_for_device(struct device *dev,
 		 */
 		break;
 	}
-
-}
-
-static dma_addr_t
-or1k_map_page(struct device *dev, struct page *page,
-	      unsigned long offset, size_t size,
-	      enum dma_data_direction dir,
-	      unsigned long attrs)
-{
-	unsigned long cl;
-	dma_addr_t addr = page_to_phys(page) + offset;
-	struct cpuinfo_or1k *cpuinfo = &cpuinfo_or1k[smp_processor_id()];
-
-	if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC))
-		or1k_sync_single_for_device(dev, addr, size, dir);
-	return addr;
 }
-
-static int
-or1k_map_sg(struct device *dev, struct scatterlist *sg,
-	    int nents, enum dma_data_direction dir,
-	    unsigned long attrs)
-{
-	struct scatterlist *s;
-	int i;
-
-	for_each_sg(sg, s, nents, i) {
-		s->dma_address = or1k_map_page(dev, sg_page(s), s->offset,
-					       s->length, dir, 0);
-	}
-
-	return nents;
-}
-
-const struct dma_map_ops or1k_dma_map_ops = {
-	.alloc = or1k_dma_alloc,
-	.free = or1k_dma_free,
-	.map_page = or1k_map_page,
-	.map_sg = or1k_map_sg,
-	.sync_single_for_device = or1k_sync_single_for_device,
-};
-EXPORT_SYMBOL(or1k_dma_map_ops);
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 16/25] sh: simplify get_arch_dma_ops
  2018-05-22 12:04 common non-cache coherent direct dma mapping ops v2 Christoph Hellwig
                   ` (8 preceding siblings ...)
  2018-05-22 12:04 ` [PATCH 15/25] openrisc: use generic dma_noncoherent_ops Christoph Hellwig
@ 2018-05-22 12:04 ` Christoph Hellwig
  2018-05-22 12:04 ` [PATCH 17/25] sh: introduce a sh_cacheop_vaddr helper Christoph Hellwig
                   ` (8 subsequent siblings)
  18 siblings, 0 replies; 26+ messages in thread
From: Christoph Hellwig @ 2018-05-22 12:04 UTC (permalink / raw)
  To: iommu
  Cc: linux-arch, linux-xtensa, Michal Simek, Vincent Chen,
	linux-c6x-dev, linux-parisc, linux-sh, linux-hexagon,
	linux-kernel, linux-m68k, openrisc, Greentime Hu, linux-alpha,
	sparclinux, nios2-dev, linux-snps-arc, linux-arm-kernel

Remove the indirection through the dma_ops variable, and just return
nommu_dma_ops directly from get_arch_dma_ops.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 arch/sh/include/asm/dma-mapping.h |  5 ++---
 arch/sh/kernel/dma-nommu.c        |  8 +-------
 arch/sh/mm/consistent.c           |  3 ---
 arch/sh/mm/init.c                 | 10 ----------
 4 files changed, 3 insertions(+), 23 deletions(-)

diff --git a/arch/sh/include/asm/dma-mapping.h b/arch/sh/include/asm/dma-mapping.h
index 41167931e5d9..149e71f95be7 100644
--- a/arch/sh/include/asm/dma-mapping.h
+++ b/arch/sh/include/asm/dma-mapping.h
@@ -2,12 +2,11 @@
 #ifndef __ASM_SH_DMA_MAPPING_H
 #define __ASM_SH_DMA_MAPPING_H
 
-extern const struct dma_map_ops *dma_ops;
-extern void no_iommu_init(void);
+extern const struct dma_map_ops nommu_dma_ops;
 
 static inline const struct dma_map_ops *get_arch_dma_ops(struct bus_type *bus)
 {
-	return dma_ops;
+	return &nommu_dma_ops;
 }
 
 extern void *dma_generic_alloc_coherent(struct device *dev, size_t size,
diff --git a/arch/sh/kernel/dma-nommu.c b/arch/sh/kernel/dma-nommu.c
index 3e3a32fc676e..79a9edafa5b0 100644
--- a/arch/sh/kernel/dma-nommu.c
+++ b/arch/sh/kernel/dma-nommu.c
@@ -79,10 +79,4 @@ const struct dma_map_ops nommu_dma_ops = {
 	.sync_sg_for_device	= nommu_sync_sg_for_device,
 #endif
 };
-
-void __init no_iommu_init(void)
-{
-	if (dma_ops)
-		return;
-	dma_ops = &nommu_dma_ops;
-}
+EXPORT_SYMBOL(nommu_dma_ops);
diff --git a/arch/sh/mm/consistent.c b/arch/sh/mm/consistent.c
index 35ea3099a3b6..221832eec33b 100644
--- a/arch/sh/mm/consistent.c
+++ b/arch/sh/mm/consistent.c
@@ -20,9 +20,6 @@
 #include <asm/cacheflush.h>
 #include <asm/addrspace.h>
 
-const struct dma_map_ops *dma_ops;
-EXPORT_SYMBOL(dma_ops);
-
 void *dma_generic_alloc_coherent(struct device *dev, size_t size,
 				 dma_addr_t *dma_handle, gfp_t gfp,
 				 unsigned long attrs)
diff --git a/arch/sh/mm/init.c b/arch/sh/mm/init.c
index ce0bbaa7e404..32e09f03e6bf 100644
--- a/arch/sh/mm/init.c
+++ b/arch/sh/mm/init.c
@@ -395,22 +395,12 @@ void __init paging_init(void)
 	free_area_init_nodes(max_zone_pfns);
 }
 
-/*
- * Early initialization for any I/O MMUs we might have.
- */
-static void __init iommu_init(void)
-{
-	no_iommu_init();
-}
-
 unsigned int mem_init_done = 0;
 
 void __init mem_init(void)
 {
 	pg_data_t *pgdat;
 
-	iommu_init();
-
 	high_memory = NULL;
 	for_each_online_pgdat(pgdat)
 		high_memory = max_t(void *, high_memory,
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 17/25] sh: introduce a sh_cacheop_vaddr helper
  2018-05-22 12:04 common non-cache coherent direct dma mapping ops v2 Christoph Hellwig
                   ` (9 preceding siblings ...)
  2018-05-22 12:04 ` [PATCH 16/25] sh: simplify get_arch_dma_ops Christoph Hellwig
@ 2018-05-22 12:04 ` Christoph Hellwig
  2018-05-22 12:04 ` [PATCH 18/25] sh: use dma_direct_ops for the CONFIG_DMA_COHERENT case Christoph Hellwig
                   ` (7 subsequent siblings)
  18 siblings, 0 replies; 26+ messages in thread
From: Christoph Hellwig @ 2018-05-22 12:04 UTC (permalink / raw)
  To: iommu
  Cc: linux-arch, Michal Simek, Greentime Hu, Vincent Chen,
	linux-alpha, linux-snps-arc, linux-arm-kernel, linux-c6x-dev,
	linux-hexagon, linux-m68k, nios2-dev, openrisc, linux-parisc,
	linux-sh, sparclinux, linux-xtensa, linux-kernel

And use it in the maple bus code to avoid a dma API dependency.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 arch/sh/include/asm/cacheflush.h | 7 +++++++
 arch/sh/mm/consistent.c          | 6 +-----
 drivers/sh/maple/maple.c         | 7 ++++---
 3 files changed, 12 insertions(+), 8 deletions(-)

diff --git a/arch/sh/include/asm/cacheflush.h b/arch/sh/include/asm/cacheflush.h
index d103ab5a4e4b..b932e42ef028 100644
--- a/arch/sh/include/asm/cacheflush.h
+++ b/arch/sh/include/asm/cacheflush.h
@@ -101,5 +101,12 @@ void kunmap_coherent(void *kvaddr);
 
 void cpu_cache_init(void);
 
+static inline void *sh_cacheop_vaddr(void *vaddr)
+{
+	if (__in_29bit_mode())
+		vaddr = (void *)CAC_ADDR((unsigned long)vaddr);
+	return vaddr;
+}
+
 #endif /* __KERNEL__ */
 #endif /* __ASM_SH_CACHEFLUSH_H */
diff --git a/arch/sh/mm/consistent.c b/arch/sh/mm/consistent.c
index 221832eec33b..5f86ae24025b 100644
--- a/arch/sh/mm/consistent.c
+++ b/arch/sh/mm/consistent.c
@@ -69,10 +69,7 @@ void dma_generic_free_coherent(struct device *dev, size_t size,
 void sh_sync_dma_for_device(void *vaddr, size_t size,
 		    enum dma_data_direction direction)
 {
-	void *addr;
-
-	addr = __in_29bit_mode() ?
-	       (void *)CAC_ADDR((unsigned long)vaddr) : vaddr;
+	void *addr = sh_cacheop_vaddr(vaddr);
 
 	switch (direction) {
 	case DMA_FROM_DEVICE:		/* invalidate only */
@@ -88,7 +85,6 @@ void sh_sync_dma_for_device(void *vaddr, size_t size,
 		BUG();
 	}
 }
-EXPORT_SYMBOL(sh_sync_dma_for_device);
 
 static int __init memchunk_setup(char *str)
 {
diff --git a/drivers/sh/maple/maple.c b/drivers/sh/maple/maple.c
index 7525039d812c..c9c354bd713a 100644
--- a/drivers/sh/maple/maple.c
+++ b/drivers/sh/maple/maple.c
@@ -300,8 +300,8 @@ static void maple_send(void)
 	mutex_unlock(&maple_wlist_lock);
 	if (maple_packets > 0) {
 		for (i = 0; i < (1 << MAPLE_DMA_PAGES); i++)
-			sh_sync_dma_for_device(maple_sendbuf + i * PAGE_SIZE,
-				       PAGE_SIZE, DMA_BIDIRECTIONAL);
+			__flush_purge_region(maple_sendbuf + i * PAGE_SIZE,
+					PAGE_SIZE);
 	}
 
 finish:
@@ -642,7 +642,8 @@ static void maple_dma_handler(struct work_struct *work)
 		list_for_each_entry_safe(mq, nmq, &maple_sentq, list) {
 			mdev = mq->dev;
 			recvbuf = mq->recvbuf->buf;
-			sh_sync_dma_for_device(recvbuf, 0x400, DMA_FROM_DEVICE);
+			__flush_invalidate_region(sh_cacheop_vaddr(recvbuf),
+					0x400);
 			code = recvbuf[0];
 			kfree(mq->sendbuf);
 			list_del_init(&mq->list);
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 18/25] sh: use dma_direct_ops for the CONFIG_DMA_COHERENT case
  2018-05-22 12:04 common non-cache coherent direct dma mapping ops v2 Christoph Hellwig
                   ` (10 preceding siblings ...)
  2018-05-22 12:04 ` [PATCH 17/25] sh: introduce a sh_cacheop_vaddr helper Christoph Hellwig
@ 2018-05-22 12:04 ` Christoph Hellwig
  2018-05-22 12:04 ` [PATCH 19/25] sh: split arch/sh/mm/consistent.c Christoph Hellwig
                   ` (6 subsequent siblings)
  18 siblings, 0 replies; 26+ messages in thread
From: Christoph Hellwig @ 2018-05-22 12:04 UTC (permalink / raw)
  To: iommu
  Cc: linux-arch, Michal Simek, Greentime Hu, Vincent Chen,
	linux-alpha, linux-snps-arc, linux-arm-kernel, linux-c6x-dev,
	linux-hexagon, linux-m68k, nios2-dev, openrisc, linux-parisc,
	linux-sh, sparclinux, linux-xtensa, linux-kernel

This is a slight change in behavior as we avoid the detour through the
virtual mapping for the coherent allocator, but if this CPU really is
coherent that should be the right thing to do.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 arch/sh/Kconfig                   | 1 +
 arch/sh/include/asm/dma-mapping.h | 4 ++++
 arch/sh/kernel/Makefile           | 4 ++--
 arch/sh/kernel/dma-nommu.c        | 4 ----
 4 files changed, 7 insertions(+), 6 deletions(-)

diff --git a/arch/sh/Kconfig b/arch/sh/Kconfig
index 7d521926041e..d0b095323d62 100644
--- a/arch/sh/Kconfig
+++ b/arch/sh/Kconfig
@@ -157,6 +157,7 @@ config SWAP_IO_SPACE
 	bool
 
 config DMA_COHERENT
+	select DMA_DIRECT_OPS
 	bool
 
 config DMA_NONCOHERENT
diff --git a/arch/sh/include/asm/dma-mapping.h b/arch/sh/include/asm/dma-mapping.h
index 149e71f95be7..1ebc6a4eb1c5 100644
--- a/arch/sh/include/asm/dma-mapping.h
+++ b/arch/sh/include/asm/dma-mapping.h
@@ -6,7 +6,11 @@ extern const struct dma_map_ops nommu_dma_ops;
 
 static inline const struct dma_map_ops *get_arch_dma_ops(struct bus_type *bus)
 {
+#ifdef CONFIG_DMA_NONCOHERENT
 	return &nommu_dma_ops;
+#else
+	return &dma_direct_ops;
+#endif
 }
 
 extern void *dma_generic_alloc_coherent(struct device *dev, size_t size,
diff --git a/arch/sh/kernel/Makefile b/arch/sh/kernel/Makefile
index dc80041f7363..cb5f1bfb52de 100644
--- a/arch/sh/kernel/Makefile
+++ b/arch/sh/kernel/Makefile
@@ -12,7 +12,7 @@ endif
 
 CFLAGS_REMOVE_return_address.o = -pg
 
-obj-y	:= debugtraps.o dma-nommu.o dumpstack.o 		\
+obj-y	:= debugtraps.o dumpstack.o 		\
 	   idle.o io.o irq.o irq_$(BITS).o kdebugfs.o			\
 	   machvec.o nmi_debug.o process.o				\
 	   process_$(BITS).o ptrace.o ptrace_$(BITS).o			\
@@ -45,7 +45,7 @@ obj-$(CONFIG_DUMP_CODE)		+= disassemble.o
 obj-$(CONFIG_HIBERNATION)	+= swsusp.o
 obj-$(CONFIG_DWARF_UNWINDER)	+= dwarf.o
 obj-$(CONFIG_PERF_EVENTS)	+= perf_event.o perf_callchain.o
-
+obj-$(CONFIG_DMA_NONCOHERENT)	+= dma-nommu.o
 obj-$(CONFIG_HAVE_HW_BREAKPOINT)		+= hw_breakpoint.o
 
 ccflags-y := -Werror
diff --git a/arch/sh/kernel/dma-nommu.c b/arch/sh/kernel/dma-nommu.c
index 79a9edafa5b0..d8689b1cb743 100644
--- a/arch/sh/kernel/dma-nommu.c
+++ b/arch/sh/kernel/dma-nommu.c
@@ -51,7 +51,6 @@ static int nommu_map_sg(struct device *dev, struct scatterlist *sg,
 	return nents;
 }
 
-#ifdef CONFIG_DMA_NONCOHERENT
 static void nommu_sync_single_for_device(struct device *dev, dma_addr_t addr,
 			      size_t size, enum dma_data_direction dir)
 {
@@ -67,16 +66,13 @@ static void nommu_sync_sg_for_device(struct device *dev, struct scatterlist *sg,
 	for_each_sg(sg, s, nelems, i)
 		sh_sync_dma_for_device(sg_virt(s), s->length, dir);
 }
-#endif
 
 const struct dma_map_ops nommu_dma_ops = {
 	.alloc			= dma_generic_alloc_coherent,
 	.free			= dma_generic_free_coherent,
 	.map_page		= nommu_map_page,
 	.map_sg			= nommu_map_sg,
-#ifdef CONFIG_DMA_NONCOHERENT
 	.sync_single_for_device	= nommu_sync_single_for_device,
 	.sync_sg_for_device	= nommu_sync_sg_for_device,
-#endif
 };
 EXPORT_SYMBOL(nommu_dma_ops);
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 19/25] sh: split arch/sh/mm/consistent.c
  2018-05-22 12:04 common non-cache coherent direct dma mapping ops v2 Christoph Hellwig
                   ` (11 preceding siblings ...)
  2018-05-22 12:04 ` [PATCH 18/25] sh: use dma_direct_ops for the CONFIG_DMA_COHERENT case Christoph Hellwig
@ 2018-05-22 12:04 ` Christoph Hellwig
  2018-05-22 12:04 ` [PATCH 20/25] sh: use generic dma_noncoherent_ops Christoph Hellwig
                   ` (5 subsequent siblings)
  18 siblings, 0 replies; 26+ messages in thread
From: Christoph Hellwig @ 2018-05-22 12:04 UTC (permalink / raw)
  To: iommu
  Cc: linux-arch, Michal Simek, Greentime Hu, Vincent Chen,
	linux-alpha, linux-snps-arc, linux-arm-kernel, linux-c6x-dev,
	linux-hexagon, linux-m68k, nios2-dev, openrisc, linux-parisc,
	linux-sh, sparclinux, linux-xtensa, linux-kernel

Half of the file just contains platform device memory setup code which
is required for all builds, and half contains helpers for dma coherent
allocation, which is only needed if CONFIG_DMA_NONCOHERENT is enabled.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 arch/sh/kernel/Makefile       |  2 +-
 arch/sh/kernel/dma-coherent.c | 80 +++++++++++++++++++++++++++++++++++
 arch/sh/mm/consistent.c       | 75 --------------------------------
 3 files changed, 81 insertions(+), 76 deletions(-)
 create mode 100644 arch/sh/kernel/dma-coherent.c

diff --git a/arch/sh/kernel/Makefile b/arch/sh/kernel/Makefile
index cb5f1bfb52de..d5ddb64bfffe 100644
--- a/arch/sh/kernel/Makefile
+++ b/arch/sh/kernel/Makefile
@@ -45,7 +45,7 @@ obj-$(CONFIG_DUMP_CODE)		+= disassemble.o
 obj-$(CONFIG_HIBERNATION)	+= swsusp.o
 obj-$(CONFIG_DWARF_UNWINDER)	+= dwarf.o
 obj-$(CONFIG_PERF_EVENTS)	+= perf_event.o perf_callchain.o
-obj-$(CONFIG_DMA_NONCOHERENT)	+= dma-nommu.o
+obj-$(CONFIG_DMA_NONCOHERENT)	+= dma-nommu.o dma-coherent.o
 obj-$(CONFIG_HAVE_HW_BREAKPOINT)		+= hw_breakpoint.o
 
 ccflags-y := -Werror
diff --git a/arch/sh/kernel/dma-coherent.c b/arch/sh/kernel/dma-coherent.c
new file mode 100644
index 000000000000..4f41e5cd5207
--- /dev/null
+++ b/arch/sh/kernel/dma-coherent.c
@@ -0,0 +1,80 @@
+/*
+ * Copyright (C) 2004 - 2007  Paul Mundt
+ *
+ * This file is subject to the terms and conditions of the GNU General Public
+ * License.  See the file "COPYING" in the main directory of this archive
+ * for more details.
+ */
+#include <linux/mm.h>
+#include <linux/init.h>
+#include <linux/dma-mapping.h>
+#include <linux/module.h>
+#include <asm/cacheflush.h>
+#include <asm/addrspace.h>
+
+void *dma_generic_alloc_coherent(struct device *dev, size_t size,
+				 dma_addr_t *dma_handle, gfp_t gfp,
+				 unsigned long attrs)
+{
+	void *ret, *ret_nocache;
+	int order = get_order(size);
+
+	gfp |= __GFP_ZERO;
+
+	ret = (void *)__get_free_pages(gfp, order);
+	if (!ret)
+		return NULL;
+
+	/*
+	 * Pages from the page allocator may have data present in
+	 * cache. So flush the cache before using uncached memory.
+	 */
+	sh_sync_dma_for_device(ret, size, DMA_BIDIRECTIONAL);
+
+	ret_nocache = (void __force *)ioremap_nocache(virt_to_phys(ret), size);
+	if (!ret_nocache) {
+		free_pages((unsigned long)ret, order);
+		return NULL;
+	}
+
+	split_page(pfn_to_page(virt_to_phys(ret) >> PAGE_SHIFT), order);
+
+	*dma_handle = virt_to_phys(ret) - PFN_PHYS(dev->dma_pfn_offset);
+
+	return ret_nocache;
+}
+
+void dma_generic_free_coherent(struct device *dev, size_t size,
+			       void *vaddr, dma_addr_t dma_handle,
+			       unsigned long attrs)
+{
+	int order = get_order(size);
+	unsigned long pfn = (dma_handle >> PAGE_SHIFT) + dev->dma_pfn_offset;
+	int k;
+
+	for (k = 0; k < (1 << order); k++)
+		__free_pages(pfn_to_page(pfn + k), 0);
+
+	iounmap(vaddr);
+}
+
+void sh_sync_dma_for_device(void *vaddr, size_t size,
+		    enum dma_data_direction direction)
+{
+	void *addr = sh_cacheop_vaddr(vaddr);
+
+	switch (direction) {
+	case DMA_FROM_DEVICE:		/* invalidate only */
+		__flush_invalidate_region(addr, size);
+		break;
+	case DMA_TO_DEVICE:		/* writeback only */
+		__flush_wback_region(addr, size);
+		break;
+	case DMA_BIDIRECTIONAL:		/* writeback and invalidate */
+		__flush_purge_region(addr, size);
+		break;
+	default:
+		BUG();
+	}
+}
+EXPORT_SYMBOL(sh_sync_dma_for_device);
diff --git a/arch/sh/mm/consistent.c b/arch/sh/mm/consistent.c
index 5f86ae24025b..5da5be74fb6b 100644
--- a/arch/sh/mm/consistent.c
+++ b/arch/sh/mm/consistent.c
@@ -1,10 +1,6 @@
 /*
- * arch/sh/mm/consistent.c
- *
  * Copyright (C) 2004 - 2007  Paul Mundt
  *
- * Declared coherent memory functions based on arch/x86/kernel/pci-dma_32.c
- *
  * This file is subject to the terms and conditions of the GNU General Public
  * License.  See the file "COPYING" in the main directory of this archive
  * for more details.
@@ -13,78 +9,7 @@
 #include <linux/init.h>
 #include <linux/platform_device.h>
 #include <linux/dma-mapping.h>
-#include <linux/dma-debug.h>
 #include <linux/io.h>
-#include <linux/module.h>
-#include <linux/gfp.h>
-#include <asm/cacheflush.h>
-#include <asm/addrspace.h>
-
-void *dma_generic_alloc_coherent(struct device *dev, size_t size,
-				 dma_addr_t *dma_handle, gfp_t gfp,
-				 unsigned long attrs)
-{
-	void *ret, *ret_nocache;
-	int order = get_order(size);
-
-	gfp |= __GFP_ZERO;
-
-	ret = (void *)__get_free_pages(gfp, order);
-	if (!ret)
-		return NULL;
-
-	/*
-	 * Pages from the page allocator may have data present in
-	 * cache. So flush the cache before using uncached memory.
-	 */
-	sh_sync_dma_for_device(ret, size, DMA_BIDIRECTIONAL);
-
-	ret_nocache = (void __force *)ioremap_nocache(virt_to_phys(ret), size);
-	if (!ret_nocache) {
-		free_pages((unsigned long)ret, order);
-		return NULL;
-	}
-
-	split_page(pfn_to_page(virt_to_phys(ret) >> PAGE_SHIFT), order);
-
-	*dma_handle = virt_to_phys(ret) - PFN_PHYS(dev->dma_pfn_offset);
-
-	return ret_nocache;
-}
-
-void dma_generic_free_coherent(struct device *dev, size_t size,
-			       void *vaddr, dma_addr_t dma_handle,
-			       unsigned long attrs)
-{
-	int order = get_order(size);
-	unsigned long pfn = (dma_handle >> PAGE_SHIFT) + dev->dma_pfn_offset;
-	int k;
-
-	for (k = 0; k < (1 << order); k++)
-		__free_pages(pfn_to_page(pfn + k), 0);
-
-	iounmap(vaddr);
-}
-
-void sh_sync_dma_for_device(void *vaddr, size_t size,
-		    enum dma_data_direction direction)
-{
-	void *addr = sh_cacheop_vaddr(vaddr);
-
-	switch (direction) {
-	case DMA_FROM_DEVICE:		/* invalidate only */
-		__flush_invalidate_region(addr, size);
-		break;
-	case DMA_TO_DEVICE:		/* writeback only */
-		__flush_wback_region(addr, size);
-		break;
-	case DMA_BIDIRECTIONAL:		/* writeback and invalidate */
-		__flush_purge_region(addr, size);
-		break;
-	default:
-		BUG();
-	}
-}
 
 static int __init memchunk_setup(char *str)
 {
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 20/25] sh: use generic dma_noncoherent_ops
  2018-05-22 12:04 common non-cache coherent direct dma mapping ops v2 Christoph Hellwig
                   ` (12 preceding siblings ...)
  2018-05-22 12:04 ` [PATCH 19/25] sh: split arch/sh/mm/consistent.c Christoph Hellwig
@ 2018-05-22 12:04 ` Christoph Hellwig
  2018-05-22 12:04 ` [PATCH 21/25] xtensa: " Christoph Hellwig
                   ` (4 subsequent siblings)
  18 siblings, 0 replies; 26+ messages in thread
From: Christoph Hellwig @ 2018-05-22 12:04 UTC (permalink / raw)
  To: iommu
  Cc: linux-arch, Michal Simek, Greentime Hu, Vincent Chen,
	linux-alpha, linux-snps-arc, linux-arm-kernel, linux-c6x-dev,
	linux-hexagon, linux-m68k, nios2-dev, openrisc, linux-parisc,
	linux-sh, sparclinux, linux-xtensa, linux-kernel

Switch to the generic noncoherent direct mapping implementation.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 arch/sh/Kconfig                   |  3 +-
 arch/sh/include/asm/Kbuild        |  1 +
 arch/sh/include/asm/dma-mapping.h | 26 -----------
 arch/sh/kernel/Makefile           |  2 +-
 arch/sh/kernel/dma-coherent.c     | 23 +++++----
 arch/sh/kernel/dma-nommu.c        | 78 -------------------------------
 6 files changed, 15 insertions(+), 118 deletions(-)
 delete mode 100644 arch/sh/include/asm/dma-mapping.h
 delete mode 100644 arch/sh/kernel/dma-nommu.c

diff --git a/arch/sh/Kconfig b/arch/sh/Kconfig
index d0b095323d62..9809e0604af3 100644
--- a/arch/sh/Kconfig
+++ b/arch/sh/Kconfig
@@ -49,7 +49,6 @@ config SUPERH
 	select HAVE_ARCH_AUDITSYSCALL
 	select HAVE_FUTEX_CMPXCHG if FUTEX
 	select HAVE_NMI
-	select NEED_DMA_MAP_STATE
 	select NEED_SG_DMA_LENGTH
 
 	help
@@ -162,6 +161,8 @@ config DMA_COHERENT
 
 config DMA_NONCOHERENT
 	def_bool !DMA_COHERENT
+	select ARCH_HAS_SYNC_DMA_FOR_DEVICE
+	select DMA_NONCOHERENT_OPS
 
 config PGTABLE_LEVELS
 	default 3 if X2TLB
diff --git a/arch/sh/include/asm/Kbuild b/arch/sh/include/asm/Kbuild
index 1efcce74997b..50f7e878ea1b 100644
--- a/arch/sh/include/asm/Kbuild
+++ b/arch/sh/include/asm/Kbuild
@@ -1,6 +1,7 @@
 generic-y += current.h
 generic-y += delay.h
 generic-y += div64.h
+generic-y += dma-mapping.h
 generic-y += emergency-restart.h
 generic-y += exec.h
 generic-y += irq_regs.h
diff --git a/arch/sh/include/asm/dma-mapping.h b/arch/sh/include/asm/dma-mapping.h
deleted file mode 100644
index 1ebc6a4eb1c5..000000000000
--- a/arch/sh/include/asm/dma-mapping.h
+++ /dev/null
@@ -1,26 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0 */
-#ifndef __ASM_SH_DMA_MAPPING_H
-#define __ASM_SH_DMA_MAPPING_H
-
-extern const struct dma_map_ops nommu_dma_ops;
-
-static inline const struct dma_map_ops *get_arch_dma_ops(struct bus_type *bus)
-{
-#ifdef CONFIG_DMA_NONCOHERENT
-	return &nommu_dma_ops;
-#else
-	return &dma_direct_ops;
-#endif
-}
-
-extern void *dma_generic_alloc_coherent(struct device *dev, size_t size,
-					dma_addr_t *dma_addr, gfp_t flag,
-					unsigned long attrs);
-extern void dma_generic_free_coherent(struct device *dev, size_t size,
-				      void *vaddr, dma_addr_t dma_handle,
-				      unsigned long attrs);
-
-void sh_sync_dma_for_device(void *vaddr, size_t size,
-	    enum dma_data_direction dir);
-
-#endif /* __ASM_SH_DMA_MAPPING_H */
diff --git a/arch/sh/kernel/Makefile b/arch/sh/kernel/Makefile
index d5ddb64bfffe..59673f8a3379 100644
--- a/arch/sh/kernel/Makefile
+++ b/arch/sh/kernel/Makefile
@@ -45,7 +45,7 @@ obj-$(CONFIG_DUMP_CODE)		+= disassemble.o
 obj-$(CONFIG_HIBERNATION)	+= swsusp.o
 obj-$(CONFIG_DWARF_UNWINDER)	+= dwarf.o
 obj-$(CONFIG_PERF_EVENTS)	+= perf_event.o perf_callchain.o
-obj-$(CONFIG_DMA_NONCOHERENT)	+= dma-nommu.o dma-coherent.o
+obj-$(CONFIG_DMA_NONCOHERENT)	+= dma-coherent.o
 obj-$(CONFIG_HAVE_HW_BREAKPOINT)		+= hw_breakpoint.o
 
 ccflags-y := -Werror
diff --git a/arch/sh/kernel/dma-coherent.c b/arch/sh/kernel/dma-coherent.c
index 4f41e5cd5207..67b9c9e6e43a 100644
--- a/arch/sh/kernel/dma-coherent.c
+++ b/arch/sh/kernel/dma-coherent.c
@@ -7,14 +7,13 @@
  */
 #include <linux/mm.h>
 #include <linux/init.h>
-#include <linux/dma-mapping.h>
+#include <linux/dma-noncoherent.h>
 #include <linux/module.h>
 #include <asm/cacheflush.h>
 #include <asm/addrspace.h>
 
-void *dma_generic_alloc_coherent(struct device *dev, size_t size,
-				 dma_addr_t *dma_handle, gfp_t gfp,
-				 unsigned long attrs)
+void *arch_dma_alloc(struct device *dev, size_t size, dma_addr_t *dma_handle,
+		gfp_t gfp, unsigned long attrs)
 {
 	void *ret, *ret_nocache;
 	int order = get_order(size);
@@ -29,7 +28,8 @@ void *dma_generic_alloc_coherent(struct device *dev, size_t size,
 	 * Pages from the page allocator may have data present in
 	 * cache. So flush the cache before using uncached memory.
 	 */
-	sh_sync_dma_for_device(ret, size, DMA_BIDIRECTIONAL);
+	arch_sync_dma_for_device(dev, virt_to_phys(ret), size,
+			DMA_BIDIRECTIONAL);
 
 	ret_nocache = (void __force *)ioremap_nocache(virt_to_phys(ret), size);
 	if (!ret_nocache) {
@@ -44,9 +44,8 @@ void *dma_generic_alloc_coherent(struct device *dev, size_t size,
 	return ret_nocache;
 }
 
-void dma_generic_free_coherent(struct device *dev, size_t size,
-			       void *vaddr, dma_addr_t dma_handle,
-			       unsigned long attrs)
+void arch_dma_free(struct device *dev, size_t size, void *vaddr,
+		dma_addr_t dma_handle, unsigned long attrs)
 {
 	int order = get_order(size);
 	unsigned long pfn = (dma_handle >> PAGE_SHIFT) + dev->dma_pfn_offset;
@@ -58,12 +57,12 @@ void dma_generic_free_coherent(struct device *dev, size_t size,
 	iounmap(vaddr);
 }
 
-void sh_sync_dma_for_device(void *vaddr, size_t size,
-		    enum dma_data_direction direction)
+void arch_sync_dma_for_device(struct device *dev, phys_addr_t paddr,
+		size_t size, enum dma_data_direction dir)
 {
-	void *addr = sh_cacheop_vaddr(vaddr);
+	void *addr = sh_cacheop_vaddr(phys_to_virt(paddr));
 
-	switch (direction) {
+	switch (dir) {
 	case DMA_FROM_DEVICE:		/* invalidate only */
 		__flush_invalidate_region(addr, size);
 		break;
diff --git a/arch/sh/kernel/dma-nommu.c b/arch/sh/kernel/dma-nommu.c
deleted file mode 100644
index d8689b1cb743..000000000000
--- a/arch/sh/kernel/dma-nommu.c
+++ /dev/null
@@ -1,78 +0,0 @@
-/*
- * DMA mapping support for platforms lacking IOMMUs.
- *
- * Copyright (C) 2009  Paul Mundt
- *
- * This file is subject to the terms and conditions of the GNU General Public
- * License.  See the file "COPYING" in the main directory of this archive
- * for more details.
- */
-#include <linux/dma-mapping.h>
-#include <linux/io.h>
-#include <asm/cacheflush.h>
-
-static dma_addr_t nommu_map_page(struct device *dev, struct page *page,
-				 unsigned long offset, size_t size,
-				 enum dma_data_direction dir,
-				 unsigned long attrs)
-{
-	dma_addr_t addr = page_to_phys(page) + offset
-		- PFN_PHYS(dev->dma_pfn_offset);
-
-	WARN_ON(size == 0);
-
-	if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC))
-		sh_sync_dma_for_device(page_address(page) + offset, size, dir);
-
-	return addr;
-}
-
-static int nommu_map_sg(struct device *dev, struct scatterlist *sg,
-			int nents, enum dma_data_direction dir,
-			unsigned long attrs)
-{
-	struct scatterlist *s;
-	int i;
-
-	WARN_ON(nents == 0 || sg[0].length == 0);
-
-	for_each_sg(sg, s, nents, i) {
-		dma_addr_t offset = PFN_PHYS(dev->dma_pfn_offset);
-
-		BUG_ON(!sg_page(s));
-
-		if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC))
-			sh_sync_dma_for_device(sg_virt(s), s->length, dir);
-
-		s->dma_address = sg_phys(s) - offset;
-		s->dma_length = s->length;
-	}
-
-	return nents;
-}
-
-static void nommu_sync_single_for_device(struct device *dev, dma_addr_t addr,
-			      size_t size, enum dma_data_direction dir)
-{
-	sh_sync_dma_for_device(phys_to_virt(addr), size, dir);
-}
-
-static void nommu_sync_sg_for_device(struct device *dev, struct scatterlist *sg,
-			  int nelems, enum dma_data_direction dir)
-{
-	struct scatterlist *s;
-	int i;
-
-	for_each_sg(sg, s, nelems, i)
-		sh_sync_dma_for_device(sg_virt(s), s->length, dir);
-}
-
-const struct dma_map_ops nommu_dma_ops = {
-	.alloc			= dma_generic_alloc_coherent,
-	.free			= dma_generic_free_coherent,
-	.map_page		= nommu_map_page,
-	.map_sg			= nommu_map_sg,
-	.sync_single_for_device	= nommu_sync_single_for_device,
-	.sync_sg_for_device	= nommu_sync_sg_for_device,
-};
-EXPORT_SYMBOL(nommu_dma_ops);
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 21/25] xtensa: use generic dma_noncoherent_ops
  2018-05-22 12:04 common non-cache coherent direct dma mapping ops v2 Christoph Hellwig
                   ` (13 preceding siblings ...)
  2018-05-22 12:04 ` [PATCH 20/25] sh: use generic dma_noncoherent_ops Christoph Hellwig
@ 2018-05-22 12:04 ` Christoph Hellwig
  2018-05-22 12:04 ` [PATCH 22/25] sparc: " Christoph Hellwig
                   ` (3 subsequent siblings)
  18 siblings, 0 replies; 26+ messages in thread
From: Christoph Hellwig @ 2018-05-22 12:04 UTC (permalink / raw)
  To: iommu
  Cc: linux-arch, Michal Simek, Greentime Hu, Vincent Chen,
	linux-alpha, linux-snps-arc, linux-arm-kernel, linux-c6x-dev,
	linux-hexagon, linux-m68k, nios2-dev, openrisc, linux-parisc,
	linux-sh, sparclinux, linux-xtensa, linux-kernel

Switch to the generic noncoherent direct mapping implementation.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 arch/xtensa/Kconfig                   |   3 +
 arch/xtensa/include/asm/Kbuild        |   1 +
 arch/xtensa/include/asm/dma-mapping.h |  26 ------
 arch/xtensa/kernel/pci-dma.c          | 130 +++-----------------------
 4 files changed, 19 insertions(+), 141 deletions(-)
 delete mode 100644 arch/xtensa/include/asm/dma-mapping.h

diff --git a/arch/xtensa/Kconfig b/arch/xtensa/Kconfig
index 17df332269b2..ef114648e954 100644
--- a/arch/xtensa/Kconfig
+++ b/arch/xtensa/Kconfig
@@ -5,11 +5,14 @@ config ZONE_DMA
 config XTENSA
 	def_bool y
 	select ARCH_NO_COHERENT_DMA_MMAP if !MMU
+	select ARCH_HAS_SYNC_DMA_FOR_CPU
+	select ARCH_HAS_SYNC_DMA_FOR_DEVICE
 	select ARCH_WANT_FRAME_POINTERS
 	select ARCH_WANT_IPC_PARSE_VERSION
 	select BUILDTIME_EXTABLE_SORT
 	select CLONE_BACKWARDS
 	select COMMON_CLK
+	select DMA_NONCOHERENT_OPS
 	select GENERIC_ATOMIC64
 	select GENERIC_CLOCKEVENTS
 	select GENERIC_IRQ_SHOW
diff --git a/arch/xtensa/include/asm/Kbuild b/arch/xtensa/include/asm/Kbuild
index 436b20337168..a8d6cd3bee4b 100644
--- a/arch/xtensa/include/asm/Kbuild
+++ b/arch/xtensa/include/asm/Kbuild
@@ -2,6 +2,7 @@ generic-y += bug.h
 generic-y += device.h
 generic-y += div64.h
 generic-y += dma-contiguous.h
+generic-y += dma-mapping.h
 generic-y += emergency-restart.h
 generic-y += exec.h
 generic-y += extable.h
diff --git a/arch/xtensa/include/asm/dma-mapping.h b/arch/xtensa/include/asm/dma-mapping.h
deleted file mode 100644
index 44098800dad7..000000000000
--- a/arch/xtensa/include/asm/dma-mapping.h
+++ /dev/null
@@ -1,26 +0,0 @@
-/*
- * This file is subject to the terms and conditions of the GNU General Public
- * License.  See the file "COPYING" in the main directory of this archive
- * for more details.
- *
- * Copyright (C) 2003 - 2005 Tensilica Inc.
- * Copyright (C) 2015 Cadence Design Systems Inc.
- */
-
-#ifndef _XTENSA_DMA_MAPPING_H
-#define _XTENSA_DMA_MAPPING_H
-
-#include <asm/cache.h>
-#include <asm/io.h>
-
-#include <linux/mm.h>
-#include <linux/scatterlist.h>
-
-extern const struct dma_map_ops xtensa_dma_map_ops;
-
-static inline const struct dma_map_ops *get_arch_dma_ops(struct bus_type *bus)
-{
-	return &xtensa_dma_map_ops;
-}
-
-#endif	/* _XTENSA_DMA_MAPPING_H */
diff --git a/arch/xtensa/kernel/pci-dma.c b/arch/xtensa/kernel/pci-dma.c
index 392b4a80ebc2..a83d60e92908 100644
--- a/arch/xtensa/kernel/pci-dma.c
+++ b/arch/xtensa/kernel/pci-dma.c
@@ -16,26 +16,24 @@
  */
 
 #include <linux/dma-contiguous.h>
+#include <linux/dma-noncoherent.h>
 #include <linux/dma-direct.h>
 #include <linux/gfp.h>
 #include <linux/highmem.h>
 #include <linux/mm.h>
-#include <linux/module.h>
-#include <linux/pci.h>
-#include <linux/string.h>
 #include <linux/types.h>
 #include <asm/cacheflush.h>
 #include <asm/io.h>
 
-static void do_cache_op(dma_addr_t dma_handle, size_t size,
+static void do_cache_op(phys_addr_t paddr, size_t size,
 			void (*fn)(unsigned long, unsigned long))
 {
-	unsigned long off = dma_handle & (PAGE_SIZE - 1);
-	unsigned long pfn = PFN_DOWN(dma_handle);
+	unsigned long off = paddr & (PAGE_SIZE - 1);
+	unsigned long pfn = PFN_DOWN(paddr);
 	struct page *page = pfn_to_page(pfn);
 
 	if (!PageHighMem(page))
-		fn((unsigned long)bus_to_virt(dma_handle), size);
+		fn((unsigned long)phys_to_virt(paddr), size);
 	else
 		while (size > 0) {
 			size_t sz = min_t(size_t, size, PAGE_SIZE - off);
@@ -49,14 +47,13 @@ static void do_cache_op(dma_addr_t dma_handle, size_t size,
 		}
 }
 
-static void xtensa_sync_single_for_cpu(struct device *dev,
-				       dma_addr_t dma_handle, size_t size,
-				       enum dma_data_direction dir)
+void arch_sync_dma_for_cpu(struct device *dev, phys_addr_t paddr,
+		size_t size, enum dma_data_direction dir)
 {
 	switch (dir) {
 	case DMA_BIDIRECTIONAL:
 	case DMA_FROM_DEVICE:
-		do_cache_op(dma_handle, size, __invalidate_dcache_range);
+		do_cache_op(paddr, size, __invalidate_dcache_range);
 		break;
 
 	case DMA_NONE:
@@ -68,15 +65,14 @@ static void xtensa_sync_single_for_cpu(struct device *dev,
 	}
 }
 
-static void xtensa_sync_single_for_device(struct device *dev,
-					  dma_addr_t dma_handle, size_t size,
-					  enum dma_data_direction dir)
+void arch_sync_dma_for_device(struct device *dev, phys_addr_t paddr,
+		size_t size, enum dma_data_direction dir)
 {
 	switch (dir) {
 	case DMA_BIDIRECTIONAL:
 	case DMA_TO_DEVICE:
 		if (XCHAL_DCACHE_IS_WRITEBACK)
-			do_cache_op(dma_handle, size, __flush_dcache_range);
+			do_cache_op(paddr, size, __flush_dcache_range);
 		break;
 
 	case DMA_NONE:
@@ -88,40 +84,13 @@ static void xtensa_sync_single_for_device(struct device *dev,
 	}
 }
 
-static void xtensa_sync_sg_for_cpu(struct device *dev,
-				   struct scatterlist *sg, int nents,
-				   enum dma_data_direction dir)
-{
-	struct scatterlist *s;
-	int i;
-
-	for_each_sg(sg, s, nents, i) {
-		xtensa_sync_single_for_cpu(dev, sg_dma_address(s),
-					   sg_dma_len(s), dir);
-	}
-}
-
-static void xtensa_sync_sg_for_device(struct device *dev,
-				      struct scatterlist *sg, int nents,
-				      enum dma_data_direction dir)
-{
-	struct scatterlist *s;
-	int i;
-
-	for_each_sg(sg, s, nents, i) {
-		xtensa_sync_single_for_device(dev, sg_dma_address(s),
-					      sg_dma_len(s), dir);
-	}
-}
-
 /*
  * Note: We assume that the full memory space is always mapped to 'kseg'
  *	 Otherwise we have to use page attributes (not implemented).
  */
 
-static void *xtensa_dma_alloc(struct device *dev, size_t size,
-			      dma_addr_t *handle, gfp_t flag,
-			      unsigned long attrs)
+void *arch_dma_alloc(struct device *dev, size_t size, dma_addr_t *handle,
+		gfp_t flag, unsigned long attrs)
 {
 	unsigned long ret;
 	unsigned long uncached;
@@ -171,8 +140,8 @@ static void *xtensa_dma_alloc(struct device *dev, size_t size,
 	return (void *)uncached;
 }
 
-static void xtensa_dma_free(struct device *dev, size_t size, void *vaddr,
-			    dma_addr_t dma_handle, unsigned long attrs)
+void arch_dma_free(struct device *dev, size_t size, void *vaddr,
+		dma_addr_t dma_handle, unsigned long attrs)
 {
 	unsigned long count = PAGE_ALIGN(size) >> PAGE_SHIFT;
 	unsigned long addr = (unsigned long)vaddr;
@@ -192,72 +161,3 @@ static void xtensa_dma_free(struct device *dev, size_t size, void *vaddr,
 	if (!dma_release_from_contiguous(dev, page, count))
 		__free_pages(page, get_order(size));
 }
-
-static dma_addr_t xtensa_map_page(struct device *dev, struct page *page,
-				  unsigned long offset, size_t size,
-				  enum dma_data_direction dir,
-				  unsigned long attrs)
-{
-	dma_addr_t dma_handle = page_to_phys(page) + offset;
-
-	if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC))
-		xtensa_sync_single_for_device(dev, dma_handle, size, dir);
-
-	return dma_handle;
-}
-
-static void xtensa_unmap_page(struct device *dev, dma_addr_t dma_handle,
-			      size_t size, enum dma_data_direction dir,
-			      unsigned long attrs)
-{
-	if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC))
-		xtensa_sync_single_for_cpu(dev, dma_handle, size, dir);
-}
-
-static int xtensa_map_sg(struct device *dev, struct scatterlist *sg,
-			 int nents, enum dma_data_direction dir,
-			 unsigned long attrs)
-{
-	struct scatterlist *s;
-	int i;
-
-	for_each_sg(sg, s, nents, i) {
-		s->dma_address = xtensa_map_page(dev, sg_page(s), s->offset,
-						 s->length, dir, attrs);
-	}
-	return nents;
-}
-
-static void xtensa_unmap_sg(struct device *dev,
-			    struct scatterlist *sg, int nents,
-			    enum dma_data_direction dir,
-			    unsigned long attrs)
-{
-	struct scatterlist *s;
-	int i;
-
-	for_each_sg(sg, s, nents, i) {
-		xtensa_unmap_page(dev, sg_dma_address(s),
-				  sg_dma_len(s), dir, attrs);
-	}
-}
-
-int xtensa_dma_mapping_error(struct device *dev, dma_addr_t dma_addr)
-{
-	return 0;
-}
-
-const struct dma_map_ops xtensa_dma_map_ops = {
-	.alloc = xtensa_dma_alloc,
-	.free = xtensa_dma_free,
-	.map_page = xtensa_map_page,
-	.unmap_page = xtensa_unmap_page,
-	.map_sg = xtensa_map_sg,
-	.unmap_sg = xtensa_unmap_sg,
-	.sync_single_for_cpu = xtensa_sync_single_for_cpu,
-	.sync_single_for_device = xtensa_sync_single_for_device,
-	.sync_sg_for_cpu = xtensa_sync_sg_for_cpu,
-	.sync_sg_for_device = xtensa_sync_sg_for_device,
-	.mapping_error = xtensa_dma_mapping_error,
-};
-EXPORT_SYMBOL(xtensa_dma_map_ops);
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 22/25] sparc: use generic dma_noncoherent_ops
  2018-05-22 12:04 common non-cache coherent direct dma mapping ops v2 Christoph Hellwig
                   ` (14 preceding siblings ...)
  2018-05-22 12:04 ` [PATCH 21/25] xtensa: " Christoph Hellwig
@ 2018-05-22 12:04 ` Christoph Hellwig
  2018-05-22 12:04 ` [PATCH 23/25] parisc: merge pcx_dma_ops and pcxl_dma_ops Christoph Hellwig
                   ` (2 subsequent siblings)
  18 siblings, 0 replies; 26+ messages in thread
From: Christoph Hellwig @ 2018-05-22 12:04 UTC (permalink / raw)
  To: iommu
  Cc: linux-arch, Michal Simek, Greentime Hu, Vincent Chen,
	linux-alpha, linux-snps-arc, linux-arm-kernel, linux-c6x-dev,
	linux-hexagon, linux-m68k, nios2-dev, openrisc, linux-parisc,
	linux-sh, sparclinux, linux-xtensa, linux-kernel

Switch to the generic noncoherent direct mapping implementation.

This removes the previous sync_single_for_device implementation, which
looks bogus given that no syncing is happening in the similar but more
important map_single case.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 arch/sparc/Kconfig                   |   2 +
 arch/sparc/include/asm/dma-mapping.h |   5 +-
 arch/sparc/kernel/ioport.c           | 151 ++-------------------------
 3 files changed, 14 insertions(+), 144 deletions(-)

diff --git a/arch/sparc/Kconfig b/arch/sparc/Kconfig
index 435dbc033afe..0889b4eabf8b 100644
--- a/arch/sparc/Kconfig
+++ b/arch/sparc/Kconfig
@@ -48,6 +48,8 @@ config SPARC
 
 config SPARC32
 	def_bool !64BIT
+	select ARCH_HAS_SYNC_DMA_FOR_CPU
+	select DMA_NONCOHERENT_OPS
 	select GENERIC_ATOMIC64
 	select CLZ_TAB
 	select HAVE_UID16
diff --git a/arch/sparc/include/asm/dma-mapping.h b/arch/sparc/include/asm/dma-mapping.h
index 12ae33daf52f..e17566376934 100644
--- a/arch/sparc/include/asm/dma-mapping.h
+++ b/arch/sparc/include/asm/dma-mapping.h
@@ -7,7 +7,6 @@
 #include <linux/dma-debug.h>
 
 extern const struct dma_map_ops *dma_ops;
-extern const struct dma_map_ops pci32_dma_ops;
 
 extern struct bus_type pci_bus_type;
 
@@ -15,11 +14,11 @@ static inline const struct dma_map_ops *get_arch_dma_ops(struct bus_type *bus)
 {
 #ifdef CONFIG_SPARC_LEON
 	if (sparc_cpu_model == sparc_leon)
-		return &pci32_dma_ops;
+		return &dma_noncoherent_ops;
 #endif
 #if defined(CONFIG_SPARC32) && defined(CONFIG_PCI)
 	if (bus == &pci_bus_type)
-		return &pci32_dma_ops;
+		return &dma_noncoherent_ops;
 #endif
 	return dma_ops;
 }
diff --git a/arch/sparc/kernel/ioport.c b/arch/sparc/kernel/ioport.c
index 3bcef9ce74df..7954512c42e7 100644
--- a/arch/sparc/kernel/ioport.c
+++ b/arch/sparc/kernel/ioport.c
@@ -38,6 +38,7 @@
 #include <linux/proc_fs.h>
 #include <linux/seq_file.h>
 #include <linux/scatterlist.h>
+#include <linux/dma-noncoherent.h>
 #include <linux/of_device.h>
 
 #include <asm/io.h>
@@ -434,9 +435,8 @@ arch_initcall(sparc_register_ioport);
 /* Allocate and map kernel buffer using consistent mode DMA for a device.
  * hwdev should be valid struct pci_dev pointer for PCI devices.
  */
-static void *pci32_alloc_coherent(struct device *dev, size_t len,
-				  dma_addr_t *pba, gfp_t gfp,
-				  unsigned long attrs)
+void *arch_dma_alloc(struct device *dev, size_t len, dma_addr_t *pba, gfp_t gfp,
+		unsigned long attrs)
 {
 	unsigned long len_total = PAGE_ALIGN(len);
 	void *va;
@@ -488,8 +488,8 @@ static void *pci32_alloc_coherent(struct device *dev, size_t len,
  * References to the memory and mappings associated with cpu_addr/dma_addr
  * past this call are illegal.
  */
-static void pci32_free_coherent(struct device *dev, size_t n, void *p,
-				dma_addr_t ba, unsigned long attrs)
+void arch_dma_free(struct device *dev, size_t n, void *p, dma_addr_t ba,
+		unsigned long attrs)
 {
 	struct resource *res;
 
@@ -519,146 +519,15 @@ static void pci32_free_coherent(struct device *dev, size_t n, void *p,
 	free_pages((unsigned long)phys_to_virt(ba), get_order(n));
 }
 
-/*
- * Same as pci_map_single, but with pages.
- */
-static dma_addr_t pci32_map_page(struct device *dev, struct page *page,
-				 unsigned long offset, size_t size,
-				 enum dma_data_direction dir,
-				 unsigned long attrs)
-{
-	/* IIep is write-through, not flushing. */
-	return page_to_phys(page) + offset;
-}
-
-static void pci32_unmap_page(struct device *dev, dma_addr_t ba, size_t size,
-			     enum dma_data_direction dir, unsigned long attrs)
-{
-	if (dir != PCI_DMA_TODEVICE && !(attrs & DMA_ATTR_SKIP_CPU_SYNC))
-		dma_make_coherent(ba, PAGE_ALIGN(size));
-}
-
-/* Map a set of buffers described by scatterlist in streaming
- * mode for DMA.  This is the scatter-gather version of the
- * above pci_map_single interface.  Here the scatter gather list
- * elements are each tagged with the appropriate dma address
- * and length.  They are obtained via sg_dma_{address,length}(SG).
- *
- * NOTE: An implementation may be able to use a smaller number of
- *       DMA address/length pairs than there are SG table elements.
- *       (for example via virtual mapping capabilities)
- *       The routine returns the number of addr/length pairs actually
- *       used, at most nents.
- *
- * Device ownership issues as mentioned above for pci_map_single are
- * the same here.
- */
-static int pci32_map_sg(struct device *device, struct scatterlist *sgl,
-			int nents, enum dma_data_direction dir,
-			unsigned long attrs)
-{
-	struct scatterlist *sg;
-	int n;
-
-	/* IIep is write-through, not flushing. */
-	for_each_sg(sgl, sg, nents, n) {
-		sg->dma_address = sg_phys(sg);
-		sg->dma_length = sg->length;
-	}
-	return nents;
-}
-
-/* Unmap a set of streaming mode DMA translations.
- * Again, cpu read rules concerning calls here are the same as for
- * pci_unmap_single() above.
- */
-static void pci32_unmap_sg(struct device *dev, struct scatterlist *sgl,
-			   int nents, enum dma_data_direction dir,
-			   unsigned long attrs)
-{
-	struct scatterlist *sg;
-	int n;
-
-	if (dir != PCI_DMA_TODEVICE && !(attrs & DMA_ATTR_SKIP_CPU_SYNC)) {
-		for_each_sg(sgl, sg, nents, n) {
-			dma_make_coherent(sg_phys(sg), PAGE_ALIGN(sg->length));
-		}
-	}
-}
+/* IIep is write-through, not flushing on cpu to device transfer. */
 
-/* Make physical memory consistent for a single
- * streaming mode DMA translation before or after a transfer.
- *
- * If you perform a pci_map_single() but wish to interrogate the
- * buffer using the cpu, yet do not wish to teardown the PCI dma
- * mapping, you must call this function before doing so.  At the
- * next point you give the PCI dma address back to the card, you
- * must first perform a pci_dma_sync_for_device, and then the
- * device again owns the buffer.
- */
-static void pci32_sync_single_for_cpu(struct device *dev, dma_addr_t ba,
-				      size_t size, enum dma_data_direction dir)
+void arch_sync_dma_for_cpu(struct device *dev, phys_addr_t paddr,
+		size_t size, enum dma_data_direction dir)
 {
-	if (dir != PCI_DMA_TODEVICE) {
-		dma_make_coherent(ba, PAGE_ALIGN(size));
-	}
-}
-
-static void pci32_sync_single_for_device(struct device *dev, dma_addr_t ba,
-					 size_t size, enum dma_data_direction dir)
-{
-	if (dir != PCI_DMA_TODEVICE) {
-		dma_make_coherent(ba, PAGE_ALIGN(size));
-	}
+	if (dir != PCI_DMA_TODEVICE)
+		dma_make_coherent(paddr, PAGE_ALIGN(size));
 }
 
-/* Make physical memory consistent for a set of streaming
- * mode DMA translations after a transfer.
- *
- * The same as pci_dma_sync_single_* but for a scatter-gather list,
- * same rules and usage.
- */
-static void pci32_sync_sg_for_cpu(struct device *dev, struct scatterlist *sgl,
-				  int nents, enum dma_data_direction dir)
-{
-	struct scatterlist *sg;
-	int n;
-
-	if (dir != PCI_DMA_TODEVICE) {
-		for_each_sg(sgl, sg, nents, n) {
-			dma_make_coherent(sg_phys(sg), PAGE_ALIGN(sg->length));
-		}
-	}
-}
-
-static void pci32_sync_sg_for_device(struct device *device, struct scatterlist *sgl,
-				     int nents, enum dma_data_direction dir)
-{
-	struct scatterlist *sg;
-	int n;
-
-	if (dir != PCI_DMA_TODEVICE) {
-		for_each_sg(sgl, sg, nents, n) {
-			dma_make_coherent(sg_phys(sg), PAGE_ALIGN(sg->length));
-		}
-	}
-}
-
-/* note: leon re-uses pci32_dma_ops */
-const struct dma_map_ops pci32_dma_ops = {
-	.alloc			= pci32_alloc_coherent,
-	.free			= pci32_free_coherent,
-	.map_page		= pci32_map_page,
-	.unmap_page		= pci32_unmap_page,
-	.map_sg			= pci32_map_sg,
-	.unmap_sg		= pci32_unmap_sg,
-	.sync_single_for_cpu	= pci32_sync_single_for_cpu,
-	.sync_single_for_device	= pci32_sync_single_for_device,
-	.sync_sg_for_cpu	= pci32_sync_sg_for_cpu,
-	.sync_sg_for_device	= pci32_sync_sg_for_device,
-};
-EXPORT_SYMBOL(pci32_dma_ops);
-
 const struct dma_map_ops *dma_ops = &sbus_dma_ops;
 EXPORT_SYMBOL(dma_ops);
 
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 23/25] parisc: merge pcx_dma_ops and pcxl_dma_ops
  2018-05-22 12:04 common non-cache coherent direct dma mapping ops v2 Christoph Hellwig
                   ` (15 preceding siblings ...)
  2018-05-22 12:04 ` [PATCH 22/25] sparc: " Christoph Hellwig
@ 2018-05-22 12:04 ` Christoph Hellwig
  2018-05-22 12:04 ` [PATCH 24/25] parisc: always use flush_kernel_dcache_range for DMA cache maintainance Christoph Hellwig
  2018-05-22 12:04 ` [PATCH 25/25] parisc: use generic dma_noncoherent_ops Christoph Hellwig
  18 siblings, 0 replies; 26+ messages in thread
From: Christoph Hellwig @ 2018-05-22 12:04 UTC (permalink / raw)
  To: iommu
  Cc: linux-arch, Michal Simek, Greentime Hu, Vincent Chen,
	linux-alpha, linux-snps-arc, linux-arm-kernel, linux-c6x-dev,
	linux-hexagon, linux-m68k, nios2-dev, openrisc, linux-parisc,
	linux-sh, sparclinux, linux-xtensa, linux-kernel

The only difference is that pcxl supports dma coherent allocations, while
pcx only supports non-consistent allocations and otherwise fails.

But dma_alloc* is not in the fast path, and merging these two allows an
easy migration path to the generic dma-noncoherent implementation, so
do it.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 arch/parisc/include/asm/dma-mapping.h |  3 +-
 arch/parisc/kernel/pci-dma.c          | 80 ++++++++++++---------------
 arch/parisc/kernel/setup.c            |  8 +--
 arch/parisc/mm/init.c                 | 11 +---
 4 files changed, 43 insertions(+), 59 deletions(-)

diff --git a/arch/parisc/include/asm/dma-mapping.h b/arch/parisc/include/asm/dma-mapping.h
index 01e1fc057c83..eeec8dd18e74 100644
--- a/arch/parisc/include/asm/dma-mapping.h
+++ b/arch/parisc/include/asm/dma-mapping.h
@@ -22,8 +22,7 @@
 */
 
 #ifdef CONFIG_PA11
-extern const struct dma_map_ops pcxl_dma_ops;
-extern const struct dma_map_ops pcx_dma_ops;
+extern const struct dma_map_ops pa11_dma_ops;
 #endif
 
 extern const struct dma_map_ops *hppa_dma_ops;
diff --git a/arch/parisc/kernel/pci-dma.c b/arch/parisc/kernel/pci-dma.c
index 91bc0cac03a1..52304cb290f9 100644
--- a/arch/parisc/kernel/pci-dma.c
+++ b/arch/parisc/kernel/pci-dma.c
@@ -408,7 +408,7 @@ pcxl_dma_init(void)
 
 __initcall(pcxl_dma_init);
 
-static void *pa11_dma_alloc(struct device *dev, size_t size,
+static void *pcxl_dma_alloc(struct device *dev, size_t size,
 		dma_addr_t *dma_handle, gfp_t flag, unsigned long attrs)
 {
 	unsigned long vaddr;
@@ -435,16 +435,44 @@ static void *pa11_dma_alloc(struct device *dev, size_t size,
 	return (void *)vaddr;
 }
 
+static void *pcx_dma_alloc(struct device *dev, size_t size,
+		dma_addr_t *dma_handle, gfp_t flag, unsigned long attrs)
+{
+	void *addr;
+
+	if ((attrs & DMA_ATTR_NON_CONSISTENT) == 0)
+		return NULL;
+
+	addr = (void *)__get_free_pages(flag, get_order(size));
+	if (addr)
+		*dma_handle = (dma_addr_t)virt_to_phys(addr);
+
+	return addr;
+}
+
+static void *pa11_dma_alloc(struct device *dev, size_t size,
+		dma_addr_t *dma_handle, gfp_t gfp, unsigned long attrs)
+{
+
+	if (boot_cpu_data.cpu_type == pcxl2 || boot_cpu_data.cpu_type == pcxl)
+		return pcxl_dma_alloc(dev, size, dma_handle, gfp, attrs);
+	else
+		return pcx_dma_alloc(dev, size, dma_handle, gfp, attrs);
+}
+
 static void pa11_dma_free(struct device *dev, size_t size, void *vaddr,
 		dma_addr_t dma_handle, unsigned long attrs)
 {
-	int order;
+	int order = get_order(size);
 
-	order = get_order(size);
-	size = 1 << (order + PAGE_SHIFT);
-	unmap_uncached_pages((unsigned long)vaddr, size);
-	pcxl_free_range((unsigned long)vaddr, size);
-	free_pages((unsigned long)__va(dma_handle), order);
+	if (boot_cpu_data.cpu_type == pcxl2 || boot_cpu_data.cpu_type == pcxl) {
+		size = 1 << (order + PAGE_SHIFT);
+		unmap_uncached_pages((unsigned long)vaddr, size);
+		pcxl_free_range((unsigned long)vaddr, size);
+
+		vaddr = __va(dma_handle);
+	}
+	free_pages((unsigned long)vaddr, get_order(size));
 }
 
 static dma_addr_t pa11_dma_map_page(struct device *dev, struct page *page,
@@ -573,7 +601,7 @@ static void pa11_dma_cache_sync(struct device *dev, void *vaddr, size_t size,
 	flush_kernel_dcache_range((unsigned long)vaddr, size);
 }
 
-const struct dma_map_ops pcxl_dma_ops = {
+const struct dma_map_ops pa11_dma_ops = {
 	.alloc =		pa11_dma_alloc,
 	.free =			pa11_dma_free,
 	.map_page =		pa11_dma_map_page,
@@ -586,39 +614,3 @@ const struct dma_map_ops pcxl_dma_ops = {
 	.sync_sg_for_device =	pa11_dma_sync_sg_for_device,
 	.cache_sync =		pa11_dma_cache_sync,
 };
-
-static void *pcx_dma_alloc(struct device *dev, size_t size,
-		dma_addr_t *dma_handle, gfp_t flag, unsigned long attrs)
-{
-	void *addr;
-
-	if ((attrs & DMA_ATTR_NON_CONSISTENT) == 0)
-		return NULL;
-
-	addr = (void *)__get_free_pages(flag, get_order(size));
-	if (addr)
-		*dma_handle = (dma_addr_t)virt_to_phys(addr);
-
-	return addr;
-}
-
-static void pcx_dma_free(struct device *dev, size_t size, void *vaddr,
-		dma_addr_t iova, unsigned long attrs)
-{
-	free_pages((unsigned long)vaddr, get_order(size));
-	return;
-}
-
-const struct dma_map_ops pcx_dma_ops = {
-	.alloc =		pcx_dma_alloc,
-	.free =			pcx_dma_free,
-	.map_page =		pa11_dma_map_page,
-	.unmap_page =		pa11_dma_unmap_page,
-	.map_sg =		pa11_dma_map_sg,
-	.unmap_sg =		pa11_dma_unmap_sg,
-	.sync_single_for_cpu =	pa11_dma_sync_single_for_cpu,
-	.sync_single_for_device = pa11_dma_sync_single_for_device,
-	.sync_sg_for_cpu =	pa11_dma_sync_sg_for_cpu,
-	.sync_sg_for_device =	pa11_dma_sync_sg_for_device,
-	.cache_sync =		pa11_dma_cache_sync,
-};
diff --git a/arch/parisc/kernel/setup.c b/arch/parisc/kernel/setup.c
index 8d3a7b80ac42..5c8450a22255 100644
--- a/arch/parisc/kernel/setup.c
+++ b/arch/parisc/kernel/setup.c
@@ -97,14 +97,12 @@ void __init dma_ops_init(void)
 		panic(	"PA-RISC Linux currently only supports machines that conform to\n"
 			"the PA-RISC 1.1 or 2.0 architecture specification.\n");
 
-	case pcxs:
-	case pcxt:
-		hppa_dma_ops = &pcx_dma_ops;
-		break;
 	case pcxl2:
 		pa7300lc_init();
 	case pcxl: /* falls through */
-		hppa_dma_ops = &pcxl_dma_ops;
+	case pcxs:
+	case pcxt:
+		hppa_dma_ops = &pa11_dma_ops;
 		break;
 	default:
 		break;
diff --git a/arch/parisc/mm/init.c b/arch/parisc/mm/init.c
index cab32ee824d2..4ad91c28ecbe 100644
--- a/arch/parisc/mm/init.c
+++ b/arch/parisc/mm/init.c
@@ -19,7 +19,6 @@
 #include <linux/gfp.h>
 #include <linux/delay.h>
 #include <linux/init.h>
-#include <linux/pci.h>		/* for hppa_dma_ops and pcxl_dma_ops */
 #include <linux/initrd.h>
 #include <linux/swap.h>
 #include <linux/unistd.h>
@@ -616,17 +615,13 @@ void __init mem_init(void)
 	free_all_bootmem();
 
 #ifdef CONFIG_PA11
-	if (hppa_dma_ops == &pcxl_dma_ops) {
+	if (boot_cpu_data.cpu_type == pcxl2 || boot_cpu_data.cpu_type == pcxl) {
 		pcxl_dma_start = (unsigned long)SET_MAP_OFFSET(MAP_START);
 		parisc_vmalloc_start = SET_MAP_OFFSET(pcxl_dma_start
 						+ PCXL_DMA_MAP_SIZE);
-	} else {
-		pcxl_dma_start = 0;
-		parisc_vmalloc_start = SET_MAP_OFFSET(MAP_START);
-	}
-#else
-	parisc_vmalloc_start = SET_MAP_OFFSET(MAP_START);
+	} else
 #endif
+		parisc_vmalloc_start = SET_MAP_OFFSET(MAP_START);
 
 	mem_init_print_info(NULL);
 
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 24/25] parisc: always use flush_kernel_dcache_range for DMA cache maintainance
  2018-05-22 12:04 common non-cache coherent direct dma mapping ops v2 Christoph Hellwig
                   ` (16 preceding siblings ...)
  2018-05-22 12:04 ` [PATCH 23/25] parisc: merge pcx_dma_ops and pcxl_dma_ops Christoph Hellwig
@ 2018-05-22 12:04 ` Christoph Hellwig
  2018-05-22 12:04 ` [PATCH 25/25] parisc: use generic dma_noncoherent_ops Christoph Hellwig
  18 siblings, 0 replies; 26+ messages in thread
From: Christoph Hellwig @ 2018-05-22 12:04 UTC (permalink / raw)
  To: iommu
  Cc: linux-arch, Michal Simek, Greentime Hu, Vincent Chen,
	linux-alpha, linux-snps-arc, linux-arm-kernel, linux-c6x-dev,
	linux-hexagon, linux-m68k, nios2-dev, openrisc, linux-parisc,
	linux-sh, sparclinux, linux-xtensa, linux-kernel

Current the S/G list based DMA ops use flush_kernel_vmap_range which
contains a few UP optimizations, while the rest of the DMA operations
uses flush_kernel_dcache_range.  The single vs sg operations are supposed
to have the same effect, so they should use the same routines.  Use
the more conservation version for now, but if people more familiar with
parisc think the vmap version is generally fine for DMA we should switch
all interfaces over to it.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 arch/parisc/kernel/pci-dma.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/parisc/kernel/pci-dma.c b/arch/parisc/kernel/pci-dma.c
index 52304cb290f9..4d7506336918 100644
--- a/arch/parisc/kernel/pci-dma.c
+++ b/arch/parisc/kernel/pci-dma.c
@@ -550,7 +550,7 @@ static void pa11_dma_unmap_sg(struct device *dev, struct scatterlist *sglist,
 	/* once we do combining we'll need to use phys_to_virt(sg_dma_address(sglist)) */
 
 	for_each_sg(sglist, sg, nents, i)
-		flush_kernel_vmap_range(sg_virt(sg), sg->length);
+		flush_kernel_dcache_range(sg_virt(sg), sg->length);
 }
 
 static void pa11_dma_sync_single_for_cpu(struct device *dev,
@@ -581,7 +581,7 @@ static void pa11_dma_sync_sg_for_cpu(struct device *dev, struct scatterlist *sgl
 	/* once we do combining we'll need to use phys_to_virt(sg_dma_address(sglist)) */
 
 	for_each_sg(sglist, sg, nents, i)
-		flush_kernel_vmap_range(sg_virt(sg), sg->length);
+		flush_kernel_dcache_range(sg_virt(sg), sg->length);
 }
 
 static void pa11_dma_sync_sg_for_device(struct device *dev, struct scatterlist *sglist, int nents, enum dma_data_direction direction)
@@ -592,7 +592,7 @@ static void pa11_dma_sync_sg_for_device(struct device *dev, struct scatterlist *
 	/* once we do combining we'll need to use phys_to_virt(sg_dma_address(sglist)) */
 
 	for_each_sg(sglist, sg, nents, i)
-		flush_kernel_vmap_range(sg_virt(sg), sg->length);
+		flush_kernel_dcache_range(sg_virt(sg), sg->length);
 }
 
 static void pa11_dma_cache_sync(struct device *dev, void *vaddr, size_t size,
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 25/25] parisc: use generic dma_noncoherent_ops
  2018-05-22 12:04 common non-cache coherent direct dma mapping ops v2 Christoph Hellwig
                   ` (17 preceding siblings ...)
  2018-05-22 12:04 ` [PATCH 24/25] parisc: always use flush_kernel_dcache_range for DMA cache maintainance Christoph Hellwig
@ 2018-05-22 12:04 ` Christoph Hellwig
  18 siblings, 0 replies; 26+ messages in thread
From: Christoph Hellwig @ 2018-05-22 12:04 UTC (permalink / raw)
  To: iommu
  Cc: linux-arch, Michal Simek, Greentime Hu, Vincent Chen,
	linux-alpha, linux-snps-arc, linux-arm-kernel, linux-c6x-dev,
	linux-hexagon, linux-m68k, nios2-dev, openrisc, linux-parisc,
	linux-sh, sparclinux, linux-xtensa, linux-kernel

Switch to the generic noncoherent direct mapping implementation.

Fix sync_single_for_cpu to do skip the cache flush unless the transfer
is to the device to match the more tested unmap_single path which should
have the same cache coherency implications.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 arch/parisc/Kconfig                   |   4 +
 arch/parisc/include/asm/dma-mapping.h |   4 -
 arch/parisc/kernel/pci-dma.c          | 145 ++------------------------
 arch/parisc/kernel/setup.c            |   2 +-
 4 files changed, 16 insertions(+), 139 deletions(-)

diff --git a/arch/parisc/Kconfig b/arch/parisc/Kconfig
index 4d8f64d48597..4993c6dc8358 100644
--- a/arch/parisc/Kconfig
+++ b/arch/parisc/Kconfig
@@ -188,6 +188,10 @@ config PA20
 config PA11
 	def_bool y
 	depends on PA7000 || PA7100LC || PA7200 || PA7300LC
+	select ARCH_HAS_SYNC_DMA_FOR_CPU
+	select ARCH_HAS_SYNC_DMA_FOR_DEVICE
+	select DMA_NONCOHERENT_OPS
+	select DMA_NONCOHERENT_CACHE_SYNC
 
 config PREFETCH
 	def_bool y
diff --git a/arch/parisc/include/asm/dma-mapping.h b/arch/parisc/include/asm/dma-mapping.h
index eeec8dd18e74..44a9f97194aa 100644
--- a/arch/parisc/include/asm/dma-mapping.h
+++ b/arch/parisc/include/asm/dma-mapping.h
@@ -21,10 +21,6 @@
 ** flush/purge and allocate "regular" cacheable pages for everything.
 */
 
-#ifdef CONFIG_PA11
-extern const struct dma_map_ops pa11_dma_ops;
-#endif
-
 extern const struct dma_map_ops *hppa_dma_ops;
 
 static inline const struct dma_map_ops *get_arch_dma_ops(struct bus_type *bus)
diff --git a/arch/parisc/kernel/pci-dma.c b/arch/parisc/kernel/pci-dma.c
index 4d7506336918..87a7926cda6f 100644
--- a/arch/parisc/kernel/pci-dma.c
+++ b/arch/parisc/kernel/pci-dma.c
@@ -21,13 +21,12 @@
 #include <linux/init.h>
 #include <linux/gfp.h>
 #include <linux/mm.h>
-#include <linux/pci.h>
 #include <linux/proc_fs.h>
 #include <linux/seq_file.h>
 #include <linux/string.h>
 #include <linux/types.h>
-#include <linux/scatterlist.h>
-#include <linux/export.h>
+#include <linux/dma-direct.h>
+#include <linux/dma-noncoherent.h>
 
 #include <asm/cacheflush.h>
 #include <asm/dma.h>    /* for DMA_CHUNK_SIZE */
@@ -450,7 +449,7 @@ static void *pcx_dma_alloc(struct device *dev, size_t size,
 	return addr;
 }
 
-static void *pa11_dma_alloc(struct device *dev, size_t size,
+void *arch_dma_alloc(struct device *dev, size_t size,
 		dma_addr_t *dma_handle, gfp_t gfp, unsigned long attrs)
 {
 
@@ -460,7 +459,7 @@ static void *pa11_dma_alloc(struct device *dev, size_t size,
 		return pcx_dma_alloc(dev, size, dma_handle, gfp, attrs);
 }
 
-static void pa11_dma_free(struct device *dev, size_t size, void *vaddr,
+void arch_dma_free(struct device *dev, size_t size, void *vaddr,
 		dma_addr_t dma_handle, unsigned long attrs)
 {
 	int order = get_order(size);
@@ -475,142 +474,20 @@ static void pa11_dma_free(struct device *dev, size_t size, void *vaddr,
 	free_pages((unsigned long)vaddr, get_order(size));
 }
 
-static dma_addr_t pa11_dma_map_page(struct device *dev, struct page *page,
-		unsigned long offset, size_t size,
-		enum dma_data_direction direction, unsigned long attrs)
+void arch_sync_dma_for_device(struct device *dev, phys_addr_t paddr,
+		size_t size, enum dma_data_direction dir)
 {
-	void *addr = page_address(page) + offset;
-	BUG_ON(direction == DMA_NONE);
-
-	if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC))
-		flush_kernel_dcache_range((unsigned long) addr, size);
-
-	return virt_to_phys(addr);
-}
-
-static void pa11_dma_unmap_page(struct device *dev, dma_addr_t dma_handle,
-		size_t size, enum dma_data_direction direction,
-		unsigned long attrs)
-{
-	BUG_ON(direction == DMA_NONE);
-
-	if (attrs & DMA_ATTR_SKIP_CPU_SYNC)
-		return;
-
-	if (direction == DMA_TO_DEVICE)
-		return;
-
-	/*
-	 * For PCI_DMA_FROMDEVICE this flush is not necessary for the
-	 * simple map/unmap case. However, it IS necessary if if
-	 * pci_dma_sync_single_* has been called and the buffer reused.
-	 */
-
-	flush_kernel_dcache_range((unsigned long) phys_to_virt(dma_handle), size);
-}
-
-static int pa11_dma_map_sg(struct device *dev, struct scatterlist *sglist,
-		int nents, enum dma_data_direction direction,
-		unsigned long attrs)
-{
-	int i;
-	struct scatterlist *sg;
-
-	BUG_ON(direction == DMA_NONE);
-
-	for_each_sg(sglist, sg, nents, i) {
-		unsigned long vaddr = (unsigned long)sg_virt(sg);
-
-		sg_dma_address(sg) = (dma_addr_t) virt_to_phys(vaddr);
-		sg_dma_len(sg) = sg->length;
-
-		if (attrs & DMA_ATTR_SKIP_CPU_SYNC)
-			continue;
-
-		flush_kernel_dcache_range(vaddr, sg->length);
-	}
-	return nents;
-}
-
-static void pa11_dma_unmap_sg(struct device *dev, struct scatterlist *sglist,
-		int nents, enum dma_data_direction direction,
-		unsigned long attrs)
-{
-	int i;
-	struct scatterlist *sg;
-
-	BUG_ON(direction == DMA_NONE);
-
-	if (attrs & DMA_ATTR_SKIP_CPU_SYNC)
-		return;
-
-	if (direction == DMA_TO_DEVICE)
-		return;
-
-	/* once we do combining we'll need to use phys_to_virt(sg_dma_address(sglist)) */
-
-	for_each_sg(sglist, sg, nents, i)
-		flush_kernel_dcache_range(sg_virt(sg), sg->length);
-}
-
-static void pa11_dma_sync_single_for_cpu(struct device *dev,
-		dma_addr_t dma_handle, size_t size,
-		enum dma_data_direction direction)
-{
-	BUG_ON(direction == DMA_NONE);
-
-	flush_kernel_dcache_range((unsigned long) phys_to_virt(dma_handle),
-			size);
-}
-
-static void pa11_dma_sync_single_for_device(struct device *dev,
-		dma_addr_t dma_handle, size_t size,
-		enum dma_data_direction direction)
-{
-	BUG_ON(direction == DMA_NONE);
-
-	flush_kernel_dcache_range((unsigned long) phys_to_virt(dma_handle),
-			size);
-}
-
-static void pa11_dma_sync_sg_for_cpu(struct device *dev, struct scatterlist *sglist, int nents, enum dma_data_direction direction)
-{
-	int i;
-	struct scatterlist *sg;
-
-	/* once we do combining we'll need to use phys_to_virt(sg_dma_address(sglist)) */
-
-	for_each_sg(sglist, sg, nents, i)
-		flush_kernel_dcache_range(sg_virt(sg), sg->length);
+	flush_kernel_dcache_range((unsigned long)phys_to_virt(paddr), size);
 }
 
-static void pa11_dma_sync_sg_for_device(struct device *dev, struct scatterlist *sglist, int nents, enum dma_data_direction direction)
+void arch_sync_dma_for_cpu(struct device *dev, phys_addr_t paddr,
+		size_t size, enum dma_data_direction dir)
 {
-	int i;
-	struct scatterlist *sg;
-
-	/* once we do combining we'll need to use phys_to_virt(sg_dma_address(sglist)) */
-
-	for_each_sg(sglist, sg, nents, i)
-		flush_kernel_dcache_range(sg_virt(sg), sg->length);
+	flush_kernel_dcache_range((unsigned long)phys_to_virt(paddr), size);
 }
 
-static void pa11_dma_cache_sync(struct device *dev, void *vaddr, size_t size,
+void arch_dma_cache_sync(struct device *dev, void *vaddr, size_t size,
 	       enum dma_data_direction direction)
 {
 	flush_kernel_dcache_range((unsigned long)vaddr, size);
 }
-
-const struct dma_map_ops pa11_dma_ops = {
-	.alloc =		pa11_dma_alloc,
-	.free =			pa11_dma_free,
-	.map_page =		pa11_dma_map_page,
-	.unmap_page =		pa11_dma_unmap_page,
-	.map_sg =		pa11_dma_map_sg,
-	.unmap_sg =		pa11_dma_unmap_sg,
-	.sync_single_for_cpu =	pa11_dma_sync_single_for_cpu,
-	.sync_single_for_device = pa11_dma_sync_single_for_device,
-	.sync_sg_for_cpu =	pa11_dma_sync_sg_for_cpu,
-	.sync_sg_for_device =	pa11_dma_sync_sg_for_device,
-	.cache_sync =		pa11_dma_cache_sync,
-};
diff --git a/arch/parisc/kernel/setup.c b/arch/parisc/kernel/setup.c
index 5c8450a22255..4e87c35c22b7 100644
--- a/arch/parisc/kernel/setup.c
+++ b/arch/parisc/kernel/setup.c
@@ -102,7 +102,7 @@ void __init dma_ops_init(void)
 	case pcxl: /* falls through */
 	case pcxs:
 	case pcxt:
-		hppa_dma_ops = &pa11_dma_ops;
+		hppa_dma_ops = &dma_noncoherent_ops;
 		break;
 	default:
 		break;
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2018-05-22 12:04 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-05-22 12:04 common non-cache coherent direct dma mapping ops v2 Christoph Hellwig
2018-05-22 12:04 ` [PATCH 01/25] hexagon: remove the sync_single_for_cpu DMA operation Christoph Hellwig
2018-05-22 12:04 ` [PATCH 02/25] hexagon: implement the sync_sg_for_device " Christoph Hellwig
     [not found] ` <20180522120430.28709-1-hch-jcswGhMUV9g@public.gmane.org>
2018-05-22 12:04   ` [PATCH 03/25] hexagon: use generic dma_noncoherent_ops Christoph Hellwig
2018-05-22 12:04   ` [PATCH 04/25] m68k: " Christoph Hellwig
2018-05-22 12:04   ` [PATCH 07/25] nds32: remove the broken kmap code in nds32_dma_map_sg Christoph Hellwig
2018-05-22 12:04   ` [PATCH 08/25] nds32: consolidate DMA cache maintainance routines Christoph Hellwig
2018-05-22 12:04   ` [PATCH 10/25] nds32: use generic dma_noncoherent_ops Christoph Hellwig
2018-05-22 12:04   ` [PATCH 11/25] nios2: " Christoph Hellwig
2018-05-22 12:04   ` [PATCH 12/25] openrisc: remove the sync_single_for_cpu DMA operation Christoph Hellwig
2018-05-22 12:04 ` [PATCH 05/25] microblaze: use generic dma_noncoherent_ops Christoph Hellwig
2018-05-22 12:04 ` [PATCH 06/25] microblaze: remove the consistent_sync and consistent_sync_page Christoph Hellwig
2018-05-22 12:04 ` [PATCH 09/25] nds32: implement the unmap_sg DMA operation Christoph Hellwig
2018-05-22 12:04 ` [PATCH 13/25] openrisc: remove the no-op unmap_page and unmap_sg DMA operations Christoph Hellwig
2018-05-22 12:04 ` [PATCH 14/25] openrisc: fix cache maintainance the the sync_single_for_device DMA operation Christoph Hellwig
2018-05-22 12:04 ` [PATCH 15/25] openrisc: use generic dma_noncoherent_ops Christoph Hellwig
2018-05-22 12:04 ` [PATCH 16/25] sh: simplify get_arch_dma_ops Christoph Hellwig
2018-05-22 12:04 ` [PATCH 17/25] sh: introduce a sh_cacheop_vaddr helper Christoph Hellwig
2018-05-22 12:04 ` [PATCH 18/25] sh: use dma_direct_ops for the CONFIG_DMA_COHERENT case Christoph Hellwig
2018-05-22 12:04 ` [PATCH 19/25] sh: split arch/sh/mm/consistent.c Christoph Hellwig
2018-05-22 12:04 ` [PATCH 20/25] sh: use generic dma_noncoherent_ops Christoph Hellwig
2018-05-22 12:04 ` [PATCH 21/25] xtensa: " Christoph Hellwig
2018-05-22 12:04 ` [PATCH 22/25] sparc: " Christoph Hellwig
2018-05-22 12:04 ` [PATCH 23/25] parisc: merge pcx_dma_ops and pcxl_dma_ops Christoph Hellwig
2018-05-22 12:04 ` [PATCH 24/25] parisc: always use flush_kernel_dcache_range for DMA cache maintainance Christoph Hellwig
2018-05-22 12:04 ` [PATCH 25/25] parisc: use generic dma_noncoherent_ops Christoph Hellwig

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).