linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/4] ARC: allow to use IOC and non-IOC DMA devices simultaneously
@ 2018-07-30 16:26 Eugeniy Paltsev
  2018-07-30 16:26 ` [PATCH v2 1/4] ARC: DTS: mark DMA devices connected through IOC port as dma-coherent Eugeniy Paltsev
                   ` (5 more replies)
  0 siblings, 6 replies; 19+ messages in thread
From: Eugeniy Paltsev @ 2018-07-30 16:26 UTC (permalink / raw)
  To: linux-snps-arc
  Cc: linux-kernel, linux-arch, Vineet Gupta, Alexey Brodkin, hch,
	Eugeniy Paltsev

The ARC HS processor provides an IOC port (I/O coherency bus
interface) that allows external devices such as DMA devices
to access memory through the cache hierarchy, providing
coherency between I/O transactions and the complete memory
hierarchy.

Some recent SoC with ARC HS (like HSDK) allow to select bus
port (IOC or non-IOC port) for connecting DMA devices in runtime.

With this patch we can use both HW-coherent and regular DMA
peripherals simultaneously.

NOTE:
This patch series was stress tested on HSDK with iperf3 (ethernet)
and bonie++ (usb and sdio) in three configurations:
 * IOC enabled globaly
 * IOC disabled globaly
 * IOC enabled partially (USB & SDIO are connected via IOC AXI port,
   ethernet is connected to DDR AXI port (non-IOC port) 

NOTE:
If you want to test some device without IOC it is not enough
to remove "dma-coherent" property from dts. You had to remap this
device to regular DDR AXI port intead of IOC AXI port.
You also need to apply 3 following patches firstly:
https://www.mail-archive.com/linux-snps-arc@lists.infradead.org/msg03865.html
https://www.mail-archive.com/linux-snps-arc@lists.infradead.org/msg03889.html
https://www.mail-archive.com/linux-snps-arc@lists.infradead.org/msg03887.html

NOTE:
We don't touch any aperture configuration in this patch series. So we
don't switch any devices between IOC and non-IOC AXI ports on any board.
It can be done later if it is required.

Changes v1->v2 (Thanks to Christoph):
 * Don't select DMA_DIRECT_OPS explicitly as it is already selected by
   DMA_NONCOHERENT_OPS
 * Remove check for HIGHMEM pages from arch_dma_{alloc, free}

Eugeniy Paltsev (4):
  ARC: DTS: mark DMA devices connected through IOC port as dma-coherent
  ARC: allow to use IOC and non-IOC DMA devices simultaneously
  ARC: IOC: panic if both IOC and ZONE_HIGHMEM enabled
  ARC: don't check for HIGHMEM pages in arch_dma_alloc

 arch/arc/boot/dts/axc003.dtsi      | 26 ++++++++++++++++
 arch/arc/boot/dts/axc003_idu.dtsi  | 26 ++++++++++++++++
 arch/arc/boot/dts/hsdk.dts         |  4 +++
 arch/arc/include/asm/dma-mapping.h | 13 ++++++++
 arch/arc/mm/cache.c                | 30 +++++++++---------
 arch/arc/mm/dma.c                  | 62 +++++++++++++++++++-------------------
 6 files changed, 115 insertions(+), 46 deletions(-)
 create mode 100644 arch/arc/include/asm/dma-mapping.h

-- 
2.14.4


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH v2 1/4] ARC: DTS: mark DMA devices connected through IOC port as dma-coherent
  2018-07-30 16:26 [PATCH v2 0/4] ARC: allow to use IOC and non-IOC DMA devices simultaneously Eugeniy Paltsev
@ 2018-07-30 16:26 ` Eugeniy Paltsev
  2018-07-30 16:26 ` [PATCH v2 2/4] ARC: allow to use IOC and non-IOC DMA devices simultaneously Eugeniy Paltsev
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 19+ messages in thread
From: Eugeniy Paltsev @ 2018-07-30 16:26 UTC (permalink / raw)
  To: linux-snps-arc
  Cc: linux-kernel, linux-arch, Vineet Gupta, Alexey Brodkin, hch,
	Eugeniy Paltsev

Mark DMA devices on AXS103 and HSDK boards connected through IOC
port as dma-coherent.

Signed-off-by: Eugeniy Paltsev <Eugeniy.Paltsev@synopsys.com>
---
Changes v1->v2:
 * None.

 arch/arc/boot/dts/axc003.dtsi     | 26 ++++++++++++++++++++++++++
 arch/arc/boot/dts/axc003_idu.dtsi | 26 ++++++++++++++++++++++++++
 arch/arc/boot/dts/hsdk.dts        |  4 ++++
 3 files changed, 56 insertions(+)

diff --git a/arch/arc/boot/dts/axc003.dtsi b/arch/arc/boot/dts/axc003.dtsi
index dc91c663bcc0..d75d65ddf8e3 100644
--- a/arch/arc/boot/dts/axc003.dtsi
+++ b/arch/arc/boot/dts/axc003.dtsi
@@ -93,6 +93,32 @@
 		};
 	};
 
+	/*
+	 * Mark DMA peripherals connected via IOC port as dma-coherent. We do
+	 * it via overlay because peripherals defined in axs10x_mb.dtsi are
+	 * used for both AXS101 and AXS103 boards and only AXS103 has IOC (so
+	 * only AXS103 board has HW-coherent DMA peripherals)
+	 * We don't need to mark pgu@17000 as dma-coherent because it uses
+	 * external DMA buffer located outside of IOC aperture.
+	 */
+	axs10x_mb {
+		ethernet@0x18000 {
+			dma-coherent;
+		};
+
+		ehci@0x40000 {
+			dma-coherent;
+		};
+
+		ohci@0x60000 {
+			dma-coherent;
+		};
+
+		mmc@0x15000 {
+			dma-coherent;
+		};
+	};
+
 	/*
 	 * The DW APB ICTL intc on MB is connected to CPU intc via a
 	 * DT "invisible" DW APB GPIO block, configured to simply pass thru
diff --git a/arch/arc/boot/dts/axc003_idu.dtsi b/arch/arc/boot/dts/axc003_idu.dtsi
index 69ff4895f2ba..a05bb737ea63 100644
--- a/arch/arc/boot/dts/axc003_idu.dtsi
+++ b/arch/arc/boot/dts/axc003_idu.dtsi
@@ -100,6 +100,32 @@
 		};
 	};
 
+	/*
+	 * Mark DMA peripherals connected via IOC port as dma-coherent. We do
+	 * it via overlay because peripherals defined in axs10x_mb.dtsi are
+	 * used for both AXS101 and AXS103 boards and only AXS103 has IOC (so
+	 * only AXS103 board has HW-coherent DMA peripherals)
+	 * We don't need to mark pgu@17000 as dma-coherent because it uses
+	 * external DMA buffer located outside of IOC aperture.
+	 */
+	axs10x_mb {
+		ethernet@0x18000 {
+			dma-coherent;
+		};
+
+		ehci@0x40000 {
+			dma-coherent;
+		};
+
+		ohci@0x60000 {
+			dma-coherent;
+		};
+
+		mmc@0x15000 {
+			dma-coherent;
+		};
+	};
+
 	/*
 	 * This INTC is actually connected to DW APB GPIO
 	 * which acts as a wire between MB INTC and CPU INTC.
diff --git a/arch/arc/boot/dts/hsdk.dts b/arch/arc/boot/dts/hsdk.dts
index 006aa3de5348..ebb686c21393 100644
--- a/arch/arc/boot/dts/hsdk.dts
+++ b/arch/arc/boot/dts/hsdk.dts
@@ -176,6 +176,7 @@
 			phy-handle = <&phy0>;
 			resets = <&cgu_rst HSDK_ETH_RESET>;
 			reset-names = "stmmaceth";
+			dma-coherent;
 
 			mdio {
 				#address-cells = <1>;
@@ -194,12 +195,14 @@
 			compatible = "snps,hsdk-v1.0-ohci", "generic-ohci";
 			reg = <0x60000 0x100>;
 			interrupts = <15>;
+			dma-coherent;
 		};
 
 		ehci@40000 {
 			compatible = "snps,hsdk-v1.0-ehci", "generic-ehci";
 			reg = <0x40000 0x100>;
 			interrupts = <15>;
+			dma-coherent;
 		};
 
 		mmc@a000 {
@@ -212,6 +215,7 @@
 			clock-names = "biu", "ciu";
 			interrupts = <12>;
 			bus-width = <4>;
+			dma-coherent;
 		};
 	};
 
-- 
2.14.4


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v2 2/4] ARC: allow to use IOC and non-IOC DMA devices simultaneously
  2018-07-30 16:26 [PATCH v2 0/4] ARC: allow to use IOC and non-IOC DMA devices simultaneously Eugeniy Paltsev
  2018-07-30 16:26 ` [PATCH v2 1/4] ARC: DTS: mark DMA devices connected through IOC port as dma-coherent Eugeniy Paltsev
@ 2018-07-30 16:26 ` Eugeniy Paltsev
  2018-08-13 16:24   ` Vineet Gupta
  2018-07-30 16:26 ` [PATCH v2 3/4] ARC: IOC: panic if both IOC and ZONE_HIGHMEM enabled Eugeniy Paltsev
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 19+ messages in thread
From: Eugeniy Paltsev @ 2018-07-30 16:26 UTC (permalink / raw)
  To: linux-snps-arc
  Cc: linux-kernel, linux-arch, Vineet Gupta, Alexey Brodkin, hch,
	Eugeniy Paltsev

The ARC HS processor provides an IOC port (I/O coherency bus
interface) that allows external devices such as DMA devices
to access memory through the cache hierarchy, providing
coherency between I/O transactions and the complete memory
hierarchy.

Some recent SoC with ARC HS (like HSDK) allow to select bus
port (IOC or non-IOC port) for connecting DMA devices in runtime.

With this patch we can use both HW-coherent and regular DMA
peripherals simultaneously.

For example we can connect USB and SDIO controllers through IOC port
(so we don't need to need to maintain cache coherency for these
devices manualy. All cache sync ops will be nop)
And we can connect Ethernet directly to RAM port (so we had to
maintain cache coherency manualy. Cache sync ops will be real
flush/invalidate operations)

Cache ops are set per-device and depends on "dma-coherent" device
tree property:
"dma_noncoherent_ops" are used if no "dma-coherent" property is
present (or IOC is disabled)
"dma_direct_ops" are used if "dma-coherent" property is present.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Eugeniy Paltsev <Eugeniy.Paltsev@synopsys.com>
---
Changes v1->v2 (Thanks to Christoph):
 * Don't select DMA_DIRECT_OPS explicitly as it is already selected by
   DMA_NONCOHERENT_OPS

 arch/arc/include/asm/dma-mapping.h | 13 +++++++++++++
 arch/arc/mm/cache.c                | 17 ++---------------
 arch/arc/mm/dma.c                  | 34 +++++++++++++++++++---------------
 3 files changed, 34 insertions(+), 30 deletions(-)
 create mode 100644 arch/arc/include/asm/dma-mapping.h

diff --git a/arch/arc/include/asm/dma-mapping.h b/arch/arc/include/asm/dma-mapping.h
new file mode 100644
index 000000000000..c946c0a83e76
--- /dev/null
+++ b/arch/arc/include/asm/dma-mapping.h
@@ -0,0 +1,13 @@
+// SPDX-License-Identifier:  GPL-2.0
+// (C) 2018 Synopsys, Inc. (www.synopsys.com)
+
+#ifndef ASM_ARC_DMA_MAPPING_H
+#define ASM_ARC_DMA_MAPPING_H
+
+#include <asm-generic/dma-mapping.h>
+
+void arch_setup_dma_ops(struct device *dev, u64 dma_base, u64 size,
+			const struct iommu_ops *iommu, bool coherent);
+#define arch_setup_dma_ops arch_setup_dma_ops
+
+#endif
diff --git a/arch/arc/mm/cache.c b/arch/arc/mm/cache.c
index b95365e1253a..dac12afbb93a 100644
--- a/arch/arc/mm/cache.c
+++ b/arch/arc/mm/cache.c
@@ -65,7 +65,7 @@ char *arc_cache_mumbojumbo(int c, char *buf, int len)
 
 	n += scnprintf(buf + n, len - n, "Peripherals\t: %#lx%s%s\n",
 		       perip_base,
-		       IS_AVAIL3(ioc_exists, ioc_enable, ", IO-Coherency "));
+		       IS_AVAIL3(ioc_exists, ioc_enable, ", per device IO-Coherency "));
 
 	return buf;
 }
@@ -896,15 +896,6 @@ static void __dma_cache_wback_slc(phys_addr_t start, unsigned long sz)
 	slc_op(start, sz, OP_FLUSH);
 }
 
-/*
- * DMA ops for systems with IOC
- * IOC hardware snoops all DMA traffic keeping the caches consistent with
- * memory - eliding need for any explicit cache maintenance of DMA buffers
- */
-static void __dma_cache_wback_inv_ioc(phys_addr_t start, unsigned long sz) {}
-static void __dma_cache_inv_ioc(phys_addr_t start, unsigned long sz) {}
-static void __dma_cache_wback_ioc(phys_addr_t start, unsigned long sz) {}
-
 /*
  * Exported DMA API
  */
@@ -1263,11 +1254,7 @@ void __init arc_cache_init_master(void)
 	if (is_isa_arcv2() && ioc_enable)
 		arc_ioc_setup();
 
-	if (is_isa_arcv2() && ioc_enable) {
-		__dma_cache_wback_inv = __dma_cache_wback_inv_ioc;
-		__dma_cache_inv = __dma_cache_inv_ioc;
-		__dma_cache_wback = __dma_cache_wback_ioc;
-	} else if (is_isa_arcv2() && l2_line_sz && slc_enable) {
+	if (is_isa_arcv2() && l2_line_sz && slc_enable) {
 		__dma_cache_wback_inv = __dma_cache_wback_inv_slc;
 		__dma_cache_inv = __dma_cache_inv_slc;
 		__dma_cache_wback = __dma_cache_wback_slc;
diff --git a/arch/arc/mm/dma.c b/arch/arc/mm/dma.c
index cefb776a99ff..4d1466905e48 100644
--- a/arch/arc/mm/dma.c
+++ b/arch/arc/mm/dma.c
@@ -33,19 +33,7 @@ void *arch_dma_alloc(struct device *dev, size_t size, dma_addr_t *dma_handle,
 	if (!page)
 		return NULL;
 
-	/*
-	 * IOC relies on all data (even coherent DMA data) being in cache
-	 * Thus allocate normal cached memory
-	 *
-	 * The gains with IOC are two pronged:
-	 *   -For streaming data, elides need for cache maintenance, saving
-	 *    cycles in flush code, and bus bandwidth as all the lines of a
-	 *    buffer need to be flushed out to memory
-	 *   -For coherent data, Read/Write to buffers terminate early in cache
-	 *   (vs. always going to memory - thus are faster)
-	 */
-	if ((is_isa_arcv2() && ioc_enable) ||
-	    (attrs & DMA_ATTR_NON_CONSISTENT))
+	if (attrs & DMA_ATTR_NON_CONSISTENT)
 		need_coh = 0;
 
 	/*
@@ -95,8 +83,7 @@ void arch_dma_free(struct device *dev, size_t size, void *vaddr,
 	struct page *page = virt_to_page(paddr);
 	int is_non_coh = 1;
 
-	is_non_coh = (attrs & DMA_ATTR_NON_CONSISTENT) ||
-			(is_isa_arcv2() && ioc_enable);
+	is_non_coh = (attrs & DMA_ATTR_NON_CONSISTENT);
 
 	if (PageHighMem(page) || !is_non_coh)
 		iounmap((void __force __iomem *)vaddr);
@@ -182,3 +169,20 @@ void arch_sync_dma_for_cpu(struct device *dev, phys_addr_t paddr,
 		break;
 	}
 }
+
+void arch_setup_dma_ops(struct device *dev, u64 dma_base, u64 size,
+			const struct iommu_ops *iommu, bool coherent)
+{
+	/*
+	 * IOC hardware snoops all DMA traffic keeping the caches consistent
+	 * with memory - eliding need for any explicit cache maintenance of
+	 * DMA buffers - so we can use dma_direct cache ops.
+	 */
+	if (is_isa_arcv2() && ioc_enable && coherent) {
+		set_dma_ops(dev, &dma_direct_ops);
+		dev_info(dev, "use dma_direct_ops cache ops\n");
+	} else {
+		set_dma_ops(dev, &dma_noncoherent_ops);
+		dev_info(dev, "use dma_noncoherent_ops cache ops\n");
+	}
+}
-- 
2.14.4


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v2 3/4] ARC: IOC: panic if both IOC and ZONE_HIGHMEM enabled
  2018-07-30 16:26 [PATCH v2 0/4] ARC: allow to use IOC and non-IOC DMA devices simultaneously Eugeniy Paltsev
  2018-07-30 16:26 ` [PATCH v2 1/4] ARC: DTS: mark DMA devices connected through IOC port as dma-coherent Eugeniy Paltsev
  2018-07-30 16:26 ` [PATCH v2 2/4] ARC: allow to use IOC and non-IOC DMA devices simultaneously Eugeniy Paltsev
@ 2018-07-30 16:26 ` Eugeniy Paltsev
  2018-07-30 16:26 ` [PATCH v2 4/4] ARC: don't check for HIGHMEM pages in arch_dma_alloc Eugeniy Paltsev
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 19+ messages in thread
From: Eugeniy Paltsev @ 2018-07-30 16:26 UTC (permalink / raw)
  To: linux-snps-arc
  Cc: linux-kernel, linux-arch, Vineet Gupta, Alexey Brodkin, hch,
	Eugeniy Paltsev

Signed-off-by: Eugeniy Paltsev <Eugeniy.Paltsev@synopsys.com>
---
Changes v1->v2:
 * None.

 arch/arc/mm/cache.c | 13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/arch/arc/mm/cache.c b/arch/arc/mm/cache.c
index dac12afbb93a..389602cf1594 100644
--- a/arch/arc/mm/cache.c
+++ b/arch/arc/mm/cache.c
@@ -1143,6 +1143,19 @@ noinline void __init arc_ioc_setup(void)
 {
 	unsigned int ioc_base, mem_sz;
 
+	/*
+	 * As for today we don't support both IOC and ZONE_HIGHMEM enabled
+	 * simultaneously. This happens because as of today IOC aperture covers
+	 * only ZONE_NORMAL (low mem) and any dma transactions outside this
+	 * region won't be HW coherent.
+	 * If we want to use both IOC and ZONE_HIGHMEM we can use
+	 * bounce_buffer to handle dma transactions to HIGHMEM.
+	 * Also it is possible to modify dma_direct cache ops or increase IOC
+	 * aperture size if we are planning to use HIGHMEM without PAE.
+	 */
+	if (IS_ENABLED(CONFIG_HIGHMEM))
+		panic("IOC and HIGHMEM can't be used simultaneously");
+
 	/* Flush + invalidate + disable L1 dcache */
 	__dc_disable();
 
-- 
2.14.4


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v2 4/4] ARC: don't check for HIGHMEM pages in arch_dma_alloc
  2018-07-30 16:26 [PATCH v2 0/4] ARC: allow to use IOC and non-IOC DMA devices simultaneously Eugeniy Paltsev
                   ` (2 preceding siblings ...)
  2018-07-30 16:26 ` [PATCH v2 3/4] ARC: IOC: panic if both IOC and ZONE_HIGHMEM enabled Eugeniy Paltsev
@ 2018-07-30 16:26 ` Eugeniy Paltsev
  2018-07-31  8:08   ` Christoph Hellwig
  2018-08-13 16:19 ` [PATCH v2 0/4] ARC: allow to use IOC and non-IOC DMA devices simultaneously Vineet Gupta
  2018-09-04 20:14 ` Vineet Gupta
  5 siblings, 1 reply; 19+ messages in thread
From: Eugeniy Paltsev @ 2018-07-30 16:26 UTC (permalink / raw)
  To: linux-snps-arc
  Cc: linux-kernel, linux-arch, Vineet Gupta, Alexey Brodkin, hch,
	Eugeniy Paltsev

__GFP_HIGHMEM flag is cleared by upper layer functions
(in include/linux/dma-mapping.h) so we'll never get a
__GFP_HIGHMEM flag in arch_dma_alloc gfp argument.
That's why alloc_pages will never return highmem page
here.

Get rid of highmem pages handling and cleanup arch_dma_alloc
and arch_dma_free functions.

Signed-off-by: Eugeniy Paltsev <Eugeniy.Paltsev@synopsys.com>
---
Changes v1->v2 (Thanks to Christoph):
 * Remove check for HIGHMEM pages from arch_dma_{alloc, free}

 arch/arc/mm/dma.c | 32 ++++++++++++++------------------
 1 file changed, 14 insertions(+), 18 deletions(-)

diff --git a/arch/arc/mm/dma.c b/arch/arc/mm/dma.c
index 4d1466905e48..46129d7a298d 100644
--- a/arch/arc/mm/dma.c
+++ b/arch/arc/mm/dma.c
@@ -27,30 +27,29 @@ void *arch_dma_alloc(struct device *dev, size_t size, dma_addr_t *dma_handle,
 	struct page *page;
 	phys_addr_t paddr;
 	void *kvaddr;
-	int need_coh = 1, need_kvaddr = 0;
+	bool need_coh = !(attrs & DMA_ATTR_NON_CONSISTENT);
+
+	/*
+	 * __GFP_HIGHMEM flag is cleared by upper layer functions
+	 * (in include/linux/dma-mapping.h) so we should never get a
+	 * __GFP_HIGHMEM here.
+	 */
+	BUG_ON(gfp & __GFP_HIGHMEM);
 
 	page = alloc_pages(gfp, order);
 	if (!page)
 		return NULL;
 
-	if (attrs & DMA_ATTR_NON_CONSISTENT)
-		need_coh = 0;
-
-	/*
-	 * - A coherent buffer needs MMU mapping to enforce non-cachability
-	 * - A highmem page needs a virtual handle (hence MMU mapping)
-	 *   independent of cachability
-	 */
-	if (PageHighMem(page) || need_coh)
-		need_kvaddr = 1;
-
 	/* This is linear addr (0x8000_0000 based) */
 	paddr = page_to_phys(page);
 
 	*dma_handle = paddr;
 
-	/* This is kernel Virtual address (0x7000_0000 based) */
-	if (need_kvaddr) {
+	/*
+	 * A coherent buffer needs MMU mapping to enforce non-cachability.
+	 * kvaddr is kernel Virtual address (0x7000_0000 based).
+	 */
+	if (need_coh) {
 		kvaddr = ioremap_nocache(paddr, size);
 		if (kvaddr == NULL) {
 			__free_pages(page, order);
@@ -81,11 +80,8 @@ void arch_dma_free(struct device *dev, size_t size, void *vaddr,
 {
 	phys_addr_t paddr = dma_handle;
 	struct page *page = virt_to_page(paddr);
-	int is_non_coh = 1;
-
-	is_non_coh = (attrs & DMA_ATTR_NON_CONSISTENT);
 
-	if (PageHighMem(page) || !is_non_coh)
+	if (!(attrs & DMA_ATTR_NON_CONSISTENT))
 		iounmap((void __force __iomem *)vaddr);
 
 	__free_pages(page, get_order(size));
-- 
2.14.4


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 4/4] ARC: don't check for HIGHMEM pages in arch_dma_alloc
  2018-07-30 16:26 ` [PATCH v2 4/4] ARC: don't check for HIGHMEM pages in arch_dma_alloc Eugeniy Paltsev
@ 2018-07-31  8:08   ` Christoph Hellwig
  0 siblings, 0 replies; 19+ messages in thread
From: Christoph Hellwig @ 2018-07-31  8:08 UTC (permalink / raw)
  To: Eugeniy Paltsev
  Cc: linux-snps-arc, linux-kernel, linux-arch, Vineet Gupta,
	Alexey Brodkin, hch

On Mon, Jul 30, 2018 at 07:26:36PM +0300, Eugeniy Paltsev wrote:
> __GFP_HIGHMEM flag is cleared by upper layer functions
> (in include/linux/dma-mapping.h) so we'll never get a
> __GFP_HIGHMEM flag in arch_dma_alloc gfp argument.
> That's why alloc_pages will never return highmem page
> here.
> 
> Get rid of highmem pages handling and cleanup arch_dma_alloc
> and arch_dma_free functions.
> 
> Signed-off-by: Eugeniy Paltsev <Eugeniy.Paltsev@synopsys.com>

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 0/4] ARC: allow to use IOC and non-IOC DMA devices simultaneously
  2018-07-30 16:26 [PATCH v2 0/4] ARC: allow to use IOC and non-IOC DMA devices simultaneously Eugeniy Paltsev
                   ` (3 preceding siblings ...)
  2018-07-30 16:26 ` [PATCH v2 4/4] ARC: don't check for HIGHMEM pages in arch_dma_alloc Eugeniy Paltsev
@ 2018-08-13 16:19 ` Vineet Gupta
  2018-08-13 17:27   ` Eugeniy Paltsev
  2018-09-04 20:14 ` Vineet Gupta
  5 siblings, 1 reply; 19+ messages in thread
From: Vineet Gupta @ 2018-08-13 16:19 UTC (permalink / raw)
  To: Eugeniy Paltsev, linux-snps-arc
  Cc: linux-kernel, linux-arch, Alexey Brodkin, hch

On 07/30/2018 09:26 AM, Eugeniy Paltsev wrote:
> The ARC HS processor provides an IOC port (I/O coherency bus
> interface) that allows external devices such as DMA devices
> to access memory through the cache hierarchy, providing
> coherency between I/O transactions and the complete memory
> hierarchy.

You didn't pay attention to my previous comment on this !
IOC port can be considered a micro-architecture optimization (an important one
though). The main thing is hardware snooping DMA transactions which enabled IOC in
first place.

> Some recent SoC with ARC HS (like HSDK) allow to select bus
> port (IOC or non-IOC port) for connecting DMA devices in runtime.

Again you mention the port but none of your 4 patches actually touch the port
itself in programming it.


> With this patch we can use both HW-coherent and regular DMA
> peripherals simultaneously.
>
> NOTE:
> This patch series was stress tested on HSDK with iperf3 (ethernet)
> and bonie++ (usb and sdio) in three configurations:
>  * IOC enabled globaly
>  * IOC disabled globaly
>  * IOC enabled partially (USB & SDIO are connected via IOC AXI port,
>    ethernet is connected to DDR AXI port (non-IOC port) 
>
> NOTE:
> If you want to test some device without IOC it is not enough
> to remove "dma-coherent" property from dts. You had to remap this
> device to regular DDR AXI port intead of IOC AXI port.

Why are we not doing that for the GPU as part of this series.

> You also need to apply 3 following patches firstly:
> https://www.mail-archive.com/linux-snps-arc@lists.infradead.org/msg03865.html
> https://www.mail-archive.com/linux-snps-arc@lists.infradead.org/msg03889.html
> https://www.mail-archive.com/linux-snps-arc@lists.infradead.org/msg03887.html
>
> NOTE:
> We don't touch any aperture configuration in this patch series. So we
> don't switch any devices between IOC and non-IOC AXI ports on any board.
> It can be done later if it is required.
>
> Changes v1->v2 (Thanks to Christoph):
>  * Don't select DMA_DIRECT_OPS explicitly as it is already selected by
>    DMA_NONCOHERENT_OPS
>  * Remove check for HIGHMEM pages from arch_dma_{alloc, free}
>
> Eugeniy Paltsev (4):
>   ARC: DTS: mark DMA devices connected through IOC port as dma-coherent
>   ARC: allow to use IOC and non-IOC DMA devices simultaneously
>   ARC: IOC: panic if both IOC and ZONE_HIGHMEM enabled
>   ARC: don't check for HIGHMEM pages in arch_dma_alloc
>
>  arch/arc/boot/dts/axc003.dtsi      | 26 ++++++++++++++++
>  arch/arc/boot/dts/axc003_idu.dtsi  | 26 ++++++++++++++++
>  arch/arc/boot/dts/hsdk.dts         |  4 +++
>  arch/arc/include/asm/dma-mapping.h | 13 ++++++++
>  arch/arc/mm/cache.c                | 30 +++++++++---------
>  arch/arc/mm/dma.c                  | 62 +++++++++++++++++++-------------------
>  6 files changed, 115 insertions(+), 46 deletions(-)
>  create mode 100644 arch/arc/include/asm/dma-mapping.h
>


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 2/4] ARC: allow to use IOC and non-IOC DMA devices simultaneously
  2018-07-30 16:26 ` [PATCH v2 2/4] ARC: allow to use IOC and non-IOC DMA devices simultaneously Eugeniy Paltsev
@ 2018-08-13 16:24   ` Vineet Gupta
  2018-08-13 17:08     ` Eugeniy Paltsev
  0 siblings, 1 reply; 19+ messages in thread
From: Vineet Gupta @ 2018-08-13 16:24 UTC (permalink / raw)
  To: Eugeniy Paltsev, linux-snps-arc
  Cc: linux-kernel, linux-arch, Alexey Brodkin, hch

On 07/30/2018 09:26 AM, Eugeniy Paltsev wrote:
> @@ -1263,11 +1254,7 @@ void __init arc_cache_init_master(void)
>  	if (is_isa_arcv2() && ioc_enable)
>  		arc_ioc_setup();
>  
> -	if (is_isa_arcv2() && ioc_enable) {
> -		__dma_cache_wback_inv = __dma_cache_wback_inv_ioc;
> -		__dma_cache_inv = __dma_cache_inv_ioc;
> -		__dma_cache_wback = __dma_cache_wback_ioc;
> -	} else if (is_isa_arcv2() && l2_line_sz && slc_enable) {
> +	if (is_isa_arcv2() && l2_line_sz && slc_enable) {
>  		__dma_cache_wback_inv = __dma_cache_wback_inv_slc;
>  		__dma_cache_inv = __dma_cache_inv_slc;
>  		__dma_cache_wback = __dma_cache_wback_slc;
> diff --git a/arch/arc/mm/dma.c b/arch/arc/mm/dma.c
> index cefb776a99ff..4d1466905e48 100644
> --- a/arch/arc/mm/dma.c
> +++ b/arch/arc/mm/dma.c
> @@ -33,19 +33,7 @@ void *arch_dma_alloc(struct device *dev, size_t size, dma_addr_t *dma_handle,
>  	if (!page)
>  		return NULL;
>  
> -	/*
> -	 * IOC relies on all data (even coherent DMA data) being in cache
> -	 * Thus allocate normal cached memory
> -	 *
> -	 * The gains with IOC are two pronged:
> -	 *   -For streaming data, elides need for cache maintenance, saving
> -	 *    cycles in flush code, and bus bandwidth as all the lines of a
> -	 *    buffer need to be flushed out to memory
> -	 *   -For coherent data, Read/Write to buffers terminate early in cache
> -	 *   (vs. always going to memory - thus are faster)
> -	 */
> -	if ((is_isa_arcv2() && ioc_enable) ||
> -	    (attrs & DMA_ATTR_NON_CONSISTENT))
> +	if (attrs & DMA_ATTR_NON_CONSISTENT)
>  		need_coh = 0;
>  
>  	/*
> @@ -95,8 +83,7 @@ void arch_dma_free(struct device *dev, size_t size, void *vaddr,
>  	struct page *page = virt_to_page(paddr);
>  	int is_non_coh = 1;
>  
> -	is_non_coh = (attrs & DMA_ATTR_NON_CONSISTENT) ||
> -			(is_isa_arcv2() && ioc_enable);
> +	is_non_coh = (attrs & DMA_ATTR_NON_CONSISTENT);
>  
>  	if (PageHighMem(page) || !is_non_coh)
>  		iounmap((void __force __iomem *)vaddr);
> @@ -182,3 +169,20 @@ void arch_sync_dma_for_cpu(struct device *dev, phys_addr_t paddr,
>  		break;
>  	}
>  }

I think we have some shenanigans with @ioc_enable now.
Do note that it was more of a debug hack using when the hw feature was introduced
to be able to run same kernel on various FPGA bitfiles but just flicking a global
variable via the debugger.

So per code below, if @ioc_enable is NOT set, we still use software assisted cache
maintenance, but dma_{alloc,free} don't do use that variable. Have you tried
testing the combination when @ioc_enable is set to 0 before boot ? And is that works ?

> +
> +void arch_setup_dma_ops(struct device *dev, u64 dma_base, u64 size,
> +			const struct iommu_ops *iommu, bool coherent)
> +{
> +	/*
> +	 * IOC hardware snoops all DMA traffic keeping the caches consistent
> +	 * with memory - eliding need for any explicit cache maintenance of
> +	 * DMA buffers - so we can use dma_direct cache ops.
> +	 */
> +	if (is_isa_arcv2() && ioc_enable && coherent) {
> +		set_dma_ops(dev, &dma_direct_ops);
> +		dev_info(dev, "use dma_direct_ops cache ops\n");
> +	} else {
> +		set_dma_ops(dev, &dma_noncoherent_ops);
> +		dev_info(dev, "use dma_noncoherent_ops cache ops\n");
> +	}
> +}


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 2/4] ARC: allow to use IOC and non-IOC DMA devices simultaneously
  2018-08-13 16:24   ` Vineet Gupta
@ 2018-08-13 17:08     ` Eugeniy Paltsev
  2018-08-20 22:34       ` Vineet Gupta
  0 siblings, 1 reply; 19+ messages in thread
From: Eugeniy Paltsev @ 2018-08-13 17:08 UTC (permalink / raw)
  To: Eugeniy.Paltsev, Vineet Gupta, linux-snps-arc
  Cc: hch, linux-kernel, linux-arch, Alexey Brodkin

On Mon, 2018-08-13 at 16:24 +0000, Vineet Gupta wrote:
> On 07/30/2018 09:26 AM, Eugeniy Paltsev wrote:
> > @@ -1263,11 +1254,7 @@ void __init arc_cache_init_master(void)
> >  	if (is_isa_arcv2() && ioc_enable)
> >  		arc_ioc_setup();
> >  
> > -	if (is_isa_arcv2() && ioc_enable) {
> > -		__dma_cache_wback_inv = __dma_cache_wback_inv_ioc;
> > -		__dma_cache_inv = __dma_cache_inv_ioc;
> > -		__dma_cache_wback = __dma_cache_wback_ioc;
> > -	} else if (is_isa_arcv2() && l2_line_sz && slc_enable) {
> > +	if (is_isa_arcv2() && l2_line_sz && slc_enable) {
> >  		__dma_cache_wback_inv = __dma_cache_wback_inv_slc;
> >  		__dma_cache_inv = __dma_cache_inv_slc;
> >  		__dma_cache_wback = __dma_cache_wback_slc;
> > diff --git a/arch/arc/mm/dma.c b/arch/arc/mm/dma.c
> > index cefb776a99ff..4d1466905e48 100644
> > --- a/arch/arc/mm/dma.c
> > +++ b/arch/arc/mm/dma.c
> > @@ -33,19 +33,7 @@ void *arch_dma_alloc(struct device *dev, size_t size, dma_addr_t *dma_handle,
> >  	if (!page)
> >  		return NULL;
> >  
> > -	/*
> > -	 * IOC relies on all data (even coherent DMA data) being in cache
> > -	 * Thus allocate normal cached memory
> > -	 *
> > -	 * The gains with IOC are two pronged:
> > -	 *   -For streaming data, elides need for cache maintenance, saving
> > -	 *    cycles in flush code, and bus bandwidth as all the lines of a
> > -	 *    buffer need to be flushed out to memory
> > -	 *   -For coherent data, Read/Write to buffers terminate early in cache
> > -	 *   (vs. always going to memory - thus are faster)
> > -	 */
> > -	if ((is_isa_arcv2() && ioc_enable) ||
> > -	    (attrs & DMA_ATTR_NON_CONSISTENT))
> > +	if (attrs & DMA_ATTR_NON_CONSISTENT)
> >  		need_coh = 0;
> >  
> >  	/*
> > @@ -95,8 +83,7 @@ void arch_dma_free(struct device *dev, size_t size, void *vaddr,
> >  	struct page *page = virt_to_page(paddr);
> >  	int is_non_coh = 1;
> >  
> > -	is_non_coh = (attrs & DMA_ATTR_NON_CONSISTENT) ||
> > -			(is_isa_arcv2() && ioc_enable);
> > +	is_non_coh = (attrs & DMA_ATTR_NON_CONSISTENT);
> >  
> >  	if (PageHighMem(page) || !is_non_coh)
> >  		iounmap((void __force __iomem *)vaddr);
> > @@ -182,3 +169,20 @@ void arch_sync_dma_for_cpu(struct device *dev, phys_addr_t paddr,
> >  		break;
> >  	}
> >  }
> 
> I think we have some shenanigans with @ioc_enable now.
> Do note that it was more of a debug hack using when the hw feature was introduced
> to be able to run same kernel on various FPGA bitfiles but just flicking a global
> variable via the debugger.
> 
> So per code below, if @ioc_enable is NOT set, we still use software assisted cache
> maintenance, but dma_{alloc,free} don't do use that variable. Have you tried
> testing the combination when @ioc_enable is set to 0 before boot ? And is that works ?

Yep, I tested that.
And it works fine with both @ioc_enable == 0 and @ioc_enable == 1
Note that we check this variable in arch_setup_dma_ops() function now.

So this arch_dma_{alloc,free} are used ONLY in case of software assisted cache maintenance.
That's why we had to do MMU mapping to enforce non-cachability regardless of @ioc_enable.


Previously [before this patch] we used this ops for both HW/SW assisted cache maintenance
that's why we checked @ioc_enable in arch_dma_{alloc,free}. (in case of HW assisted cache
maintenance we only allocate memory, and in case of SW assisted cache maintenance we 
allocate memory and do MMU mapping to enforce non-cachability)

> > +
> > +void arch_setup_dma_ops(struct device *dev, u64 dma_base, u64 size,
> > +			const struct iommu_ops *iommu, bool coherent)
> > +{
> > +	/*
> > +	 * IOC hardware snoops all DMA traffic keeping the caches consistent
> > +	 * with memory - eliding need for any explicit cache maintenance of
> > +	 * DMA buffers - so we can use dma_direct cache ops.
> > +	 */
> > +	if (is_isa_arcv2() && ioc_enable && coherent) {
> > +		set_dma_ops(dev, &dma_direct_ops);
> > +		dev_info(dev, "use dma_direct_ops cache ops\n");
> > +	} else {
> > +		set_dma_ops(dev, &dma_noncoherent_ops);
> > +		dev_info(dev, "use dma_noncoherent_ops cache ops\n");
> > +	}
> > +}
> 
> 
-- 
 Eugeniy Paltsev

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 0/4] ARC: allow to use IOC and non-IOC DMA devices simultaneously
  2018-08-13 16:19 ` [PATCH v2 0/4] ARC: allow to use IOC and non-IOC DMA devices simultaneously Vineet Gupta
@ 2018-08-13 17:27   ` Eugeniy Paltsev
  2018-08-13 17:39     ` Vineet Gupta
  0 siblings, 1 reply; 19+ messages in thread
From: Eugeniy Paltsev @ 2018-08-13 17:27 UTC (permalink / raw)
  To: Eugeniy.Paltsev, Vineet Gupta, linux-snps-arc
  Cc: hch, linux-kernel, linux-arch, Alexey Brodkin

On Mon, 2018-08-13 at 16:19 +0000, Vineet Gupta wrote:
> On 07/30/2018 09:26 AM, Eugeniy Paltsev wrote:
> > The ARC HS processor provides an IOC port (I/O coherency bus
> > interface) that allows external devices such as DMA devices
> > to access memory through the cache hierarchy, providing
> > coherency between I/O transactions and the complete memory
> > hierarchy.
> 
> You didn't pay attention to my previous comment on this !
> IOC port can be considered a micro-architecture optimization (an important one
> though). The main thing is hardware snooping DMA transactions which enabled IOC in
> first place.

Ok, I'll rewrite this commit message.

> > Some recent SoC with ARC HS (like HSDK) allow to select bus
> > port (IOC or non-IOC port) for connecting DMA devices in runtime.
> 
> Again you mention the port but none of your 4 patches actually touch the port
> itself in programming it.

Ok.

> 
> > With this patch we can use both HW-coherent and regular DMA
> > peripherals simultaneously.
> > 
> > NOTE:
> > This patch series was stress tested on HSDK with iperf3 (ethernet)
> > and bonie++ (usb and sdio) in three configurations:
> >  * IOC enabled globaly
> >  * IOC disabled globaly
> >  * IOC enabled partially (USB & SDIO are connected via IOC AXI port,
> >    ethernet is connected to DDR AXI port (non-IOC port) 
> > 
> > NOTE:
> > If you want to test some device without IOC it is not enough
> > to remove "dma-coherent" property from dts. You had to remap this
> > device to regular DDR AXI port intead of IOC AXI port.
> 
> Why are we not doing that for the GPU as part of this series.

We still need to upstream couple of minor changes in Vivante driver before enable
it for HSDK.

> 
> > You also need to apply 3 following patches firstly:
> > https://www.mail-archive.com/linux-snps-arc@lists.infradead.org/msg03865.html
> > https://www.mail-archive.com/linux-snps-arc@lists.infradead.org/msg03889.html
> > https://www.mail-archive.com/linux-snps-arc@lists.infradead.org/msg03887.html
> > 
> > NOTE:
> > We don't touch any aperture configuration in this patch series. So we
> > don't switch any devices between IOC and non-IOC AXI ports on any board.
> > It can be done later if it is required.
> > 
> > Changes v1->v2 (Thanks to Christoph):
> >  * Don't select DMA_DIRECT_OPS explicitly as it is already selected by
> >    DMA_NONCOHERENT_OPS
> >  * Remove check for HIGHMEM pages from arch_dma_{alloc, free}
> > 
> > Eugeniy Paltsev (4):
> >   ARC: DTS: mark DMA devices connected through IOC port as dma-coherent
> >   ARC: allow to use IOC and non-IOC DMA devices simultaneously
> >   ARC: IOC: panic if both IOC and ZONE_HIGHMEM enabled
> >   ARC: don't check for HIGHMEM pages in arch_dma_alloc
> > 
> >  arch/arc/boot/dts/axc003.dtsi      | 26 ++++++++++++++++
> >  arch/arc/boot/dts/axc003_idu.dtsi  | 26 ++++++++++++++++
> >  arch/arc/boot/dts/hsdk.dts         |  4 +++
> >  arch/arc/include/asm/dma-mapping.h | 13 ++++++++
> >  arch/arc/mm/cache.c                | 30 +++++++++---------
> >  arch/arc/mm/dma.c                  | 62 +++++++++++++++++++-------------------
> >  6 files changed, 115 insertions(+), 46 deletions(-)
> >  create mode 100644 arch/arc/include/asm/dma-mapping.h
> > 
> 
> 
-- 
 Eugeniy Paltsev

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 0/4] ARC: allow to use IOC and non-IOC DMA devices simultaneously
  2018-08-13 17:27   ` Eugeniy Paltsev
@ 2018-08-13 17:39     ` Vineet Gupta
  0 siblings, 0 replies; 19+ messages in thread
From: Vineet Gupta @ 2018-08-13 17:39 UTC (permalink / raw)
  To: Eugeniy Paltsev, Eugeniy.Paltsev, linux-snps-arc
  Cc: hch, linux-kernel, linux-arch, Alexey Brodkin

On 08/13/2018 10:27 AM, Eugeniy Paltsev wrote:
>> You didn't pay attention to my previous comment on this !
>> IOC port can be considered a micro-architecture optimization (an important one
>> though). The main thing is hardware snooping DMA transactions which enabled IOC in
>> first place.
> Ok, I'll rewrite this commit message.

No need since this is a cover letter and will get dropped anyways, but the reason
I emphasize it is that it sets the wrong expectations for someone down the line
googling for this stuff ! And that someone could be other or just us when we've
long forgotten about this ;-)

>
>>> Some recent SoC with ARC HS (like HSDK) allow to select bus
>>> port (IOC or non-IOC port) for connecting DMA devices in runtime.
>> Again you mention the port but none of your 4 patches actually touch the port
>> itself in programming it.
> Ok.
>


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 2/4] ARC: allow to use IOC and non-IOC DMA devices simultaneously
  2018-08-13 17:08     ` Eugeniy Paltsev
@ 2018-08-20 22:34       ` Vineet Gupta
  2018-08-22 18:40         ` Eugeniy Paltsev
  0 siblings, 1 reply; 19+ messages in thread
From: Vineet Gupta @ 2018-08-20 22:34 UTC (permalink / raw)
  To: Eugeniy Paltsev, linux-snps-arc
  Cc: hch, linux-kernel, linux-arch, Alexey Brodkin

On 08/13/2018 10:08 AM, Eugeniy Paltsev wrote:
> On Mon, 2018-08-13 at 16:24 +0000, Vineet Gupta wrote:
>> On 07/30/2018 09:26 AM, Eugeniy Paltsev wrote:
>>> @@ -1263,11 +1254,7 @@ void __init arc_cache_init_master(void)
>>>  	if (is_isa_arcv2() && ioc_enable)
>>>  		arc_ioc_setup();
>>>  
>>> -	if (is_isa_arcv2() && ioc_enable) {
>>> -		__dma_cache_wback_inv = __dma_cache_wback_inv_ioc;
>>> -		__dma_cache_inv = __dma_cache_inv_ioc;
>>> -		__dma_cache_wback = __dma_cache_wback_ioc;
>>> -	} else if (is_isa_arcv2() && l2_line_sz && slc_enable) {

For the casual reader I'd add a comment why this was deleted.

>>> +	if (is_isa_arcv2() && l2_line_sz && slc_enable) {
>>>  		__dma_cache_wback_inv = __dma_cache_wback_inv_slc;
>>>  		__dma_cache_inv = __dma_cache_inv_slc;
>>>  		__dma_cache_wback = __dma_cache_wback_slc;

[snip]

>>>  
>>> -	/*
>>> -	 * IOC relies on all data (even coherent DMA data) being in cache
>>> -	 * Thus allocate normal cached memory
>>> -	 *
>>> -	 * The gains with IOC are two pronged:
>>> -	 *   -For streaming data, elides need for cache maintenance, saving
>>> -	 *    cycles in flush code, and bus bandwidth as all the lines of a
>>> -	 *    buffer need to be flushed out to memory
>>> -	 *   -For coherent data, Read/Write to buffers terminate early in cache
>>> -	 *   (vs. always going to memory - thus are faster)
>>> -	 */
>>> -	if ((is_isa_arcv2() && ioc_enable) ||
>>> -	    (attrs & DMA_ATTR_NON_CONSISTENT))
>>> +	if (attrs & DMA_ATTR_NON_CONSISTENT)
>>>  		need_coh = 0;
>>>  

[snip]

> Yep, I tested that.
> And it works fine with both @ioc_enable == 0 and @ioc_enable == 1
> Note that we check this variable in arch_setup_dma_ops() function now.
> 
> So this arch_dma_{alloc,free} are used ONLY in case of software assisted cache maintenance.
> That's why we had to do MMU mapping to enforce non-cachability regardless of @ioc_enable.

Reading kernel/dma/* I see what you mean. We check @ioc_enable at the time of
registering the dma op for coherent vs. non coherent case, so there's common vs.
ARC versions of alloc/free for coherent vs. noncoherent. But then I'm curious why
do we bother to check the following in new arch_dma_(alloc|free) at all.

	if (attrs & DMA_ATTR_NON_CONSISTENT)

Isn't it supposed to be NON_CONSISTENT always given the way new code works ?

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 2/4] ARC: allow to use IOC and non-IOC DMA devices simultaneously
  2018-08-20 22:34       ` Vineet Gupta
@ 2018-08-22 18:40         ` Eugeniy Paltsev
  2018-08-23 14:05           ` hch
  2018-09-04 18:13           ` Vineet Gupta
  0 siblings, 2 replies; 19+ messages in thread
From: Eugeniy Paltsev @ 2018-08-22 18:40 UTC (permalink / raw)
  To: Vineet Gupta, linux-snps-arc
  Cc: hch, linux-kernel, linux-arch, Alexey.Brodkin

Hi Vineet,

On Mon, 2018-08-20 at 15:34 -0700, Vineet Gupta wrote:
> On 08/13/2018 10:08 AM, Eugeniy Paltsev wrote:
> > On Mon, 2018-08-13 at 16:24 +0000, Vineet Gupta wrote:
> > > On 07/30/2018 09:26 AM, Eugeniy Paltsev wrote:
> > > > @@ -1263,11 +1254,7 @@ void __init arc_cache_init_master(void)
> > > >  	if (is_isa_arcv2() && ioc_enable)
> > > >  		arc_ioc_setup();
> > > >  
> > > > -	if (is_isa_arcv2() && ioc_enable) {
> > > > -		__dma_cache_wback_inv = __dma_cache_wback_inv_ioc;
> > > > -		__dma_cache_inv = __dma_cache_inv_ioc;
> > > > -		__dma_cache_wback = __dma_cache_wback_ioc;
> > > > -	} else if (is_isa_arcv2() && l2_line_sz && slc_enable) {
> 
> For the casual reader I'd add a comment why this was deleted.
Do you mean comment in commit description or comment right in sources?

> 
> > > > +	if (is_isa_arcv2() && l2_line_sz && slc_enable) {
> > > >  		__dma_cache_wback_inv = __dma_cache_wback_inv_slc;
> > > >  		__dma_cache_inv = __dma_cache_inv_slc;
> > > >  		__dma_cache_wback = __dma_cache_wback_slc;
> 
> [snip]
> 
> > > >  
> > > > -	/*
> > > > -	 * IOC relies on all data (even coherent DMA data) being in cache
> > > > -	 * Thus allocate normal cached memory
> > > > -	 *
> > > > -	 * The gains with IOC are two pronged:
> > > > -	 *   -For streaming data, elides need for cache maintenance, saving
> > > > -	 *    cycles in flush code, and bus bandwidth as all the lines of a
> > > > -	 *    buffer need to be flushed out to memory
> > > > -	 *   -For coherent data, Read/Write to buffers terminate early in cache
> > > > -	 *   (vs. always going to memory - thus are faster)
> > > > -	 */
> > > > -	if ((is_isa_arcv2() && ioc_enable) ||
> > > > -	    (attrs & DMA_ATTR_NON_CONSISTENT))
> > > > +	if (attrs & DMA_ATTR_NON_CONSISTENT)
> > > >  		need_coh = 0;
> > > >  
> 
> [snip]
> 
> > Yep, I tested that.
> > And it works fine with both @ioc_enable == 0 and @ioc_enable == 1
> > Note that we check this variable in arch_setup_dma_ops() function now.
> > 
> > So this arch_dma_{alloc,free} are used ONLY in case of software assisted cache maintenance.
> > That's why we had to do MMU mapping to enforce non-cachability regardless of @ioc_enable.
> 
> Reading kernel/dma/* I see what you mean. We check @ioc_enable at the time of
> registering the dma op for coherent vs. non coherent case, so there's common vs.
> ARC versions of alloc/free for coherent vs. noncoherent.

Just to be sure that we understand both each other and source code correctly: 
- In coherent case we use dma_direct_* ops which doesn't use ARC alloc functions (arch_dma_{alloc|free})
- In non-coherent case we use dma_noncoherent_* ops which uses ARC alloc functions (arch_dma_{alloc|free})

> But then I'm curious why
> do we bother to check the following in new arch_dma_(alloc|free) at all.
> 
> 	if (attrs & DMA_ATTR_NON_CONSISTENT)
> 
> Isn't it supposed to be NON_CONSISTENT always given the way new code works ?

DMA_ATTR_NON_CONSISTENT flag is not related to IOC topic.
It is a flag which we can pass to dma_{alloc|free}_attrs function from driver side.

According to documentation:
  DMA_ATTR_NON_CONSISTENT: Lets the platform to choose to return either
  consistent or non-consistent memory as it sees fit.

We check this flag in arch_dma_alloc (which are used in non-coherent case) to
skip MMU mapping if we are advertised that consistency is not required.

So, actually we can get rid of this flag checking in arch_dma_alloc and 
simply always do MMU mapping to enforce non-cachability and return
non-cacheable memory even if DMA_ATTR_NON_CONSISTENT is passed.
But I don't sure we want to do that.

BTW: dma_alloc_coherent is simply dma_alloc_attrs with attrs == 0.

-- 
 Eugeniy Paltsev

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 2/4] ARC: allow to use IOC and non-IOC DMA devices simultaneously
  2018-08-22 18:40         ` Eugeniy Paltsev
@ 2018-08-23 14:05           ` hch
  2018-09-04 18:13           ` Vineet Gupta
  1 sibling, 0 replies; 19+ messages in thread
From: hch @ 2018-08-23 14:05 UTC (permalink / raw)
  To: Eugeniy Paltsev
  Cc: Vineet Gupta, linux-snps-arc, hch, linux-kernel, linux-arch,
	Alexey.Brodkin

Btw, given that I assume this is 4.20 material now, any chance we
could merge it through the dma-mapping tree?  I have some major changes
pending that would clash if done in a different tree, so I'd rather
get it all together.

> We check this flag in arch_dma_alloc (which are used in non-coherent case) to
> skip MMU mapping if we are advertised that consistency is not required.
> 
> So, actually we can get rid of this flag checking in arch_dma_alloc and 
> simply always do MMU mapping to enforce non-cachability and return
> non-cacheable memory even if DMA_ATTR_NON_CONSISTENT is passed.
> But I don't sure we want to do that.

I plan to kill this flag for 4.20 (or 4.20 at latest) in favor
of a better interface.  But your implementation looks ok, so I'm
fine with keeping it for now.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 2/4] ARC: allow to use IOC and non-IOC DMA devices simultaneously
  2018-08-22 18:40         ` Eugeniy Paltsev
  2018-08-23 14:05           ` hch
@ 2018-09-04 18:13           ` Vineet Gupta
  1 sibling, 0 replies; 19+ messages in thread
From: Vineet Gupta @ 2018-09-04 18:13 UTC (permalink / raw)
  To: Eugeniy Paltsev, linux-snps-arc
  Cc: hch, linux-kernel, linux-arch, Alexey.Brodkin

Hi,

On 08/22/2018 11:40 AM, Eugeniy Paltsev wrote:
>
>> Reading kernel/dma/* I see what you mean. We check @ioc_enable at the time of
>> registering the dma op for coherent vs. non coherent case, so there's common vs.
>> ARC versions of alloc/free for coherent vs. noncoherent.
> Just to be sure that we understand both each other and source code correctly: 
> - In coherent case we use dma_direct_* ops which doesn't use ARC alloc functions (arch_dma_{alloc|free})
> - In non-coherent case we use dma_noncoherent_* ops which uses ARC alloc functions (arch_dma_{alloc|free})

Right I see that.

>> But then I'm curious why
>> do we bother to check the following in new arch_dma_(alloc|free) at all.
>>
>> 	if (attrs & DMA_ATTR_NON_CONSISTENT)
>>
>> Isn't it supposed to be NON_CONSISTENT always given the way new code works ?
> DMA_ATTR_NON_CONSISTENT flag is not related to IOC topic.
> It is a flag which we can pass to dma_{alloc|free}_attrs function from driver side.
>
> According to documentation:
>   DMA_ATTR_NON_CONSISTENT: Lets the platform to choose to return either
>   consistent or non-consistent memory as it sees fit.

Right I'd them mixed up. But then in case of direct dma ops, the attr is simply
ignored in dma_alloc_attrs() -> dma_direct_alloc(). User always gets coherent memory.

>
> We check this flag in arch_dma_alloc (which are used in non-coherent case) to
> skip MMU mapping if we are advertised that consistency is not required.
>
> So, actually we can get rid of this flag checking in arch_dma_alloc and 
> simply always do MMU mapping to enforce non-cachability and return
> non-cacheable memory even if DMA_ATTR_NON_CONSISTENT is passed.
> But I don't sure we want to do that.
>
> BTW: dma_alloc_coherent is simply dma_alloc_attrs with attrs == 0.
>


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 0/4] ARC: allow to use IOC and non-IOC DMA devices simultaneously
  2018-07-30 16:26 [PATCH v2 0/4] ARC: allow to use IOC and non-IOC DMA devices simultaneously Eugeniy Paltsev
                   ` (4 preceding siblings ...)
  2018-08-13 16:19 ` [PATCH v2 0/4] ARC: allow to use IOC and non-IOC DMA devices simultaneously Vineet Gupta
@ 2018-09-04 20:14 ` Vineet Gupta
  2018-09-04 21:11   ` Christoph Hellwig
  5 siblings, 1 reply; 19+ messages in thread
From: Vineet Gupta @ 2018-09-04 20:14 UTC (permalink / raw)
  To: Eugeniy Paltsev, linux-snps-arc
  Cc: linux-kernel, linux-arch, Alexey Brodkin, hch

On 07/30/2018 09:26 AM, Eugeniy Paltsev wrote:
> The ARC HS processor provides an IOC port (I/O coherency bus
> interface) that allows external devices such as DMA devices
> to access memory through the cache hierarchy, providing
> coherency between I/O transactions and the complete memory
> hierarchy.
> 
> Some recent SoC with ARC HS (like HSDK) allow to select bus
> port (IOC or non-IOC port) for connecting DMA devices in runtime.
> 
> With this patch we can use both HW-coherent and regular DMA
> peripherals simultaneously.
> 
> NOTE:
> This patch series was stress tested on HSDK with iperf3 (ethernet)
> and bonie++ (usb and sdio) in three configurations:
>  * IOC enabled globaly
>  * IOC disabled globaly
>  * IOC enabled partially (USB & SDIO are connected via IOC AXI port,
>    ethernet is connected to DDR AXI port (non-IOC port) 
> 
> NOTE:
> If you want to test some device without IOC it is not enough
> to remove "dma-coherent" property from dts. You had to remap this
> device to regular DDR AXI port intead of IOC AXI port.
> You also need to apply 3 following patches firstly:
> https://www.mail-archive.com/linux-snps-arc@lists.infradead.org/msg03865.html
> https://www.mail-archive.com/linux-snps-arc@lists.infradead.org/msg03889.html
> https://www.mail-archive.com/linux-snps-arc@lists.infradead.org/msg03887.html
> 
> NOTE:
> We don't touch any aperture configuration in this patch series. So we
> don't switch any devices between IOC and non-IOC AXI ports on any board.
> It can be done later if it is required.
> 
> Changes v1->v2 (Thanks to Christoph):
>  * Don't select DMA_DIRECT_OPS explicitly as it is already selected by
>    DMA_NONCOHERENT_OPS
>  * Remove check for HIGHMEM pages from arch_dma_{alloc, free}
> 
> Eugeniy Paltsev (4):
>   ARC: DTS: mark DMA devices connected through IOC port as dma-coherent
>   ARC: allow to use IOC and non-IOC DMA devices simultaneously
>   ARC: IOC: panic if both IOC and ZONE_HIGHMEM enabled
>   ARC: don't check for HIGHMEM pages in arch_dma_alloc
> 
>  arch/arc/boot/dts/axc003.dtsi      | 26 ++++++++++++++++
>  arch/arc/boot/dts/axc003_idu.dtsi  | 26 ++++++++++++++++
>  arch/arc/boot/dts/hsdk.dts         |  4 +++
>  arch/arc/include/asm/dma-mapping.h | 13 ++++++++
>  arch/arc/mm/cache.c                | 30 +++++++++---------
>  arch/arc/mm/dma.c                  | 62 +++++++++++++++++++-------------------
>  6 files changed, 115 insertions(+), 46 deletions(-)
>  create mode 100644 arch/arc/include/asm/dma-mapping.h

Apologies for the delay in getting to this - series applied and pushed to for-curr

Thx,
-Vineet



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 0/4] ARC: allow to use IOC and non-IOC DMA devices simultaneously
  2018-09-04 20:14 ` Vineet Gupta
@ 2018-09-04 21:11   ` Christoph Hellwig
  2018-09-04 21:34     ` Vineet Gupta
  0 siblings, 1 reply; 19+ messages in thread
From: Christoph Hellwig @ 2018-09-04 21:11 UTC (permalink / raw)
  To: Vineet Gupta
  Cc: Eugeniy Paltsev, linux-snps-arc, linux-kernel, linux-arch,
	Alexey Brodkin, hch

On Tue, Sep 04, 2018 at 01:14:49PM -0700, Vineet Gupta wrote:
> Apologies for the delay in getting to this - series applied and pushed to for-curr

This is going to create really annoying merge conflicts with
work pending for the dma-mapping tree.  That's why I requested earlier
to merge it either via the dma-mapping tree or some shared branch.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 0/4] ARC: allow to use IOC and non-IOC DMA devices simultaneously
  2018-09-04 21:11   ` Christoph Hellwig
@ 2018-09-04 21:34     ` Vineet Gupta
  2018-09-04 21:38       ` Christoph Hellwig
  0 siblings, 1 reply; 19+ messages in thread
From: Vineet Gupta @ 2018-09-04 21:34 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Eugeniy Paltsev, linux-snps-arc, linux-kernel, linux-arch,
	Alexey Brodkin

On 09/04/2018 02:07 PM, Christoph Hellwig wrote:
> On Tue, Sep 04, 2018 at 01:14:49PM -0700, Vineet Gupta wrote:
>> Apologies for the delay in getting to this - series applied and pushed to for-curr
> This is going to create really annoying merge conflicts with
> work pending for the dma-mapping tree.  That's why I requested earlier
> to merge it either via the dma-mapping tree or some shared branch.

Sorry I missed that request of yours on top of the last msg. I would really want
this to get into 4.19 (understand that merge window is done etc) given that
there's a bunch of other changes lined up behind this one. I was just sucked up
into "other" stuff and could not get to it sooner. Please bear with me this once -
hopefully it would be smoother going fwd.

Thx,
-Vineet



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 0/4] ARC: allow to use IOC and non-IOC DMA devices simultaneously
  2018-09-04 21:34     ` Vineet Gupta
@ 2018-09-04 21:38       ` Christoph Hellwig
  0 siblings, 0 replies; 19+ messages in thread
From: Christoph Hellwig @ 2018-09-04 21:38 UTC (permalink / raw)
  To: Vineet Gupta
  Cc: Christoph Hellwig, Eugeniy Paltsev, linux-snps-arc, linux-kernel,
	linux-arch, Alexey Brodkin

On Tue, Sep 04, 2018 at 09:34:43PM +0000, Vineet Gupta wrote:
> Sorry I missed that request of yours on top of the last msg. I would really want
> this to get into 4.19 (understand that merge window is done etc) given that
> there's a bunch of other changes lined up behind this one. I was just sucked up
> into "other" stuff and could not get to it sooner. Please bear with me this once -
> hopefully it would be smoother going fwd.

Oh, if you manage to get it into 4.19 and do so quickly that is fine
as well.  I just don't want it in your 4.20 queue creating conflicts.

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2018-09-04 21:35 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-07-30 16:26 [PATCH v2 0/4] ARC: allow to use IOC and non-IOC DMA devices simultaneously Eugeniy Paltsev
2018-07-30 16:26 ` [PATCH v2 1/4] ARC: DTS: mark DMA devices connected through IOC port as dma-coherent Eugeniy Paltsev
2018-07-30 16:26 ` [PATCH v2 2/4] ARC: allow to use IOC and non-IOC DMA devices simultaneously Eugeniy Paltsev
2018-08-13 16:24   ` Vineet Gupta
2018-08-13 17:08     ` Eugeniy Paltsev
2018-08-20 22:34       ` Vineet Gupta
2018-08-22 18:40         ` Eugeniy Paltsev
2018-08-23 14:05           ` hch
2018-09-04 18:13           ` Vineet Gupta
2018-07-30 16:26 ` [PATCH v2 3/4] ARC: IOC: panic if both IOC and ZONE_HIGHMEM enabled Eugeniy Paltsev
2018-07-30 16:26 ` [PATCH v2 4/4] ARC: don't check for HIGHMEM pages in arch_dma_alloc Eugeniy Paltsev
2018-07-31  8:08   ` Christoph Hellwig
2018-08-13 16:19 ` [PATCH v2 0/4] ARC: allow to use IOC and non-IOC DMA devices simultaneously Vineet Gupta
2018-08-13 17:27   ` Eugeniy Paltsev
2018-08-13 17:39     ` Vineet Gupta
2018-09-04 20:14 ` Vineet Gupta
2018-09-04 21:11   ` Christoph Hellwig
2018-09-04 21:34     ` Vineet Gupta
2018-09-04 21:38       ` Christoph Hellwig

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).