linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [(subset) PATCH v2 0/3] riscv: dma-mapping: unify support for cache flushes
@ 2023-08-14 20:28 Prabhakar
  2023-08-14 20:28 ` [(subset) PATCH v2 1/3] riscv: dma-mapping: only invalidate after DMA, not flush Prabhakar
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Prabhakar @ 2023-08-14 20:28 UTC (permalink / raw)
  To: Arnd Bergmann, Christoph Hellwig, Paul Walmsley, Palmer Dabbelt,
	Albert Ou, Conor Dooley, Anup Patel, Andrew Jones, Jisheng Zhang,
	linux-kernel
  Cc: Geert Uytterhoeven, Samuel Holland, linux-riscv,
	linux-renesas-soc, Lad Prabhakar

From: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>

This patch series is a subset from Arnd's original series [0]. Ive just
picked up the bits required for RISC-V unification of cache flushing.
Remaining patches from the series [0] will be taken care by Arnd soon.

Cheers,
Prabhakar

v1->v2
* Dropped others archs
* Included RB and ACKs
* Fixed checkpatch issues

V1:
[0] https://patchwork.kernel.org/project/linux-riscv/cover/20230327121317.4081816-1-arnd@kernel.org/

Arnd Bergmann (3):
  riscv: dma-mapping: only invalidate after DMA, not flush
  riscv: dma-mapping: skip invalidation before bidirectional DMA
  riscv: dma-mapping: replace custom code with generic implementation

 arch/riscv/mm/dma-noncoherent.c |  50 +++++++-------
 include/linux/dma-sync.h        | 113 ++++++++++++++++++++++++++++++++
 2 files changed, 136 insertions(+), 27 deletions(-)
 create mode 100644 include/linux/dma-sync.h

-- 
2.34.1


^ permalink raw reply	[flat|nested] 4+ messages in thread

* [(subset) PATCH v2 1/3] riscv: dma-mapping: only invalidate after DMA, not flush
  2023-08-14 20:28 [(subset) PATCH v2 0/3] riscv: dma-mapping: unify support for cache flushes Prabhakar
@ 2023-08-14 20:28 ` Prabhakar
  2023-08-14 20:28 ` [(subset) PATCH v2 2/3] riscv: dma-mapping: skip invalidation before bidirectional DMA Prabhakar
  2023-08-14 20:28 ` [(subset) PATCH v2 3/3] riscv: dma-mapping: replace custom code with generic implementation Prabhakar
  2 siblings, 0 replies; 4+ messages in thread
From: Prabhakar @ 2023-08-14 20:28 UTC (permalink / raw)
  To: Arnd Bergmann, Christoph Hellwig, Paul Walmsley, Palmer Dabbelt,
	Albert Ou, Conor Dooley, Anup Patel, Andrew Jones, Jisheng Zhang,
	linux-kernel
  Cc: Geert Uytterhoeven, Samuel Holland, linux-riscv,
	linux-renesas-soc, Lad Prabhakar, Palmer Dabbelt

From: Arnd Bergmann <arnd@arndb.de>

No other architecture intentionally writes back dirty cache lines into
a buffer that a device has just finished writing into. If the cache is
clean, this has no effect at all, but if a cacheline in the buffer has
actually been written by the CPU,  there is a driver bug that is likely
made worse by overwriting that buffer.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
Acked-by: Palmer Dabbelt <palmer@rivosinc.com>
Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
---
v1->v2
* Fixed typo drive->driver
* Included RB and ACKs
---
 arch/riscv/mm/dma-noncoherent.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/riscv/mm/dma-noncoherent.c b/arch/riscv/mm/dma-noncoherent.c
index d51a75864e53..94614cf61cdd 100644
--- a/arch/riscv/mm/dma-noncoherent.c
+++ b/arch/riscv/mm/dma-noncoherent.c
@@ -42,7 +42,7 @@ void arch_sync_dma_for_cpu(phys_addr_t paddr, size_t size,
 		break;
 	case DMA_FROM_DEVICE:
 	case DMA_BIDIRECTIONAL:
-		ALT_CMO_OP(flush, vaddr, size, riscv_cbom_block_size);
+		ALT_CMO_OP(inval, vaddr, size, riscv_cbom_block_size);
 		break;
 	default:
 		break;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [(subset) PATCH v2 2/3] riscv: dma-mapping: skip invalidation before bidirectional DMA
  2023-08-14 20:28 [(subset) PATCH v2 0/3] riscv: dma-mapping: unify support for cache flushes Prabhakar
  2023-08-14 20:28 ` [(subset) PATCH v2 1/3] riscv: dma-mapping: only invalidate after DMA, not flush Prabhakar
@ 2023-08-14 20:28 ` Prabhakar
  2023-08-14 20:28 ` [(subset) PATCH v2 3/3] riscv: dma-mapping: replace custom code with generic implementation Prabhakar
  2 siblings, 0 replies; 4+ messages in thread
From: Prabhakar @ 2023-08-14 20:28 UTC (permalink / raw)
  To: Arnd Bergmann, Christoph Hellwig, Paul Walmsley, Palmer Dabbelt,
	Albert Ou, Conor Dooley, Anup Patel, Andrew Jones, Jisheng Zhang,
	linux-kernel
  Cc: Geert Uytterhoeven, Samuel Holland, linux-riscv,
	linux-renesas-soc, Lad Prabhakar, Palmer Dabbelt, Guo Ren

From: Arnd Bergmann <arnd@arndb.de>

For a DMA_BIDIRECTIONAL transfer, the caches have to be cleaned
first to let the device see data written by the CPU, and invalidated
after the transfer to let the CPU see data written by the device.

riscv also invalidates the caches before the transfer, which does
not appear to serve any purpose.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
Acked-by: Palmer Dabbelt <palmer@rivosinc.com>
Acked-by: Guo Ren <guoren@kernel.org>
Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
---
v1->v2
* Included RB and ACKs
---
 arch/riscv/mm/dma-noncoherent.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/riscv/mm/dma-noncoherent.c b/arch/riscv/mm/dma-noncoherent.c
index 94614cf61cdd..fc6377a64c8d 100644
--- a/arch/riscv/mm/dma-noncoherent.c
+++ b/arch/riscv/mm/dma-noncoherent.c
@@ -25,7 +25,7 @@ void arch_sync_dma_for_device(phys_addr_t paddr, size_t size,
 		ALT_CMO_OP(clean, vaddr, size, riscv_cbom_block_size);
 		break;
 	case DMA_BIDIRECTIONAL:
-		ALT_CMO_OP(flush, vaddr, size, riscv_cbom_block_size);
+		ALT_CMO_OP(clean, vaddr, size, riscv_cbom_block_size);
 		break;
 	default:
 		break;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [(subset) PATCH v2 3/3] riscv: dma-mapping: replace custom code with generic implementation
  2023-08-14 20:28 [(subset) PATCH v2 0/3] riscv: dma-mapping: unify support for cache flushes Prabhakar
  2023-08-14 20:28 ` [(subset) PATCH v2 1/3] riscv: dma-mapping: only invalidate after DMA, not flush Prabhakar
  2023-08-14 20:28 ` [(subset) PATCH v2 2/3] riscv: dma-mapping: skip invalidation before bidirectional DMA Prabhakar
@ 2023-08-14 20:28 ` Prabhakar
  2 siblings, 0 replies; 4+ messages in thread
From: Prabhakar @ 2023-08-14 20:28 UTC (permalink / raw)
  To: Arnd Bergmann, Christoph Hellwig, Paul Walmsley, Palmer Dabbelt,
	Albert Ou, Conor Dooley, Anup Patel, Andrew Jones, Jisheng Zhang,
	linux-kernel
  Cc: Geert Uytterhoeven, Samuel Holland, linux-riscv,
	linux-renesas-soc, Lad Prabhakar

From: Arnd Bergmann <arnd@arndb.de>

Now that all of these have consistent behavior, replace them with
a single shared implementation of arch_sync_dma_for_device() and
arch_sync_dma_for_cpu() and three parameters to pick how they should
operate:

 - If the CPU has speculative prefetching, then the cache
   has to be invalidated after a transfer from the device.
   On the rarer CPUs without prefetching, this can be skipped,
   with all cache management happening before the transfer.
   This flag can be runtime detected, but is usually fixed
   per architecture.

 - Some architectures currently clean the caches before DMA
   from a device, while others invalidate it. There has not
   been a conclusion regarding whether we should change all
   architectures to use clean instead, so this adds an
   architecture specific flag that we can change later on.

For the function naming, I picked 'wback' over 'clean', and 'wback_inv'
over 'flush', to avoid any ambiguity of what the helper functions are
supposed to do.

Moving the global functions into a header file is usually a bad idea
as it prevents the header from being included more than once, but it
helps keep the behavior as close as possible to the previous state,
including the possibility of inlining most of it into these functions
where that was done before. This also helps keep the global namespace
clean, by hiding the new arch_dma_cache{_wback,_inv,_wback_inv} from
device drivers that might use them incorrectly.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
Tested-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
[PL: Dropped other archs, updated commit message and fixed checkpatch issues]
Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
---
v1->v2
* Updated commit message
* Fixed checkpatch issues
* Dropped other archs
---
 arch/riscv/mm/dma-noncoherent.c |  50 +++++++-------
 include/linux/dma-sync.h        | 113 ++++++++++++++++++++++++++++++++
 2 files changed, 136 insertions(+), 27 deletions(-)
 create mode 100644 include/linux/dma-sync.h

diff --git a/arch/riscv/mm/dma-noncoherent.c b/arch/riscv/mm/dma-noncoherent.c
index fc6377a64c8d..b6a1e9cc9339 100644
--- a/arch/riscv/mm/dma-noncoherent.c
+++ b/arch/riscv/mm/dma-noncoherent.c
@@ -12,43 +12,39 @@
 
 static bool noncoherent_supported __ro_after_init;
 
-void arch_sync_dma_for_device(phys_addr_t paddr, size_t size,
-			      enum dma_data_direction dir)
+static inline void arch_dma_cache_wback(phys_addr_t paddr, size_t size)
 {
 	void *vaddr = phys_to_virt(paddr);
 
-	switch (dir) {
-	case DMA_TO_DEVICE:
-		ALT_CMO_OP(clean, vaddr, size, riscv_cbom_block_size);
-		break;
-	case DMA_FROM_DEVICE:
-		ALT_CMO_OP(clean, vaddr, size, riscv_cbom_block_size);
-		break;
-	case DMA_BIDIRECTIONAL:
-		ALT_CMO_OP(clean, vaddr, size, riscv_cbom_block_size);
-		break;
-	default:
-		break;
-	}
+	ALT_CMO_OP(clean, vaddr, size, riscv_cbom_block_size);
 }
 
-void arch_sync_dma_for_cpu(phys_addr_t paddr, size_t size,
-			   enum dma_data_direction dir)
+static inline void arch_dma_cache_inv(phys_addr_t paddr, size_t size)
 {
 	void *vaddr = phys_to_virt(paddr);
 
-	switch (dir) {
-	case DMA_TO_DEVICE:
-		break;
-	case DMA_FROM_DEVICE:
-	case DMA_BIDIRECTIONAL:
-		ALT_CMO_OP(inval, vaddr, size, riscv_cbom_block_size);
-		break;
-	default:
-		break;
-	}
+	ALT_CMO_OP(inval, vaddr, size, riscv_cbom_block_size);
 }
 
+static inline void arch_dma_cache_wback_inv(phys_addr_t paddr, size_t size)
+{
+	void *vaddr = phys_to_virt(paddr);
+
+	ALT_CMO_OP(flush, vaddr, size, riscv_cbom_block_size);
+}
+
+static inline bool arch_sync_dma_clean_before_fromdevice(void)
+{
+	return true;
+}
+
+static inline bool arch_sync_dma_cpu_needs_post_dma_flush(void)
+{
+	return true;
+}
+
+#include <linux/dma-sync.h>
+
 void arch_dma_prep_coherent(struct page *page, size_t size)
 {
 	void *flush_addr = page_address(page);
diff --git a/include/linux/dma-sync.h b/include/linux/dma-sync.h
new file mode 100644
index 000000000000..be23e8dda2e2
--- /dev/null
+++ b/include/linux/dma-sync.h
@@ -0,0 +1,113 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+/*
+ * Cache operations depending on function and direction argument, inspired by
+ * https://lore.kernel.org/lkml/20180518175004.GF17671@n2100.armlinux.org.uk
+ * "dma_sync_*_for_cpu and direction=TO_DEVICE (was Re: [PATCH 02/20]
+ * dma-mapping: provide a generic dma-noncoherent implementation)"
+ *
+ *          |   map          ==  for_device     |   unmap     ==  for_cpu
+ *          |----------------------------------------------------------------
+ * TO_DEV   |   writeback        writeback      |   none          none
+ * FROM_DEV |   invalidate       invalidate     |   invalidate*   invalidate*
+ * BIDIR    |   writeback        writeback      |   invalidate    invalidate
+ *
+ *     [*] needed for CPU speculative prefetches
+ *
+ * NOTE: we don't check the validity of direction argument as it is done in
+ * upper layer functions (in include/linux/dma-mapping.h)
+ *
+ * This file can be included by arch/.../kernel/dma-noncoherent.c to provide
+ * the respective high-level operations without having to expose the
+ * cache management ops to drivers.
+ */
+
+#ifndef __DMA_SYNC_H__
+#define __DMA_SYNC_H__
+
+void arch_sync_dma_for_device(phys_addr_t paddr, size_t size,
+			      enum dma_data_direction dir)
+{
+	switch (dir) {
+	case DMA_TO_DEVICE:
+		/*
+		 * This may be an empty function on write-through caches,
+		 * and it might invalidate the cache if an architecture has
+		 * a write-back cache but no way to write it back without
+		 * invalidating
+		 */
+		arch_dma_cache_wback(paddr, size);
+		break;
+
+	case DMA_FROM_DEVICE:
+		/*
+		 * FIXME: this should be handled the same across all
+		 * architectures, see
+		 * https://lore.kernel.org/all/20220606152150.GA31568@willie-the-truck/
+		 */
+		if (!arch_sync_dma_clean_before_fromdevice()) {
+			arch_dma_cache_inv(paddr, size);
+			break;
+		}
+		fallthrough;
+
+	case DMA_BIDIRECTIONAL:
+		/* Skip the invalidate here if it's done later */
+		if (IS_ENABLED(CONFIG_ARCH_HAS_SYNC_DMA_FOR_CPU) &&
+		    arch_sync_dma_cpu_needs_post_dma_flush())
+			arch_dma_cache_wback(paddr, size);
+		else
+			arch_dma_cache_wback_inv(paddr, size);
+		break;
+
+	default:
+		break;
+	}
+}
+
+#ifdef CONFIG_ARCH_HAS_SYNC_DMA_FOR_CPU
+/*
+ * Mark the D-cache clean for these pages to avoid extra flushing.
+ */
+static void arch_dma_mark_dcache_clean(phys_addr_t paddr, size_t size)
+{
+#ifdef CONFIG_ARCH_DMA_MARK_DCACHE_CLEAN
+	unsigned long pfn = PFN_UP(paddr);
+	unsigned long off = paddr & (PAGE_SIZE - 1);
+	size_t left = size;
+
+	if (off)
+		left -= PAGE_SIZE - off;
+
+	while (left >= PAGE_SIZE) {
+		struct page *page = pfn_to_page(pfn++);
+
+		set_bit(PG_dcache_clean, &page->flags);
+		left -= PAGE_SIZE;
+	}
+#endif
+}
+
+void arch_sync_dma_for_cpu(phys_addr_t paddr, size_t size,
+			   enum dma_data_direction dir)
+{
+	switch (dir) {
+	case DMA_TO_DEVICE:
+		break;
+
+	case DMA_FROM_DEVICE:
+	case DMA_BIDIRECTIONAL:
+		/* FROM_DEVICE invalidate needed if speculative CPU prefetch only */
+		if (arch_sync_dma_cpu_needs_post_dma_flush())
+			arch_dma_cache_inv(paddr, size);
+
+		if (size > PAGE_SIZE)
+			arch_dma_mark_dcache_clean(paddr, size);
+		break;
+
+	default:
+		break;
+	}
+}
+#endif
+#endif /* __DMA_SYNC_H__ */
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2023-08-14 20:30 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-08-14 20:28 [(subset) PATCH v2 0/3] riscv: dma-mapping: unify support for cache flushes Prabhakar
2023-08-14 20:28 ` [(subset) PATCH v2 1/3] riscv: dma-mapping: only invalidate after DMA, not flush Prabhakar
2023-08-14 20:28 ` [(subset) PATCH v2 2/3] riscv: dma-mapping: skip invalidation before bidirectional DMA Prabhakar
2023-08-14 20:28 ` [(subset) PATCH v2 3/3] riscv: dma-mapping: replace custom code with generic implementation Prabhakar

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).