linux-renesas-soc.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v7 0/6] RISC-V non-coherent function pointer based CMO + non-coherent DMA support for AX45MP
@ 2023-03-30 20:42 Prabhakar
  2023-03-30 20:42 ` [PATCH v7 1/6] riscv: mm: dma-noncoherent: Switch using function pointers for cache management Prabhakar
                   ` (6 more replies)
  0 siblings, 7 replies; 40+ messages in thread
From: Prabhakar @ 2023-03-30 20:42 UTC (permalink / raw)
  To: Arnd Bergmann, Conor Dooley, Geert Uytterhoeven, Heiko Stuebner,
	Guo Ren, Andrew Jones, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Samuel Holland, linux-riscv
  Cc: Rob Herring, Krzysztof Kozlowski, devicetree, linux-kernel,
	linux-renesas-soc, Prabhakar, Biju Das, Lad Prabhakar

From: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>

Hi All,

non-coherent DMA support for AX45MP
====================================

On the Andes AX45MP core, cache coherency is a specification option so it
may not be supported. In this case DMA will fail. To get around with this
issue this patch series does the below:

1] Andes alternative ports is implemented as errata which checks if the IOCP
is missing and only then applies to CMO errata. One vendor specific SBI EXT
(ANDES_SBI_EXT_IOCP_SW_WORKAROUND) is implemented as part of errata.

Below are the configs which Andes port provides (and are selected by RZ/Five):
      - ERRATA_ANDES
      - ERRATA_ANDES_CMO

OpenSBI patch supporting ANDES_SBI_EXT_IOCP_SW_WORKAROUND SBI can be found here,
https://patchwork.ozlabs.org/project/opensbi/patch/20230317140357.14819-1-prabhakar.mahadev-lad.rj@bp.renesas.com/

2] Andes AX45MP core has a Programmable Physical Memory Attributes (PMA)
block that allows dynamic adjustment of memory attributes in the runtime.
It contains a configurable amount of PMA entries implemented as CSR
registers to control the attributes of memory locations in interest.
OpenSBI configures the PMA regions as required and creates a reserve memory
node and propagates it to the higher boot stack.

Currently OpenSBI (upstream) configures the required PMA region and passes
this a shared DMA pool to Linux.

    reserved-memory {
        #address-cells = <2>;
        #size-cells = <2>;
        ranges;

        pma_resv0@58000000 {
            compatible = "shared-dma-pool";
            reg = <0x0 0x58000000 0x0 0x08000000>;
            no-map;
            linux,dma-default;
        };
    };

The above shared DMA pool gets appended to Linux DTB so the DMA memory
requests go through this region.

3] We provide callbacks to synchronize specific content between memory and
cache and register using riscv_noncoherent_register_cache_ops().

4] RZ/Five SoC selects the below configs
        - AX45MP_L2_CACHE
        - DMA_GLOBAL_POOL
        - ERRATA_ANDES
        - ERRATA_ANDES_CMO

----------x---------------------x--------------------x---------------x--------------

Note,
- This series requires testing on Cores with zicbom and T-Head SoCs
- Ive used GCC 12.2.0 for compilation
- Tested all the IP blocks on RZ/Five which use DMA
- Patch series is dependent on the series from Arnd,
  https://patchwork.kernel.org/project/linux-riscv/cover/20230327121317.4081816-1-arnd@kernel.org/
- Patches applies on top of palmer/for-next (d34a6b715a23)

v6 -> v7
* Reworked the code based on Arnd's work
* Fixed review comments pointed by Arnd
* Fixed review comments pointed by Conor

v5.1 -> v6
* Dropped use of ALTERNATIVE_x() macro
* Now switched to used function pointers for CMO
* Moved driver to drivers/cache folder

v5 -> v5.1
* https://patchwork.kernel.org/project/linux-riscv/list/?series=708610&state=%2A&archive=both

v4 -> v5
* Rebased ALTERNATIVE_3() macro on top of Andrew's patches
* Rebased the changes on top of Heiko's alternative call patches
* Dropped configuring the PMA from Linux
* Dropped configuring the L2 cache from Linux and dropped the binding for same
* Now using runtime patching mechanism instead of compile time config

RFC v3 -> v4
* Implemented ALTERNATIVE_3() macro 
* Now using runtime patching mechanism instead of compile time config
* Added Andes CMO as and errata
* Fixed comments pointed by Geert

RFC v2-> RFC v3
* Fixed review comments pointed by Conor
* Move DT binding into cache folder
* Fixed DT binding check issue
* Added andestech,ax45mp-cache.h header file
* Now passing the flags for the PMA setup as part of andestech,pma-regions
  property.
* Added andestech,inst/data-prefetch and andestech,tag/data-ram-ctl
  properties to configure the L2 cache.
* Registered the cache driver as platform driver

RFC v1-> RFC v2
* Moved out the code from arc/riscv to drivers/soc/renesas
* Now handling the PMA setup as part of the L2 cache
* Now making use of dma-noncoherent.c instead SoC specific implementation.
* Dropped arch_dma_alloc() and arch_dma_free()
* Switched to RISCV_DMA_NONCOHERENT
* Included DT binding doc

RFC v2: https://patchwork.kernel.org/project/linux-renesas-soc/cover/20221003223222.448551-1-prabhakar.mahadev-lad.rj@bp.renesas.com/
RFC v1: https://patchwork.kernel.org/project/linux-renesas-soc/cover/20220906102154.32526-1-prabhakar.mahadev-lad.rj@bp.renesas.com/

Cheers,
Prabhakar

Lad Prabhakar (6):
  riscv: mm: dma-noncoherent: Switch using function pointers for cache
    management
  riscv: asm: vendorid_list: Add Andes Technology to the vendors list
  riscv: errata: Add Andes alternative ports
  dt-bindings: cache: r9a07g043f-l2-cache: Add DT binding documentation
    for L2 cache controller
  cache: Add L2 cache management for Andes AX45MP RISC-V core
  soc: renesas: Kconfig: Select the required configs for RZ/Five SoC

 .../cache/andestech,ax45mp-cache.yaml         |  81 +++++++
 MAINTAINERS                                   |   8 +
 arch/riscv/Kconfig.errata                     |  21 ++
 arch/riscv/errata/Makefile                    |   1 +
 arch/riscv/errata/andes/Makefile              |   1 +
 arch/riscv/errata/andes/errata.c              |  71 ++++++
 arch/riscv/errata/thead/errata.c              |  76 ++++++
 arch/riscv/include/asm/alternative.h          |   3 +
 arch/riscv/include/asm/dma-noncoherent.h      |  73 ++++++
 arch/riscv/include/asm/errata_list.h          |  53 ----
 arch/riscv/include/asm/vendorid_list.h        |   1 +
 arch/riscv/kernel/alternative.c               |   5 +
 arch/riscv/kernel/setup.c                     |  49 +++-
 arch/riscv/mm/dma-noncoherent.c               |  25 +-
 arch/riscv/mm/pmem.c                          |   6 +-
 drivers/Kconfig                               |   2 +
 drivers/Makefile                              |   1 +
 drivers/cache/Kconfig                         |  10 +
 drivers/cache/Makefile                        |   3 +
 drivers/cache/ax45mp_cache.c                  | 229 ++++++++++++++++++
 drivers/soc/renesas/Kconfig                   |   4 +
 21 files changed, 662 insertions(+), 61 deletions(-)
 create mode 100644 Documentation/devicetree/bindings/cache/andestech,ax45mp-cache.yaml
 create mode 100644 arch/riscv/errata/andes/Makefile
 create mode 100644 arch/riscv/errata/andes/errata.c
 create mode 100644 arch/riscv/include/asm/dma-noncoherent.h
 create mode 100644 drivers/cache/Kconfig
 create mode 100644 drivers/cache/Makefile
 create mode 100644 drivers/cache/ax45mp_cache.c

-- 
2.25.1


^ permalink raw reply	[flat|nested] 40+ messages in thread

* [PATCH v7 1/6] riscv: mm: dma-noncoherent: Switch using function pointers for cache management
  2023-03-30 20:42 [PATCH v7 0/6] RISC-V non-coherent function pointer based CMO + non-coherent DMA support for AX45MP Prabhakar
@ 2023-03-30 20:42 ` Prabhakar
  2023-03-30 21:34   ` Arnd Bergmann
                     ` (3 more replies)
  2023-03-30 20:42 ` [PATCH v7 2/6] riscv: asm: vendorid_list: Add Andes Technology to the vendors list Prabhakar
                   ` (5 subsequent siblings)
  6 siblings, 4 replies; 40+ messages in thread
From: Prabhakar @ 2023-03-30 20:42 UTC (permalink / raw)
  To: Arnd Bergmann, Conor Dooley, Geert Uytterhoeven, Heiko Stuebner,
	Guo Ren, Andrew Jones, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Samuel Holland, linux-riscv
  Cc: Rob Herring, Krzysztof Kozlowski, devicetree, linux-kernel,
	linux-renesas-soc, Prabhakar, Biju Das, Lad Prabhakar

From: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>

Currently, selecting which CMOs to use on a given platform is done using
and ALTERNATIVE_X() macro. This was manageable when there were just two
CMO implementations, but now that there are more and more platforms coming
needing custom CMOs, the use of the ALTERNATIVE_X() macro is unmanageable.

To avoid such issues this patch switches to use of function pointers
instead of ALTERNATIVE_X() macro for cache management (the only drawback
being performance over the previous approach).

void (*clean_range)(unsigned long addr, unsigned long size);
void (*inv_range)(unsigned long addr, unsigned long size);
void (*flush_range)(unsigned long addr, unsigned long size);

The above function pointers are provided to be overridden for platforms
needing CMO.

Convert ZICBOM and T-HEAD CMO to use function pointers.

Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
---
v6->v7
* Updated commit description
* Fixed build issues when CONFIG_ERRATA_THEAD_CMO=n
* Used static const struct ptr to register CMO ops
* Dropped riscv_dma_noncoherent_cmo_ops
* Moved ZICBOM CMO setup to setup.c

v5->v6
* New patch
---
 arch/riscv/errata/thead/errata.c         | 76 ++++++++++++++++++++++++
 arch/riscv/include/asm/dma-noncoherent.h | 73 +++++++++++++++++++++++
 arch/riscv/include/asm/errata_list.h     | 53 -----------------
 arch/riscv/kernel/setup.c                | 49 ++++++++++++++-
 arch/riscv/mm/dma-noncoherent.c          | 25 ++++++--
 arch/riscv/mm/pmem.c                     |  6 +-
 6 files changed, 221 insertions(+), 61 deletions(-)
 create mode 100644 arch/riscv/include/asm/dma-noncoherent.h

diff --git a/arch/riscv/errata/thead/errata.c b/arch/riscv/errata/thead/errata.c
index 7e8d50ebb71a..4bb3f2baee03 100644
--- a/arch/riscv/errata/thead/errata.c
+++ b/arch/riscv/errata/thead/errata.c
@@ -11,10 +11,83 @@
 #include <linux/uaccess.h>
 #include <asm/alternative.h>
 #include <asm/cacheflush.h>
+#include <asm/dma-noncoherent.h>
 #include <asm/errata_list.h>
 #include <asm/patch.h>
 #include <asm/vendorid_list.h>
 
+#ifdef CONFIG_ERRATA_THEAD_CMO
+/*
+ * dcache.ipa rs1 (invalidate, physical address)
+ * | 31 - 25 | 24 - 20 | 19 - 15 | 14 - 12 | 11 - 7 | 6 - 0 |
+ *   0000001    01010      rs1       000      00000  0001011
+ * dache.iva rs1 (invalida, virtual address)
+ *   0000001    00110      rs1       000      00000  0001011
+ *
+ * dcache.cpa rs1 (clean, physical address)
+ * | 31 - 25 | 24 - 20 | 19 - 15 | 14 - 12 | 11 - 7 | 6 - 0 |
+ *   0000001    01001      rs1       000      00000  0001011
+ * dcache.cva rs1 (clean, virtual address)
+ *   0000001    00100      rs1       000      00000  0001011
+ *
+ * dcache.cipa rs1 (clean then invalidate, physical address)
+ * | 31 - 25 | 24 - 20 | 19 - 15 | 14 - 12 | 11 - 7 | 6 - 0 |
+ *   0000001    01011      rs1       000      00000  0001011
+ * dcache.civa rs1 (... virtual address)
+ *   0000001    00111      rs1       000      00000  0001011
+ *
+ * sync.s (make sure all cache operations finished)
+ * | 31 - 25 | 24 - 20 | 19 - 15 | 14 - 12 | 11 - 7 | 6 - 0 |
+ *   0000000    11001     00000      000      00000  0001011
+ */
+#define THEAD_inval_A0	".long 0x0265000b"
+#define THEAD_clean_A0	".long 0x0245000b"
+#define THEAD_flush_A0	".long 0x0275000b"
+#define THEAD_SYNC_S	".long 0x0190000b"
+
+#define THEAD_CMO_OP(_op, _start, _size, _cachesize)				\
+	asm volatile("mv a0, %1\n\t"						\
+		     "j 2f\n\t"							\
+		     "3:\n\t"							\
+		     THEAD_##_op##_A0 "\n\t"					\
+		     "add a0, a0, %0\n\t"					\
+		     "2:\n\t"							\
+		     "bltu a0, %2, 3b\n\t"					\
+		     THEAD_SYNC_S						\
+		     : : "r"(_cachesize),					\
+			 "r"((unsigned long)(_start) & ~((_cachesize) - 1UL)),	\
+			 "r"((unsigned long)(_start) + (_size))			\
+		     : "a0")
+
+static void thead_cmo_clean_range(unsigned long addr, unsigned long size)
+{
+	THEAD_CMO_OP(clean, addr, size, riscv_cbom_block_size);
+}
+
+static void thead_cmo_flush_range(unsigned long addr, unsigned long size)
+{
+	THEAD_CMO_OP(flush, addr, size, riscv_cbom_block_size);
+}
+
+static void thead_cmo_inval_range(unsigned long addr, unsigned long size)
+{
+	THEAD_CMO_OP(inval, addr, size, riscv_cbom_block_size);
+}
+
+static const struct riscv_cache_ops thead_cmo_ops = {
+	.clean_range = &thead_cmo_clean_range,
+	.inv_range = &thead_cmo_inval_range,
+	.flush_range = &thead_cmo_flush_range,
+};
+
+static void thead_register_cmo_ops(void)
+{
+	riscv_noncoherent_register_cache_ops(&thead_cmo_ops);
+}
+#else
+static void thead_register_cmo_ops(void) {}
+#endif
+
 static bool errata_probe_pbmt(unsigned int stage,
 			      unsigned long arch_id, unsigned long impid)
 {
@@ -45,6 +118,9 @@ static bool errata_probe_cmo(unsigned int stage,
 
 	riscv_cbom_block_size = L1_CACHE_BYTES;
 	riscv_noncoherent_supported();
+
+	thead_register_cmo_ops();
+
 	return true;
 }
 
diff --git a/arch/riscv/include/asm/dma-noncoherent.h b/arch/riscv/include/asm/dma-noncoherent.h
new file mode 100644
index 000000000000..fc8d16c89f01
--- /dev/null
+++ b/arch/riscv/include/asm/dma-noncoherent.h
@@ -0,0 +1,73 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Copyright (C) 2023 Renesas Electronics Corp.
+ */
+
+#ifndef __ASM_DMA_NONCOHERENT_H
+#define __ASM_DMA_NONCOHERENT_H
+
+#include <linux/dma-direct.h>
+
+#ifdef CONFIG_RISCV_DMA_NONCOHERENT
+
+/*
+ * struct riscv_cache_ops - Structure for CMO function pointers
+ *
+ * @clean_range: Function pointer for clean cache
+ * @inv_range: Function pointer for invalidate cache
+ * @flush_range: Function pointer for flushing the cache
+ */
+struct riscv_cache_ops {
+	void (*clean_range)(unsigned long addr, unsigned long size);
+	void (*inv_range)(unsigned long addr, unsigned long size);
+	void (*flush_range)(unsigned long addr, unsigned long size);
+};
+
+extern struct riscv_cache_ops noncoherent_cache_ops;
+
+void riscv_noncoherent_register_cache_ops(const struct riscv_cache_ops *ops);
+
+static inline void riscv_dma_noncoherent_clean(void *vaddr, size_t size)
+{
+	if (noncoherent_cache_ops.clean_range) {
+		unsigned long addr = (unsigned long)vaddr;
+
+		noncoherent_cache_ops.clean_range(addr, size);
+	}
+}
+
+static inline void riscv_dma_noncoherent_flush(void *vaddr, size_t size)
+{
+	if (noncoherent_cache_ops.flush_range) {
+		unsigned long addr = (unsigned long)vaddr;
+
+		noncoherent_cache_ops.flush_range(addr, size);
+	}
+}
+
+static inline void riscv_dma_noncoherent_inval(void *vaddr, size_t size)
+{
+	if (noncoherent_cache_ops.inv_range) {
+		unsigned long addr = (unsigned long)vaddr;
+
+		noncoherent_cache_ops.inv_range(addr, size);
+	}
+}
+
+static inline void riscv_dma_noncoherent_pmem_clean(void *vaddr, size_t size)
+{
+	riscv_dma_noncoherent_clean(vaddr, size);
+}
+
+static inline void riscv_dma_noncoherent_pmem_inval(void *vaddr, size_t size)
+{
+	riscv_dma_noncoherent_inval(vaddr, size);
+}
+#else
+
+static inline void riscv_dma_noncoherent_pmem_clean(void *vaddr, size_t size) {}
+static inline void riscv_dma_noncoherent_pmem_inval(void *vaddr, size_t size) {}
+
+#endif /* CONFIG_RISCV_DMA_NONCOHERENT */
+
+#endif	/* __ASM_DMA_NONCOHERENT_H */
diff --git a/arch/riscv/include/asm/errata_list.h b/arch/riscv/include/asm/errata_list.h
index fb1a810f3d8c..112429910ee6 100644
--- a/arch/riscv/include/asm/errata_list.h
+++ b/arch/riscv/include/asm/errata_list.h
@@ -89,59 +89,6 @@ asm volatile(ALTERNATIVE(						\
 #define ALT_THEAD_PMA(_val)
 #endif
 
-/*
- * dcache.ipa rs1 (invalidate, physical address)
- * | 31 - 25 | 24 - 20 | 19 - 15 | 14 - 12 | 11 - 7 | 6 - 0 |
- *   0000001    01010      rs1       000      00000  0001011
- * dache.iva rs1 (invalida, virtual address)
- *   0000001    00110      rs1       000      00000  0001011
- *
- * dcache.cpa rs1 (clean, physical address)
- * | 31 - 25 | 24 - 20 | 19 - 15 | 14 - 12 | 11 - 7 | 6 - 0 |
- *   0000001    01001      rs1       000      00000  0001011
- * dcache.cva rs1 (clean, virtual address)
- *   0000001    00100      rs1       000      00000  0001011
- *
- * dcache.cipa rs1 (clean then invalidate, physical address)
- * | 31 - 25 | 24 - 20 | 19 - 15 | 14 - 12 | 11 - 7 | 6 - 0 |
- *   0000001    01011      rs1       000      00000  0001011
- * dcache.civa rs1 (... virtual address)
- *   0000001    00111      rs1       000      00000  0001011
- *
- * sync.s (make sure all cache operations finished)
- * | 31 - 25 | 24 - 20 | 19 - 15 | 14 - 12 | 11 - 7 | 6 - 0 |
- *   0000000    11001     00000      000      00000  0001011
- */
-#define THEAD_inval_A0	".long 0x0265000b"
-#define THEAD_clean_A0	".long 0x0245000b"
-#define THEAD_flush_A0	".long 0x0275000b"
-#define THEAD_SYNC_S	".long 0x0190000b"
-
-#define ALT_CMO_OP(_op, _start, _size, _cachesize)			\
-asm volatile(ALTERNATIVE_2(						\
-	__nops(6),							\
-	"mv a0, %1\n\t"							\
-	"j 2f\n\t"							\
-	"3:\n\t"							\
-	CBO_##_op(a0)							\
-	"add a0, a0, %0\n\t"						\
-	"2:\n\t"							\
-	"bltu a0, %2, 3b\n\t"						\
-	"nop", 0, RISCV_ISA_EXT_ZICBOM, CONFIG_RISCV_ISA_ZICBOM,	\
-	"mv a0, %1\n\t"							\
-	"j 2f\n\t"							\
-	"3:\n\t"							\
-	THEAD_##_op##_A0 "\n\t"						\
-	"add a0, a0, %0\n\t"						\
-	"2:\n\t"							\
-	"bltu a0, %2, 3b\n\t"						\
-	THEAD_SYNC_S, THEAD_VENDOR_ID,					\
-			ERRATA_THEAD_CMO, CONFIG_ERRATA_THEAD_CMO)	\
-	: : "r"(_cachesize),						\
-	    "r"((unsigned long)(_start) & ~((_cachesize) - 1UL)),	\
-	    "r"((unsigned long)(_start) + (_size))			\
-	: "a0")
-
 #define THEAD_C9XX_RV_IRQ_PMU			17
 #define THEAD_C9XX_CSR_SCOUNTEROF		0x5c5
 
diff --git a/arch/riscv/kernel/setup.c b/arch/riscv/kernel/setup.c
index 5d3184cbf518..b2b69d1dec23 100644
--- a/arch/riscv/kernel/setup.c
+++ b/arch/riscv/kernel/setup.c
@@ -24,6 +24,7 @@
 #include <asm/alternative.h>
 #include <asm/cacheflush.h>
 #include <asm/cpu_ops.h>
+#include <asm/dma-noncoherent.h>
 #include <asm/early_ioremap.h>
 #include <asm/pgtable.h>
 #include <asm/setup.h>
@@ -262,6 +263,50 @@ static void __init parse_dtb(void)
 #endif
 }
 
+#ifdef CONFIG_RISCV_ISA_ZICBOM
+
+#define ZICBOM_CMO_OP(_op, _start, _size, _cachesize)				\
+	asm volatile("mv a0, %1\n\t"						\
+		     "j 2f\n\t"							\
+		     "3:\n\t"							\
+		     CBO_##_op(a0)						\
+		     "add a0, a0, %0\n\t"					\
+		     "2:\n\t"							\
+		     "bltu a0, %2, 3b\n\t"					\
+		     : : "r"(_cachesize),					\
+			 "r"((unsigned long)(_start) & ~((_cachesize) - 1UL)),	\
+			 "r"((unsigned long)(_start) + (_size))			\
+		     : "a0")
+
+static void zicbom_cmo_clean_range(unsigned long addr, unsigned long size)
+{
+	ZICBOM_CMO_OP(clean, addr, size, riscv_cbom_block_size);
+}
+
+static void zicbom_cmo_flush_range(unsigned long addr, unsigned long size)
+{
+	ZICBOM_CMO_OP(flush, addr, size, riscv_cbom_block_size);
+}
+
+static void zicbom_cmo_inval_range(unsigned long addr, unsigned long size)
+{
+	ZICBOM_CMO_OP(inval, addr, size, riscv_cbom_block_size);
+}
+
+const struct riscv_cache_ops zicbom_cmo_ops = {
+	.clean_range = &zicbom_cmo_clean_range,
+	.inv_range = &zicbom_cmo_inval_range,
+	.flush_range = &zicbom_cmo_flush_range,
+};
+
+static void zicbom_register_cmo_ops(void)
+{
+	riscv_noncoherent_register_cache_ops(&zicbom_cmo_ops);
+}
+#else
+static void zicbom_register_cmo_ops(void) {}
+#endif
+
 void __init setup_arch(char **cmdline_p)
 {
 	parse_dtb();
@@ -301,8 +346,10 @@ void __init setup_arch(char **cmdline_p)
 	riscv_fill_hwcap();
 	apply_boot_alternatives();
 	if (IS_ENABLED(CONFIG_RISCV_ISA_ZICBOM) &&
-	    riscv_isa_extension_available(NULL, ZICBOM))
+	    riscv_isa_extension_available(NULL, ZICBOM)) {
 		riscv_noncoherent_supported();
+		zicbom_register_cmo_ops();
+	}
 }
 
 static int __init topology_init(void)
diff --git a/arch/riscv/mm/dma-noncoherent.c b/arch/riscv/mm/dma-noncoherent.c
index b9a9f57e02be..ab8f6dc67914 100644
--- a/arch/riscv/mm/dma-noncoherent.c
+++ b/arch/riscv/mm/dma-noncoherent.c
@@ -9,28 +9,36 @@
 #include <linux/dma-map-ops.h>
 #include <linux/mm.h>
 #include <asm/cacheflush.h>
+#include <asm/dma-noncoherent.h>
 
 static bool noncoherent_supported;
 
+struct riscv_cache_ops noncoherent_cache_ops = {
+	.clean_range = NULL,
+	.inv_range = NULL,
+	.flush_range = NULL,
+};
+EXPORT_SYMBOL_GPL(noncoherent_cache_ops);
+
 static inline void arch_dma_cache_wback(phys_addr_t paddr, size_t size)
 {
 	void *vaddr = phys_to_virt(paddr);
 
-	ALT_CMO_OP(clean, vaddr, size, riscv_cbom_block_size);
+	riscv_dma_noncoherent_clean(vaddr, size);
 }
 
 static inline void arch_dma_cache_inv(phys_addr_t paddr, size_t size)
 {
 	void *vaddr = phys_to_virt(paddr);
 
-	ALT_CMO_OP(inval, vaddr, size, riscv_cbom_block_size);
+	riscv_dma_noncoherent_inval(vaddr, size);
 }
 
 static inline void arch_dma_cache_wback_inv(phys_addr_t paddr, size_t size)
 {
 	void *vaddr = phys_to_virt(paddr);
 
-	ALT_CMO_OP(flush, vaddr, size, riscv_cbom_block_size);
+	riscv_dma_noncoherent_flush(vaddr, size);
 }
 
 static inline bool arch_sync_dma_clean_before_fromdevice(void)
@@ -50,7 +58,7 @@ void arch_dma_prep_coherent(struct page *page, size_t size)
 {
 	void *flush_addr = page_address(page);
 
-	ALT_CMO_OP(flush, flush_addr, size, riscv_cbom_block_size);
+	riscv_dma_noncoherent_flush(flush_addr, size);
 }
 
 void arch_setup_dma_ops(struct device *dev, u64 dma_base, u64 size,
@@ -75,3 +83,12 @@ void riscv_noncoherent_supported(void)
 	     "Non-coherent DMA support enabled without a block size\n");
 	noncoherent_supported = true;
 }
+
+void riscv_noncoherent_register_cache_ops(const struct riscv_cache_ops *ops)
+{
+	if (!ops)
+		return;
+
+	noncoherent_cache_ops = *ops;
+}
+EXPORT_SYMBOL_GPL(riscv_noncoherent_register_cache_ops);
diff --git a/arch/riscv/mm/pmem.c b/arch/riscv/mm/pmem.c
index 089df92ae876..aad7c2209eca 100644
--- a/arch/riscv/mm/pmem.c
+++ b/arch/riscv/mm/pmem.c
@@ -6,16 +6,16 @@
 #include <linux/export.h>
 #include <linux/libnvdimm.h>
 
-#include <asm/cacheflush.h>
+#include <asm/dma-noncoherent.h>
 
 void arch_wb_cache_pmem(void *addr, size_t size)
 {
-	ALT_CMO_OP(clean, addr, size, riscv_cbom_block_size);
+	riscv_dma_noncoherent_pmem_clean(addr, size);
 }
 EXPORT_SYMBOL_GPL(arch_wb_cache_pmem);
 
 void arch_invalidate_pmem(void *addr, size_t size)
 {
-	ALT_CMO_OP(inval, addr, size, riscv_cbom_block_size);
+	riscv_dma_noncoherent_pmem_inval(addr, size);
 }
 EXPORT_SYMBOL_GPL(arch_invalidate_pmem);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v7 2/6] riscv: asm: vendorid_list: Add Andes Technology to the vendors list
  2023-03-30 20:42 [PATCH v7 0/6] RISC-V non-coherent function pointer based CMO + non-coherent DMA support for AX45MP Prabhakar
  2023-03-30 20:42 ` [PATCH v7 1/6] riscv: mm: dma-noncoherent: Switch using function pointers for cache management Prabhakar
@ 2023-03-30 20:42 ` Prabhakar
  2023-03-30 20:42 ` [PATCH v7 3/6] riscv: errata: Add Andes alternative ports Prabhakar
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 40+ messages in thread
From: Prabhakar @ 2023-03-30 20:42 UTC (permalink / raw)
  To: Arnd Bergmann, Conor Dooley, Geert Uytterhoeven, Heiko Stuebner,
	Guo Ren, Andrew Jones, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Samuel Holland, linux-riscv
  Cc: Rob Herring, Krzysztof Kozlowski, devicetree, linux-kernel,
	linux-renesas-soc, Prabhakar, Biju Das, Lad Prabhakar

From: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>

Add Andes Technology to the vendors list.

Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
Reviewed-by: Heiko Stuebner <heiko@sntech.de>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
---
v6 -> v7
* No change

v5 -> v6
* No change

v4 -> v5
* Included RB tags

RFC v3 -> v4
* New patch
---
 arch/riscv/include/asm/vendorid_list.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/riscv/include/asm/vendorid_list.h b/arch/riscv/include/asm/vendorid_list.h
index cb89af3f0704..e55407ace0c3 100644
--- a/arch/riscv/include/asm/vendorid_list.h
+++ b/arch/riscv/include/asm/vendorid_list.h
@@ -5,6 +5,7 @@
 #ifndef ASM_VENDOR_LIST_H
 #define ASM_VENDOR_LIST_H
 
+#define ANDESTECH_VENDOR_ID	0x31e
 #define SIFIVE_VENDOR_ID	0x489
 #define THEAD_VENDOR_ID		0x5b7
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v7 3/6] riscv: errata: Add Andes alternative ports
  2023-03-30 20:42 [PATCH v7 0/6] RISC-V non-coherent function pointer based CMO + non-coherent DMA support for AX45MP Prabhakar
  2023-03-30 20:42 ` [PATCH v7 1/6] riscv: mm: dma-noncoherent: Switch using function pointers for cache management Prabhakar
  2023-03-30 20:42 ` [PATCH v7 2/6] riscv: asm: vendorid_list: Add Andes Technology to the vendors list Prabhakar
@ 2023-03-30 20:42 ` Prabhakar
  2023-03-30 20:42 ` [PATCH v7 4/6] dt-bindings: cache: r9a07g043f-l2-cache: Add DT binding documentation for L2 cache controller Prabhakar
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 40+ messages in thread
From: Prabhakar @ 2023-03-30 20:42 UTC (permalink / raw)
  To: Arnd Bergmann, Conor Dooley, Geert Uytterhoeven, Heiko Stuebner,
	Guo Ren, Andrew Jones, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Samuel Holland, linux-riscv
  Cc: Rob Herring, Krzysztof Kozlowski, devicetree, linux-kernel,
	linux-renesas-soc, Prabhakar, Biju Das, Lad Prabhakar

From: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>

Add required ports of the Alternative scheme for Andes CPU cores.

I/O Coherence Port (IOCP) provides an AXI interface for connecting external
non-caching masters, such as DMA controllers. IOCP is a specification
option and is disabled on the Renesas RZ/Five SoC due to this reason cache
management needs a software workaround.

Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
---
v6 -> v7
* Renamed RZFIVE_SBI_EXT_IOCP_SW_WORKAROUND -> ANDES_SBI_EXT_IOCP_SW_WORKAROUND
* Dropped "depends on !XIP_KERNEL" for ERRATA_ANDES config

v5 -> v6
* Dropped patching alternative and now just probing IOCP

v4 -> v5
* Sorted the Kconfig/Makefile/Switch based on Core name
* Added a comments
* Introduced RZFIVE_SBI_EXT_IOCP_SW_WORKAROUND SBI EXT ID to check if
  CMO needs to be applied. Is there a way we can access the DTB while patching
  as we can drop this SBI EXT ID and add a DT property instead for cmo?

RFC v3 -> v4
* New patch
---
 arch/riscv/Kconfig.errata            | 21 ++++++++
 arch/riscv/errata/Makefile           |  1 +
 arch/riscv/errata/andes/Makefile     |  1 +
 arch/riscv/errata/andes/errata.c     | 71 ++++++++++++++++++++++++++++
 arch/riscv/include/asm/alternative.h |  3 ++
 arch/riscv/kernel/alternative.c      |  5 ++
 6 files changed, 102 insertions(+)
 create mode 100644 arch/riscv/errata/andes/Makefile
 create mode 100644 arch/riscv/errata/andes/errata.c

diff --git a/arch/riscv/Kconfig.errata b/arch/riscv/Kconfig.errata
index 0c8f4652cd82..92c779764b27 100644
--- a/arch/riscv/Kconfig.errata
+++ b/arch/riscv/Kconfig.errata
@@ -1,5 +1,26 @@
 menu "CPU errata selection"
 
+config ERRATA_ANDES
+	bool "Andes AX45MP errata"
+	depends on RISCV_ALTERNATIVE
+	help
+	  All Andes errata Kconfig depend on this Kconfig. Disabling
+	  this Kconfig will disable all Andes errata. Please say "Y"
+	  here if your platform uses Andes CPU cores.
+
+	  Otherwise, please say "N" here to avoid unnecessary overhead.
+
+config ERRATA_ANDES_CMO
+	bool "Apply Andes cache management errata"
+	depends on ERRATA_ANDES && MMU && ARCH_R9A07G043
+	select RISCV_DMA_NONCOHERENT
+	default y
+	help
+	  This will apply the cache management errata to handle the
+	  non-standard handling on non-coherent operations on Andes cores.
+
+	  If you don't know what to do here, say "Y".
+
 config ERRATA_SIFIVE
 	bool "SiFive errata"
 	depends on RISCV_ALTERNATIVE
diff --git a/arch/riscv/errata/Makefile b/arch/riscv/errata/Makefile
index a1055965fbee..6f1c693af92d 100644
--- a/arch/riscv/errata/Makefile
+++ b/arch/riscv/errata/Makefile
@@ -1,2 +1,3 @@
+obj-$(CONFIG_ERRATA_ANDES) += andes/
 obj-$(CONFIG_ERRATA_SIFIVE) += sifive/
 obj-$(CONFIG_ERRATA_THEAD) += thead/
diff --git a/arch/riscv/errata/andes/Makefile b/arch/riscv/errata/andes/Makefile
new file mode 100644
index 000000000000..2d644e19caef
--- /dev/null
+++ b/arch/riscv/errata/andes/Makefile
@@ -0,0 +1 @@
+obj-y += errata.o
diff --git a/arch/riscv/errata/andes/errata.c b/arch/riscv/errata/andes/errata.c
new file mode 100644
index 000000000000..b7f3dbd04e58
--- /dev/null
+++ b/arch/riscv/errata/andes/errata.c
@@ -0,0 +1,71 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Erratas to be applied for Andes CPU cores
+ *
+ *  Copyright (C) 2023 Renesas Electronics Corporation.
+ *
+ * Author: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
+ */
+
+#include <asm/cacheflush.h>
+#include <asm/dma-noncoherent.h>
+#include <asm/errata_list.h>
+#include <asm/patch.h>
+#include <asm/sbi.h>
+#include <asm/vendorid_list.h>
+
+#define ANDESTECH_AX45MP_MARCHID	0x8000000000008a45UL
+#define ANDESTECH_AX45MP_MIMPID		0x500UL
+#define ANDESTECH_SBI_EXT_ANDES		0x0900031E
+
+#define ANDES_SBI_EXT_IOCP_SW_WORKAROUND	1
+
+static long ax45mp_iocp_sw_workaround(void)
+{
+	struct sbiret ret;
+
+	/*
+	 * ANDES_SBI_EXT_IOCP_SW_WORKAROUND SBI EXT checks if the IOCP is missing and
+	 * cache is controllable only then CMO will be applied to the platform.
+	 */
+	ret = sbi_ecall(ANDESTECH_SBI_EXT_ANDES, ANDES_SBI_EXT_IOCP_SW_WORKAROUND,
+			0, 0, 0, 0, 0, 0);
+
+	return ret.error ? 0 : ret.value;
+}
+
+static bool errata_probe_iocp(unsigned int stage, unsigned long arch_id, unsigned long impid)
+{
+	if (!IS_ENABLED(CONFIG_ERRATA_ANDES_CMO))
+		return false;
+
+	if (arch_id != ANDESTECH_AX45MP_MARCHID || impid != ANDESTECH_AX45MP_MIMPID)
+		return false;
+
+	if (!ax45mp_iocp_sw_workaround())
+		return false;
+
+	/* Set this just to make core cbo code happy */
+	riscv_cbom_block_size = 1;
+	riscv_noncoherent_supported();
+
+	return true;
+}
+
+static void andes_errata_probe(unsigned int stage, unsigned long archid, unsigned long impid)
+{
+	/*
+	 * In the absence of the I/O Coherency Port, access to certain peripherals
+	 * requires vendor specific DMA handling.
+	 */
+	errata_probe_iocp(stage, archid, impid);
+}
+
+void __init_or_module andes_errata_patch_func(struct alt_entry *begin, struct alt_entry *end,
+					      unsigned long archid, unsigned long impid,
+					      unsigned int stage)
+{
+	andes_errata_probe(stage, archid, impid);
+
+	/* we have nothing to patch here ATM so just return back */
+}
diff --git a/arch/riscv/include/asm/alternative.h b/arch/riscv/include/asm/alternative.h
index 58ccd2f8cab7..3c2b59b25017 100644
--- a/arch/riscv/include/asm/alternative.h
+++ b/arch/riscv/include/asm/alternative.h
@@ -45,6 +45,9 @@ struct alt_entry {
 	u32 patch_id;		/* The patch ID (erratum ID or cpufeature ID) */
 };
 
+void andes_errata_patch_func(struct alt_entry *begin, struct alt_entry *end,
+			     unsigned long archid, unsigned long impid,
+			     unsigned int stage);
 void sifive_errata_patch_func(struct alt_entry *begin, struct alt_entry *end,
 			      unsigned long archid, unsigned long impid,
 			      unsigned int stage);
diff --git a/arch/riscv/kernel/alternative.c b/arch/riscv/kernel/alternative.c
index 2354c69dc7d1..8a34e72c2b5e 100644
--- a/arch/riscv/kernel/alternative.c
+++ b/arch/riscv/kernel/alternative.c
@@ -42,6 +42,11 @@ static void __init_or_module riscv_fill_cpu_mfr_info(struct cpu_manufacturer_inf
 #endif
 
 	switch (cpu_mfr_info->vendor_id) {
+#ifdef CONFIG_ERRATA_ANDES
+	case ANDESTECH_VENDOR_ID:
+		cpu_mfr_info->patch_func = andes_errata_patch_func;
+		break;
+#endif
 #ifdef CONFIG_ERRATA_SIFIVE
 	case SIFIVE_VENDOR_ID:
 		cpu_mfr_info->patch_func = sifive_errata_patch_func;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v7 4/6] dt-bindings: cache: r9a07g043f-l2-cache: Add DT binding documentation for L2 cache controller
  2023-03-30 20:42 [PATCH v7 0/6] RISC-V non-coherent function pointer based CMO + non-coherent DMA support for AX45MP Prabhakar
                   ` (2 preceding siblings ...)
  2023-03-30 20:42 ` [PATCH v7 3/6] riscv: errata: Add Andes alternative ports Prabhakar
@ 2023-03-30 20:42 ` Prabhakar
  2023-03-31 10:21   ` Conor Dooley
  2023-03-30 20:42 ` [PATCH v7 5/6] cache: Add L2 cache management for Andes AX45MP RISC-V core Prabhakar
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 40+ messages in thread
From: Prabhakar @ 2023-03-30 20:42 UTC (permalink / raw)
  To: Arnd Bergmann, Conor Dooley, Geert Uytterhoeven, Heiko Stuebner,
	Guo Ren, Andrew Jones, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Samuel Holland, linux-riscv
  Cc: Rob Herring, Krzysztof Kozlowski, devicetree, linux-kernel,
	linux-renesas-soc, Prabhakar, Biju Das, Lad Prabhakar,
	Rob Herring

From: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>

Add DT binding documentation for L2 cache controller found on RZ/Five SoC.

The Renesas RZ/Five microprocessor includes a RISC-V CPU Core (AX45MP
Single) from Andes. The AX45MP core has an L2 cache controller, this patch
describes the L2 cache block.

Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
Reviewed-by: Rob Herring <robh@kernel.org>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
---
v6 -> v7
* No Change

v5 -> v6
* Included RB tag from Rob

v4 -> v5
* Dropped L2 cache configuration properties
* Dropped PMA configuration properties
* Ordered the required list to match the properties list

RFC v3 -> v4
* Dropped l2 cache configuration parameters
* s/larger/large
* Added minItems/maxItems for andestech,pma-regions
---
 .../cache/andestech,ax45mp-cache.yaml         | 81 +++++++++++++++++++
 1 file changed, 81 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/cache/andestech,ax45mp-cache.yaml

diff --git a/Documentation/devicetree/bindings/cache/andestech,ax45mp-cache.yaml b/Documentation/devicetree/bindings/cache/andestech,ax45mp-cache.yaml
new file mode 100644
index 000000000000..9ab5f0c435d4
--- /dev/null
+++ b/Documentation/devicetree/bindings/cache/andestech,ax45mp-cache.yaml
@@ -0,0 +1,81 @@
+# SPDX-License-Identifier: (GPL-2.0 OR BSD-2-Clause)
+# Copyright (C) 2023 Renesas Electronics Corp.
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/cache/andestech,ax45mp-cache.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Andestech AX45MP L2 Cache Controller
+
+maintainers:
+  - Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
+
+description:
+  A level-2 cache (L2C) is used to improve the system performance by providing
+  a large amount of cache line entries and reasonable access delays. The L2C
+  is shared between cores, and a non-inclusive non-exclusive policy is used.
+
+select:
+  properties:
+    compatible:
+      contains:
+        enum:
+          - andestech,ax45mp-cache
+
+  required:
+    - compatible
+
+properties:
+  compatible:
+    items:
+      - const: andestech,ax45mp-cache
+      - const: cache
+
+  reg:
+    maxItems: 1
+
+  interrupts:
+    maxItems: 1
+
+  cache-line-size:
+    const: 64
+
+  cache-level:
+    const: 2
+
+  cache-sets:
+    const: 1024
+
+  cache-size:
+    enum: [131072, 262144, 524288, 1048576, 2097152]
+
+  cache-unified: true
+
+  next-level-cache: true
+
+additionalProperties: false
+
+required:
+  - compatible
+  - reg
+  - interrupts
+  - cache-line-size
+  - cache-level
+  - cache-sets
+  - cache-size
+  - cache-unified
+
+examples:
+  - |
+    #include <dt-bindings/interrupt-controller/irq.h>
+
+    cache-controller@2010000 {
+        compatible = "andestech,ax45mp-cache", "cache";
+        reg = <0x13400000 0x100000>;
+        interrupts = <508 IRQ_TYPE_LEVEL_HIGH>;
+        cache-line-size = <64>;
+        cache-level = <2>;
+        cache-sets = <1024>;
+        cache-size = <262144>;
+        cache-unified;
+    };
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v7 5/6] cache: Add L2 cache management for Andes AX45MP RISC-V core
  2023-03-30 20:42 [PATCH v7 0/6] RISC-V non-coherent function pointer based CMO + non-coherent DMA support for AX45MP Prabhakar
                   ` (3 preceding siblings ...)
  2023-03-30 20:42 ` [PATCH v7 4/6] dt-bindings: cache: r9a07g043f-l2-cache: Add DT binding documentation for L2 cache controller Prabhakar
@ 2023-03-30 20:42 ` Prabhakar
  2023-03-31 12:45   ` Conor Dooley
  2023-03-30 20:42 ` [PATCH v7 6/6] soc: renesas: Kconfig: Select the required configs for RZ/Five SoC Prabhakar
  2023-03-31 18:05 ` [PATCH v7 0/6] RISC-V non-coherent function pointer based CMO + non-coherent DMA support for AX45MP Conor Dooley
  6 siblings, 1 reply; 40+ messages in thread
From: Prabhakar @ 2023-03-30 20:42 UTC (permalink / raw)
  To: Arnd Bergmann, Conor Dooley, Geert Uytterhoeven, Heiko Stuebner,
	Guo Ren, Andrew Jones, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Samuel Holland, linux-riscv
  Cc: Rob Herring, Krzysztof Kozlowski, devicetree, linux-kernel,
	linux-renesas-soc, Prabhakar, Biju Das, Lad Prabhakar

From: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>

I/O Coherence Port (IOCP) provides an AXI interface for connecting
external non-caching masters, such as DMA controllers. The accesses
from IOCP are coherent with D-Caches and L2 Cache.

IOCP is a specification option and is disabled on the Renesas RZ/Five
SoC due to this reason IP blocks using DMA will fail.

The Andes AX45MP core has a Programmable Physical Memory Attributes (PMA)
block that allows dynamic adjustment of memory attributes in the runtime.
It contains a configurable amount of PMA entries implemented as CSR
registers to control the attributes of memory locations in interest.
Below are the memory attributes supported:
* Device, Non-bufferable
* Device, bufferable
* Memory, Non-cacheable, Non-bufferable
* Memory, Non-cacheable, Bufferable
* Memory, Write-back, No-allocate
* Memory, Write-back, Read-allocate
* Memory, Write-back, Write-allocate
* Memory, Write-back, Read and Write-allocate

More info about PMA (section 10.3):
Link: http://www.andestech.com/wp-content/uploads/AX45MP-1C-Rev.-5.0.0-Datasheet.pdf

As a workaround for SoCs with IOCP disabled CMO needs to be handled by
software. Firstly OpenSBI configures the memory region as
"Memory, Non-cacheable, Bufferable" and passes this region as a global
shared dma pool as a DT node. With DMA_GLOBAL_POOL enabled all DMA
allocations happen from this region and synchronization callbacks are
implemented to synchronize when doing DMA transactions.

Example PMA region passes as a DT node from OpenSBI:
    reserved-memory {
        #address-cells = <2>;
        #size-cells = <2>;
        ranges;

        pma_resv0@58000000 {
            compatible = "shared-dma-pool";
            reg = <0x0 0x58000000 0x0 0x08000000>;
            no-map;
            linux,dma-default;
        };
    };

Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
---
v6 -> v7
* Implemented flush callback
* Dropped using riscv_dma_noncoherent_cmo_ops

v5 -> v6
* Moved driver to cache folder
* Switched to new API for CMO

v4 -> v5
* Dropped code for configuring L2 cache
* Dropped code for configuring PMA
* Updated commit message
* Added comments
* Changed static branch enable/disable order

RFC v3 -> v4
* Made use of runtime patching instead of compile time
* Now just exposing single function ax45mp_no_iocp_cmo() for CMO handling
* Added a check to make sure cache line size is always 64 bytes
* Renamed folder rzf -> rzfive
* Improved Kconfig description
* Dropped L2 cache configuration
* Dropped unnecessary casts
* Fixed comments pointed by Geert.
---
 MAINTAINERS                  |   8 ++
 drivers/Kconfig              |   2 +
 drivers/Makefile             |   1 +
 drivers/cache/Kconfig        |  10 ++
 drivers/cache/Makefile       |   3 +
 drivers/cache/ax45mp_cache.c | 229 +++++++++++++++++++++++++++++++++++
 6 files changed, 253 insertions(+)
 create mode 100644 drivers/cache/Kconfig
 create mode 100644 drivers/cache/Makefile
 create mode 100644 drivers/cache/ax45mp_cache.c

diff --git a/MAINTAINERS b/MAINTAINERS
index 258fa89de965..921a96859530 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -19897,6 +19897,14 @@ S:	Supported
 T:	git git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging.git
 F:	drivers/staging/
 
+STANDALONE CACHE CONTROLLER DRIVERS
+M:	Conor Dooley <conor@kernel.org>
+L:	linux-riscv@lists.infradead.org
+S:	Maintained
+T:	git https://git.kernel.org/pub/scm/linux/kernel/git/conor/linux.git/
+F:	drivers/cache
+F:	include/cache
+
 STARFIRE/DURALAN NETWORK DRIVER
 M:	Ion Badulescu <ionut@badula.org>
 S:	Odd Fixes
diff --git a/drivers/Kconfig b/drivers/Kconfig
index 968bd0a6fd78..44abd2cba3a3 100644
--- a/drivers/Kconfig
+++ b/drivers/Kconfig
@@ -15,6 +15,8 @@ source "drivers/base/Kconfig"
 
 source "drivers/bus/Kconfig"
 
+source "drivers/cache/Kconfig"
+
 source "drivers/connector/Kconfig"
 
 source "drivers/firmware/Kconfig"
diff --git a/drivers/Makefile b/drivers/Makefile
index 20b118dca999..db5a8115093f 100644
--- a/drivers/Makefile
+++ b/drivers/Makefile
@@ -11,6 +11,7 @@ ifdef building_out_of_srctree
 MAKEFLAGS += --include-dir=$(srctree)
 endif
 
+obj-y				+= cache/
 obj-y				+= irqchip/
 obj-y				+= bus/
 
diff --git a/drivers/cache/Kconfig b/drivers/cache/Kconfig
new file mode 100644
index 000000000000..5478adff3d88
--- /dev/null
+++ b/drivers/cache/Kconfig
@@ -0,0 +1,10 @@
+# SPDX-License-Identifier: GPL-2.0
+menu "Cache Drivers"
+
+config AX45MP_L2_CACHE
+	bool "Andes Technology AX45MP L2 Cache controller"
+	depends on RISCV && RISCV_DMA_NONCOHERENT
+	help
+	  Support for the L2 cache controller on Andes Technology AX45MP platforms.
+
+endmenu
diff --git a/drivers/cache/Makefile b/drivers/cache/Makefile
new file mode 100644
index 000000000000..2012e7fb978d
--- /dev/null
+++ b/drivers/cache/Makefile
@@ -0,0 +1,3 @@
+# SPDX-License-Identifier: GPL-2.0
+
+obj-$(CONFIG_AX45MP_L2_CACHE) += ax45mp_cache.o
diff --git a/drivers/cache/ax45mp_cache.c b/drivers/cache/ax45mp_cache.c
new file mode 100644
index 000000000000..cb230b544be8
--- /dev/null
+++ b/drivers/cache/ax45mp_cache.c
@@ -0,0 +1,229 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * non-coherent cache functions for Andes AX45MP
+ *
+ * Copyright (C) 2023 Renesas Electronics Corp.
+ */
+
+#include <asm/dma-noncoherent.h>
+#include <linux/cacheflush.h>
+#include <linux/cacheinfo.h>
+#include <linux/dma-direction.h>
+#include <linux/of_address.h>
+#include <linux/of_platform.h>
+
+/* L2 cache registers */
+#define AX45MP_L2C_REG_CTL_OFFSET		0x8
+
+#define AX45MP_L2C_REG_C0_CMD_OFFSET		0x40
+#define AX45MP_L2C_REG_C0_ACC_OFFSET		0x48
+#define AX45MP_L2C_REG_STATUS_OFFSET		0x80
+
+/* D-cache operation */
+#define AX45MP_CCTL_L1D_VA_INVAL		0 /* Invalidate an L1 cache entry */
+#define AX45MP_CCTL_L1D_VA_WB			1 /* Write-back an L1 cache entry */
+
+/* L2 CCTL status */
+#define AX45MP_CCTL_L2_STATUS_IDLE		0
+
+/* L2 CCTL status cores mask */
+#define AX45MP_CCTL_L2_STATUS_C0_MASK		0xf
+
+/* L2 cache operation */
+#define AX45MP_CCTL_L2_PA_INVAL			0x8 /* Invalidate an L2 cache entry */
+#define AX45MP_CCTL_L2_PA_WB			0x9 /* Write-back an L2 cache entry */
+
+#define AX45MP_L2C_REG_PER_CORE_OFFSET		0x10
+#define AX45MP_CCTL_L2_STATUS_PER_CORE_OFFSET	4
+
+#define AX45MP_L2C_REG_CN_CMD_OFFSET(n)	\
+	(AX45MP_L2C_REG_C0_CMD_OFFSET + ((n) * AX45MP_L2C_REG_PER_CORE_OFFSET))
+#define AX45MP_L2C_REG_CN_ACC_OFFSET(n)	\
+	(AX45MP_L2C_REG_C0_ACC_OFFSET + ((n) * AX45MP_L2C_REG_PER_CORE_OFFSET))
+#define AX45MP_CCTL_L2_STATUS_CN_MASK(n)	\
+	(AX45MP_CCTL_L2_STATUS_C0_MASK << ((n) * AX45MP_CCTL_L2_STATUS_PER_CORE_OFFSET))
+
+#define AX45MP_CCTL_REG_UCCTLBEGINADDR_NUM	0x80b
+#define AX45MP_CCTL_REG_UCCTLCOMMAND_NUM	0x80c
+
+#define AX45MP_CACHE_LINE_SIZE			64
+
+struct ax45mp_priv {
+	void __iomem *l2c_base;
+	u32 ax45mp_cache_line_size;
+};
+
+static struct ax45mp_priv *ax45mp_priv;
+
+/* L2 Cache operations */
+static inline uint32_t ax45mp_cpu_l2c_get_cctl_status(void)
+{
+	return readl(ax45mp_priv->l2c_base + AX45MP_L2C_REG_STATUS_OFFSET);
+}
+
+static void ax45mp_cpu_cache_operation(unsigned long start, unsigned long end,
+				       unsigned long line_size, unsigned int l1_op,
+				       unsigned int l2_op)
+{
+	void __iomem *base = ax45mp_priv->l2c_base;
+	int mhartid = smp_processor_id();
+	unsigned long pa;
+
+	while (end > start) {
+		csr_write(AX45MP_CCTL_REG_UCCTLBEGINADDR_NUM, start);
+		csr_write(AX45MP_CCTL_REG_UCCTLCOMMAND_NUM, l1_op);
+
+		pa = virt_to_phys((void *)start);
+		writel(pa, base + AX45MP_L2C_REG_CN_ACC_OFFSET(mhartid));
+		writel(l2_op, base + AX45MP_L2C_REG_CN_CMD_OFFSET(mhartid));
+		while ((ax45mp_cpu_l2c_get_cctl_status() &
+			AX45MP_CCTL_L2_STATUS_CN_MASK(mhartid)) !=
+			AX45MP_CCTL_L2_STATUS_IDLE)
+			;
+
+		start += line_size;
+	}
+}
+
+/* Write-back L1 and L2 cache entry */
+static inline void ax45mp_cpu_dcache_wb_range(unsigned long start, unsigned long end,
+					      unsigned long line_size)
+{
+	ax45mp_cpu_cache_operation(start, end, line_size,
+				   AX45MP_CCTL_L1D_VA_WB,
+				   AX45MP_CCTL_L2_PA_WB);
+}
+
+/* Invalidate the L1 and L2 cache entry */
+static inline void ax45mp_cpu_dcache_inval_range(unsigned long start, unsigned long end,
+						 unsigned long line_size)
+{
+	ax45mp_cpu_cache_operation(start, end, line_size,
+				   AX45MP_CCTL_L1D_VA_INVAL,
+				   AX45MP_CCTL_L2_PA_INVAL);
+}
+
+static void ax45mp_cpu_dma_inval_range(unsigned long vaddr, unsigned long size)
+{
+	char cache_buf[2][AX45MP_CACHE_LINE_SIZE];
+	unsigned long start = vaddr;
+	unsigned long end = start + size;
+	unsigned long old_start = start;
+	unsigned long old_end = end;
+	unsigned long line_size;
+	unsigned long flags;
+
+	if (unlikely(start == end))
+		return;
+
+	line_size = ax45mp_priv->ax45mp_cache_line_size;
+
+	memset(&cache_buf, 0x0, sizeof(cache_buf));
+	start = start & (~(line_size - 1));
+	end = ((end + line_size - 1) & (~(line_size - 1)));
+
+	local_irq_save(flags);
+	if (unlikely(start != old_start))
+		memcpy(&cache_buf[0][0], (void *)start, line_size);
+
+	if (unlikely(end != old_end))
+		memcpy(&cache_buf[1][0], (void *)(old_end & (~(line_size - 1))), line_size);
+
+	ax45mp_cpu_dcache_inval_range(start, end, line_size);
+
+	if (unlikely(start != old_start))
+		memcpy((void *)start, &cache_buf[0][0], (old_start & (line_size - 1)));
+
+	local_irq_restore(flags);
+}
+
+static void ax45mp_cpu_dma_wb_range(unsigned long vaddr, unsigned long size)
+{
+	unsigned long start = vaddr;
+	unsigned long end = start + size;
+	unsigned long line_size;
+	unsigned long flags;
+
+	line_size = ax45mp_priv->ax45mp_cache_line_size;
+	local_irq_save(flags);
+	start = start & (~(line_size - 1));
+	ax45mp_cpu_dcache_wb_range(start, end, line_size);
+	local_irq_restore(flags);
+}
+
+static void ax45mp_cpu_dma_flush_range(unsigned long vaddr, unsigned long size)
+{
+	ax45mp_cpu_dma_wb_range(vaddr, size);
+	ax45mp_cpu_dma_inval_range(vaddr, size);
+}
+
+static void ax45mp_get_l2_line_size(struct platform_device *pdev)
+{
+	struct device_node *np = pdev->dev.of_node;
+	struct device *dev = &pdev->dev;
+	int ret;
+
+	ret = of_property_read_u32(np, "cache-line-size", &ax45mp_priv->ax45mp_cache_line_size);
+	if (ret) {
+		dev_err(dev, "Failed to get cache-line-size, defaulting to 64 bytes\n");
+		ax45mp_priv->ax45mp_cache_line_size = AX45MP_CACHE_LINE_SIZE;
+	}
+
+	if (ax45mp_priv->ax45mp_cache_line_size != AX45MP_CACHE_LINE_SIZE) {
+		dev_err(dev, "Expected cache-line-size to be 64 bytes (found:%u). Defaulting to 64 bytes\n",
+			ax45mp_priv->ax45mp_cache_line_size);
+		ax45mp_priv->ax45mp_cache_line_size = AX45MP_CACHE_LINE_SIZE;
+	}
+}
+
+static const struct riscv_cache_ops ax45mp_cmo_ops = {
+	.clean_range = &ax45mp_cpu_dma_wb_range,
+	.inv_range = &ax45mp_cpu_dma_inval_range,
+	.flush_range = &ax45mp_cpu_dma_flush_range,
+};
+
+static int ax45mp_l2c_probe(struct platform_device *pdev)
+{
+	/*
+	 * If IOCP is present on the Andes AX45MP core riscv_cbom_block_size
+	 * will be 0 for sure, so we can definitely rely on it. If
+	 * riscv_cbom_block_size = 0 we don't need to handle CMO using SW any
+	 * more so we just return success here and only if its being set we
+	 * continue further in the probe path.
+	 */
+	if (!riscv_cbom_block_size)
+		return 0;
+
+	ax45mp_priv = devm_kzalloc(&pdev->dev, sizeof(*ax45mp_priv), GFP_KERNEL);
+	if (!ax45mp_priv)
+		return -ENOMEM;
+
+	ax45mp_priv->l2c_base = devm_platform_ioremap_resource(pdev, 0);
+	if (IS_ERR(ax45mp_priv->l2c_base))
+		return PTR_ERR(ax45mp_priv->l2c_base);
+
+	ax45mp_get_l2_line_size(pdev);
+
+	riscv_noncoherent_register_cache_ops(&ax45mp_cmo_ops);
+
+	return 0;
+}
+
+static const struct of_device_id ax45mp_cache_ids[] = {
+	{ .compatible = "andestech,ax45mp-cache" },
+	{ /* sentinel */ }
+};
+
+static struct platform_driver ax45mp_l2c_driver = {
+	.driver = {
+		.name = "ax45mp-l2c",
+		.of_match_table = ax45mp_cache_ids,
+	},
+	.probe = ax45mp_l2c_probe,
+};
+
+static int __init ax45mp_cache_init(void)
+{
+	return platform_driver_register(&ax45mp_l2c_driver);
+}
+arch_initcall(ax45mp_cache_init);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v7 6/6] soc: renesas: Kconfig: Select the required configs for RZ/Five SoC
  2023-03-30 20:42 [PATCH v7 0/6] RISC-V non-coherent function pointer based CMO + non-coherent DMA support for AX45MP Prabhakar
                   ` (4 preceding siblings ...)
  2023-03-30 20:42 ` [PATCH v7 5/6] cache: Add L2 cache management for Andes AX45MP RISC-V core Prabhakar
@ 2023-03-30 20:42 ` Prabhakar
  2023-03-31  7:37   ` Geert Uytterhoeven
  2023-03-31 18:05 ` [PATCH v7 0/6] RISC-V non-coherent function pointer based CMO + non-coherent DMA support for AX45MP Conor Dooley
  6 siblings, 1 reply; 40+ messages in thread
From: Prabhakar @ 2023-03-30 20:42 UTC (permalink / raw)
  To: Arnd Bergmann, Conor Dooley, Geert Uytterhoeven, Heiko Stuebner,
	Guo Ren, Andrew Jones, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Samuel Holland, linux-riscv
  Cc: Rob Herring, Krzysztof Kozlowski, devicetree, linux-kernel,
	linux-renesas-soc, Prabhakar, Biju Das, Lad Prabhakar

From: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>

Explicitly select the required Cache management and Errata configs
required for the RZ/Five SoC.

Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
---
v6->v7
* Included RB tag from Conor

v5->v6
* New patch
---
 drivers/soc/renesas/Kconfig | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/soc/renesas/Kconfig b/drivers/soc/renesas/Kconfig
index de31589ed054..67604f24973e 100644
--- a/drivers/soc/renesas/Kconfig
+++ b/drivers/soc/renesas/Kconfig
@@ -334,6 +334,10 @@ if RISCV
 config ARCH_R9A07G043
 	bool "RISC-V Platform support for RZ/Five"
 	select ARCH_RZG2L
+	select AX45MP_L2_CACHE
+	select DMA_GLOBAL_POOL
+	select ERRATA_ANDES
+	select ERRATA_ANDES_CMO
 	help
 	  This enables support for the Renesas RZ/Five SoC.
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* Re: [PATCH v7 1/6] riscv: mm: dma-noncoherent: Switch using function pointers for cache management
  2023-03-30 20:42 ` [PATCH v7 1/6] riscv: mm: dma-noncoherent: Switch using function pointers for cache management Prabhakar
@ 2023-03-30 21:34   ` Arnd Bergmann
  2023-03-31  7:54     ` Conor Dooley
  2023-03-31 10:37     ` Lad, Prabhakar
  2023-03-31  7:31   ` Geert Uytterhoeven
                     ` (2 subsequent siblings)
  3 siblings, 2 replies; 40+ messages in thread
From: Arnd Bergmann @ 2023-03-30 21:34 UTC (permalink / raw)
  To: Prabhakar, Conor.Dooley, Geert Uytterhoeven, Heiko Stübner,
	guoren, Andrew Jones, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Samuel Holland, linux-riscv
  Cc: Rob Herring, Krzysztof Kozlowski, devicetree, linux-kernel,
	Linux-Renesas, Biju Das, Lad, Prabhakar

On Thu, Mar 30, 2023, at 22:42, Prabhakar wrote:
> From: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
>
> Currently, selecting which CMOs to use on a given platform is done using
> and ALTERNATIVE_X() macro. This was manageable when there were just two
> CMO implementations, but now that there are more and more platforms coming
> needing custom CMOs, the use of the ALTERNATIVE_X() macro is unmanageable.
>
> To avoid such issues this patch switches to use of function pointers
> instead of ALTERNATIVE_X() macro for cache management (the only drawback
> being performance over the previous approach).
>
> void (*clean_range)(unsigned long addr, unsigned long size);
> void (*inv_range)(unsigned long addr, unsigned long size);
> void (*flush_range)(unsigned long addr, unsigned long size);
>
> The above function pointers are provided to be overridden for platforms
> needing CMO.
>
> Convert ZICBOM and T-HEAD CMO to use function pointers.
>
> Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>

This is looking pretty good. There are a few things that I
still see sticking out, and I think I've mentioned some of 
them before, but don't remember if there was a reason for
doing it like this:

> +#ifdef CONFIG_ERRATA_THEAD_CMO

I would rename this to not call this an 'ERRATA' but
just make it a driver. Not important though, and there
was probably a reason you did it like this.

> +extern struct riscv_cache_ops noncoherent_cache_ops;
> +
> +void riscv_noncoherent_register_cache_ops(const struct riscv_cache_ops 
> *ops);
> +
> +static inline void riscv_dma_noncoherent_clean(void *vaddr, size_t 
> size)
> +{
> +	if (noncoherent_cache_ops.clean_range) {
> +		unsigned long addr = (unsigned long)vaddr;
> +
> +		noncoherent_cache_ops.clean_range(addr, size);
> +	}
> +}

The type case should not be necessary here. Instead I would 
make the callbacks use 'void *' as well, not 'unsigned long'.

It's possible that some future cache controller driver requires
passing physical addresses, as is common for last level cache,
but as long as all drivers pass virtual addresses, it's easier
to do the phys_to_virt() in common code.

It also seems wrong to have the fallback be to do nothing
when the pointer is NULL, since that cannot actually work
when a device is not cache coherent.

I would either initialize the function pointer to the
zicbom version and remove the NULL check, or keep the
pointer NULL and have an explicit
'else zicbom_clean_range()' fallback.

> +static void zicbom_register_cmo_ops(void)
> +{
> +	riscv_noncoherent_register_cache_ops(&zicbom_cmo_ops);
> +}
> +#else
> +static void zicbom_register_cmo_ops(void) {}
> +#endif

As far as I recall, the #else path here was needed previously
to work around a binutils dependency, but with the current
code, it should be possible to just always enable
CONFIG_RISCV_ISA_ZICBOM when RISCV_DMA_NONCOHERENT is used.

     Arnd

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v7 1/6] riscv: mm: dma-noncoherent: Switch using function pointers for cache management
  2023-03-30 20:42 ` [PATCH v7 1/6] riscv: mm: dma-noncoherent: Switch using function pointers for cache management Prabhakar
  2023-03-30 21:34   ` Arnd Bergmann
@ 2023-03-31  7:31   ` Geert Uytterhoeven
  2023-03-31 10:45     ` Lad, Prabhakar
  2023-03-31 12:24   ` Conor Dooley
  2023-04-04  5:29   ` Christoph Hellwig
  3 siblings, 1 reply; 40+ messages in thread
From: Geert Uytterhoeven @ 2023-03-31  7:31 UTC (permalink / raw)
  To: Prabhakar
  Cc: Arnd Bergmann, Conor Dooley, Heiko Stuebner, Guo Ren,
	Andrew Jones, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Samuel Holland, linux-riscv, Rob Herring, Krzysztof Kozlowski,
	devicetree, linux-kernel, linux-renesas-soc, Biju Das,
	Lad Prabhakar

Hi Prabhakar,

Thanks for your patch!

On Thu, Mar 30, 2023 at 10:42 PM Prabhakar <prabhakar.csengg@gmail.com> wrote:
> From: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
>
> Currently, selecting which CMOs to use on a given platform is done using
> and ALTERNATIVE_X() macro. This was manageable when there were just two

the ALTERNATIVE_X()

> CMO implementations, but now that there are more and more platforms coming
> needing custom CMOs, the use of the ALTERNATIVE_X() macro is unmanageable.
>
> To avoid such issues this patch switches to use of function pointers

"the use" or "using"

> instead of ALTERNATIVE_X() macro for cache management (the only drawback

the ALTERNATIVE_X()

> being performance over the previous approach).
>
> void (*clean_range)(unsigned long addr, unsigned long size);
> void (*inv_range)(unsigned long addr, unsigned long size);
> void (*flush_range)(unsigned long addr, unsigned long size);
>
> The above function pointers are provided to be overridden for platforms
> needing CMO.
>
> Convert ZICBOM and T-HEAD CMO to use function pointers.
>
> Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>

> --- a/arch/riscv/errata/thead/errata.c
> +++ b/arch/riscv/errata/thead/errata.c

> +#ifdef CONFIG_ERRATA_THEAD_CMO

> +static void thead_register_cmo_ops(void)
> +{
> +       riscv_noncoherent_register_cache_ops(&thead_cmo_ops);
> +}
> +#else
> +static void thead_register_cmo_ops(void) {}
> +#endif

> --- a/arch/riscv/mm/dma-noncoherent.c
> +++ b/arch/riscv/mm/dma-noncoherent.c

> @@ -75,3 +83,12 @@ void riscv_noncoherent_supported(void)
>              "Non-coherent DMA support enabled without a block size\n");
>         noncoherent_supported = true;
>  }
> +
> +void riscv_noncoherent_register_cache_ops(const struct riscv_cache_ops *ops)
> +{
> +       if (!ops)
> +               return;

This is never true.
I guess originally you wanted to call riscv_noncoherent_register_cache_ops()
unconditionally from common code, instead of the various *register_cmo_ops()?
But that would have required something like

#ifdef CONFIG_ERRATA_THEAD_CMO
#define THEAD_CMO_OPS_PTR   (&thead_cmo_ops)
#else
#define THEAD_CMO_OPS_PTR   NULL
#endif

Or can we come up with some macro like pm_ptr(), but that also takes
care of the "&", so we can do "#define thead_cmo_ops NULL"?

> +
> +       noncoherent_cache_ops = *ops;
> +}
> +EXPORT_SYMBOL_GPL(riscv_noncoherent_register_cache_ops);

Gr{oetje,eeting}s,

                        Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v7 6/6] soc: renesas: Kconfig: Select the required configs for RZ/Five SoC
  2023-03-30 20:42 ` [PATCH v7 6/6] soc: renesas: Kconfig: Select the required configs for RZ/Five SoC Prabhakar
@ 2023-03-31  7:37   ` Geert Uytterhoeven
  2023-03-31  7:37     ` Geert Uytterhoeven
  0 siblings, 1 reply; 40+ messages in thread
From: Geert Uytterhoeven @ 2023-03-31  7:37 UTC (permalink / raw)
  To: Prabhakar
  Cc: Arnd Bergmann, Conor Dooley, Heiko Stuebner, Guo Ren,
	Andrew Jones, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Samuel Holland, linux-riscv, Rob Herring, Krzysztof Kozlowski,
	devicetree, linux-kernel, linux-renesas-soc, Biju Das,
	Lad Prabhakar

On Thu, Mar 30, 2023 at 10:42 PM Prabhakar <prabhakar.csengg@gmail.com> wrote:
> From: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
>
> Explicitly select the required Cache management and Errata configs
> required for the RZ/Five SoC.
>
> Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
> Reviewed-by: Conor Dooley <conor.dooley@microchip.com>

Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>

Gr{oetje,eeting}s,

                        Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v7 6/6] soc: renesas: Kconfig: Select the required configs for RZ/Five SoC
  2023-03-31  7:37   ` Geert Uytterhoeven
@ 2023-03-31  7:37     ` Geert Uytterhoeven
  0 siblings, 0 replies; 40+ messages in thread
From: Geert Uytterhoeven @ 2023-03-31  7:37 UTC (permalink / raw)
  To: Prabhakar
  Cc: Arnd Bergmann, Conor Dooley, Heiko Stuebner, Guo Ren,
	Andrew Jones, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Samuel Holland, linux-riscv, Rob Herring, Krzysztof Kozlowski,
	devicetree, linux-kernel, linux-renesas-soc, Biju Das,
	Lad Prabhakar

On Fri, Mar 31, 2023 at 9:37 AM Geert Uytterhoeven <geert@linux-m68k.org> wrote:
> On Thu, Mar 30, 2023 at 10:42 PM Prabhakar <prabhakar.csengg@gmail.com> wrote:
> > From: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
> >
> > Explicitly select the required Cache management and Errata configs
> > required for the RZ/Five SoC.
> >
> > Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
> > Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
>
> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>

Acked-by: Geert Uytterhoeven <geert+renesas@glider.be>

Gr{oetje,eeting}s,

                        Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v7 1/6] riscv: mm: dma-noncoherent: Switch using function pointers for cache management
  2023-03-30 21:34   ` Arnd Bergmann
@ 2023-03-31  7:54     ` Conor Dooley
  2023-03-31  7:58       ` Arnd Bergmann
  2023-03-31 10:37     ` Lad, Prabhakar
  1 sibling, 1 reply; 40+ messages in thread
From: Conor Dooley @ 2023-03-31  7:54 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Prabhakar, Geert Uytterhoeven, Heiko Stübner, guoren,
	Andrew Jones, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Samuel Holland, linux-riscv, Rob Herring, Krzysztof Kozlowski,
	devicetree, linux-kernel, Linux-Renesas, Biju Das, Lad,
	Prabhakar

[-- Attachment #1: Type: text/plain, Size: 524 bytes --]

On Thu, Mar 30, 2023 at 11:34:02PM +0200, Arnd Bergmann wrote:
> On Thu, Mar 30, 2023, at 22:42, Prabhakar wrote:
> > From: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>

> > +#ifdef CONFIG_ERRATA_THEAD_CMO
> 
> I would rename this to not call this an 'ERRATA' but
> just make it a driver. Not important though, and there
> was probably a reason you did it like this.

I think what was discussed in a prior iteration was that we'd leave
refactoring the T-HEAD bits into a driver for a subsequent work.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v7 1/6] riscv: mm: dma-noncoherent: Switch using function pointers for cache management
  2023-03-31  7:54     ` Conor Dooley
@ 2023-03-31  7:58       ` Arnd Bergmann
  0 siblings, 0 replies; 40+ messages in thread
From: Arnd Bergmann @ 2023-03-31  7:58 UTC (permalink / raw)
  To: Conor.Dooley
  Cc: Prabhakar, Geert Uytterhoeven, Heiko Stübner, guoren,
	Andrew Jones, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Samuel Holland, linux-riscv, Rob Herring, Krzysztof Kozlowski,
	devicetree, linux-kernel, Linux-Renesas, Biju Das, Lad,
	Prabhakar

On Fri, Mar 31, 2023, at 09:54, Conor Dooley wrote:
> On Thu, Mar 30, 2023 at 11:34:02PM +0200, Arnd Bergmann wrote:
>> On Thu, Mar 30, 2023, at 22:42, Prabhakar wrote:
>> > From: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
>
>> > +#ifdef CONFIG_ERRATA_THEAD_CMO
>> 
>> I would rename this to not call this an 'ERRATA' but
>> just make it a driver. Not important though, and there
>> was probably a reason you did it like this.
>
> I think what was discussed in a prior iteration was that we'd leave
> refactoring the T-HEAD bits into a driver for a subsequent work.

Ok, makes sense.

     Arnd

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v7 4/6] dt-bindings: cache: r9a07g043f-l2-cache: Add DT binding documentation for L2 cache controller
  2023-03-30 20:42 ` [PATCH v7 4/6] dt-bindings: cache: r9a07g043f-l2-cache: Add DT binding documentation for L2 cache controller Prabhakar
@ 2023-03-31 10:21   ` Conor Dooley
  2023-03-31 10:47     ` Lad, Prabhakar
  0 siblings, 1 reply; 40+ messages in thread
From: Conor Dooley @ 2023-03-31 10:21 UTC (permalink / raw)
  To: Prabhakar
  Cc: Arnd Bergmann, Geert Uytterhoeven, Heiko Stuebner, Guo Ren,
	Andrew Jones, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Samuel Holland, linux-riscv, Rob Herring, Krzysztof Kozlowski,
	devicetree, linux-kernel, linux-renesas-soc, Biju Das,
	Lad Prabhakar, Rob Herring

[-- Attachment #1: Type: text/plain, Size: 226 bytes --]

$subject: t-bindings: cache: r9a07g043f-l2-cache: Add DT binding documentation for L2 cache controller
                             ^^^^^^^^^^^^^^^^^^^
I assume this should be updated to be ax45mp-foo instead?

Cheers,
Conor.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v7 1/6] riscv: mm: dma-noncoherent: Switch using function pointers for cache management
  2023-03-30 21:34   ` Arnd Bergmann
  2023-03-31  7:54     ` Conor Dooley
@ 2023-03-31 10:37     ` Lad, Prabhakar
  2023-03-31 10:44       ` Arnd Bergmann
  2023-03-31 10:55       ` Conor Dooley
  1 sibling, 2 replies; 40+ messages in thread
From: Lad, Prabhakar @ 2023-03-31 10:37 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Conor.Dooley, Geert Uytterhoeven, Heiko Stübner, guoren,
	Andrew Jones, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Samuel Holland, linux-riscv, Rob Herring, Krzysztof Kozlowski,
	devicetree, linux-kernel, Linux-Renesas, Biju Das, Lad,
	Prabhakar

Hi Arnd,

Thank you for the review.

On Thu, Mar 30, 2023 at 10:34 PM Arnd Bergmann <arnd@arndb.de> wrote:
>
> On Thu, Mar 30, 2023, at 22:42, Prabhakar wrote:
> > From: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
> >
> > Currently, selecting which CMOs to use on a given platform is done using
> > and ALTERNATIVE_X() macro. This was manageable when there were just two
> > CMO implementations, but now that there are more and more platforms coming
> > needing custom CMOs, the use of the ALTERNATIVE_X() macro is unmanageable.
> >
> > To avoid such issues this patch switches to use of function pointers
> > instead of ALTERNATIVE_X() macro for cache management (the only drawback
> > being performance over the previous approach).
> >
> > void (*clean_range)(unsigned long addr, unsigned long size);
> > void (*inv_range)(unsigned long addr, unsigned long size);
> > void (*flush_range)(unsigned long addr, unsigned long size);
> >
> > The above function pointers are provided to be overridden for platforms
> > needing CMO.
> >
> > Convert ZICBOM and T-HEAD CMO to use function pointers.
> >
> > Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
>
> This is looking pretty good. There are a few things that I
> still see sticking out, and I think I've mentioned some of
> them before, but don't remember if there was a reason for
> doing it like this:
>
> > +#ifdef CONFIG_ERRATA_THEAD_CMO
>
> I would rename this to not call this an 'ERRATA' but
> just make it a driver. Not important though, and there
> was probably a reason you did it like this.
>
As agreed, we will keep this as is.

> > +extern struct riscv_cache_ops noncoherent_cache_ops;
> > +
> > +void riscv_noncoherent_register_cache_ops(const struct riscv_cache_ops
> > *ops);
> > +
> > +static inline void riscv_dma_noncoherent_clean(void *vaddr, size_t
> > size)
> > +{
> > +     if (noncoherent_cache_ops.clean_range) {
> > +             unsigned long addr = (unsigned long)vaddr;
> > +
> > +             noncoherent_cache_ops.clean_range(addr, size);
> > +     }
> > +}
>
> The type case should not be necessary here. Instead I would
> make the callbacks use 'void *' as well, not 'unsigned long'.
>
Ok, I'll update this on using void *

> It's possible that some future cache controller driver requires
> passing physical addresses, as is common for last level cache,
> but as long as all drivers pass virtual addresses, it's easier
> to do the phys_to_virt() in common code.
>
Agreed.

> It also seems wrong to have the fallback be to do nothing
> when the pointer is NULL, since that cannot actually work
> when a device is not cache coherent.
>
If the device is non cache coherent and if it doesn't support ZICBOM
ISA extension the device won't work anyway. So non-cache coherent
devices until they have their CMO config enabled won't work anyway. So
I didn't see any benefit in enabling ZICBOM by default. Please let me
know if I am misunderstanding.

> I would either initialize the function pointer to the
> zicbom version and remove the NULL check, or keep the
> pointer NULL and have an explicit
> 'else zicbom_clean_range()' fallback.
>
> > +static void zicbom_register_cmo_ops(void)
> > +{
> > +     riscv_noncoherent_register_cache_ops(&zicbom_cmo_ops);
> > +}
> > +#else
> > +static void zicbom_register_cmo_ops(void) {}
> > +#endif
>
> As far as I recall, the #else path here was needed previously
> to work around a binutils dependency, but with the current
> code, it should be possible to just always enable
> CONFIG_RISCV_ISA_ZICBOM when RISCV_DMA_NONCOHERENT is used.
>
Are you suggesting something like below?

diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index 4dadf35ac721..a55dee98ccf8 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -242,6 +242,7 @@ config RISCV_DMA_NONCOHERENT
        select ARCH_HAS_SYNC_DMA_FOR_CPU
        select ARCH_HAS_SYNC_DMA_FOR_DEVICE
        select DMA_DIRECT_REMAP
+       select RISCV_ISA_ZICBOM

 config AS_HAS_INSN
        def_bool $(as-instr,.insn r 51$(comma) 0$(comma) 0$(comma)
t0$(comma) t0$(comma) zero)
@@ -465,7 +466,6 @@ config RISCV_ISA_ZICBOM
        depends on MMU
        depends on RISCV_ALTERNATIVE
        default y
-       select RISCV_DMA_NONCOHERENT
        help
           Adds support to dynamically detect the presence of the ZICBOM
           extension (Cache Block Management Operations) and enable its

But what if the platform doesn't have the ZICBOM ISA extension?

Cheers,
Prabhakar

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* Re: [PATCH v7 1/6] riscv: mm: dma-noncoherent: Switch using function pointers for cache management
  2023-03-31 10:37     ` Lad, Prabhakar
@ 2023-03-31 10:44       ` Arnd Bergmann
  2023-03-31 12:11         ` Lad, Prabhakar
  2023-04-03 17:00         ` Lad, Prabhakar
  2023-03-31 10:55       ` Conor Dooley
  1 sibling, 2 replies; 40+ messages in thread
From: Arnd Bergmann @ 2023-03-31 10:44 UTC (permalink / raw)
  To: Prabhakar
  Cc: Conor.Dooley, Geert Uytterhoeven, Heiko Stübner, guoren,
	Andrew Jones, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Samuel Holland, linux-riscv, Rob Herring, Krzysztof Kozlowski,
	devicetree, linux-kernel, Linux-Renesas, Biju Das, Lad,
	Prabhakar

On Fri, Mar 31, 2023, at 12:37, Lad, Prabhakar wrote:
> On Thu, Mar 30, 2023 at 10:34 PM Arnd Bergmann <arnd@arndb.de> wrote:
>
>> It also seems wrong to have the fallback be to do nothing
>> when the pointer is NULL, since that cannot actually work
>> when a device is not cache coherent.
>>
> If the device is non cache coherent and if it doesn't support ZICBOM
> ISA extension the device won't work anyway. So non-cache coherent
> devices until they have their CMO config enabled won't work anyway. So
> I didn't see any benefit in enabling ZICBOM by default. Please let me
> know if I am misunderstanding.

Two things:

- Having a broken machine crash with in invalid instruction
  exception is better than having it run into silent data
  corruption.

- a correctly predicted branch is typically faster than an
  indirect function call, so the fallback to zicbom makes the
  expected (at least in the future) case the fast one.

> @@ -465,7 +466,6 @@ config RISCV_ISA_ZICBOM
>         depends on MMU
>         depends on RISCV_ALTERNATIVE
>         default y
> -       select RISCV_DMA_NONCOHERENT
>         help
>            Adds support to dynamically detect the presence of the ZICBOM
>            extension (Cache Block Management Operations) and enable its
>
> But what if the platform doesn't have the ZICBOM ISA extension?

Then it needs to register its cache operations before the first
DMA, which is something that it should do anyway. With your
current code, it may work by accident depending on the state of
the cache, but with the version I suggested, it will either work
correctly all the time or crash in an obvious way when misconfigured.

     Arnd

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v7 1/6] riscv: mm: dma-noncoherent: Switch using function pointers for cache management
  2023-03-31  7:31   ` Geert Uytterhoeven
@ 2023-03-31 10:45     ` Lad, Prabhakar
  0 siblings, 0 replies; 40+ messages in thread
From: Lad, Prabhakar @ 2023-03-31 10:45 UTC (permalink / raw)
  To: Geert Uytterhoeven
  Cc: Arnd Bergmann, Conor Dooley, Heiko Stuebner, Guo Ren,
	Andrew Jones, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Samuel Holland, linux-riscv, Rob Herring, Krzysztof Kozlowski,
	devicetree, linux-kernel, linux-renesas-soc, Biju Das,
	Lad Prabhakar

Hi Geert,

Thank you for the review.

On Fri, Mar 31, 2023 at 8:31 AM Geert Uytterhoeven <geert@linux-m68k.org> wrote:
>
> Hi Prabhakar,
>
> Thanks for your patch!
>
> On Thu, Mar 30, 2023 at 10:42 PM Prabhakar <prabhakar.csengg@gmail.com> wrote:
> > From: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
> >
> > Currently, selecting which CMOs to use on a given platform is done using
> > and ALTERNATIVE_X() macro. This was manageable when there were just two
>
> the ALTERNATIVE_X()
>
OK.

> > CMO implementations, but now that there are more and more platforms coming
> > needing custom CMOs, the use of the ALTERNATIVE_X() macro is unmanageable.
> >
> > To avoid such issues this patch switches to use of function pointers
>
> "the use" or "using"
>
OK.

> > instead of ALTERNATIVE_X() macro for cache management (the only drawback
>
> the ALTERNATIVE_X()
>
OK.

> > being performance over the previous approach).
> >
> > void (*clean_range)(unsigned long addr, unsigned long size);
> > void (*inv_range)(unsigned long addr, unsigned long size);
> > void (*flush_range)(unsigned long addr, unsigned long size);
> >
> > The above function pointers are provided to be overridden for platforms
> > needing CMO.
> >
> > Convert ZICBOM and T-HEAD CMO to use function pointers.
> >
> > Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
>
> > --- a/arch/riscv/errata/thead/errata.c
> > +++ b/arch/riscv/errata/thead/errata.c
>
> > +#ifdef CONFIG_ERRATA_THEAD_CMO
>
> > +static void thead_register_cmo_ops(void)
> > +{
> > +       riscv_noncoherent_register_cache_ops(&thead_cmo_ops);
> > +}
> > +#else
> > +static void thead_register_cmo_ops(void) {}
> > +#endif
>
> > --- a/arch/riscv/mm/dma-noncoherent.c
> > +++ b/arch/riscv/mm/dma-noncoherent.c
>
> > @@ -75,3 +83,12 @@ void riscv_noncoherent_supported(void)
> >              "Non-coherent DMA support enabled without a block size\n");
> >         noncoherent_supported = true;
> >  }
> > +
> > +void riscv_noncoherent_register_cache_ops(const struct riscv_cache_ops *ops)
> > +{
> > +       if (!ops)
> > +               return;
>
> This is never true.
I just wanted to add a sanity check.

> I guess originally you wanted to call riscv_noncoherent_register_cache_ops()
> unconditionally from common code, instead of the various *register_cmo_ops()?
> But that would have required something like
>
> #ifdef CONFIG_ERRATA_THEAD_CMO
> #define THEAD_CMO_OPS_PTR   (&thead_cmo_ops)
> #else
> #define THEAD_CMO_OPS_PTR   NULL
> #endif
>
riscv_noncoherent_register_cache_ops() will only be called if the
ISA/Errata needs to be applied so I'll drop this check.

Cheers,
Prabhakar

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v7 4/6] dt-bindings: cache: r9a07g043f-l2-cache: Add DT binding documentation for L2 cache controller
  2023-03-31 10:21   ` Conor Dooley
@ 2023-03-31 10:47     ` Lad, Prabhakar
  0 siblings, 0 replies; 40+ messages in thread
From: Lad, Prabhakar @ 2023-03-31 10:47 UTC (permalink / raw)
  To: Conor Dooley
  Cc: Arnd Bergmann, Geert Uytterhoeven, Heiko Stuebner, Guo Ren,
	Andrew Jones, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Samuel Holland, linux-riscv, Rob Herring, Krzysztof Kozlowski,
	devicetree, linux-kernel, linux-renesas-soc, Biju Das,
	Lad Prabhakar, Rob Herring

Hi Conor,

THank you for the review.

On Fri, Mar 31, 2023 at 11:22 AM Conor Dooley
<conor.dooley@microchip.com> wrote:
>
> $subject: t-bindings: cache: r9a07g043f-l2-cache: Add DT binding documentation for L2 cache controller
>                              ^^^^^^^^^^^^^^^^^^^
> I assume this should be updated to be ax45mp-foo instead?
>
Agreed, I'll fix this in the next version.

Cheers,
Prabhakar

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v7 1/6] riscv: mm: dma-noncoherent: Switch using function pointers for cache management
  2023-03-31 10:37     ` Lad, Prabhakar
  2023-03-31 10:44       ` Arnd Bergmann
@ 2023-03-31 10:55       ` Conor Dooley
  2023-03-31 11:36         ` Arnd Bergmann
  1 sibling, 1 reply; 40+ messages in thread
From: Conor Dooley @ 2023-03-31 10:55 UTC (permalink / raw)
  To: Lad, Prabhakar
  Cc: Arnd Bergmann, Geert Uytterhoeven, Heiko Stübner, guoren,
	Andrew Jones, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Samuel Holland, linux-riscv, Rob Herring, Krzysztof Kozlowski,
	devicetree, linux-kernel, Linux-Renesas, Biju Das, Lad,
	Prabhakar

[-- Attachment #1: Type: text/plain, Size: 1653 bytes --]

On Fri, Mar 31, 2023 at 11:37:30AM +0100, Lad, Prabhakar wrote:

> > As far as I recall, the #else path here was needed previously
> > to work around a binutils dependency, but with the current
> > code, it should be possible to just always enable
> > CONFIG_RISCV_ISA_ZICBOM when RISCV_DMA_NONCOHERENT is used.
> >
> Are you suggesting something like below?
> 
> diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
> index 4dadf35ac721..a55dee98ccf8 100644
> --- a/arch/riscv/Kconfig
> +++ b/arch/riscv/Kconfig
> @@ -242,6 +242,7 @@ config RISCV_DMA_NONCOHERENT
>         select ARCH_HAS_SYNC_DMA_FOR_CPU
>         select ARCH_HAS_SYNC_DMA_FOR_DEVICE
>         select DMA_DIRECT_REMAP
> +       select RISCV_ISA_ZICBOM
> 
>  config AS_HAS_INSN
>         def_bool $(as-instr,.insn r 51$(comma) 0$(comma) 0$(comma)
> t0$(comma) t0$(comma) zero)
> @@ -465,7 +466,6 @@ config RISCV_ISA_ZICBOM
>         depends on MMU
>         depends on RISCV_ALTERNATIVE
>         default y
> -       select RISCV_DMA_NONCOHERENT
>         help
>            Adds support to dynamically detect the presence of the ZICBOM
>            extension (Cache Block Management Operations) and enable its
>

Does that actually work? I don't think it does.
If you try to enable RISCV_ISA_ZICBOM then you won't get
RISC_DMA_NONCOHERENT turned on. Run menuconfig and disable support for
Renesas, SiFive and T-Head SoCs & you can replicate.

I think one of RISCV_ISA_ZICBOM and RISCV_DMA_NONCOHERENT should just be
dropped, although I don't know which one to pick!
Making RISCV_DMA_NONCOHERENT user selectable probably makes the most
sense.


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v7 1/6] riscv: mm: dma-noncoherent: Switch using function pointers for cache management
  2023-03-31 10:55       ` Conor Dooley
@ 2023-03-31 11:36         ` Arnd Bergmann
  0 siblings, 0 replies; 40+ messages in thread
From: Arnd Bergmann @ 2023-03-31 11:36 UTC (permalink / raw)
  To: Conor.Dooley, Prabhakar
  Cc: Geert Uytterhoeven, Heiko Stübner, guoren, Andrew Jones,
	Paul Walmsley, Palmer Dabbelt, Albert Ou, Samuel Holland,
	linux-riscv, Rob Herring, Krzysztof Kozlowski, devicetree,
	linux-kernel, Linux-Renesas, Biju Das, Lad, Prabhakar

On Fri, Mar 31, 2023, at 12:55, Conor Dooley wrote:
> On Fri, Mar 31, 2023 at 11:37:30AM +0100, Lad, Prabhakar wrote:
>>
>
> Does that actually work? I don't think it does.
> If you try to enable RISCV_ISA_ZICBOM then you won't get
> RISC_DMA_NONCOHERENT turned on. Run menuconfig and disable support for
> Renesas, SiFive and T-Head SoCs & you can replicate.

Right, the circular dependency has to be broken in some form.

> I think one of RISCV_ISA_ZICBOM and RISCV_DMA_NONCOHERENT should just be
> dropped, although I don't know which one to pick!
> Making RISCV_DMA_NONCOHERENT user selectable probably makes the most
> sense.

That sounds good to me.

    Arnd

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v7 1/6] riscv: mm: dma-noncoherent: Switch using function pointers for cache management
  2023-03-31 10:44       ` Arnd Bergmann
@ 2023-03-31 12:11         ` Lad, Prabhakar
  2023-04-03 17:00         ` Lad, Prabhakar
  1 sibling, 0 replies; 40+ messages in thread
From: Lad, Prabhakar @ 2023-03-31 12:11 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Conor.Dooley, Geert Uytterhoeven, Heiko Stübner, guoren,
	Andrew Jones, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Samuel Holland, linux-riscv, Rob Herring, Krzysztof Kozlowski,
	devicetree, linux-kernel, Linux-Renesas, Biju Das, Lad,
	Prabhakar

On Fri, Mar 31, 2023 at 11:45 AM Arnd Bergmann <arnd@arndb.de> wrote:
>
> On Fri, Mar 31, 2023, at 12:37, Lad, Prabhakar wrote:
> > On Thu, Mar 30, 2023 at 10:34 PM Arnd Bergmann <arnd@arndb.de> wrote:
> >
> >> It also seems wrong to have the fallback be to do nothing
> >> when the pointer is NULL, since that cannot actually work
> >> when a device is not cache coherent.
> >>
> > If the device is non cache coherent and if it doesn't support ZICBOM
> > ISA extension the device won't work anyway. So non-cache coherent
> > devices until they have their CMO config enabled won't work anyway. So
> > I didn't see any benefit in enabling ZICBOM by default. Please let me
> > know if I am misunderstanding.
>
> Two things:
>
> - Having a broken machine crash with in invalid instruction
>   exception is better than having it run into silent data
>   corruption.
>
> - a correctly predicted branch is typically faster than an
>   indirect function call, so the fallback to zicbom makes the
>   expected (at least in the future) case the fast one.
>
Ok, thank you for the clarification. I'll default to zicbom.

> > @@ -465,7 +466,6 @@ config RISCV_ISA_ZICBOM
> >         depends on MMU
> >         depends on RISCV_ALTERNATIVE
> >         default y
> > -       select RISCV_DMA_NONCOHERENT
> >         help
> >            Adds support to dynamically detect the presence of the ZICBOM
> >            extension (Cache Block Management Operations) and enable its
> >
> > But what if the platform doesn't have the ZICBOM ISA extension?
>
> Then it needs to register its cache operations before the first
> DMA, which is something that it should do anyway. With your
> current code, it may work by accident depending on the state of
> the cache, but with the version I suggested, it will either work
> correctly all the time or crash in an obvious way when misconfigured.
>
Okay, agreed.

Cheers,
Prabhakar

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v7 1/6] riscv: mm: dma-noncoherent: Switch using function pointers for cache management
  2023-03-30 20:42 ` [PATCH v7 1/6] riscv: mm: dma-noncoherent: Switch using function pointers for cache management Prabhakar
  2023-03-30 21:34   ` Arnd Bergmann
  2023-03-31  7:31   ` Geert Uytterhoeven
@ 2023-03-31 12:24   ` Conor Dooley
  2023-04-03 18:23     ` Lad, Prabhakar
  2023-04-04  5:29   ` Christoph Hellwig
  3 siblings, 1 reply; 40+ messages in thread
From: Conor Dooley @ 2023-03-31 12:24 UTC (permalink / raw)
  To: Prabhakar
  Cc: Arnd Bergmann, Geert Uytterhoeven, Heiko Stuebner, Guo Ren,
	Andrew Jones, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Samuel Holland, linux-riscv, Rob Herring, Krzysztof Kozlowski,
	devicetree, linux-kernel, linux-renesas-soc, Biju Das,
	Lad Prabhakar

[-- Attachment #1: Type: text/plain, Size: 4256 bytes --]

Hey,

I think most of what I wanted to talk about has been raised by Arnd or
Geert, so I kinda only have a couple of small comments for you here.

On Thu, Mar 30, 2023 at 09:42:12PM +0100, Prabhakar wrote:
> From: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
> 
> Currently, selecting which CMOs to use on a given platform is done using
> and ALTERNATIVE_X() macro. This was manageable when there were just two
> CMO implementations, but now that there are more and more platforms coming
> needing custom CMOs, the use of the ALTERNATIVE_X() macro is unmanageable.
> 
> To avoid such issues this patch switches to use of function pointers
> instead of ALTERNATIVE_X() macro for cache management (the only drawback
> being performance over the previous approach).
> 
> void (*clean_range)(unsigned long addr, unsigned long size);
> void (*inv_range)(unsigned long addr, unsigned long size);
> void (*flush_range)(unsigned long addr, unsigned long size);

So, given Arnd has renamed the generic helpers, should we use the
writeback/invalidate/writeback & invalidate terminology here too?
I think it'd be nice to make the "driver" functions match the generic
implementations's names (even though that means not making the
instruction naming).

> The above function pointers are provided to be overridden for platforms
> needing CMO.
> 
> Convert ZICBOM and T-HEAD CMO to use function pointers.
> 
> Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
> ---
> v6->v7
> * Updated commit description
> * Fixed build issues when CONFIG_ERRATA_THEAD_CMO=n
> * Used static const struct ptr to register CMO ops
> * Dropped riscv_dma_noncoherent_cmo_ops

> * Moved ZICBOM CMO setup to setup.c

Why'd you do that?
What is the reason that the Zicbom stuff cannot live in
dma-noncoherent.[ch] and only expose, say:
void riscv_cbom_register_cmo_ops(void)
{
	riscv_noncoherent_register_cache_ops(&zicbom_cmo_ops);
}
which you then call from setup.c?

> v5->v6
> * New patch
> ---
>  arch/riscv/errata/thead/errata.c         | 76 ++++++++++++++++++++++++
>  arch/riscv/include/asm/dma-noncoherent.h | 73 +++++++++++++++++++++++
>  arch/riscv/include/asm/errata_list.h     | 53 -----------------
>  arch/riscv/kernel/setup.c                | 49 ++++++++++++++-
>  arch/riscv/mm/dma-noncoherent.c          | 25 ++++++--
>  arch/riscv/mm/pmem.c                     |  6 +-
>  6 files changed, 221 insertions(+), 61 deletions(-)
>  create mode 100644 arch/riscv/include/asm/dma-noncoherent.h

> +#ifdef CONFIG_RISCV_ISA_ZICBOM
> +
> +#define ZICBOM_CMO_OP(_op, _start, _size, _cachesize)				\
> +	asm volatile("mv a0, %1\n\t"						\
> +		     "j 2f\n\t"							\
> +		     "3:\n\t"							\
> +		     CBO_##_op(a0)						\
> +		     "add a0, a0, %0\n\t"					\
> +		     "2:\n\t"							\
> +		     "bltu a0, %2, 3b\n\t"					\
> +		     : : "r"(_cachesize),					\
> +			 "r"((unsigned long)(_start) & ~((_cachesize) - 1UL)),	\
> +			 "r"((unsigned long)(_start) + (_size))			\
> +		     : "a0")
> +
> +static void zicbom_cmo_clean_range(unsigned long addr, unsigned long size)
> +{
> +	ZICBOM_CMO_OP(clean, addr, size, riscv_cbom_block_size);
> +}
> +
> +static void zicbom_cmo_flush_range(unsigned long addr, unsigned long size)
> +{
> +	ZICBOM_CMO_OP(flush, addr, size, riscv_cbom_block_size);
> +}
> +
> +static void zicbom_cmo_inval_range(unsigned long addr, unsigned long size)
> +{
> +	ZICBOM_CMO_OP(inval, addr, size, riscv_cbom_block_size);
> +}
> +
> +const struct riscv_cache_ops zicbom_cmo_ops = {
> +	.clean_range = &zicbom_cmo_clean_range,
> +	.inv_range = &zicbom_cmo_inval_range,
> +	.flush_range = &zicbom_cmo_flush_range,
> +};
> +
> +static void zicbom_register_cmo_ops(void)
> +{
> +	riscv_noncoherent_register_cache_ops(&zicbom_cmo_ops);
> +}
> +#else
> +static void zicbom_register_cmo_ops(void) {}
> +#endif

I think all of the above should be prefixed with riscv_cbom to match the
other riscv_cbom stuff we already have.
Although, given the discussion elsewhere about just falling back to
zicbom in the absence of specific ops, most of these functions will
probably disappear (if not all of them).

Cheers,
Conor.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v7 5/6] cache: Add L2 cache management for Andes AX45MP RISC-V core
  2023-03-30 20:42 ` [PATCH v7 5/6] cache: Add L2 cache management for Andes AX45MP RISC-V core Prabhakar
@ 2023-03-31 12:45   ` Conor Dooley
  2023-03-31 20:17     ` Lad, Prabhakar
  0 siblings, 1 reply; 40+ messages in thread
From: Conor Dooley @ 2023-03-31 12:45 UTC (permalink / raw)
  To: Prabhakar
  Cc: Arnd Bergmann, Geert Uytterhoeven, Heiko Stuebner, Guo Ren,
	Andrew Jones, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Samuel Holland, linux-riscv, Rob Herring, Krzysztof Kozlowski,
	devicetree, linux-kernel, linux-renesas-soc, Biju Das,
	Lad Prabhakar

[-- Attachment #1: Type: text/plain, Size: 1729 bytes --]

On Thu, Mar 30, 2023 at 09:42:16PM +0100, Prabhakar wrote:

> +STANDALONE CACHE CONTROLLER DRIVERS

> +F:	include/cache

This can go since the file no longer exists.

> +config AX45MP_L2_CACHE
> +	bool "Andes Technology AX45MP L2 Cache controller"
> +	depends on RISCV && RISCV_DMA_NONCOHERENT

This can just be depends on RISCV_DMA_NONCOHERENT, since that's only
defined on RISC-V.

> +static void ax45mp_get_l2_line_size(struct platform_device *pdev)
> +{
> +	struct device_node *np = pdev->dev.of_node;
> +	struct device *dev = &pdev->dev;
> +	int ret;
> +
> +	ret = of_property_read_u32(np, "cache-line-size", &ax45mp_priv->ax45mp_cache_line_size);
> +	if (ret) {
> +		dev_err(dev, "Failed to get cache-line-size, defaulting to 64 bytes\n");
> +		ax45mp_priv->ax45mp_cache_line_size = AX45MP_CACHE_LINE_SIZE;
> +	}
> +
> +	if (ax45mp_priv->ax45mp_cache_line_size != AX45MP_CACHE_LINE_SIZE) {
> +		dev_err(dev, "Expected cache-line-size to be 64 bytes (found:%u). Defaulting to 64 bytes\n",
> +			ax45mp_priv->ax45mp_cache_line_size);
> +		ax45mp_priv->ax45mp_cache_line_size = AX45MP_CACHE_LINE_SIZE;
> +	}

I forget, why are we doing this defaulting rather than falling over
immediately if we detect the property is missing or wrong?

> +}

> +static const struct riscv_cache_ops ax45mp_cmo_ops = {
> +	.clean_range = &ax45mp_cpu_dma_wb_range,
> +	.inv_range = &ax45mp_cpu_dma_inval_range,
> +	.flush_range = &ax45mp_cpu_dma_flush_range,
> +};

I think it would be nice if your driver functions matched the names used
by the ops. (and as I said on the other patch, I think the ops should
match the cross-arch naming.

Otherwise, looks grand - although I think I was mostly happy with the
last revision too.a

Cheers,
Conor.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v7 0/6] RISC-V non-coherent function pointer based CMO + non-coherent DMA support for AX45MP
  2023-03-30 20:42 [PATCH v7 0/6] RISC-V non-coherent function pointer based CMO + non-coherent DMA support for AX45MP Prabhakar
                   ` (5 preceding siblings ...)
  2023-03-30 20:42 ` [PATCH v7 6/6] soc: renesas: Kconfig: Select the required configs for RZ/Five SoC Prabhakar
@ 2023-03-31 18:05 ` Conor Dooley
  2023-03-31 20:09   ` Lad, Prabhakar
  6 siblings, 1 reply; 40+ messages in thread
From: Conor Dooley @ 2023-03-31 18:05 UTC (permalink / raw)
  To: Prabhakar
  Cc: Arnd Bergmann, Conor Dooley, Geert Uytterhoeven, Heiko Stuebner,
	Guo Ren, Andrew Jones, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Samuel Holland, linux-riscv, Rob Herring, Krzysztof Kozlowski,
	devicetree, linux-kernel, linux-renesas-soc, Biju Das,
	Lad Prabhakar

[-- Attachment #1: Type: text/plain, Size: 428 bytes --]

On Thu, Mar 30, 2023 at 09:42:11PM +0100, Prabhakar wrote:

> - This series requires testing on Cores with zicbom and T-Head SoCs

I don't actually know if there are Zicbom parts, may need to test that
on QEMU.
I had to revert unrelated content to boot, but my D1 NFS setup seems to
work fine with these changes, so where it is relevant:
Tested-by: Conor Dooley <conor.dooley@microchip.com> # tyre-kicking on D1

Cheers,
Conor.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v7 0/6] RISC-V non-coherent function pointer based CMO + non-coherent DMA support for AX45MP
  2023-03-31 18:05 ` [PATCH v7 0/6] RISC-V non-coherent function pointer based CMO + non-coherent DMA support for AX45MP Conor Dooley
@ 2023-03-31 20:09   ` Lad, Prabhakar
  2023-03-31 20:15     ` Conor Dooley
  0 siblings, 1 reply; 40+ messages in thread
From: Lad, Prabhakar @ 2023-03-31 20:09 UTC (permalink / raw)
  To: Conor Dooley
  Cc: Arnd Bergmann, Conor Dooley, Geert Uytterhoeven, Heiko Stuebner,
	Guo Ren, Andrew Jones, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Samuel Holland, linux-riscv, Rob Herring, Krzysztof Kozlowski,
	devicetree, linux-kernel, linux-renesas-soc, Biju Das,
	Lad Prabhakar

Hi Conor,

On Fri, Mar 31, 2023 at 7:05 PM Conor Dooley <conor@kernel.org> wrote:
>
> On Thu, Mar 30, 2023 at 09:42:11PM +0100, Prabhakar wrote:
>
> > - This series requires testing on Cores with zicbom and T-Head SoCs
>
> I don't actually know if there are Zicbom parts, may need to test that
> on QEMU.
> I had to revert unrelated content to boot, but my D1 NFS setup seems to
> work fine with these changes, so where it is relevant:
> Tested-by: Conor Dooley <conor.dooley@microchip.com> # tyre-kicking on D1
>
Thank you for testing this. By any chance did you compare the performance?

Cheers,
Prabhakar

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v7 0/6] RISC-V non-coherent function pointer based CMO + non-coherent DMA support for AX45MP
  2023-03-31 20:09   ` Lad, Prabhakar
@ 2023-03-31 20:15     ` Conor Dooley
  2023-04-01  1:47       ` Icenowy Zheng
  0 siblings, 1 reply; 40+ messages in thread
From: Conor Dooley @ 2023-03-31 20:15 UTC (permalink / raw)
  To: Lad, Prabhakar
  Cc: Arnd Bergmann, Conor Dooley, Geert Uytterhoeven, Heiko Stuebner,
	Guo Ren, Andrew Jones, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Samuel Holland, linux-riscv, Rob Herring, Krzysztof Kozlowski,
	devicetree, linux-kernel, linux-renesas-soc, Biju Das,
	Lad Prabhakar, uwu

[-- Attachment #1: Type: text/plain, Size: 811 bytes --]

On Fri, Mar 31, 2023 at 08:09:16PM +0000, Lad, Prabhakar wrote:
> Hi Conor,
> 
> On Fri, Mar 31, 2023 at 7:05 PM Conor Dooley <conor@kernel.org> wrote:
> >
> > On Thu, Mar 30, 2023 at 09:42:11PM +0100, Prabhakar wrote:
> >
> > > - This series requires testing on Cores with zicbom and T-Head SoCs
> >
> > I don't actually know if there are Zicbom parts, may need to test that
> > on QEMU.
> > I had to revert unrelated content to boot, but my D1 NFS setup seems to
> > work fine with these changes, so where it is relevant:
> > Tested-by: Conor Dooley <conor.dooley@microchip.com> # tyre-kicking on D1
> >
> Thank you for testing this. By any chance did you compare the performance?

No, just tyre kicking. Icenowy had some benchmark for it IIRC, I think
mining some coin or w/e. +CC them.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v7 5/6] cache: Add L2 cache management for Andes AX45MP RISC-V core
  2023-03-31 12:45   ` Conor Dooley
@ 2023-03-31 20:17     ` Lad, Prabhakar
  0 siblings, 0 replies; 40+ messages in thread
From: Lad, Prabhakar @ 2023-03-31 20:17 UTC (permalink / raw)
  To: Conor Dooley
  Cc: Arnd Bergmann, Geert Uytterhoeven, Heiko Stuebner, Guo Ren,
	Andrew Jones, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Samuel Holland, linux-riscv, Rob Herring, Krzysztof Kozlowski,
	devicetree, linux-kernel, linux-renesas-soc, Biju Das,
	Lad Prabhakar

Hi Conor,

Thank you for the review.

On Fri, Mar 31, 2023 at 1:45 PM Conor Dooley <conor.dooley@microchip.com> wrote:
>
> On Thu, Mar 30, 2023 at 09:42:16PM +0100, Prabhakar wrote:
>
> > +STANDALONE CACHE CONTROLLER DRIVERS
>
> > +F:   include/cache
>
> This can go since the file no longer exists.
>
Agreed I will drop this.

> > +config AX45MP_L2_CACHE
> > +     bool "Andes Technology AX45MP L2 Cache controller"
> > +     depends on RISCV && RISCV_DMA_NONCOHERENT
>
> This can just be depends on RISCV_DMA_NONCOHERENT, since that's only
> defined on RISC-V.
>
Agreed.

> > +static void ax45mp_get_l2_line_size(struct platform_device *pdev)
> > +{
> > +     struct device_node *np = pdev->dev.of_node;
> > +     struct device *dev = &pdev->dev;
> > +     int ret;
> > +
> > +     ret = of_property_read_u32(np, "cache-line-size", &ax45mp_priv->ax45mp_cache_line_size);
> > +     if (ret) {
> > +             dev_err(dev, "Failed to get cache-line-size, defaulting to 64 bytes\n");
> > +             ax45mp_priv->ax45mp_cache_line_size = AX45MP_CACHE_LINE_SIZE;
> > +     }
> > +
> > +     if (ax45mp_priv->ax45mp_cache_line_size != AX45MP_CACHE_LINE_SIZE) {
> > +             dev_err(dev, "Expected cache-line-size to be 64 bytes (found:%u). Defaulting to 64 bytes\n",
> > +                     ax45mp_priv->ax45mp_cache_line_size);
> > +             ax45mp_priv->ax45mp_cache_line_size = AX45MP_CACHE_LINE_SIZE;
> > +     }
>
> I forget, why are we doing this defaulting rather than falling over
> immediately if we detect the property is missing or wrong?
>
No reason as such on not failing on property not existing/Invalid. I
will bail out in an error case now.

> > +}
>
> > +static const struct riscv_cache_ops ax45mp_cmo_ops = {
> > +     .clean_range = &ax45mp_cpu_dma_wb_range,
> > +     .inv_range = &ax45mp_cpu_dma_inval_range,
> > +     .flush_range = &ax45mp_cpu_dma_flush_range,
> > +};
>
> I think it would be nice if your driver functions matched the names used
> by the ops. (and as I said on the other patch, I think the ops should
> match the cross-arch naming.
>
Agreed, will do.

> Otherwise, looks grand - although I think I was mostly happy with the
> last revision too.a
>
I know you had provided the RB for the last version ;)

Cheers,
Prabhakar

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v7 0/6] RISC-V non-coherent function pointer based CMO + non-coherent DMA support for AX45MP
  2023-03-31 20:15     ` Conor Dooley
@ 2023-04-01  1:47       ` Icenowy Zheng
  0 siblings, 0 replies; 40+ messages in thread
From: Icenowy Zheng @ 2023-04-01  1:47 UTC (permalink / raw)
  To: Conor Dooley, Lad, Prabhakar
  Cc: Arnd Bergmann, Conor Dooley, Geert Uytterhoeven, Heiko Stuebner,
	Guo Ren, Andrew Jones, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Samuel Holland, linux-riscv, Rob Herring, Krzysztof Kozlowski,
	devicetree, linux-kernel, linux-renesas-soc, Biju Das,
	Lad Prabhakar

在 2023-03-31星期五的 21:15 +0100,Conor Dooley写道:
> On Fri, Mar 31, 2023 at 08:09:16PM +0000, Lad, Prabhakar wrote:
> > Hi Conor,
> > 
> > On Fri, Mar 31, 2023 at 7:05 PM Conor Dooley <conor@kernel.org>
> > wrote:
> > > 
> > > On Thu, Mar 30, 2023 at 09:42:11PM +0100, Prabhakar wrote:
> > > 
> > > > - This series requires testing on Cores with zicbom and T-Head
> > > > SoCs
> > > 
> > > I don't actually know if there are Zicbom parts, may need to test
> > > that
> > > on QEMU.
> > > I had to revert unrelated content to boot, but my D1 NFS setup
> > > seems to
> > > work fine with these changes, so where it is relevant:
> > > Tested-by: Conor Dooley <conor.dooley@microchip.com> # tyre-
> > > kicking on D1
> > > 
> > Thank you for testing this. By any chance did you compare the
> > performance?
> 
> No, just tyre kicking. Icenowy had some benchmark for it IIRC, I
> think
> mining some coin or w/e. +CC them.

I previously tested the function pointer based CMO, it do not affect
the performance beyond the measurement error. Maybe it's because CMO
operations are only done at the start and end of DMA operations.

My previous test system is LiteX + OpenC906.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v7 1/6] riscv: mm: dma-noncoherent: Switch using function pointers for cache management
  2023-03-31 10:44       ` Arnd Bergmann
  2023-03-31 12:11         ` Lad, Prabhakar
@ 2023-04-03 17:00         ` Lad, Prabhakar
  1 sibling, 0 replies; 40+ messages in thread
From: Lad, Prabhakar @ 2023-04-03 17:00 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Conor.Dooley, Geert Uytterhoeven, Heiko Stübner, guoren,
	Andrew Jones, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Samuel Holland, linux-riscv, Rob Herring, Krzysztof Kozlowski,
	devicetree, linux-kernel, Linux-Renesas, Biju Das, Lad,
	Prabhakar

Hi Arnd,

On Fri, Mar 31, 2023 at 11:45 AM Arnd Bergmann <arnd@arndb.de> wrote:
>
> On Fri, Mar 31, 2023, at 12:37, Lad, Prabhakar wrote:
> > On Thu, Mar 30, 2023 at 10:34 PM Arnd Bergmann <arnd@arndb.de> wrote:
> >
> >> It also seems wrong to have the fallback be to do nothing
> >> when the pointer is NULL, since that cannot actually work
> >> when a device is not cache coherent.
> >>
> > If the device is non cache coherent and if it doesn't support ZICBOM
> > ISA extension the device won't work anyway. So non-cache coherent
> > devices until they have their CMO config enabled won't work anyway. So
> > I didn't see any benefit in enabling ZICBOM by default. Please let me
> > know if I am misunderstanding.
>
> Two things:
>
> - Having a broken machine crash with in invalid instruction
>   exception is better than having it run into silent data
>   corruption.
>
> - a correctly predicted branch is typically faster than an
>   indirect function call, so the fallback to zicbom makes the
>   expected (at least in the future) case the fast one.
>
> > @@ -465,7 +466,6 @@ config RISCV_ISA_ZICBOM
> >         depends on MMU
> >         depends on RISCV_ALTERNATIVE
> >         default y
> > -       select RISCV_DMA_NONCOHERENT
> >         help
> >            Adds support to dynamically detect the presence of the ZICBOM
> >            extension (Cache Block Management Operations) and enable its
> >
> > But what if the platform doesn't have the ZICBOM ISA extension?
>
> Then it needs to register its cache operations before the first
> DMA, which is something that it should do anyway. With your
> current code, it may work by accident depending on the state of
> the cache, but with the version I suggested, it will either work
> correctly all the time or crash in an obvious way when misconfigured.
>
You were right, defaulting to ZICBOM is giving me a crash. So I need
to switch to something else rather than using arch_initcall().

I tried early_initcall() but this doesn't let me register a platform
driver. I should be able to access the resources and DT from init
callback and then register CMO callbacks (I havent tried this yet) but
it wont be a platform driver. Should this be OK?

Cheers,
Prabhakar

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v7 1/6] riscv: mm: dma-noncoherent: Switch using function pointers for cache management
  2023-03-31 12:24   ` Conor Dooley
@ 2023-04-03 18:23     ` Lad, Prabhakar
  2023-04-03 18:31       ` Conor Dooley
  0 siblings, 1 reply; 40+ messages in thread
From: Lad, Prabhakar @ 2023-04-03 18:23 UTC (permalink / raw)
  To: Conor Dooley
  Cc: Arnd Bergmann, Geert Uytterhoeven, Heiko Stuebner, Guo Ren,
	Andrew Jones, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Samuel Holland, linux-riscv, Rob Herring, Krzysztof Kozlowski,
	devicetree, linux-kernel, linux-renesas-soc, Biju Das,
	Lad Prabhakar

Hi Conor,

Thank you for the review.

On Fri, Mar 31, 2023 at 1:24 PM Conor Dooley <conor.dooley@microchip.com> wrote:
>
> Hey,
>
> I think most of what I wanted to talk about has been raised by Arnd or
> Geert, so I kinda only have a couple of small comments for you here.
>
> On Thu, Mar 30, 2023 at 09:42:12PM +0100, Prabhakar wrote:
> > From: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
> >
> > Currently, selecting which CMOs to use on a given platform is done using
> > and ALTERNATIVE_X() macro. This was manageable when there were just two
> > CMO implementations, but now that there are more and more platforms coming
> > needing custom CMOs, the use of the ALTERNATIVE_X() macro is unmanageable.
> >
> > To avoid such issues this patch switches to use of function pointers
> > instead of ALTERNATIVE_X() macro for cache management (the only drawback
> > being performance over the previous approach).
> >
> > void (*clean_range)(unsigned long addr, unsigned long size);
> > void (*inv_range)(unsigned long addr, unsigned long size);
> > void (*flush_range)(unsigned long addr, unsigned long size);
>
> So, given Arnd has renamed the generic helpers, should we use the
> writeback/invalidate/writeback & invalidate terminology here too?
> I think it'd be nice to make the "driver" functions match the generic
> implementations's names (even though that means not making the
> instruction naming).
>
> > The above function pointers are provided to be overridden for platforms
> > needing CMO.
> >
> > Convert ZICBOM and T-HEAD CMO to use function pointers.
> >
> > Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
> > ---
> > v6->v7
> > * Updated commit description
> > * Fixed build issues when CONFIG_ERRATA_THEAD_CMO=n
> > * Used static const struct ptr to register CMO ops
> > * Dropped riscv_dma_noncoherent_cmo_ops
>
> > * Moved ZICBOM CMO setup to setup.c
>
> Why'd you do that?
> What is the reason that the Zicbom stuff cannot live in
> dma-noncoherent.[ch] and only expose, say:
> void riscv_cbom_register_cmo_ops(void)
> {
>         riscv_noncoherent_register_cache_ops(&zicbom_cmo_ops);
> }
> which you then call from setup.c?
>
Commit abcc445acd ("riscv: move riscv_noncoherent_supported() out of
ZICBOM probe) moved the zicbom the setup to setup.c hence I moved the
CMO stuff here. Said that, now I am defaulting to zicbom ops so I have
mode the functions to dma-noncoherent.c  .

> > v5->v6
> > * New patch
> > ---
> >  arch/riscv/errata/thead/errata.c         | 76 ++++++++++++++++++++++++
> >  arch/riscv/include/asm/dma-noncoherent.h | 73 +++++++++++++++++++++++
> >  arch/riscv/include/asm/errata_list.h     | 53 -----------------
> >  arch/riscv/kernel/setup.c                | 49 ++++++++++++++-
> >  arch/riscv/mm/dma-noncoherent.c          | 25 ++++++--
> >  arch/riscv/mm/pmem.c                     |  6 +-
> >  6 files changed, 221 insertions(+), 61 deletions(-)
> >  create mode 100644 arch/riscv/include/asm/dma-noncoherent.h
>
> > +#ifdef CONFIG_RISCV_ISA_ZICBOM
> > +
> > +#define ZICBOM_CMO_OP(_op, _start, _size, _cachesize)                                \
> > +     asm volatile("mv a0, %1\n\t"                                            \
> > +                  "j 2f\n\t"                                                 \
> > +                  "3:\n\t"                                                   \
> > +                  CBO_##_op(a0)                                              \
> > +                  "add a0, a0, %0\n\t"                                       \
> > +                  "2:\n\t"                                                   \
> > +                  "bltu a0, %2, 3b\n\t"                                      \
> > +                  : : "r"(_cachesize),                                       \
> > +                      "r"((unsigned long)(_start) & ~((_cachesize) - 1UL)),  \
> > +                      "r"((unsigned long)(_start) + (_size))                 \
> > +                  : "a0")
> > +
> > +static void zicbom_cmo_clean_range(unsigned long addr, unsigned long size)
> > +{
> > +     ZICBOM_CMO_OP(clean, addr, size, riscv_cbom_block_size);
> > +}
> > +
> > +static void zicbom_cmo_flush_range(unsigned long addr, unsigned long size)
> > +{
> > +     ZICBOM_CMO_OP(flush, addr, size, riscv_cbom_block_size);
> > +}
> > +
> > +static void zicbom_cmo_inval_range(unsigned long addr, unsigned long size)
> > +{
> > +     ZICBOM_CMO_OP(inval, addr, size, riscv_cbom_block_size);
> > +}
> > +
> > +const struct riscv_cache_ops zicbom_cmo_ops = {
> > +     .clean_range = &zicbom_cmo_clean_range,
> > +     .inv_range = &zicbom_cmo_inval_range,
> > +     .flush_range = &zicbom_cmo_flush_range,
> > +};
> > +
> > +static void zicbom_register_cmo_ops(void)
> > +{
> > +     riscv_noncoherent_register_cache_ops(&zicbom_cmo_ops);
> > +}
> > +#else
> > +static void zicbom_register_cmo_ops(void) {}
> > +#endif
>
> I think all of the above should be prefixed with riscv_cbom to match the
> other riscv_cbom stuff we already have.
Just to clarify, the riscv_cbom prefix should just be applied to the
ZICOM functions and not to T-HEAD?

Cheers,
Prabhakar

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v7 1/6] riscv: mm: dma-noncoherent: Switch using function pointers for cache management
  2023-04-03 18:23     ` Lad, Prabhakar
@ 2023-04-03 18:31       ` Conor Dooley
  0 siblings, 0 replies; 40+ messages in thread
From: Conor Dooley @ 2023-04-03 18:31 UTC (permalink / raw)
  To: Lad, Prabhakar
  Cc: Conor Dooley, Arnd Bergmann, Geert Uytterhoeven, Heiko Stuebner,
	Guo Ren, Andrew Jones, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Samuel Holland, linux-riscv, Rob Herring, Krzysztof Kozlowski,
	devicetree, linux-kernel, linux-renesas-soc, Biju Das,
	Lad Prabhakar

[-- Attachment #1: Type: text/plain, Size: 353 bytes --]

On Mon, Apr 03, 2023 at 07:23:37PM +0100, Lad, Prabhakar wrote:
> > I think all of the above should be prefixed with riscv_cbom to match the
> > other riscv_cbom stuff we already have.
> Just to clarify, the riscv_cbom prefix should just be applied to the
> ZICOM functions and not to T-HEAD?

Yah, can leave the T-HEAD stuff as is IMO.

Cheers,
Conor.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v7 1/6] riscv: mm: dma-noncoherent: Switch using function pointers for cache management
  2023-03-30 20:42 ` [PATCH v7 1/6] riscv: mm: dma-noncoherent: Switch using function pointers for cache management Prabhakar
                     ` (2 preceding siblings ...)
  2023-03-31 12:24   ` Conor Dooley
@ 2023-04-04  5:29   ` Christoph Hellwig
  2023-04-04  6:24     ` Biju Das
                       ` (2 more replies)
  3 siblings, 3 replies; 40+ messages in thread
From: Christoph Hellwig @ 2023-04-04  5:29 UTC (permalink / raw)
  To: Prabhakar
  Cc: Arnd Bergmann, Conor Dooley, Geert Uytterhoeven, Heiko Stuebner,
	Guo Ren, Andrew Jones, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Samuel Holland, linux-riscv, Rob Herring, Krzysztof Kozlowski,
	devicetree, linux-kernel, linux-renesas-soc, Biju Das,
	Lad Prabhakar

On Thu, Mar 30, 2023 at 09:42:12PM +0100, Prabhakar wrote:
> From: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
> 
> Currently, selecting which CMOs to use on a given platform is done using
> and ALTERNATIVE_X() macro. This was manageable when there were just two
> CMO implementations, but now that there are more and more platforms coming
> needing custom CMOs, the use of the ALTERNATIVE_X() macro is unmanageable.
> 
> To avoid such issues this patch switches to use of function pointers
> instead of ALTERNATIVE_X() macro for cache management (the only drawback
> being performance over the previous approach).
> 
> void (*clean_range)(unsigned long addr, unsigned long size);
> void (*inv_range)(unsigned long addr, unsigned long size);
> void (*flush_range)(unsigned long addr, unsigned long size);
> 

NAK.  Function pointers for somthing high performance as cache
maintainance is a complete no-go.


^ permalink raw reply	[flat|nested] 40+ messages in thread

* RE: [PATCH v7 1/6] riscv: mm: dma-noncoherent: Switch using function pointers for cache management
  2023-04-04  5:29   ` Christoph Hellwig
@ 2023-04-04  6:24     ` Biju Das
  2023-04-04 15:42       ` Christoph Hellwig
  2023-04-04  6:50     ` Arnd Bergmann
  2023-04-06 18:59     ` Lad, Prabhakar
  2 siblings, 1 reply; 40+ messages in thread
From: Biju Das @ 2023-04-04  6:24 UTC (permalink / raw)
  To: Christoph Hellwig, Prabhakar
  Cc: Arnd Bergmann, Conor Dooley, Geert Uytterhoeven, Heiko Stuebner,
	Guo Ren, Andrew Jones, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Samuel Holland, linux-riscv, Rob Herring, Krzysztof Kozlowski,
	devicetree, linux-kernel, linux-renesas-soc,
	Prabhakar Mahadev Lad, Icenowy Zheng

Hi Christoph Hellwig,

> -----Original Message-----
> From: Christoph Hellwig <hch@infradead.org>
> Sent: Tuesday, April 4, 2023 6:29 AM
> To: Prabhakar <prabhakar.csengg@gmail.com>
> Cc: Arnd Bergmann <arnd@arndb.de>; Conor Dooley
> <conor.dooley@microchip.com>; Geert Uytterhoeven <geert+renesas@glider.be>;
> Heiko Stuebner <heiko@sntech.de>; Guo Ren <guoren@kernel.org>; Andrew Jones
> <ajones@ventanamicro.com>; Paul Walmsley <paul.walmsley@sifive.com>; Palmer
> Dabbelt <palmer@dabbelt.com>; Albert Ou <aou@eecs.berkeley.edu>; Samuel
> Holland <samuel@sholland.org>; linux-riscv@lists.infradead.org; Rob Herring
> <robh+dt@kernel.org>; Krzysztof Kozlowski
> <krzysztof.kozlowski+dt@linaro.org>; devicetree@vger.kernel.org; linux-
> kernel@vger.kernel.org; linux-renesas-soc@vger.kernel.org; Biju Das
> <biju.das.jz@bp.renesas.com>; Prabhakar Mahadev Lad <prabhakar.mahadev-
> lad.rj@bp.renesas.com>
> Subject: Re: [PATCH v7 1/6] riscv: mm: dma-noncoherent: Switch using
> function pointers for cache management
> 
> On Thu, Mar 30, 2023 at 09:42:12PM +0100, Prabhakar wrote:
> > From: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
> >
> > Currently, selecting which CMOs to use on a given platform is done
> > using and ALTERNATIVE_X() macro. This was manageable when there were
> > just two CMO implementations, but now that there are more and more
> > platforms coming needing custom CMOs, the use of the ALTERNATIVE_X() macro
> is unmanageable.
> >
> > To avoid such issues this patch switches to use of function pointers
> > instead of ALTERNATIVE_X() macro for cache management (the only
> > drawback being performance over the previous approach).
> >
> > void (*clean_range)(unsigned long addr, unsigned long size); void
> > (*inv_range)(unsigned long addr, unsigned long size); void
> > (*flush_range)(unsigned long addr, unsigned long size);
> >
> 
> NAK.  Function pointers for somthing high performance as cache maintainance
> is a complete no-go.

Just a question, how does function pointer makes a performance difference compared to
ALTERNATIVE_X() macros?

On both cases, we are pushing function parameters to stack, jumping to the actual routine
And then on return pop the variables from stack. Am I missing something here?

Benchmark results by [1], shows that there is no performance degradation. I am not sure
What type of benchmarking used in this case and How accurate is this benchmark?

https://lore.kernel.org/linux-renesas-soc/40cdea465fef49a8a337b48e2a748138c66a08eb.camel@icenowy.me/T/#m093c1f3d8f7f0e15bd2909900bf137d5714c553c

Cheers,
Biju


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v7 1/6] riscv: mm: dma-noncoherent: Switch using function pointers for cache management
  2023-04-04  5:29   ` Christoph Hellwig
  2023-04-04  6:24     ` Biju Das
@ 2023-04-04  6:50     ` Arnd Bergmann
  2023-04-04  6:59       ` Conor Dooley
  2023-04-06 18:59     ` Lad, Prabhakar
  2 siblings, 1 reply; 40+ messages in thread
From: Arnd Bergmann @ 2023-04-04  6:50 UTC (permalink / raw)
  To: Christoph Hellwig, Prabhakar
  Cc: Conor.Dooley, Geert Uytterhoeven, Heiko Stübner, guoren,
	Andrew Jones, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Samuel Holland, linux-riscv, Rob Herring, Krzysztof Kozlowski,
	devicetree, linux-kernel, Linux-Renesas, Biju Das, Lad,
	Prabhakar

On Tue, Apr 4, 2023, at 07:29, Christoph Hellwig wrote:
> On Thu, Mar 30, 2023 at 09:42:12PM +0100, Prabhakar wrote:
>> From: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
>> 
>> Currently, selecting which CMOs to use on a given platform is done using
>> and ALTERNATIVE_X() macro. This was manageable when there were just two
>> CMO implementations, but now that there are more and more platforms coming
>> needing custom CMOs, the use of the ALTERNATIVE_X() macro is unmanageable.
>> 
>> To avoid such issues this patch switches to use of function pointers
>> instead of ALTERNATIVE_X() macro for cache management (the only drawback
>> being performance over the previous approach).
>> 
>> void (*clean_range)(unsigned long addr, unsigned long size);
>> void (*inv_range)(unsigned long addr, unsigned long size);
>> void (*flush_range)(unsigned long addr, unsigned long size);
>> 
>
> NAK.  Function pointers for somthing high performance as cache
> maintainance is a complete no-go.

As we already discussed, this is now planned to use a direct
branch to the zicbom version when the function pointer is NULL,
which should be a fast predicted branch on all standard implementations
and only be slow on the nonstandard ones, which I think is a reasonable
compromise.

I'm also not sure I'd actually consider this a fast path, since
there is already a significant cost in accessing the invalidated
cache lines afterwards, which is likely going to be much higher than
the cost of an indirect branch.

I suppose an alternative would be to use the linux/static_call.h
infrastructure to avoid the overhead of indirect branches altogether.
Right now, this has no riscv specific implementation though, so
using it just turns into a regular indirect branch until someone
implements static_call.

       Arnd

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v7 1/6] riscv: mm: dma-noncoherent: Switch using function pointers for cache management
  2023-04-04  6:50     ` Arnd Bergmann
@ 2023-04-04  6:59       ` Conor Dooley
  0 siblings, 0 replies; 40+ messages in thread
From: Conor Dooley @ 2023-04-04  6:59 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Christoph Hellwig, Prabhakar, Geert Uytterhoeven,
	Heiko Stübner, guoren, Andrew Jones, Paul Walmsley,
	Palmer Dabbelt, Albert Ou, Samuel Holland, linux-riscv,
	Rob Herring, Krzysztof Kozlowski, devicetree, linux-kernel,
	Linux-Renesas, Biju Das, Lad, Prabhakar

[-- Attachment #1: Type: text/plain, Size: 2304 bytes --]

On Tue, Apr 04, 2023 at 08:50:16AM +0200, Arnd Bergmann wrote:
> On Tue, Apr 4, 2023, at 07:29, Christoph Hellwig wrote:
> > On Thu, Mar 30, 2023 at 09:42:12PM +0100, Prabhakar wrote:
> >> From: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
> >> 
> >> Currently, selecting which CMOs to use on a given platform is done using
> >> and ALTERNATIVE_X() macro. This was manageable when there were just two
> >> CMO implementations, but now that there are more and more platforms coming
> >> needing custom CMOs, the use of the ALTERNATIVE_X() macro is unmanageable.
> >> 
> >> To avoid such issues this patch switches to use of function pointers
> >> instead of ALTERNATIVE_X() macro for cache management (the only drawback
> >> being performance over the previous approach).
> >> 
> >> void (*clean_range)(unsigned long addr, unsigned long size);
> >> void (*inv_range)(unsigned long addr, unsigned long size);
> >> void (*flush_range)(unsigned long addr, unsigned long size);
> >> 
> >
> > NAK.  Function pointers for somthing high performance as cache
> > maintainance is a complete no-go.
> 
> As we already discussed, this is now planned to use a direct
> branch to the zicbom version when the function pointer is NULL,
> which should be a fast predicted branch on all standard implementations
> and only be slow on the nonstandard ones, which I think is a reasonable
> compromise.
> 
> I'm also not sure I'd actually consider this a fast path, since
> there is already a significant cost in accessing the invalidated
> cache lines afterwards, which is likely going to be much higher than
> the cost of an indirect branch.
> 
> I suppose an alternative would be to use the linux/static_call.h
> infrastructure to avoid the overhead of indirect branches altogether.
> Right now, this has no riscv specific implementation though, so
> using it just turns into a regular indirect branch until someone
> implements static_call.

Someone actually posted an implementation for that the other day:
https://patchwork.kernel.org/project/linux-riscv/patch/tencent_A8A256967B654625AEE1DB222514B0613B07@qq.com/

I haven't looked at it at all, other than noticing the build issues
outside of for !rv64 || !mmu, so I have no idea what state it is
actually in.


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v7 1/6] riscv: mm: dma-noncoherent: Switch using function pointers for cache management
  2023-04-04  6:24     ` Biju Das
@ 2023-04-04 15:42       ` Christoph Hellwig
  2023-04-05  6:08         ` Biju Das
  2023-04-07  0:03         ` Andrea Parri
  0 siblings, 2 replies; 40+ messages in thread
From: Christoph Hellwig @ 2023-04-04 15:42 UTC (permalink / raw)
  To: Biju Das
  Cc: Christoph Hellwig, Prabhakar, devicetree, Albert Ou,
	Arnd Bergmann, Geert Uytterhoeven, Samuel Holland,
	Heiko Stuebner, linux-kernel, Prabhakar Mahadev Lad,
	linux-renesas-soc, Conor Dooley, Rob Herring, Guo Ren,
	Krzysztof Kozlowski, Paul Walmsley, Palmer Dabbelt, linux-riscv,
	Andrew Jones

On Tue, Apr 04, 2023 at 06:24:16AM +0000, Biju Das wrote:
> Just a question, how does function pointer makes a performance difference compared to
> ALTERNATIVE_X() macros?
> 
> On both cases, we are pushing function parameters to stack, jumping to the actual routine
> And then on return pop the variables from stack. Am I missing something here?

Indirect calls have always been more expensive, and with the hard- and
software mitigations for spectre-like attacks they are becoming even
more expensive.

But other other point is adding more cache flushing variants should not
be easy.  Everyone should be using the standardize version.  If it's not
implemented in hardware despite having ratified extensions you can fake
it up in SBI.  Yes, that's way more expensive than indirect calls, but
that's what you get for taping out new hardware that ignores the actual
architecture specification and just does their own made up shit.


^ permalink raw reply	[flat|nested] 40+ messages in thread

* RE: [PATCH v7 1/6] riscv: mm: dma-noncoherent: Switch using function pointers for cache management
  2023-04-04 15:42       ` Christoph Hellwig
@ 2023-04-05  6:08         ` Biju Das
  2023-04-07  0:03         ` Andrea Parri
  1 sibling, 0 replies; 40+ messages in thread
From: Biju Das @ 2023-04-05  6:08 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Prabhakar, devicetree, Albert Ou, Arnd Bergmann,
	Geert Uytterhoeven, Samuel Holland, Heiko Stuebner, linux-kernel,
	Prabhakar Mahadev Lad, linux-renesas-soc, Conor Dooley,
	Rob Herring, Guo Ren, Krzysztof Kozlowski, Paul Walmsley,
	Palmer Dabbelt, linux-riscv, Andrew Jones

Hi Christoph Hellwig,

Thanks for the feedback.

> Subject: Re: [PATCH v7 1/6] riscv: mm: dma-noncoherent: Switch using
> function pointers for cache management
> 
> On Tue, Apr 04, 2023 at 06:24:16AM +0000, Biju Das wrote:
> > Just a question, how does function pointer makes a performance
> > difference compared to
> > ALTERNATIVE_X() macros?
> >
> > On both cases, we are pushing function parameters to stack, jumping to
> > the actual routine And then on return pop the variables from stack. Am I
> missing something here?
> 
> Indirect calls have always been more expensive, and with the hard- and
> software mitigations for spectre-like attacks they are becoming even more
> expensive.

Thanks for the info. I agree, it will be more expensive with software mitigations
for spectre-like attacks.

Cheers,
Biju



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v7 1/6] riscv: mm: dma-noncoherent: Switch using function pointers for cache management
  2023-04-04  5:29   ` Christoph Hellwig
  2023-04-04  6:24     ` Biju Das
  2023-04-04  6:50     ` Arnd Bergmann
@ 2023-04-06 18:59     ` Lad, Prabhakar
  2 siblings, 0 replies; 40+ messages in thread
From: Lad, Prabhakar @ 2023-04-06 18:59 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Arnd Bergmann, Conor Dooley, Geert Uytterhoeven, Heiko Stuebner,
	Guo Ren, Andrew Jones, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Samuel Holland, linux-riscv, Rob Herring, Krzysztof Kozlowski,
	devicetree, linux-kernel, linux-renesas-soc, Biju Das,
	Lad Prabhakar

Hi Christoph,

On Tue, Apr 4, 2023 at 6:29 AM Christoph Hellwig <hch@infradead.org> wrote:
>
> On Thu, Mar 30, 2023 at 09:42:12PM +0100, Prabhakar wrote:
> > From: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
> >
> > Currently, selecting which CMOs to use on a given platform is done using
> > and ALTERNATIVE_X() macro. This was manageable when there were just two
> > CMO implementations, but now that there are more and more platforms coming
> > needing custom CMOs, the use of the ALTERNATIVE_X() macro is unmanageable.
> >
> > To avoid such issues this patch switches to use of function pointers
> > instead of ALTERNATIVE_X() macro for cache management (the only drawback
> > being performance over the previous approach).
> >
> > void (*clean_range)(unsigned long addr, unsigned long size);
> > void (*inv_range)(unsigned long addr, unsigned long size);
> > void (*flush_range)(unsigned long addr, unsigned long size);
> >
>
> NAK.  Function pointers for somthing high performance as cache
> maintainance is a complete no-go.
>
Ok, I will take the ALTERNATIVE() macro route.

Cheers,
Prabhakar

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v7 1/6] riscv: mm: dma-noncoherent: Switch using function pointers for cache management
  2023-04-04 15:42       ` Christoph Hellwig
  2023-04-05  6:08         ` Biju Das
@ 2023-04-07  0:03         ` Andrea Parri
  2023-04-07  5:33           ` Christoph Hellwig
  1 sibling, 1 reply; 40+ messages in thread
From: Andrea Parri @ 2023-04-07  0:03 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Biju Das, Prabhakar, devicetree, Albert Ou, Arnd Bergmann,
	Geert Uytterhoeven, Samuel Holland, Heiko Stuebner, linux-kernel,
	Prabhakar Mahadev Lad, linux-renesas-soc, Conor Dooley,
	Rob Herring, Guo Ren, Krzysztof Kozlowski, Paul Walmsley,
	Palmer Dabbelt, linux-riscv, Andrew Jones

> But other other point is adding more cache flushing variants should not
> be easy.  Everyone should be using the standardize version.  If it's not
> implemented in hardware despite having ratified extensions you can fake
> it up in SBI.  Yes, that's way more expensive than indirect calls, but
> that's what you get for taping out new hardware that ignores the actual
> architecture specification and just does their own made up shit.

FWIW, ALTERNATIVE_X() for "three instructions with (what should be a)
crystal-clear semantics" already smells like "we're doing it wrong" to
me, function pointers would be closer to "we're looking for trouble".

Thanks,
  Andrea

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v7 1/6] riscv: mm: dma-noncoherent: Switch using function pointers for cache management
  2023-04-07  0:03         ` Andrea Parri
@ 2023-04-07  5:33           ` Christoph Hellwig
  0 siblings, 0 replies; 40+ messages in thread
From: Christoph Hellwig @ 2023-04-07  5:33 UTC (permalink / raw)
  To: Andrea Parri
  Cc: Christoph Hellwig, Biju Das, Prabhakar, devicetree, Albert Ou,
	Arnd Bergmann, Geert Uytterhoeven, Samuel Holland,
	Heiko Stuebner, linux-kernel, Prabhakar Mahadev Lad,
	linux-renesas-soc, Conor Dooley, Rob Herring, Guo Ren,
	Krzysztof Kozlowski, Paul Walmsley, Palmer Dabbelt, linux-riscv,
	Andrew Jones

On Fri, Apr 07, 2023 at 02:03:36AM +0200, Andrea Parri wrote:
> > But other other point is adding more cache flushing variants should not
> > be easy.  Everyone should be using the standardize version.  If it's not
> > implemented in hardware despite having ratified extensions you can fake
> > it up in SBI.  Yes, that's way more expensive than indirect calls, but
> > that's what you get for taping out new hardware that ignores the actual
> > architecture specification and just does their own made up shit.
> 
> FWIW, ALTERNATIVE_X() for "three instructions with (what should be a)
> crystal-clear semantics" already smells like "we're doing it wrong" to
> me, function pointers would be closer to "we're looking for trouble".

Thanks for putting my feelings into such concise words.

^ permalink raw reply	[flat|nested] 40+ messages in thread

end of thread, other threads:[~2023-04-07  5:33 UTC | newest]

Thread overview: 40+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-03-30 20:42 [PATCH v7 0/6] RISC-V non-coherent function pointer based CMO + non-coherent DMA support for AX45MP Prabhakar
2023-03-30 20:42 ` [PATCH v7 1/6] riscv: mm: dma-noncoherent: Switch using function pointers for cache management Prabhakar
2023-03-30 21:34   ` Arnd Bergmann
2023-03-31  7:54     ` Conor Dooley
2023-03-31  7:58       ` Arnd Bergmann
2023-03-31 10:37     ` Lad, Prabhakar
2023-03-31 10:44       ` Arnd Bergmann
2023-03-31 12:11         ` Lad, Prabhakar
2023-04-03 17:00         ` Lad, Prabhakar
2023-03-31 10:55       ` Conor Dooley
2023-03-31 11:36         ` Arnd Bergmann
2023-03-31  7:31   ` Geert Uytterhoeven
2023-03-31 10:45     ` Lad, Prabhakar
2023-03-31 12:24   ` Conor Dooley
2023-04-03 18:23     ` Lad, Prabhakar
2023-04-03 18:31       ` Conor Dooley
2023-04-04  5:29   ` Christoph Hellwig
2023-04-04  6:24     ` Biju Das
2023-04-04 15:42       ` Christoph Hellwig
2023-04-05  6:08         ` Biju Das
2023-04-07  0:03         ` Andrea Parri
2023-04-07  5:33           ` Christoph Hellwig
2023-04-04  6:50     ` Arnd Bergmann
2023-04-04  6:59       ` Conor Dooley
2023-04-06 18:59     ` Lad, Prabhakar
2023-03-30 20:42 ` [PATCH v7 2/6] riscv: asm: vendorid_list: Add Andes Technology to the vendors list Prabhakar
2023-03-30 20:42 ` [PATCH v7 3/6] riscv: errata: Add Andes alternative ports Prabhakar
2023-03-30 20:42 ` [PATCH v7 4/6] dt-bindings: cache: r9a07g043f-l2-cache: Add DT binding documentation for L2 cache controller Prabhakar
2023-03-31 10:21   ` Conor Dooley
2023-03-31 10:47     ` Lad, Prabhakar
2023-03-30 20:42 ` [PATCH v7 5/6] cache: Add L2 cache management for Andes AX45MP RISC-V core Prabhakar
2023-03-31 12:45   ` Conor Dooley
2023-03-31 20:17     ` Lad, Prabhakar
2023-03-30 20:42 ` [PATCH v7 6/6] soc: renesas: Kconfig: Select the required configs for RZ/Five SoC Prabhakar
2023-03-31  7:37   ` Geert Uytterhoeven
2023-03-31  7:37     ` Geert Uytterhoeven
2023-03-31 18:05 ` [PATCH v7 0/6] RISC-V non-coherent function pointer based CMO + non-coherent DMA support for AX45MP Conor Dooley
2023-03-31 20:09   ` Lad, Prabhakar
2023-03-31 20:15     ` Conor Dooley
2023-04-01  1:47       ` Icenowy Zheng

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).