linux-riscv.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v6 0/8] RISC-V: Apply Zicboz to clear_page
@ 2023-02-24 16:26 Andrew Jones
  2023-02-24 16:26 ` [PATCH v6 1/8] RISC-V: alternatives: Support patching multiple insns in assembly Andrew Jones
                   ` (9 more replies)
  0 siblings, 10 replies; 14+ messages in thread
From: Andrew Jones @ 2023-02-24 16:26 UTC (permalink / raw)
  To: linux-riscv, kvm-riscv, devicetree
  Cc: 'Conor Dooley ', 'Paul Walmsley ',
	'Palmer Dabbelt ', 'Sudip Mukherjee ',
	'Ben Dooks ', 'Atish Patra ',
	'Albert Ou ', 'Anup Patel ',
	'Krzysztof Kozlowski ', 'Rob Herring ',
	'Jisheng Zhang ', 'Heiko Stuebner '

When the Zicboz extension is available we can more rapidly zero naturally
aligned Zicboz block sized chunks of memory. As pages are always page
aligned and are larger than any Zicboz block size will be, then
clear_page() appears to be a good candidate for the extension. While cycle
count and energy consumption should also be considered, we can be pretty
certain that implementing clear_page() with the Zicboz extension is a win
by comparing the new dynamic instruction count with its current count[1].
Doing so we see that the new count is just over a quarter of the old count
(see patch6's commit message for more details).

For those of you who reviewed v1[2], you may be looking for the memset()
patches. As pointed out in v1, and a couple follow-up emails, it's not
clear that patching memset() is a win yet. When I get a chance to test
on real hardware with a comprehensive benchmark collection then I can
post the memset() patches separately (assuming the benchmarks show it's
worthwhile).

Based on riscv-linux/for-next plus the dependencies listed below.

Dependencies:
https://lore.kernel.org/all/20230224154601.88163-1-ajones@ventanamicro.com/

The patches are also available here
https://github.com/jones-drew/linux/commits/riscv/zicboz-v6

To test over QEMU this branch may be used to enable Zicboz
https://gitlab.com/jones-drew/qemu/-/commits/riscv/zicboz

To test running a KVM guest with Zicboz this kvmtool branch may be used
https://github.com/jones-drew/kvmtool/commits/riscv/zicboz

Thanks,
drew

[1] I ported the functions under test to userspace and linked them with
    a test program. Then, I ran them under gdb with a script[3] which
    counted instructions by single stepping.
[2] https://lore.kernel.org/all/20221027130247.31634-1-ajones@ventanamicro.com/
[3] https://gist.github.com/jones-drew/487791c956ceca8c18adc2847eec9c60

v6:
  - Rebased on latest riscv-linux/for-next
  - Used upper/lower_16_bits() in PATCH_ID_* macros, thanks to Conor
    for pointing out that they can be improved
  - Switched to SYM_FUNC_START/END for clear_page and exported the
    symbol since other building modules which use clear_page was
    busted [Ben Dooks and Sudip Mukherjee]
  - Picked up an R-b from Conor

v5:
  - Rebased on latest for-next which allowed dropping v4's dependencies
  - Added a new dependency on an alternatives/cpufeatures cleanup series
  - Replaced "RISC-V: cpufeature: Put vendor_id to work" with "riscv:
    cpufeatures: Put the upper 16 bits of patch ID to work" since touching
    vendor_id is probably a bad idea [Conor]
  - Picked up an r-b from Conor

v4:
  - Rebased on latest for-next which allowed dropping one dependency
  - Added "RISC-V: alternatives: Support patching multiple insns in assembly"
    since I needed to use more than one instruction in an ALTERNATIVE call
    from assembly. I can post this patch separately as a fix if desired.
  - Improved the dt-binding patch commit message [Conor]
  - Picked up some tags from Conor and Rob (I kept Conor's a-b on the
    clear_page patch, even though there are several changes to it, because
    I interpreted the a-b as "OK by me to implement a Zicboz clear_page")
  - Added "RISC-V: cpufeature: Put vendor_id to work", which was necessary
    for how I wanted to use ALTERNATIVE in clear_page.
  - Fallback to memset when the Zicboz block size is too big for the
    Zicboz implementation of clear_page [Palmer]
  - Use lw without la in clear_page impl [Palmer]
  - Implement a Duff device for clear_page using the new 'vendor_id'
    alternatives support [drew]

v3:
  - CC'ed DT list
  - Improved commit message of DT bindings patch to point out relationship
    with cbom-block-size
  - Picked up an a-b from Conor

v2:
  - s/blksz/block_size/, improved commit message for "RISC-V: Add Zicboz
    detection and block size parsing", isa ext sorting [Conor]
  - Added dt binding patch [Heiko]
  - Picked up r-b's from Conor, Heiko, and Anup
  - Moved config symbol and CBO_zero() introduction to "RISC-V: Use Zicboz
    in clear_page when available" and improved its commit message and
    implementation (unrolled four times) [drew]
  - Dropped memset() patches [drew]
  - Rebased on ae4d39f75308 ("Merge patch "RISC-V: fix incorrect type of
    ARCH_CANAAN_K210_DTB_SOURCE"") plus the dependencies

Andrew Jones (8):
  RISC-V: alternatives: Support patching multiple insns in assembly
  RISC-V: Factor out body of riscv_init_cbom_blocksize loop
  dt-bindings: riscv: Document cboz-block-size
  RISC-V: Add Zicboz detection and block size parsing
  RISC-V: cpufeatures: Put the upper 16 bits of patch ID to work
  RISC-V: Use Zicboz in clear_page when available
  RISC-V: KVM: Provide UAPI for Zicboz block size
  RISC-V: KVM: Expose Zicboz to the guest

 .../devicetree/bindings/riscv/cpus.yaml       |  5 ++
 arch/riscv/Kconfig                            | 13 ++++
 arch/riscv/include/asm/alternative-macros.h   |  6 +-
 arch/riscv/include/asm/alternative.h          |  4 +
 arch/riscv/include/asm/cacheflush.h           |  3 +-
 arch/riscv/include/asm/hwcap.h                |  1 +
 arch/riscv/include/asm/insn-def.h             |  4 +
 arch/riscv/include/asm/page.h                 |  6 +-
 arch/riscv/include/uapi/asm/kvm.h             |  2 +
 arch/riscv/kernel/cpu.c                       |  1 +
 arch/riscv/kernel/cpufeature.c                | 58 ++++++++++++++-
 arch/riscv/kernel/setup.c                     |  2 +-
 arch/riscv/kvm/vcpu.c                         | 11 +++
 arch/riscv/lib/Makefile                       |  1 +
 arch/riscv/lib/clear_page.S                   | 74 +++++++++++++++++++
 arch/riscv/mm/cacheflush.c                    | 64 +++++++++-------
 16 files changed, 219 insertions(+), 36 deletions(-)
 create mode 100644 arch/riscv/lib/clear_page.S

-- 
2.39.1


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH v6 1/8] RISC-V: alternatives: Support patching multiple insns in assembly
  2023-02-24 16:26 [PATCH v6 0/8] RISC-V: Apply Zicboz to clear_page Andrew Jones
@ 2023-02-24 16:26 ` Andrew Jones
  2023-02-24 16:26 ` [PATCH v6 2/8] RISC-V: Factor out body of riscv_init_cbom_blocksize loop Andrew Jones
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 14+ messages in thread
From: Andrew Jones @ 2023-02-24 16:26 UTC (permalink / raw)
  To: linux-riscv, kvm-riscv, devicetree
  Cc: 'Conor Dooley ', 'Paul Walmsley ',
	'Palmer Dabbelt ', 'Sudip Mukherjee ',
	'Ben Dooks ', 'Atish Patra ',
	'Albert Ou ', 'Anup Patel ',
	'Krzysztof Kozlowski ', 'Rob Herring ',
	'Jisheng Zhang ', 'Heiko Stuebner '

As pointed out in commit d374a16539b1 ("RISC-V: fix compile error
from deduplicated __ALTERNATIVE_CFG_2"), we need quotes around
parameters passed to macros within macros to avoid spaces being
interpreted as separators. ALT_NEW_CONTENT was trying to handle
this by defining new_c has a vararg, but this isn't sufficient
for calling ALTERNATIVE() from assembly with multiple instructions
in the new/old sequences. Remove the vararg "hack" and use quotes.

Signed-off-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
---
 arch/riscv/include/asm/alternative-macros.h | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/riscv/include/asm/alternative-macros.h b/arch/riscv/include/asm/alternative-macros.h
index 993a44a8fdac..b8c55fb3ab2c 100644
--- a/arch/riscv/include/asm/alternative-macros.h
+++ b/arch/riscv/include/asm/alternative-macros.h
@@ -14,7 +14,7 @@
 	.4byte \patch_id
 .endm
 
-.macro ALT_NEW_CONTENT vendor_id, patch_id, enable = 1, new_c : vararg
+.macro ALT_NEW_CONTENT vendor_id, patch_id, enable = 1, new_c
 	.if \enable
 	.pushsection .alternative, "a"
 	ALT_ENTRY 886b, 888f, \vendor_id, \patch_id, 889f - 888f
@@ -41,13 +41,13 @@
 	\old_c
 	.option pop
 887 :
-	ALT_NEW_CONTENT \vendor_id, \patch_id, \enable, \new_c
+	ALT_NEW_CONTENT \vendor_id, \patch_id, \enable, "\new_c"
 .endm
 
 .macro ALTERNATIVE_CFG_2 old_c, new_c_1, vendor_id_1, patch_id_1, enable_1,	\
 				new_c_2, vendor_id_2, patch_id_2, enable_2
 	ALTERNATIVE_CFG "\old_c", "\new_c_1", \vendor_id_1, \patch_id_1, \enable_1
-	ALT_NEW_CONTENT \vendor_id_2, \patch_id_2, \enable_2, \new_c_2
+	ALT_NEW_CONTENT \vendor_id_2, \patch_id_2, \enable_2, "\new_c_2"
 .endm
 
 #define __ALTERNATIVE_CFG(...)		ALTERNATIVE_CFG __VA_ARGS__
-- 
2.39.1


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v6 2/8] RISC-V: Factor out body of riscv_init_cbom_blocksize loop
  2023-02-24 16:26 [PATCH v6 0/8] RISC-V: Apply Zicboz to clear_page Andrew Jones
  2023-02-24 16:26 ` [PATCH v6 1/8] RISC-V: alternatives: Support patching multiple insns in assembly Andrew Jones
@ 2023-02-24 16:26 ` Andrew Jones
  2023-02-24 16:26 ` [PATCH v6 3/8] dt-bindings: riscv: Document cboz-block-size Andrew Jones
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 14+ messages in thread
From: Andrew Jones @ 2023-02-24 16:26 UTC (permalink / raw)
  To: linux-riscv, kvm-riscv, devicetree
  Cc: 'Conor Dooley ', 'Paul Walmsley ',
	'Palmer Dabbelt ', 'Sudip Mukherjee ',
	'Ben Dooks ', 'Atish Patra ',
	'Albert Ou ', 'Anup Patel ',
	'Krzysztof Kozlowski ', 'Rob Herring ',
	'Jisheng Zhang ', 'Heiko Stuebner '

Refactor riscv_init_cbom_blocksize() to prepare for it to be used
for both cbom block size and cboz block size.

Signed-off-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Heiko Stuebner <heiko@sntech.de>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
---
 arch/riscv/mm/cacheflush.c | 45 +++++++++++++++++++++-----------------
 1 file changed, 25 insertions(+), 20 deletions(-)

diff --git a/arch/riscv/mm/cacheflush.c b/arch/riscv/mm/cacheflush.c
index 3cc07ed45aeb..eaf23fc14966 100644
--- a/arch/riscv/mm/cacheflush.c
+++ b/arch/riscv/mm/cacheflush.c
@@ -98,34 +98,39 @@ void flush_icache_pte(pte_t pte)
 unsigned int riscv_cbom_block_size;
 EXPORT_SYMBOL_GPL(riscv_cbom_block_size);
 
+static void cbo_get_block_size(struct device_node *node,
+			       const char *name, u32 *block_size,
+			       unsigned long *first_hartid)
+{
+	unsigned long hartid;
+	u32 val;
+
+	if (riscv_of_processor_hartid(node, &hartid))
+		return;
+
+	if (of_property_read_u32(node, name, &val))
+		return;
+
+	if (!*block_size) {
+		*block_size = val;
+		*first_hartid = hartid;
+	} else if (*block_size != val) {
+		pr_warn("%s mismatched between harts %lu and %lu\n",
+			name, *first_hartid, hartid);
+	}
+}
+
 void riscv_init_cbom_blocksize(void)
 {
 	struct device_node *node;
 	unsigned long cbom_hartid;
-	u32 val, probed_block_size;
-	int ret;
+	u32 probed_block_size;
 
 	probed_block_size = 0;
 	for_each_of_cpu_node(node) {
-		unsigned long hartid;
-
-		ret = riscv_of_processor_hartid(node, &hartid);
-		if (ret)
-			continue;
-
 		/* set block-size for cbom extension if available */
-		ret = of_property_read_u32(node, "riscv,cbom-block-size", &val);
-		if (ret)
-			continue;
-
-		if (!probed_block_size) {
-			probed_block_size = val;
-			cbom_hartid = hartid;
-		} else {
-			if (probed_block_size != val)
-				pr_warn("cbom-block-size mismatched between harts %lu and %lu\n",
-					cbom_hartid, hartid);
-		}
+		cbo_get_block_size(node, "riscv,cbom-block-size",
+				   &probed_block_size, &cbom_hartid);
 	}
 
 	if (probed_block_size)
-- 
2.39.1


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v6 3/8] dt-bindings: riscv: Document cboz-block-size
  2023-02-24 16:26 [PATCH v6 0/8] RISC-V: Apply Zicboz to clear_page Andrew Jones
  2023-02-24 16:26 ` [PATCH v6 1/8] RISC-V: alternatives: Support patching multiple insns in assembly Andrew Jones
  2023-02-24 16:26 ` [PATCH v6 2/8] RISC-V: Factor out body of riscv_init_cbom_blocksize loop Andrew Jones
@ 2023-02-24 16:26 ` Andrew Jones
  2023-02-24 16:26 ` [PATCH v6 4/8] RISC-V: Add Zicboz detection and block size parsing Andrew Jones
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 14+ messages in thread
From: Andrew Jones @ 2023-02-24 16:26 UTC (permalink / raw)
  To: linux-riscv, kvm-riscv, devicetree
  Cc: 'Conor Dooley ', 'Paul Walmsley ',
	'Palmer Dabbelt ', 'Sudip Mukherjee ',
	'Ben Dooks ', 'Atish Patra ',
	'Albert Ou ', 'Anup Patel ',
	'Krzysztof Kozlowski ', 'Rob Herring ',
	'Jisheng Zhang ', 'Heiko Stuebner '

The Zicboz operation (cbo.zero) operates on a block-size defined
for the cpu-core. While we already have the riscv,cbom-block-size
property, it only provides the block size for Zicbom operations.
Even though it's likely Zicboz and Zicbom will use the same size,
that's not required by the specification. Create another property
specifically for Zicboz.

Cc: Rob Herring <robh@kernel.org>
Signed-off-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Rob Herring <robh@kernel.org>
---
 Documentation/devicetree/bindings/riscv/cpus.yaml | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/Documentation/devicetree/bindings/riscv/cpus.yaml b/Documentation/devicetree/bindings/riscv/cpus.yaml
index 001931d526ec..f24cf9601c6e 100644
--- a/Documentation/devicetree/bindings/riscv/cpus.yaml
+++ b/Documentation/devicetree/bindings/riscv/cpus.yaml
@@ -72,6 +72,11 @@ properties:
     description:
       The blocksize in bytes for the Zicbom cache operations.
 
+  riscv,cboz-block-size:
+    $ref: /schemas/types.yaml#/definitions/uint32
+    description:
+      The blocksize in bytes for the Zicboz cache operations.
+
   riscv,isa:
     description:
       Identifies the specific RISC-V instruction set architecture
-- 
2.39.1


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v6 4/8] RISC-V: Add Zicboz detection and block size parsing
  2023-02-24 16:26 [PATCH v6 0/8] RISC-V: Apply Zicboz to clear_page Andrew Jones
                   ` (2 preceding siblings ...)
  2023-02-24 16:26 ` [PATCH v6 3/8] dt-bindings: riscv: Document cboz-block-size Andrew Jones
@ 2023-02-24 16:26 ` Andrew Jones
  2023-02-24 16:26 ` [PATCH v6 5/8] RISC-V: cpufeatures: Put the upper 16 bits of patch ID to work Andrew Jones
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 14+ messages in thread
From: Andrew Jones @ 2023-02-24 16:26 UTC (permalink / raw)
  To: linux-riscv, kvm-riscv, devicetree
  Cc: 'Conor Dooley ', 'Paul Walmsley ',
	'Palmer Dabbelt ', 'Sudip Mukherjee ',
	'Ben Dooks ', 'Atish Patra ',
	'Albert Ou ', 'Anup Patel ',
	'Krzysztof Kozlowski ', 'Rob Herring ',
	'Jisheng Zhang ', 'Heiko Stuebner '

Parse "riscv,cboz-block-size" from the DT by piggybacking on Zicbom's
riscv_init_cbom_blocksize(). Additionally check the DT for the presence
of the "zicboz" extension and, when it's present, validate the parsed
cboz block size as we do Zicbom's cbom block size with
riscv_isa_extension_check().

Signed-off-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Heiko Stuebner <heiko@sntech.de>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
---
 arch/riscv/include/asm/cacheflush.h |  3 ++-
 arch/riscv/include/asm/hwcap.h      |  1 +
 arch/riscv/kernel/cpu.c             |  1 +
 arch/riscv/kernel/cpufeature.c      | 10 ++++++++++
 arch/riscv/kernel/setup.c           |  2 +-
 arch/riscv/mm/cacheflush.c          | 23 +++++++++++++++--------
 6 files changed, 30 insertions(+), 10 deletions(-)

diff --git a/arch/riscv/include/asm/cacheflush.h b/arch/riscv/include/asm/cacheflush.h
index 03e3b95ae6da..8091b8bf4883 100644
--- a/arch/riscv/include/asm/cacheflush.h
+++ b/arch/riscv/include/asm/cacheflush.h
@@ -50,7 +50,8 @@ void flush_icache_mm(struct mm_struct *mm, bool local);
 #endif /* CONFIG_SMP */
 
 extern unsigned int riscv_cbom_block_size;
-void riscv_init_cbom_blocksize(void);
+extern unsigned int riscv_cboz_block_size;
+void riscv_init_cbo_blocksizes(void);
 
 #ifdef CONFIG_RISCV_DMA_NONCOHERENT
 void riscv_noncoherent_supported(void);
diff --git a/arch/riscv/include/asm/hwcap.h b/arch/riscv/include/asm/hwcap.h
index 8f3994a7f0ca..a96d3d7d7d28 100644
--- a/arch/riscv/include/asm/hwcap.h
+++ b/arch/riscv/include/asm/hwcap.h
@@ -42,6 +42,7 @@
 #define RISCV_ISA_EXT_ZBB		30
 #define RISCV_ISA_EXT_ZICBOM		31
 #define RISCV_ISA_EXT_ZIHINTPAUSE	32
+#define RISCV_ISA_EXT_ZICBOZ		33
 
 #define RISCV_ISA_EXT_MAX		64
 #define RISCV_ISA_EXT_NAME_LEN_MAX	32
diff --git a/arch/riscv/kernel/cpu.c b/arch/riscv/kernel/cpu.c
index 8400f0cc9704..1b0411280141 100644
--- a/arch/riscv/kernel/cpu.c
+++ b/arch/riscv/kernel/cpu.c
@@ -186,6 +186,7 @@ arch_initcall(riscv_cpuinfo_init);
  */
 static struct riscv_isa_ext_data isa_ext_arr[] = {
 	__RISCV_ISA_EXT_DATA(zicbom, RISCV_ISA_EXT_ZICBOM),
+	__RISCV_ISA_EXT_DATA(zicboz, RISCV_ISA_EXT_ZICBOZ),
 	__RISCV_ISA_EXT_DATA(zihintpause, RISCV_ISA_EXT_ZIHINTPAUSE),
 	__RISCV_ISA_EXT_DATA(zbb, RISCV_ISA_EXT_ZBB),
 	__RISCV_ISA_EXT_DATA(sscofpmf, RISCV_ISA_EXT_SSCOFPMF),
diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c
index 6569d963fc7d..538779d03311 100644
--- a/arch/riscv/kernel/cpufeature.c
+++ b/arch/riscv/kernel/cpufeature.c
@@ -74,6 +74,15 @@ static bool riscv_isa_extension_check(int id)
 			return false;
 		}
 		return true;
+	case RISCV_ISA_EXT_ZICBOZ:
+		if (!riscv_cboz_block_size) {
+			pr_err("Zicboz detected in ISA string, but no cboz-block-size found\n");
+			return false;
+		} else if (!is_power_of_2(riscv_cboz_block_size)) {
+			pr_err("cboz-block-size present, but is not a power-of-2\n");
+			return false;
+		}
+		return true;
 	}
 
 	return true;
@@ -222,6 +231,7 @@ void __init riscv_fill_hwcap(void)
 				SET_ISA_EXT_MAP("svpbmt", RISCV_ISA_EXT_SVPBMT);
 				SET_ISA_EXT_MAP("zbb", RISCV_ISA_EXT_ZBB);
 				SET_ISA_EXT_MAP("zicbom", RISCV_ISA_EXT_ZICBOM);
+				SET_ISA_EXT_MAP("zicboz", RISCV_ISA_EXT_ZICBOZ);
 				SET_ISA_EXT_MAP("zihintpause", RISCV_ISA_EXT_ZIHINTPAUSE);
 			}
 #undef SET_ISA_EXT_MAP
diff --git a/arch/riscv/kernel/setup.c b/arch/riscv/kernel/setup.c
index 376d2827e736..5d3184cbf518 100644
--- a/arch/riscv/kernel/setup.c
+++ b/arch/riscv/kernel/setup.c
@@ -297,7 +297,7 @@ void __init setup_arch(char **cmdline_p)
 	setup_smp();
 #endif
 
-	riscv_init_cbom_blocksize();
+	riscv_init_cbo_blocksizes();
 	riscv_fill_hwcap();
 	apply_boot_alternatives();
 	if (IS_ENABLED(CONFIG_RISCV_ISA_ZICBOM) &&
diff --git a/arch/riscv/mm/cacheflush.c b/arch/riscv/mm/cacheflush.c
index eaf23fc14966..ba4832bb949b 100644
--- a/arch/riscv/mm/cacheflush.c
+++ b/arch/riscv/mm/cacheflush.c
@@ -98,6 +98,9 @@ void flush_icache_pte(pte_t pte)
 unsigned int riscv_cbom_block_size;
 EXPORT_SYMBOL_GPL(riscv_cbom_block_size);
 
+unsigned int riscv_cboz_block_size;
+EXPORT_SYMBOL_GPL(riscv_cboz_block_size);
+
 static void cbo_get_block_size(struct device_node *node,
 			       const char *name, u32 *block_size,
 			       unsigned long *first_hartid)
@@ -120,19 +123,23 @@ static void cbo_get_block_size(struct device_node *node,
 	}
 }
 
-void riscv_init_cbom_blocksize(void)
+void riscv_init_cbo_blocksizes(void)
 {
+	unsigned long cbom_hartid, cboz_hartid;
+	u32 cbom_block_size = 0, cboz_block_size = 0;
 	struct device_node *node;
-	unsigned long cbom_hartid;
-	u32 probed_block_size;
 
-	probed_block_size = 0;
 	for_each_of_cpu_node(node) {
-		/* set block-size for cbom extension if available */
+		/* set block-size for cbom and/or cboz extension if available */
 		cbo_get_block_size(node, "riscv,cbom-block-size",
-				   &probed_block_size, &cbom_hartid);
+				   &cbom_block_size, &cbom_hartid);
+		cbo_get_block_size(node, "riscv,cboz-block-size",
+				   &cboz_block_size, &cboz_hartid);
 	}
 
-	if (probed_block_size)
-		riscv_cbom_block_size = probed_block_size;
+	if (cbom_block_size)
+		riscv_cbom_block_size = cbom_block_size;
+
+	if (cboz_block_size)
+		riscv_cboz_block_size = cboz_block_size;
 }
-- 
2.39.1


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v6 5/8] RISC-V: cpufeatures: Put the upper 16 bits of patch ID to work
  2023-02-24 16:26 [PATCH v6 0/8] RISC-V: Apply Zicboz to clear_page Andrew Jones
                   ` (3 preceding siblings ...)
  2023-02-24 16:26 ` [PATCH v6 4/8] RISC-V: Add Zicboz detection and block size parsing Andrew Jones
@ 2023-02-24 16:26 ` Andrew Jones
  2023-02-24 16:26 ` [PATCH v6 6/8] RISC-V: Use Zicboz in clear_page when available Andrew Jones
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 14+ messages in thread
From: Andrew Jones @ 2023-02-24 16:26 UTC (permalink / raw)
  To: linux-riscv, kvm-riscv, devicetree
  Cc: 'Conor Dooley ', 'Paul Walmsley ',
	'Palmer Dabbelt ', 'Sudip Mukherjee ',
	'Ben Dooks ', 'Atish Patra ',
	'Albert Ou ', 'Anup Patel ',
	'Krzysztof Kozlowski ', 'Rob Herring ',
	'Jisheng Zhang ', 'Heiko Stuebner '

cpufeature IDs are consecutive integers starting at 26, so a 32-bit
patch ID allows an aircraft carrier load of feature IDs. Repurposing
the upper 16 bits still leaves a boat load of feature IDs and gains
16 bits which may be used to control patching on a per patch-site
basis.

This will be initially used in Zicboz's application to clear_page(),
as Zicboz's block size must also be considered. In that case, the
upper 16-bit value's role will be to convey the maximum block size
which the Zicboz clear_page() implementation supports.

cpufeature patch sites which need to check for the existence or
absence of other cpufeatures may also be able to make use of this.

Signed-off-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
---
 arch/riscv/include/asm/alternative.h |  4 +++
 arch/riscv/kernel/cpufeature.c       | 37 +++++++++++++++++++++++++---
 2 files changed, 37 insertions(+), 4 deletions(-)

diff --git a/arch/riscv/include/asm/alternative.h b/arch/riscv/include/asm/alternative.h
index c8dea9e94310..58ccd2f8cab7 100644
--- a/arch/riscv/include/asm/alternative.h
+++ b/arch/riscv/include/asm/alternative.h
@@ -13,10 +13,14 @@
 #ifdef CONFIG_RISCV_ALTERNATIVE
 
 #include <linux/init.h>
+#include <linux/kernel.h>
 #include <linux/types.h>
 #include <linux/stddef.h>
 #include <asm/hwcap.h>
 
+#define PATCH_ID_CPUFEATURE_ID(p)		lower_16_bits(p)
+#define PATCH_ID_CPUFEATURE_VALUE(p)		upper_16_bits(p)
+
 #define RISCV_ALTERNATIVES_BOOT		0 /* alternatives applied during regular boot */
 #define RISCV_ALTERNATIVES_MODULE	1 /* alternatives applied during module-init */
 #define RISCV_ALTERNATIVES_EARLY_BOOT	2 /* alternatives applied before mmu start */
diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c
index 538779d03311..d424cd76beb1 100644
--- a/arch/riscv/kernel/cpufeature.c
+++ b/arch/riscv/kernel/cpufeature.c
@@ -274,12 +274,35 @@ void __init riscv_fill_hwcap(void)
 }
 
 #ifdef CONFIG_RISCV_ALTERNATIVE
+/*
+ * Alternative patch sites consider 48 bits when determining when to patch
+ * the old instruction sequence with the new. These bits are broken into a
+ * 16-bit vendor ID and a 32-bit patch ID. A non-zero vendor ID means the
+ * patch site is for an erratum, identified by the 32-bit patch ID. When
+ * the vendor ID is zero, the patch site is for a cpufeature. cpufeatures
+ * further break down patch ID into two 16-bit numbers. The lower 16 bits
+ * are the cpufeature ID and the upper 16 bits are used for a value specific
+ * to the cpufeature and patch site. If the upper 16 bits are zero, then it
+ * implies no specific value is specified. cpufeatures that want to control
+ * patching on a per-site basis will provide non-zero values and implement
+ * checks here. The checks return true when patching should be done, and
+ * false otherwise.
+ */
+static bool riscv_cpufeature_patch_check(u16 id, u16 value)
+{
+	if (!value)
+		return true;
+
+	return false;
+}
+
 void __init_or_module riscv_cpufeature_patch_func(struct alt_entry *begin,
 						  struct alt_entry *end,
 						  unsigned int stage)
 {
 	struct alt_entry *alt;
 	void *oldptr, *altptr;
+	u16 id, value;
 
 	if (stage == RISCV_ALTERNATIVES_EARLY_BOOT)
 		return;
@@ -287,13 +310,19 @@ void __init_or_module riscv_cpufeature_patch_func(struct alt_entry *begin,
 	for (alt = begin; alt < end; alt++) {
 		if (alt->vendor_id != 0)
 			continue;
-		if (alt->patch_id >= RISCV_ISA_EXT_MAX) {
-			WARN(1, "This extension id:%d is not in ISA extension list",
-				alt->patch_id);
+
+		id = PATCH_ID_CPUFEATURE_ID(alt->patch_id);
+
+		if (id >= RISCV_ISA_EXT_MAX) {
+			WARN(1, "This extension id:%d is not in ISA extension list", id);
 			continue;
 		}
 
-		if (!__riscv_isa_extension_available(NULL, alt->patch_id))
+		if (!__riscv_isa_extension_available(NULL, id))
+			continue;
+
+		value = PATCH_ID_CPUFEATURE_VALUE(alt->patch_id);
+		if (!riscv_cpufeature_patch_check(id, value))
 			continue;
 
 		oldptr = ALT_OLD_PTR(alt);
-- 
2.39.1


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v6 6/8] RISC-V: Use Zicboz in clear_page when available
  2023-02-24 16:26 [PATCH v6 0/8] RISC-V: Apply Zicboz to clear_page Andrew Jones
                   ` (4 preceding siblings ...)
  2023-02-24 16:26 ` [PATCH v6 5/8] RISC-V: cpufeatures: Put the upper 16 bits of patch ID to work Andrew Jones
@ 2023-02-24 16:26 ` Andrew Jones
  2023-02-24 16:26 ` [PATCH v6 7/8] RISC-V: KVM: Provide UAPI for Zicboz block size Andrew Jones
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 14+ messages in thread
From: Andrew Jones @ 2023-02-24 16:26 UTC (permalink / raw)
  To: linux-riscv, kvm-riscv, devicetree
  Cc: 'Conor Dooley ', 'Paul Walmsley ',
	'Palmer Dabbelt ', 'Sudip Mukherjee ',
	'Ben Dooks ', 'Atish Patra ',
	'Albert Ou ', 'Anup Patel ',
	'Krzysztof Kozlowski ', 'Rob Herring ',
	'Jisheng Zhang ', 'Heiko Stuebner '

Using memset() to zero a 4K page takes 563 total instructions, where
20 are branches. clear_page(), with Zicboz and a 64 byte block size,
takes 169 total instructions, where 4 are branches and 33 are nops.
Even though the block size is a variable, thanks to alternatives, we
can still implement a Duff device without having to do any preliminary
calculations. This is achieved by using the alternatives' cpufeature
value (the upper 16 bits of patch_id). The value used is the maximum
zicboz block size order accepted at the patch site. This enables us
to stop patching / unrolling when 4K bytes have been zeroed (we would
loop and continue after 4K if the page size would be larger)

For 4K pages, unrolling 16 times allows block sizes of 64 and 128 to
only loop a few times and larger block sizes to not loop at all. Since
cbo.zero doesn't take an offset, we also need an 'add' after each
instruction, making the loop body 112 to 160 bytes. Hopefully this
is small enough to not cause icache misses.

Signed-off-by: Andrew Jones <ajones@ventanamicro.com>
Acked-by: Conor Dooley <conor.dooley@microchip.com>
---
 arch/riscv/Kconfig                | 13 ++++++
 arch/riscv/include/asm/insn-def.h |  4 ++
 arch/riscv/include/asm/page.h     |  6 ++-
 arch/riscv/kernel/cpufeature.c    | 11 +++++
 arch/riscv/lib/Makefile           |  1 +
 arch/riscv/lib/clear_page.S       | 74 +++++++++++++++++++++++++++++++
 6 files changed, 108 insertions(+), 1 deletion(-)
 create mode 100644 arch/riscv/lib/clear_page.S

diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index aa951fe2bc56..f715aa39c465 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -458,6 +458,19 @@ config RISCV_ISA_ZICBOM
 
 	   If you don't know what to do here, say Y.
 
+config RISCV_ISA_ZICBOZ
+	bool "Zicboz extension support for faster zeroing of memory"
+	depends on !XIP_KERNEL && MMU
+	select RISCV_ALTERNATIVE
+	default y
+	help
+	   Enable the use of the ZICBOZ extension (cbo.zero instruction)
+	   when available.
+
+	   The Zicboz extension is used for faster zeroing of memory.
+
+	   If you don't know what to do here, say Y.
+
 config TOOLCHAIN_HAS_ZIHINTPAUSE
 	bool
 	default y
diff --git a/arch/riscv/include/asm/insn-def.h b/arch/riscv/include/asm/insn-def.h
index e01ab51f50d2..6960beb75f32 100644
--- a/arch/riscv/include/asm/insn-def.h
+++ b/arch/riscv/include/asm/insn-def.h
@@ -192,4 +192,8 @@
 	INSN_I(OPCODE_MISC_MEM, FUNC3(2), __RD(0),		\
 	       RS1(base), SIMM12(2))
 
+#define CBO_zero(base)						\
+	INSN_I(OPCODE_MISC_MEM, FUNC3(2), __RD(0),		\
+	       RS1(base), SIMM12(4))
+
 #endif /* __ASM_INSN_DEF_H */
diff --git a/arch/riscv/include/asm/page.h b/arch/riscv/include/asm/page.h
index 9f432c1b5289..ccd168fe29d2 100644
--- a/arch/riscv/include/asm/page.h
+++ b/arch/riscv/include/asm/page.h
@@ -49,10 +49,14 @@
 
 #ifndef __ASSEMBLY__
 
+#ifdef CONFIG_RISCV_ISA_ZICBOZ
+void clear_page(void *page);
+#else
 #define clear_page(pgaddr)			memset((pgaddr), 0, PAGE_SIZE)
+#endif
 #define copy_page(to, from)			memcpy((to), (from), PAGE_SIZE)
 
-#define clear_user_page(pgaddr, vaddr, page)	memset((pgaddr), 0, PAGE_SIZE)
+#define clear_user_page(pgaddr, vaddr, page)	clear_page(pgaddr)
 #define copy_user_page(vto, vfrom, vaddr, topg) \
 			memcpy((vto), (vfrom), PAGE_SIZE)
 
diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c
index d424cd76beb1..8e7b0d703841 100644
--- a/arch/riscv/kernel/cpufeature.c
+++ b/arch/riscv/kernel/cpufeature.c
@@ -293,6 +293,17 @@ static bool riscv_cpufeature_patch_check(u16 id, u16 value)
 	if (!value)
 		return true;
 
+	switch (id) {
+	case RISCV_ISA_EXT_ZICBOZ:
+		/*
+		 * Zicboz alternative applications provide the maximum
+		 * supported block size order, or zero when it doesn't
+		 * matter. If the current block size exceeds the maximum,
+		 * then the alternative cannot be applied.
+		 */
+		return riscv_cboz_block_size <= (1U << value);
+	}
+
 	return false;
 }
 
diff --git a/arch/riscv/lib/Makefile b/arch/riscv/lib/Makefile
index 6c74b0bedd60..26cb2502ecf8 100644
--- a/arch/riscv/lib/Makefile
+++ b/arch/riscv/lib/Makefile
@@ -8,5 +8,6 @@ lib-y			+= strlen.o
 lib-y			+= strncmp.o
 lib-$(CONFIG_MMU)	+= uaccess.o
 lib-$(CONFIG_64BIT)	+= tishift.o
+lib-$(CONFIG_RISCV_ISA_ZICBOZ)	+= clear_page.o
 
 obj-$(CONFIG_FUNCTION_ERROR_INJECTION) += error-inject.o
diff --git a/arch/riscv/lib/clear_page.S b/arch/riscv/lib/clear_page.S
new file mode 100644
index 000000000000..d7a256eb53f4
--- /dev/null
+++ b/arch/riscv/lib/clear_page.S
@@ -0,0 +1,74 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Copyright (c) 2023 Ventana Micro Systems Inc.
+ */
+
+#include <linux/linkage.h>
+#include <asm/asm.h>
+#include <asm/alternative-macros.h>
+#include <asm-generic/export.h>
+#include <asm/hwcap.h>
+#include <asm/insn-def.h>
+#include <asm/page.h>
+
+#define CBOZ_ALT(order, old, new)				\
+	ALTERNATIVE(old, new, 0,				\
+		    ((order) << 16) | RISCV_ISA_EXT_ZICBOZ,	\
+		    CONFIG_RISCV_ISA_ZICBOZ)
+
+/* void clear_page(void *page) */
+SYM_FUNC_START(clear_page)
+	li	a2, PAGE_SIZE
+
+	/*
+	 * If Zicboz isn't present, or somehow has a block
+	 * size larger than 4K, then fallback to memset.
+	 */
+	CBOZ_ALT(12, "j .Lno_zicboz", "nop")
+
+	lw	a1, riscv_cboz_block_size
+	add	a2, a0, a2
+.Lzero_loop:
+	CBO_zero(a0)
+	add	a0, a0, a1
+	CBOZ_ALT(11, "bltu a0, a2, .Lzero_loop; ret", "nop; nop")
+	CBO_zero(a0)
+	add	a0, a0, a1
+	CBOZ_ALT(10, "bltu a0, a2, .Lzero_loop; ret", "nop; nop")
+	CBO_zero(a0)
+	add	a0, a0, a1
+	CBO_zero(a0)
+	add	a0, a0, a1
+	CBOZ_ALT(9, "bltu a0, a2, .Lzero_loop; ret", "nop; nop")
+	CBO_zero(a0)
+	add	a0, a0, a1
+	CBO_zero(a0)
+	add	a0, a0, a1
+	CBO_zero(a0)
+	add	a0, a0, a1
+	CBO_zero(a0)
+	add	a0, a0, a1
+	CBOZ_ALT(8, "bltu a0, a2, .Lzero_loop; ret", "nop; nop")
+	CBO_zero(a0)
+	add	a0, a0, a1
+	CBO_zero(a0)
+	add	a0, a0, a1
+	CBO_zero(a0)
+	add	a0, a0, a1
+	CBO_zero(a0)
+	add	a0, a0, a1
+	CBO_zero(a0)
+	add	a0, a0, a1
+	CBO_zero(a0)
+	add	a0, a0, a1
+	CBO_zero(a0)
+	add	a0, a0, a1
+	CBO_zero(a0)
+	add	a0, a0, a1
+	bltu	a0, a2, .Lzero_loop
+	ret
+.Lno_zicboz:
+	li	a1, 0
+	tail	__memset
+SYM_FUNC_END(clear_page)
+EXPORT_SYMBOL(clear_page)
-- 
2.39.1


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v6 7/8] RISC-V: KVM: Provide UAPI for Zicboz block size
  2023-02-24 16:26 [PATCH v6 0/8] RISC-V: Apply Zicboz to clear_page Andrew Jones
                   ` (5 preceding siblings ...)
  2023-02-24 16:26 ` [PATCH v6 6/8] RISC-V: Use Zicboz in clear_page when available Andrew Jones
@ 2023-02-24 16:26 ` Andrew Jones
  2023-02-24 16:26 ` [PATCH v6 8/8] RISC-V: KVM: Expose Zicboz to the guest Andrew Jones
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 14+ messages in thread
From: Andrew Jones @ 2023-02-24 16:26 UTC (permalink / raw)
  To: linux-riscv, kvm-riscv, devicetree
  Cc: 'Conor Dooley ', 'Paul Walmsley ',
	'Palmer Dabbelt ', 'Sudip Mukherjee ',
	'Ben Dooks ', 'Atish Patra ',
	'Albert Ou ', 'Anup Patel ',
	'Krzysztof Kozlowski ', 'Rob Herring ',
	'Jisheng Zhang ', 'Heiko Stuebner ',
	Anup Patel

We're about to allow guests to use the Zicboz extension. KVM
userspace needs to know the cache block size in order to
properly advertise it to the guest. Provide a virtual config
register for userspace to get it with the GET_ONE_REG API, but
setting it cannot be supported, so disallow SET_ONE_REG.

Signed-off-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
---
 arch/riscv/include/uapi/asm/kvm.h | 1 +
 arch/riscv/kvm/vcpu.c             | 7 +++++++
 2 files changed, 8 insertions(+)

diff --git a/arch/riscv/include/uapi/asm/kvm.h b/arch/riscv/include/uapi/asm/kvm.h
index 92af6f3f057c..c1a1bb0fa91c 100644
--- a/arch/riscv/include/uapi/asm/kvm.h
+++ b/arch/riscv/include/uapi/asm/kvm.h
@@ -52,6 +52,7 @@ struct kvm_riscv_config {
 	unsigned long mvendorid;
 	unsigned long marchid;
 	unsigned long mimpid;
+	unsigned long zicboz_block_size;
 };
 
 /* CORE registers for KVM_GET_ONE_REG and KVM_SET_ONE_REG */
diff --git a/arch/riscv/kvm/vcpu.c b/arch/riscv/kvm/vcpu.c
index 7c08567097f0..e5126cefbc87 100644
--- a/arch/riscv/kvm/vcpu.c
+++ b/arch/riscv/kvm/vcpu.c
@@ -276,6 +276,11 @@ static int kvm_riscv_vcpu_get_reg_config(struct kvm_vcpu *vcpu,
 			return -EINVAL;
 		reg_val = riscv_cbom_block_size;
 		break;
+	case KVM_REG_RISCV_CONFIG_REG(zicboz_block_size):
+		if (!riscv_isa_extension_available(vcpu->arch.isa, ZICBOZ))
+			return -EINVAL;
+		reg_val = riscv_cboz_block_size;
+		break;
 	case KVM_REG_RISCV_CONFIG_REG(mvendorid):
 		reg_val = vcpu->arch.mvendorid;
 		break;
@@ -347,6 +352,8 @@ static int kvm_riscv_vcpu_set_reg_config(struct kvm_vcpu *vcpu,
 		break;
 	case KVM_REG_RISCV_CONFIG_REG(zicbom_block_size):
 		return -EOPNOTSUPP;
+	case KVM_REG_RISCV_CONFIG_REG(zicboz_block_size):
+		return -EOPNOTSUPP;
 	case KVM_REG_RISCV_CONFIG_REG(mvendorid):
 		if (!vcpu->arch.ran_atleast_once)
 			vcpu->arch.mvendorid = reg_val;
-- 
2.39.1


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v6 8/8] RISC-V: KVM: Expose Zicboz to the guest
  2023-02-24 16:26 [PATCH v6 0/8] RISC-V: Apply Zicboz to clear_page Andrew Jones
                   ` (6 preceding siblings ...)
  2023-02-24 16:26 ` [PATCH v6 7/8] RISC-V: KVM: Provide UAPI for Zicboz block size Andrew Jones
@ 2023-02-24 16:26 ` Andrew Jones
  2023-03-15  4:38   ` Palmer Dabbelt
  2023-03-15  4:35 ` [PATCH v6 0/8] RISC-V: Apply Zicboz to clear_page Palmer Dabbelt
  2023-03-18  1:00 ` patchwork-bot+linux-riscv
  9 siblings, 1 reply; 14+ messages in thread
From: Andrew Jones @ 2023-02-24 16:26 UTC (permalink / raw)
  To: linux-riscv, kvm-riscv, devicetree
  Cc: 'Conor Dooley ', 'Paul Walmsley ',
	'Palmer Dabbelt ', 'Sudip Mukherjee ',
	'Ben Dooks ', 'Atish Patra ',
	'Albert Ou ', 'Anup Patel ',
	'Krzysztof Kozlowski ', 'Rob Herring ',
	'Jisheng Zhang ', 'Heiko Stuebner ',
	Anup Patel

Guests may use the cbo.zero instruction when the CPU has the Zicboz
extension and the hypervisor sets henvcfg.CBZE.

Add Zicboz support for KVM guests which may be enabled and
disabled from KVM userspace using the ISA extension ONE_REG API.

Signed-off-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
---
 arch/riscv/include/uapi/asm/kvm.h | 1 +
 arch/riscv/kvm/vcpu.c             | 4 ++++
 2 files changed, 5 insertions(+)

diff --git a/arch/riscv/include/uapi/asm/kvm.h b/arch/riscv/include/uapi/asm/kvm.h
index c1a1bb0fa91c..e44c1e90eaa7 100644
--- a/arch/riscv/include/uapi/asm/kvm.h
+++ b/arch/riscv/include/uapi/asm/kvm.h
@@ -106,6 +106,7 @@ enum KVM_RISCV_ISA_EXT_ID {
 	KVM_RISCV_ISA_EXT_SVINVAL,
 	KVM_RISCV_ISA_EXT_ZIHINTPAUSE,
 	KVM_RISCV_ISA_EXT_ZICBOM,
+	KVM_RISCV_ISA_EXT_ZICBOZ,
 	KVM_RISCV_ISA_EXT_MAX,
 };
 
diff --git a/arch/riscv/kvm/vcpu.c b/arch/riscv/kvm/vcpu.c
index e5126cefbc87..198ee86cad38 100644
--- a/arch/riscv/kvm/vcpu.c
+++ b/arch/riscv/kvm/vcpu.c
@@ -63,6 +63,7 @@ static const unsigned long kvm_isa_ext_arr[] = {
 	KVM_ISA_EXT_ARR(SVPBMT),
 	KVM_ISA_EXT_ARR(ZIHINTPAUSE),
 	KVM_ISA_EXT_ARR(ZICBOM),
+	KVM_ISA_EXT_ARR(ZICBOZ),
 };
 
 static unsigned long kvm_riscv_vcpu_base2isa_ext(unsigned long base_ext)
@@ -865,6 +866,9 @@ static void kvm_riscv_vcpu_update_config(const unsigned long *isa)
 	if (riscv_isa_extension_available(isa, ZICBOM))
 		henvcfg |= (ENVCFG_CBIE | ENVCFG_CBCFE);
 
+	if (riscv_isa_extension_available(isa, ZICBOZ))
+		henvcfg |= ENVCFG_CBZE;
+
 	csr_write(CSR_HENVCFG, henvcfg);
 #ifdef CONFIG_32BIT
 	csr_write(CSR_HENVCFGH, henvcfg >> 32);
-- 
2.39.1


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH v6 0/8] RISC-V: Apply Zicboz to clear_page
  2023-02-24 16:26 [PATCH v6 0/8] RISC-V: Apply Zicboz to clear_page Andrew Jones
                   ` (7 preceding siblings ...)
  2023-02-24 16:26 ` [PATCH v6 8/8] RISC-V: KVM: Expose Zicboz to the guest Andrew Jones
@ 2023-03-15  4:35 ` Palmer Dabbelt
  2023-03-15  8:53   ` Andrew Jones
  2023-03-18  1:00 ` patchwork-bot+linux-riscv
  9 siblings, 1 reply; 14+ messages in thread
From: Palmer Dabbelt @ 2023-03-15  4:35 UTC (permalink / raw)
  To: ajones
  Cc: linux-riscv, kvm-riscv, devicetree, Conor Dooley, Paul Walmsley,
	sudip.mukherjee, ben.dooks, Atish Patra, aou, apatel,
	krzysztof.kozlowski+dt, robh, jszhang, heiko

On Fri, 24 Feb 2023 08:26:23 PST (-0800), ajones@ventanamicro.com wrote:
> When the Zicboz extension is available we can more rapidly zero naturally
> aligned Zicboz block sized chunks of memory. As pages are always page

I guess we're sort of in a grey area here: this is just a performance 
thing so in theory some sort of benchmark should be required, but IMO 
it's OK to just hand wave this one away -- if the "zero a cache block" 
instruction doesn't make "zero a page" go faster then something has gone 
so far off the rails it's probably better to just pretend the machine 
doesn't implement Zicboz.

This all LGTM, so unless anyone's opposed or it blows up on testing 
overnight then it'll be on for-next.

Thanks!

> aligned and are larger than any Zicboz block size will be, then
> clear_page() appears to be a good candidate for the extension. While cycle
> count and energy consumption should also be considered, we can be pretty
> certain that implementing clear_page() with the Zicboz extension is a win
> by comparing the new dynamic instruction count with its current count[1].
> Doing so we see that the new count is just over a quarter of the old count
> (see patch6's commit message for more details).
>
> For those of you who reviewed v1[2], you may be looking for the memset()
> patches. As pointed out in v1, and a couple follow-up emails, it's not
> clear that patching memset() is a win yet. When I get a chance to test
> on real hardware with a comprehensive benchmark collection then I can
> post the memset() patches separately (assuming the benchmarks show it's
> worthwhile).
>
> Based on riscv-linux/for-next plus the dependencies listed below.
>
> Dependencies:
> https://lore.kernel.org/all/20230224154601.88163-1-ajones@ventanamicro.com/
>
> The patches are also available here
> https://github.com/jones-drew/linux/commits/riscv/zicboz-v6
>
> To test over QEMU this branch may be used to enable Zicboz
> https://gitlab.com/jones-drew/qemu/-/commits/riscv/zicboz
>
> To test running a KVM guest with Zicboz this kvmtool branch may be used
> https://github.com/jones-drew/kvmtool/commits/riscv/zicboz
>
> Thanks,
> drew
>
> [1] I ported the functions under test to userspace and linked them with
>     a test program. Then, I ran them under gdb with a script[3] which
>     counted instructions by single stepping.
> [2] https://lore.kernel.org/all/20221027130247.31634-1-ajones@ventanamicro.com/
> [3] https://gist.github.com/jones-drew/487791c956ceca8c18adc2847eec9c60
>
> v6:
>   - Rebased on latest riscv-linux/for-next
>   - Used upper/lower_16_bits() in PATCH_ID_* macros, thanks to Conor
>     for pointing out that they can be improved
>   - Switched to SYM_FUNC_START/END for clear_page and exported the
>     symbol since other building modules which use clear_page was
>     busted [Ben Dooks and Sudip Mukherjee]
>   - Picked up an R-b from Conor
>
> v5:
>   - Rebased on latest for-next which allowed dropping v4's dependencies
>   - Added a new dependency on an alternatives/cpufeatures cleanup series
>   - Replaced "RISC-V: cpufeature: Put vendor_id to work" with "riscv:
>     cpufeatures: Put the upper 16 bits of patch ID to work" since touching
>     vendor_id is probably a bad idea [Conor]
>   - Picked up an r-b from Conor
>
> v4:
>   - Rebased on latest for-next which allowed dropping one dependency
>   - Added "RISC-V: alternatives: Support patching multiple insns in assembly"
>     since I needed to use more than one instruction in an ALTERNATIVE call
>     from assembly. I can post this patch separately as a fix if desired.
>   - Improved the dt-binding patch commit message [Conor]
>   - Picked up some tags from Conor and Rob (I kept Conor's a-b on the
>     clear_page patch, even though there are several changes to it, because
>     I interpreted the a-b as "OK by me to implement a Zicboz clear_page")
>   - Added "RISC-V: cpufeature: Put vendor_id to work", which was necessary
>     for how I wanted to use ALTERNATIVE in clear_page.
>   - Fallback to memset when the Zicboz block size is too big for the
>     Zicboz implementation of clear_page [Palmer]
>   - Use lw without la in clear_page impl [Palmer]
>   - Implement a Duff device for clear_page using the new 'vendor_id'
>     alternatives support [drew]
>
> v3:
>   - CC'ed DT list
>   - Improved commit message of DT bindings patch to point out relationship
>     with cbom-block-size
>   - Picked up an a-b from Conor
>
> v2:
>   - s/blksz/block_size/, improved commit message for "RISC-V: Add Zicboz
>     detection and block size parsing", isa ext sorting [Conor]
>   - Added dt binding patch [Heiko]
>   - Picked up r-b's from Conor, Heiko, and Anup
>   - Moved config symbol and CBO_zero() introduction to "RISC-V: Use Zicboz
>     in clear_page when available" and improved its commit message and
>     implementation (unrolled four times) [drew]
>   - Dropped memset() patches [drew]
>   - Rebased on ae4d39f75308 ("Merge patch "RISC-V: fix incorrect type of
>     ARCH_CANAAN_K210_DTB_SOURCE"") plus the dependencies
>
> Andrew Jones (8):
>   RISC-V: alternatives: Support patching multiple insns in assembly
>   RISC-V: Factor out body of riscv_init_cbom_blocksize loop
>   dt-bindings: riscv: Document cboz-block-size
>   RISC-V: Add Zicboz detection and block size parsing
>   RISC-V: cpufeatures: Put the upper 16 bits of patch ID to work
>   RISC-V: Use Zicboz in clear_page when available
>   RISC-V: KVM: Provide UAPI for Zicboz block size
>   RISC-V: KVM: Expose Zicboz to the guest
>
>  .../devicetree/bindings/riscv/cpus.yaml       |  5 ++
>  arch/riscv/Kconfig                            | 13 ++++
>  arch/riscv/include/asm/alternative-macros.h   |  6 +-
>  arch/riscv/include/asm/alternative.h          |  4 +
>  arch/riscv/include/asm/cacheflush.h           |  3 +-
>  arch/riscv/include/asm/hwcap.h                |  1 +
>  arch/riscv/include/asm/insn-def.h             |  4 +
>  arch/riscv/include/asm/page.h                 |  6 +-
>  arch/riscv/include/uapi/asm/kvm.h             |  2 +
>  arch/riscv/kernel/cpu.c                       |  1 +
>  arch/riscv/kernel/cpufeature.c                | 58 ++++++++++++++-
>  arch/riscv/kernel/setup.c                     |  2 +-
>  arch/riscv/kvm/vcpu.c                         | 11 +++
>  arch/riscv/lib/Makefile                       |  1 +
>  arch/riscv/lib/clear_page.S                   | 74 +++++++++++++++++++
>  arch/riscv/mm/cacheflush.c                    | 64 +++++++++-------
>  16 files changed, 219 insertions(+), 36 deletions(-)
>  create mode 100644 arch/riscv/lib/clear_page.S

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v6 8/8] RISC-V: KVM: Expose Zicboz to the guest
  2023-02-24 16:26 ` [PATCH v6 8/8] RISC-V: KVM: Expose Zicboz to the guest Andrew Jones
@ 2023-03-15  4:38   ` Palmer Dabbelt
  2023-03-15  4:54     ` Anup Patel
  0 siblings, 1 reply; 14+ messages in thread
From: Palmer Dabbelt @ 2023-03-15  4:38 UTC (permalink / raw)
  To: ajones, apatel
  Cc: linux-riscv, kvm-riscv, devicetree, Conor Dooley, Paul Walmsley,
	sudip.mukherjee, ben.dooks, Atish Patra, aou,
	krzysztof.kozlowski+dt, robh, jszhang, heiko, anup

On Fri, 24 Feb 2023 08:26:31 PST (-0800), ajones@ventanamicro.com wrote:
> Guests may use the cbo.zero instruction when the CPU has the Zicboz
> extension and the hypervisor sets henvcfg.CBZE.
>
> Add Zicboz support for KVM guests which may be enabled and
> disabled from KVM userspace using the ISA extension ONE_REG API.
>
> Signed-off-by: Andrew Jones <ajones@ventanamicro.com>
> Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
> Reviewed-by: Anup Patel <anup@brainfault.org>

Sorry, I guess I wasn't looking closely enough.  It's just a review, not 
an ack.

Anup: is it OK if this goes along with the others?

> ---
>  arch/riscv/include/uapi/asm/kvm.h | 1 +
>  arch/riscv/kvm/vcpu.c             | 4 ++++
>  2 files changed, 5 insertions(+)
>
> diff --git a/arch/riscv/include/uapi/asm/kvm.h b/arch/riscv/include/uapi/asm/kvm.h
> index c1a1bb0fa91c..e44c1e90eaa7 100644
> --- a/arch/riscv/include/uapi/asm/kvm.h
> +++ b/arch/riscv/include/uapi/asm/kvm.h
> @@ -106,6 +106,7 @@ enum KVM_RISCV_ISA_EXT_ID {
>  	KVM_RISCV_ISA_EXT_SVINVAL,
>  	KVM_RISCV_ISA_EXT_ZIHINTPAUSE,
>  	KVM_RISCV_ISA_EXT_ZICBOM,
> +	KVM_RISCV_ISA_EXT_ZICBOZ,
>  	KVM_RISCV_ISA_EXT_MAX,
>  };
>
> diff --git a/arch/riscv/kvm/vcpu.c b/arch/riscv/kvm/vcpu.c
> index e5126cefbc87..198ee86cad38 100644
> --- a/arch/riscv/kvm/vcpu.c
> +++ b/arch/riscv/kvm/vcpu.c
> @@ -63,6 +63,7 @@ static const unsigned long kvm_isa_ext_arr[] = {
>  	KVM_ISA_EXT_ARR(SVPBMT),
>  	KVM_ISA_EXT_ARR(ZIHINTPAUSE),
>  	KVM_ISA_EXT_ARR(ZICBOM),
> +	KVM_ISA_EXT_ARR(ZICBOZ),
>  };
>
>  static unsigned long kvm_riscv_vcpu_base2isa_ext(unsigned long base_ext)
> @@ -865,6 +866,9 @@ static void kvm_riscv_vcpu_update_config(const unsigned long *isa)
>  	if (riscv_isa_extension_available(isa, ZICBOM))
>  		henvcfg |= (ENVCFG_CBIE | ENVCFG_CBCFE);
>
> +	if (riscv_isa_extension_available(isa, ZICBOZ))
> +		henvcfg |= ENVCFG_CBZE;
> +
>  	csr_write(CSR_HENVCFG, henvcfg);
>  #ifdef CONFIG_32BIT
>  	csr_write(CSR_HENVCFGH, henvcfg >> 32);

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v6 8/8] RISC-V: KVM: Expose Zicboz to the guest
  2023-03-15  4:38   ` Palmer Dabbelt
@ 2023-03-15  4:54     ` Anup Patel
  0 siblings, 0 replies; 14+ messages in thread
From: Anup Patel @ 2023-03-15  4:54 UTC (permalink / raw)
  To: Palmer Dabbelt
  Cc: ajones, apatel, linux-riscv, kvm-riscv, devicetree, Conor Dooley,
	Paul Walmsley, sudip.mukherjee, ben.dooks, Atish Patra, aou,
	krzysztof.kozlowski+dt, robh, jszhang, heiko

On Wed, Mar 15, 2023 at 10:08 AM Palmer Dabbelt <palmer@dabbelt.com> wrote:
>
> On Fri, 24 Feb 2023 08:26:31 PST (-0800), ajones@ventanamicro.com wrote:
> > Guests may use the cbo.zero instruction when the CPU has the Zicboz
> > extension and the hypervisor sets henvcfg.CBZE.
> >
> > Add Zicboz support for KVM guests which may be enabled and
> > disabled from KVM userspace using the ISA extension ONE_REG API.
> >
> > Signed-off-by: Andrew Jones <ajones@ventanamicro.com>
> > Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
> > Reviewed-by: Anup Patel <anup@brainfault.org>
>
> Sorry, I guess I wasn't looking closely enough.  It's just a review, not
> an ack.
>
> Anup: is it OK if this goes along with the others?

Yes, no problem.

Acked-by: Anup Patel <anup@brainfault.org>

Regards,
Anup

>
> > ---
> >  arch/riscv/include/uapi/asm/kvm.h | 1 +
> >  arch/riscv/kvm/vcpu.c             | 4 ++++
> >  2 files changed, 5 insertions(+)
> >
> > diff --git a/arch/riscv/include/uapi/asm/kvm.h b/arch/riscv/include/uapi/asm/kvm.h
> > index c1a1bb0fa91c..e44c1e90eaa7 100644
> > --- a/arch/riscv/include/uapi/asm/kvm.h
> > +++ b/arch/riscv/include/uapi/asm/kvm.h
> > @@ -106,6 +106,7 @@ enum KVM_RISCV_ISA_EXT_ID {
> >       KVM_RISCV_ISA_EXT_SVINVAL,
> >       KVM_RISCV_ISA_EXT_ZIHINTPAUSE,
> >       KVM_RISCV_ISA_EXT_ZICBOM,
> > +     KVM_RISCV_ISA_EXT_ZICBOZ,
> >       KVM_RISCV_ISA_EXT_MAX,
> >  };
> >
> > diff --git a/arch/riscv/kvm/vcpu.c b/arch/riscv/kvm/vcpu.c
> > index e5126cefbc87..198ee86cad38 100644
> > --- a/arch/riscv/kvm/vcpu.c
> > +++ b/arch/riscv/kvm/vcpu.c
> > @@ -63,6 +63,7 @@ static const unsigned long kvm_isa_ext_arr[] = {
> >       KVM_ISA_EXT_ARR(SVPBMT),
> >       KVM_ISA_EXT_ARR(ZIHINTPAUSE),
> >       KVM_ISA_EXT_ARR(ZICBOM),
> > +     KVM_ISA_EXT_ARR(ZICBOZ),
> >  };
> >
> >  static unsigned long kvm_riscv_vcpu_base2isa_ext(unsigned long base_ext)
> > @@ -865,6 +866,9 @@ static void kvm_riscv_vcpu_update_config(const unsigned long *isa)
> >       if (riscv_isa_extension_available(isa, ZICBOM))
> >               henvcfg |= (ENVCFG_CBIE | ENVCFG_CBCFE);
> >
> > +     if (riscv_isa_extension_available(isa, ZICBOZ))
> > +             henvcfg |= ENVCFG_CBZE;
> > +
> >       csr_write(CSR_HENVCFG, henvcfg);
> >  #ifdef CONFIG_32BIT
> >       csr_write(CSR_HENVCFGH, henvcfg >> 32);

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v6 0/8] RISC-V: Apply Zicboz to clear_page
  2023-03-15  4:35 ` [PATCH v6 0/8] RISC-V: Apply Zicboz to clear_page Palmer Dabbelt
@ 2023-03-15  8:53   ` Andrew Jones
  0 siblings, 0 replies; 14+ messages in thread
From: Andrew Jones @ 2023-03-15  8:53 UTC (permalink / raw)
  To: Palmer Dabbelt
  Cc: linux-riscv, kvm-riscv, devicetree, Conor Dooley, Paul Walmsley,
	sudip.mukherjee, ben.dooks, Atish Patra, aou, apatel,
	krzysztof.kozlowski+dt, robh, jszhang, heiko

On Tue, Mar 14, 2023 at 09:35:09PM -0700, Palmer Dabbelt wrote:
> On Fri, 24 Feb 2023 08:26:23 PST (-0800), ajones@ventanamicro.com wrote:
> > When the Zicboz extension is available we can more rapidly zero naturally
> > aligned Zicboz block sized chunks of memory. As pages are always page
> 
> I guess we're sort of in a grey area here: this is just a performance thing
> so in theory some sort of benchmark should be required, but IMO it's OK to
> just hand wave this one away -- if the "zero a cache block" instruction
> doesn't make "zero a page" go faster then something has gone so far off the
> rails it's probably better to just pretend the machine doesn't implement
> Zicboz.
> 
> This all LGTM, so unless anyone's opposed or it blows up on testing
> overnight then it'll be on for-next.

Thanks, Palmer!

drew

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v6 0/8] RISC-V: Apply Zicboz to clear_page
  2023-02-24 16:26 [PATCH v6 0/8] RISC-V: Apply Zicboz to clear_page Andrew Jones
                   ` (8 preceding siblings ...)
  2023-03-15  4:35 ` [PATCH v6 0/8] RISC-V: Apply Zicboz to clear_page Palmer Dabbelt
@ 2023-03-18  1:00 ` patchwork-bot+linux-riscv
  9 siblings, 0 replies; 14+ messages in thread
From: patchwork-bot+linux-riscv @ 2023-03-18  1:00 UTC (permalink / raw)
  To: Andrew Jones
  Cc: linux-riscv, kvm-riscv, devicetree, conor.dooley, paul.walmsley,
	palmer, sudip.mukherjee, ben.dooks, atishp, aou, apatel,
	krzysztof.kozlowski+dt, robh, jszhang, heiko

Hello:

This series was applied to riscv/linux.git (for-next)
by Palmer Dabbelt <palmer@rivosinc.com>:

On Fri, 24 Feb 2023 17:26:23 +0100 you wrote:
> When the Zicboz extension is available we can more rapidly zero naturally
> aligned Zicboz block sized chunks of memory. As pages are always page
> aligned and are larger than any Zicboz block size will be, then
> clear_page() appears to be a good candidate for the extension. While cycle
> count and energy consumption should also be considered, we can be pretty
> certain that implementing clear_page() with the Zicboz extension is a win
> by comparing the new dynamic instruction count with its current count[1].
> Doing so we see that the new count is just over a quarter of the old count
> (see patch6's commit message for more details).
> 
> [...]

Here is the summary with links:
  - [v6,1/8] RISC-V: alternatives: Support patching multiple insns in assembly
    https://git.kernel.org/riscv/c/0b2f658f5370
  - [v6,2/8] RISC-V: Factor out body of riscv_init_cbom_blocksize loop
    https://git.kernel.org/riscv/c/8b05e7d0408a
  - [v6,3/8] dt-bindings: riscv: Document cboz-block-size
    https://git.kernel.org/riscv/c/ea20f117ab99
  - [v6,4/8] RISC-V: Add Zicboz detection and block size parsing
    https://git.kernel.org/riscv/c/7ea5a73617e9
  - [v6,5/8] RISC-V: cpufeatures: Put the upper 16 bits of patch ID to work
    https://git.kernel.org/riscv/c/d25f256332cc
  - [v6,6/8] RISC-V: Use Zicboz in clear_page when available
    https://git.kernel.org/riscv/c/ab0f77465e3e
  - [v6,7/8] RISC-V: KVM: Provide UAPI for Zicboz block size
    https://git.kernel.org/riscv/c/665fd8862413
  - [v6,8/8] RISC-V: KVM: Expose Zicboz to the guest
    https://git.kernel.org/riscv/c/b20f67994f35

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2023-03-18  1:00 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-02-24 16:26 [PATCH v6 0/8] RISC-V: Apply Zicboz to clear_page Andrew Jones
2023-02-24 16:26 ` [PATCH v6 1/8] RISC-V: alternatives: Support patching multiple insns in assembly Andrew Jones
2023-02-24 16:26 ` [PATCH v6 2/8] RISC-V: Factor out body of riscv_init_cbom_blocksize loop Andrew Jones
2023-02-24 16:26 ` [PATCH v6 3/8] dt-bindings: riscv: Document cboz-block-size Andrew Jones
2023-02-24 16:26 ` [PATCH v6 4/8] RISC-V: Add Zicboz detection and block size parsing Andrew Jones
2023-02-24 16:26 ` [PATCH v6 5/8] RISC-V: cpufeatures: Put the upper 16 bits of patch ID to work Andrew Jones
2023-02-24 16:26 ` [PATCH v6 6/8] RISC-V: Use Zicboz in clear_page when available Andrew Jones
2023-02-24 16:26 ` [PATCH v6 7/8] RISC-V: KVM: Provide UAPI for Zicboz block size Andrew Jones
2023-02-24 16:26 ` [PATCH v6 8/8] RISC-V: KVM: Expose Zicboz to the guest Andrew Jones
2023-03-15  4:38   ` Palmer Dabbelt
2023-03-15  4:54     ` Anup Patel
2023-03-15  4:35 ` [PATCH v6 0/8] RISC-V: Apply Zicboz to clear_page Palmer Dabbelt
2023-03-15  8:53   ` Andrew Jones
2023-03-18  1:00 ` patchwork-bot+linux-riscv

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).