All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 0/14] Zbb string optimizations and call support in alternatives
@ 2022-11-30 22:56 ` Heiko Stuebner
  0 siblings, 0 replies; 84+ messages in thread
From: Heiko Stuebner @ 2022-11-30 22:56 UTC (permalink / raw)
  To: linux-riscv, palmer
  Cc: christoph.muellner, prabhakar.csengg, conor, philipp.tomsich,
	ajones, heiko, emil.renner.berthing, ardb, linux-efi,
	Heiko Stuebner

From: Heiko Stuebner <heiko.stuebner@vrull.eu>

The Zbb extension can be used to make string functions run a lot
faster.

To allow There are essentially two problems to solve:
- making it possible for str* functions to replace what they do
  in a performant way

  This is done by inlining the core functions and then
  using alternatives to call the actual variant.

  This of course will need a more intelligent selection mechanism
  down the road when more variants may exist using different
  available extensions.

- actually allowing calls in alternatives
  Function calls use auipc + jalr to reach those 32bit relative
  addresses but when they're compiled the offset will be wrong
  as alternatives live in a different section. So when the patch
  gets applied the address will point to the wrong location.

  So similar to arm64 the target addresses need to be updated.

  This is probably also helpful for other things needing more
  complex code in alternatives.


In my half-scientific test-case of running the functions in question
on a 95 character string in a loop of 10000 iterations, the Zbb
variants shave off around 2/3 of the original runtime.


For v2 I got into some sort of cleanup spree for the general instruction
parsing that already existed. A number of places do their own
instruction parsing and I tried consolidating some of them.

Noteable, the kvm parts still do, but I had to stop somewhere :-)

The series is based on v6.1-rc7 right now.

changes since v2:
- add patch fixing the c.jalr funct4 value
- reword some commit messages
- fix position of auipc addition patch (earlier)
- fix compile errors from patch-reordering gone wrong
  (worked at the end of v2, but compiling individual patches
   caused issues) - patches are now tested individually
- limit Zbb variants for GNU as for now
  (LLVM support for .option arch is still under review)
- prevent str-functions from getting optimized to builtin-variants

changes since v1:
- a number of generalizations/cleanups for instruction parsing
- use accessor function to access instructions (Emil)
- actually patch the correct location when having more than one
  instruction in an alternative block
- string function cleanups (comments etc) (Conor)
- move zbb extension above s* extensions in cpu.c lists

changes since rfc:
- make Zbb code actually work
- drop some unneeded patches
- a lot of cleanups

Heiko Stuebner (14):
  RISC-V: fix funct4 definition for c.jalr in parse_asm.h
  RISC-V: add prefix to all constants/macros in parse_asm.h
  RISC-V: detach funct-values from their offset
  RISC-V: add ebreak instructions to definitions
  RISC-V: add auipc elements to parse_asm header
  RISC-V: Move riscv_insn_is_* macros into a common header
  RISC-V: rename parse_asm.h to insn.h
  RISC-V: kprobes: use central defined funct3 constants
  RISC-V: add U-type imm parsing to insn.h header
  RISC-V: add rd reg parsing to insn.h header
  RISC-V: fix auipc-jalr addresses in patched alternatives
  efi/riscv: libstub: mark when compiling libstub
  RISC-V: add infrastructure to allow different str* implementations
  RISC-V: add zbb support to string functions

 arch/riscv/Kconfig                       |  24 ++
 arch/riscv/Makefile                      |   3 +
 arch/riscv/include/asm/alternative.h     |   3 +
 arch/riscv/include/asm/errata_list.h     |   3 +-
 arch/riscv/include/asm/hwcap.h           |   1 +
 arch/riscv/include/asm/insn.h            | 292 +++++++++++++++++++++++
 arch/riscv/include/asm/parse_asm.h       | 219 -----------------
 arch/riscv/include/asm/string.h          |  83 +++++++
 arch/riscv/kernel/alternative.c          |  72 ++++++
 arch/riscv/kernel/cpu.c                  |   1 +
 arch/riscv/kernel/cpufeature.c           |  29 ++-
 arch/riscv/kernel/image-vars.h           |   6 +-
 arch/riscv/kernel/kgdb.c                 |  63 ++---
 arch/riscv/kernel/probes/simulate-insn.c |  19 +-
 arch/riscv/kernel/probes/simulate-insn.h |  26 +-
 arch/riscv/lib/Makefile                  |   6 +
 arch/riscv/lib/strcmp.S                  |  38 +++
 arch/riscv/lib/strcmp_zbb.S              |  96 ++++++++
 arch/riscv/lib/strlen.S                  |  29 +++
 arch/riscv/lib/strlen_zbb.S              | 115 +++++++++
 arch/riscv/lib/strncmp.S                 |  41 ++++
 arch/riscv/lib/strncmp_zbb.S             | 112 +++++++++
 drivers/firmware/efi/libstub/Makefile    |   2 +-
 23 files changed, 982 insertions(+), 301 deletions(-)
 create mode 100644 arch/riscv/include/asm/insn.h
 delete mode 100644 arch/riscv/include/asm/parse_asm.h
 create mode 100644 arch/riscv/lib/strcmp.S
 create mode 100644 arch/riscv/lib/strcmp_zbb.S
 create mode 100644 arch/riscv/lib/strlen.S
 create mode 100644 arch/riscv/lib/strlen_zbb.S
 create mode 100644 arch/riscv/lib/strncmp.S
 create mode 100644 arch/riscv/lib/strncmp_zbb.S

-- 
2.35.1


^ permalink raw reply	[flat|nested] 84+ messages in thread

* [PATCH v3 0/14] Zbb string optimizations and call support in alternatives
@ 2022-11-30 22:56 ` Heiko Stuebner
  0 siblings, 0 replies; 84+ messages in thread
From: Heiko Stuebner @ 2022-11-30 22:56 UTC (permalink / raw)
  To: linux-riscv, palmer
  Cc: christoph.muellner, prabhakar.csengg, conor, philipp.tomsich,
	ajones, heiko, emil.renner.berthing, ardb, linux-efi,
	Heiko Stuebner

From: Heiko Stuebner <heiko.stuebner@vrull.eu>

The Zbb extension can be used to make string functions run a lot
faster.

To allow There are essentially two problems to solve:
- making it possible for str* functions to replace what they do
  in a performant way

  This is done by inlining the core functions and then
  using alternatives to call the actual variant.

  This of course will need a more intelligent selection mechanism
  down the road when more variants may exist using different
  available extensions.

- actually allowing calls in alternatives
  Function calls use auipc + jalr to reach those 32bit relative
  addresses but when they're compiled the offset will be wrong
  as alternatives live in a different section. So when the patch
  gets applied the address will point to the wrong location.

  So similar to arm64 the target addresses need to be updated.

  This is probably also helpful for other things needing more
  complex code in alternatives.


In my half-scientific test-case of running the functions in question
on a 95 character string in a loop of 10000 iterations, the Zbb
variants shave off around 2/3 of the original runtime.


For v2 I got into some sort of cleanup spree for the general instruction
parsing that already existed. A number of places do their own
instruction parsing and I tried consolidating some of them.

Noteable, the kvm parts still do, but I had to stop somewhere :-)

The series is based on v6.1-rc7 right now.

changes since v2:
- add patch fixing the c.jalr funct4 value
- reword some commit messages
- fix position of auipc addition patch (earlier)
- fix compile errors from patch-reordering gone wrong
  (worked at the end of v2, but compiling individual patches
   caused issues) - patches are now tested individually
- limit Zbb variants for GNU as for now
  (LLVM support for .option arch is still under review)
- prevent str-functions from getting optimized to builtin-variants

changes since v1:
- a number of generalizations/cleanups for instruction parsing
- use accessor function to access instructions (Emil)
- actually patch the correct location when having more than one
  instruction in an alternative block
- string function cleanups (comments etc) (Conor)
- move zbb extension above s* extensions in cpu.c lists

changes since rfc:
- make Zbb code actually work
- drop some unneeded patches
- a lot of cleanups

Heiko Stuebner (14):
  RISC-V: fix funct4 definition for c.jalr in parse_asm.h
  RISC-V: add prefix to all constants/macros in parse_asm.h
  RISC-V: detach funct-values from their offset
  RISC-V: add ebreak instructions to definitions
  RISC-V: add auipc elements to parse_asm header
  RISC-V: Move riscv_insn_is_* macros into a common header
  RISC-V: rename parse_asm.h to insn.h
  RISC-V: kprobes: use central defined funct3 constants
  RISC-V: add U-type imm parsing to insn.h header
  RISC-V: add rd reg parsing to insn.h header
  RISC-V: fix auipc-jalr addresses in patched alternatives
  efi/riscv: libstub: mark when compiling libstub
  RISC-V: add infrastructure to allow different str* implementations
  RISC-V: add zbb support to string functions

 arch/riscv/Kconfig                       |  24 ++
 arch/riscv/Makefile                      |   3 +
 arch/riscv/include/asm/alternative.h     |   3 +
 arch/riscv/include/asm/errata_list.h     |   3 +-
 arch/riscv/include/asm/hwcap.h           |   1 +
 arch/riscv/include/asm/insn.h            | 292 +++++++++++++++++++++++
 arch/riscv/include/asm/parse_asm.h       | 219 -----------------
 arch/riscv/include/asm/string.h          |  83 +++++++
 arch/riscv/kernel/alternative.c          |  72 ++++++
 arch/riscv/kernel/cpu.c                  |   1 +
 arch/riscv/kernel/cpufeature.c           |  29 ++-
 arch/riscv/kernel/image-vars.h           |   6 +-
 arch/riscv/kernel/kgdb.c                 |  63 ++---
 arch/riscv/kernel/probes/simulate-insn.c |  19 +-
 arch/riscv/kernel/probes/simulate-insn.h |  26 +-
 arch/riscv/lib/Makefile                  |   6 +
 arch/riscv/lib/strcmp.S                  |  38 +++
 arch/riscv/lib/strcmp_zbb.S              |  96 ++++++++
 arch/riscv/lib/strlen.S                  |  29 +++
 arch/riscv/lib/strlen_zbb.S              | 115 +++++++++
 arch/riscv/lib/strncmp.S                 |  41 ++++
 arch/riscv/lib/strncmp_zbb.S             | 112 +++++++++
 drivers/firmware/efi/libstub/Makefile    |   2 +-
 23 files changed, 982 insertions(+), 301 deletions(-)
 create mode 100644 arch/riscv/include/asm/insn.h
 delete mode 100644 arch/riscv/include/asm/parse_asm.h
 create mode 100644 arch/riscv/lib/strcmp.S
 create mode 100644 arch/riscv/lib/strcmp_zbb.S
 create mode 100644 arch/riscv/lib/strlen.S
 create mode 100644 arch/riscv/lib/strlen_zbb.S
 create mode 100644 arch/riscv/lib/strncmp.S
 create mode 100644 arch/riscv/lib/strncmp_zbb.S

-- 
2.35.1


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 84+ messages in thread

* [PATCH v3 01/14] RISC-V: fix funct4 definition for c.jalr in parse_asm.h
  2022-11-30 22:56 ` Heiko Stuebner
@ 2022-11-30 22:56   ` Heiko Stuebner
  -1 siblings, 0 replies; 84+ messages in thread
From: Heiko Stuebner @ 2022-11-30 22:56 UTC (permalink / raw)
  To: linux-riscv, palmer
  Cc: christoph.muellner, prabhakar.csengg, conor, philipp.tomsich,
	ajones, heiko, emil.renner.berthing, ardb, linux-efi,
	Heiko Stuebner

From: Heiko Stuebner <heiko.stuebner@vrull.eu>

The opcode definition for c.jalr is
    c.jalr c_rs1_n0  1..0=2 15..13=4 12=1 6..2=0

This means funct4 consisting of bit [15:12] is 1001b, so the value is 0x9.

Signed-off-by: Heiko Stuebner <heiko.stuebner@vrull.eu>
---
 arch/riscv/include/asm/parse_asm.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/riscv/include/asm/parse_asm.h b/arch/riscv/include/asm/parse_asm.h
index f36368de839f..7fee806805c1 100644
--- a/arch/riscv/include/asm/parse_asm.h
+++ b/arch/riscv/include/asm/parse_asm.h
@@ -125,7 +125,7 @@
 #define FUNCT3_C_J		0xa000
 #define FUNCT3_C_JAL		0x2000
 #define FUNCT4_C_JR		0x8000
-#define FUNCT4_C_JALR		0xf000
+#define FUNCT4_C_JALR		0x9000
 
 #define FUNCT12_SRET		0x10200000
 
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH v3 01/14] RISC-V: fix funct4 definition for c.jalr in parse_asm.h
@ 2022-11-30 22:56   ` Heiko Stuebner
  0 siblings, 0 replies; 84+ messages in thread
From: Heiko Stuebner @ 2022-11-30 22:56 UTC (permalink / raw)
  To: linux-riscv, palmer
  Cc: christoph.muellner, prabhakar.csengg, conor, philipp.tomsich,
	ajones, heiko, emil.renner.berthing, ardb, linux-efi,
	Heiko Stuebner

From: Heiko Stuebner <heiko.stuebner@vrull.eu>

The opcode definition for c.jalr is
    c.jalr c_rs1_n0  1..0=2 15..13=4 12=1 6..2=0

This means funct4 consisting of bit [15:12] is 1001b, so the value is 0x9.

Signed-off-by: Heiko Stuebner <heiko.stuebner@vrull.eu>
---
 arch/riscv/include/asm/parse_asm.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/riscv/include/asm/parse_asm.h b/arch/riscv/include/asm/parse_asm.h
index f36368de839f..7fee806805c1 100644
--- a/arch/riscv/include/asm/parse_asm.h
+++ b/arch/riscv/include/asm/parse_asm.h
@@ -125,7 +125,7 @@
 #define FUNCT3_C_J		0xa000
 #define FUNCT3_C_JAL		0x2000
 #define FUNCT4_C_JR		0x8000
-#define FUNCT4_C_JALR		0xf000
+#define FUNCT4_C_JALR		0x9000
 
 #define FUNCT12_SRET		0x10200000
 
-- 
2.35.1


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH v3 02/14] RISC-V: add prefix to all constants/macros in parse_asm.h
  2022-11-30 22:56 ` Heiko Stuebner
@ 2022-11-30 22:56   ` Heiko Stuebner
  -1 siblings, 0 replies; 84+ messages in thread
From: Heiko Stuebner @ 2022-11-30 22:56 UTC (permalink / raw)
  To: linux-riscv, palmer
  Cc: christoph.muellner, prabhakar.csengg, conor, philipp.tomsich,
	ajones, heiko, emil.renner.berthing, ardb, linux-efi,
	Heiko Stuebner, Conor Dooley

From: Heiko Stuebner <heiko.stuebner@vrull.eu>

Some of the constants and macros already have suitable RV_, RVG_ or
RVC_ prefixes.

Extend this to the rest of the file as well, as we want to use these
things in a broader scope soon.

Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Heiko Stuebner <heiko.stuebner@vrull.eu>
---
 arch/riscv/include/asm/parse_asm.h | 194 ++++++++++++++---------------
 arch/riscv/kernel/kgdb.c           |  40 +++---
 2 files changed, 117 insertions(+), 117 deletions(-)

diff --git a/arch/riscv/include/asm/parse_asm.h b/arch/riscv/include/asm/parse_asm.h
index 7fee806805c1..ea51542e0c65 100644
--- a/arch/riscv/include/asm/parse_asm.h
+++ b/arch/riscv/include/asm/parse_asm.h
@@ -6,37 +6,37 @@
 #include <linux/bits.h>
 
 /* The bit field of immediate value in I-type instruction */
-#define I_IMM_SIGN_OPOFF	31
-#define I_IMM_11_0_OPOFF	20
-#define I_IMM_SIGN_OFF		12
-#define I_IMM_11_0_OFF		0
-#define I_IMM_11_0_MASK		GENMASK(11, 0)
+#define RV_I_IMM_SIGN_OPOFF	31
+#define RV_I_IMM_11_0_OPOFF	20
+#define RV_I_IMM_SIGN_OFF	12
+#define RV_I_IMM_11_0_OFF	0
+#define RV_I_IMM_11_0_MASK	GENMASK(11, 0)
 
 /* The bit field of immediate value in J-type instruction */
-#define J_IMM_SIGN_OPOFF	31
-#define J_IMM_10_1_OPOFF	21
-#define J_IMM_11_OPOFF		20
-#define J_IMM_19_12_OPOFF	12
-#define J_IMM_SIGN_OFF		20
-#define J_IMM_10_1_OFF		1
-#define J_IMM_11_OFF		11
-#define J_IMM_19_12_OFF		12
-#define J_IMM_10_1_MASK		GENMASK(9, 0)
-#define J_IMM_11_MASK		GENMASK(0, 0)
-#define J_IMM_19_12_MASK	GENMASK(7, 0)
+#define RV_J_IMM_SIGN_OPOFF	31
+#define RV_J_IMM_10_1_OPOFF	21
+#define RV_J_IMM_11_OPOFF	20
+#define RV_J_IMM_19_12_OPOFF	12
+#define RV_J_IMM_SIGN_OFF	20
+#define RV_J_IMM_10_1_OFF	1
+#define RV_J_IMM_11_OFF		11
+#define RV_J_IMM_19_12_OFF	12
+#define RV_J_IMM_10_1_MASK	GENMASK(9, 0)
+#define RV_J_IMM_11_MASK	GENMASK(0, 0)
+#define RV_J_IMM_19_12_MASK	GENMASK(7, 0)
 
 /* The bit field of immediate value in B-type instruction */
-#define B_IMM_SIGN_OPOFF	31
-#define B_IMM_10_5_OPOFF	25
-#define B_IMM_4_1_OPOFF		8
-#define B_IMM_11_OPOFF		7
-#define B_IMM_SIGN_OFF		12
-#define B_IMM_10_5_OFF		5
-#define B_IMM_4_1_OFF		1
-#define B_IMM_11_OFF		11
-#define B_IMM_10_5_MASK		GENMASK(5, 0)
-#define B_IMM_4_1_MASK		GENMASK(3, 0)
-#define B_IMM_11_MASK		GENMASK(0, 0)
+#define RV_B_IMM_SIGN_OPOFF	31
+#define RV_B_IMM_10_5_OPOFF	25
+#define RV_B_IMM_4_1_OPOFF	8
+#define RV_B_IMM_11_OPOFF	7
+#define RV_B_IMM_SIGN_OFF	12
+#define RV_B_IMM_10_5_OFF	5
+#define RV_B_IMM_4_1_OFF	1
+#define RV_B_IMM_11_OFF		11
+#define RV_B_IMM_10_5_MASK	GENMASK(5, 0)
+#define RV_B_IMM_4_1_MASK	GENMASK(3, 0)
+#define RV_B_IMM_11_MASK	GENMASK(0, 0)
 
 /* The register offset in RVG instruction */
 #define RVG_RS1_OPOFF		15
@@ -100,71 +100,71 @@
 #define RVC_C2_RD_OPOFF		7
 
 /* parts of opcode for RVG*/
-#define OPCODE_BRANCH		0x63
-#define OPCODE_JALR		0x67
-#define OPCODE_JAL		0x6f
-#define OPCODE_SYSTEM		0x73
+#define RVG_OPCODE_BRANCH	0x63
+#define RVG_OPCODE_JALR		0x67
+#define RVG_OPCODE_JAL		0x6f
+#define RVG_OPCODE_SYSTEM	0x73
 
 /* parts of opcode for RVC*/
-#define OPCODE_C_0		0x0
-#define OPCODE_C_1		0x1
-#define OPCODE_C_2		0x2
+#define RVC_OPCODE_C0		0x0
+#define RVC_OPCODE_C1		0x1
+#define RVC_OPCODE_C2		0x2
 
 /* parts of funct3 code for I, M, A extension*/
-#define FUNCT3_JALR		0x0
-#define FUNCT3_BEQ		0x0
-#define FUNCT3_BNE		0x1000
-#define FUNCT3_BLT		0x4000
-#define FUNCT3_BGE		0x5000
-#define FUNCT3_BLTU		0x6000
-#define FUNCT3_BGEU		0x7000
+#define RVG_FUNCT3_JALR		0x0
+#define RVG_FUNCT3_BEQ		0x0
+#define RVG_FUNCT3_BNE		0x1000
+#define RVG_FUNCT3_BLT		0x4000
+#define RVG_FUNCT3_BGE		0x5000
+#define RVG_FUNCT3_BLTU		0x6000
+#define RVG_FUNCT3_BGEU		0x7000
 
 /* parts of funct3 code for C extension*/
-#define FUNCT3_C_BEQZ		0xc000
-#define FUNCT3_C_BNEZ		0xe000
-#define FUNCT3_C_J		0xa000
-#define FUNCT3_C_JAL		0x2000
-#define FUNCT4_C_JR		0x8000
-#define FUNCT4_C_JALR		0x9000
-
-#define FUNCT12_SRET		0x10200000
-
-#define MATCH_JALR		(FUNCT3_JALR | OPCODE_JALR)
-#define MATCH_JAL		(OPCODE_JAL)
-#define MATCH_BEQ		(FUNCT3_BEQ | OPCODE_BRANCH)
-#define MATCH_BNE		(FUNCT3_BNE | OPCODE_BRANCH)
-#define MATCH_BLT		(FUNCT3_BLT | OPCODE_BRANCH)
-#define MATCH_BGE		(FUNCT3_BGE | OPCODE_BRANCH)
-#define MATCH_BLTU		(FUNCT3_BLTU | OPCODE_BRANCH)
-#define MATCH_BGEU		(FUNCT3_BGEU | OPCODE_BRANCH)
-#define MATCH_SRET		(FUNCT12_SRET | OPCODE_SYSTEM)
-#define MATCH_C_BEQZ		(FUNCT3_C_BEQZ | OPCODE_C_1)
-#define MATCH_C_BNEZ		(FUNCT3_C_BNEZ | OPCODE_C_1)
-#define MATCH_C_J		(FUNCT3_C_J | OPCODE_C_1)
-#define MATCH_C_JAL		(FUNCT3_C_JAL | OPCODE_C_1)
-#define MATCH_C_JR		(FUNCT4_C_JR | OPCODE_C_2)
-#define MATCH_C_JALR		(FUNCT4_C_JALR | OPCODE_C_2)
-
-#define MASK_JALR		0x707f
-#define MASK_JAL		0x7f
-#define MASK_C_JALR		0xf07f
-#define MASK_C_JR		0xf07f
-#define MASK_C_JAL		0xe003
-#define MASK_C_J		0xe003
-#define MASK_BEQ		0x707f
-#define MASK_BNE		0x707f
-#define MASK_BLT		0x707f
-#define MASK_BGE		0x707f
-#define MASK_BLTU		0x707f
-#define MASK_BGEU		0x707f
-#define MASK_C_BEQZ		0xe003
-#define MASK_C_BNEZ		0xe003
-#define MASK_SRET		0xffffffff
+#define RVC_FUNCT3_C_BEQZ	0xc000
+#define RVC_FUNCT3_C_BNEZ	0xe000
+#define RVC_FUNCT3_C_J		0xa000
+#define RVC_FUNCT3_C_JAL	0x2000
+#define RVC_FUNCT4_C_JR		0x8000
+#define RVC_FUNCT4_C_JALR	0x9000
+
+#define RVG_FUNCT12_SRET	0x10200000
+
+#define RVG_MATCH_JALR		(RVG_FUNCT3_JALR | RVG_OPCODE_JALR)
+#define RVG_MATCH_JAL		(RVG_OPCODE_JAL)
+#define RVG_MATCH_BEQ		(RVG_FUNCT3_BEQ | RVG_OPCODE_BRANCH)
+#define RVG_MATCH_BNE		(RVG_FUNCT3_BNE | RVG_OPCODE_BRANCH)
+#define RVG_MATCH_BLT		(RVG_FUNCT3_BLT | RVG_OPCODE_BRANCH)
+#define RVG_MATCH_BGE		(RVG_FUNCT3_BGE | RVG_OPCODE_BRANCH)
+#define RVG_MATCH_BLTU		(RVG_FUNCT3_BLTU | RVG_OPCODE_BRANCH)
+#define RVG_MATCH_BGEU		(RVG_FUNCT3_BGEU | RVG_OPCODE_BRANCH)
+#define RVG_MATCH_SRET		(RVG_FUNCT12_SRET | RVG_OPCODE_SYSTEM)
+#define RVC_MATCH_C_BEQZ	(RVC_FUNCT3_C_BEQZ | RVC_OPCODE_C1)
+#define RVC_MATCH_C_BNEZ	(RVC_FUNCT3_C_BNEZ | RVC_OPCODE_C1)
+#define RVC_MATCH_C_J		(RVC_FUNCT3_C_J | RVC_OPCODE_C1)
+#define RVC_MATCH_C_JAL		(RVC_FUNCT3_C_JAL | RVC_OPCODE_C1)
+#define RVC_MATCH_C_JR		(RVC_FUNCT4_C_JR | RVC_OPCODE_C2)
+#define RVC_MATCH_C_JALR	(RVC_FUNCT4_C_JALR | RVC_OPCODE_C2)
+
+#define RVG_MASK_JALR		0x707f
+#define RVG_MASK_JAL		0x7f
+#define RVC_MASK_C_JALR		0xf07f
+#define RVC_MASK_C_JR		0xf07f
+#define RVC_MASK_C_JAL		0xe003
+#define RVC_MASK_C_J		0xe003
+#define RVG_MASK_BEQ		0x707f
+#define RVG_MASK_BNE		0x707f
+#define RVG_MASK_BLT		0x707f
+#define RVG_MASK_BGE		0x707f
+#define RVG_MASK_BLTU		0x707f
+#define RVG_MASK_BGEU		0x707f
+#define RVC_MASK_C_BEQZ		0xe003
+#define RVC_MASK_C_BNEZ		0xe003
+#define RVG_MASK_SRET		0xffffffff
 
 #define __INSN_LENGTH_MASK	_UL(0x3)
 #define __INSN_LENGTH_GE_32	_UL(0x3)
 #define __INSN_OPCODE_MASK	_UL(0x7F)
-#define __INSN_BRANCH_OPCODE	_UL(OPCODE_BRANCH)
+#define __INSN_BRANCH_OPCODE	_UL(RVG_OPCODE_BRANCH)
 
 /* Define a series of is_XXX_insn functions to check if the value INSN
  * is an instance of instruction XXX.
@@ -180,26 +180,26 @@ static inline bool is_ ## INSN_NAME ## _insn(long insn) \
 #define RV_X(X, s, mask)  (((X) >> (s)) & (mask))
 #define RVC_X(X, s, mask) RV_X(X, s, mask)
 
-#define EXTRACT_JTYPE_IMM(x) \
+#define RV_EXTRACT_JTYPE_IMM(x) \
 	({typeof(x) x_ = (x); \
-	(RV_X(x_, J_IMM_10_1_OPOFF, J_IMM_10_1_MASK) << J_IMM_10_1_OFF) | \
-	(RV_X(x_, J_IMM_11_OPOFF, J_IMM_11_MASK) << J_IMM_11_OFF) | \
-	(RV_X(x_, J_IMM_19_12_OPOFF, J_IMM_19_12_MASK) << J_IMM_19_12_OFF) | \
-	(RV_IMM_SIGN(x_) << J_IMM_SIGN_OFF); })
+	(RV_X(x_, RV_J_IMM_10_1_OPOFF, RV_J_IMM_10_1_MASK) << RV_J_IMM_10_1_OFF) | \
+	(RV_X(x_, RV_J_IMM_11_OPOFF, RV_J_IMM_11_MASK) << RV_J_IMM_11_OFF) | \
+	(RV_X(x_, RV_J_IMM_19_12_OPOFF, RV_J_IMM_19_12_MASK) << RV_J_IMM_19_12_OFF) | \
+	(RV_IMM_SIGN(x_) << RV_J_IMM_SIGN_OFF); })
 
-#define EXTRACT_ITYPE_IMM(x) \
+#define RV_EXTRACT_ITYPE_IMM(x) \
 	({typeof(x) x_ = (x); \
-	(RV_X(x_, I_IMM_11_0_OPOFF, I_IMM_11_0_MASK)) | \
-	(RV_IMM_SIGN(x_) << I_IMM_SIGN_OFF); })
+	(RV_X(x_, RV_I_IMM_11_0_OPOFF, RV_I_IMM_11_0_MASK)) | \
+	(RV_IMM_SIGN(x_) << RV_I_IMM_SIGN_OFF); })
 
-#define EXTRACT_BTYPE_IMM(x) \
+#define RV_EXTRACT_BTYPE_IMM(x) \
 	({typeof(x) x_ = (x); \
-	(RV_X(x_, B_IMM_4_1_OPOFF, B_IMM_4_1_MASK) << B_IMM_4_1_OFF) | \
-	(RV_X(x_, B_IMM_10_5_OPOFF, B_IMM_10_5_MASK) << B_IMM_10_5_OFF) | \
-	(RV_X(x_, B_IMM_11_OPOFF, B_IMM_11_MASK) << B_IMM_11_OFF) | \
-	(RV_IMM_SIGN(x_) << B_IMM_SIGN_OFF); })
+	(RV_X(x_, RV_B_IMM_4_1_OPOFF, RV_B_IMM_4_1_MASK) << RV_B_IMM_4_1_OFF) | \
+	(RV_X(x_, RV_B_IMM_10_5_OPOFF, RV_B_IMM_10_5_MASK) << RV_B_IMM_10_5_OFF) | \
+	(RV_X(x_, RV_B_IMM_11_OPOFF, RV_B_IMM_11_MASK) << RV_B_IMM_11_OFF) | \
+	(RV_IMM_SIGN(x_) << RV_B_IMM_SIGN_OFF); })
 
-#define EXTRACT_RVC_J_IMM(x) \
+#define RVC_EXTRACT_JTYPE_IMM(x) \
 	({typeof(x) x_ = (x); \
 	(RVC_X(x_, RVC_J_IMM_3_1_OPOFF, RVC_J_IMM_3_1_MASK) << RVC_J_IMM_3_1_OFF) | \
 	(RVC_X(x_, RVC_J_IMM_4_OPOFF, RVC_J_IMM_4_MASK) << RVC_J_IMM_4_OFF) | \
@@ -210,7 +210,7 @@ static inline bool is_ ## INSN_NAME ## _insn(long insn) \
 	(RVC_X(x_, RVC_J_IMM_10_OPOFF, RVC_J_IMM_10_MASK) << RVC_J_IMM_10_OFF) | \
 	(RVC_IMM_SIGN(x_) << RVC_J_IMM_SIGN_OFF); })
 
-#define EXTRACT_RVC_B_IMM(x) \
+#define RVC_EXTRACT_BTYPE_IMM(x) \
 	({typeof(x) x_ = (x); \
 	(RVC_X(x_, RVC_B_IMM_2_1_OPOFF, RVC_B_IMM_2_1_MASK) << RVC_B_IMM_2_1_OFF) | \
 	(RVC_X(x_, RVC_B_IMM_4_3_OPOFF, RVC_B_IMM_4_3_MASK) << RVC_B_IMM_4_3_OFF) | \
diff --git a/arch/riscv/kernel/kgdb.c b/arch/riscv/kernel/kgdb.c
index 963ed7edcff2..137c6d870d58 100644
--- a/arch/riscv/kernel/kgdb.c
+++ b/arch/riscv/kernel/kgdb.c
@@ -29,20 +29,20 @@ DECLARE_INSN(c_jal, MATCH_C_JAL, MASK_C_JAL)
 #else
 #define is_c_jal_insn(opcode) 0
 #endif
-DECLARE_INSN(jalr, MATCH_JALR, MASK_JALR)
-DECLARE_INSN(jal, MATCH_JAL, MASK_JAL)
-DECLARE_INSN(c_jr, MATCH_C_JR, MASK_C_JR)
-DECLARE_INSN(c_jalr, MATCH_C_JALR, MASK_C_JALR)
-DECLARE_INSN(c_j, MATCH_C_J, MASK_C_J)
-DECLARE_INSN(beq, MATCH_BEQ, MASK_BEQ)
-DECLARE_INSN(bne, MATCH_BNE, MASK_BNE)
-DECLARE_INSN(blt, MATCH_BLT, MASK_BLT)
-DECLARE_INSN(bge, MATCH_BGE, MASK_BGE)
-DECLARE_INSN(bltu, MATCH_BLTU, MASK_BLTU)
-DECLARE_INSN(bgeu, MATCH_BGEU, MASK_BGEU)
-DECLARE_INSN(c_beqz, MATCH_C_BEQZ, MASK_C_BEQZ)
-DECLARE_INSN(c_bnez, MATCH_C_BNEZ, MASK_C_BNEZ)
-DECLARE_INSN(sret, MATCH_SRET, MASK_SRET)
+DECLARE_INSN(jalr, RVG_MATCH_JALR, RVG_MASK_JALR)
+DECLARE_INSN(jal, RVG_MATCH_JAL, RVG_MASK_JAL)
+DECLARE_INSN(c_jr, RVC_MATCH_C_JR, RVC_MASK_C_JR)
+DECLARE_INSN(c_jalr, RVC_MATCH_C_JALR, RVC_MASK_C_JALR)
+DECLARE_INSN(c_j, RVC_MATCH_C_J, RVC_MASK_C_J)
+DECLARE_INSN(beq, RVG_MATCH_BEQ, RVG_MASK_BEQ)
+DECLARE_INSN(bne, RVG_MATCH_BNE, RVG_MASK_BNE)
+DECLARE_INSN(blt, RVG_MATCH_BLT, RVG_MASK_BLT)
+DECLARE_INSN(bge, RVG_MATCH_BGE, RVG_MASK_BGE)
+DECLARE_INSN(bltu, RVG_MATCH_BLTU, RVG_MASK_BLTU)
+DECLARE_INSN(bgeu, RVG_MATCH_BGEU, RVG_MASK_BGEU)
+DECLARE_INSN(c_beqz, RVC_MATCH_C_BEQZ, RVC_MASK_C_BEQZ)
+DECLARE_INSN(c_bnez, RVC_MATCH_C_BNEZ, RVC_MASK_C_BNEZ)
+DECLARE_INSN(sret, RVG_MATCH_SRET, RVG_MASK_SRET)
 
 static int decode_register_index(unsigned long opcode, int offset)
 {
@@ -69,19 +69,19 @@ static int get_step_address(struct pt_regs *regs, unsigned long *next_addr)
 			rs1_num = decode_register_index(op_code, RVC_C2_RS1_OPOFF);
 			*next_addr = regs_ptr[rs1_num];
 		} else if (is_c_j_insn(op_code) || is_c_jal_insn(op_code)) {
-			*next_addr = EXTRACT_RVC_J_IMM(op_code) + pc;
+			*next_addr = RVC_EXTRACT_JTYPE_IMM(op_code) + pc;
 		} else if (is_c_beqz_insn(op_code)) {
 			rs1_num = decode_register_index_short(op_code,
 							      RVC_C1_RS1_OPOFF);
 			if (!rs1_num || regs_ptr[rs1_num] == 0)
-				*next_addr = EXTRACT_RVC_B_IMM(op_code) + pc;
+				*next_addr = RVC_EXTRACT_BTYPE_IMM(op_code) + pc;
 			else
 				*next_addr = pc + 2;
 		} else if (is_c_bnez_insn(op_code)) {
 			rs1_num =
 			    decode_register_index_short(op_code, RVC_C1_RS1_OPOFF);
 			if (rs1_num && regs_ptr[rs1_num] != 0)
-				*next_addr = EXTRACT_RVC_B_IMM(op_code) + pc;
+				*next_addr = RVC_EXTRACT_BTYPE_IMM(op_code) + pc;
 			else
 				*next_addr = pc + 2;
 		} else {
@@ -90,7 +90,7 @@ static int get_step_address(struct pt_regs *regs, unsigned long *next_addr)
 	} else {
 		if ((op_code & __INSN_OPCODE_MASK) == __INSN_BRANCH_OPCODE) {
 			bool result = false;
-			long imm = EXTRACT_BTYPE_IMM(op_code);
+			long imm = RV_EXTRACT_BTYPE_IMM(op_code);
 			unsigned long rs1_val = 0, rs2_val = 0;
 
 			rs1_num = decode_register_index(op_code, RVG_RS1_OPOFF);
@@ -121,12 +121,12 @@ static int get_step_address(struct pt_regs *regs, unsigned long *next_addr)
 			else
 				*next_addr = pc + 4;
 		} else if (is_jal_insn(op_code)) {
-			*next_addr = EXTRACT_JTYPE_IMM(op_code) + pc;
+			*next_addr = RV_EXTRACT_JTYPE_IMM(op_code) + pc;
 		} else if (is_jalr_insn(op_code)) {
 			rs1_num = decode_register_index(op_code, RVG_RS1_OPOFF);
 			if (rs1_num)
 				*next_addr = ((unsigned long *)regs)[rs1_num];
-			*next_addr += EXTRACT_ITYPE_IMM(op_code);
+			*next_addr += RV_EXTRACT_ITYPE_IMM(op_code);
 		} else if (is_sret_insn(op_code)) {
 			*next_addr = pc;
 		} else {
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH v3 02/14] RISC-V: add prefix to all constants/macros in parse_asm.h
@ 2022-11-30 22:56   ` Heiko Stuebner
  0 siblings, 0 replies; 84+ messages in thread
From: Heiko Stuebner @ 2022-11-30 22:56 UTC (permalink / raw)
  To: linux-riscv, palmer
  Cc: christoph.muellner, prabhakar.csengg, conor, philipp.tomsich,
	ajones, heiko, emil.renner.berthing, ardb, linux-efi,
	Heiko Stuebner, Conor Dooley

From: Heiko Stuebner <heiko.stuebner@vrull.eu>

Some of the constants and macros already have suitable RV_, RVG_ or
RVC_ prefixes.

Extend this to the rest of the file as well, as we want to use these
things in a broader scope soon.

Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Heiko Stuebner <heiko.stuebner@vrull.eu>
---
 arch/riscv/include/asm/parse_asm.h | 194 ++++++++++++++---------------
 arch/riscv/kernel/kgdb.c           |  40 +++---
 2 files changed, 117 insertions(+), 117 deletions(-)

diff --git a/arch/riscv/include/asm/parse_asm.h b/arch/riscv/include/asm/parse_asm.h
index 7fee806805c1..ea51542e0c65 100644
--- a/arch/riscv/include/asm/parse_asm.h
+++ b/arch/riscv/include/asm/parse_asm.h
@@ -6,37 +6,37 @@
 #include <linux/bits.h>
 
 /* The bit field of immediate value in I-type instruction */
-#define I_IMM_SIGN_OPOFF	31
-#define I_IMM_11_0_OPOFF	20
-#define I_IMM_SIGN_OFF		12
-#define I_IMM_11_0_OFF		0
-#define I_IMM_11_0_MASK		GENMASK(11, 0)
+#define RV_I_IMM_SIGN_OPOFF	31
+#define RV_I_IMM_11_0_OPOFF	20
+#define RV_I_IMM_SIGN_OFF	12
+#define RV_I_IMM_11_0_OFF	0
+#define RV_I_IMM_11_0_MASK	GENMASK(11, 0)
 
 /* The bit field of immediate value in J-type instruction */
-#define J_IMM_SIGN_OPOFF	31
-#define J_IMM_10_1_OPOFF	21
-#define J_IMM_11_OPOFF		20
-#define J_IMM_19_12_OPOFF	12
-#define J_IMM_SIGN_OFF		20
-#define J_IMM_10_1_OFF		1
-#define J_IMM_11_OFF		11
-#define J_IMM_19_12_OFF		12
-#define J_IMM_10_1_MASK		GENMASK(9, 0)
-#define J_IMM_11_MASK		GENMASK(0, 0)
-#define J_IMM_19_12_MASK	GENMASK(7, 0)
+#define RV_J_IMM_SIGN_OPOFF	31
+#define RV_J_IMM_10_1_OPOFF	21
+#define RV_J_IMM_11_OPOFF	20
+#define RV_J_IMM_19_12_OPOFF	12
+#define RV_J_IMM_SIGN_OFF	20
+#define RV_J_IMM_10_1_OFF	1
+#define RV_J_IMM_11_OFF		11
+#define RV_J_IMM_19_12_OFF	12
+#define RV_J_IMM_10_1_MASK	GENMASK(9, 0)
+#define RV_J_IMM_11_MASK	GENMASK(0, 0)
+#define RV_J_IMM_19_12_MASK	GENMASK(7, 0)
 
 /* The bit field of immediate value in B-type instruction */
-#define B_IMM_SIGN_OPOFF	31
-#define B_IMM_10_5_OPOFF	25
-#define B_IMM_4_1_OPOFF		8
-#define B_IMM_11_OPOFF		7
-#define B_IMM_SIGN_OFF		12
-#define B_IMM_10_5_OFF		5
-#define B_IMM_4_1_OFF		1
-#define B_IMM_11_OFF		11
-#define B_IMM_10_5_MASK		GENMASK(5, 0)
-#define B_IMM_4_1_MASK		GENMASK(3, 0)
-#define B_IMM_11_MASK		GENMASK(0, 0)
+#define RV_B_IMM_SIGN_OPOFF	31
+#define RV_B_IMM_10_5_OPOFF	25
+#define RV_B_IMM_4_1_OPOFF	8
+#define RV_B_IMM_11_OPOFF	7
+#define RV_B_IMM_SIGN_OFF	12
+#define RV_B_IMM_10_5_OFF	5
+#define RV_B_IMM_4_1_OFF	1
+#define RV_B_IMM_11_OFF		11
+#define RV_B_IMM_10_5_MASK	GENMASK(5, 0)
+#define RV_B_IMM_4_1_MASK	GENMASK(3, 0)
+#define RV_B_IMM_11_MASK	GENMASK(0, 0)
 
 /* The register offset in RVG instruction */
 #define RVG_RS1_OPOFF		15
@@ -100,71 +100,71 @@
 #define RVC_C2_RD_OPOFF		7
 
 /* parts of opcode for RVG*/
-#define OPCODE_BRANCH		0x63
-#define OPCODE_JALR		0x67
-#define OPCODE_JAL		0x6f
-#define OPCODE_SYSTEM		0x73
+#define RVG_OPCODE_BRANCH	0x63
+#define RVG_OPCODE_JALR		0x67
+#define RVG_OPCODE_JAL		0x6f
+#define RVG_OPCODE_SYSTEM	0x73
 
 /* parts of opcode for RVC*/
-#define OPCODE_C_0		0x0
-#define OPCODE_C_1		0x1
-#define OPCODE_C_2		0x2
+#define RVC_OPCODE_C0		0x0
+#define RVC_OPCODE_C1		0x1
+#define RVC_OPCODE_C2		0x2
 
 /* parts of funct3 code for I, M, A extension*/
-#define FUNCT3_JALR		0x0
-#define FUNCT3_BEQ		0x0
-#define FUNCT3_BNE		0x1000
-#define FUNCT3_BLT		0x4000
-#define FUNCT3_BGE		0x5000
-#define FUNCT3_BLTU		0x6000
-#define FUNCT3_BGEU		0x7000
+#define RVG_FUNCT3_JALR		0x0
+#define RVG_FUNCT3_BEQ		0x0
+#define RVG_FUNCT3_BNE		0x1000
+#define RVG_FUNCT3_BLT		0x4000
+#define RVG_FUNCT3_BGE		0x5000
+#define RVG_FUNCT3_BLTU		0x6000
+#define RVG_FUNCT3_BGEU		0x7000
 
 /* parts of funct3 code for C extension*/
-#define FUNCT3_C_BEQZ		0xc000
-#define FUNCT3_C_BNEZ		0xe000
-#define FUNCT3_C_J		0xa000
-#define FUNCT3_C_JAL		0x2000
-#define FUNCT4_C_JR		0x8000
-#define FUNCT4_C_JALR		0x9000
-
-#define FUNCT12_SRET		0x10200000
-
-#define MATCH_JALR		(FUNCT3_JALR | OPCODE_JALR)
-#define MATCH_JAL		(OPCODE_JAL)
-#define MATCH_BEQ		(FUNCT3_BEQ | OPCODE_BRANCH)
-#define MATCH_BNE		(FUNCT3_BNE | OPCODE_BRANCH)
-#define MATCH_BLT		(FUNCT3_BLT | OPCODE_BRANCH)
-#define MATCH_BGE		(FUNCT3_BGE | OPCODE_BRANCH)
-#define MATCH_BLTU		(FUNCT3_BLTU | OPCODE_BRANCH)
-#define MATCH_BGEU		(FUNCT3_BGEU | OPCODE_BRANCH)
-#define MATCH_SRET		(FUNCT12_SRET | OPCODE_SYSTEM)
-#define MATCH_C_BEQZ		(FUNCT3_C_BEQZ | OPCODE_C_1)
-#define MATCH_C_BNEZ		(FUNCT3_C_BNEZ | OPCODE_C_1)
-#define MATCH_C_J		(FUNCT3_C_J | OPCODE_C_1)
-#define MATCH_C_JAL		(FUNCT3_C_JAL | OPCODE_C_1)
-#define MATCH_C_JR		(FUNCT4_C_JR | OPCODE_C_2)
-#define MATCH_C_JALR		(FUNCT4_C_JALR | OPCODE_C_2)
-
-#define MASK_JALR		0x707f
-#define MASK_JAL		0x7f
-#define MASK_C_JALR		0xf07f
-#define MASK_C_JR		0xf07f
-#define MASK_C_JAL		0xe003
-#define MASK_C_J		0xe003
-#define MASK_BEQ		0x707f
-#define MASK_BNE		0x707f
-#define MASK_BLT		0x707f
-#define MASK_BGE		0x707f
-#define MASK_BLTU		0x707f
-#define MASK_BGEU		0x707f
-#define MASK_C_BEQZ		0xe003
-#define MASK_C_BNEZ		0xe003
-#define MASK_SRET		0xffffffff
+#define RVC_FUNCT3_C_BEQZ	0xc000
+#define RVC_FUNCT3_C_BNEZ	0xe000
+#define RVC_FUNCT3_C_J		0xa000
+#define RVC_FUNCT3_C_JAL	0x2000
+#define RVC_FUNCT4_C_JR		0x8000
+#define RVC_FUNCT4_C_JALR	0x9000
+
+#define RVG_FUNCT12_SRET	0x10200000
+
+#define RVG_MATCH_JALR		(RVG_FUNCT3_JALR | RVG_OPCODE_JALR)
+#define RVG_MATCH_JAL		(RVG_OPCODE_JAL)
+#define RVG_MATCH_BEQ		(RVG_FUNCT3_BEQ | RVG_OPCODE_BRANCH)
+#define RVG_MATCH_BNE		(RVG_FUNCT3_BNE | RVG_OPCODE_BRANCH)
+#define RVG_MATCH_BLT		(RVG_FUNCT3_BLT | RVG_OPCODE_BRANCH)
+#define RVG_MATCH_BGE		(RVG_FUNCT3_BGE | RVG_OPCODE_BRANCH)
+#define RVG_MATCH_BLTU		(RVG_FUNCT3_BLTU | RVG_OPCODE_BRANCH)
+#define RVG_MATCH_BGEU		(RVG_FUNCT3_BGEU | RVG_OPCODE_BRANCH)
+#define RVG_MATCH_SRET		(RVG_FUNCT12_SRET | RVG_OPCODE_SYSTEM)
+#define RVC_MATCH_C_BEQZ	(RVC_FUNCT3_C_BEQZ | RVC_OPCODE_C1)
+#define RVC_MATCH_C_BNEZ	(RVC_FUNCT3_C_BNEZ | RVC_OPCODE_C1)
+#define RVC_MATCH_C_J		(RVC_FUNCT3_C_J | RVC_OPCODE_C1)
+#define RVC_MATCH_C_JAL		(RVC_FUNCT3_C_JAL | RVC_OPCODE_C1)
+#define RVC_MATCH_C_JR		(RVC_FUNCT4_C_JR | RVC_OPCODE_C2)
+#define RVC_MATCH_C_JALR	(RVC_FUNCT4_C_JALR | RVC_OPCODE_C2)
+
+#define RVG_MASK_JALR		0x707f
+#define RVG_MASK_JAL		0x7f
+#define RVC_MASK_C_JALR		0xf07f
+#define RVC_MASK_C_JR		0xf07f
+#define RVC_MASK_C_JAL		0xe003
+#define RVC_MASK_C_J		0xe003
+#define RVG_MASK_BEQ		0x707f
+#define RVG_MASK_BNE		0x707f
+#define RVG_MASK_BLT		0x707f
+#define RVG_MASK_BGE		0x707f
+#define RVG_MASK_BLTU		0x707f
+#define RVG_MASK_BGEU		0x707f
+#define RVC_MASK_C_BEQZ		0xe003
+#define RVC_MASK_C_BNEZ		0xe003
+#define RVG_MASK_SRET		0xffffffff
 
 #define __INSN_LENGTH_MASK	_UL(0x3)
 #define __INSN_LENGTH_GE_32	_UL(0x3)
 #define __INSN_OPCODE_MASK	_UL(0x7F)
-#define __INSN_BRANCH_OPCODE	_UL(OPCODE_BRANCH)
+#define __INSN_BRANCH_OPCODE	_UL(RVG_OPCODE_BRANCH)
 
 /* Define a series of is_XXX_insn functions to check if the value INSN
  * is an instance of instruction XXX.
@@ -180,26 +180,26 @@ static inline bool is_ ## INSN_NAME ## _insn(long insn) \
 #define RV_X(X, s, mask)  (((X) >> (s)) & (mask))
 #define RVC_X(X, s, mask) RV_X(X, s, mask)
 
-#define EXTRACT_JTYPE_IMM(x) \
+#define RV_EXTRACT_JTYPE_IMM(x) \
 	({typeof(x) x_ = (x); \
-	(RV_X(x_, J_IMM_10_1_OPOFF, J_IMM_10_1_MASK) << J_IMM_10_1_OFF) | \
-	(RV_X(x_, J_IMM_11_OPOFF, J_IMM_11_MASK) << J_IMM_11_OFF) | \
-	(RV_X(x_, J_IMM_19_12_OPOFF, J_IMM_19_12_MASK) << J_IMM_19_12_OFF) | \
-	(RV_IMM_SIGN(x_) << J_IMM_SIGN_OFF); })
+	(RV_X(x_, RV_J_IMM_10_1_OPOFF, RV_J_IMM_10_1_MASK) << RV_J_IMM_10_1_OFF) | \
+	(RV_X(x_, RV_J_IMM_11_OPOFF, RV_J_IMM_11_MASK) << RV_J_IMM_11_OFF) | \
+	(RV_X(x_, RV_J_IMM_19_12_OPOFF, RV_J_IMM_19_12_MASK) << RV_J_IMM_19_12_OFF) | \
+	(RV_IMM_SIGN(x_) << RV_J_IMM_SIGN_OFF); })
 
-#define EXTRACT_ITYPE_IMM(x) \
+#define RV_EXTRACT_ITYPE_IMM(x) \
 	({typeof(x) x_ = (x); \
-	(RV_X(x_, I_IMM_11_0_OPOFF, I_IMM_11_0_MASK)) | \
-	(RV_IMM_SIGN(x_) << I_IMM_SIGN_OFF); })
+	(RV_X(x_, RV_I_IMM_11_0_OPOFF, RV_I_IMM_11_0_MASK)) | \
+	(RV_IMM_SIGN(x_) << RV_I_IMM_SIGN_OFF); })
 
-#define EXTRACT_BTYPE_IMM(x) \
+#define RV_EXTRACT_BTYPE_IMM(x) \
 	({typeof(x) x_ = (x); \
-	(RV_X(x_, B_IMM_4_1_OPOFF, B_IMM_4_1_MASK) << B_IMM_4_1_OFF) | \
-	(RV_X(x_, B_IMM_10_5_OPOFF, B_IMM_10_5_MASK) << B_IMM_10_5_OFF) | \
-	(RV_X(x_, B_IMM_11_OPOFF, B_IMM_11_MASK) << B_IMM_11_OFF) | \
-	(RV_IMM_SIGN(x_) << B_IMM_SIGN_OFF); })
+	(RV_X(x_, RV_B_IMM_4_1_OPOFF, RV_B_IMM_4_1_MASK) << RV_B_IMM_4_1_OFF) | \
+	(RV_X(x_, RV_B_IMM_10_5_OPOFF, RV_B_IMM_10_5_MASK) << RV_B_IMM_10_5_OFF) | \
+	(RV_X(x_, RV_B_IMM_11_OPOFF, RV_B_IMM_11_MASK) << RV_B_IMM_11_OFF) | \
+	(RV_IMM_SIGN(x_) << RV_B_IMM_SIGN_OFF); })
 
-#define EXTRACT_RVC_J_IMM(x) \
+#define RVC_EXTRACT_JTYPE_IMM(x) \
 	({typeof(x) x_ = (x); \
 	(RVC_X(x_, RVC_J_IMM_3_1_OPOFF, RVC_J_IMM_3_1_MASK) << RVC_J_IMM_3_1_OFF) | \
 	(RVC_X(x_, RVC_J_IMM_4_OPOFF, RVC_J_IMM_4_MASK) << RVC_J_IMM_4_OFF) | \
@@ -210,7 +210,7 @@ static inline bool is_ ## INSN_NAME ## _insn(long insn) \
 	(RVC_X(x_, RVC_J_IMM_10_OPOFF, RVC_J_IMM_10_MASK) << RVC_J_IMM_10_OFF) | \
 	(RVC_IMM_SIGN(x_) << RVC_J_IMM_SIGN_OFF); })
 
-#define EXTRACT_RVC_B_IMM(x) \
+#define RVC_EXTRACT_BTYPE_IMM(x) \
 	({typeof(x) x_ = (x); \
 	(RVC_X(x_, RVC_B_IMM_2_1_OPOFF, RVC_B_IMM_2_1_MASK) << RVC_B_IMM_2_1_OFF) | \
 	(RVC_X(x_, RVC_B_IMM_4_3_OPOFF, RVC_B_IMM_4_3_MASK) << RVC_B_IMM_4_3_OFF) | \
diff --git a/arch/riscv/kernel/kgdb.c b/arch/riscv/kernel/kgdb.c
index 963ed7edcff2..137c6d870d58 100644
--- a/arch/riscv/kernel/kgdb.c
+++ b/arch/riscv/kernel/kgdb.c
@@ -29,20 +29,20 @@ DECLARE_INSN(c_jal, MATCH_C_JAL, MASK_C_JAL)
 #else
 #define is_c_jal_insn(opcode) 0
 #endif
-DECLARE_INSN(jalr, MATCH_JALR, MASK_JALR)
-DECLARE_INSN(jal, MATCH_JAL, MASK_JAL)
-DECLARE_INSN(c_jr, MATCH_C_JR, MASK_C_JR)
-DECLARE_INSN(c_jalr, MATCH_C_JALR, MASK_C_JALR)
-DECLARE_INSN(c_j, MATCH_C_J, MASK_C_J)
-DECLARE_INSN(beq, MATCH_BEQ, MASK_BEQ)
-DECLARE_INSN(bne, MATCH_BNE, MASK_BNE)
-DECLARE_INSN(blt, MATCH_BLT, MASK_BLT)
-DECLARE_INSN(bge, MATCH_BGE, MASK_BGE)
-DECLARE_INSN(bltu, MATCH_BLTU, MASK_BLTU)
-DECLARE_INSN(bgeu, MATCH_BGEU, MASK_BGEU)
-DECLARE_INSN(c_beqz, MATCH_C_BEQZ, MASK_C_BEQZ)
-DECLARE_INSN(c_bnez, MATCH_C_BNEZ, MASK_C_BNEZ)
-DECLARE_INSN(sret, MATCH_SRET, MASK_SRET)
+DECLARE_INSN(jalr, RVG_MATCH_JALR, RVG_MASK_JALR)
+DECLARE_INSN(jal, RVG_MATCH_JAL, RVG_MASK_JAL)
+DECLARE_INSN(c_jr, RVC_MATCH_C_JR, RVC_MASK_C_JR)
+DECLARE_INSN(c_jalr, RVC_MATCH_C_JALR, RVC_MASK_C_JALR)
+DECLARE_INSN(c_j, RVC_MATCH_C_J, RVC_MASK_C_J)
+DECLARE_INSN(beq, RVG_MATCH_BEQ, RVG_MASK_BEQ)
+DECLARE_INSN(bne, RVG_MATCH_BNE, RVG_MASK_BNE)
+DECLARE_INSN(blt, RVG_MATCH_BLT, RVG_MASK_BLT)
+DECLARE_INSN(bge, RVG_MATCH_BGE, RVG_MASK_BGE)
+DECLARE_INSN(bltu, RVG_MATCH_BLTU, RVG_MASK_BLTU)
+DECLARE_INSN(bgeu, RVG_MATCH_BGEU, RVG_MASK_BGEU)
+DECLARE_INSN(c_beqz, RVC_MATCH_C_BEQZ, RVC_MASK_C_BEQZ)
+DECLARE_INSN(c_bnez, RVC_MATCH_C_BNEZ, RVC_MASK_C_BNEZ)
+DECLARE_INSN(sret, RVG_MATCH_SRET, RVG_MASK_SRET)
 
 static int decode_register_index(unsigned long opcode, int offset)
 {
@@ -69,19 +69,19 @@ static int get_step_address(struct pt_regs *regs, unsigned long *next_addr)
 			rs1_num = decode_register_index(op_code, RVC_C2_RS1_OPOFF);
 			*next_addr = regs_ptr[rs1_num];
 		} else if (is_c_j_insn(op_code) || is_c_jal_insn(op_code)) {
-			*next_addr = EXTRACT_RVC_J_IMM(op_code) + pc;
+			*next_addr = RVC_EXTRACT_JTYPE_IMM(op_code) + pc;
 		} else if (is_c_beqz_insn(op_code)) {
 			rs1_num = decode_register_index_short(op_code,
 							      RVC_C1_RS1_OPOFF);
 			if (!rs1_num || regs_ptr[rs1_num] == 0)
-				*next_addr = EXTRACT_RVC_B_IMM(op_code) + pc;
+				*next_addr = RVC_EXTRACT_BTYPE_IMM(op_code) + pc;
 			else
 				*next_addr = pc + 2;
 		} else if (is_c_bnez_insn(op_code)) {
 			rs1_num =
 			    decode_register_index_short(op_code, RVC_C1_RS1_OPOFF);
 			if (rs1_num && regs_ptr[rs1_num] != 0)
-				*next_addr = EXTRACT_RVC_B_IMM(op_code) + pc;
+				*next_addr = RVC_EXTRACT_BTYPE_IMM(op_code) + pc;
 			else
 				*next_addr = pc + 2;
 		} else {
@@ -90,7 +90,7 @@ static int get_step_address(struct pt_regs *regs, unsigned long *next_addr)
 	} else {
 		if ((op_code & __INSN_OPCODE_MASK) == __INSN_BRANCH_OPCODE) {
 			bool result = false;
-			long imm = EXTRACT_BTYPE_IMM(op_code);
+			long imm = RV_EXTRACT_BTYPE_IMM(op_code);
 			unsigned long rs1_val = 0, rs2_val = 0;
 
 			rs1_num = decode_register_index(op_code, RVG_RS1_OPOFF);
@@ -121,12 +121,12 @@ static int get_step_address(struct pt_regs *regs, unsigned long *next_addr)
 			else
 				*next_addr = pc + 4;
 		} else if (is_jal_insn(op_code)) {
-			*next_addr = EXTRACT_JTYPE_IMM(op_code) + pc;
+			*next_addr = RV_EXTRACT_JTYPE_IMM(op_code) + pc;
 		} else if (is_jalr_insn(op_code)) {
 			rs1_num = decode_register_index(op_code, RVG_RS1_OPOFF);
 			if (rs1_num)
 				*next_addr = ((unsigned long *)regs)[rs1_num];
-			*next_addr += EXTRACT_ITYPE_IMM(op_code);
+			*next_addr += RV_EXTRACT_ITYPE_IMM(op_code);
 		} else if (is_sret_insn(op_code)) {
 			*next_addr = pc;
 		} else {
-- 
2.35.1


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH v3 03/14] RISC-V: detach funct-values from their offset
  2022-11-30 22:56 ` Heiko Stuebner
@ 2022-11-30 22:56   ` Heiko Stuebner
  -1 siblings, 0 replies; 84+ messages in thread
From: Heiko Stuebner @ 2022-11-30 22:56 UTC (permalink / raw)
  To: linux-riscv, palmer
  Cc: christoph.muellner, prabhakar.csengg, conor, philipp.tomsich,
	ajones, heiko, emil.renner.berthing, ardb, linux-efi,
	Heiko Stuebner, Conor Dooley

From: Heiko Stuebner <heiko.stuebner@vrull.eu>

Rather than defining funct3, funct4, etc values pre-shifted to their
target-position in an instruction, define the values themselves and
only shift them where needed.

This allows using these funct-values in other places as well, for example
when decoding functions.

At the same time also reduces the use of magic numbers, one would need
a spec manual to understand.

Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Signed-off-by: Heiko Stuebner <heiko.stuebner@vrull.eu>
---
 arch/riscv/include/asm/parse_asm.h | 100 +++++++++++++++++------------
 1 file changed, 59 insertions(+), 41 deletions(-)

diff --git a/arch/riscv/include/asm/parse_asm.h b/arch/riscv/include/asm/parse_asm.h
index ea51542e0c65..e3f87da108f4 100644
--- a/arch/riscv/include/asm/parse_asm.h
+++ b/arch/riscv/include/asm/parse_asm.h
@@ -5,6 +5,15 @@
 
 #include <linux/bits.h>
 
+#define RV_INSN_FUNCT3_MASK	GENMASK(14, 12)
+#define RV_INSN_FUNCT3_OPOFF	12
+#define RV_INSN_OPCODE_MASK	GENMASK(6, 0)
+#define RV_INSN_OPCODE_OPOFF	0
+#define RV_INSN_FUNCT12_OPOFF	20
+
+#define RV_ENCODE_FUNCT3(f_)	(RVG_FUNCT3_##f_ << RV_INSN_FUNCT3_OPOFF)
+#define RV_ENCODE_FUNCT12(f_)	(RVG_FUNCT12_##f_ << RV_INSN_FUNCT12_OPOFF)
+
 /* The bit field of immediate value in I-type instruction */
 #define RV_I_IMM_SIGN_OPOFF	31
 #define RV_I_IMM_11_0_OPOFF	20
@@ -84,6 +93,15 @@
 #define RVC_B_IMM_2_1_MASK	GENMASK(1, 0)
 #define RVC_B_IMM_5_MASK	GENMASK(0, 0)
 
+#define RVC_INSN_FUNCT4_MASK	GENMASK(15, 12)
+#define RVC_INSN_FUNCT4_OPOFF	12
+#define RVC_INSN_FUNCT3_MASK	GENMASK(15, 13)
+#define RVC_INSN_FUNCT3_OPOFF	13
+#define RVC_INSN_J_RS2_MASK	GENMASK(6, 2)
+#define RVC_INSN_OPCODE_MASK	GENMASK(1, 0)
+#define RVC_ENCODE_FUNCT3(f_)	(RVC_FUNCT3_##f_ << RVC_INSN_FUNCT3_OPOFF)
+#define RVC_ENCODE_FUNCT4(f_)	(RVC_FUNCT4_##f_ << RVC_INSN_FUNCT4_OPOFF)
+
 /* The register offset in RVC op=C0 instruction */
 #define RVC_C0_RS1_OPOFF	7
 #define RVC_C0_RS2_OPOFF	2
@@ -113,52 +131,52 @@
 /* parts of funct3 code for I, M, A extension*/
 #define RVG_FUNCT3_JALR		0x0
 #define RVG_FUNCT3_BEQ		0x0
-#define RVG_FUNCT3_BNE		0x1000
-#define RVG_FUNCT3_BLT		0x4000
-#define RVG_FUNCT3_BGE		0x5000
-#define RVG_FUNCT3_BLTU		0x6000
-#define RVG_FUNCT3_BGEU		0x7000
+#define RVG_FUNCT3_BNE		0x1
+#define RVG_FUNCT3_BLT		0x4
+#define RVG_FUNCT3_BGE		0x5
+#define RVG_FUNCT3_BLTU		0x6
+#define RVG_FUNCT3_BGEU		0x7
 
 /* parts of funct3 code for C extension*/
-#define RVC_FUNCT3_C_BEQZ	0xc000
-#define RVC_FUNCT3_C_BNEZ	0xe000
-#define RVC_FUNCT3_C_J		0xa000
-#define RVC_FUNCT3_C_JAL	0x2000
-#define RVC_FUNCT4_C_JR		0x8000
-#define RVC_FUNCT4_C_JALR	0x9000
+#define RVC_FUNCT3_C_BEQZ	0x6
+#define RVC_FUNCT3_C_BNEZ	0x7
+#define RVC_FUNCT3_C_J		0x5
+#define RVC_FUNCT3_C_JAL	0x1
+#define RVC_FUNCT4_C_JR		0x8
+#define RVC_FUNCT4_C_JALR	0x9
 
-#define RVG_FUNCT12_SRET	0x10200000
+#define RVG_FUNCT12_SRET	0x102
 
-#define RVG_MATCH_JALR		(RVG_FUNCT3_JALR | RVG_OPCODE_JALR)
+#define RVG_MATCH_JALR		(RV_ENCODE_FUNCT3(JALR) | RVG_OPCODE_JALR)
 #define RVG_MATCH_JAL		(RVG_OPCODE_JAL)
-#define RVG_MATCH_BEQ		(RVG_FUNCT3_BEQ | RVG_OPCODE_BRANCH)
-#define RVG_MATCH_BNE		(RVG_FUNCT3_BNE | RVG_OPCODE_BRANCH)
-#define RVG_MATCH_BLT		(RVG_FUNCT3_BLT | RVG_OPCODE_BRANCH)
-#define RVG_MATCH_BGE		(RVG_FUNCT3_BGE | RVG_OPCODE_BRANCH)
-#define RVG_MATCH_BLTU		(RVG_FUNCT3_BLTU | RVG_OPCODE_BRANCH)
-#define RVG_MATCH_BGEU		(RVG_FUNCT3_BGEU | RVG_OPCODE_BRANCH)
-#define RVG_MATCH_SRET		(RVG_FUNCT12_SRET | RVG_OPCODE_SYSTEM)
-#define RVC_MATCH_C_BEQZ	(RVC_FUNCT3_C_BEQZ | RVC_OPCODE_C1)
-#define RVC_MATCH_C_BNEZ	(RVC_FUNCT3_C_BNEZ | RVC_OPCODE_C1)
-#define RVC_MATCH_C_J		(RVC_FUNCT3_C_J | RVC_OPCODE_C1)
-#define RVC_MATCH_C_JAL		(RVC_FUNCT3_C_JAL | RVC_OPCODE_C1)
-#define RVC_MATCH_C_JR		(RVC_FUNCT4_C_JR | RVC_OPCODE_C2)
-#define RVC_MATCH_C_JALR	(RVC_FUNCT4_C_JALR | RVC_OPCODE_C2)
-
-#define RVG_MASK_JALR		0x707f
-#define RVG_MASK_JAL		0x7f
-#define RVC_MASK_C_JALR		0xf07f
-#define RVC_MASK_C_JR		0xf07f
-#define RVC_MASK_C_JAL		0xe003
-#define RVC_MASK_C_J		0xe003
-#define RVG_MASK_BEQ		0x707f
-#define RVG_MASK_BNE		0x707f
-#define RVG_MASK_BLT		0x707f
-#define RVG_MASK_BGE		0x707f
-#define RVG_MASK_BLTU		0x707f
-#define RVG_MASK_BGEU		0x707f
-#define RVC_MASK_C_BEQZ		0xe003
-#define RVC_MASK_C_BNEZ		0xe003
+#define RVG_MATCH_BEQ		(RV_ENCODE_FUNCT3(BEQ) | RVG_OPCODE_BRANCH)
+#define RVG_MATCH_BNE		(RV_ENCODE_FUNCT3(BNE) | RVG_OPCODE_BRANCH)
+#define RVG_MATCH_BLT		(RV_ENCODE_FUNCT3(BLT) | RVG_OPCODE_BRANCH)
+#define RVG_MATCH_BGE		(RV_ENCODE_FUNCT3(BGE) | RVG_OPCODE_BRANCH)
+#define RVG_MATCH_BLTU		(RV_ENCODE_FUNCT3(BLTU) | RVG_OPCODE_BRANCH)
+#define RVG_MATCH_BGEU		(RV_ENCODE_FUNCT3(BGEU) | RVG_OPCODE_BRANCH)
+#define RVG_MATCH_SRET		(RV_ENCODE_FUNCT12(SRET) | RVG_OPCODE_SYSTEM)
+#define RVC_MATCH_C_BEQZ	(RVC_ENCODE_FUNCT3(C_BEQZ) | RVC_OPCODE_C1)
+#define RVC_MATCH_C_BNEZ	(RVC_ENCODE_FUNCT3(C_BNEZ) | RVC_OPCODE_C1)
+#define RVC_MATCH_C_J		(RVC_ENCODE_FUNCT3(C_J) | RVC_OPCODE_C1)
+#define RVC_MATCH_C_JAL		(RVC_ENCODE_FUNCT3(C_JAL) | RVC_OPCODE_C1)
+#define RVC_MATCH_C_JR		(RVC_ENCODE_FUNCT4(C_JR) | RVC_OPCODE_C2)
+#define RVC_MATCH_C_JALR	(RVC_ENCODE_FUNCT4(C_JALR) | RVC_OPCODE_C2)
+
+#define RVG_MASK_JALR		(RV_INSN_FUNCT3_MASK | RV_INSN_OPCODE_MASK)
+#define RVG_MASK_JAL		(RV_INSN_OPCODE_MASK)
+#define RVC_MASK_C_JALR		(RVC_INSN_FUNCT4_MASK | RVC_INSN_J_RS2_MASK | RVC_INSN_OPCODE_MASK)
+#define RVC_MASK_C_JR		(RVC_INSN_FUNCT4_MASK | RVC_INSN_J_RS2_MASK | RVC_INSN_OPCODE_MASK)
+#define RVC_MASK_C_JAL		(RVC_INSN_FUNCT3_MASK | RVC_INSN_OPCODE_MASK)
+#define RVC_MASK_C_J		(RVC_INSN_FUNCT3_MASK | RVC_INSN_OPCODE_MASK)
+#define RVG_MASK_BEQ		(RV_INSN_FUNCT3_MASK | RV_INSN_OPCODE_MASK)
+#define RVG_MASK_BNE		(RV_INSN_FUNCT3_MASK | RV_INSN_OPCODE_MASK)
+#define RVG_MASK_BLT		(RV_INSN_FUNCT3_MASK | RV_INSN_OPCODE_MASK)
+#define RVG_MASK_BGE		(RV_INSN_FUNCT3_MASK | RV_INSN_OPCODE_MASK)
+#define RVG_MASK_BLTU		(RV_INSN_FUNCT3_MASK | RV_INSN_OPCODE_MASK)
+#define RVG_MASK_BGEU		(RV_INSN_FUNCT3_MASK | RV_INSN_OPCODE_MASK)
+#define RVC_MASK_C_BEQZ		(RVC_INSN_FUNCT3_MASK | RVC_INSN_OPCODE_MASK)
+#define RVC_MASK_C_BNEZ		(RVC_INSN_FUNCT3_MASK | RVC_INSN_OPCODE_MASK)
 #define RVG_MASK_SRET		0xffffffff
 
 #define __INSN_LENGTH_MASK	_UL(0x3)
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH v3 03/14] RISC-V: detach funct-values from their offset
@ 2022-11-30 22:56   ` Heiko Stuebner
  0 siblings, 0 replies; 84+ messages in thread
From: Heiko Stuebner @ 2022-11-30 22:56 UTC (permalink / raw)
  To: linux-riscv, palmer
  Cc: christoph.muellner, prabhakar.csengg, conor, philipp.tomsich,
	ajones, heiko, emil.renner.berthing, ardb, linux-efi,
	Heiko Stuebner, Conor Dooley

From: Heiko Stuebner <heiko.stuebner@vrull.eu>

Rather than defining funct3, funct4, etc values pre-shifted to their
target-position in an instruction, define the values themselves and
only shift them where needed.

This allows using these funct-values in other places as well, for example
when decoding functions.

At the same time also reduces the use of magic numbers, one would need
a spec manual to understand.

Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Signed-off-by: Heiko Stuebner <heiko.stuebner@vrull.eu>
---
 arch/riscv/include/asm/parse_asm.h | 100 +++++++++++++++++------------
 1 file changed, 59 insertions(+), 41 deletions(-)

diff --git a/arch/riscv/include/asm/parse_asm.h b/arch/riscv/include/asm/parse_asm.h
index ea51542e0c65..e3f87da108f4 100644
--- a/arch/riscv/include/asm/parse_asm.h
+++ b/arch/riscv/include/asm/parse_asm.h
@@ -5,6 +5,15 @@
 
 #include <linux/bits.h>
 
+#define RV_INSN_FUNCT3_MASK	GENMASK(14, 12)
+#define RV_INSN_FUNCT3_OPOFF	12
+#define RV_INSN_OPCODE_MASK	GENMASK(6, 0)
+#define RV_INSN_OPCODE_OPOFF	0
+#define RV_INSN_FUNCT12_OPOFF	20
+
+#define RV_ENCODE_FUNCT3(f_)	(RVG_FUNCT3_##f_ << RV_INSN_FUNCT3_OPOFF)
+#define RV_ENCODE_FUNCT12(f_)	(RVG_FUNCT12_##f_ << RV_INSN_FUNCT12_OPOFF)
+
 /* The bit field of immediate value in I-type instruction */
 #define RV_I_IMM_SIGN_OPOFF	31
 #define RV_I_IMM_11_0_OPOFF	20
@@ -84,6 +93,15 @@
 #define RVC_B_IMM_2_1_MASK	GENMASK(1, 0)
 #define RVC_B_IMM_5_MASK	GENMASK(0, 0)
 
+#define RVC_INSN_FUNCT4_MASK	GENMASK(15, 12)
+#define RVC_INSN_FUNCT4_OPOFF	12
+#define RVC_INSN_FUNCT3_MASK	GENMASK(15, 13)
+#define RVC_INSN_FUNCT3_OPOFF	13
+#define RVC_INSN_J_RS2_MASK	GENMASK(6, 2)
+#define RVC_INSN_OPCODE_MASK	GENMASK(1, 0)
+#define RVC_ENCODE_FUNCT3(f_)	(RVC_FUNCT3_##f_ << RVC_INSN_FUNCT3_OPOFF)
+#define RVC_ENCODE_FUNCT4(f_)	(RVC_FUNCT4_##f_ << RVC_INSN_FUNCT4_OPOFF)
+
 /* The register offset in RVC op=C0 instruction */
 #define RVC_C0_RS1_OPOFF	7
 #define RVC_C0_RS2_OPOFF	2
@@ -113,52 +131,52 @@
 /* parts of funct3 code for I, M, A extension*/
 #define RVG_FUNCT3_JALR		0x0
 #define RVG_FUNCT3_BEQ		0x0
-#define RVG_FUNCT3_BNE		0x1000
-#define RVG_FUNCT3_BLT		0x4000
-#define RVG_FUNCT3_BGE		0x5000
-#define RVG_FUNCT3_BLTU		0x6000
-#define RVG_FUNCT3_BGEU		0x7000
+#define RVG_FUNCT3_BNE		0x1
+#define RVG_FUNCT3_BLT		0x4
+#define RVG_FUNCT3_BGE		0x5
+#define RVG_FUNCT3_BLTU		0x6
+#define RVG_FUNCT3_BGEU		0x7
 
 /* parts of funct3 code for C extension*/
-#define RVC_FUNCT3_C_BEQZ	0xc000
-#define RVC_FUNCT3_C_BNEZ	0xe000
-#define RVC_FUNCT3_C_J		0xa000
-#define RVC_FUNCT3_C_JAL	0x2000
-#define RVC_FUNCT4_C_JR		0x8000
-#define RVC_FUNCT4_C_JALR	0x9000
+#define RVC_FUNCT3_C_BEQZ	0x6
+#define RVC_FUNCT3_C_BNEZ	0x7
+#define RVC_FUNCT3_C_J		0x5
+#define RVC_FUNCT3_C_JAL	0x1
+#define RVC_FUNCT4_C_JR		0x8
+#define RVC_FUNCT4_C_JALR	0x9
 
-#define RVG_FUNCT12_SRET	0x10200000
+#define RVG_FUNCT12_SRET	0x102
 
-#define RVG_MATCH_JALR		(RVG_FUNCT3_JALR | RVG_OPCODE_JALR)
+#define RVG_MATCH_JALR		(RV_ENCODE_FUNCT3(JALR) | RVG_OPCODE_JALR)
 #define RVG_MATCH_JAL		(RVG_OPCODE_JAL)
-#define RVG_MATCH_BEQ		(RVG_FUNCT3_BEQ | RVG_OPCODE_BRANCH)
-#define RVG_MATCH_BNE		(RVG_FUNCT3_BNE | RVG_OPCODE_BRANCH)
-#define RVG_MATCH_BLT		(RVG_FUNCT3_BLT | RVG_OPCODE_BRANCH)
-#define RVG_MATCH_BGE		(RVG_FUNCT3_BGE | RVG_OPCODE_BRANCH)
-#define RVG_MATCH_BLTU		(RVG_FUNCT3_BLTU | RVG_OPCODE_BRANCH)
-#define RVG_MATCH_BGEU		(RVG_FUNCT3_BGEU | RVG_OPCODE_BRANCH)
-#define RVG_MATCH_SRET		(RVG_FUNCT12_SRET | RVG_OPCODE_SYSTEM)
-#define RVC_MATCH_C_BEQZ	(RVC_FUNCT3_C_BEQZ | RVC_OPCODE_C1)
-#define RVC_MATCH_C_BNEZ	(RVC_FUNCT3_C_BNEZ | RVC_OPCODE_C1)
-#define RVC_MATCH_C_J		(RVC_FUNCT3_C_J | RVC_OPCODE_C1)
-#define RVC_MATCH_C_JAL		(RVC_FUNCT3_C_JAL | RVC_OPCODE_C1)
-#define RVC_MATCH_C_JR		(RVC_FUNCT4_C_JR | RVC_OPCODE_C2)
-#define RVC_MATCH_C_JALR	(RVC_FUNCT4_C_JALR | RVC_OPCODE_C2)
-
-#define RVG_MASK_JALR		0x707f
-#define RVG_MASK_JAL		0x7f
-#define RVC_MASK_C_JALR		0xf07f
-#define RVC_MASK_C_JR		0xf07f
-#define RVC_MASK_C_JAL		0xe003
-#define RVC_MASK_C_J		0xe003
-#define RVG_MASK_BEQ		0x707f
-#define RVG_MASK_BNE		0x707f
-#define RVG_MASK_BLT		0x707f
-#define RVG_MASK_BGE		0x707f
-#define RVG_MASK_BLTU		0x707f
-#define RVG_MASK_BGEU		0x707f
-#define RVC_MASK_C_BEQZ		0xe003
-#define RVC_MASK_C_BNEZ		0xe003
+#define RVG_MATCH_BEQ		(RV_ENCODE_FUNCT3(BEQ) | RVG_OPCODE_BRANCH)
+#define RVG_MATCH_BNE		(RV_ENCODE_FUNCT3(BNE) | RVG_OPCODE_BRANCH)
+#define RVG_MATCH_BLT		(RV_ENCODE_FUNCT3(BLT) | RVG_OPCODE_BRANCH)
+#define RVG_MATCH_BGE		(RV_ENCODE_FUNCT3(BGE) | RVG_OPCODE_BRANCH)
+#define RVG_MATCH_BLTU		(RV_ENCODE_FUNCT3(BLTU) | RVG_OPCODE_BRANCH)
+#define RVG_MATCH_BGEU		(RV_ENCODE_FUNCT3(BGEU) | RVG_OPCODE_BRANCH)
+#define RVG_MATCH_SRET		(RV_ENCODE_FUNCT12(SRET) | RVG_OPCODE_SYSTEM)
+#define RVC_MATCH_C_BEQZ	(RVC_ENCODE_FUNCT3(C_BEQZ) | RVC_OPCODE_C1)
+#define RVC_MATCH_C_BNEZ	(RVC_ENCODE_FUNCT3(C_BNEZ) | RVC_OPCODE_C1)
+#define RVC_MATCH_C_J		(RVC_ENCODE_FUNCT3(C_J) | RVC_OPCODE_C1)
+#define RVC_MATCH_C_JAL		(RVC_ENCODE_FUNCT3(C_JAL) | RVC_OPCODE_C1)
+#define RVC_MATCH_C_JR		(RVC_ENCODE_FUNCT4(C_JR) | RVC_OPCODE_C2)
+#define RVC_MATCH_C_JALR	(RVC_ENCODE_FUNCT4(C_JALR) | RVC_OPCODE_C2)
+
+#define RVG_MASK_JALR		(RV_INSN_FUNCT3_MASK | RV_INSN_OPCODE_MASK)
+#define RVG_MASK_JAL		(RV_INSN_OPCODE_MASK)
+#define RVC_MASK_C_JALR		(RVC_INSN_FUNCT4_MASK | RVC_INSN_J_RS2_MASK | RVC_INSN_OPCODE_MASK)
+#define RVC_MASK_C_JR		(RVC_INSN_FUNCT4_MASK | RVC_INSN_J_RS2_MASK | RVC_INSN_OPCODE_MASK)
+#define RVC_MASK_C_JAL		(RVC_INSN_FUNCT3_MASK | RVC_INSN_OPCODE_MASK)
+#define RVC_MASK_C_J		(RVC_INSN_FUNCT3_MASK | RVC_INSN_OPCODE_MASK)
+#define RVG_MASK_BEQ		(RV_INSN_FUNCT3_MASK | RV_INSN_OPCODE_MASK)
+#define RVG_MASK_BNE		(RV_INSN_FUNCT3_MASK | RV_INSN_OPCODE_MASK)
+#define RVG_MASK_BLT		(RV_INSN_FUNCT3_MASK | RV_INSN_OPCODE_MASK)
+#define RVG_MASK_BGE		(RV_INSN_FUNCT3_MASK | RV_INSN_OPCODE_MASK)
+#define RVG_MASK_BLTU		(RV_INSN_FUNCT3_MASK | RV_INSN_OPCODE_MASK)
+#define RVG_MASK_BGEU		(RV_INSN_FUNCT3_MASK | RV_INSN_OPCODE_MASK)
+#define RVC_MASK_C_BEQZ		(RVC_INSN_FUNCT3_MASK | RVC_INSN_OPCODE_MASK)
+#define RVC_MASK_C_BNEZ		(RVC_INSN_FUNCT3_MASK | RVC_INSN_OPCODE_MASK)
 #define RVG_MASK_SRET		0xffffffff
 
 #define __INSN_LENGTH_MASK	_UL(0x3)
-- 
2.35.1


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH v3 04/14] RISC-V: add ebreak instructions to definitions
  2022-11-30 22:56 ` Heiko Stuebner
@ 2022-11-30 22:56   ` Heiko Stuebner
  -1 siblings, 0 replies; 84+ messages in thread
From: Heiko Stuebner @ 2022-11-30 22:56 UTC (permalink / raw)
  To: linux-riscv, palmer
  Cc: christoph.muellner, prabhakar.csengg, conor, philipp.tomsich,
	ajones, heiko, emil.renner.berthing, ardb, linux-efi,
	Heiko Stuebner, Conor Dooley

From: Heiko Stuebner <heiko.stuebner@vrull.eu>

kprobes need to match ebreak instructions, so add the necessary
data to enable us to centralize that functionality.

Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Heiko Stuebner <heiko.stuebner@vrull.eu>
---
 arch/riscv/include/asm/parse_asm.h | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/arch/riscv/include/asm/parse_asm.h b/arch/riscv/include/asm/parse_asm.h
index e3f87da108f4..e8303250f598 100644
--- a/arch/riscv/include/asm/parse_asm.h
+++ b/arch/riscv/include/asm/parse_asm.h
@@ -144,7 +144,9 @@
 #define RVC_FUNCT3_C_JAL	0x1
 #define RVC_FUNCT4_C_JR		0x8
 #define RVC_FUNCT4_C_JALR	0x9
+#define RVC_FUNCT4_C_EBREAK	0x9
 
+#define RVG_FUNCT12_EBREAK	0x1
 #define RVG_FUNCT12_SRET	0x102
 
 #define RVG_MATCH_JALR		(RV_ENCODE_FUNCT3(JALR) | RVG_OPCODE_JALR)
@@ -155,6 +157,7 @@
 #define RVG_MATCH_BGE		(RV_ENCODE_FUNCT3(BGE) | RVG_OPCODE_BRANCH)
 #define RVG_MATCH_BLTU		(RV_ENCODE_FUNCT3(BLTU) | RVG_OPCODE_BRANCH)
 #define RVG_MATCH_BGEU		(RV_ENCODE_FUNCT3(BGEU) | RVG_OPCODE_BRANCH)
+#define RVG_MATCH_EBREAK	(RV_ENCODE_FUNCT12(EBREAK) | RVG_OPCODE_SYSTEM)
 #define RVG_MATCH_SRET		(RV_ENCODE_FUNCT12(SRET) | RVG_OPCODE_SYSTEM)
 #define RVC_MATCH_C_BEQZ	(RVC_ENCODE_FUNCT3(C_BEQZ) | RVC_OPCODE_C1)
 #define RVC_MATCH_C_BNEZ	(RVC_ENCODE_FUNCT3(C_BNEZ) | RVC_OPCODE_C1)
@@ -162,6 +165,7 @@
 #define RVC_MATCH_C_JAL		(RVC_ENCODE_FUNCT3(C_JAL) | RVC_OPCODE_C1)
 #define RVC_MATCH_C_JR		(RVC_ENCODE_FUNCT4(C_JR) | RVC_OPCODE_C2)
 #define RVC_MATCH_C_JALR	(RVC_ENCODE_FUNCT4(C_JALR) | RVC_OPCODE_C2)
+#define RVC_MATCH_C_EBREAK	(RVC_ENCODE_FUNCT4(C_EBREAK) | RVC_OPCODE_C2)
 
 #define RVG_MASK_JALR		(RV_INSN_FUNCT3_MASK | RV_INSN_OPCODE_MASK)
 #define RVG_MASK_JAL		(RV_INSN_OPCODE_MASK)
@@ -177,6 +181,8 @@
 #define RVG_MASK_BGEU		(RV_INSN_FUNCT3_MASK | RV_INSN_OPCODE_MASK)
 #define RVC_MASK_C_BEQZ		(RVC_INSN_FUNCT3_MASK | RVC_INSN_OPCODE_MASK)
 #define RVC_MASK_C_BNEZ		(RVC_INSN_FUNCT3_MASK | RVC_INSN_OPCODE_MASK)
+#define RVC_MASK_C_EBREAK	0xffff
+#define RVG_MASK_EBREAK		0xffffffff
 #define RVG_MASK_SRET		0xffffffff
 
 #define __INSN_LENGTH_MASK	_UL(0x3)
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH v3 04/14] RISC-V: add ebreak instructions to definitions
@ 2022-11-30 22:56   ` Heiko Stuebner
  0 siblings, 0 replies; 84+ messages in thread
From: Heiko Stuebner @ 2022-11-30 22:56 UTC (permalink / raw)
  To: linux-riscv, palmer
  Cc: christoph.muellner, prabhakar.csengg, conor, philipp.tomsich,
	ajones, heiko, emil.renner.berthing, ardb, linux-efi,
	Heiko Stuebner, Conor Dooley

From: Heiko Stuebner <heiko.stuebner@vrull.eu>

kprobes need to match ebreak instructions, so add the necessary
data to enable us to centralize that functionality.

Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Heiko Stuebner <heiko.stuebner@vrull.eu>
---
 arch/riscv/include/asm/parse_asm.h | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/arch/riscv/include/asm/parse_asm.h b/arch/riscv/include/asm/parse_asm.h
index e3f87da108f4..e8303250f598 100644
--- a/arch/riscv/include/asm/parse_asm.h
+++ b/arch/riscv/include/asm/parse_asm.h
@@ -144,7 +144,9 @@
 #define RVC_FUNCT3_C_JAL	0x1
 #define RVC_FUNCT4_C_JR		0x8
 #define RVC_FUNCT4_C_JALR	0x9
+#define RVC_FUNCT4_C_EBREAK	0x9
 
+#define RVG_FUNCT12_EBREAK	0x1
 #define RVG_FUNCT12_SRET	0x102
 
 #define RVG_MATCH_JALR		(RV_ENCODE_FUNCT3(JALR) | RVG_OPCODE_JALR)
@@ -155,6 +157,7 @@
 #define RVG_MATCH_BGE		(RV_ENCODE_FUNCT3(BGE) | RVG_OPCODE_BRANCH)
 #define RVG_MATCH_BLTU		(RV_ENCODE_FUNCT3(BLTU) | RVG_OPCODE_BRANCH)
 #define RVG_MATCH_BGEU		(RV_ENCODE_FUNCT3(BGEU) | RVG_OPCODE_BRANCH)
+#define RVG_MATCH_EBREAK	(RV_ENCODE_FUNCT12(EBREAK) | RVG_OPCODE_SYSTEM)
 #define RVG_MATCH_SRET		(RV_ENCODE_FUNCT12(SRET) | RVG_OPCODE_SYSTEM)
 #define RVC_MATCH_C_BEQZ	(RVC_ENCODE_FUNCT3(C_BEQZ) | RVC_OPCODE_C1)
 #define RVC_MATCH_C_BNEZ	(RVC_ENCODE_FUNCT3(C_BNEZ) | RVC_OPCODE_C1)
@@ -162,6 +165,7 @@
 #define RVC_MATCH_C_JAL		(RVC_ENCODE_FUNCT3(C_JAL) | RVC_OPCODE_C1)
 #define RVC_MATCH_C_JR		(RVC_ENCODE_FUNCT4(C_JR) | RVC_OPCODE_C2)
 #define RVC_MATCH_C_JALR	(RVC_ENCODE_FUNCT4(C_JALR) | RVC_OPCODE_C2)
+#define RVC_MATCH_C_EBREAK	(RVC_ENCODE_FUNCT4(C_EBREAK) | RVC_OPCODE_C2)
 
 #define RVG_MASK_JALR		(RV_INSN_FUNCT3_MASK | RV_INSN_OPCODE_MASK)
 #define RVG_MASK_JAL		(RV_INSN_OPCODE_MASK)
@@ -177,6 +181,8 @@
 #define RVG_MASK_BGEU		(RV_INSN_FUNCT3_MASK | RV_INSN_OPCODE_MASK)
 #define RVC_MASK_C_BEQZ		(RVC_INSN_FUNCT3_MASK | RVC_INSN_OPCODE_MASK)
 #define RVC_MASK_C_BNEZ		(RVC_INSN_FUNCT3_MASK | RVC_INSN_OPCODE_MASK)
+#define RVC_MASK_C_EBREAK	0xffff
+#define RVG_MASK_EBREAK		0xffffffff
 #define RVG_MASK_SRET		0xffffffff
 
 #define __INSN_LENGTH_MASK	_UL(0x3)
-- 
2.35.1


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH v3 05/14] RISC-V: add auipc elements to parse_asm header
  2022-11-30 22:56 ` Heiko Stuebner
@ 2022-11-30 22:56   ` Heiko Stuebner
  -1 siblings, 0 replies; 84+ messages in thread
From: Heiko Stuebner @ 2022-11-30 22:56 UTC (permalink / raw)
  To: linux-riscv, palmer
  Cc: christoph.muellner, prabhakar.csengg, conor, philipp.tomsich,
	ajones, heiko, emil.renner.berthing, ardb, linux-efi,
	Heiko Stuebner, Conor Dooley

From: Heiko Stuebner <heiko.stuebner@vrull.eu>

Instruction parsing should not be done in individual code, but instead
supported by central

Right now kgdb and kprobes parse instructions and at least kprobes (and
the upcoming auipc+jalr alternative fixer-function) need the the auipc
instruction.

Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Heiko Stuebner <heiko.stuebner@vrull.eu>
---
 arch/riscv/include/asm/parse_asm.h | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/arch/riscv/include/asm/parse_asm.h b/arch/riscv/include/asm/parse_asm.h
index e8303250f598..28742eb19034 100644
--- a/arch/riscv/include/asm/parse_asm.h
+++ b/arch/riscv/include/asm/parse_asm.h
@@ -118,6 +118,7 @@
 #define RVC_C2_RD_OPOFF		7
 
 /* parts of opcode for RVG*/
+#define RVG_OPCODE_AUIPC	0x17
 #define RVG_OPCODE_BRANCH	0x63
 #define RVG_OPCODE_JALR		0x67
 #define RVG_OPCODE_JAL		0x6f
@@ -149,6 +150,7 @@
 #define RVG_FUNCT12_EBREAK	0x1
 #define RVG_FUNCT12_SRET	0x102
 
+#define RVG_MATCH_AUIPC		(RVG_OPCODE_AUIPC)
 #define RVG_MATCH_JALR		(RV_ENCODE_FUNCT3(JALR) | RVG_OPCODE_JALR)
 #define RVG_MATCH_JAL		(RVG_OPCODE_JAL)
 #define RVG_MATCH_BEQ		(RV_ENCODE_FUNCT3(BEQ) | RVG_OPCODE_BRANCH)
@@ -167,6 +169,7 @@
 #define RVC_MATCH_C_JALR	(RVC_ENCODE_FUNCT4(C_JALR) | RVC_OPCODE_C2)
 #define RVC_MATCH_C_EBREAK	(RVC_ENCODE_FUNCT4(C_EBREAK) | RVC_OPCODE_C2)
 
+#define RVG_MASK_AUIPC		(RV_INSN_OPCODE_MASK)
 #define RVG_MASK_JALR		(RV_INSN_FUNCT3_MASK | RV_INSN_OPCODE_MASK)
 #define RVG_MASK_JAL		(RV_INSN_OPCODE_MASK)
 #define RVC_MASK_C_JALR		(RVC_INSN_FUNCT4_MASK | RVC_INSN_J_RS2_MASK | RVC_INSN_OPCODE_MASK)
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH v3 05/14] RISC-V: add auipc elements to parse_asm header
@ 2022-11-30 22:56   ` Heiko Stuebner
  0 siblings, 0 replies; 84+ messages in thread
From: Heiko Stuebner @ 2022-11-30 22:56 UTC (permalink / raw)
  To: linux-riscv, palmer
  Cc: christoph.muellner, prabhakar.csengg, conor, philipp.tomsich,
	ajones, heiko, emil.renner.berthing, ardb, linux-efi,
	Heiko Stuebner, Conor Dooley

From: Heiko Stuebner <heiko.stuebner@vrull.eu>

Instruction parsing should not be done in individual code, but instead
supported by central

Right now kgdb and kprobes parse instructions and at least kprobes (and
the upcoming auipc+jalr alternative fixer-function) need the the auipc
instruction.

Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Heiko Stuebner <heiko.stuebner@vrull.eu>
---
 arch/riscv/include/asm/parse_asm.h | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/arch/riscv/include/asm/parse_asm.h b/arch/riscv/include/asm/parse_asm.h
index e8303250f598..28742eb19034 100644
--- a/arch/riscv/include/asm/parse_asm.h
+++ b/arch/riscv/include/asm/parse_asm.h
@@ -118,6 +118,7 @@
 #define RVC_C2_RD_OPOFF		7
 
 /* parts of opcode for RVG*/
+#define RVG_OPCODE_AUIPC	0x17
 #define RVG_OPCODE_BRANCH	0x63
 #define RVG_OPCODE_JALR		0x67
 #define RVG_OPCODE_JAL		0x6f
@@ -149,6 +150,7 @@
 #define RVG_FUNCT12_EBREAK	0x1
 #define RVG_FUNCT12_SRET	0x102
 
+#define RVG_MATCH_AUIPC		(RVG_OPCODE_AUIPC)
 #define RVG_MATCH_JALR		(RV_ENCODE_FUNCT3(JALR) | RVG_OPCODE_JALR)
 #define RVG_MATCH_JAL		(RVG_OPCODE_JAL)
 #define RVG_MATCH_BEQ		(RV_ENCODE_FUNCT3(BEQ) | RVG_OPCODE_BRANCH)
@@ -167,6 +169,7 @@
 #define RVC_MATCH_C_JALR	(RVC_ENCODE_FUNCT4(C_JALR) | RVC_OPCODE_C2)
 #define RVC_MATCH_C_EBREAK	(RVC_ENCODE_FUNCT4(C_EBREAK) | RVC_OPCODE_C2)
 
+#define RVG_MASK_AUIPC		(RV_INSN_OPCODE_MASK)
 #define RVG_MASK_JALR		(RV_INSN_FUNCT3_MASK | RV_INSN_OPCODE_MASK)
 #define RVG_MASK_JAL		(RV_INSN_OPCODE_MASK)
 #define RVC_MASK_C_JALR		(RVC_INSN_FUNCT4_MASK | RVC_INSN_J_RS2_MASK | RVC_INSN_OPCODE_MASK)
-- 
2.35.1


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH v3 06/14] RISC-V: Move riscv_insn_is_* macros into a common header
  2022-11-30 22:56 ` Heiko Stuebner
@ 2022-11-30 22:56   ` Heiko Stuebner
  -1 siblings, 0 replies; 84+ messages in thread
From: Heiko Stuebner @ 2022-11-30 22:56 UTC (permalink / raw)
  To: linux-riscv, palmer
  Cc: christoph.muellner, prabhakar.csengg, conor, philipp.tomsich,
	ajones, heiko, emil.renner.berthing, ardb, linux-efi,
	Heiko Stuebner, Conor Dooley

From: Heiko Stuebner <heiko.stuebner@vrull.eu>

Right now the riscv kernel has (at least) two independent sets
of functions to check if an encoded instruction is of a specific
type. One in kgdb and one kprobes simulate-insn code.

More parts of the kernel will probably need this in the future,
so instead of allowing this duplication to go on further,
move macros that do the function declaration in a common header,
similar to at least aarch64.

Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Heiko Stuebner <heiko.stuebner@vrull.eu>
---
 arch/riscv/include/asm/parse_asm.h       | 42 ++++++++++++++++----
 arch/riscv/kernel/kgdb.c                 | 49 ++++++++----------------
 arch/riscv/kernel/probes/simulate-insn.h | 26 +++----------
 3 files changed, 55 insertions(+), 62 deletions(-)

diff --git a/arch/riscv/include/asm/parse_asm.h b/arch/riscv/include/asm/parse_asm.h
index 28742eb19034..50c899cf4de5 100644
--- a/arch/riscv/include/asm/parse_asm.h
+++ b/arch/riscv/include/asm/parse_asm.h
@@ -193,13 +193,41 @@
 #define __INSN_OPCODE_MASK	_UL(0x7F)
 #define __INSN_BRANCH_OPCODE	_UL(RVG_OPCODE_BRANCH)
 
-/* Define a series of is_XXX_insn functions to check if the value INSN
- * is an instance of instruction XXX.
- */
-#define DECLARE_INSN(INSN_NAME, INSN_MATCH, INSN_MASK) \
-static inline bool is_ ## INSN_NAME ## _insn(long insn) \
-{ \
-	return (insn & (INSN_MASK)) == (INSN_MATCH); \
+#define __RISCV_INSN_FUNCS(name, mask, val)				\
+static __always_inline bool riscv_insn_is_##name(u32 code)		\
+{									\
+	BUILD_BUG_ON(~(mask) & (val));					\
+	return (code & (mask)) == (val);				\
+}									\
+
+#if __riscv_xlen == 32
+/* C.JAL is an RV32C-only instruction */
+__RISCV_INSN_FUNCS(c_jal, RVC_MASK_C_JAL, RVC_MATCH_C_JAL)
+#else
+#define riscv_insn_is_c_jal(opcode) 0
+#endif
+__RISCV_INSN_FUNCS(auipc, RVG_MASK_AUIPC, RVG_MATCH_AUIPC)
+__RISCV_INSN_FUNCS(jalr, RVG_MASK_JALR, RVG_MATCH_JALR)
+__RISCV_INSN_FUNCS(jal, RVG_MASK_JAL, RVG_MATCH_JAL)
+__RISCV_INSN_FUNCS(c_jr, RVC_MASK_C_JR, RVC_MATCH_C_JR)
+__RISCV_INSN_FUNCS(c_jalr, RVC_MASK_C_JALR, RVC_MATCH_C_JALR)
+__RISCV_INSN_FUNCS(c_j, RVC_MASK_C_J, RVC_MATCH_C_J)
+__RISCV_INSN_FUNCS(beq, RVG_MASK_BEQ, RVG_MATCH_BEQ)
+__RISCV_INSN_FUNCS(bne, RVG_MASK_BNE, RVG_MATCH_BNE)
+__RISCV_INSN_FUNCS(blt, RVG_MASK_BLT, RVG_MATCH_BLT)
+__RISCV_INSN_FUNCS(bge, RVG_MASK_BGE, RVG_MATCH_BGE)
+__RISCV_INSN_FUNCS(bltu, RVG_MASK_BLTU, RVG_MATCH_BLTU)
+__RISCV_INSN_FUNCS(bgeu, RVG_MASK_BGEU, RVG_MATCH_BGEU)
+__RISCV_INSN_FUNCS(c_beqz, RVC_MASK_C_BEQZ, RVC_MATCH_C_BEQZ)
+__RISCV_INSN_FUNCS(c_bnez, RVC_MASK_C_BNEZ, RVC_MATCH_C_BNEZ)
+__RISCV_INSN_FUNCS(c_ebreak, RVC_MASK_C_EBREAK, RVC_MATCH_C_EBREAK)
+__RISCV_INSN_FUNCS(ebreak, RVG_MASK_EBREAK, RVG_MATCH_EBREAK)
+__RISCV_INSN_FUNCS(sret, RVG_MASK_SRET, RVG_MATCH_SRET)
+
+/* special case to catch _any_ branch instruction */
+static __always_inline bool riscv_insn_is_branch(u32 code)
+{
+	return (code & RV_INSN_OPCODE_MASK) == RVG_OPCODE_BRANCH;
 }
 
 #define RV_IMM_SIGN(x) (-(((x) >> 31) & 1))
diff --git a/arch/riscv/kernel/kgdb.c b/arch/riscv/kernel/kgdb.c
index 137c6d870d58..61237aeb493c 100644
--- a/arch/riscv/kernel/kgdb.c
+++ b/arch/riscv/kernel/kgdb.c
@@ -23,27 +23,6 @@ enum {
 static unsigned long stepped_address;
 static unsigned int stepped_opcode;
 
-#if __riscv_xlen == 32
-/* C.JAL is an RV32C-only instruction */
-DECLARE_INSN(c_jal, MATCH_C_JAL, MASK_C_JAL)
-#else
-#define is_c_jal_insn(opcode) 0
-#endif
-DECLARE_INSN(jalr, RVG_MATCH_JALR, RVG_MASK_JALR)
-DECLARE_INSN(jal, RVG_MATCH_JAL, RVG_MASK_JAL)
-DECLARE_INSN(c_jr, RVC_MATCH_C_JR, RVC_MASK_C_JR)
-DECLARE_INSN(c_jalr, RVC_MATCH_C_JALR, RVC_MASK_C_JALR)
-DECLARE_INSN(c_j, RVC_MATCH_C_J, RVC_MASK_C_J)
-DECLARE_INSN(beq, RVG_MATCH_BEQ, RVG_MASK_BEQ)
-DECLARE_INSN(bne, RVG_MATCH_BNE, RVG_MASK_BNE)
-DECLARE_INSN(blt, RVG_MATCH_BLT, RVG_MASK_BLT)
-DECLARE_INSN(bge, RVG_MATCH_BGE, RVG_MASK_BGE)
-DECLARE_INSN(bltu, RVG_MATCH_BLTU, RVG_MASK_BLTU)
-DECLARE_INSN(bgeu, RVG_MATCH_BGEU, RVG_MASK_BGEU)
-DECLARE_INSN(c_beqz, RVC_MATCH_C_BEQZ, RVC_MASK_C_BEQZ)
-DECLARE_INSN(c_bnez, RVC_MATCH_C_BNEZ, RVC_MASK_C_BNEZ)
-DECLARE_INSN(sret, RVG_MATCH_SRET, RVG_MASK_SRET)
-
 static int decode_register_index(unsigned long opcode, int offset)
 {
 	return (opcode >> offset) & 0x1F;
@@ -65,19 +44,21 @@ static int get_step_address(struct pt_regs *regs, unsigned long *next_addr)
 	if (get_kernel_nofault(op_code, (void *)pc))
 		return -EINVAL;
 	if ((op_code & __INSN_LENGTH_MASK) != __INSN_LENGTH_GE_32) {
-		if (is_c_jalr_insn(op_code) || is_c_jr_insn(op_code)) {
+		if (riscv_insn_is_c_jalr(op_code) ||
+		    riscv_insn_is_c_jr(op_code)) {
 			rs1_num = decode_register_index(op_code, RVC_C2_RS1_OPOFF);
 			*next_addr = regs_ptr[rs1_num];
-		} else if (is_c_j_insn(op_code) || is_c_jal_insn(op_code)) {
+		} else if (riscv_insn_is_c_j(op_code) ||
+			   riscv_insn_is_c_jal(op_code)) {
 			*next_addr = RVC_EXTRACT_JTYPE_IMM(op_code) + pc;
-		} else if (is_c_beqz_insn(op_code)) {
+		} else if (riscv_insn_is_c_beqz(op_code)) {
 			rs1_num = decode_register_index_short(op_code,
 							      RVC_C1_RS1_OPOFF);
 			if (!rs1_num || regs_ptr[rs1_num] == 0)
 				*next_addr = RVC_EXTRACT_BTYPE_IMM(op_code) + pc;
 			else
 				*next_addr = pc + 2;
-		} else if (is_c_bnez_insn(op_code)) {
+		} else if (riscv_insn_is_c_bnez(op_code)) {
 			rs1_num =
 			    decode_register_index_short(op_code, RVC_C1_RS1_OPOFF);
 			if (rs1_num && regs_ptr[rs1_num] != 0)
@@ -100,34 +81,34 @@ static int get_step_address(struct pt_regs *regs, unsigned long *next_addr)
 			if (rs2_num)
 				rs2_val = regs_ptr[rs2_num];
 
-			if (is_beq_insn(op_code))
+			if (riscv_insn_is_beq(op_code))
 				result = (rs1_val == rs2_val) ? true : false;
-			else if (is_bne_insn(op_code))
+			else if (riscv_insn_is_bne(op_code))
 				result = (rs1_val != rs2_val) ? true : false;
-			else if (is_blt_insn(op_code))
+			else if (riscv_insn_is_blt(op_code))
 				result =
 				    ((long)rs1_val <
 				     (long)rs2_val) ? true : false;
-			else if (is_bge_insn(op_code))
+			else if (riscv_insn_is_bge(op_code))
 				result =
 				    ((long)rs1_val >=
 				     (long)rs2_val) ? true : false;
-			else if (is_bltu_insn(op_code))
+			else if (riscv_insn_is_bltu(op_code))
 				result = (rs1_val < rs2_val) ? true : false;
-			else if (is_bgeu_insn(op_code))
+			else if (riscv_insn_is_bgeu(op_code))
 				result = (rs1_val >= rs2_val) ? true : false;
 			if (result)
 				*next_addr = imm + pc;
 			else
 				*next_addr = pc + 4;
-		} else if (is_jal_insn(op_code)) {
+		} else if (riscv_insn_is_jal(op_code)) {
 			*next_addr = RV_EXTRACT_JTYPE_IMM(op_code) + pc;
-		} else if (is_jalr_insn(op_code)) {
+		} else if (riscv_insn_is_jalr(op_code)) {
 			rs1_num = decode_register_index(op_code, RVG_RS1_OPOFF);
 			if (rs1_num)
 				*next_addr = ((unsigned long *)regs)[rs1_num];
 			*next_addr += RV_EXTRACT_ITYPE_IMM(op_code);
-		} else if (is_sret_insn(op_code)) {
+		} else if (riscv_insn_is_sret(op_code)) {
 			*next_addr = pc;
 		} else {
 			*next_addr = pc + 4;
diff --git a/arch/riscv/kernel/probes/simulate-insn.h b/arch/riscv/kernel/probes/simulate-insn.h
index cb6ff7dccb92..29fb16cd335c 100644
--- a/arch/riscv/kernel/probes/simulate-insn.h
+++ b/arch/riscv/kernel/probes/simulate-insn.h
@@ -3,14 +3,7 @@
 #ifndef _RISCV_KERNEL_PROBES_SIMULATE_INSN_H
 #define _RISCV_KERNEL_PROBES_SIMULATE_INSN_H
 
-#define __RISCV_INSN_FUNCS(name, mask, val)				\
-static __always_inline bool riscv_insn_is_##name(probe_opcode_t code)	\
-{									\
-	BUILD_BUG_ON(~(mask) & (val));					\
-	return (code & (mask)) == (val);				\
-}									\
-bool simulate_##name(u32 opcode, unsigned long addr,			\
-		     struct pt_regs *regs)
+#include <asm/parse_asm.h>
 
 #define RISCV_INSN_REJECTED(name, code)					\
 	do {								\
@@ -30,18 +23,9 @@ __RISCV_INSN_FUNCS(fence,	0x7f, 0x0f);
 		}							\
 	} while (0)
 
-__RISCV_INSN_FUNCS(c_j,		0xe003, 0xa001);
-__RISCV_INSN_FUNCS(c_jr,	0xf007, 0x8002);
-__RISCV_INSN_FUNCS(c_jal,	0xe003, 0x2001);
-__RISCV_INSN_FUNCS(c_jalr,	0xf007, 0x9002);
-__RISCV_INSN_FUNCS(c_beqz,	0xe003, 0xc001);
-__RISCV_INSN_FUNCS(c_bnez,	0xe003, 0xe001);
-__RISCV_INSN_FUNCS(c_ebreak,	0xffff, 0x9002);
-
-__RISCV_INSN_FUNCS(auipc,	0x7f, 0x17);
-__RISCV_INSN_FUNCS(branch,	0x7f, 0x63);
-
-__RISCV_INSN_FUNCS(jal,		0x7f, 0x6f);
-__RISCV_INSN_FUNCS(jalr,	0x707f, 0x67);
+bool simulate_auipc(u32 opcode, unsigned long addr, struct pt_regs *regs);
+bool simulate_branch(u32 opcode, unsigned long addr, struct pt_regs *regs);
+bool simulate_jal(u32 opcode, unsigned long addr, struct pt_regs *regs);
+bool simulate_jalr(u32 opcode, unsigned long addr, struct pt_regs *regs);
 
 #endif /* _RISCV_KERNEL_PROBES_SIMULATE_INSN_H */
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH v3 06/14] RISC-V: Move riscv_insn_is_* macros into a common header
@ 2022-11-30 22:56   ` Heiko Stuebner
  0 siblings, 0 replies; 84+ messages in thread
From: Heiko Stuebner @ 2022-11-30 22:56 UTC (permalink / raw)
  To: linux-riscv, palmer
  Cc: christoph.muellner, prabhakar.csengg, conor, philipp.tomsich,
	ajones, heiko, emil.renner.berthing, ardb, linux-efi,
	Heiko Stuebner, Conor Dooley

From: Heiko Stuebner <heiko.stuebner@vrull.eu>

Right now the riscv kernel has (at least) two independent sets
of functions to check if an encoded instruction is of a specific
type. One in kgdb and one kprobes simulate-insn code.

More parts of the kernel will probably need this in the future,
so instead of allowing this duplication to go on further,
move macros that do the function declaration in a common header,
similar to at least aarch64.

Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Heiko Stuebner <heiko.stuebner@vrull.eu>
---
 arch/riscv/include/asm/parse_asm.h       | 42 ++++++++++++++++----
 arch/riscv/kernel/kgdb.c                 | 49 ++++++++----------------
 arch/riscv/kernel/probes/simulate-insn.h | 26 +++----------
 3 files changed, 55 insertions(+), 62 deletions(-)

diff --git a/arch/riscv/include/asm/parse_asm.h b/arch/riscv/include/asm/parse_asm.h
index 28742eb19034..50c899cf4de5 100644
--- a/arch/riscv/include/asm/parse_asm.h
+++ b/arch/riscv/include/asm/parse_asm.h
@@ -193,13 +193,41 @@
 #define __INSN_OPCODE_MASK	_UL(0x7F)
 #define __INSN_BRANCH_OPCODE	_UL(RVG_OPCODE_BRANCH)
 
-/* Define a series of is_XXX_insn functions to check if the value INSN
- * is an instance of instruction XXX.
- */
-#define DECLARE_INSN(INSN_NAME, INSN_MATCH, INSN_MASK) \
-static inline bool is_ ## INSN_NAME ## _insn(long insn) \
-{ \
-	return (insn & (INSN_MASK)) == (INSN_MATCH); \
+#define __RISCV_INSN_FUNCS(name, mask, val)				\
+static __always_inline bool riscv_insn_is_##name(u32 code)		\
+{									\
+	BUILD_BUG_ON(~(mask) & (val));					\
+	return (code & (mask)) == (val);				\
+}									\
+
+#if __riscv_xlen == 32
+/* C.JAL is an RV32C-only instruction */
+__RISCV_INSN_FUNCS(c_jal, RVC_MASK_C_JAL, RVC_MATCH_C_JAL)
+#else
+#define riscv_insn_is_c_jal(opcode) 0
+#endif
+__RISCV_INSN_FUNCS(auipc, RVG_MASK_AUIPC, RVG_MATCH_AUIPC)
+__RISCV_INSN_FUNCS(jalr, RVG_MASK_JALR, RVG_MATCH_JALR)
+__RISCV_INSN_FUNCS(jal, RVG_MASK_JAL, RVG_MATCH_JAL)
+__RISCV_INSN_FUNCS(c_jr, RVC_MASK_C_JR, RVC_MATCH_C_JR)
+__RISCV_INSN_FUNCS(c_jalr, RVC_MASK_C_JALR, RVC_MATCH_C_JALR)
+__RISCV_INSN_FUNCS(c_j, RVC_MASK_C_J, RVC_MATCH_C_J)
+__RISCV_INSN_FUNCS(beq, RVG_MASK_BEQ, RVG_MATCH_BEQ)
+__RISCV_INSN_FUNCS(bne, RVG_MASK_BNE, RVG_MATCH_BNE)
+__RISCV_INSN_FUNCS(blt, RVG_MASK_BLT, RVG_MATCH_BLT)
+__RISCV_INSN_FUNCS(bge, RVG_MASK_BGE, RVG_MATCH_BGE)
+__RISCV_INSN_FUNCS(bltu, RVG_MASK_BLTU, RVG_MATCH_BLTU)
+__RISCV_INSN_FUNCS(bgeu, RVG_MASK_BGEU, RVG_MATCH_BGEU)
+__RISCV_INSN_FUNCS(c_beqz, RVC_MASK_C_BEQZ, RVC_MATCH_C_BEQZ)
+__RISCV_INSN_FUNCS(c_bnez, RVC_MASK_C_BNEZ, RVC_MATCH_C_BNEZ)
+__RISCV_INSN_FUNCS(c_ebreak, RVC_MASK_C_EBREAK, RVC_MATCH_C_EBREAK)
+__RISCV_INSN_FUNCS(ebreak, RVG_MASK_EBREAK, RVG_MATCH_EBREAK)
+__RISCV_INSN_FUNCS(sret, RVG_MASK_SRET, RVG_MATCH_SRET)
+
+/* special case to catch _any_ branch instruction */
+static __always_inline bool riscv_insn_is_branch(u32 code)
+{
+	return (code & RV_INSN_OPCODE_MASK) == RVG_OPCODE_BRANCH;
 }
 
 #define RV_IMM_SIGN(x) (-(((x) >> 31) & 1))
diff --git a/arch/riscv/kernel/kgdb.c b/arch/riscv/kernel/kgdb.c
index 137c6d870d58..61237aeb493c 100644
--- a/arch/riscv/kernel/kgdb.c
+++ b/arch/riscv/kernel/kgdb.c
@@ -23,27 +23,6 @@ enum {
 static unsigned long stepped_address;
 static unsigned int stepped_opcode;
 
-#if __riscv_xlen == 32
-/* C.JAL is an RV32C-only instruction */
-DECLARE_INSN(c_jal, MATCH_C_JAL, MASK_C_JAL)
-#else
-#define is_c_jal_insn(opcode) 0
-#endif
-DECLARE_INSN(jalr, RVG_MATCH_JALR, RVG_MASK_JALR)
-DECLARE_INSN(jal, RVG_MATCH_JAL, RVG_MASK_JAL)
-DECLARE_INSN(c_jr, RVC_MATCH_C_JR, RVC_MASK_C_JR)
-DECLARE_INSN(c_jalr, RVC_MATCH_C_JALR, RVC_MASK_C_JALR)
-DECLARE_INSN(c_j, RVC_MATCH_C_J, RVC_MASK_C_J)
-DECLARE_INSN(beq, RVG_MATCH_BEQ, RVG_MASK_BEQ)
-DECLARE_INSN(bne, RVG_MATCH_BNE, RVG_MASK_BNE)
-DECLARE_INSN(blt, RVG_MATCH_BLT, RVG_MASK_BLT)
-DECLARE_INSN(bge, RVG_MATCH_BGE, RVG_MASK_BGE)
-DECLARE_INSN(bltu, RVG_MATCH_BLTU, RVG_MASK_BLTU)
-DECLARE_INSN(bgeu, RVG_MATCH_BGEU, RVG_MASK_BGEU)
-DECLARE_INSN(c_beqz, RVC_MATCH_C_BEQZ, RVC_MASK_C_BEQZ)
-DECLARE_INSN(c_bnez, RVC_MATCH_C_BNEZ, RVC_MASK_C_BNEZ)
-DECLARE_INSN(sret, RVG_MATCH_SRET, RVG_MASK_SRET)
-
 static int decode_register_index(unsigned long opcode, int offset)
 {
 	return (opcode >> offset) & 0x1F;
@@ -65,19 +44,21 @@ static int get_step_address(struct pt_regs *regs, unsigned long *next_addr)
 	if (get_kernel_nofault(op_code, (void *)pc))
 		return -EINVAL;
 	if ((op_code & __INSN_LENGTH_MASK) != __INSN_LENGTH_GE_32) {
-		if (is_c_jalr_insn(op_code) || is_c_jr_insn(op_code)) {
+		if (riscv_insn_is_c_jalr(op_code) ||
+		    riscv_insn_is_c_jr(op_code)) {
 			rs1_num = decode_register_index(op_code, RVC_C2_RS1_OPOFF);
 			*next_addr = regs_ptr[rs1_num];
-		} else if (is_c_j_insn(op_code) || is_c_jal_insn(op_code)) {
+		} else if (riscv_insn_is_c_j(op_code) ||
+			   riscv_insn_is_c_jal(op_code)) {
 			*next_addr = RVC_EXTRACT_JTYPE_IMM(op_code) + pc;
-		} else if (is_c_beqz_insn(op_code)) {
+		} else if (riscv_insn_is_c_beqz(op_code)) {
 			rs1_num = decode_register_index_short(op_code,
 							      RVC_C1_RS1_OPOFF);
 			if (!rs1_num || regs_ptr[rs1_num] == 0)
 				*next_addr = RVC_EXTRACT_BTYPE_IMM(op_code) + pc;
 			else
 				*next_addr = pc + 2;
-		} else if (is_c_bnez_insn(op_code)) {
+		} else if (riscv_insn_is_c_bnez(op_code)) {
 			rs1_num =
 			    decode_register_index_short(op_code, RVC_C1_RS1_OPOFF);
 			if (rs1_num && regs_ptr[rs1_num] != 0)
@@ -100,34 +81,34 @@ static int get_step_address(struct pt_regs *regs, unsigned long *next_addr)
 			if (rs2_num)
 				rs2_val = regs_ptr[rs2_num];
 
-			if (is_beq_insn(op_code))
+			if (riscv_insn_is_beq(op_code))
 				result = (rs1_val == rs2_val) ? true : false;
-			else if (is_bne_insn(op_code))
+			else if (riscv_insn_is_bne(op_code))
 				result = (rs1_val != rs2_val) ? true : false;
-			else if (is_blt_insn(op_code))
+			else if (riscv_insn_is_blt(op_code))
 				result =
 				    ((long)rs1_val <
 				     (long)rs2_val) ? true : false;
-			else if (is_bge_insn(op_code))
+			else if (riscv_insn_is_bge(op_code))
 				result =
 				    ((long)rs1_val >=
 				     (long)rs2_val) ? true : false;
-			else if (is_bltu_insn(op_code))
+			else if (riscv_insn_is_bltu(op_code))
 				result = (rs1_val < rs2_val) ? true : false;
-			else if (is_bgeu_insn(op_code))
+			else if (riscv_insn_is_bgeu(op_code))
 				result = (rs1_val >= rs2_val) ? true : false;
 			if (result)
 				*next_addr = imm + pc;
 			else
 				*next_addr = pc + 4;
-		} else if (is_jal_insn(op_code)) {
+		} else if (riscv_insn_is_jal(op_code)) {
 			*next_addr = RV_EXTRACT_JTYPE_IMM(op_code) + pc;
-		} else if (is_jalr_insn(op_code)) {
+		} else if (riscv_insn_is_jalr(op_code)) {
 			rs1_num = decode_register_index(op_code, RVG_RS1_OPOFF);
 			if (rs1_num)
 				*next_addr = ((unsigned long *)regs)[rs1_num];
 			*next_addr += RV_EXTRACT_ITYPE_IMM(op_code);
-		} else if (is_sret_insn(op_code)) {
+		} else if (riscv_insn_is_sret(op_code)) {
 			*next_addr = pc;
 		} else {
 			*next_addr = pc + 4;
diff --git a/arch/riscv/kernel/probes/simulate-insn.h b/arch/riscv/kernel/probes/simulate-insn.h
index cb6ff7dccb92..29fb16cd335c 100644
--- a/arch/riscv/kernel/probes/simulate-insn.h
+++ b/arch/riscv/kernel/probes/simulate-insn.h
@@ -3,14 +3,7 @@
 #ifndef _RISCV_KERNEL_PROBES_SIMULATE_INSN_H
 #define _RISCV_KERNEL_PROBES_SIMULATE_INSN_H
 
-#define __RISCV_INSN_FUNCS(name, mask, val)				\
-static __always_inline bool riscv_insn_is_##name(probe_opcode_t code)	\
-{									\
-	BUILD_BUG_ON(~(mask) & (val));					\
-	return (code & (mask)) == (val);				\
-}									\
-bool simulate_##name(u32 opcode, unsigned long addr,			\
-		     struct pt_regs *regs)
+#include <asm/parse_asm.h>
 
 #define RISCV_INSN_REJECTED(name, code)					\
 	do {								\
@@ -30,18 +23,9 @@ __RISCV_INSN_FUNCS(fence,	0x7f, 0x0f);
 		}							\
 	} while (0)
 
-__RISCV_INSN_FUNCS(c_j,		0xe003, 0xa001);
-__RISCV_INSN_FUNCS(c_jr,	0xf007, 0x8002);
-__RISCV_INSN_FUNCS(c_jal,	0xe003, 0x2001);
-__RISCV_INSN_FUNCS(c_jalr,	0xf007, 0x9002);
-__RISCV_INSN_FUNCS(c_beqz,	0xe003, 0xc001);
-__RISCV_INSN_FUNCS(c_bnez,	0xe003, 0xe001);
-__RISCV_INSN_FUNCS(c_ebreak,	0xffff, 0x9002);
-
-__RISCV_INSN_FUNCS(auipc,	0x7f, 0x17);
-__RISCV_INSN_FUNCS(branch,	0x7f, 0x63);
-
-__RISCV_INSN_FUNCS(jal,		0x7f, 0x6f);
-__RISCV_INSN_FUNCS(jalr,	0x707f, 0x67);
+bool simulate_auipc(u32 opcode, unsigned long addr, struct pt_regs *regs);
+bool simulate_branch(u32 opcode, unsigned long addr, struct pt_regs *regs);
+bool simulate_jal(u32 opcode, unsigned long addr, struct pt_regs *regs);
+bool simulate_jalr(u32 opcode, unsigned long addr, struct pt_regs *regs);
 
 #endif /* _RISCV_KERNEL_PROBES_SIMULATE_INSN_H */
-- 
2.35.1


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH v3 07/14] RISC-V: rename parse_asm.h to insn.h
  2022-11-30 22:56 ` Heiko Stuebner
@ 2022-11-30 22:56   ` Heiko Stuebner
  -1 siblings, 0 replies; 84+ messages in thread
From: Heiko Stuebner @ 2022-11-30 22:56 UTC (permalink / raw)
  To: linux-riscv, palmer
  Cc: christoph.muellner, prabhakar.csengg, conor, philipp.tomsich,
	ajones, heiko, emil.renner.berthing, ardb, linux-efi,
	Heiko Stuebner, Conor Dooley

From: Heiko Stuebner <heiko.stuebner@vrull.eu>

The current parse_asm header should become a more centralized place
for everything concerning parsing and constructing instructions.

We already have a header insn-def.h similar to aarch64, so rename
parse_asm.h to insn.h (again similar to aarch64) to show that it's
meant for more than simple instruction parsing.

Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Heiko Stuebner <heiko.stuebner@vrull.eu>
---
 arch/riscv/include/asm/{parse_asm.h => insn.h} | 0
 arch/riscv/kernel/kgdb.c                       | 2 +-
 arch/riscv/kernel/probes/simulate-insn.h       | 2 +-
 3 files changed, 2 insertions(+), 2 deletions(-)
 rename arch/riscv/include/asm/{parse_asm.h => insn.h} (100%)

diff --git a/arch/riscv/include/asm/parse_asm.h b/arch/riscv/include/asm/insn.h
similarity index 100%
rename from arch/riscv/include/asm/parse_asm.h
rename to arch/riscv/include/asm/insn.h
diff --git a/arch/riscv/kernel/kgdb.c b/arch/riscv/kernel/kgdb.c
index 61237aeb493c..2e0266ae6bd7 100644
--- a/arch/riscv/kernel/kgdb.c
+++ b/arch/riscv/kernel/kgdb.c
@@ -11,7 +11,7 @@
 #include <linux/string.h>
 #include <asm/cacheflush.h>
 #include <asm/gdb_xml.h>
-#include <asm/parse_asm.h>
+#include <asm/insn.h>
 
 enum {
 	NOT_KGDB_BREAK = 0,
diff --git a/arch/riscv/kernel/probes/simulate-insn.h b/arch/riscv/kernel/probes/simulate-insn.h
index 29fb16cd335c..a19aaa0feb44 100644
--- a/arch/riscv/kernel/probes/simulate-insn.h
+++ b/arch/riscv/kernel/probes/simulate-insn.h
@@ -3,7 +3,7 @@
 #ifndef _RISCV_KERNEL_PROBES_SIMULATE_INSN_H
 #define _RISCV_KERNEL_PROBES_SIMULATE_INSN_H
 
-#include <asm/parse_asm.h>
+#include <asm/insn.h>
 
 #define RISCV_INSN_REJECTED(name, code)					\
 	do {								\
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH v3 07/14] RISC-V: rename parse_asm.h to insn.h
@ 2022-11-30 22:56   ` Heiko Stuebner
  0 siblings, 0 replies; 84+ messages in thread
From: Heiko Stuebner @ 2022-11-30 22:56 UTC (permalink / raw)
  To: linux-riscv, palmer
  Cc: christoph.muellner, prabhakar.csengg, conor, philipp.tomsich,
	ajones, heiko, emil.renner.berthing, ardb, linux-efi,
	Heiko Stuebner, Conor Dooley

From: Heiko Stuebner <heiko.stuebner@vrull.eu>

The current parse_asm header should become a more centralized place
for everything concerning parsing and constructing instructions.

We already have a header insn-def.h similar to aarch64, so rename
parse_asm.h to insn.h (again similar to aarch64) to show that it's
meant for more than simple instruction parsing.

Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Heiko Stuebner <heiko.stuebner@vrull.eu>
---
 arch/riscv/include/asm/{parse_asm.h => insn.h} | 0
 arch/riscv/kernel/kgdb.c                       | 2 +-
 arch/riscv/kernel/probes/simulate-insn.h       | 2 +-
 3 files changed, 2 insertions(+), 2 deletions(-)
 rename arch/riscv/include/asm/{parse_asm.h => insn.h} (100%)

diff --git a/arch/riscv/include/asm/parse_asm.h b/arch/riscv/include/asm/insn.h
similarity index 100%
rename from arch/riscv/include/asm/parse_asm.h
rename to arch/riscv/include/asm/insn.h
diff --git a/arch/riscv/kernel/kgdb.c b/arch/riscv/kernel/kgdb.c
index 61237aeb493c..2e0266ae6bd7 100644
--- a/arch/riscv/kernel/kgdb.c
+++ b/arch/riscv/kernel/kgdb.c
@@ -11,7 +11,7 @@
 #include <linux/string.h>
 #include <asm/cacheflush.h>
 #include <asm/gdb_xml.h>
-#include <asm/parse_asm.h>
+#include <asm/insn.h>
 
 enum {
 	NOT_KGDB_BREAK = 0,
diff --git a/arch/riscv/kernel/probes/simulate-insn.h b/arch/riscv/kernel/probes/simulate-insn.h
index 29fb16cd335c..a19aaa0feb44 100644
--- a/arch/riscv/kernel/probes/simulate-insn.h
+++ b/arch/riscv/kernel/probes/simulate-insn.h
@@ -3,7 +3,7 @@
 #ifndef _RISCV_KERNEL_PROBES_SIMULATE_INSN_H
 #define _RISCV_KERNEL_PROBES_SIMULATE_INSN_H
 
-#include <asm/parse_asm.h>
+#include <asm/insn.h>
 
 #define RISCV_INSN_REJECTED(name, code)					\
 	do {								\
-- 
2.35.1


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH v3 08/14] RISC-V: kprobes: use central defined funct3 constants
  2022-11-30 22:56 ` Heiko Stuebner
@ 2022-11-30 22:56   ` Heiko Stuebner
  -1 siblings, 0 replies; 84+ messages in thread
From: Heiko Stuebner @ 2022-11-30 22:56 UTC (permalink / raw)
  To: linux-riscv, palmer
  Cc: christoph.muellner, prabhakar.csengg, conor, philipp.tomsich,
	ajones, heiko, emil.renner.berthing, ardb, linux-efi,
	Heiko Stuebner, Conor Dooley

From: Heiko Stuebner <heiko.stuebner@vrull.eu>

Don't redefine values that are already available in the central header
asm/insn.h . Use the values from there instead.

Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Heiko Stuebner <heiko.stuebner@vrull.eu>
---
 arch/riscv/kernel/probes/simulate-insn.c | 19 ++++++-------------
 1 file changed, 6 insertions(+), 13 deletions(-)

diff --git a/arch/riscv/kernel/probes/simulate-insn.c b/arch/riscv/kernel/probes/simulate-insn.c
index d73e96f6ed7c..330afe9331a8 100644
--- a/arch/riscv/kernel/probes/simulate-insn.c
+++ b/arch/riscv/kernel/probes/simulate-insn.c
@@ -136,13 +136,6 @@ bool __kprobes simulate_auipc(u32 opcode, unsigned long addr, struct pt_regs *re
 #define branch_offset(opcode) \
 	sign_extend32((branch_imm(opcode)), 12)
 
-#define BRANCH_BEQ	0x0
-#define BRANCH_BNE	0x1
-#define BRANCH_BLT	0x4
-#define BRANCH_BGE	0x5
-#define BRANCH_BLTU	0x6
-#define BRANCH_BGEU	0x7
-
 bool __kprobes simulate_branch(u32 opcode, unsigned long addr, struct pt_regs *regs)
 {
 	/*
@@ -169,22 +162,22 @@ bool __kprobes simulate_branch(u32 opcode, unsigned long addr, struct pt_regs *r
 
 	offset_tmp = branch_offset(opcode);
 	switch (branch_funct3(opcode)) {
-	case BRANCH_BEQ:
+	case RVG_FUNCT3_BEQ:
 		offset = (rs1_val == rs2_val) ? offset_tmp : 4;
 		break;
-	case BRANCH_BNE:
+	case RVG_FUNCT3_BNE:
 		offset = (rs1_val != rs2_val) ? offset_tmp : 4;
 		break;
-	case BRANCH_BLT:
+	case RVG_FUNCT3_BLT:
 		offset = ((long)rs1_val < (long)rs2_val) ? offset_tmp : 4;
 		break;
-	case BRANCH_BGE:
+	case RVG_FUNCT3_BGE:
 		offset = ((long)rs1_val >= (long)rs2_val) ? offset_tmp : 4;
 		break;
-	case BRANCH_BLTU:
+	case RVG_FUNCT3_BLTU:
 		offset = (rs1_val < rs2_val) ? offset_tmp : 4;
 		break;
-	case BRANCH_BGEU:
+	case RVG_FUNCT3_BGEU:
 		offset = (rs1_val >= rs2_val) ? offset_tmp : 4;
 		break;
 	default:
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH v3 08/14] RISC-V: kprobes: use central defined funct3 constants
@ 2022-11-30 22:56   ` Heiko Stuebner
  0 siblings, 0 replies; 84+ messages in thread
From: Heiko Stuebner @ 2022-11-30 22:56 UTC (permalink / raw)
  To: linux-riscv, palmer
  Cc: christoph.muellner, prabhakar.csengg, conor, philipp.tomsich,
	ajones, heiko, emil.renner.berthing, ardb, linux-efi,
	Heiko Stuebner, Conor Dooley

From: Heiko Stuebner <heiko.stuebner@vrull.eu>

Don't redefine values that are already available in the central header
asm/insn.h . Use the values from there instead.

Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Heiko Stuebner <heiko.stuebner@vrull.eu>
---
 arch/riscv/kernel/probes/simulate-insn.c | 19 ++++++-------------
 1 file changed, 6 insertions(+), 13 deletions(-)

diff --git a/arch/riscv/kernel/probes/simulate-insn.c b/arch/riscv/kernel/probes/simulate-insn.c
index d73e96f6ed7c..330afe9331a8 100644
--- a/arch/riscv/kernel/probes/simulate-insn.c
+++ b/arch/riscv/kernel/probes/simulate-insn.c
@@ -136,13 +136,6 @@ bool __kprobes simulate_auipc(u32 opcode, unsigned long addr, struct pt_regs *re
 #define branch_offset(opcode) \
 	sign_extend32((branch_imm(opcode)), 12)
 
-#define BRANCH_BEQ	0x0
-#define BRANCH_BNE	0x1
-#define BRANCH_BLT	0x4
-#define BRANCH_BGE	0x5
-#define BRANCH_BLTU	0x6
-#define BRANCH_BGEU	0x7
-
 bool __kprobes simulate_branch(u32 opcode, unsigned long addr, struct pt_regs *regs)
 {
 	/*
@@ -169,22 +162,22 @@ bool __kprobes simulate_branch(u32 opcode, unsigned long addr, struct pt_regs *r
 
 	offset_tmp = branch_offset(opcode);
 	switch (branch_funct3(opcode)) {
-	case BRANCH_BEQ:
+	case RVG_FUNCT3_BEQ:
 		offset = (rs1_val == rs2_val) ? offset_tmp : 4;
 		break;
-	case BRANCH_BNE:
+	case RVG_FUNCT3_BNE:
 		offset = (rs1_val != rs2_val) ? offset_tmp : 4;
 		break;
-	case BRANCH_BLT:
+	case RVG_FUNCT3_BLT:
 		offset = ((long)rs1_val < (long)rs2_val) ? offset_tmp : 4;
 		break;
-	case BRANCH_BGE:
+	case RVG_FUNCT3_BGE:
 		offset = ((long)rs1_val >= (long)rs2_val) ? offset_tmp : 4;
 		break;
-	case BRANCH_BLTU:
+	case RVG_FUNCT3_BLTU:
 		offset = (rs1_val < rs2_val) ? offset_tmp : 4;
 		break;
-	case BRANCH_BGEU:
+	case RVG_FUNCT3_BGEU:
 		offset = (rs1_val >= rs2_val) ? offset_tmp : 4;
 		break;
 	default:
-- 
2.35.1


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH v3 09/14] RISC-V: add U-type imm parsing to insn.h header
  2022-11-30 22:56 ` Heiko Stuebner
@ 2022-11-30 22:56   ` Heiko Stuebner
  -1 siblings, 0 replies; 84+ messages in thread
From: Heiko Stuebner @ 2022-11-30 22:56 UTC (permalink / raw)
  To: linux-riscv, palmer
  Cc: christoph.muellner, prabhakar.csengg, conor, philipp.tomsich,
	ajones, heiko, emil.renner.berthing, ardb, linux-efi,
	Heiko Stuebner, Conor Dooley

From: Heiko Stuebner <heiko.stuebner@vrull.eu>

Similar to other existing types, allow extracting the immediate
for a U-type instruction.

U-type immediates are special in that regard, that the value
in the instruction in bits [31:12] already represents the same
bits of the immediate, so no shifting is required.

U-type immediates are for example used in the auipc instruction,
so these constants make it easier to parse such instructions.

Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Heiko Stuebner <heiko.stuebner@vrull.eu>
---
 arch/riscv/include/asm/insn.h | 13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/arch/riscv/include/asm/insn.h b/arch/riscv/include/asm/insn.h
index 50c899cf4de5..21ec817abec1 100644
--- a/arch/riscv/include/asm/insn.h
+++ b/arch/riscv/include/asm/insn.h
@@ -34,6 +34,15 @@
 #define RV_J_IMM_11_MASK	GENMASK(0, 0)
 #define RV_J_IMM_19_12_MASK	GENMASK(7, 0)
 
+/*
+ * U-type IMMs contain the upper 20bits [31:20] of an immediate with
+ * the rest filled in by zeros, so no shifting required. Similarly,
+ * bit31 contains the signed state, so no sign extension necessary.
+ */
+#define RV_U_IMM_SIGN_OPOFF	31
+#define RV_U_IMM_31_12_OPOFF	0
+#define RV_U_IMM_31_12_MASK	GENMASK(31, 12)
+
 /* The bit field of immediate value in B-type instruction */
 #define RV_B_IMM_SIGN_OPOFF	31
 #define RV_B_IMM_10_5_OPOFF	25
@@ -235,6 +244,10 @@ static __always_inline bool riscv_insn_is_branch(u32 code)
 #define RV_X(X, s, mask)  (((X) >> (s)) & (mask))
 #define RVC_X(X, s, mask) RV_X(X, s, mask)
 
+#define RV_EXTRACT_UTYPE_IMM(x) \
+	({typeof(x) x_ = (x); \
+	(RV_X(x_, RV_U_IMM_31_12_OPOFF, RV_U_IMM_31_12_MASK)); })
+
 #define RV_EXTRACT_JTYPE_IMM(x) \
 	({typeof(x) x_ = (x); \
 	(RV_X(x_, RV_J_IMM_10_1_OPOFF, RV_J_IMM_10_1_MASK) << RV_J_IMM_10_1_OFF) | \
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH v3 09/14] RISC-V: add U-type imm parsing to insn.h header
@ 2022-11-30 22:56   ` Heiko Stuebner
  0 siblings, 0 replies; 84+ messages in thread
From: Heiko Stuebner @ 2022-11-30 22:56 UTC (permalink / raw)
  To: linux-riscv, palmer
  Cc: christoph.muellner, prabhakar.csengg, conor, philipp.tomsich,
	ajones, heiko, emil.renner.berthing, ardb, linux-efi,
	Heiko Stuebner, Conor Dooley

From: Heiko Stuebner <heiko.stuebner@vrull.eu>

Similar to other existing types, allow extracting the immediate
for a U-type instruction.

U-type immediates are special in that regard, that the value
in the instruction in bits [31:12] already represents the same
bits of the immediate, so no shifting is required.

U-type immediates are for example used in the auipc instruction,
so these constants make it easier to parse such instructions.

Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Heiko Stuebner <heiko.stuebner@vrull.eu>
---
 arch/riscv/include/asm/insn.h | 13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/arch/riscv/include/asm/insn.h b/arch/riscv/include/asm/insn.h
index 50c899cf4de5..21ec817abec1 100644
--- a/arch/riscv/include/asm/insn.h
+++ b/arch/riscv/include/asm/insn.h
@@ -34,6 +34,15 @@
 #define RV_J_IMM_11_MASK	GENMASK(0, 0)
 #define RV_J_IMM_19_12_MASK	GENMASK(7, 0)
 
+/*
+ * U-type IMMs contain the upper 20bits [31:20] of an immediate with
+ * the rest filled in by zeros, so no shifting required. Similarly,
+ * bit31 contains the signed state, so no sign extension necessary.
+ */
+#define RV_U_IMM_SIGN_OPOFF	31
+#define RV_U_IMM_31_12_OPOFF	0
+#define RV_U_IMM_31_12_MASK	GENMASK(31, 12)
+
 /* The bit field of immediate value in B-type instruction */
 #define RV_B_IMM_SIGN_OPOFF	31
 #define RV_B_IMM_10_5_OPOFF	25
@@ -235,6 +244,10 @@ static __always_inline bool riscv_insn_is_branch(u32 code)
 #define RV_X(X, s, mask)  (((X) >> (s)) & (mask))
 #define RVC_X(X, s, mask) RV_X(X, s, mask)
 
+#define RV_EXTRACT_UTYPE_IMM(x) \
+	({typeof(x) x_ = (x); \
+	(RV_X(x_, RV_U_IMM_31_12_OPOFF, RV_U_IMM_31_12_MASK)); })
+
 #define RV_EXTRACT_JTYPE_IMM(x) \
 	({typeof(x) x_ = (x); \
 	(RV_X(x_, RV_J_IMM_10_1_OPOFF, RV_J_IMM_10_1_MASK) << RV_J_IMM_10_1_OFF) | \
-- 
2.35.1


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH v3 10/14] RISC-V: add rd reg parsing to insn.h header
  2022-11-30 22:56 ` Heiko Stuebner
@ 2022-11-30 22:56   ` Heiko Stuebner
  -1 siblings, 0 replies; 84+ messages in thread
From: Heiko Stuebner @ 2022-11-30 22:56 UTC (permalink / raw)
  To: linux-riscv, palmer
  Cc: christoph.muellner, prabhakar.csengg, conor, philipp.tomsich,
	ajones, heiko, emil.renner.berthing, ardb, linux-efi,
	Heiko Stuebner, Conor Dooley

From: Heiko Stuebner <heiko.stuebner@vrull.eu>

Add a macro to allow parsing of the rd register from an instruction.

Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Heiko Stuebner <heiko.stuebner@vrull.eu>
---
 arch/riscv/include/asm/insn.h | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/arch/riscv/include/asm/insn.h b/arch/riscv/include/asm/insn.h
index 21ec817abec1..2a23890b4577 100644
--- a/arch/riscv/include/asm/insn.h
+++ b/arch/riscv/include/asm/insn.h
@@ -60,6 +60,7 @@
 #define RVG_RS1_OPOFF		15
 #define RVG_RS2_OPOFF		20
 #define RVG_RD_OPOFF		7
+#define RVG_RD_MASK		GENMASK(4, 0)
 
 /* The bit field of immediate value in RVC J instruction */
 #define RVC_J_IMM_SIGN_OPOFF	12
@@ -244,6 +245,10 @@ static __always_inline bool riscv_insn_is_branch(u32 code)
 #define RV_X(X, s, mask)  (((X) >> (s)) & (mask))
 #define RVC_X(X, s, mask) RV_X(X, s, mask)
 
+#define RV_EXTRACT_RD_REG(x) \
+	({typeof(x) x_ = (x); \
+	(RV_X(x_, RVG_RD_OPOFF, RVG_RD_MASK)); })
+
 #define RV_EXTRACT_UTYPE_IMM(x) \
 	({typeof(x) x_ = (x); \
 	(RV_X(x_, RV_U_IMM_31_12_OPOFF, RV_U_IMM_31_12_MASK)); })
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH v3 10/14] RISC-V: add rd reg parsing to insn.h header
@ 2022-11-30 22:56   ` Heiko Stuebner
  0 siblings, 0 replies; 84+ messages in thread
From: Heiko Stuebner @ 2022-11-30 22:56 UTC (permalink / raw)
  To: linux-riscv, palmer
  Cc: christoph.muellner, prabhakar.csengg, conor, philipp.tomsich,
	ajones, heiko, emil.renner.berthing, ardb, linux-efi,
	Heiko Stuebner, Conor Dooley

From: Heiko Stuebner <heiko.stuebner@vrull.eu>

Add a macro to allow parsing of the rd register from an instruction.

Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Heiko Stuebner <heiko.stuebner@vrull.eu>
---
 arch/riscv/include/asm/insn.h | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/arch/riscv/include/asm/insn.h b/arch/riscv/include/asm/insn.h
index 21ec817abec1..2a23890b4577 100644
--- a/arch/riscv/include/asm/insn.h
+++ b/arch/riscv/include/asm/insn.h
@@ -60,6 +60,7 @@
 #define RVG_RS1_OPOFF		15
 #define RVG_RS2_OPOFF		20
 #define RVG_RD_OPOFF		7
+#define RVG_RD_MASK		GENMASK(4, 0)
 
 /* The bit field of immediate value in RVC J instruction */
 #define RVC_J_IMM_SIGN_OPOFF	12
@@ -244,6 +245,10 @@ static __always_inline bool riscv_insn_is_branch(u32 code)
 #define RV_X(X, s, mask)  (((X) >> (s)) & (mask))
 #define RVC_X(X, s, mask) RV_X(X, s, mask)
 
+#define RV_EXTRACT_RD_REG(x) \
+	({typeof(x) x_ = (x); \
+	(RV_X(x_, RVG_RD_OPOFF, RVG_RD_MASK)); })
+
 #define RV_EXTRACT_UTYPE_IMM(x) \
 	({typeof(x) x_ = (x); \
 	(RV_X(x_, RV_U_IMM_31_12_OPOFF, RV_U_IMM_31_12_MASK)); })
-- 
2.35.1


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH v3 11/14] RISC-V: fix auipc-jalr addresses in patched alternatives
  2022-11-30 22:56 ` Heiko Stuebner
@ 2022-11-30 22:56   ` Heiko Stuebner
  -1 siblings, 0 replies; 84+ messages in thread
From: Heiko Stuebner @ 2022-11-30 22:56 UTC (permalink / raw)
  To: linux-riscv, palmer
  Cc: christoph.muellner, prabhakar.csengg, conor, philipp.tomsich,
	ajones, heiko, emil.renner.berthing, ardb, linux-efi,
	Heiko Stuebner, Conor Dooley

From: Heiko Stuebner <heiko.stuebner@vrull.eu>

Alternatives live in a different section, so addresses used by call
functions will point to wrong locations after the patch got applied.

Similar to arm64, adjust the location to consider that offset.

Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Signed-off-by: Heiko Stuebner <heiko.stuebner@vrull.eu>
---
 arch/riscv/include/asm/alternative.h |  3 ++
 arch/riscv/kernel/alternative.c      | 72 ++++++++++++++++++++++++++++
 arch/riscv/kernel/cpufeature.c       | 11 ++++-
 3 files changed, 84 insertions(+), 2 deletions(-)

diff --git a/arch/riscv/include/asm/alternative.h b/arch/riscv/include/asm/alternative.h
index 6511dd73e812..c58ec3cc4bc3 100644
--- a/arch/riscv/include/asm/alternative.h
+++ b/arch/riscv/include/asm/alternative.h
@@ -27,6 +27,9 @@ void __init apply_boot_alternatives(void);
 void __init apply_early_boot_alternatives(void);
 void apply_module_alternatives(void *start, size_t length);
 
+void riscv_alternative_fix_auipc_jalr(void *alt_ptr, unsigned int len,
+				      int patch_offset);
+
 struct alt_entry {
 	void *old_ptr;		 /* address of original instruciton or data  */
 	void *alt_ptr;		 /* address of replacement instruction or data */
diff --git a/arch/riscv/kernel/alternative.c b/arch/riscv/kernel/alternative.c
index a7d26a00beea..292cc42dc3be 100644
--- a/arch/riscv/kernel/alternative.c
+++ b/arch/riscv/kernel/alternative.c
@@ -15,6 +15,8 @@
 #include <asm/vendorid_list.h>
 #include <asm/sbi.h>
 #include <asm/csr.h>
+#include <asm/insn.h>
+#include <asm/patch.h>
 
 struct cpu_manufacturer_info_t {
 	unsigned long vendor_id;
@@ -53,6 +55,76 @@ static void __init_or_module riscv_fill_cpu_mfr_info(struct cpu_manufacturer_inf
 	}
 }
 
+static unsigned int riscv_instruction_at(void *p, unsigned int offset)
+{
+	u16 *parcel = p + (offset * sizeof(u32));
+
+	return (unsigned int)parcel[0] | (unsigned int)parcel[1] << 16;
+}
+
+static inline bool riscv_insn_is_auipc_jalr(u32 insn1, u32 insn2)
+{
+	return riscv_insn_is_auipc(insn1) && riscv_insn_is_jalr(insn2);
+}
+
+#define JALR_SIGN_MASK		BIT(RV_I_IMM_SIGN_OPOFF - RV_I_IMM_11_0_OPOFF)
+#define AUIPC_PAD		(0x00001000)
+
+#define to_jalr_imm(value)						\
+	((value & RV_I_IMM_11_0_MASK) << RV_I_IMM_11_0_OPOFF)
+
+#define to_auipc_imm(value)						\
+	((value & JALR_SIGN_MASK) ?					\
+	((value & RV_U_IMM_31_12_MASK) + AUIPC_PAD) :	\
+	(value & RV_U_IMM_31_12_MASK))
+
+void riscv_alternative_fix_auipc_jalr(void *alt_ptr, unsigned int len,
+				      int patch_offset)
+{
+	int num_instr = len / sizeof(u32);
+	unsigned int call[2];
+	int i;
+	int imm;
+	u32 rd1;
+
+	/*
+	 * stop one instruction before the end, as we're checking
+	 * for auipc + jalr
+	 */
+	for (i = 0; i < num_instr - 1; i++) {
+		u32 inst1 = riscv_instruction_at(alt_ptr, i);
+		u32 inst2 = riscv_instruction_at(alt_ptr, i + 1);
+
+		if (!riscv_insn_is_auipc_jalr(inst1, inst2))
+			continue;
+
+		/* call will use ra register */
+		rd1 = RV_EXTRACT_RD_REG(inst1);
+		if (rd1 != 1)
+			continue;
+
+		/* get and adjust new target address */
+		imm = RV_EXTRACT_UTYPE_IMM(inst1);
+		imm += RV_EXTRACT_ITYPE_IMM(inst2);
+		imm -= patch_offset;
+
+		/* pick the original auipc + jalr */
+		call[0] = inst1;
+		call[1] = inst2;
+
+		/* drop the old IMMs */
+		call[0] &= ~(RV_U_IMM_31_12_MASK);
+		call[1] &= ~(RV_I_IMM_11_0_MASK << RV_I_IMM_11_0_OPOFF);
+
+		/* add the adapted IMMs */
+		call[0] |= to_auipc_imm(imm);
+		call[1] |= to_jalr_imm(imm);
+
+		/* patch the call place again */
+		patch_text_nosync(alt_ptr + i * sizeof(u32), call, 8);
+	}
+}
+
 /*
  * This is called very early in the boot process (directly after we run
  * a feature detect on the boot CPU). No need to worry about other CPUs
diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c
index 694267d1fe81..ba62a4ff5ccd 100644
--- a/arch/riscv/kernel/cpufeature.c
+++ b/arch/riscv/kernel/cpufeature.c
@@ -316,8 +316,15 @@ void __init_or_module riscv_cpufeature_patch_func(struct alt_entry *begin,
 		}
 
 		tmp = (1U << alt->errata_id);
-		if (cpu_req_feature & tmp)
-			patch_text_nosync(alt->old_ptr, alt->alt_ptr, alt->alt_len);
+		if (cpu_req_feature & tmp) {
+			/* do the basic patching */
+			patch_text_nosync(alt->old_ptr, alt->alt_ptr,
+					  alt->alt_len);
+
+			riscv_alternative_fix_auipc_jalr(alt->old_ptr,
+							 alt->alt_len,
+							 alt->old_ptr - alt->alt_ptr);
+		}
 	}
 }
 #endif
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH v3 11/14] RISC-V: fix auipc-jalr addresses in patched alternatives
@ 2022-11-30 22:56   ` Heiko Stuebner
  0 siblings, 0 replies; 84+ messages in thread
From: Heiko Stuebner @ 2022-11-30 22:56 UTC (permalink / raw)
  To: linux-riscv, palmer
  Cc: christoph.muellner, prabhakar.csengg, conor, philipp.tomsich,
	ajones, heiko, emil.renner.berthing, ardb, linux-efi,
	Heiko Stuebner, Conor Dooley

From: Heiko Stuebner <heiko.stuebner@vrull.eu>

Alternatives live in a different section, so addresses used by call
functions will point to wrong locations after the patch got applied.

Similar to arm64, adjust the location to consider that offset.

Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Signed-off-by: Heiko Stuebner <heiko.stuebner@vrull.eu>
---
 arch/riscv/include/asm/alternative.h |  3 ++
 arch/riscv/kernel/alternative.c      | 72 ++++++++++++++++++++++++++++
 arch/riscv/kernel/cpufeature.c       | 11 ++++-
 3 files changed, 84 insertions(+), 2 deletions(-)

diff --git a/arch/riscv/include/asm/alternative.h b/arch/riscv/include/asm/alternative.h
index 6511dd73e812..c58ec3cc4bc3 100644
--- a/arch/riscv/include/asm/alternative.h
+++ b/arch/riscv/include/asm/alternative.h
@@ -27,6 +27,9 @@ void __init apply_boot_alternatives(void);
 void __init apply_early_boot_alternatives(void);
 void apply_module_alternatives(void *start, size_t length);
 
+void riscv_alternative_fix_auipc_jalr(void *alt_ptr, unsigned int len,
+				      int patch_offset);
+
 struct alt_entry {
 	void *old_ptr;		 /* address of original instruciton or data  */
 	void *alt_ptr;		 /* address of replacement instruction or data */
diff --git a/arch/riscv/kernel/alternative.c b/arch/riscv/kernel/alternative.c
index a7d26a00beea..292cc42dc3be 100644
--- a/arch/riscv/kernel/alternative.c
+++ b/arch/riscv/kernel/alternative.c
@@ -15,6 +15,8 @@
 #include <asm/vendorid_list.h>
 #include <asm/sbi.h>
 #include <asm/csr.h>
+#include <asm/insn.h>
+#include <asm/patch.h>
 
 struct cpu_manufacturer_info_t {
 	unsigned long vendor_id;
@@ -53,6 +55,76 @@ static void __init_or_module riscv_fill_cpu_mfr_info(struct cpu_manufacturer_inf
 	}
 }
 
+static unsigned int riscv_instruction_at(void *p, unsigned int offset)
+{
+	u16 *parcel = p + (offset * sizeof(u32));
+
+	return (unsigned int)parcel[0] | (unsigned int)parcel[1] << 16;
+}
+
+static inline bool riscv_insn_is_auipc_jalr(u32 insn1, u32 insn2)
+{
+	return riscv_insn_is_auipc(insn1) && riscv_insn_is_jalr(insn2);
+}
+
+#define JALR_SIGN_MASK		BIT(RV_I_IMM_SIGN_OPOFF - RV_I_IMM_11_0_OPOFF)
+#define AUIPC_PAD		(0x00001000)
+
+#define to_jalr_imm(value)						\
+	((value & RV_I_IMM_11_0_MASK) << RV_I_IMM_11_0_OPOFF)
+
+#define to_auipc_imm(value)						\
+	((value & JALR_SIGN_MASK) ?					\
+	((value & RV_U_IMM_31_12_MASK) + AUIPC_PAD) :	\
+	(value & RV_U_IMM_31_12_MASK))
+
+void riscv_alternative_fix_auipc_jalr(void *alt_ptr, unsigned int len,
+				      int patch_offset)
+{
+	int num_instr = len / sizeof(u32);
+	unsigned int call[2];
+	int i;
+	int imm;
+	u32 rd1;
+
+	/*
+	 * stop one instruction before the end, as we're checking
+	 * for auipc + jalr
+	 */
+	for (i = 0; i < num_instr - 1; i++) {
+		u32 inst1 = riscv_instruction_at(alt_ptr, i);
+		u32 inst2 = riscv_instruction_at(alt_ptr, i + 1);
+
+		if (!riscv_insn_is_auipc_jalr(inst1, inst2))
+			continue;
+
+		/* call will use ra register */
+		rd1 = RV_EXTRACT_RD_REG(inst1);
+		if (rd1 != 1)
+			continue;
+
+		/* get and adjust new target address */
+		imm = RV_EXTRACT_UTYPE_IMM(inst1);
+		imm += RV_EXTRACT_ITYPE_IMM(inst2);
+		imm -= patch_offset;
+
+		/* pick the original auipc + jalr */
+		call[0] = inst1;
+		call[1] = inst2;
+
+		/* drop the old IMMs */
+		call[0] &= ~(RV_U_IMM_31_12_MASK);
+		call[1] &= ~(RV_I_IMM_11_0_MASK << RV_I_IMM_11_0_OPOFF);
+
+		/* add the adapted IMMs */
+		call[0] |= to_auipc_imm(imm);
+		call[1] |= to_jalr_imm(imm);
+
+		/* patch the call place again */
+		patch_text_nosync(alt_ptr + i * sizeof(u32), call, 8);
+	}
+}
+
 /*
  * This is called very early in the boot process (directly after we run
  * a feature detect on the boot CPU). No need to worry about other CPUs
diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c
index 694267d1fe81..ba62a4ff5ccd 100644
--- a/arch/riscv/kernel/cpufeature.c
+++ b/arch/riscv/kernel/cpufeature.c
@@ -316,8 +316,15 @@ void __init_or_module riscv_cpufeature_patch_func(struct alt_entry *begin,
 		}
 
 		tmp = (1U << alt->errata_id);
-		if (cpu_req_feature & tmp)
-			patch_text_nosync(alt->old_ptr, alt->alt_ptr, alt->alt_len);
+		if (cpu_req_feature & tmp) {
+			/* do the basic patching */
+			patch_text_nosync(alt->old_ptr, alt->alt_ptr,
+					  alt->alt_len);
+
+			riscv_alternative_fix_auipc_jalr(alt->old_ptr,
+							 alt->alt_len,
+							 alt->old_ptr - alt->alt_ptr);
+		}
 	}
 }
 #endif
-- 
2.35.1


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH v3 12/14] efi/riscv: libstub: mark when compiling libstub
  2022-11-30 22:56 ` Heiko Stuebner
@ 2022-11-30 22:56   ` Heiko Stuebner
  -1 siblings, 0 replies; 84+ messages in thread
From: Heiko Stuebner @ 2022-11-30 22:56 UTC (permalink / raw)
  To: linux-riscv, palmer
  Cc: christoph.muellner, prabhakar.csengg, conor, philipp.tomsich,
	ajones, heiko, emil.renner.berthing, ardb, linux-efi,
	Heiko Stuebner, Conor Dooley

From: Heiko Stuebner <heiko.stuebner@vrull.eu>

We may want to runtime-optimize some core functions (str*, mem*),
but not have this leak into libstub and cause build issues.
Instead libstub, for the short while it's running, should just use
the generic implementation.

So, to be able to determine whether functions, that are used both in
libstub and the main kernel, are getting compiled as part of libstub or
not, add a compile-flag we can check via #ifdef.

Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Signed-off-by: Heiko Stuebner <heiko.stuebner@vrull.eu>
---
 drivers/firmware/efi/libstub/Makefile | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/firmware/efi/libstub/Makefile b/drivers/firmware/efi/libstub/Makefile
index ef5045a53ce0..777d1ab059e3 100644
--- a/drivers/firmware/efi/libstub/Makefile
+++ b/drivers/firmware/efi/libstub/Makefile
@@ -25,7 +25,7 @@ cflags-$(CONFIG_ARM)		:= $(subst $(CC_FLAGS_FTRACE),,$(KBUILD_CFLAGS)) \
 				   -fno-builtin -fpic \
 				   $(call cc-option,-mno-single-pic-base)
 cflags-$(CONFIG_RISCV)		:= $(subst $(CC_FLAGS_FTRACE),,$(KBUILD_CFLAGS)) \
-				   -fpic
+				   -fpic -DRISCV_EFISTUB
 cflags-$(CONFIG_LOONGARCH)	:= $(subst $(CC_FLAGS_FTRACE),,$(KBUILD_CFLAGS)) \
 				   -fpie
 
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH v3 12/14] efi/riscv: libstub: mark when compiling libstub
@ 2022-11-30 22:56   ` Heiko Stuebner
  0 siblings, 0 replies; 84+ messages in thread
From: Heiko Stuebner @ 2022-11-30 22:56 UTC (permalink / raw)
  To: linux-riscv, palmer
  Cc: christoph.muellner, prabhakar.csengg, conor, philipp.tomsich,
	ajones, heiko, emil.renner.berthing, ardb, linux-efi,
	Heiko Stuebner, Conor Dooley

From: Heiko Stuebner <heiko.stuebner@vrull.eu>

We may want to runtime-optimize some core functions (str*, mem*),
but not have this leak into libstub and cause build issues.
Instead libstub, for the short while it's running, should just use
the generic implementation.

So, to be able to determine whether functions, that are used both in
libstub and the main kernel, are getting compiled as part of libstub or
not, add a compile-flag we can check via #ifdef.

Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Signed-off-by: Heiko Stuebner <heiko.stuebner@vrull.eu>
---
 drivers/firmware/efi/libstub/Makefile | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/firmware/efi/libstub/Makefile b/drivers/firmware/efi/libstub/Makefile
index ef5045a53ce0..777d1ab059e3 100644
--- a/drivers/firmware/efi/libstub/Makefile
+++ b/drivers/firmware/efi/libstub/Makefile
@@ -25,7 +25,7 @@ cflags-$(CONFIG_ARM)		:= $(subst $(CC_FLAGS_FTRACE),,$(KBUILD_CFLAGS)) \
 				   -fno-builtin -fpic \
 				   $(call cc-option,-mno-single-pic-base)
 cflags-$(CONFIG_RISCV)		:= $(subst $(CC_FLAGS_FTRACE),,$(KBUILD_CFLAGS)) \
-				   -fpic
+				   -fpic -DRISCV_EFISTUB
 cflags-$(CONFIG_LOONGARCH)	:= $(subst $(CC_FLAGS_FTRACE),,$(KBUILD_CFLAGS)) \
 				   -fpie
 
-- 
2.35.1


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH v3 13/14] RISC-V: add infrastructure to allow different str* implementations
  2022-11-30 22:56 ` Heiko Stuebner
@ 2022-11-30 22:56   ` Heiko Stuebner
  -1 siblings, 0 replies; 84+ messages in thread
From: Heiko Stuebner @ 2022-11-30 22:56 UTC (permalink / raw)
  To: linux-riscv, palmer
  Cc: christoph.muellner, prabhakar.csengg, conor, philipp.tomsich,
	ajones, heiko, emil.renner.berthing, ardb, linux-efi,
	Heiko Stuebner, Conor Dooley

From: Heiko Stuebner <heiko.stuebner@vrull.eu>

Depending on supported extensions on specific RISC-V cores,
optimized str* functions might make sense.

This adds basic infrastructure to allow patching the function calls
via alternatives later on.

The main idea is to have the core str* functions be inline functions
which then call the most optimized variant and this call then be
replaced via alternatives.

The big advantage is that we don't need additional calls.
Though we need to duplicate the generic functions as the main code
expects either itself or the architecture to provide the str* functions.

The added *_generic functions are done in assembler (taken from
disassembling the main-kernel functions for now) to allow us to control
the used registers.

Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Signed-off-by: Heiko Stuebner <heiko.stuebner@vrull.eu>
---
 arch/riscv/Makefile             |  3 ++
 arch/riscv/include/asm/string.h | 66 +++++++++++++++++++++++++++++++++
 arch/riscv/kernel/image-vars.h  |  6 +--
 arch/riscv/lib/Makefile         |  3 ++
 arch/riscv/lib/strcmp.S         | 38 +++++++++++++++++++
 arch/riscv/lib/strlen.S         | 29 +++++++++++++++
 arch/riscv/lib/strncmp.S        | 41 ++++++++++++++++++++
 7 files changed, 183 insertions(+), 3 deletions(-)
 create mode 100644 arch/riscv/lib/strcmp.S
 create mode 100644 arch/riscv/lib/strlen.S
 create mode 100644 arch/riscv/lib/strncmp.S

diff --git a/arch/riscv/Makefile b/arch/riscv/Makefile
index 0d13b597cb55..581e4370c2a6 100644
--- a/arch/riscv/Makefile
+++ b/arch/riscv/Makefile
@@ -80,6 +80,9 @@ ifeq ($(CONFIG_PERF_EVENTS),y)
         KBUILD_CFLAGS += -fno-omit-frame-pointer
 endif
 
+# strchr is special case, as gcc might want to call its own strlen from there
+KBUILD_CFLAGS += -fno-builtin-strlen -fno-builtin-strcmp -fno-builtin-strncmp -fno-builtin-strchr
+
 KBUILD_CFLAGS_MODULE += $(call cc-option,-mno-relax)
 KBUILD_AFLAGS_MODULE += $(call as-option,-Wa$(comma)-mno-relax)
 
diff --git a/arch/riscv/include/asm/string.h b/arch/riscv/include/asm/string.h
index 909049366555..b98481d2d154 100644
--- a/arch/riscv/include/asm/string.h
+++ b/arch/riscv/include/asm/string.h
@@ -18,6 +18,72 @@ extern asmlinkage void *__memcpy(void *, const void *, size_t);
 #define __HAVE_ARCH_MEMMOVE
 extern asmlinkage void *memmove(void *, const void *, size_t);
 extern asmlinkage void *__memmove(void *, const void *, size_t);
+
+#define __HAVE_ARCH_STRCMP
+extern asmlinkage int __strcmp_generic(const char *cs, const char *ct);
+
+static inline int strcmp(const char *cs, const char *ct)
+{
+#ifdef RISCV_EFISTUB
+	return __strcmp_generic(cs, ct);
+#else
+	register const char *a0 asm("a0") = cs;
+	register const char *a1 asm("a1") = ct;
+	register int a0_out asm("a0");
+
+	asm volatile("call __strcmp_generic\n\t"
+		: "=r"(a0_out)
+		: "r"(a0), "r"(a1)
+		: "ra", "t0", "t1", "t2");
+
+	return a0_out;
+#endif
+}
+
+#define __HAVE_ARCH_STRNCMP
+extern asmlinkage int __strncmp_generic(const char *cs,
+					const char *ct, size_t count);
+
+static inline int strncmp(const char *cs, const char *ct, size_t count)
+{
+#ifdef RISCV_EFISTUB
+	return __strncmp_generic(cs, ct, count);
+#else
+	register const char *a0 asm("a0") = cs;
+	register const char *a1 asm("a1") = ct;
+	register size_t a2 asm("a2") = count;
+	register int a0_out asm("a0");
+
+	asm volatile("call __strncmp_generic\n\t"
+		: "=r"(a0_out)
+		: "r"(a0), "r"(a1), "r"(a2)
+		: "ra", "t0", "t1", "t2");
+
+	return a0_out;
+#endif
+}
+
+#define __HAVE_ARCH_STRLEN
+extern asmlinkage __kernel_size_t __strlen_generic(const char *);
+
+static inline __kernel_size_t strlen(const char *s)
+{
+#ifdef RISCV_EFISTUB
+	return __strlen_generic(s);
+#else
+	register const char *a0 asm("a0") = s;
+	register int a0_out asm("a0");
+
+	asm volatile(
+		"call __strlen_generic\n\t"
+		: "=r"(a0_out)
+		: "r"(a0)
+		: "ra", "t0", "t1");
+
+	return a0_out;
+#endif
+}
+
 /* For those files which don't want to check by kasan. */
 #if defined(CONFIG_KASAN) && !defined(__SANITIZE_ADDRESS__)
 #define memcpy(dst, src, len) __memcpy(dst, src, len)
diff --git a/arch/riscv/kernel/image-vars.h b/arch/riscv/kernel/image-vars.h
index d6e5f739905e..2f270b9fde63 100644
--- a/arch/riscv/kernel/image-vars.h
+++ b/arch/riscv/kernel/image-vars.h
@@ -25,10 +25,10 @@
  */
 __efistub_memcmp		= memcmp;
 __efistub_memchr		= memchr;
-__efistub_strlen		= strlen;
+__efistub___strlen_generic	= __strlen_generic;
 __efistub_strnlen		= strnlen;
-__efistub_strcmp		= strcmp;
-__efistub_strncmp		= strncmp;
+__efistub___strcmp_generic	= __strcmp_generic;
+__efistub___strncmp_generic	= __strncmp_generic;
 __efistub_strrchr		= strrchr;
 
 __efistub__start		= _start;
diff --git a/arch/riscv/lib/Makefile b/arch/riscv/lib/Makefile
index 25d5c9664e57..6c74b0bedd60 100644
--- a/arch/riscv/lib/Makefile
+++ b/arch/riscv/lib/Makefile
@@ -3,6 +3,9 @@ lib-y			+= delay.o
 lib-y			+= memcpy.o
 lib-y			+= memset.o
 lib-y			+= memmove.o
+lib-y			+= strcmp.o
+lib-y			+= strlen.o
+lib-y			+= strncmp.o
 lib-$(CONFIG_MMU)	+= uaccess.o
 lib-$(CONFIG_64BIT)	+= tishift.o
 
diff --git a/arch/riscv/lib/strcmp.S b/arch/riscv/lib/strcmp.S
new file mode 100644
index 000000000000..f4b7f4e806f0
--- /dev/null
+++ b/arch/riscv/lib/strcmp.S
@@ -0,0 +1,38 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+
+#include <linux/linkage.h>
+#include <asm/asm.h>
+#include <asm-generic/export.h>
+
+/* int __strcmp_generic(const char *cs, const char *ct) */
+ENTRY(__strcmp_generic)
+	/*
+	 * Returns
+	 *   a0 - comparison result, value like strcmp
+	 *
+	 * Parameters
+	 *   a0 - string1
+	 *   a1 - string2
+	 *
+	 * Clobbers
+	 *   t0, t1, t2
+	 */
+	mv	t2, a1
+1:
+	lbu	t1, 0(a0)
+	lbu	t0, 0(a1)
+	addi	a0, a0, 1
+	addi	a1, a1, 1
+	beq	t1, t0, 3f
+	li	a0, 1
+	bgeu	t1, t0, 2f
+	li	a0, -1
+2:
+	mv	a1, t2
+	ret
+3:
+	bnez	t1, 1b
+	li	a0, 0
+	j	2b
+END(__strcmp_generic)
+EXPORT_SYMBOL(__strcmp_generic)
diff --git a/arch/riscv/lib/strlen.S b/arch/riscv/lib/strlen.S
new file mode 100644
index 000000000000..e0e7440ac724
--- /dev/null
+++ b/arch/riscv/lib/strlen.S
@@ -0,0 +1,29 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+
+#include <linux/linkage.h>
+#include <asm/asm.h>
+#include <asm-generic/export.h>
+
+/* int __strlen_generic(const char *s) */
+ENTRY(__strlen_generic)
+	/*
+	 * Returns
+	 *   a0 - string length
+	 *
+	 * Parameters
+	 *   a0 - String to measure
+	 *
+	 * Clobbers:
+	 *   t0, t1
+	 */
+	mv	t1, a0
+1:
+	lbu	t0, 0(t1)
+	bnez	t0, 2f
+	sub	a0, t1, a0
+	ret
+2:
+	addi	t1, t1, 1
+	j	1b
+END(__strlen_generic)
+EXPORT_SYMBOL(__strlen_generic)
diff --git a/arch/riscv/lib/strncmp.S b/arch/riscv/lib/strncmp.S
new file mode 100644
index 000000000000..7e116d942265
--- /dev/null
+++ b/arch/riscv/lib/strncmp.S
@@ -0,0 +1,41 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+
+#include <linux/linkage.h>
+#include <asm/asm.h>
+#include <asm-generic/export.h>
+
+/* int __strncmp_generic(const char *cs, const char *ct, size_t count) */
+ENTRY(__strncmp_generic)
+	/*
+	 * Returns
+	 *   a0 - comparison result, value like strncmp
+	 *
+	 * Parameters
+	 *   a0 - string1
+	 *   a1 - string2
+	 *   a2 - number of characters to compare
+	 *
+	 * Clobbers
+	 *   t0, t1, t2
+	 */
+	li	t0, 0
+1:
+	beq	a2, t0, 4f
+	add	t1, a0, t0
+	add	t2, a1, t0
+	lbu	t1, 0(t1)
+	lbu	t2, 0(t2)
+	beq	t1, t2, 3f
+	li	a0, 1
+	bgeu	t1, t2, 2f
+	li	a0, -1
+2:
+	ret
+3:
+	addi	t0, t0, 1
+	bnez	t1, 1b
+4:
+	li	a0, 0
+	j	2b
+END(__strncmp_generic)
+EXPORT_SYMBOL(__strncmp_generic)
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH v3 13/14] RISC-V: add infrastructure to allow different str* implementations
@ 2022-11-30 22:56   ` Heiko Stuebner
  0 siblings, 0 replies; 84+ messages in thread
From: Heiko Stuebner @ 2022-11-30 22:56 UTC (permalink / raw)
  To: linux-riscv, palmer
  Cc: christoph.muellner, prabhakar.csengg, conor, philipp.tomsich,
	ajones, heiko, emil.renner.berthing, ardb, linux-efi,
	Heiko Stuebner, Conor Dooley

From: Heiko Stuebner <heiko.stuebner@vrull.eu>

Depending on supported extensions on specific RISC-V cores,
optimized str* functions might make sense.

This adds basic infrastructure to allow patching the function calls
via alternatives later on.

The main idea is to have the core str* functions be inline functions
which then call the most optimized variant and this call then be
replaced via alternatives.

The big advantage is that we don't need additional calls.
Though we need to duplicate the generic functions as the main code
expects either itself or the architecture to provide the str* functions.

The added *_generic functions are done in assembler (taken from
disassembling the main-kernel functions for now) to allow us to control
the used registers.

Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Signed-off-by: Heiko Stuebner <heiko.stuebner@vrull.eu>
---
 arch/riscv/Makefile             |  3 ++
 arch/riscv/include/asm/string.h | 66 +++++++++++++++++++++++++++++++++
 arch/riscv/kernel/image-vars.h  |  6 +--
 arch/riscv/lib/Makefile         |  3 ++
 arch/riscv/lib/strcmp.S         | 38 +++++++++++++++++++
 arch/riscv/lib/strlen.S         | 29 +++++++++++++++
 arch/riscv/lib/strncmp.S        | 41 ++++++++++++++++++++
 7 files changed, 183 insertions(+), 3 deletions(-)
 create mode 100644 arch/riscv/lib/strcmp.S
 create mode 100644 arch/riscv/lib/strlen.S
 create mode 100644 arch/riscv/lib/strncmp.S

diff --git a/arch/riscv/Makefile b/arch/riscv/Makefile
index 0d13b597cb55..581e4370c2a6 100644
--- a/arch/riscv/Makefile
+++ b/arch/riscv/Makefile
@@ -80,6 +80,9 @@ ifeq ($(CONFIG_PERF_EVENTS),y)
         KBUILD_CFLAGS += -fno-omit-frame-pointer
 endif
 
+# strchr is special case, as gcc might want to call its own strlen from there
+KBUILD_CFLAGS += -fno-builtin-strlen -fno-builtin-strcmp -fno-builtin-strncmp -fno-builtin-strchr
+
 KBUILD_CFLAGS_MODULE += $(call cc-option,-mno-relax)
 KBUILD_AFLAGS_MODULE += $(call as-option,-Wa$(comma)-mno-relax)
 
diff --git a/arch/riscv/include/asm/string.h b/arch/riscv/include/asm/string.h
index 909049366555..b98481d2d154 100644
--- a/arch/riscv/include/asm/string.h
+++ b/arch/riscv/include/asm/string.h
@@ -18,6 +18,72 @@ extern asmlinkage void *__memcpy(void *, const void *, size_t);
 #define __HAVE_ARCH_MEMMOVE
 extern asmlinkage void *memmove(void *, const void *, size_t);
 extern asmlinkage void *__memmove(void *, const void *, size_t);
+
+#define __HAVE_ARCH_STRCMP
+extern asmlinkage int __strcmp_generic(const char *cs, const char *ct);
+
+static inline int strcmp(const char *cs, const char *ct)
+{
+#ifdef RISCV_EFISTUB
+	return __strcmp_generic(cs, ct);
+#else
+	register const char *a0 asm("a0") = cs;
+	register const char *a1 asm("a1") = ct;
+	register int a0_out asm("a0");
+
+	asm volatile("call __strcmp_generic\n\t"
+		: "=r"(a0_out)
+		: "r"(a0), "r"(a1)
+		: "ra", "t0", "t1", "t2");
+
+	return a0_out;
+#endif
+}
+
+#define __HAVE_ARCH_STRNCMP
+extern asmlinkage int __strncmp_generic(const char *cs,
+					const char *ct, size_t count);
+
+static inline int strncmp(const char *cs, const char *ct, size_t count)
+{
+#ifdef RISCV_EFISTUB
+	return __strncmp_generic(cs, ct, count);
+#else
+	register const char *a0 asm("a0") = cs;
+	register const char *a1 asm("a1") = ct;
+	register size_t a2 asm("a2") = count;
+	register int a0_out asm("a0");
+
+	asm volatile("call __strncmp_generic\n\t"
+		: "=r"(a0_out)
+		: "r"(a0), "r"(a1), "r"(a2)
+		: "ra", "t0", "t1", "t2");
+
+	return a0_out;
+#endif
+}
+
+#define __HAVE_ARCH_STRLEN
+extern asmlinkage __kernel_size_t __strlen_generic(const char *);
+
+static inline __kernel_size_t strlen(const char *s)
+{
+#ifdef RISCV_EFISTUB
+	return __strlen_generic(s);
+#else
+	register const char *a0 asm("a0") = s;
+	register int a0_out asm("a0");
+
+	asm volatile(
+		"call __strlen_generic\n\t"
+		: "=r"(a0_out)
+		: "r"(a0)
+		: "ra", "t0", "t1");
+
+	return a0_out;
+#endif
+}
+
 /* For those files which don't want to check by kasan. */
 #if defined(CONFIG_KASAN) && !defined(__SANITIZE_ADDRESS__)
 #define memcpy(dst, src, len) __memcpy(dst, src, len)
diff --git a/arch/riscv/kernel/image-vars.h b/arch/riscv/kernel/image-vars.h
index d6e5f739905e..2f270b9fde63 100644
--- a/arch/riscv/kernel/image-vars.h
+++ b/arch/riscv/kernel/image-vars.h
@@ -25,10 +25,10 @@
  */
 __efistub_memcmp		= memcmp;
 __efistub_memchr		= memchr;
-__efistub_strlen		= strlen;
+__efistub___strlen_generic	= __strlen_generic;
 __efistub_strnlen		= strnlen;
-__efistub_strcmp		= strcmp;
-__efistub_strncmp		= strncmp;
+__efistub___strcmp_generic	= __strcmp_generic;
+__efistub___strncmp_generic	= __strncmp_generic;
 __efistub_strrchr		= strrchr;
 
 __efistub__start		= _start;
diff --git a/arch/riscv/lib/Makefile b/arch/riscv/lib/Makefile
index 25d5c9664e57..6c74b0bedd60 100644
--- a/arch/riscv/lib/Makefile
+++ b/arch/riscv/lib/Makefile
@@ -3,6 +3,9 @@ lib-y			+= delay.o
 lib-y			+= memcpy.o
 lib-y			+= memset.o
 lib-y			+= memmove.o
+lib-y			+= strcmp.o
+lib-y			+= strlen.o
+lib-y			+= strncmp.o
 lib-$(CONFIG_MMU)	+= uaccess.o
 lib-$(CONFIG_64BIT)	+= tishift.o
 
diff --git a/arch/riscv/lib/strcmp.S b/arch/riscv/lib/strcmp.S
new file mode 100644
index 000000000000..f4b7f4e806f0
--- /dev/null
+++ b/arch/riscv/lib/strcmp.S
@@ -0,0 +1,38 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+
+#include <linux/linkage.h>
+#include <asm/asm.h>
+#include <asm-generic/export.h>
+
+/* int __strcmp_generic(const char *cs, const char *ct) */
+ENTRY(__strcmp_generic)
+	/*
+	 * Returns
+	 *   a0 - comparison result, value like strcmp
+	 *
+	 * Parameters
+	 *   a0 - string1
+	 *   a1 - string2
+	 *
+	 * Clobbers
+	 *   t0, t1, t2
+	 */
+	mv	t2, a1
+1:
+	lbu	t1, 0(a0)
+	lbu	t0, 0(a1)
+	addi	a0, a0, 1
+	addi	a1, a1, 1
+	beq	t1, t0, 3f
+	li	a0, 1
+	bgeu	t1, t0, 2f
+	li	a0, -1
+2:
+	mv	a1, t2
+	ret
+3:
+	bnez	t1, 1b
+	li	a0, 0
+	j	2b
+END(__strcmp_generic)
+EXPORT_SYMBOL(__strcmp_generic)
diff --git a/arch/riscv/lib/strlen.S b/arch/riscv/lib/strlen.S
new file mode 100644
index 000000000000..e0e7440ac724
--- /dev/null
+++ b/arch/riscv/lib/strlen.S
@@ -0,0 +1,29 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+
+#include <linux/linkage.h>
+#include <asm/asm.h>
+#include <asm-generic/export.h>
+
+/* int __strlen_generic(const char *s) */
+ENTRY(__strlen_generic)
+	/*
+	 * Returns
+	 *   a0 - string length
+	 *
+	 * Parameters
+	 *   a0 - String to measure
+	 *
+	 * Clobbers:
+	 *   t0, t1
+	 */
+	mv	t1, a0
+1:
+	lbu	t0, 0(t1)
+	bnez	t0, 2f
+	sub	a0, t1, a0
+	ret
+2:
+	addi	t1, t1, 1
+	j	1b
+END(__strlen_generic)
+EXPORT_SYMBOL(__strlen_generic)
diff --git a/arch/riscv/lib/strncmp.S b/arch/riscv/lib/strncmp.S
new file mode 100644
index 000000000000..7e116d942265
--- /dev/null
+++ b/arch/riscv/lib/strncmp.S
@@ -0,0 +1,41 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+
+#include <linux/linkage.h>
+#include <asm/asm.h>
+#include <asm-generic/export.h>
+
+/* int __strncmp_generic(const char *cs, const char *ct, size_t count) */
+ENTRY(__strncmp_generic)
+	/*
+	 * Returns
+	 *   a0 - comparison result, value like strncmp
+	 *
+	 * Parameters
+	 *   a0 - string1
+	 *   a1 - string2
+	 *   a2 - number of characters to compare
+	 *
+	 * Clobbers
+	 *   t0, t1, t2
+	 */
+	li	t0, 0
+1:
+	beq	a2, t0, 4f
+	add	t1, a0, t0
+	add	t2, a1, t0
+	lbu	t1, 0(t1)
+	lbu	t2, 0(t2)
+	beq	t1, t2, 3f
+	li	a0, 1
+	bgeu	t1, t2, 2f
+	li	a0, -1
+2:
+	ret
+3:
+	addi	t0, t0, 1
+	bnez	t1, 1b
+4:
+	li	a0, 0
+	j	2b
+END(__strncmp_generic)
+EXPORT_SYMBOL(__strncmp_generic)
-- 
2.35.1


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH v3 14/14] RISC-V: add zbb support to string functions
  2022-11-30 22:56 ` Heiko Stuebner
@ 2022-11-30 22:56   ` Heiko Stuebner
  -1 siblings, 0 replies; 84+ messages in thread
From: Heiko Stuebner @ 2022-11-30 22:56 UTC (permalink / raw)
  To: linux-riscv, palmer
  Cc: christoph.muellner, prabhakar.csengg, conor, philipp.tomsich,
	ajones, heiko, emil.renner.berthing, ardb, linux-efi,
	Heiko Stuebner

From: Heiko Stuebner <heiko.stuebner@vrull.eu>

Add handling for ZBB extension and add support for using it as a
variant for optimized string functions.

Support for the Zbb-str-variants is limited to the GNU-assembler
for now, as LLVM has not yet aquired the functionality to
selectively change the arch option in assembler code.
This is still under review at
    https://reviews.llvm.org/D123515

Co-developed-by: Christoph Muellner <christoph.muellner@vrull.eu>
Signed-off-by: Christoph Muellner <christoph.muellner@vrull.eu>
Signed-off-by: Heiko Stuebner <heiko.stuebner@vrull.eu>
---
 arch/riscv/Kconfig                   |  24 ++++++
 arch/riscv/include/asm/errata_list.h |   3 +-
 arch/riscv/include/asm/hwcap.h       |   1 +
 arch/riscv/include/asm/string.h      |  29 +++++--
 arch/riscv/kernel/cpu.c              |   1 +
 arch/riscv/kernel/cpufeature.c       |  18 +++++
 arch/riscv/lib/Makefile              |   3 +
 arch/riscv/lib/strcmp_zbb.S          |  96 ++++++++++++++++++++++
 arch/riscv/lib/strlen_zbb.S          | 115 +++++++++++++++++++++++++++
 arch/riscv/lib/strncmp_zbb.S         | 112 ++++++++++++++++++++++++++
 10 files changed, 395 insertions(+), 7 deletions(-)
 create mode 100644 arch/riscv/lib/strcmp_zbb.S
 create mode 100644 arch/riscv/lib/strlen_zbb.S
 create mode 100644 arch/riscv/lib/strncmp_zbb.S

diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index fa78595a6089..5703c44da5f9 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -411,6 +411,30 @@ config RISCV_ISA_SVPBMT
 
 	   If you don't know what to do here, say Y.
 
+config TOOLCHAIN_HAS_ZBB
+	bool
+	default y
+	depends on !64BIT || $(cc-option,-mabi=lp64 -march=rv64ima_zbb)
+	depends on !32BIT || $(cc-option,-mabi=ilp32 -march=rv32ima_zbb)
+	depends on LLD_VERSION >= 150000 || LD_VERSION >= 23900
+	depends on AS_IS_GNU
+
+config RISCV_ISA_ZBB
+	bool "Zbb extension support for bit manipulation instructions"
+	depends on TOOLCHAIN_HAS_ZBB
+	depends on !XIP_KERNEL && MMU
+	select RISCV_ALTERNATIVE
+	default y
+	help
+	   Adds support to dynamically detect the presence of the ZBB
+	   extension (basic bit manipulation) and enable its usage.
+
+	   The Zbb extension provides instructions to accelerate a number
+	   of bit-specific operations (count bit population, sign extending,
+	   bitrotation, etc).
+
+	   If you don't know what to do here, say Y.
+
 config TOOLCHAIN_HAS_ZICBOM
 	bool
 	default y
diff --git a/arch/riscv/include/asm/errata_list.h b/arch/riscv/include/asm/errata_list.h
index 19a771085781..e0a49b9b7627 100644
--- a/arch/riscv/include/asm/errata_list.h
+++ b/arch/riscv/include/asm/errata_list.h
@@ -22,7 +22,8 @@
 
 #define	CPUFEATURE_SVPBMT 0
 #define	CPUFEATURE_ZICBOM 1
-#define	CPUFEATURE_NUMBER 2
+#define	CPUFEATURE_ZBB 2
+#define	CPUFEATURE_NUMBER 3
 
 #ifdef __ASSEMBLY__
 
diff --git a/arch/riscv/include/asm/hwcap.h b/arch/riscv/include/asm/hwcap.h
index b22525290073..ac5555fd9788 100644
--- a/arch/riscv/include/asm/hwcap.h
+++ b/arch/riscv/include/asm/hwcap.h
@@ -59,6 +59,7 @@ enum riscv_isa_ext_id {
 	RISCV_ISA_EXT_ZIHINTPAUSE,
 	RISCV_ISA_EXT_SSTC,
 	RISCV_ISA_EXT_SVINVAL,
+	RISCV_ISA_EXT_ZBB,
 	RISCV_ISA_EXT_ID_MAX = RISCV_ISA_EXT_MAX,
 };
 
diff --git a/arch/riscv/include/asm/string.h b/arch/riscv/include/asm/string.h
index b98481d2d154..806c402c874e 100644
--- a/arch/riscv/include/asm/string.h
+++ b/arch/riscv/include/asm/string.h
@@ -6,6 +6,8 @@
 #ifndef _ASM_RISCV_STRING_H
 #define _ASM_RISCV_STRING_H
 
+#include <asm/alternative-macros.h>
+#include <asm/errata_list.h>
 #include <linux/types.h>
 #include <linux/linkage.h>
 
@@ -21,6 +23,7 @@ extern asmlinkage void *__memmove(void *, const void *, size_t);
 
 #define __HAVE_ARCH_STRCMP
 extern asmlinkage int __strcmp_generic(const char *cs, const char *ct);
+extern asmlinkage int __strcmp_zbb(const char *cs, const char *ct);
 
 static inline int strcmp(const char *cs, const char *ct)
 {
@@ -31,10 +34,14 @@ static inline int strcmp(const char *cs, const char *ct)
 	register const char *a1 asm("a1") = ct;
 	register int a0_out asm("a0");
 
-	asm volatile("call __strcmp_generic\n\t"
+	asm volatile(
+		ALTERNATIVE(
+			"call __strcmp_generic\n\t",
+			"call __strcmp_zbb\n\t",
+			0, CPUFEATURE_ZBB, CONFIG_RISCV_ISA_ZBB)
 		: "=r"(a0_out)
 		: "r"(a0), "r"(a1)
-		: "ra", "t0", "t1", "t2");
+		: "ra", "t0", "t1", "t2", "t3", "t4", "t5");
 
 	return a0_out;
 #endif
@@ -43,6 +50,8 @@ static inline int strcmp(const char *cs, const char *ct)
 #define __HAVE_ARCH_STRNCMP
 extern asmlinkage int __strncmp_generic(const char *cs,
 					const char *ct, size_t count);
+extern asmlinkage int __strncmp_zbb(const char *cs,
+				    const char *ct, size_t count);
 
 static inline int strncmp(const char *cs, const char *ct, size_t count)
 {
@@ -54,10 +63,14 @@ static inline int strncmp(const char *cs, const char *ct, size_t count)
 	register size_t a2 asm("a2") = count;
 	register int a0_out asm("a0");
 
-	asm volatile("call __strncmp_generic\n\t"
+	asm volatile(
+		ALTERNATIVE(
+			"call __strncmp_generic\n\t",
+			"call __strncmp_zbb\n\t",
+			0, CPUFEATURE_ZBB, CONFIG_RISCV_ISA_ZBB)
 		: "=r"(a0_out)
 		: "r"(a0), "r"(a1), "r"(a2)
-		: "ra", "t0", "t1", "t2");
+		: "ra", "t0", "t1", "t2", "t3", "t4", "t5", "t6");
 
 	return a0_out;
 #endif
@@ -65,6 +78,7 @@ static inline int strncmp(const char *cs, const char *ct, size_t count)
 
 #define __HAVE_ARCH_STRLEN
 extern asmlinkage __kernel_size_t __strlen_generic(const char *);
+extern asmlinkage __kernel_size_t __strlen_zbb(const char *);
 
 static inline __kernel_size_t strlen(const char *s)
 {
@@ -75,10 +89,13 @@ static inline __kernel_size_t strlen(const char *s)
 	register int a0_out asm("a0");
 
 	asm volatile(
-		"call __strlen_generic\n\t"
+		ALTERNATIVE(
+			"call __strlen_generic\n\t",
+			"call __strlen_zbb\n\t",
+			0, CPUFEATURE_ZBB, CONFIG_RISCV_ISA_ZBB)
 		: "=r"(a0_out)
 		: "r"(a0)
-		: "ra", "t0", "t1");
+		: "ra", "t0", "t1", "t2", "t3");
 
 	return a0_out;
 #endif
diff --git a/arch/riscv/kernel/cpu.c b/arch/riscv/kernel/cpu.c
index 852ecccd8920..e93e098f34fe 100644
--- a/arch/riscv/kernel/cpu.c
+++ b/arch/riscv/kernel/cpu.c
@@ -138,6 +138,7 @@ device_initcall(riscv_cpuinfo_init);
  *    extensions by an underscore.
  */
 static struct riscv_isa_ext_data isa_ext_arr[] = {
+	__RISCV_ISA_EXT_DATA(zbb, RISCV_ISA_EXT_ZBB),
 	__RISCV_ISA_EXT_DATA(sscofpmf, RISCV_ISA_EXT_SSCOFPMF),
 	__RISCV_ISA_EXT_DATA(sstc, RISCV_ISA_EXT_SSTC),
 	__RISCV_ISA_EXT_DATA(svinval, RISCV_ISA_EXT_SVINVAL),
diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c
index ba62a4ff5ccd..2ec60794293f 100644
--- a/arch/riscv/kernel/cpufeature.c
+++ b/arch/riscv/kernel/cpufeature.c
@@ -199,6 +199,7 @@ void __init riscv_fill_hwcap(void)
 				this_hwcap |= isa2hwcap[(unsigned char)(*ext)];
 				set_bit(*ext - 'a', this_isa);
 			} else {
+				SET_ISA_EXT_MAP("zbb", RISCV_ISA_EXT_ZBB);
 				SET_ISA_EXT_MAP("sscofpmf", RISCV_ISA_EXT_SSCOFPMF);
 				SET_ISA_EXT_MAP("svpbmt", RISCV_ISA_EXT_SVPBMT);
 				SET_ISA_EXT_MAP("zicbom", RISCV_ISA_EXT_ZICBOM);
@@ -278,6 +279,20 @@ static bool __init_or_module cpufeature_probe_zicbom(unsigned int stage)
 	return true;
 }
 
+static bool __init_or_module cpufeature_probe_zbb(unsigned int stage)
+{
+	if (!IS_ENABLED(CONFIG_RISCV_ISA_ZBB))
+		return false;
+
+	if (stage == RISCV_ALTERNATIVES_EARLY_BOOT)
+		return false;
+
+	if (!riscv_isa_extension_available(NULL, ZBB))
+		return false;
+
+	return true;
+}
+
 /*
  * Probe presence of individual extensions.
  *
@@ -295,6 +310,9 @@ static u32 __init_or_module cpufeature_probe(unsigned int stage)
 	if (cpufeature_probe_zicbom(stage))
 		cpu_req_feature |= BIT(CPUFEATURE_ZICBOM);
 
+	if (cpufeature_probe_zbb(stage))
+		cpu_req_feature |= BIT(CPUFEATURE_ZBB);
+
 	return cpu_req_feature;
 }
 
diff --git a/arch/riscv/lib/Makefile b/arch/riscv/lib/Makefile
index 6c74b0bedd60..b632483f851c 100644
--- a/arch/riscv/lib/Makefile
+++ b/arch/riscv/lib/Makefile
@@ -4,8 +4,11 @@ lib-y			+= memcpy.o
 lib-y			+= memset.o
 lib-y			+= memmove.o
 lib-y			+= strcmp.o
+lib-$(CONFIG_RISCV_ISA_ZBB) += strcmp_zbb.o
 lib-y			+= strlen.o
+lib-$(CONFIG_RISCV_ISA_ZBB) += strlen_zbb.o
 lib-y			+= strncmp.o
+lib-$(CONFIG_RISCV_ISA_ZBB) += strncmp_zbb.o
 lib-$(CONFIG_MMU)	+= uaccess.o
 lib-$(CONFIG_64BIT)	+= tishift.o
 
diff --git a/arch/riscv/lib/strcmp_zbb.S b/arch/riscv/lib/strcmp_zbb.S
new file mode 100644
index 000000000000..654f17dabd67
--- /dev/null
+++ b/arch/riscv/lib/strcmp_zbb.S
@@ -0,0 +1,96 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Copyright (c) 2022 VRULL GmbH
+ * Author: Christoph Muellner <christoph.muellner@vrull.eu>
+ */
+
+#include <linux/linkage.h>
+#include <asm/asm.h>
+#include <asm-generic/export.h>
+
+#define src1		a0
+#define result		a0
+#define src2		t5
+#define data1		t0
+#define data2		t1
+#define align		t2
+#define data1_orcb	t3
+#define m1		t4
+
+.option push
+.option arch,+zbb
+
+/* int __strcmp_zbb(const char *cs, const char *ct) */
+ENTRY(__strcmp_zbb)
+	/*
+	 * Returns
+	 *   a0 - comparison result, value like strcmp
+	 *
+	 * Parameters
+	 *   a0 - string1
+	 *   a1 - string2
+	 *
+	 * Clobbers
+	 *   t0, t1, t2, t3, t4, t5
+	 */
+	mv	src2, a1
+
+	or	align, src1, src2
+	li	m1, -1
+	and	align, align, SZREG-1
+	bnez	align, 3f
+
+	/* Main loop for aligned string.  */
+	.p2align 3
+1:
+	REG_L	data1, 0(src1)
+	REG_L	data2, 0(src2)
+	orc.b	data1_orcb, data1
+	bne	data1_orcb, m1, 2f
+	addi	src1, src1, SZREG
+	addi	src2, src2, SZREG
+	beq	data1, data2, 1b
+
+	/*
+	 * Words don't match, and no null byte in the first
+	 * word. Get bytes in big-endian order and compare.
+	 */
+#ifndef CONFIG_CPU_BIG_ENDIAN
+	rev8	data1, data1
+	rev8	data2, data2
+#endif
+
+	/* Synthesize (data1 >= data2) ? 1 : -1 in a branchless sequence. */
+	sltu	result, data1, data2
+	neg	result, result
+	ori	result, result, 1
+	ret
+
+2:
+	/*
+	 * Found a null byte.
+	 * If words don't match, fall back to simple loop.
+	 */
+	bne	data1, data2, 3f
+
+	/* Otherwise, strings are equal. */
+	li	result, 0
+	ret
+
+	/* Simple loop for misaligned strings. */
+	.p2align 3
+3:
+	lbu	data1, 0(src1)
+	lbu	data2, 0(src2)
+	addi	src1, src1, 1
+	addi	src2, src2, 1
+	bne	data1, data2, 4f
+	bnez	data1, 3b
+
+4:
+	sub	result, data1, data2
+	ret
+END(__strcmp_zbb)
+EXPORT_SYMBOL(__strcmp_zbb)
+
+.option pop
diff --git a/arch/riscv/lib/strlen_zbb.S b/arch/riscv/lib/strlen_zbb.S
new file mode 100644
index 000000000000..49a396567819
--- /dev/null
+++ b/arch/riscv/lib/strlen_zbb.S
@@ -0,0 +1,115 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Copyright (c) 2022 VRULL GmbH
+ * Author: Christoph Muellner <christoph.muellner@vrull.eu>
+ */
+
+#include <linux/linkage.h>
+#include <asm/asm.h>
+#include <asm-generic/export.h>
+
+#define src		a0
+#define result		a0
+#define addr		t0
+#define data		t1
+#define offset		t2
+#define offset_bits	t2
+#define valid_bytes	t3
+#define m1		t3
+
+#ifdef CONFIG_CPU_BIG_ENDIAN
+# define CZ	clz
+# define SHIFT	sll
+#else
+# define CZ	ctz
+# define SHIFT	srl
+#endif
+
+.option push
+.option arch,+zbb
+
+/* int __strlen_zbb(const char *s) */
+ENTRY(__strlen_zbb)
+	/*
+	 * Returns
+	 *   a0 - string length
+	 *
+	 * Parameters
+	 *   a0 - String to measure
+	 *
+	 * Clobbers
+	 *   t0, t1, t2, t3
+	 */
+
+	/* Number of irrelevant bytes in the first word. */
+	andi	offset, src, SZREG-1
+
+	/* Align pointer. */
+	andi	addr, src, -SZREG
+
+	li	valid_bytes, SZREG
+	sub	valid_bytes, valid_bytes, offset
+	slli	offset_bits, offset, RISCV_LGPTR
+
+	/* Get the first word.  */
+	REG_L	data, 0(addr)
+
+	/*
+	 * Shift away the partial data we loaded to remove the irrelevant bytes
+	 * preceding the string with the effect of adding NUL bytes at the
+	 * end of the string.
+	 */
+	SHIFT	data, data, offset_bits
+
+	/* Convert non-NUL into 0xff and NUL into 0x00. */
+	orc.b	data, data
+
+	/* Convert non-NUL into 0x00 and NUL into 0xff. */
+	not	data, data
+
+	/*
+	 * Search for the first set bit (corresponding to a NUL byte in the
+	 * original chunk).
+	 */
+	CZ	data, data
+
+	/*
+	 * The first chunk is special: commpare against the number
+	 * of valid bytes in this chunk.
+	 */
+	srli	result, data, 3
+	bgtu	valid_bytes, result, 3f
+
+	/* Prepare for the word comparison loop. */
+	addi	offset, addr, SZREG
+	li	m1, -1
+
+	/*
+	 * Our critical loop is 4 instructions and processes data in
+	 * 4 byte or 8 byte chunks.
+	 */
+	.p2align 3
+1:
+	REG_L	data, SZREG(addr)
+	addi	addr, addr, SZREG
+	orc.b	data, data
+	beq	data, m1, 1b
+2:
+	not	data, data
+	CZ	data, data
+
+	/* Get number of processed words.  */
+	sub	offset, addr, offset
+
+	/* Add number of characters in the first word.  */
+	add	result, result, offset
+	srli	data, data, 3
+
+	/* Add number of characters in the last word.  */
+	add	result, result, data
+3:
+	ret
+END(__strlen_zbb)
+EXPORT_SYMBOL(__strlen_zbb)
+
+.option pop
diff --git a/arch/riscv/lib/strncmp_zbb.S b/arch/riscv/lib/strncmp_zbb.S
new file mode 100644
index 000000000000..0cd4b84b7d27
--- /dev/null
+++ b/arch/riscv/lib/strncmp_zbb.S
@@ -0,0 +1,112 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Copyright (c) 2022 VRULL GmbH
+ * Author: Christoph Muellner <christoph.muellner@vrull.eu>
+ */
+
+#include <linux/linkage.h>
+#include <asm/asm.h>
+#include <asm-generic/export.h>
+
+#define src1		a0
+#define result		a0
+#define src2		t6
+#define len		a2
+#define data1		t0
+#define data2		t1
+#define align		t2
+#define data1_orcb	t3
+#define limit		t4
+#define m1		t5
+
+.option push
+.option arch,+zbb
+
+/* int __strncmp_zbb(const char *cs, const char *ct, size_t count) */
+ENTRY(__strncmp_zbb)
+	/*
+	 * Returns
+	 *   a0 - comparison result, like strncmp
+	 *
+	 * Parameters
+	 *   a0 - string1
+	 *   a1 - string2
+	 *   a2 - number of characters to compare
+	 *
+	 * Clobbers
+	 *   t0, t1, t2, t3, t4, t5, t6
+	 */
+	mv	src2, a1
+
+	or	align, src1, src2
+	li	m1, -1
+	and	align, align, SZREG-1
+	add	limit, src1, len
+	bnez	align, 4f
+
+	/* Adjust limit for fast-path.  */
+	addi	limit, limit, -SZREG
+
+	/* Main loop for aligned string.  */
+	.p2align 3
+1:
+	bgt	src1, limit, 3f
+	REG_L	data1, 0(src1)
+	REG_L	data2, 0(src2)
+	orc.b	data1_orcb, data1
+	bne	data1_orcb, m1, 2f
+	addi	src1, src1, SZREG
+	addi	src2, src2, SZREG
+	beq	data1, data2, 1b
+
+	/*
+	 * Words don't match, and no null byte in the first
+	 * word. Get bytes in big-endian order and compare.
+	 */
+#ifndef CONFIG_CPU_BIG_ENDIAN
+	rev8	data1, data1
+	rev8	data2, data2
+#endif
+
+	/* Synthesize (data1 >= data2) ? 1 : -1 in a branchless sequence.  */
+	sltu	result, data1, data2
+	neg	result, result
+	ori	result, result, 1
+	ret
+
+2:
+	/*
+	 * Found a null byte.
+	 * If words don't match, fall back to simple loop.
+	 */
+	bne	data1, data2, 3f
+
+	/* Otherwise, strings are equal.  */
+	li	result, 0
+	ret
+
+	/* Simple loop for misaligned strings.  */
+3:
+	/* Restore limit for slow-path.  */
+	addi	limit, limit, SZREG
+	.p2align 3
+4:
+	bge	src1, limit, 6f
+	lbu	data1, 0(src1)
+	lbu	data2, 0(src2)
+	addi	src1, src1, 1
+	addi	src2, src2, 1
+	bne	data1, data2, 5f
+	bnez	data1, 4b
+
+5:
+	sub	result, data1, data2
+	ret
+
+6:
+	li	result, 0
+	ret
+END(__strncmp_zbb)
+EXPORT_SYMBOL(__strncmp_zbb)
+
+.option pop
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH v3 14/14] RISC-V: add zbb support to string functions
@ 2022-11-30 22:56   ` Heiko Stuebner
  0 siblings, 0 replies; 84+ messages in thread
From: Heiko Stuebner @ 2022-11-30 22:56 UTC (permalink / raw)
  To: linux-riscv, palmer
  Cc: christoph.muellner, prabhakar.csengg, conor, philipp.tomsich,
	ajones, heiko, emil.renner.berthing, ardb, linux-efi,
	Heiko Stuebner

From: Heiko Stuebner <heiko.stuebner@vrull.eu>

Add handling for ZBB extension and add support for using it as a
variant for optimized string functions.

Support for the Zbb-str-variants is limited to the GNU-assembler
for now, as LLVM has not yet aquired the functionality to
selectively change the arch option in assembler code.
This is still under review at
    https://reviews.llvm.org/D123515

Co-developed-by: Christoph Muellner <christoph.muellner@vrull.eu>
Signed-off-by: Christoph Muellner <christoph.muellner@vrull.eu>
Signed-off-by: Heiko Stuebner <heiko.stuebner@vrull.eu>
---
 arch/riscv/Kconfig                   |  24 ++++++
 arch/riscv/include/asm/errata_list.h |   3 +-
 arch/riscv/include/asm/hwcap.h       |   1 +
 arch/riscv/include/asm/string.h      |  29 +++++--
 arch/riscv/kernel/cpu.c              |   1 +
 arch/riscv/kernel/cpufeature.c       |  18 +++++
 arch/riscv/lib/Makefile              |   3 +
 arch/riscv/lib/strcmp_zbb.S          |  96 ++++++++++++++++++++++
 arch/riscv/lib/strlen_zbb.S          | 115 +++++++++++++++++++++++++++
 arch/riscv/lib/strncmp_zbb.S         | 112 ++++++++++++++++++++++++++
 10 files changed, 395 insertions(+), 7 deletions(-)
 create mode 100644 arch/riscv/lib/strcmp_zbb.S
 create mode 100644 arch/riscv/lib/strlen_zbb.S
 create mode 100644 arch/riscv/lib/strncmp_zbb.S

diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index fa78595a6089..5703c44da5f9 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -411,6 +411,30 @@ config RISCV_ISA_SVPBMT
 
 	   If you don't know what to do here, say Y.
 
+config TOOLCHAIN_HAS_ZBB
+	bool
+	default y
+	depends on !64BIT || $(cc-option,-mabi=lp64 -march=rv64ima_zbb)
+	depends on !32BIT || $(cc-option,-mabi=ilp32 -march=rv32ima_zbb)
+	depends on LLD_VERSION >= 150000 || LD_VERSION >= 23900
+	depends on AS_IS_GNU
+
+config RISCV_ISA_ZBB
+	bool "Zbb extension support for bit manipulation instructions"
+	depends on TOOLCHAIN_HAS_ZBB
+	depends on !XIP_KERNEL && MMU
+	select RISCV_ALTERNATIVE
+	default y
+	help
+	   Adds support to dynamically detect the presence of the ZBB
+	   extension (basic bit manipulation) and enable its usage.
+
+	   The Zbb extension provides instructions to accelerate a number
+	   of bit-specific operations (count bit population, sign extending,
+	   bitrotation, etc).
+
+	   If you don't know what to do here, say Y.
+
 config TOOLCHAIN_HAS_ZICBOM
 	bool
 	default y
diff --git a/arch/riscv/include/asm/errata_list.h b/arch/riscv/include/asm/errata_list.h
index 19a771085781..e0a49b9b7627 100644
--- a/arch/riscv/include/asm/errata_list.h
+++ b/arch/riscv/include/asm/errata_list.h
@@ -22,7 +22,8 @@
 
 #define	CPUFEATURE_SVPBMT 0
 #define	CPUFEATURE_ZICBOM 1
-#define	CPUFEATURE_NUMBER 2
+#define	CPUFEATURE_ZBB 2
+#define	CPUFEATURE_NUMBER 3
 
 #ifdef __ASSEMBLY__
 
diff --git a/arch/riscv/include/asm/hwcap.h b/arch/riscv/include/asm/hwcap.h
index b22525290073..ac5555fd9788 100644
--- a/arch/riscv/include/asm/hwcap.h
+++ b/arch/riscv/include/asm/hwcap.h
@@ -59,6 +59,7 @@ enum riscv_isa_ext_id {
 	RISCV_ISA_EXT_ZIHINTPAUSE,
 	RISCV_ISA_EXT_SSTC,
 	RISCV_ISA_EXT_SVINVAL,
+	RISCV_ISA_EXT_ZBB,
 	RISCV_ISA_EXT_ID_MAX = RISCV_ISA_EXT_MAX,
 };
 
diff --git a/arch/riscv/include/asm/string.h b/arch/riscv/include/asm/string.h
index b98481d2d154..806c402c874e 100644
--- a/arch/riscv/include/asm/string.h
+++ b/arch/riscv/include/asm/string.h
@@ -6,6 +6,8 @@
 #ifndef _ASM_RISCV_STRING_H
 #define _ASM_RISCV_STRING_H
 
+#include <asm/alternative-macros.h>
+#include <asm/errata_list.h>
 #include <linux/types.h>
 #include <linux/linkage.h>
 
@@ -21,6 +23,7 @@ extern asmlinkage void *__memmove(void *, const void *, size_t);
 
 #define __HAVE_ARCH_STRCMP
 extern asmlinkage int __strcmp_generic(const char *cs, const char *ct);
+extern asmlinkage int __strcmp_zbb(const char *cs, const char *ct);
 
 static inline int strcmp(const char *cs, const char *ct)
 {
@@ -31,10 +34,14 @@ static inline int strcmp(const char *cs, const char *ct)
 	register const char *a1 asm("a1") = ct;
 	register int a0_out asm("a0");
 
-	asm volatile("call __strcmp_generic\n\t"
+	asm volatile(
+		ALTERNATIVE(
+			"call __strcmp_generic\n\t",
+			"call __strcmp_zbb\n\t",
+			0, CPUFEATURE_ZBB, CONFIG_RISCV_ISA_ZBB)
 		: "=r"(a0_out)
 		: "r"(a0), "r"(a1)
-		: "ra", "t0", "t1", "t2");
+		: "ra", "t0", "t1", "t2", "t3", "t4", "t5");
 
 	return a0_out;
 #endif
@@ -43,6 +50,8 @@ static inline int strcmp(const char *cs, const char *ct)
 #define __HAVE_ARCH_STRNCMP
 extern asmlinkage int __strncmp_generic(const char *cs,
 					const char *ct, size_t count);
+extern asmlinkage int __strncmp_zbb(const char *cs,
+				    const char *ct, size_t count);
 
 static inline int strncmp(const char *cs, const char *ct, size_t count)
 {
@@ -54,10 +63,14 @@ static inline int strncmp(const char *cs, const char *ct, size_t count)
 	register size_t a2 asm("a2") = count;
 	register int a0_out asm("a0");
 
-	asm volatile("call __strncmp_generic\n\t"
+	asm volatile(
+		ALTERNATIVE(
+			"call __strncmp_generic\n\t",
+			"call __strncmp_zbb\n\t",
+			0, CPUFEATURE_ZBB, CONFIG_RISCV_ISA_ZBB)
 		: "=r"(a0_out)
 		: "r"(a0), "r"(a1), "r"(a2)
-		: "ra", "t0", "t1", "t2");
+		: "ra", "t0", "t1", "t2", "t3", "t4", "t5", "t6");
 
 	return a0_out;
 #endif
@@ -65,6 +78,7 @@ static inline int strncmp(const char *cs, const char *ct, size_t count)
 
 #define __HAVE_ARCH_STRLEN
 extern asmlinkage __kernel_size_t __strlen_generic(const char *);
+extern asmlinkage __kernel_size_t __strlen_zbb(const char *);
 
 static inline __kernel_size_t strlen(const char *s)
 {
@@ -75,10 +89,13 @@ static inline __kernel_size_t strlen(const char *s)
 	register int a0_out asm("a0");
 
 	asm volatile(
-		"call __strlen_generic\n\t"
+		ALTERNATIVE(
+			"call __strlen_generic\n\t",
+			"call __strlen_zbb\n\t",
+			0, CPUFEATURE_ZBB, CONFIG_RISCV_ISA_ZBB)
 		: "=r"(a0_out)
 		: "r"(a0)
-		: "ra", "t0", "t1");
+		: "ra", "t0", "t1", "t2", "t3");
 
 	return a0_out;
 #endif
diff --git a/arch/riscv/kernel/cpu.c b/arch/riscv/kernel/cpu.c
index 852ecccd8920..e93e098f34fe 100644
--- a/arch/riscv/kernel/cpu.c
+++ b/arch/riscv/kernel/cpu.c
@@ -138,6 +138,7 @@ device_initcall(riscv_cpuinfo_init);
  *    extensions by an underscore.
  */
 static struct riscv_isa_ext_data isa_ext_arr[] = {
+	__RISCV_ISA_EXT_DATA(zbb, RISCV_ISA_EXT_ZBB),
 	__RISCV_ISA_EXT_DATA(sscofpmf, RISCV_ISA_EXT_SSCOFPMF),
 	__RISCV_ISA_EXT_DATA(sstc, RISCV_ISA_EXT_SSTC),
 	__RISCV_ISA_EXT_DATA(svinval, RISCV_ISA_EXT_SVINVAL),
diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c
index ba62a4ff5ccd..2ec60794293f 100644
--- a/arch/riscv/kernel/cpufeature.c
+++ b/arch/riscv/kernel/cpufeature.c
@@ -199,6 +199,7 @@ void __init riscv_fill_hwcap(void)
 				this_hwcap |= isa2hwcap[(unsigned char)(*ext)];
 				set_bit(*ext - 'a', this_isa);
 			} else {
+				SET_ISA_EXT_MAP("zbb", RISCV_ISA_EXT_ZBB);
 				SET_ISA_EXT_MAP("sscofpmf", RISCV_ISA_EXT_SSCOFPMF);
 				SET_ISA_EXT_MAP("svpbmt", RISCV_ISA_EXT_SVPBMT);
 				SET_ISA_EXT_MAP("zicbom", RISCV_ISA_EXT_ZICBOM);
@@ -278,6 +279,20 @@ static bool __init_or_module cpufeature_probe_zicbom(unsigned int stage)
 	return true;
 }
 
+static bool __init_or_module cpufeature_probe_zbb(unsigned int stage)
+{
+	if (!IS_ENABLED(CONFIG_RISCV_ISA_ZBB))
+		return false;
+
+	if (stage == RISCV_ALTERNATIVES_EARLY_BOOT)
+		return false;
+
+	if (!riscv_isa_extension_available(NULL, ZBB))
+		return false;
+
+	return true;
+}
+
 /*
  * Probe presence of individual extensions.
  *
@@ -295,6 +310,9 @@ static u32 __init_or_module cpufeature_probe(unsigned int stage)
 	if (cpufeature_probe_zicbom(stage))
 		cpu_req_feature |= BIT(CPUFEATURE_ZICBOM);
 
+	if (cpufeature_probe_zbb(stage))
+		cpu_req_feature |= BIT(CPUFEATURE_ZBB);
+
 	return cpu_req_feature;
 }
 
diff --git a/arch/riscv/lib/Makefile b/arch/riscv/lib/Makefile
index 6c74b0bedd60..b632483f851c 100644
--- a/arch/riscv/lib/Makefile
+++ b/arch/riscv/lib/Makefile
@@ -4,8 +4,11 @@ lib-y			+= memcpy.o
 lib-y			+= memset.o
 lib-y			+= memmove.o
 lib-y			+= strcmp.o
+lib-$(CONFIG_RISCV_ISA_ZBB) += strcmp_zbb.o
 lib-y			+= strlen.o
+lib-$(CONFIG_RISCV_ISA_ZBB) += strlen_zbb.o
 lib-y			+= strncmp.o
+lib-$(CONFIG_RISCV_ISA_ZBB) += strncmp_zbb.o
 lib-$(CONFIG_MMU)	+= uaccess.o
 lib-$(CONFIG_64BIT)	+= tishift.o
 
diff --git a/arch/riscv/lib/strcmp_zbb.S b/arch/riscv/lib/strcmp_zbb.S
new file mode 100644
index 000000000000..654f17dabd67
--- /dev/null
+++ b/arch/riscv/lib/strcmp_zbb.S
@@ -0,0 +1,96 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Copyright (c) 2022 VRULL GmbH
+ * Author: Christoph Muellner <christoph.muellner@vrull.eu>
+ */
+
+#include <linux/linkage.h>
+#include <asm/asm.h>
+#include <asm-generic/export.h>
+
+#define src1		a0
+#define result		a0
+#define src2		t5
+#define data1		t0
+#define data2		t1
+#define align		t2
+#define data1_orcb	t3
+#define m1		t4
+
+.option push
+.option arch,+zbb
+
+/* int __strcmp_zbb(const char *cs, const char *ct) */
+ENTRY(__strcmp_zbb)
+	/*
+	 * Returns
+	 *   a0 - comparison result, value like strcmp
+	 *
+	 * Parameters
+	 *   a0 - string1
+	 *   a1 - string2
+	 *
+	 * Clobbers
+	 *   t0, t1, t2, t3, t4, t5
+	 */
+	mv	src2, a1
+
+	or	align, src1, src2
+	li	m1, -1
+	and	align, align, SZREG-1
+	bnez	align, 3f
+
+	/* Main loop for aligned string.  */
+	.p2align 3
+1:
+	REG_L	data1, 0(src1)
+	REG_L	data2, 0(src2)
+	orc.b	data1_orcb, data1
+	bne	data1_orcb, m1, 2f
+	addi	src1, src1, SZREG
+	addi	src2, src2, SZREG
+	beq	data1, data2, 1b
+
+	/*
+	 * Words don't match, and no null byte in the first
+	 * word. Get bytes in big-endian order and compare.
+	 */
+#ifndef CONFIG_CPU_BIG_ENDIAN
+	rev8	data1, data1
+	rev8	data2, data2
+#endif
+
+	/* Synthesize (data1 >= data2) ? 1 : -1 in a branchless sequence. */
+	sltu	result, data1, data2
+	neg	result, result
+	ori	result, result, 1
+	ret
+
+2:
+	/*
+	 * Found a null byte.
+	 * If words don't match, fall back to simple loop.
+	 */
+	bne	data1, data2, 3f
+
+	/* Otherwise, strings are equal. */
+	li	result, 0
+	ret
+
+	/* Simple loop for misaligned strings. */
+	.p2align 3
+3:
+	lbu	data1, 0(src1)
+	lbu	data2, 0(src2)
+	addi	src1, src1, 1
+	addi	src2, src2, 1
+	bne	data1, data2, 4f
+	bnez	data1, 3b
+
+4:
+	sub	result, data1, data2
+	ret
+END(__strcmp_zbb)
+EXPORT_SYMBOL(__strcmp_zbb)
+
+.option pop
diff --git a/arch/riscv/lib/strlen_zbb.S b/arch/riscv/lib/strlen_zbb.S
new file mode 100644
index 000000000000..49a396567819
--- /dev/null
+++ b/arch/riscv/lib/strlen_zbb.S
@@ -0,0 +1,115 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Copyright (c) 2022 VRULL GmbH
+ * Author: Christoph Muellner <christoph.muellner@vrull.eu>
+ */
+
+#include <linux/linkage.h>
+#include <asm/asm.h>
+#include <asm-generic/export.h>
+
+#define src		a0
+#define result		a0
+#define addr		t0
+#define data		t1
+#define offset		t2
+#define offset_bits	t2
+#define valid_bytes	t3
+#define m1		t3
+
+#ifdef CONFIG_CPU_BIG_ENDIAN
+# define CZ	clz
+# define SHIFT	sll
+#else
+# define CZ	ctz
+# define SHIFT	srl
+#endif
+
+.option push
+.option arch,+zbb
+
+/* int __strlen_zbb(const char *s) */
+ENTRY(__strlen_zbb)
+	/*
+	 * Returns
+	 *   a0 - string length
+	 *
+	 * Parameters
+	 *   a0 - String to measure
+	 *
+	 * Clobbers
+	 *   t0, t1, t2, t3
+	 */
+
+	/* Number of irrelevant bytes in the first word. */
+	andi	offset, src, SZREG-1
+
+	/* Align pointer. */
+	andi	addr, src, -SZREG
+
+	li	valid_bytes, SZREG
+	sub	valid_bytes, valid_bytes, offset
+	slli	offset_bits, offset, RISCV_LGPTR
+
+	/* Get the first word.  */
+	REG_L	data, 0(addr)
+
+	/*
+	 * Shift away the partial data we loaded to remove the irrelevant bytes
+	 * preceding the string with the effect of adding NUL bytes at the
+	 * end of the string.
+	 */
+	SHIFT	data, data, offset_bits
+
+	/* Convert non-NUL into 0xff and NUL into 0x00. */
+	orc.b	data, data
+
+	/* Convert non-NUL into 0x00 and NUL into 0xff. */
+	not	data, data
+
+	/*
+	 * Search for the first set bit (corresponding to a NUL byte in the
+	 * original chunk).
+	 */
+	CZ	data, data
+
+	/*
+	 * The first chunk is special: commpare against the number
+	 * of valid bytes in this chunk.
+	 */
+	srli	result, data, 3
+	bgtu	valid_bytes, result, 3f
+
+	/* Prepare for the word comparison loop. */
+	addi	offset, addr, SZREG
+	li	m1, -1
+
+	/*
+	 * Our critical loop is 4 instructions and processes data in
+	 * 4 byte or 8 byte chunks.
+	 */
+	.p2align 3
+1:
+	REG_L	data, SZREG(addr)
+	addi	addr, addr, SZREG
+	orc.b	data, data
+	beq	data, m1, 1b
+2:
+	not	data, data
+	CZ	data, data
+
+	/* Get number of processed words.  */
+	sub	offset, addr, offset
+
+	/* Add number of characters in the first word.  */
+	add	result, result, offset
+	srli	data, data, 3
+
+	/* Add number of characters in the last word.  */
+	add	result, result, data
+3:
+	ret
+END(__strlen_zbb)
+EXPORT_SYMBOL(__strlen_zbb)
+
+.option pop
diff --git a/arch/riscv/lib/strncmp_zbb.S b/arch/riscv/lib/strncmp_zbb.S
new file mode 100644
index 000000000000..0cd4b84b7d27
--- /dev/null
+++ b/arch/riscv/lib/strncmp_zbb.S
@@ -0,0 +1,112 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Copyright (c) 2022 VRULL GmbH
+ * Author: Christoph Muellner <christoph.muellner@vrull.eu>
+ */
+
+#include <linux/linkage.h>
+#include <asm/asm.h>
+#include <asm-generic/export.h>
+
+#define src1		a0
+#define result		a0
+#define src2		t6
+#define len		a2
+#define data1		t0
+#define data2		t1
+#define align		t2
+#define data1_orcb	t3
+#define limit		t4
+#define m1		t5
+
+.option push
+.option arch,+zbb
+
+/* int __strncmp_zbb(const char *cs, const char *ct, size_t count) */
+ENTRY(__strncmp_zbb)
+	/*
+	 * Returns
+	 *   a0 - comparison result, like strncmp
+	 *
+	 * Parameters
+	 *   a0 - string1
+	 *   a1 - string2
+	 *   a2 - number of characters to compare
+	 *
+	 * Clobbers
+	 *   t0, t1, t2, t3, t4, t5, t6
+	 */
+	mv	src2, a1
+
+	or	align, src1, src2
+	li	m1, -1
+	and	align, align, SZREG-1
+	add	limit, src1, len
+	bnez	align, 4f
+
+	/* Adjust limit for fast-path.  */
+	addi	limit, limit, -SZREG
+
+	/* Main loop for aligned string.  */
+	.p2align 3
+1:
+	bgt	src1, limit, 3f
+	REG_L	data1, 0(src1)
+	REG_L	data2, 0(src2)
+	orc.b	data1_orcb, data1
+	bne	data1_orcb, m1, 2f
+	addi	src1, src1, SZREG
+	addi	src2, src2, SZREG
+	beq	data1, data2, 1b
+
+	/*
+	 * Words don't match, and no null byte in the first
+	 * word. Get bytes in big-endian order and compare.
+	 */
+#ifndef CONFIG_CPU_BIG_ENDIAN
+	rev8	data1, data1
+	rev8	data2, data2
+#endif
+
+	/* Synthesize (data1 >= data2) ? 1 : -1 in a branchless sequence.  */
+	sltu	result, data1, data2
+	neg	result, result
+	ori	result, result, 1
+	ret
+
+2:
+	/*
+	 * Found a null byte.
+	 * If words don't match, fall back to simple loop.
+	 */
+	bne	data1, data2, 3f
+
+	/* Otherwise, strings are equal.  */
+	li	result, 0
+	ret
+
+	/* Simple loop for misaligned strings.  */
+3:
+	/* Restore limit for slow-path.  */
+	addi	limit, limit, SZREG
+	.p2align 3
+4:
+	bge	src1, limit, 6f
+	lbu	data1, 0(src1)
+	lbu	data2, 0(src2)
+	addi	src1, src1, 1
+	addi	src2, src2, 1
+	bne	data1, data2, 5f
+	bnez	data1, 4b
+
+5:
+	sub	result, data1, data2
+	ret
+
+6:
+	li	result, 0
+	ret
+END(__strncmp_zbb)
+EXPORT_SYMBOL(__strncmp_zbb)
+
+.option pop
-- 
2.35.1


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply related	[flat|nested] 84+ messages in thread

* Re: [PATCH v3 0/14] Zbb string optimizations and call support in alternatives
  2022-11-30 22:56 ` Heiko Stuebner
@ 2022-12-01  0:02   ` Conor Dooley
  -1 siblings, 0 replies; 84+ messages in thread
From: Conor Dooley @ 2022-12-01  0:02 UTC (permalink / raw)
  To: Heiko Stuebner, linux-riscv, palmer
  Cc: christoph.muellner, prabhakar.csengg, conor, philipp.tomsich,
	ajones, emil.renner.berthing, ardb, linux-efi, Heiko Stuebner

On 30/11/2022 22:56, Heiko Stuebner wrote:
> EXTERNAL EMAIL: Do not click links or open attachments unless you know the content is safe
> 
> From: Heiko Stuebner <heiko.stuebner@vrull.eu>
> 
> The Zbb extension can be used to make string functions run a lot
> faster.
> 
> To allow There are essentially two problems to solve:
> - making it possible for str* functions to replace what they do
>   in a performant way
> 
>   This is done by inlining the core functions and then
>   using alternatives to call the actual variant.
> 
>   This of course will need a more intelligent selection mechanism
>   down the road when more variants may exist using different
>   available extensions.
> 
> - actually allowing calls in alternatives
>   Function calls use auipc + jalr to reach those 32bit relative
>   addresses but when they're compiled the offset will be wrong
>   as alternatives live in a different section. So when the patch
>   gets applied the address will point to the wrong location.
> 
>   So similar to arm64 the target addresses need to be updated.
> 
>   This is probably also helpful for other things needing more
>   complex code in alternatives.
> 
> 
> In my half-scientific test-case of running the functions in question
> on a 95 character string in a loop of 10000 iterations, the Zbb
> variants shave off around 2/3 of the original runtime.
> 
> 
> For v2 I got into some sort of cleanup spree for the general instruction
> parsing that already existed. A number of places do their own
> instruction parsing and I tried consolidating some of them.
> 
> Noteable, the kvm parts still do, but I had to stop somewhere :-)
> 
> The series is based on v6.1-rc7 right now.
> 
> changes since v2:
> - add patch fixing the c.jalr funct4 value
> - reword some commit messages
> - fix position of auipc addition patch (earlier)
> - fix compile errors from patch-reordering gone wrong
>   (worked at the end of v2, but compiling individual patches
>    caused issues) - patches are now tested individually
> - limit Zbb variants for GNU as for now
>   (LLVM support for .option arch is still under review)

Still no good on that front chief:
ld.lld: error: undefined symbol: __strlen_generic
>>> referenced by ctype.c
>>>               arch/riscv/purgatory/purgatory.ro:(strlcpy)
>>> referenced by ctype.c
>>>               arch/riscv/purgatory/purgatory.ro:(strlcat)
>>> referenced by ctype.c
>>>               arch/riscv/purgatory/purgatory.ro:(strlcat)
>>> referenced 3 more times
make[5]: *** [/stuff/linux/arch/riscv/purgatory/Makefile:85: arch/riscv/purgatory/purgatory.chk] Error 1
make[5]: Target 'arch/riscv/purgatory/' not remade because of errors.
make[4]: *** [/stuff/linux/scripts/Makefile.build:500: arch/riscv/purgatory] Error 2

allmodconfig, same toolchain as before.

> - prevent str-functions from getting optimized to builtin-variants
> 
> changes since v1:
> - a number of generalizations/cleanups for instruction parsing
> - use accessor function to access instructions (Emil)
> - actually patch the correct location when having more than one
>   instruction in an alternative block
> - string function cleanups (comments etc) (Conor)
> - move zbb extension above s* extensions in cpu.c lists
> 
> changes since rfc:
> - make Zbb code actually work
> - drop some unneeded patches
> - a lot of cleanups
> 
> Heiko Stuebner (14):
>   RISC-V: fix funct4 definition for c.jalr in parse_asm.h
>   RISC-V: add prefix to all constants/macros in parse_asm.h
>   RISC-V: detach funct-values from their offset
>   RISC-V: add ebreak instructions to definitions
>   RISC-V: add auipc elements to parse_asm header
>   RISC-V: Move riscv_insn_is_* macros into a common header
>   RISC-V: rename parse_asm.h to insn.h
>   RISC-V: kprobes: use central defined funct3 constants
>   RISC-V: add U-type imm parsing to insn.h header
>   RISC-V: add rd reg parsing to insn.h header
>   RISC-V: fix auipc-jalr addresses in patched alternatives
>   efi/riscv: libstub: mark when compiling libstub
>   RISC-V: add infrastructure to allow different str* implementations
>   RISC-V: add zbb support to string functions
> 
>  arch/riscv/Kconfig                       |  24 ++
>  arch/riscv/Makefile                      |   3 +
>  arch/riscv/include/asm/alternative.h     |   3 +
>  arch/riscv/include/asm/errata_list.h     |   3 +-
>  arch/riscv/include/asm/hwcap.h           |   1 +
>  arch/riscv/include/asm/insn.h            | 292 +++++++++++++++++++++++
>  arch/riscv/include/asm/parse_asm.h       | 219 -----------------
>  arch/riscv/include/asm/string.h          |  83 +++++++
>  arch/riscv/kernel/alternative.c          |  72 ++++++
>  arch/riscv/kernel/cpu.c                  |   1 +
>  arch/riscv/kernel/cpufeature.c           |  29 ++-
>  arch/riscv/kernel/image-vars.h           |   6 +-
>  arch/riscv/kernel/kgdb.c                 |  63 ++---
>  arch/riscv/kernel/probes/simulate-insn.c |  19 +-
>  arch/riscv/kernel/probes/simulate-insn.h |  26 +-
>  arch/riscv/lib/Makefile                  |   6 +
>  arch/riscv/lib/strcmp.S                  |  38 +++
>  arch/riscv/lib/strcmp_zbb.S              |  96 ++++++++
>  arch/riscv/lib/strlen.S                  |  29 +++
>  arch/riscv/lib/strlen_zbb.S              | 115 +++++++++
>  arch/riscv/lib/strncmp.S                 |  41 ++++
>  arch/riscv/lib/strncmp_zbb.S             | 112 +++++++++
>  drivers/firmware/efi/libstub/Makefile    |   2 +-
>  23 files changed, 982 insertions(+), 301 deletions(-)
>  create mode 100644 arch/riscv/include/asm/insn.h
>  delete mode 100644 arch/riscv/include/asm/parse_asm.h
>  create mode 100644 arch/riscv/lib/strcmp.S
>  create mode 100644 arch/riscv/lib/strcmp_zbb.S
>  create mode 100644 arch/riscv/lib/strlen.S
>  create mode 100644 arch/riscv/lib/strlen_zbb.S
>  create mode 100644 arch/riscv/lib/strncmp.S
>  create mode 100644 arch/riscv/lib/strncmp_zbb.S
> 
> --
> 2.35.1
> 
> 
> _______________________________________________
> linux-riscv mailing list
> linux-riscv@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-riscv


^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [PATCH v3 0/14] Zbb string optimizations and call support in alternatives
@ 2022-12-01  0:02   ` Conor Dooley
  0 siblings, 0 replies; 84+ messages in thread
From: Conor Dooley @ 2022-12-01  0:02 UTC (permalink / raw)
  To: Heiko Stuebner, linux-riscv, palmer
  Cc: christoph.muellner, prabhakar.csengg, conor, philipp.tomsich,
	ajones, emil.renner.berthing, ardb, linux-efi, Heiko Stuebner

On 30/11/2022 22:56, Heiko Stuebner wrote:
> EXTERNAL EMAIL: Do not click links or open attachments unless you know the content is safe
> 
> From: Heiko Stuebner <heiko.stuebner@vrull.eu>
> 
> The Zbb extension can be used to make string functions run a lot
> faster.
> 
> To allow There are essentially two problems to solve:
> - making it possible for str* functions to replace what they do
>   in a performant way
> 
>   This is done by inlining the core functions and then
>   using alternatives to call the actual variant.
> 
>   This of course will need a more intelligent selection mechanism
>   down the road when more variants may exist using different
>   available extensions.
> 
> - actually allowing calls in alternatives
>   Function calls use auipc + jalr to reach those 32bit relative
>   addresses but when they're compiled the offset will be wrong
>   as alternatives live in a different section. So when the patch
>   gets applied the address will point to the wrong location.
> 
>   So similar to arm64 the target addresses need to be updated.
> 
>   This is probably also helpful for other things needing more
>   complex code in alternatives.
> 
> 
> In my half-scientific test-case of running the functions in question
> on a 95 character string in a loop of 10000 iterations, the Zbb
> variants shave off around 2/3 of the original runtime.
> 
> 
> For v2 I got into some sort of cleanup spree for the general instruction
> parsing that already existed. A number of places do their own
> instruction parsing and I tried consolidating some of them.
> 
> Noteable, the kvm parts still do, but I had to stop somewhere :-)
> 
> The series is based on v6.1-rc7 right now.
> 
> changes since v2:
> - add patch fixing the c.jalr funct4 value
> - reword some commit messages
> - fix position of auipc addition patch (earlier)
> - fix compile errors from patch-reordering gone wrong
>   (worked at the end of v2, but compiling individual patches
>    caused issues) - patches are now tested individually
> - limit Zbb variants for GNU as for now
>   (LLVM support for .option arch is still under review)

Still no good on that front chief:
ld.lld: error: undefined symbol: __strlen_generic
>>> referenced by ctype.c
>>>               arch/riscv/purgatory/purgatory.ro:(strlcpy)
>>> referenced by ctype.c
>>>               arch/riscv/purgatory/purgatory.ro:(strlcat)
>>> referenced by ctype.c
>>>               arch/riscv/purgatory/purgatory.ro:(strlcat)
>>> referenced 3 more times
make[5]: *** [/stuff/linux/arch/riscv/purgatory/Makefile:85: arch/riscv/purgatory/purgatory.chk] Error 1
make[5]: Target 'arch/riscv/purgatory/' not remade because of errors.
make[4]: *** [/stuff/linux/scripts/Makefile.build:500: arch/riscv/purgatory] Error 2

allmodconfig, same toolchain as before.

> - prevent str-functions from getting optimized to builtin-variants
> 
> changes since v1:
> - a number of generalizations/cleanups for instruction parsing
> - use accessor function to access instructions (Emil)
> - actually patch the correct location when having more than one
>   instruction in an alternative block
> - string function cleanups (comments etc) (Conor)
> - move zbb extension above s* extensions in cpu.c lists
> 
> changes since rfc:
> - make Zbb code actually work
> - drop some unneeded patches
> - a lot of cleanups
> 
> Heiko Stuebner (14):
>   RISC-V: fix funct4 definition for c.jalr in parse_asm.h
>   RISC-V: add prefix to all constants/macros in parse_asm.h
>   RISC-V: detach funct-values from their offset
>   RISC-V: add ebreak instructions to definitions
>   RISC-V: add auipc elements to parse_asm header
>   RISC-V: Move riscv_insn_is_* macros into a common header
>   RISC-V: rename parse_asm.h to insn.h
>   RISC-V: kprobes: use central defined funct3 constants
>   RISC-V: add U-type imm parsing to insn.h header
>   RISC-V: add rd reg parsing to insn.h header
>   RISC-V: fix auipc-jalr addresses in patched alternatives
>   efi/riscv: libstub: mark when compiling libstub
>   RISC-V: add infrastructure to allow different str* implementations
>   RISC-V: add zbb support to string functions
> 
>  arch/riscv/Kconfig                       |  24 ++
>  arch/riscv/Makefile                      |   3 +
>  arch/riscv/include/asm/alternative.h     |   3 +
>  arch/riscv/include/asm/errata_list.h     |   3 +-
>  arch/riscv/include/asm/hwcap.h           |   1 +
>  arch/riscv/include/asm/insn.h            | 292 +++++++++++++++++++++++
>  arch/riscv/include/asm/parse_asm.h       | 219 -----------------
>  arch/riscv/include/asm/string.h          |  83 +++++++
>  arch/riscv/kernel/alternative.c          |  72 ++++++
>  arch/riscv/kernel/cpu.c                  |   1 +
>  arch/riscv/kernel/cpufeature.c           |  29 ++-
>  arch/riscv/kernel/image-vars.h           |   6 +-
>  arch/riscv/kernel/kgdb.c                 |  63 ++---
>  arch/riscv/kernel/probes/simulate-insn.c |  19 +-
>  arch/riscv/kernel/probes/simulate-insn.h |  26 +-
>  arch/riscv/lib/Makefile                  |   6 +
>  arch/riscv/lib/strcmp.S                  |  38 +++
>  arch/riscv/lib/strcmp_zbb.S              |  96 ++++++++
>  arch/riscv/lib/strlen.S                  |  29 +++
>  arch/riscv/lib/strlen_zbb.S              | 115 +++++++++
>  arch/riscv/lib/strncmp.S                 |  41 ++++
>  arch/riscv/lib/strncmp_zbb.S             | 112 +++++++++
>  drivers/firmware/efi/libstub/Makefile    |   2 +-
>  23 files changed, 982 insertions(+), 301 deletions(-)
>  create mode 100644 arch/riscv/include/asm/insn.h
>  delete mode 100644 arch/riscv/include/asm/parse_asm.h
>  create mode 100644 arch/riscv/lib/strcmp.S
>  create mode 100644 arch/riscv/lib/strcmp_zbb.S
>  create mode 100644 arch/riscv/lib/strlen.S
>  create mode 100644 arch/riscv/lib/strlen_zbb.S
>  create mode 100644 arch/riscv/lib/strncmp.S
>  create mode 100644 arch/riscv/lib/strncmp_zbb.S
> 
> --
> 2.35.1
> 
> 
> _______________________________________________
> linux-riscv mailing list
> linux-riscv@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-riscv


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [PATCH v3 0/14] Zbb string optimizations and call support in alternatives
  2022-12-01  0:02   ` Conor Dooley
@ 2022-12-01 11:42     ` Heiko Stübner
  -1 siblings, 0 replies; 84+ messages in thread
From: Heiko Stübner @ 2022-12-01 11:42 UTC (permalink / raw)
  To: linux-riscv, palmer, Conor Dooley
  Cc: christoph.muellner, prabhakar.csengg, conor, philipp.tomsich,
	ajones, emil.renner.berthing, ardb, linux-efi

Am Donnerstag, 1. Dezember 2022, 01:02:08 CET schrieb Conor Dooley:
> On 30/11/2022 22:56, Heiko Stuebner wrote:

> > changes since v2:
> > - add patch fixing the c.jalr funct4 value
> > - reword some commit messages
> > - fix position of auipc addition patch (earlier)
> > - fix compile errors from patch-reordering gone wrong
> >   (worked at the end of v2, but compiling individual patches
> >    caused issues) - patches are now tested individually
> > - limit Zbb variants for GNU as for now
> >   (LLVM support for .option arch is still under review)
> 
> Still no good on that front chief:
> ld.lld: error: undefined symbol: __strlen_generic
> >>> referenced by ctype.c
> >>>               arch/riscv/purgatory/purgatory.ro:(strlcpy)
> >>> referenced by ctype.c
> >>>               arch/riscv/purgatory/purgatory.ro:(strlcat)
> >>> referenced by ctype.c
> >>>               arch/riscv/purgatory/purgatory.ro:(strlcat)
> >>> referenced 3 more times
> make[5]: *** [/stuff/linux/arch/riscv/purgatory/Makefile:85: arch/riscv/purgatory/purgatory.chk] Error 1
> make[5]: Target 'arch/riscv/purgatory/' not remade because of errors.
> make[4]: *** [/stuff/linux/scripts/Makefile.build:500: arch/riscv/purgatory] Error 2

Oh interesting, there is another efistub-like thingy hidden in the tree.
(and CRYPTO_SHA256 needs to be built-in, not a module) to allow the
kexec-purgatory to be build.

The following should do the trick:

---------------- 8< --------------
diff --git a/arch/riscv/include/asm/string.h b/arch/riscv/include/asm/string.h
index 806c402c874e..b99698983045 100644
--- a/arch/riscv/include/asm/string.h
+++ b/arch/riscv/include/asm/string.h
@@ -27,7 +27,7 @@ extern asmlinkage int __strcmp_zbb(const char *cs, const char *ct);
 
 static inline int strcmp(const char *cs, const char *ct)
 {
-#ifdef RISCV_EFISTUB
+#if defined(RISCV_EFISTUB) || defined(RISCV_PURGATORY)
 	return __strcmp_generic(cs, ct);
 #else
 	register const char *a0 asm("a0") = cs;
@@ -55,7 +55,7 @@ extern asmlinkage int __strncmp_zbb(const char *cs,
 
 static inline int strncmp(const char *cs, const char *ct, size_t count)
 {
-#ifdef RISCV_EFISTUB
+#if defined(RISCV_EFISTUB) || defined(RISCV_PURGATORY)
 	return __strncmp_generic(cs, ct, count);
 #else
 	register const char *a0 asm("a0") = cs;
@@ -82,7 +82,7 @@ extern asmlinkage __kernel_size_t __strlen_zbb(const char *);
 
 static inline __kernel_size_t strlen(const char *s)
 {
-#ifdef RISCV_EFISTUB
+#if defined(RISCV_EFISTUB) || defined(RISCV_PURGATORY)
 	return __strlen_generic(s);
 #else
 	register const char *a0 asm("a0") = s;
diff --git a/arch/riscv/purgatory/Makefile b/arch/riscv/purgatory/Makefile
index dd58e1d99397..1d0969722875 100644
--- a/arch/riscv/purgatory/Makefile
+++ b/arch/riscv/purgatory/Makefile
@@ -2,6 +2,7 @@
 OBJECT_FILES_NON_STANDARD := y
 
 purgatory-y := purgatory.o sha256.o entry.o string.o ctype.o memcpy.o memset.o
+purgatory-y += strcmp.o strlen.o strncmp.o
 
 targets += $(purgatory-y)
 PURGATORY_OBJS = $(addprefix $(obj)/,$(purgatory-y))
@@ -18,6 +19,15 @@ $(obj)/memcpy.o: $(srctree)/arch/riscv/lib/memcpy.S FORCE
 $(obj)/memset.o: $(srctree)/arch/riscv/lib/memset.S FORCE
 	$(call if_changed_rule,as_o_S)
 
+$(obj)/strcmp.o: $(srctree)/arch/riscv/lib/strcmp.S FORCE
+	$(call if_changed_rule,as_o_S)
+
+$(obj)/strlen.o: $(srctree)/arch/riscv/lib/strlen.S FORCE
+	$(call if_changed_rule,as_o_S)
+
+$(obj)/strncmp.o: $(srctree)/arch/riscv/lib/strncmp.S FORCE
+	$(call if_changed_rule,as_o_S)
+
 $(obj)/sha256.o: $(srctree)/lib/crypto/sha256.c FORCE
 	$(call if_changed_rule,cc_o_c)
 
@@ -46,6 +56,7 @@ PURGATORY_CFLAGS_REMOVE := -mcmodel=kernel
 PURGATORY_CFLAGS := -mcmodel=medany -ffreestanding -fno-zero-initialized-in-bss
 PURGATORY_CFLAGS += $(DISABLE_STACKLEAK_PLUGIN) -DDISABLE_BRANCH_PROFILING
 PURGATORY_CFLAGS += -fno-stack-protector -g0
+PURGATORY_CFLAGS += -DRISCV_PURGATORY
 
 # Default KBUILD_CFLAGS can have -pg option set when FTRACE is enabled. That
 # in turn leaves some undefined symbols like __fentry__ in purgatory and not
@@ -77,6 +88,9 @@ CFLAGS_ctype.o			+= $(PURGATORY_CFLAGS)
 AFLAGS_REMOVE_entry.o		+= -Wa,-gdwarf-2
 AFLAGS_REMOVE_memcpy.o		+= -Wa,-gdwarf-2
 AFLAGS_REMOVE_memset.o		+= -Wa,-gdwarf-2
+AFLAGS_REMOVE_strcmp.o		+= -Wa,-gdwarf-2
+AFLAGS_REMOVE_strlen.o		+= -Wa,-gdwarf-2
+AFLAGS_REMOVE_strncmp.o		+= -Wa,-gdwarf-2
 
 $(obj)/purgatory.ro: $(PURGATORY_OBJS) FORCE
 		$(call if_changed,ld)




^ permalink raw reply related	[flat|nested] 84+ messages in thread

* Re: [PATCH v3 0/14] Zbb string optimizations and call support in alternatives
@ 2022-12-01 11:42     ` Heiko Stübner
  0 siblings, 0 replies; 84+ messages in thread
From: Heiko Stübner @ 2022-12-01 11:42 UTC (permalink / raw)
  To: linux-riscv, palmer, Conor Dooley
  Cc: christoph.muellner, prabhakar.csengg, conor, philipp.tomsich,
	ajones, emil.renner.berthing, ardb, linux-efi

Am Donnerstag, 1. Dezember 2022, 01:02:08 CET schrieb Conor Dooley:
> On 30/11/2022 22:56, Heiko Stuebner wrote:

> > changes since v2:
> > - add patch fixing the c.jalr funct4 value
> > - reword some commit messages
> > - fix position of auipc addition patch (earlier)
> > - fix compile errors from patch-reordering gone wrong
> >   (worked at the end of v2, but compiling individual patches
> >    caused issues) - patches are now tested individually
> > - limit Zbb variants for GNU as for now
> >   (LLVM support for .option arch is still under review)
> 
> Still no good on that front chief:
> ld.lld: error: undefined symbol: __strlen_generic
> >>> referenced by ctype.c
> >>>               arch/riscv/purgatory/purgatory.ro:(strlcpy)
> >>> referenced by ctype.c
> >>>               arch/riscv/purgatory/purgatory.ro:(strlcat)
> >>> referenced by ctype.c
> >>>               arch/riscv/purgatory/purgatory.ro:(strlcat)
> >>> referenced 3 more times
> make[5]: *** [/stuff/linux/arch/riscv/purgatory/Makefile:85: arch/riscv/purgatory/purgatory.chk] Error 1
> make[5]: Target 'arch/riscv/purgatory/' not remade because of errors.
> make[4]: *** [/stuff/linux/scripts/Makefile.build:500: arch/riscv/purgatory] Error 2

Oh interesting, there is another efistub-like thingy hidden in the tree.
(and CRYPTO_SHA256 needs to be built-in, not a module) to allow the
kexec-purgatory to be build.

The following should do the trick:

---------------- 8< --------------
diff --git a/arch/riscv/include/asm/string.h b/arch/riscv/include/asm/string.h
index 806c402c874e..b99698983045 100644
--- a/arch/riscv/include/asm/string.h
+++ b/arch/riscv/include/asm/string.h
@@ -27,7 +27,7 @@ extern asmlinkage int __strcmp_zbb(const char *cs, const char *ct);
 
 static inline int strcmp(const char *cs, const char *ct)
 {
-#ifdef RISCV_EFISTUB
+#if defined(RISCV_EFISTUB) || defined(RISCV_PURGATORY)
 	return __strcmp_generic(cs, ct);
 #else
 	register const char *a0 asm("a0") = cs;
@@ -55,7 +55,7 @@ extern asmlinkage int __strncmp_zbb(const char *cs,
 
 static inline int strncmp(const char *cs, const char *ct, size_t count)
 {
-#ifdef RISCV_EFISTUB
+#if defined(RISCV_EFISTUB) || defined(RISCV_PURGATORY)
 	return __strncmp_generic(cs, ct, count);
 #else
 	register const char *a0 asm("a0") = cs;
@@ -82,7 +82,7 @@ extern asmlinkage __kernel_size_t __strlen_zbb(const char *);
 
 static inline __kernel_size_t strlen(const char *s)
 {
-#ifdef RISCV_EFISTUB
+#if defined(RISCV_EFISTUB) || defined(RISCV_PURGATORY)
 	return __strlen_generic(s);
 #else
 	register const char *a0 asm("a0") = s;
diff --git a/arch/riscv/purgatory/Makefile b/arch/riscv/purgatory/Makefile
index dd58e1d99397..1d0969722875 100644
--- a/arch/riscv/purgatory/Makefile
+++ b/arch/riscv/purgatory/Makefile
@@ -2,6 +2,7 @@
 OBJECT_FILES_NON_STANDARD := y
 
 purgatory-y := purgatory.o sha256.o entry.o string.o ctype.o memcpy.o memset.o
+purgatory-y += strcmp.o strlen.o strncmp.o
 
 targets += $(purgatory-y)
 PURGATORY_OBJS = $(addprefix $(obj)/,$(purgatory-y))
@@ -18,6 +19,15 @@ $(obj)/memcpy.o: $(srctree)/arch/riscv/lib/memcpy.S FORCE
 $(obj)/memset.o: $(srctree)/arch/riscv/lib/memset.S FORCE
 	$(call if_changed_rule,as_o_S)
 
+$(obj)/strcmp.o: $(srctree)/arch/riscv/lib/strcmp.S FORCE
+	$(call if_changed_rule,as_o_S)
+
+$(obj)/strlen.o: $(srctree)/arch/riscv/lib/strlen.S FORCE
+	$(call if_changed_rule,as_o_S)
+
+$(obj)/strncmp.o: $(srctree)/arch/riscv/lib/strncmp.S FORCE
+	$(call if_changed_rule,as_o_S)
+
 $(obj)/sha256.o: $(srctree)/lib/crypto/sha256.c FORCE
 	$(call if_changed_rule,cc_o_c)
 
@@ -46,6 +56,7 @@ PURGATORY_CFLAGS_REMOVE := -mcmodel=kernel
 PURGATORY_CFLAGS := -mcmodel=medany -ffreestanding -fno-zero-initialized-in-bss
 PURGATORY_CFLAGS += $(DISABLE_STACKLEAK_PLUGIN) -DDISABLE_BRANCH_PROFILING
 PURGATORY_CFLAGS += -fno-stack-protector -g0
+PURGATORY_CFLAGS += -DRISCV_PURGATORY
 
 # Default KBUILD_CFLAGS can have -pg option set when FTRACE is enabled. That
 # in turn leaves some undefined symbols like __fentry__ in purgatory and not
@@ -77,6 +88,9 @@ CFLAGS_ctype.o			+= $(PURGATORY_CFLAGS)
 AFLAGS_REMOVE_entry.o		+= -Wa,-gdwarf-2
 AFLAGS_REMOVE_memcpy.o		+= -Wa,-gdwarf-2
 AFLAGS_REMOVE_memset.o		+= -Wa,-gdwarf-2
+AFLAGS_REMOVE_strcmp.o		+= -Wa,-gdwarf-2
+AFLAGS_REMOVE_strlen.o		+= -Wa,-gdwarf-2
+AFLAGS_REMOVE_strncmp.o		+= -Wa,-gdwarf-2
 
 $(obj)/purgatory.ro: $(PURGATORY_OBJS) FORCE
 		$(call if_changed,ld)




_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply related	[flat|nested] 84+ messages in thread

* Re: [PATCH v3 01/14] RISC-V: fix funct4 definition for c.jalr in parse_asm.h
  2022-11-30 22:56   ` Heiko Stuebner
@ 2022-12-01 17:00     ` Andrew Jones
  -1 siblings, 0 replies; 84+ messages in thread
From: Andrew Jones @ 2022-12-01 17:00 UTC (permalink / raw)
  To: Heiko Stuebner
  Cc: linux-riscv, palmer, christoph.muellner, prabhakar.csengg, conor,
	philipp.tomsich, emil.renner.berthing, ardb, linux-efi,
	Heiko Stuebner

On Wed, Nov 30, 2022 at 11:56:01PM +0100, Heiko Stuebner wrote:
> From: Heiko Stuebner <heiko.stuebner@vrull.eu>
> 
> The opcode definition for c.jalr is
>     c.jalr c_rs1_n0  1..0=2 15..13=4 12=1 6..2=0
> 
> This means funct4 consisting of bit [15:12] is 1001b, so the value is 0x9.

I presume kgdb c.jalr stepping was broken before this patch and, if so,
then we should add a comment stating that and also

  Fixes: edde5584c7ab ("riscv: Add SW single-step support for KDB")

and maybe to help keep me motivated :-)

  Reported-by: Andrew Jones <ajones@ventanamicro.com>

Thanks,
drew

> 
> Signed-off-by: Heiko Stuebner <heiko.stuebner@vrull.eu>
> ---
>  arch/riscv/include/asm/parse_asm.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/riscv/include/asm/parse_asm.h b/arch/riscv/include/asm/parse_asm.h
> index f36368de839f..7fee806805c1 100644
> --- a/arch/riscv/include/asm/parse_asm.h
> +++ b/arch/riscv/include/asm/parse_asm.h
> @@ -125,7 +125,7 @@
>  #define FUNCT3_C_J		0xa000
>  #define FUNCT3_C_JAL		0x2000
>  #define FUNCT4_C_JR		0x8000
> -#define FUNCT4_C_JALR		0xf000
> +#define FUNCT4_C_JALR		0x9000
>  
>  #define FUNCT12_SRET		0x10200000
>  
> -- 
> 2.35.1
> 

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [PATCH v3 01/14] RISC-V: fix funct4 definition for c.jalr in parse_asm.h
@ 2022-12-01 17:00     ` Andrew Jones
  0 siblings, 0 replies; 84+ messages in thread
From: Andrew Jones @ 2022-12-01 17:00 UTC (permalink / raw)
  To: Heiko Stuebner
  Cc: linux-riscv, palmer, christoph.muellner, prabhakar.csengg, conor,
	philipp.tomsich, emil.renner.berthing, ardb, linux-efi,
	Heiko Stuebner

On Wed, Nov 30, 2022 at 11:56:01PM +0100, Heiko Stuebner wrote:
> From: Heiko Stuebner <heiko.stuebner@vrull.eu>
> 
> The opcode definition for c.jalr is
>     c.jalr c_rs1_n0  1..0=2 15..13=4 12=1 6..2=0
> 
> This means funct4 consisting of bit [15:12] is 1001b, so the value is 0x9.

I presume kgdb c.jalr stepping was broken before this patch and, if so,
then we should add a comment stating that and also

  Fixes: edde5584c7ab ("riscv: Add SW single-step support for KDB")

and maybe to help keep me motivated :-)

  Reported-by: Andrew Jones <ajones@ventanamicro.com>

Thanks,
drew

> 
> Signed-off-by: Heiko Stuebner <heiko.stuebner@vrull.eu>
> ---
>  arch/riscv/include/asm/parse_asm.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/riscv/include/asm/parse_asm.h b/arch/riscv/include/asm/parse_asm.h
> index f36368de839f..7fee806805c1 100644
> --- a/arch/riscv/include/asm/parse_asm.h
> +++ b/arch/riscv/include/asm/parse_asm.h
> @@ -125,7 +125,7 @@
>  #define FUNCT3_C_J		0xa000
>  #define FUNCT3_C_JAL		0x2000
>  #define FUNCT4_C_JR		0x8000
> -#define FUNCT4_C_JALR		0xf000
> +#define FUNCT4_C_JALR		0x9000
>  
>  #define FUNCT12_SRET		0x10200000
>  
> -- 
> 2.35.1
> 

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [PATCH v3 01/14] RISC-V: fix funct4 definition for c.jalr in parse_asm.h
  2022-12-01 17:00     ` Andrew Jones
@ 2022-12-01 17:05       ` Andrew Jones
  -1 siblings, 0 replies; 84+ messages in thread
From: Andrew Jones @ 2022-12-01 17:05 UTC (permalink / raw)
  To: Heiko Stuebner
  Cc: linux-riscv, palmer, christoph.muellner, prabhakar.csengg, conor,
	philipp.tomsich, emil.renner.berthing, ardb, linux-efi,
	Heiko Stuebner

On Thu, Dec 01, 2022 at 06:00:23PM +0100, Andrew Jones wrote:
> On Wed, Nov 30, 2022 at 11:56:01PM +0100, Heiko Stuebner wrote:
> > From: Heiko Stuebner <heiko.stuebner@vrull.eu>
> > 
> > The opcode definition for c.jalr is
> >     c.jalr c_rs1_n0  1..0=2 15..13=4 12=1 6..2=0
> > 
> > This means funct4 consisting of bit [15:12] is 1001b, so the value is 0x9.
> 
> I presume kgdb c.jalr stepping was broken before this patch and, if so,
> then we should add a comment stating that and also
> 
>   Fixes: edde5584c7ab ("riscv: Add SW single-step support for KDB")
> 
> and maybe to help keep me motivated :-)
> 
>   Reported-by: Andrew Jones <ajones@ventanamicro.com>
>

Oh, and,

Reviewed-by: Andrew Jones <ajones@ventanamicro.com>

> Thanks,
> drew
> 
> > 
> > Signed-off-by: Heiko Stuebner <heiko.stuebner@vrull.eu>
> > ---
> >  arch/riscv/include/asm/parse_asm.h | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/arch/riscv/include/asm/parse_asm.h b/arch/riscv/include/asm/parse_asm.h
> > index f36368de839f..7fee806805c1 100644
> > --- a/arch/riscv/include/asm/parse_asm.h
> > +++ b/arch/riscv/include/asm/parse_asm.h
> > @@ -125,7 +125,7 @@
> >  #define FUNCT3_C_J		0xa000
> >  #define FUNCT3_C_JAL		0x2000
> >  #define FUNCT4_C_JR		0x8000
> > -#define FUNCT4_C_JALR		0xf000
> > +#define FUNCT4_C_JALR		0x9000
> >  
> >  #define FUNCT12_SRET		0x10200000
> >  
> > -- 
> > 2.35.1
> > 

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [PATCH v3 01/14] RISC-V: fix funct4 definition for c.jalr in parse_asm.h
@ 2022-12-01 17:05       ` Andrew Jones
  0 siblings, 0 replies; 84+ messages in thread
From: Andrew Jones @ 2022-12-01 17:05 UTC (permalink / raw)
  To: Heiko Stuebner
  Cc: linux-riscv, palmer, christoph.muellner, prabhakar.csengg, conor,
	philipp.tomsich, emil.renner.berthing, ardb, linux-efi,
	Heiko Stuebner

On Thu, Dec 01, 2022 at 06:00:23PM +0100, Andrew Jones wrote:
> On Wed, Nov 30, 2022 at 11:56:01PM +0100, Heiko Stuebner wrote:
> > From: Heiko Stuebner <heiko.stuebner@vrull.eu>
> > 
> > The opcode definition for c.jalr is
> >     c.jalr c_rs1_n0  1..0=2 15..13=4 12=1 6..2=0
> > 
> > This means funct4 consisting of bit [15:12] is 1001b, so the value is 0x9.
> 
> I presume kgdb c.jalr stepping was broken before this patch and, if so,
> then we should add a comment stating that and also
> 
>   Fixes: edde5584c7ab ("riscv: Add SW single-step support for KDB")
> 
> and maybe to help keep me motivated :-)
> 
>   Reported-by: Andrew Jones <ajones@ventanamicro.com>
>

Oh, and,

Reviewed-by: Andrew Jones <ajones@ventanamicro.com>

> Thanks,
> drew
> 
> > 
> > Signed-off-by: Heiko Stuebner <heiko.stuebner@vrull.eu>
> > ---
> >  arch/riscv/include/asm/parse_asm.h | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/arch/riscv/include/asm/parse_asm.h b/arch/riscv/include/asm/parse_asm.h
> > index f36368de839f..7fee806805c1 100644
> > --- a/arch/riscv/include/asm/parse_asm.h
> > +++ b/arch/riscv/include/asm/parse_asm.h
> > @@ -125,7 +125,7 @@
> >  #define FUNCT3_C_J		0xa000
> >  #define FUNCT3_C_JAL		0x2000
> >  #define FUNCT4_C_JR		0x8000
> > -#define FUNCT4_C_JALR		0xf000
> > +#define FUNCT4_C_JALR		0x9000
> >  
> >  #define FUNCT12_SRET		0x10200000
> >  
> > -- 
> > 2.35.1
> > 

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [PATCH v3 01/14] RISC-V: fix funct4 definition for c.jalr in parse_asm.h
  2022-12-01 17:05       ` Andrew Jones
@ 2022-12-01 17:07         ` Heiko Stübner
  -1 siblings, 0 replies; 84+ messages in thread
From: Heiko Stübner @ 2022-12-01 17:07 UTC (permalink / raw)
  To: Andrew Jones
  Cc: linux-riscv, palmer, christoph.muellner, prabhakar.csengg, conor,
	philipp.tomsich, emil.renner.berthing, ardb, linux-efi

Am Donnerstag, 1. Dezember 2022, 18:05:37 CET schrieb Andrew Jones:
> On Thu, Dec 01, 2022 at 06:00:23PM +0100, Andrew Jones wrote:
> > On Wed, Nov 30, 2022 at 11:56:01PM +0100, Heiko Stuebner wrote:
> > > From: Heiko Stuebner <heiko.stuebner@vrull.eu>
> > > 
> > > The opcode definition for c.jalr is
> > >     c.jalr c_rs1_n0  1..0=2 15..13=4 12=1 6..2=0
> > > 
> > > This means funct4 consisting of bit [15:12] is 1001b, so the value is 0x9.
> > 
> > I presume kgdb c.jalr stepping was broken before this patch and, if so,
> > then we should add a comment stating that and also
> > 
> >   Fixes: edde5584c7ab ("riscv: Add SW single-step support for KDB")
> > 
> > and maybe to help keep me motivated :-)
> > 
> >   Reported-by: Andrew Jones <ajones@ventanamicro.com>
> >
> 
> Oh, and,
> 
> Reviewed-by: Andrew Jones <ajones@ventanamicro.com>

as I need another round for the purgatory thing anyway, I've added all 3
of the things above :-)

> > > Signed-off-by: Heiko Stuebner <heiko.stuebner@vrull.eu>
> > > ---
> > >  arch/riscv/include/asm/parse_asm.h | 2 +-
> > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > 
> > > diff --git a/arch/riscv/include/asm/parse_asm.h b/arch/riscv/include/asm/parse_asm.h
> > > index f36368de839f..7fee806805c1 100644
> > > --- a/arch/riscv/include/asm/parse_asm.h
> > > +++ b/arch/riscv/include/asm/parse_asm.h
> > > @@ -125,7 +125,7 @@
> > >  #define FUNCT3_C_J		0xa000
> > >  #define FUNCT3_C_JAL		0x2000
> > >  #define FUNCT4_C_JR		0x8000
> > > -#define FUNCT4_C_JALR		0xf000
> > > +#define FUNCT4_C_JALR		0x9000
> > >  
> > >  #define FUNCT12_SRET		0x10200000
> > >  
> 





^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [PATCH v3 01/14] RISC-V: fix funct4 definition for c.jalr in parse_asm.h
@ 2022-12-01 17:07         ` Heiko Stübner
  0 siblings, 0 replies; 84+ messages in thread
From: Heiko Stübner @ 2022-12-01 17:07 UTC (permalink / raw)
  To: Andrew Jones
  Cc: linux-riscv, palmer, christoph.muellner, prabhakar.csengg, conor,
	philipp.tomsich, emil.renner.berthing, ardb, linux-efi

Am Donnerstag, 1. Dezember 2022, 18:05:37 CET schrieb Andrew Jones:
> On Thu, Dec 01, 2022 at 06:00:23PM +0100, Andrew Jones wrote:
> > On Wed, Nov 30, 2022 at 11:56:01PM +0100, Heiko Stuebner wrote:
> > > From: Heiko Stuebner <heiko.stuebner@vrull.eu>
> > > 
> > > The opcode definition for c.jalr is
> > >     c.jalr c_rs1_n0  1..0=2 15..13=4 12=1 6..2=0
> > > 
> > > This means funct4 consisting of bit [15:12] is 1001b, so the value is 0x9.
> > 
> > I presume kgdb c.jalr stepping was broken before this patch and, if so,
> > then we should add a comment stating that and also
> > 
> >   Fixes: edde5584c7ab ("riscv: Add SW single-step support for KDB")
> > 
> > and maybe to help keep me motivated :-)
> > 
> >   Reported-by: Andrew Jones <ajones@ventanamicro.com>
> >
> 
> Oh, and,
> 
> Reviewed-by: Andrew Jones <ajones@ventanamicro.com>

as I need another round for the purgatory thing anyway, I've added all 3
of the things above :-)

> > > Signed-off-by: Heiko Stuebner <heiko.stuebner@vrull.eu>
> > > ---
> > >  arch/riscv/include/asm/parse_asm.h | 2 +-
> > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > 
> > > diff --git a/arch/riscv/include/asm/parse_asm.h b/arch/riscv/include/asm/parse_asm.h
> > > index f36368de839f..7fee806805c1 100644
> > > --- a/arch/riscv/include/asm/parse_asm.h
> > > +++ b/arch/riscv/include/asm/parse_asm.h
> > > @@ -125,7 +125,7 @@
> > >  #define FUNCT3_C_J		0xa000
> > >  #define FUNCT3_C_JAL		0x2000
> > >  #define FUNCT4_C_JR		0x8000
> > > -#define FUNCT4_C_JALR		0xf000
> > > +#define FUNCT4_C_JALR		0x9000
> > >  
> > >  #define FUNCT12_SRET		0x10200000
> > >  
> 





_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [PATCH v3 03/14] RISC-V: detach funct-values from their offset
  2022-11-30 22:56   ` Heiko Stuebner
@ 2022-12-01 17:36     ` Andrew Jones
  -1 siblings, 0 replies; 84+ messages in thread
From: Andrew Jones @ 2022-12-01 17:36 UTC (permalink / raw)
  To: Heiko Stuebner
  Cc: linux-riscv, palmer, christoph.muellner, prabhakar.csengg, conor,
	philipp.tomsich, emil.renner.berthing, ardb, linux-efi,
	Heiko Stuebner, Conor Dooley

On Wed, Nov 30, 2022 at 11:56:03PM +0100, Heiko Stuebner wrote:
> From: Heiko Stuebner <heiko.stuebner@vrull.eu>
> 
> Rather than defining funct3, funct4, etc values pre-shifted to their
> target-position in an instruction, define the values themselves and
> only shift them where needed.
> 
> This allows using these funct-values in other places as well, for example
> when decoding functions.
> 
> At the same time also reduces the use of magic numbers, one would need
> a spec manual to understand.
> 
> Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
> Signed-off-by: Heiko Stuebner <heiko.stuebner@vrull.eu>
> ---
>  arch/riscv/include/asm/parse_asm.h | 100 +++++++++++++++++------------
>  1 file changed, 59 insertions(+), 41 deletions(-)
>

Reviewed-by: Andrew Jones <ajones@ventanamicro.com>

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [PATCH v3 03/14] RISC-V: detach funct-values from their offset
@ 2022-12-01 17:36     ` Andrew Jones
  0 siblings, 0 replies; 84+ messages in thread
From: Andrew Jones @ 2022-12-01 17:36 UTC (permalink / raw)
  To: Heiko Stuebner
  Cc: linux-riscv, palmer, christoph.muellner, prabhakar.csengg, conor,
	philipp.tomsich, emil.renner.berthing, ardb, linux-efi,
	Heiko Stuebner, Conor Dooley

On Wed, Nov 30, 2022 at 11:56:03PM +0100, Heiko Stuebner wrote:
> From: Heiko Stuebner <heiko.stuebner@vrull.eu>
> 
> Rather than defining funct3, funct4, etc values pre-shifted to their
> target-position in an instruction, define the values themselves and
> only shift them where needed.
> 
> This allows using these funct-values in other places as well, for example
> when decoding functions.
> 
> At the same time also reduces the use of magic numbers, one would need
> a spec manual to understand.
> 
> Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
> Signed-off-by: Heiko Stuebner <heiko.stuebner@vrull.eu>
> ---
>  arch/riscv/include/asm/parse_asm.h | 100 +++++++++++++++++------------
>  1 file changed, 59 insertions(+), 41 deletions(-)
>

Reviewed-by: Andrew Jones <ajones@ventanamicro.com>

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [PATCH v3 11/14] RISC-V: fix auipc-jalr addresses in patched alternatives
  2022-11-30 22:56   ` Heiko Stuebner
@ 2022-12-01 19:33     ` Andrew Jones
  -1 siblings, 0 replies; 84+ messages in thread
From: Andrew Jones @ 2022-12-01 19:33 UTC (permalink / raw)
  To: Heiko Stuebner
  Cc: linux-riscv, palmer, christoph.muellner, prabhakar.csengg, conor,
	philipp.tomsich, emil.renner.berthing, ardb, linux-efi,
	Heiko Stuebner, Conor Dooley

On Wed, Nov 30, 2022 at 11:56:11PM +0100, Heiko Stuebner wrote:
> From: Heiko Stuebner <heiko.stuebner@vrull.eu>
> 
> Alternatives live in a different section, so addresses used by call
> functions will point to wrong locations after the patch got applied.
> 
> Similar to arm64, adjust the location to consider that offset.
> 
> Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
> Signed-off-by: Heiko Stuebner <heiko.stuebner@vrull.eu>
> ---
>  arch/riscv/include/asm/alternative.h |  3 ++
>  arch/riscv/kernel/alternative.c      | 72 ++++++++++++++++++++++++++++
>  arch/riscv/kernel/cpufeature.c       | 11 ++++-
>  3 files changed, 84 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/riscv/include/asm/alternative.h b/arch/riscv/include/asm/alternative.h
> index 6511dd73e812..c58ec3cc4bc3 100644
> --- a/arch/riscv/include/asm/alternative.h
> +++ b/arch/riscv/include/asm/alternative.h
> @@ -27,6 +27,9 @@ void __init apply_boot_alternatives(void);
>  void __init apply_early_boot_alternatives(void);
>  void apply_module_alternatives(void *start, size_t length);
>  
> +void riscv_alternative_fix_auipc_jalr(void *alt_ptr, unsigned int len,
> +				      int patch_offset);
> +
>  struct alt_entry {
>  	void *old_ptr;		 /* address of original instruciton or data  */
>  	void *alt_ptr;		 /* address of replacement instruction or data */
> diff --git a/arch/riscv/kernel/alternative.c b/arch/riscv/kernel/alternative.c
> index a7d26a00beea..292cc42dc3be 100644
> --- a/arch/riscv/kernel/alternative.c
> +++ b/arch/riscv/kernel/alternative.c
> @@ -15,6 +15,8 @@
>  #include <asm/vendorid_list.h>
>  #include <asm/sbi.h>
>  #include <asm/csr.h>
> +#include <asm/insn.h>
> +#include <asm/patch.h>
>  
>  struct cpu_manufacturer_info_t {
>  	unsigned long vendor_id;
> @@ -53,6 +55,76 @@ static void __init_or_module riscv_fill_cpu_mfr_info(struct cpu_manufacturer_inf
>  	}
>  }
>  
> +static unsigned int riscv_instruction_at(void *p, unsigned int offset)

How about explicitly returning a u32?

> +{
> +	u16 *parcel = p + (offset * sizeof(u32));

nit: I realize this is just a helper function, but I think a cleaner
interface would require the caller do this math, or at least the offset
scaling, since only the caller knows it's necessary. And, the call to
patch_text_nosync() requires all the math, so it'd be consistent for
riscv_instruction_at() to only take a pointer too.

> +
> +	return (unsigned int)parcel[0] | (unsigned int)parcel[1] << 16;
> +}
> +
> +static inline bool riscv_insn_is_auipc_jalr(u32 insn1, u32 insn2)
> +{
> +	return riscv_insn_is_auipc(insn1) && riscv_insn_is_jalr(insn2);
> +}
> +
> +#define JALR_SIGN_MASK		BIT(RV_I_IMM_SIGN_OPOFF - RV_I_IMM_11_0_OPOFF)

We know I-type IMM is 11 bits, so we could just define this as BIT(11).

> +#define AUIPC_PAD		(0x00001000)
> +
> +#define to_jalr_imm(value)						\
> +	((value & RV_I_IMM_11_0_MASK) << RV_I_IMM_11_0_OPOFF)

Should put () around the macro argument, (value)

> +
> +#define to_auipc_imm(value)						\
> +	((value & JALR_SIGN_MASK) ?					\
> +	((value & RV_U_IMM_31_12_MASK) + AUIPC_PAD) :	\
> +	(value & RV_U_IMM_31_12_MASK))

I know RV_U_IMM_31_12_OPOFF is 0, but it looks odd not shifting
RV_U_IMM_31_12_MASK when we do shift RV_I_IMM_11_0_MASK.

So, it looks like to_auipc_imm() is doing

   offset[31:12] + ((value & BIT(11)) ? (1 << 12) : 0)

but the spec says the auipc part of the 'call' pseudoinstruction should be

   offset[31:12] + offset[11]
 
which I think would be written as

 ((((value) & RV_U_IMM_31_12_MASK) << RV_U_IMM_31_12_OPOFF) + ((value) & BIT(11)))

or what am I missing?

> +
> +void riscv_alternative_fix_auipc_jalr(void *alt_ptr, unsigned int len,
> +				      int patch_offset)
> +{
> +	int num_instr = len / sizeof(u32);
> +	unsigned int call[2];
> +	int i;
> +	int imm;
> +	u32 rd1;
> +
> +	/*
> +	 * stop one instruction before the end, as we're checking
> +	 * for auipc + jalr
> +	 */
> +	for (i = 0; i < num_instr - 1; i++) {

If we change riscv_instruction_at() to just take a pointer then we can do
the math in the for() and actually just use pointer arithmetic.

        uint32_t *p = alt_ptr;
        for (i = 0; i < num_instr - 1; i++, p++) {

> +		u32 inst1 = riscv_instruction_at(alt_ptr, i);
                                                 p
> +		u32 inst2 = riscv_instruction_at(alt_ptr, i + 1);
                                                 p + 1
> +
> +		if (!riscv_insn_is_auipc_jalr(inst1, inst2))
> +			continue;
> +
> +		/* call will use ra register */
> +		rd1 = RV_EXTRACT_RD_REG(inst1);
> +		if (rd1 != 1)
> +			continue;

nit: rd1 is only used once, how about

 if (RV_EXTRACT_RD_REG(inst1) != 1)

> +
> +		/* get and adjust new target address */
> +		imm = RV_EXTRACT_UTYPE_IMM(inst1);

Based on my understanding of a auipc part of the 'call', it seems we
should be subtracting BIT(11) here. And, since RV_EXTRACT_* does sign-
extension for I-type, then I'm not sure we should use it. So,

        imm = (inst2 >> RV_I_IMM_11_0_OPOFF) & RV_I_IMM_11_0_MASK;
	imm += ((inst1 >> RV_U_IMM_31_12_OPOFF) & RV_U_IMM_31_12_MASK) - (imm & BIT(11));

> +		imm += RV_EXTRACT_ITYPE_IMM(inst2);
> +		imm -= patch_offset;
> +
> +		/* pick the original auipc + jalr */
> +		call[0] = inst1;
> +		call[1] = inst2;
> +
> +		/* drop the old IMMs */
> +		call[0] &= ~(RV_U_IMM_31_12_MASK);

Same comment as above about RV_U_IMM_31_12_OPOFF. IMO, this would be more
consistent with the shift, even though it's zero.

                call[0] &= ~(RV_U_IMM_31_12_MASK << RV_U_IMM_31_12_OPOFF);

> +		call[1] &= ~(RV_I_IMM_11_0_MASK << RV_I_IMM_11_0_OPOFF);
> +
> +		/* add the adapted IMMs */
> +		call[0] |= to_auipc_imm(imm);

As pointed out above, I'm not sure about to_auipc_imm().

> +		call[1] |= to_jalr_imm(imm);
> +
> +		/* patch the call place again */
> +		patch_text_nosync(alt_ptr + i * sizeof(u32), call, 8);
                                  ^                                ^
				  p		       sizeof(u32) * 2
> +	}
> +}
> +
>  /*
>   * This is called very early in the boot process (directly after we run
>   * a feature detect on the boot CPU). No need to worry about other CPUs
> diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c
> index 694267d1fe81..ba62a4ff5ccd 100644
> --- a/arch/riscv/kernel/cpufeature.c
> +++ b/arch/riscv/kernel/cpufeature.c
> @@ -316,8 +316,15 @@ void __init_or_module riscv_cpufeature_patch_func(struct alt_entry *begin,
>  		}
>  
>  		tmp = (1U << alt->errata_id);
> -		if (cpu_req_feature & tmp)
> -			patch_text_nosync(alt->old_ptr, alt->alt_ptr, alt->alt_len);
> +		if (cpu_req_feature & tmp) {
> +			/* do the basic patching */
> +			patch_text_nosync(alt->old_ptr, alt->alt_ptr,
> +					  alt->alt_len);

nit: I'd leave this line long and only have one wrap in the line below

> +
> +			riscv_alternative_fix_auipc_jalr(alt->old_ptr,
> +							 alt->alt_len,
> +							 alt->old_ptr - alt->alt_ptr);
> +		}
>  	}
>  }
>  #endif
> -- 
> 2.35.1
>

Thanks,
drew

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [PATCH v3 11/14] RISC-V: fix auipc-jalr addresses in patched alternatives
@ 2022-12-01 19:33     ` Andrew Jones
  0 siblings, 0 replies; 84+ messages in thread
From: Andrew Jones @ 2022-12-01 19:33 UTC (permalink / raw)
  To: Heiko Stuebner
  Cc: linux-riscv, palmer, christoph.muellner, prabhakar.csengg, conor,
	philipp.tomsich, emil.renner.berthing, ardb, linux-efi,
	Heiko Stuebner, Conor Dooley

On Wed, Nov 30, 2022 at 11:56:11PM +0100, Heiko Stuebner wrote:
> From: Heiko Stuebner <heiko.stuebner@vrull.eu>
> 
> Alternatives live in a different section, so addresses used by call
> functions will point to wrong locations after the patch got applied.
> 
> Similar to arm64, adjust the location to consider that offset.
> 
> Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
> Signed-off-by: Heiko Stuebner <heiko.stuebner@vrull.eu>
> ---
>  arch/riscv/include/asm/alternative.h |  3 ++
>  arch/riscv/kernel/alternative.c      | 72 ++++++++++++++++++++++++++++
>  arch/riscv/kernel/cpufeature.c       | 11 ++++-
>  3 files changed, 84 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/riscv/include/asm/alternative.h b/arch/riscv/include/asm/alternative.h
> index 6511dd73e812..c58ec3cc4bc3 100644
> --- a/arch/riscv/include/asm/alternative.h
> +++ b/arch/riscv/include/asm/alternative.h
> @@ -27,6 +27,9 @@ void __init apply_boot_alternatives(void);
>  void __init apply_early_boot_alternatives(void);
>  void apply_module_alternatives(void *start, size_t length);
>  
> +void riscv_alternative_fix_auipc_jalr(void *alt_ptr, unsigned int len,
> +				      int patch_offset);
> +
>  struct alt_entry {
>  	void *old_ptr;		 /* address of original instruciton or data  */
>  	void *alt_ptr;		 /* address of replacement instruction or data */
> diff --git a/arch/riscv/kernel/alternative.c b/arch/riscv/kernel/alternative.c
> index a7d26a00beea..292cc42dc3be 100644
> --- a/arch/riscv/kernel/alternative.c
> +++ b/arch/riscv/kernel/alternative.c
> @@ -15,6 +15,8 @@
>  #include <asm/vendorid_list.h>
>  #include <asm/sbi.h>
>  #include <asm/csr.h>
> +#include <asm/insn.h>
> +#include <asm/patch.h>
>  
>  struct cpu_manufacturer_info_t {
>  	unsigned long vendor_id;
> @@ -53,6 +55,76 @@ static void __init_or_module riscv_fill_cpu_mfr_info(struct cpu_manufacturer_inf
>  	}
>  }
>  
> +static unsigned int riscv_instruction_at(void *p, unsigned int offset)

How about explicitly returning a u32?

> +{
> +	u16 *parcel = p + (offset * sizeof(u32));

nit: I realize this is just a helper function, but I think a cleaner
interface would require the caller do this math, or at least the offset
scaling, since only the caller knows it's necessary. And, the call to
patch_text_nosync() requires all the math, so it'd be consistent for
riscv_instruction_at() to only take a pointer too.

> +
> +	return (unsigned int)parcel[0] | (unsigned int)parcel[1] << 16;
> +}
> +
> +static inline bool riscv_insn_is_auipc_jalr(u32 insn1, u32 insn2)
> +{
> +	return riscv_insn_is_auipc(insn1) && riscv_insn_is_jalr(insn2);
> +}
> +
> +#define JALR_SIGN_MASK		BIT(RV_I_IMM_SIGN_OPOFF - RV_I_IMM_11_0_OPOFF)

We know I-type IMM is 11 bits, so we could just define this as BIT(11).

> +#define AUIPC_PAD		(0x00001000)
> +
> +#define to_jalr_imm(value)						\
> +	((value & RV_I_IMM_11_0_MASK) << RV_I_IMM_11_0_OPOFF)

Should put () around the macro argument, (value)

> +
> +#define to_auipc_imm(value)						\
> +	((value & JALR_SIGN_MASK) ?					\
> +	((value & RV_U_IMM_31_12_MASK) + AUIPC_PAD) :	\
> +	(value & RV_U_IMM_31_12_MASK))

I know RV_U_IMM_31_12_OPOFF is 0, but it looks odd not shifting
RV_U_IMM_31_12_MASK when we do shift RV_I_IMM_11_0_MASK.

So, it looks like to_auipc_imm() is doing

   offset[31:12] + ((value & BIT(11)) ? (1 << 12) : 0)

but the spec says the auipc part of the 'call' pseudoinstruction should be

   offset[31:12] + offset[11]
 
which I think would be written as

 ((((value) & RV_U_IMM_31_12_MASK) << RV_U_IMM_31_12_OPOFF) + ((value) & BIT(11)))

or what am I missing?

> +
> +void riscv_alternative_fix_auipc_jalr(void *alt_ptr, unsigned int len,
> +				      int patch_offset)
> +{
> +	int num_instr = len / sizeof(u32);
> +	unsigned int call[2];
> +	int i;
> +	int imm;
> +	u32 rd1;
> +
> +	/*
> +	 * stop one instruction before the end, as we're checking
> +	 * for auipc + jalr
> +	 */
> +	for (i = 0; i < num_instr - 1; i++) {

If we change riscv_instruction_at() to just take a pointer then we can do
the math in the for() and actually just use pointer arithmetic.

        uint32_t *p = alt_ptr;
        for (i = 0; i < num_instr - 1; i++, p++) {

> +		u32 inst1 = riscv_instruction_at(alt_ptr, i);
                                                 p
> +		u32 inst2 = riscv_instruction_at(alt_ptr, i + 1);
                                                 p + 1
> +
> +		if (!riscv_insn_is_auipc_jalr(inst1, inst2))
> +			continue;
> +
> +		/* call will use ra register */
> +		rd1 = RV_EXTRACT_RD_REG(inst1);
> +		if (rd1 != 1)
> +			continue;

nit: rd1 is only used once, how about

 if (RV_EXTRACT_RD_REG(inst1) != 1)

> +
> +		/* get and adjust new target address */
> +		imm = RV_EXTRACT_UTYPE_IMM(inst1);

Based on my understanding of a auipc part of the 'call', it seems we
should be subtracting BIT(11) here. And, since RV_EXTRACT_* does sign-
extension for I-type, then I'm not sure we should use it. So,

        imm = (inst2 >> RV_I_IMM_11_0_OPOFF) & RV_I_IMM_11_0_MASK;
	imm += ((inst1 >> RV_U_IMM_31_12_OPOFF) & RV_U_IMM_31_12_MASK) - (imm & BIT(11));

> +		imm += RV_EXTRACT_ITYPE_IMM(inst2);
> +		imm -= patch_offset;
> +
> +		/* pick the original auipc + jalr */
> +		call[0] = inst1;
> +		call[1] = inst2;
> +
> +		/* drop the old IMMs */
> +		call[0] &= ~(RV_U_IMM_31_12_MASK);

Same comment as above about RV_U_IMM_31_12_OPOFF. IMO, this would be more
consistent with the shift, even though it's zero.

                call[0] &= ~(RV_U_IMM_31_12_MASK << RV_U_IMM_31_12_OPOFF);

> +		call[1] &= ~(RV_I_IMM_11_0_MASK << RV_I_IMM_11_0_OPOFF);
> +
> +		/* add the adapted IMMs */
> +		call[0] |= to_auipc_imm(imm);

As pointed out above, I'm not sure about to_auipc_imm().

> +		call[1] |= to_jalr_imm(imm);
> +
> +		/* patch the call place again */
> +		patch_text_nosync(alt_ptr + i * sizeof(u32), call, 8);
                                  ^                                ^
				  p		       sizeof(u32) * 2
> +	}
> +}
> +
>  /*
>   * This is called very early in the boot process (directly after we run
>   * a feature detect on the boot CPU). No need to worry about other CPUs
> diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c
> index 694267d1fe81..ba62a4ff5ccd 100644
> --- a/arch/riscv/kernel/cpufeature.c
> +++ b/arch/riscv/kernel/cpufeature.c
> @@ -316,8 +316,15 @@ void __init_or_module riscv_cpufeature_patch_func(struct alt_entry *begin,
>  		}
>  
>  		tmp = (1U << alt->errata_id);
> -		if (cpu_req_feature & tmp)
> -			patch_text_nosync(alt->old_ptr, alt->alt_ptr, alt->alt_len);
> +		if (cpu_req_feature & tmp) {
> +			/* do the basic patching */
> +			patch_text_nosync(alt->old_ptr, alt->alt_ptr,
> +					  alt->alt_len);

nit: I'd leave this line long and only have one wrap in the line below

> +
> +			riscv_alternative_fix_auipc_jalr(alt->old_ptr,
> +							 alt->alt_len,
> +							 alt->old_ptr - alt->alt_ptr);
> +		}
>  	}
>  }
>  #endif
> -- 
> 2.35.1
>

Thanks,
drew

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [PATCH v3 12/14] efi/riscv: libstub: mark when compiling libstub
  2022-11-30 22:56   ` Heiko Stuebner
@ 2022-12-01 19:34     ` Andrew Jones
  -1 siblings, 0 replies; 84+ messages in thread
From: Andrew Jones @ 2022-12-01 19:34 UTC (permalink / raw)
  To: Heiko Stuebner
  Cc: linux-riscv, palmer, christoph.muellner, prabhakar.csengg, conor,
	philipp.tomsich, emil.renner.berthing, ardb, linux-efi,
	Heiko Stuebner, Conor Dooley

On Wed, Nov 30, 2022 at 11:56:12PM +0100, Heiko Stuebner wrote:
> From: Heiko Stuebner <heiko.stuebner@vrull.eu>
> 
> We may want to runtime-optimize some core functions (str*, mem*),
> but not have this leak into libstub and cause build issues.
> Instead libstub, for the short while it's running, should just use
> the generic implementation.
> 
> So, to be able to determine whether functions, that are used both in
> libstub and the main kernel, are getting compiled as part of libstub or
> not, add a compile-flag we can check via #ifdef.
> 
> Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
> Signed-off-by: Heiko Stuebner <heiko.stuebner@vrull.eu>
> ---
>  drivers/firmware/efi/libstub/Makefile | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/firmware/efi/libstub/Makefile b/drivers/firmware/efi/libstub/Makefile
> index ef5045a53ce0..777d1ab059e3 100644
> --- a/drivers/firmware/efi/libstub/Makefile
> +++ b/drivers/firmware/efi/libstub/Makefile
> @@ -25,7 +25,7 @@ cflags-$(CONFIG_ARM)		:= $(subst $(CC_FLAGS_FTRACE),,$(KBUILD_CFLAGS)) \
>  				   -fno-builtin -fpic \
>  				   $(call cc-option,-mno-single-pic-base)
>  cflags-$(CONFIG_RISCV)		:= $(subst $(CC_FLAGS_FTRACE),,$(KBUILD_CFLAGS)) \
> -				   -fpic
> +				   -fpic -DRISCV_EFISTUB
>  cflags-$(CONFIG_LOONGARCH)	:= $(subst $(CC_FLAGS_FTRACE),,$(KBUILD_CFLAGS)) \
>  				   -fpie
>  
> -- 
> 2.35.1
>

Reviewed-by: Andrew Jones <ajones@ventanamicro.com>

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [PATCH v3 12/14] efi/riscv: libstub: mark when compiling libstub
@ 2022-12-01 19:34     ` Andrew Jones
  0 siblings, 0 replies; 84+ messages in thread
From: Andrew Jones @ 2022-12-01 19:34 UTC (permalink / raw)
  To: Heiko Stuebner
  Cc: linux-riscv, palmer, christoph.muellner, prabhakar.csengg, conor,
	philipp.tomsich, emil.renner.berthing, ardb, linux-efi,
	Heiko Stuebner, Conor Dooley

On Wed, Nov 30, 2022 at 11:56:12PM +0100, Heiko Stuebner wrote:
> From: Heiko Stuebner <heiko.stuebner@vrull.eu>
> 
> We may want to runtime-optimize some core functions (str*, mem*),
> but not have this leak into libstub and cause build issues.
> Instead libstub, for the short while it's running, should just use
> the generic implementation.
> 
> So, to be able to determine whether functions, that are used both in
> libstub and the main kernel, are getting compiled as part of libstub or
> not, add a compile-flag we can check via #ifdef.
> 
> Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
> Signed-off-by: Heiko Stuebner <heiko.stuebner@vrull.eu>
> ---
>  drivers/firmware/efi/libstub/Makefile | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/firmware/efi/libstub/Makefile b/drivers/firmware/efi/libstub/Makefile
> index ef5045a53ce0..777d1ab059e3 100644
> --- a/drivers/firmware/efi/libstub/Makefile
> +++ b/drivers/firmware/efi/libstub/Makefile
> @@ -25,7 +25,7 @@ cflags-$(CONFIG_ARM)		:= $(subst $(CC_FLAGS_FTRACE),,$(KBUILD_CFLAGS)) \
>  				   -fno-builtin -fpic \
>  				   $(call cc-option,-mno-single-pic-base)
>  cflags-$(CONFIG_RISCV)		:= $(subst $(CC_FLAGS_FTRACE),,$(KBUILD_CFLAGS)) \
> -				   -fpic
> +				   -fpic -DRISCV_EFISTUB
>  cflags-$(CONFIG_LOONGARCH)	:= $(subst $(CC_FLAGS_FTRACE),,$(KBUILD_CFLAGS)) \
>  				   -fpie
>  
> -- 
> 2.35.1
>

Reviewed-by: Andrew Jones <ajones@ventanamicro.com>

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [PATCH v3 13/14] RISC-V: add infrastructure to allow different str* implementations
  2022-11-30 22:56   ` Heiko Stuebner
@ 2022-12-01 20:07     ` Andrew Jones
  -1 siblings, 0 replies; 84+ messages in thread
From: Andrew Jones @ 2022-12-01 20:07 UTC (permalink / raw)
  To: Heiko Stuebner
  Cc: linux-riscv, palmer, christoph.muellner, prabhakar.csengg, conor,
	philipp.tomsich, emil.renner.berthing, ardb, linux-efi,
	Heiko Stuebner, Conor Dooley

On Wed, Nov 30, 2022 at 11:56:13PM +0100, Heiko Stuebner wrote:
> From: Heiko Stuebner <heiko.stuebner@vrull.eu>
> 
> Depending on supported extensions on specific RISC-V cores,
> optimized str* functions might make sense.
> 
> This adds basic infrastructure to allow patching the function calls
> via alternatives later on.
> 
> The main idea is to have the core str* functions be inline functions
> which then call the most optimized variant and this call then be
> replaced via alternatives.
> 
> The big advantage is that we don't need additional calls.
> Though we need to duplicate the generic functions as the main code
> expects either itself or the architecture to provide the str* functions.

That's a bummer, but at least the duplicated functions are simple.

> 
> The added *_generic functions are done in assembler (taken from
> disassembling the main-kernel functions for now) to allow us to control
> the used registers.
> 
> Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
> Signed-off-by: Heiko Stuebner <heiko.stuebner@vrull.eu>
> ---
>  arch/riscv/Makefile             |  3 ++
>  arch/riscv/include/asm/string.h | 66 +++++++++++++++++++++++++++++++++
>  arch/riscv/kernel/image-vars.h  |  6 +--
>  arch/riscv/lib/Makefile         |  3 ++
>  arch/riscv/lib/strcmp.S         | 38 +++++++++++++++++++
>  arch/riscv/lib/strlen.S         | 29 +++++++++++++++
>  arch/riscv/lib/strncmp.S        | 41 ++++++++++++++++++++
>  7 files changed, 183 insertions(+), 3 deletions(-)
>  create mode 100644 arch/riscv/lib/strcmp.S
>  create mode 100644 arch/riscv/lib/strlen.S
>  create mode 100644 arch/riscv/lib/strncmp.S
> 
> diff --git a/arch/riscv/Makefile b/arch/riscv/Makefile
> index 0d13b597cb55..581e4370c2a6 100644
> --- a/arch/riscv/Makefile
> +++ b/arch/riscv/Makefile
> @@ -80,6 +80,9 @@ ifeq ($(CONFIG_PERF_EVENTS),y)
>          KBUILD_CFLAGS += -fno-omit-frame-pointer
>  endif
>  
> +# strchr is special case, as gcc might want to call its own strlen from there
> +KBUILD_CFLAGS += -fno-builtin-strlen -fno-builtin-strcmp -fno-builtin-strncmp -fno-builtin-strchr
> +
>  KBUILD_CFLAGS_MODULE += $(call cc-option,-mno-relax)
>  KBUILD_AFLAGS_MODULE += $(call as-option,-Wa$(comma)-mno-relax)
>  
> diff --git a/arch/riscv/include/asm/string.h b/arch/riscv/include/asm/string.h
> index 909049366555..b98481d2d154 100644
> --- a/arch/riscv/include/asm/string.h
> +++ b/arch/riscv/include/asm/string.h
> @@ -18,6 +18,72 @@ extern asmlinkage void *__memcpy(void *, const void *, size_t);
>  #define __HAVE_ARCH_MEMMOVE
>  extern asmlinkage void *memmove(void *, const void *, size_t);
>  extern asmlinkage void *__memmove(void *, const void *, size_t);
> +
> +#define __HAVE_ARCH_STRCMP
> +extern asmlinkage int __strcmp_generic(const char *cs, const char *ct);
> +
> +static inline int strcmp(const char *cs, const char *ct)
> +{
> +#ifdef RISCV_EFISTUB
> +	return __strcmp_generic(cs, ct);
> +#else
> +	register const char *a0 asm("a0") = cs;
> +	register const char *a1 asm("a1") = ct;
> +	register int a0_out asm("a0");
> +
> +	asm volatile("call __strcmp_generic\n\t"
> +		: "=r"(a0_out)
> +		: "r"(a0), "r"(a1)
> +		: "ra", "t0", "t1", "t2");
> +
> +	return a0_out;
> +#endif
> +}
> +
> +#define __HAVE_ARCH_STRNCMP
> +extern asmlinkage int __strncmp_generic(const char *cs,
> +					const char *ct, size_t count);
> +
> +static inline int strncmp(const char *cs, const char *ct, size_t count)
> +{
> +#ifdef RISCV_EFISTUB
> +	return __strncmp_generic(cs, ct, count);
> +#else
> +	register const char *a0 asm("a0") = cs;
> +	register const char *a1 asm("a1") = ct;
> +	register size_t a2 asm("a2") = count;
> +	register int a0_out asm("a0");
> +
> +	asm volatile("call __strncmp_generic\n\t"
> +		: "=r"(a0_out)
> +		: "r"(a0), "r"(a1), "r"(a2)
> +		: "ra", "t0", "t1", "t2");
> +
> +	return a0_out;
> +#endif
> +}
> +
> +#define __HAVE_ARCH_STRLEN
> +extern asmlinkage __kernel_size_t __strlen_generic(const char *);
> +
> +static inline __kernel_size_t strlen(const char *s)
> +{
> +#ifdef RISCV_EFISTUB
> +	return __strlen_generic(s);
> +#else
> +	register const char *a0 asm("a0") = s;
> +	register int a0_out asm("a0");
> +
> +	asm volatile(
> +		"call __strlen_generic\n\t"
> +		: "=r"(a0_out)
> +		: "r"(a0)
> +		: "ra", "t0", "t1");
> +
> +	return a0_out;
> +#endif
> +}
> +
>  /* For those files which don't want to check by kasan. */
>  #if defined(CONFIG_KASAN) && !defined(__SANITIZE_ADDRESS__)
>  #define memcpy(dst, src, len) __memcpy(dst, src, len)
> diff --git a/arch/riscv/kernel/image-vars.h b/arch/riscv/kernel/image-vars.h
> index d6e5f739905e..2f270b9fde63 100644
> --- a/arch/riscv/kernel/image-vars.h
> +++ b/arch/riscv/kernel/image-vars.h
> @@ -25,10 +25,10 @@
>   */
>  __efistub_memcmp		= memcmp;
>  __efistub_memchr		= memchr;
> -__efistub_strlen		= strlen;
> +__efistub___strlen_generic	= __strlen_generic;
>  __efistub_strnlen		= strnlen;
> -__efistub_strcmp		= strcmp;
> -__efistub_strncmp		= strncmp;
> +__efistub___strcmp_generic	= __strcmp_generic;
> +__efistub___strncmp_generic	= __strncmp_generic;
>  __efistub_strrchr		= strrchr;
>  
>  __efistub__start		= _start;
> diff --git a/arch/riscv/lib/Makefile b/arch/riscv/lib/Makefile
> index 25d5c9664e57..6c74b0bedd60 100644
> --- a/arch/riscv/lib/Makefile
> +++ b/arch/riscv/lib/Makefile
> @@ -3,6 +3,9 @@ lib-y			+= delay.o
>  lib-y			+= memcpy.o
>  lib-y			+= memset.o
>  lib-y			+= memmove.o
> +lib-y			+= strcmp.o
> +lib-y			+= strlen.o
> +lib-y			+= strncmp.o

Can't we just create a single string.S file?

>  lib-$(CONFIG_MMU)	+= uaccess.o
>  lib-$(CONFIG_64BIT)	+= tishift.o
>  
> diff --git a/arch/riscv/lib/strcmp.S b/arch/riscv/lib/strcmp.S
> new file mode 100644
> index 000000000000..f4b7f4e806f0
> --- /dev/null
> +++ b/arch/riscv/lib/strcmp.S
> @@ -0,0 +1,38 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +
> +#include <linux/linkage.h>
> +#include <asm/asm.h>
> +#include <asm-generic/export.h>
> +
> +/* int __strcmp_generic(const char *cs, const char *ct) */
> +ENTRY(__strcmp_generic)
> +	/*
> +	 * Returns
> +	 *   a0 - comparison result, value like strcmp
> +	 *
> +	 * Parameters
> +	 *   a0 - string1
> +	 *   a1 - string2
> +	 *
> +	 * Clobbers
> +	 *   t0, t1, t2
> +	 */
> +	mv	t2, a1
> +1:
> +	lbu	t1, 0(a0)
> +	lbu	t0, 0(a1)
> +	addi	a0, a0, 1
> +	addi	a1, a1, 1
> +	beq	t1, t0, 3f
> +	li	a0, 1
> +	bgeu	t1, t0, 2f
> +	li	a0, -1
> +2:
> +	mv	a1, t2
> +	ret
> +3:
> +	bnez	t1, 1b
> +	li	a0, 0
> +	j	2b
> +END(__strcmp_generic)
> +EXPORT_SYMBOL(__strcmp_generic)
> diff --git a/arch/riscv/lib/strlen.S b/arch/riscv/lib/strlen.S
> new file mode 100644
> index 000000000000..e0e7440ac724
> --- /dev/null
> +++ b/arch/riscv/lib/strlen.S
> @@ -0,0 +1,29 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +
> +#include <linux/linkage.h>
> +#include <asm/asm.h>
> +#include <asm-generic/export.h>
> +
> +/* int __strlen_generic(const char *s) */
> +ENTRY(__strlen_generic)
> +	/*
> +	 * Returns
> +	 *   a0 - string length
> +	 *
> +	 * Parameters
> +	 *   a0 - String to measure
> +	 *
> +	 * Clobbers:
> +	 *   t0, t1
> +	 */
> +	mv	t1, a0
> +1:
> +	lbu	t0, 0(t1)
> +	bnez	t0, 2f
> +	sub	a0, t1, a0
> +	ret
> +2:
> +	addi	t1, t1, 1
> +	j	1b
> +END(__strlen_generic)
> +EXPORT_SYMBOL(__strlen_generic)
> diff --git a/arch/riscv/lib/strncmp.S b/arch/riscv/lib/strncmp.S
> new file mode 100644
> index 000000000000..7e116d942265
> --- /dev/null
> +++ b/arch/riscv/lib/strncmp.S
> @@ -0,0 +1,41 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +
> +#include <linux/linkage.h>
> +#include <asm/asm.h>
> +#include <asm-generic/export.h>
> +
> +/* int __strncmp_generic(const char *cs, const char *ct, size_t count) */
> +ENTRY(__strncmp_generic)
> +	/*
> +	 * Returns
> +	 *   a0 - comparison result, value like strncmp
> +	 *
> +	 * Parameters
> +	 *   a0 - string1
> +	 *   a1 - string2
> +	 *   a2 - number of characters to compare
> +	 *
> +	 * Clobbers
> +	 *   t0, t1, t2
> +	 */
> +	li	t0, 0
> +1:
> +	beq	a2, t0, 4f
> +	add	t1, a0, t0
> +	add	t2, a1, t0
> +	lbu	t1, 0(t1)
> +	lbu	t2, 0(t2)
> +	beq	t1, t2, 3f
> +	li	a0, 1
> +	bgeu	t1, t2, 2f
> +	li	a0, -1
> +2:
> +	ret
> +3:
> +	addi	t0, t0, 1
> +	bnez	t1, 1b
> +4:
> +	li	a0, 0
> +	j	2b
> +END(__strncmp_generic)
> +EXPORT_SYMBOL(__strncmp_generic)
> -- 
> 2.35.1
>

Besides the consolidation of functions to one file request,

Reviewed-by: Andrew Jones <ajones@ventanamicro.com>

Thanks,
drew

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [PATCH v3 13/14] RISC-V: add infrastructure to allow different str* implementations
@ 2022-12-01 20:07     ` Andrew Jones
  0 siblings, 0 replies; 84+ messages in thread
From: Andrew Jones @ 2022-12-01 20:07 UTC (permalink / raw)
  To: Heiko Stuebner
  Cc: linux-riscv, palmer, christoph.muellner, prabhakar.csengg, conor,
	philipp.tomsich, emil.renner.berthing, ardb, linux-efi,
	Heiko Stuebner, Conor Dooley

On Wed, Nov 30, 2022 at 11:56:13PM +0100, Heiko Stuebner wrote:
> From: Heiko Stuebner <heiko.stuebner@vrull.eu>
> 
> Depending on supported extensions on specific RISC-V cores,
> optimized str* functions might make sense.
> 
> This adds basic infrastructure to allow patching the function calls
> via alternatives later on.
> 
> The main idea is to have the core str* functions be inline functions
> which then call the most optimized variant and this call then be
> replaced via alternatives.
> 
> The big advantage is that we don't need additional calls.
> Though we need to duplicate the generic functions as the main code
> expects either itself or the architecture to provide the str* functions.

That's a bummer, but at least the duplicated functions are simple.

> 
> The added *_generic functions are done in assembler (taken from
> disassembling the main-kernel functions for now) to allow us to control
> the used registers.
> 
> Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
> Signed-off-by: Heiko Stuebner <heiko.stuebner@vrull.eu>
> ---
>  arch/riscv/Makefile             |  3 ++
>  arch/riscv/include/asm/string.h | 66 +++++++++++++++++++++++++++++++++
>  arch/riscv/kernel/image-vars.h  |  6 +--
>  arch/riscv/lib/Makefile         |  3 ++
>  arch/riscv/lib/strcmp.S         | 38 +++++++++++++++++++
>  arch/riscv/lib/strlen.S         | 29 +++++++++++++++
>  arch/riscv/lib/strncmp.S        | 41 ++++++++++++++++++++
>  7 files changed, 183 insertions(+), 3 deletions(-)
>  create mode 100644 arch/riscv/lib/strcmp.S
>  create mode 100644 arch/riscv/lib/strlen.S
>  create mode 100644 arch/riscv/lib/strncmp.S
> 
> diff --git a/arch/riscv/Makefile b/arch/riscv/Makefile
> index 0d13b597cb55..581e4370c2a6 100644
> --- a/arch/riscv/Makefile
> +++ b/arch/riscv/Makefile
> @@ -80,6 +80,9 @@ ifeq ($(CONFIG_PERF_EVENTS),y)
>          KBUILD_CFLAGS += -fno-omit-frame-pointer
>  endif
>  
> +# strchr is special case, as gcc might want to call its own strlen from there
> +KBUILD_CFLAGS += -fno-builtin-strlen -fno-builtin-strcmp -fno-builtin-strncmp -fno-builtin-strchr
> +
>  KBUILD_CFLAGS_MODULE += $(call cc-option,-mno-relax)
>  KBUILD_AFLAGS_MODULE += $(call as-option,-Wa$(comma)-mno-relax)
>  
> diff --git a/arch/riscv/include/asm/string.h b/arch/riscv/include/asm/string.h
> index 909049366555..b98481d2d154 100644
> --- a/arch/riscv/include/asm/string.h
> +++ b/arch/riscv/include/asm/string.h
> @@ -18,6 +18,72 @@ extern asmlinkage void *__memcpy(void *, const void *, size_t);
>  #define __HAVE_ARCH_MEMMOVE
>  extern asmlinkage void *memmove(void *, const void *, size_t);
>  extern asmlinkage void *__memmove(void *, const void *, size_t);
> +
> +#define __HAVE_ARCH_STRCMP
> +extern asmlinkage int __strcmp_generic(const char *cs, const char *ct);
> +
> +static inline int strcmp(const char *cs, const char *ct)
> +{
> +#ifdef RISCV_EFISTUB
> +	return __strcmp_generic(cs, ct);
> +#else
> +	register const char *a0 asm("a0") = cs;
> +	register const char *a1 asm("a1") = ct;
> +	register int a0_out asm("a0");
> +
> +	asm volatile("call __strcmp_generic\n\t"
> +		: "=r"(a0_out)
> +		: "r"(a0), "r"(a1)
> +		: "ra", "t0", "t1", "t2");
> +
> +	return a0_out;
> +#endif
> +}
> +
> +#define __HAVE_ARCH_STRNCMP
> +extern asmlinkage int __strncmp_generic(const char *cs,
> +					const char *ct, size_t count);
> +
> +static inline int strncmp(const char *cs, const char *ct, size_t count)
> +{
> +#ifdef RISCV_EFISTUB
> +	return __strncmp_generic(cs, ct, count);
> +#else
> +	register const char *a0 asm("a0") = cs;
> +	register const char *a1 asm("a1") = ct;
> +	register size_t a2 asm("a2") = count;
> +	register int a0_out asm("a0");
> +
> +	asm volatile("call __strncmp_generic\n\t"
> +		: "=r"(a0_out)
> +		: "r"(a0), "r"(a1), "r"(a2)
> +		: "ra", "t0", "t1", "t2");
> +
> +	return a0_out;
> +#endif
> +}
> +
> +#define __HAVE_ARCH_STRLEN
> +extern asmlinkage __kernel_size_t __strlen_generic(const char *);
> +
> +static inline __kernel_size_t strlen(const char *s)
> +{
> +#ifdef RISCV_EFISTUB
> +	return __strlen_generic(s);
> +#else
> +	register const char *a0 asm("a0") = s;
> +	register int a0_out asm("a0");
> +
> +	asm volatile(
> +		"call __strlen_generic\n\t"
> +		: "=r"(a0_out)
> +		: "r"(a0)
> +		: "ra", "t0", "t1");
> +
> +	return a0_out;
> +#endif
> +}
> +
>  /* For those files which don't want to check by kasan. */
>  #if defined(CONFIG_KASAN) && !defined(__SANITIZE_ADDRESS__)
>  #define memcpy(dst, src, len) __memcpy(dst, src, len)
> diff --git a/arch/riscv/kernel/image-vars.h b/arch/riscv/kernel/image-vars.h
> index d6e5f739905e..2f270b9fde63 100644
> --- a/arch/riscv/kernel/image-vars.h
> +++ b/arch/riscv/kernel/image-vars.h
> @@ -25,10 +25,10 @@
>   */
>  __efistub_memcmp		= memcmp;
>  __efistub_memchr		= memchr;
> -__efistub_strlen		= strlen;
> +__efistub___strlen_generic	= __strlen_generic;
>  __efistub_strnlen		= strnlen;
> -__efistub_strcmp		= strcmp;
> -__efistub_strncmp		= strncmp;
> +__efistub___strcmp_generic	= __strcmp_generic;
> +__efistub___strncmp_generic	= __strncmp_generic;
>  __efistub_strrchr		= strrchr;
>  
>  __efistub__start		= _start;
> diff --git a/arch/riscv/lib/Makefile b/arch/riscv/lib/Makefile
> index 25d5c9664e57..6c74b0bedd60 100644
> --- a/arch/riscv/lib/Makefile
> +++ b/arch/riscv/lib/Makefile
> @@ -3,6 +3,9 @@ lib-y			+= delay.o
>  lib-y			+= memcpy.o
>  lib-y			+= memset.o
>  lib-y			+= memmove.o
> +lib-y			+= strcmp.o
> +lib-y			+= strlen.o
> +lib-y			+= strncmp.o

Can't we just create a single string.S file?

>  lib-$(CONFIG_MMU)	+= uaccess.o
>  lib-$(CONFIG_64BIT)	+= tishift.o
>  
> diff --git a/arch/riscv/lib/strcmp.S b/arch/riscv/lib/strcmp.S
> new file mode 100644
> index 000000000000..f4b7f4e806f0
> --- /dev/null
> +++ b/arch/riscv/lib/strcmp.S
> @@ -0,0 +1,38 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +
> +#include <linux/linkage.h>
> +#include <asm/asm.h>
> +#include <asm-generic/export.h>
> +
> +/* int __strcmp_generic(const char *cs, const char *ct) */
> +ENTRY(__strcmp_generic)
> +	/*
> +	 * Returns
> +	 *   a0 - comparison result, value like strcmp
> +	 *
> +	 * Parameters
> +	 *   a0 - string1
> +	 *   a1 - string2
> +	 *
> +	 * Clobbers
> +	 *   t0, t1, t2
> +	 */
> +	mv	t2, a1
> +1:
> +	lbu	t1, 0(a0)
> +	lbu	t0, 0(a1)
> +	addi	a0, a0, 1
> +	addi	a1, a1, 1
> +	beq	t1, t0, 3f
> +	li	a0, 1
> +	bgeu	t1, t0, 2f
> +	li	a0, -1
> +2:
> +	mv	a1, t2
> +	ret
> +3:
> +	bnez	t1, 1b
> +	li	a0, 0
> +	j	2b
> +END(__strcmp_generic)
> +EXPORT_SYMBOL(__strcmp_generic)
> diff --git a/arch/riscv/lib/strlen.S b/arch/riscv/lib/strlen.S
> new file mode 100644
> index 000000000000..e0e7440ac724
> --- /dev/null
> +++ b/arch/riscv/lib/strlen.S
> @@ -0,0 +1,29 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +
> +#include <linux/linkage.h>
> +#include <asm/asm.h>
> +#include <asm-generic/export.h>
> +
> +/* int __strlen_generic(const char *s) */
> +ENTRY(__strlen_generic)
> +	/*
> +	 * Returns
> +	 *   a0 - string length
> +	 *
> +	 * Parameters
> +	 *   a0 - String to measure
> +	 *
> +	 * Clobbers:
> +	 *   t0, t1
> +	 */
> +	mv	t1, a0
> +1:
> +	lbu	t0, 0(t1)
> +	bnez	t0, 2f
> +	sub	a0, t1, a0
> +	ret
> +2:
> +	addi	t1, t1, 1
> +	j	1b
> +END(__strlen_generic)
> +EXPORT_SYMBOL(__strlen_generic)
> diff --git a/arch/riscv/lib/strncmp.S b/arch/riscv/lib/strncmp.S
> new file mode 100644
> index 000000000000..7e116d942265
> --- /dev/null
> +++ b/arch/riscv/lib/strncmp.S
> @@ -0,0 +1,41 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +
> +#include <linux/linkage.h>
> +#include <asm/asm.h>
> +#include <asm-generic/export.h>
> +
> +/* int __strncmp_generic(const char *cs, const char *ct, size_t count) */
> +ENTRY(__strncmp_generic)
> +	/*
> +	 * Returns
> +	 *   a0 - comparison result, value like strncmp
> +	 *
> +	 * Parameters
> +	 *   a0 - string1
> +	 *   a1 - string2
> +	 *   a2 - number of characters to compare
> +	 *
> +	 * Clobbers
> +	 *   t0, t1, t2
> +	 */
> +	li	t0, 0
> +1:
> +	beq	a2, t0, 4f
> +	add	t1, a0, t0
> +	add	t2, a1, t0
> +	lbu	t1, 0(t1)
> +	lbu	t2, 0(t2)
> +	beq	t1, t2, 3f
> +	li	a0, 1
> +	bgeu	t1, t2, 2f
> +	li	a0, -1
> +2:
> +	ret
> +3:
> +	addi	t0, t0, 1
> +	bnez	t1, 1b
> +4:
> +	li	a0, 0
> +	j	2b
> +END(__strncmp_generic)
> +EXPORT_SYMBOL(__strncmp_generic)
> -- 
> 2.35.1
>

Besides the consolidation of functions to one file request,

Reviewed-by: Andrew Jones <ajones@ventanamicro.com>

Thanks,
drew

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [PATCH v3 13/14] RISC-V: add infrastructure to allow different str* implementations
  2022-12-01 20:07     ` Andrew Jones
@ 2022-12-01 20:14       ` Andrew Jones
  -1 siblings, 0 replies; 84+ messages in thread
From: Andrew Jones @ 2022-12-01 20:14 UTC (permalink / raw)
  To: Heiko Stuebner
  Cc: linux-riscv, palmer, christoph.muellner, prabhakar.csengg, conor,
	philipp.tomsich, emil.renner.berthing, ardb, linux-efi,
	Heiko Stuebner, Conor Dooley

On Thu, Dec 01, 2022 at 09:07:41PM +0100, Andrew Jones wrote:
> On Wed, Nov 30, 2022 at 11:56:13PM +0100, Heiko Stuebner wrote:
> > From: Heiko Stuebner <heiko.stuebner@vrull.eu>
> > 
> > Depending on supported extensions on specific RISC-V cores,
> > optimized str* functions might make sense.
> > 
> > This adds basic infrastructure to allow patching the function calls
> > via alternatives later on.
> > 
> > The main idea is to have the core str* functions be inline functions
> > which then call the most optimized variant and this call then be
> > replaced via alternatives.
> > 
> > The big advantage is that we don't need additional calls.
> > Though we need to duplicate the generic functions as the main code
> > expects either itself or the architecture to provide the str* functions.
> 
> That's a bummer, but at least the duplicated functions are simple.
> 
> > 
> > The added *_generic functions are done in assembler (taken from
> > disassembling the main-kernel functions for now) to allow us to control
> > the used registers.
> > 
> > Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
> > Signed-off-by: Heiko Stuebner <heiko.stuebner@vrull.eu>
> > ---
> >  arch/riscv/Makefile             |  3 ++
> >  arch/riscv/include/asm/string.h | 66 +++++++++++++++++++++++++++++++++
> >  arch/riscv/kernel/image-vars.h  |  6 +--
> >  arch/riscv/lib/Makefile         |  3 ++
> >  arch/riscv/lib/strcmp.S         | 38 +++++++++++++++++++
> >  arch/riscv/lib/strlen.S         | 29 +++++++++++++++
> >  arch/riscv/lib/strncmp.S        | 41 ++++++++++++++++++++
> >  7 files changed, 183 insertions(+), 3 deletions(-)
> >  create mode 100644 arch/riscv/lib/strcmp.S
> >  create mode 100644 arch/riscv/lib/strlen.S
> >  create mode 100644 arch/riscv/lib/strncmp.S
> > 
> > diff --git a/arch/riscv/Makefile b/arch/riscv/Makefile
> > index 0d13b597cb55..581e4370c2a6 100644
> > --- a/arch/riscv/Makefile
> > +++ b/arch/riscv/Makefile
> > @@ -80,6 +80,9 @@ ifeq ($(CONFIG_PERF_EVENTS),y)
> >          KBUILD_CFLAGS += -fno-omit-frame-pointer
> >  endif
> >  
> > +# strchr is special case, as gcc might want to call its own strlen from there
> > +KBUILD_CFLAGS += -fno-builtin-strlen -fno-builtin-strcmp -fno-builtin-strncmp -fno-builtin-strchr
> > +
> >  KBUILD_CFLAGS_MODULE += $(call cc-option,-mno-relax)
> >  KBUILD_AFLAGS_MODULE += $(call as-option,-Wa$(comma)-mno-relax)
> >  
> > diff --git a/arch/riscv/include/asm/string.h b/arch/riscv/include/asm/string.h
> > index 909049366555..b98481d2d154 100644
> > --- a/arch/riscv/include/asm/string.h
> > +++ b/arch/riscv/include/asm/string.h
> > @@ -18,6 +18,72 @@ extern asmlinkage void *__memcpy(void *, const void *, size_t);
> >  #define __HAVE_ARCH_MEMMOVE
> >  extern asmlinkage void *memmove(void *, const void *, size_t);
> >  extern asmlinkage void *__memmove(void *, const void *, size_t);
> > +
> > +#define __HAVE_ARCH_STRCMP
> > +extern asmlinkage int __strcmp_generic(const char *cs, const char *ct);
> > +
> > +static inline int strcmp(const char *cs, const char *ct)
> > +{
> > +#ifdef RISCV_EFISTUB
> > +	return __strcmp_generic(cs, ct);
> > +#else
> > +	register const char *a0 asm("a0") = cs;
> > +	register const char *a1 asm("a1") = ct;
> > +	register int a0_out asm("a0");
> > +
> > +	asm volatile("call __strcmp_generic\n\t"
> > +		: "=r"(a0_out)
> > +		: "r"(a0), "r"(a1)
> > +		: "ra", "t0", "t1", "t2");

Don't we need "memory" in all these clobber lists?

Thanks,
drew

> > +
> > +	return a0_out;
> > +#endif
> > +}
> > +
> > +#define __HAVE_ARCH_STRNCMP
> > +extern asmlinkage int __strncmp_generic(const char *cs,
> > +					const char *ct, size_t count);
> > +
> > +static inline int strncmp(const char *cs, const char *ct, size_t count)
> > +{
> > +#ifdef RISCV_EFISTUB
> > +	return __strncmp_generic(cs, ct, count);
> > +#else
> > +	register const char *a0 asm("a0") = cs;
> > +	register const char *a1 asm("a1") = ct;
> > +	register size_t a2 asm("a2") = count;
> > +	register int a0_out asm("a0");
> > +
> > +	asm volatile("call __strncmp_generic\n\t"
> > +		: "=r"(a0_out)
> > +		: "r"(a0), "r"(a1), "r"(a2)
> > +		: "ra", "t0", "t1", "t2");
> > +
> > +	return a0_out;
> > +#endif
> > +}
> > +
> > +#define __HAVE_ARCH_STRLEN
> > +extern asmlinkage __kernel_size_t __strlen_generic(const char *);
> > +
> > +static inline __kernel_size_t strlen(const char *s)
> > +{
> > +#ifdef RISCV_EFISTUB
> > +	return __strlen_generic(s);
> > +#else
> > +	register const char *a0 asm("a0") = s;
> > +	register int a0_out asm("a0");
> > +
> > +	asm volatile(
> > +		"call __strlen_generic\n\t"
> > +		: "=r"(a0_out)
> > +		: "r"(a0)
> > +		: "ra", "t0", "t1");
> > +
> > +	return a0_out;
> > +#endif
> > +}
> > +
> >  /* For those files which don't want to check by kasan. */
> >  #if defined(CONFIG_KASAN) && !defined(__SANITIZE_ADDRESS__)
> >  #define memcpy(dst, src, len) __memcpy(dst, src, len)
> > diff --git a/arch/riscv/kernel/image-vars.h b/arch/riscv/kernel/image-vars.h
> > index d6e5f739905e..2f270b9fde63 100644
> > --- a/arch/riscv/kernel/image-vars.h
> > +++ b/arch/riscv/kernel/image-vars.h
> > @@ -25,10 +25,10 @@
> >   */
> >  __efistub_memcmp		= memcmp;
> >  __efistub_memchr		= memchr;
> > -__efistub_strlen		= strlen;
> > +__efistub___strlen_generic	= __strlen_generic;
> >  __efistub_strnlen		= strnlen;
> > -__efistub_strcmp		= strcmp;
> > -__efistub_strncmp		= strncmp;
> > +__efistub___strcmp_generic	= __strcmp_generic;
> > +__efistub___strncmp_generic	= __strncmp_generic;
> >  __efistub_strrchr		= strrchr;
> >  
> >  __efistub__start		= _start;
> > diff --git a/arch/riscv/lib/Makefile b/arch/riscv/lib/Makefile
> > index 25d5c9664e57..6c74b0bedd60 100644
> > --- a/arch/riscv/lib/Makefile
> > +++ b/arch/riscv/lib/Makefile
> > @@ -3,6 +3,9 @@ lib-y			+= delay.o
> >  lib-y			+= memcpy.o
> >  lib-y			+= memset.o
> >  lib-y			+= memmove.o
> > +lib-y			+= strcmp.o
> > +lib-y			+= strlen.o
> > +lib-y			+= strncmp.o
> 
> Can't we just create a single string.S file?
> 
> >  lib-$(CONFIG_MMU)	+= uaccess.o
> >  lib-$(CONFIG_64BIT)	+= tishift.o
> >  
> > diff --git a/arch/riscv/lib/strcmp.S b/arch/riscv/lib/strcmp.S
> > new file mode 100644
> > index 000000000000..f4b7f4e806f0
> > --- /dev/null
> > +++ b/arch/riscv/lib/strcmp.S
> > @@ -0,0 +1,38 @@
> > +/* SPDX-License-Identifier: GPL-2.0-only */
> > +
> > +#include <linux/linkage.h>
> > +#include <asm/asm.h>
> > +#include <asm-generic/export.h>
> > +
> > +/* int __strcmp_generic(const char *cs, const char *ct) */
> > +ENTRY(__strcmp_generic)
> > +	/*
> > +	 * Returns
> > +	 *   a0 - comparison result, value like strcmp
> > +	 *
> > +	 * Parameters
> > +	 *   a0 - string1
> > +	 *   a1 - string2
> > +	 *
> > +	 * Clobbers
> > +	 *   t0, t1, t2
> > +	 */
> > +	mv	t2, a1
> > +1:
> > +	lbu	t1, 0(a0)
> > +	lbu	t0, 0(a1)
> > +	addi	a0, a0, 1
> > +	addi	a1, a1, 1
> > +	beq	t1, t0, 3f
> > +	li	a0, 1
> > +	bgeu	t1, t0, 2f
> > +	li	a0, -1
> > +2:
> > +	mv	a1, t2
> > +	ret
> > +3:
> > +	bnez	t1, 1b
> > +	li	a0, 0
> > +	j	2b
> > +END(__strcmp_generic)
> > +EXPORT_SYMBOL(__strcmp_generic)
> > diff --git a/arch/riscv/lib/strlen.S b/arch/riscv/lib/strlen.S
> > new file mode 100644
> > index 000000000000..e0e7440ac724
> > --- /dev/null
> > +++ b/arch/riscv/lib/strlen.S
> > @@ -0,0 +1,29 @@
> > +/* SPDX-License-Identifier: GPL-2.0-only */
> > +
> > +#include <linux/linkage.h>
> > +#include <asm/asm.h>
> > +#include <asm-generic/export.h>
> > +
> > +/* int __strlen_generic(const char *s) */
> > +ENTRY(__strlen_generic)
> > +	/*
> > +	 * Returns
> > +	 *   a0 - string length
> > +	 *
> > +	 * Parameters
> > +	 *   a0 - String to measure
> > +	 *
> > +	 * Clobbers:
> > +	 *   t0, t1
> > +	 */
> > +	mv	t1, a0
> > +1:
> > +	lbu	t0, 0(t1)
> > +	bnez	t0, 2f
> > +	sub	a0, t1, a0
> > +	ret
> > +2:
> > +	addi	t1, t1, 1
> > +	j	1b
> > +END(__strlen_generic)
> > +EXPORT_SYMBOL(__strlen_generic)
> > diff --git a/arch/riscv/lib/strncmp.S b/arch/riscv/lib/strncmp.S
> > new file mode 100644
> > index 000000000000..7e116d942265
> > --- /dev/null
> > +++ b/arch/riscv/lib/strncmp.S
> > @@ -0,0 +1,41 @@
> > +/* SPDX-License-Identifier: GPL-2.0-only */
> > +
> > +#include <linux/linkage.h>
> > +#include <asm/asm.h>
> > +#include <asm-generic/export.h>
> > +
> > +/* int __strncmp_generic(const char *cs, const char *ct, size_t count) */
> > +ENTRY(__strncmp_generic)
> > +	/*
> > +	 * Returns
> > +	 *   a0 - comparison result, value like strncmp
> > +	 *
> > +	 * Parameters
> > +	 *   a0 - string1
> > +	 *   a1 - string2
> > +	 *   a2 - number of characters to compare
> > +	 *
> > +	 * Clobbers
> > +	 *   t0, t1, t2
> > +	 */
> > +	li	t0, 0
> > +1:
> > +	beq	a2, t0, 4f
> > +	add	t1, a0, t0
> > +	add	t2, a1, t0
> > +	lbu	t1, 0(t1)
> > +	lbu	t2, 0(t2)
> > +	beq	t1, t2, 3f
> > +	li	a0, 1
> > +	bgeu	t1, t2, 2f
> > +	li	a0, -1
> > +2:
> > +	ret
> > +3:
> > +	addi	t0, t0, 1
> > +	bnez	t1, 1b
> > +4:
> > +	li	a0, 0
> > +	j	2b
> > +END(__strncmp_generic)
> > +EXPORT_SYMBOL(__strncmp_generic)
> > -- 
> > 2.35.1
> >
> 
> Besides the consolidation of functions to one file request,
> 
> Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
> 
> Thanks,
> drew

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [PATCH v3 13/14] RISC-V: add infrastructure to allow different str* implementations
@ 2022-12-01 20:14       ` Andrew Jones
  0 siblings, 0 replies; 84+ messages in thread
From: Andrew Jones @ 2022-12-01 20:14 UTC (permalink / raw)
  To: Heiko Stuebner
  Cc: linux-riscv, palmer, christoph.muellner, prabhakar.csengg, conor,
	philipp.tomsich, emil.renner.berthing, ardb, linux-efi,
	Heiko Stuebner, Conor Dooley

On Thu, Dec 01, 2022 at 09:07:41PM +0100, Andrew Jones wrote:
> On Wed, Nov 30, 2022 at 11:56:13PM +0100, Heiko Stuebner wrote:
> > From: Heiko Stuebner <heiko.stuebner@vrull.eu>
> > 
> > Depending on supported extensions on specific RISC-V cores,
> > optimized str* functions might make sense.
> > 
> > This adds basic infrastructure to allow patching the function calls
> > via alternatives later on.
> > 
> > The main idea is to have the core str* functions be inline functions
> > which then call the most optimized variant and this call then be
> > replaced via alternatives.
> > 
> > The big advantage is that we don't need additional calls.
> > Though we need to duplicate the generic functions as the main code
> > expects either itself or the architecture to provide the str* functions.
> 
> That's a bummer, but at least the duplicated functions are simple.
> 
> > 
> > The added *_generic functions are done in assembler (taken from
> > disassembling the main-kernel functions for now) to allow us to control
> > the used registers.
> > 
> > Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
> > Signed-off-by: Heiko Stuebner <heiko.stuebner@vrull.eu>
> > ---
> >  arch/riscv/Makefile             |  3 ++
> >  arch/riscv/include/asm/string.h | 66 +++++++++++++++++++++++++++++++++
> >  arch/riscv/kernel/image-vars.h  |  6 +--
> >  arch/riscv/lib/Makefile         |  3 ++
> >  arch/riscv/lib/strcmp.S         | 38 +++++++++++++++++++
> >  arch/riscv/lib/strlen.S         | 29 +++++++++++++++
> >  arch/riscv/lib/strncmp.S        | 41 ++++++++++++++++++++
> >  7 files changed, 183 insertions(+), 3 deletions(-)
> >  create mode 100644 arch/riscv/lib/strcmp.S
> >  create mode 100644 arch/riscv/lib/strlen.S
> >  create mode 100644 arch/riscv/lib/strncmp.S
> > 
> > diff --git a/arch/riscv/Makefile b/arch/riscv/Makefile
> > index 0d13b597cb55..581e4370c2a6 100644
> > --- a/arch/riscv/Makefile
> > +++ b/arch/riscv/Makefile
> > @@ -80,6 +80,9 @@ ifeq ($(CONFIG_PERF_EVENTS),y)
> >          KBUILD_CFLAGS += -fno-omit-frame-pointer
> >  endif
> >  
> > +# strchr is special case, as gcc might want to call its own strlen from there
> > +KBUILD_CFLAGS += -fno-builtin-strlen -fno-builtin-strcmp -fno-builtin-strncmp -fno-builtin-strchr
> > +
> >  KBUILD_CFLAGS_MODULE += $(call cc-option,-mno-relax)
> >  KBUILD_AFLAGS_MODULE += $(call as-option,-Wa$(comma)-mno-relax)
> >  
> > diff --git a/arch/riscv/include/asm/string.h b/arch/riscv/include/asm/string.h
> > index 909049366555..b98481d2d154 100644
> > --- a/arch/riscv/include/asm/string.h
> > +++ b/arch/riscv/include/asm/string.h
> > @@ -18,6 +18,72 @@ extern asmlinkage void *__memcpy(void *, const void *, size_t);
> >  #define __HAVE_ARCH_MEMMOVE
> >  extern asmlinkage void *memmove(void *, const void *, size_t);
> >  extern asmlinkage void *__memmove(void *, const void *, size_t);
> > +
> > +#define __HAVE_ARCH_STRCMP
> > +extern asmlinkage int __strcmp_generic(const char *cs, const char *ct);
> > +
> > +static inline int strcmp(const char *cs, const char *ct)
> > +{
> > +#ifdef RISCV_EFISTUB
> > +	return __strcmp_generic(cs, ct);
> > +#else
> > +	register const char *a0 asm("a0") = cs;
> > +	register const char *a1 asm("a1") = ct;
> > +	register int a0_out asm("a0");
> > +
> > +	asm volatile("call __strcmp_generic\n\t"
> > +		: "=r"(a0_out)
> > +		: "r"(a0), "r"(a1)
> > +		: "ra", "t0", "t1", "t2");

Don't we need "memory" in all these clobber lists?

Thanks,
drew

> > +
> > +	return a0_out;
> > +#endif
> > +}
> > +
> > +#define __HAVE_ARCH_STRNCMP
> > +extern asmlinkage int __strncmp_generic(const char *cs,
> > +					const char *ct, size_t count);
> > +
> > +static inline int strncmp(const char *cs, const char *ct, size_t count)
> > +{
> > +#ifdef RISCV_EFISTUB
> > +	return __strncmp_generic(cs, ct, count);
> > +#else
> > +	register const char *a0 asm("a0") = cs;
> > +	register const char *a1 asm("a1") = ct;
> > +	register size_t a2 asm("a2") = count;
> > +	register int a0_out asm("a0");
> > +
> > +	asm volatile("call __strncmp_generic\n\t"
> > +		: "=r"(a0_out)
> > +		: "r"(a0), "r"(a1), "r"(a2)
> > +		: "ra", "t0", "t1", "t2");
> > +
> > +	return a0_out;
> > +#endif
> > +}
> > +
> > +#define __HAVE_ARCH_STRLEN
> > +extern asmlinkage __kernel_size_t __strlen_generic(const char *);
> > +
> > +static inline __kernel_size_t strlen(const char *s)
> > +{
> > +#ifdef RISCV_EFISTUB
> > +	return __strlen_generic(s);
> > +#else
> > +	register const char *a0 asm("a0") = s;
> > +	register int a0_out asm("a0");
> > +
> > +	asm volatile(
> > +		"call __strlen_generic\n\t"
> > +		: "=r"(a0_out)
> > +		: "r"(a0)
> > +		: "ra", "t0", "t1");
> > +
> > +	return a0_out;
> > +#endif
> > +}
> > +
> >  /* For those files which don't want to check by kasan. */
> >  #if defined(CONFIG_KASAN) && !defined(__SANITIZE_ADDRESS__)
> >  #define memcpy(dst, src, len) __memcpy(dst, src, len)
> > diff --git a/arch/riscv/kernel/image-vars.h b/arch/riscv/kernel/image-vars.h
> > index d6e5f739905e..2f270b9fde63 100644
> > --- a/arch/riscv/kernel/image-vars.h
> > +++ b/arch/riscv/kernel/image-vars.h
> > @@ -25,10 +25,10 @@
> >   */
> >  __efistub_memcmp		= memcmp;
> >  __efistub_memchr		= memchr;
> > -__efistub_strlen		= strlen;
> > +__efistub___strlen_generic	= __strlen_generic;
> >  __efistub_strnlen		= strnlen;
> > -__efistub_strcmp		= strcmp;
> > -__efistub_strncmp		= strncmp;
> > +__efistub___strcmp_generic	= __strcmp_generic;
> > +__efistub___strncmp_generic	= __strncmp_generic;
> >  __efistub_strrchr		= strrchr;
> >  
> >  __efistub__start		= _start;
> > diff --git a/arch/riscv/lib/Makefile b/arch/riscv/lib/Makefile
> > index 25d5c9664e57..6c74b0bedd60 100644
> > --- a/arch/riscv/lib/Makefile
> > +++ b/arch/riscv/lib/Makefile
> > @@ -3,6 +3,9 @@ lib-y			+= delay.o
> >  lib-y			+= memcpy.o
> >  lib-y			+= memset.o
> >  lib-y			+= memmove.o
> > +lib-y			+= strcmp.o
> > +lib-y			+= strlen.o
> > +lib-y			+= strncmp.o
> 
> Can't we just create a single string.S file?
> 
> >  lib-$(CONFIG_MMU)	+= uaccess.o
> >  lib-$(CONFIG_64BIT)	+= tishift.o
> >  
> > diff --git a/arch/riscv/lib/strcmp.S b/arch/riscv/lib/strcmp.S
> > new file mode 100644
> > index 000000000000..f4b7f4e806f0
> > --- /dev/null
> > +++ b/arch/riscv/lib/strcmp.S
> > @@ -0,0 +1,38 @@
> > +/* SPDX-License-Identifier: GPL-2.0-only */
> > +
> > +#include <linux/linkage.h>
> > +#include <asm/asm.h>
> > +#include <asm-generic/export.h>
> > +
> > +/* int __strcmp_generic(const char *cs, const char *ct) */
> > +ENTRY(__strcmp_generic)
> > +	/*
> > +	 * Returns
> > +	 *   a0 - comparison result, value like strcmp
> > +	 *
> > +	 * Parameters
> > +	 *   a0 - string1
> > +	 *   a1 - string2
> > +	 *
> > +	 * Clobbers
> > +	 *   t0, t1, t2
> > +	 */
> > +	mv	t2, a1
> > +1:
> > +	lbu	t1, 0(a0)
> > +	lbu	t0, 0(a1)
> > +	addi	a0, a0, 1
> > +	addi	a1, a1, 1
> > +	beq	t1, t0, 3f
> > +	li	a0, 1
> > +	bgeu	t1, t0, 2f
> > +	li	a0, -1
> > +2:
> > +	mv	a1, t2
> > +	ret
> > +3:
> > +	bnez	t1, 1b
> > +	li	a0, 0
> > +	j	2b
> > +END(__strcmp_generic)
> > +EXPORT_SYMBOL(__strcmp_generic)
> > diff --git a/arch/riscv/lib/strlen.S b/arch/riscv/lib/strlen.S
> > new file mode 100644
> > index 000000000000..e0e7440ac724
> > --- /dev/null
> > +++ b/arch/riscv/lib/strlen.S
> > @@ -0,0 +1,29 @@
> > +/* SPDX-License-Identifier: GPL-2.0-only */
> > +
> > +#include <linux/linkage.h>
> > +#include <asm/asm.h>
> > +#include <asm-generic/export.h>
> > +
> > +/* int __strlen_generic(const char *s) */
> > +ENTRY(__strlen_generic)
> > +	/*
> > +	 * Returns
> > +	 *   a0 - string length
> > +	 *
> > +	 * Parameters
> > +	 *   a0 - String to measure
> > +	 *
> > +	 * Clobbers:
> > +	 *   t0, t1
> > +	 */
> > +	mv	t1, a0
> > +1:
> > +	lbu	t0, 0(t1)
> > +	bnez	t0, 2f
> > +	sub	a0, t1, a0
> > +	ret
> > +2:
> > +	addi	t1, t1, 1
> > +	j	1b
> > +END(__strlen_generic)
> > +EXPORT_SYMBOL(__strlen_generic)
> > diff --git a/arch/riscv/lib/strncmp.S b/arch/riscv/lib/strncmp.S
> > new file mode 100644
> > index 000000000000..7e116d942265
> > --- /dev/null
> > +++ b/arch/riscv/lib/strncmp.S
> > @@ -0,0 +1,41 @@
> > +/* SPDX-License-Identifier: GPL-2.0-only */
> > +
> > +#include <linux/linkage.h>
> > +#include <asm/asm.h>
> > +#include <asm-generic/export.h>
> > +
> > +/* int __strncmp_generic(const char *cs, const char *ct, size_t count) */
> > +ENTRY(__strncmp_generic)
> > +	/*
> > +	 * Returns
> > +	 *   a0 - comparison result, value like strncmp
> > +	 *
> > +	 * Parameters
> > +	 *   a0 - string1
> > +	 *   a1 - string2
> > +	 *   a2 - number of characters to compare
> > +	 *
> > +	 * Clobbers
> > +	 *   t0, t1, t2
> > +	 */
> > +	li	t0, 0
> > +1:
> > +	beq	a2, t0, 4f
> > +	add	t1, a0, t0
> > +	add	t2, a1, t0
> > +	lbu	t1, 0(t1)
> > +	lbu	t2, 0(t2)
> > +	beq	t1, t2, 3f
> > +	li	a0, 1
> > +	bgeu	t1, t2, 2f
> > +	li	a0, -1
> > +2:
> > +	ret
> > +3:
> > +	addi	t0, t0, 1
> > +	bnez	t1, 1b
> > +4:
> > +	li	a0, 0
> > +	j	2b
> > +END(__strncmp_generic)
> > +EXPORT_SYMBOL(__strncmp_generic)
> > -- 
> > 2.35.1
> >
> 
> Besides the consolidation of functions to one file request,
> 
> Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
> 
> Thanks,
> drew

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [PATCH v3 12/14] efi/riscv: libstub: mark when compiling libstub
  2022-12-01 19:34     ` Andrew Jones
@ 2022-12-01 20:57       ` Ard Biesheuvel
  -1 siblings, 0 replies; 84+ messages in thread
From: Ard Biesheuvel @ 2022-12-01 20:57 UTC (permalink / raw)
  To: Andrew Jones
  Cc: Heiko Stuebner, linux-riscv, palmer, christoph.muellner,
	prabhakar.csengg, conor, philipp.tomsich, emil.renner.berthing,
	linux-efi, Heiko Stuebner, Conor Dooley

On Thu, 1 Dec 2022 at 20:35, Andrew Jones <ajones@ventanamicro.com> wrote:
>
> On Wed, Nov 30, 2022 at 11:56:12PM +0100, Heiko Stuebner wrote:
> > From: Heiko Stuebner <heiko.stuebner@vrull.eu>
> >
> > We may want to runtime-optimize some core functions (str*, mem*),
> > but not have this leak into libstub and cause build issues.
> > Instead libstub, for the short while it's running, should just use
> > the generic implementation.
> >
> > So, to be able to determine whether functions, that are used both in
> > libstub and the main kernel, are getting compiled as part of libstub or
> > not, add a compile-flag we can check via #ifdef.
> >
> > Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
> > Signed-off-by: Heiko Stuebner <heiko.stuebner@vrull.eu>

I think it would be better to update arch/riscv/kernel/image-vars.h so
that only these generic implementations are exposed to the stub in the
first place.

> > ---
> >  drivers/firmware/efi/libstub/Makefile | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/drivers/firmware/efi/libstub/Makefile b/drivers/firmware/efi/libstub/Makefile
> > index ef5045a53ce0..777d1ab059e3 100644
> > --- a/drivers/firmware/efi/libstub/Makefile
> > +++ b/drivers/firmware/efi/libstub/Makefile
> > @@ -25,7 +25,7 @@ cflags-$(CONFIG_ARM)                := $(subst $(CC_FLAGS_FTRACE),,$(KBUILD_CFLAGS)) \
> >                                  -fno-builtin -fpic \
> >                                  $(call cc-option,-mno-single-pic-base)
> >  cflags-$(CONFIG_RISCV)               := $(subst $(CC_FLAGS_FTRACE),,$(KBUILD_CFLAGS)) \
> > -                                -fpic
> > +                                -fpic -DRISCV_EFISTUB
> >  cflags-$(CONFIG_LOONGARCH)   := $(subst $(CC_FLAGS_FTRACE),,$(KBUILD_CFLAGS)) \
> >                                  -fpie
> >
> > --
> > 2.35.1
> >
>
> Reviewed-by: Andrew Jones <ajones@ventanamicro.com>

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [PATCH v3 12/14] efi/riscv: libstub: mark when compiling libstub
@ 2022-12-01 20:57       ` Ard Biesheuvel
  0 siblings, 0 replies; 84+ messages in thread
From: Ard Biesheuvel @ 2022-12-01 20:57 UTC (permalink / raw)
  To: Andrew Jones
  Cc: Heiko Stuebner, linux-riscv, palmer, christoph.muellner,
	prabhakar.csengg, conor, philipp.tomsich, emil.renner.berthing,
	linux-efi, Heiko Stuebner, Conor Dooley

On Thu, 1 Dec 2022 at 20:35, Andrew Jones <ajones@ventanamicro.com> wrote:
>
> On Wed, Nov 30, 2022 at 11:56:12PM +0100, Heiko Stuebner wrote:
> > From: Heiko Stuebner <heiko.stuebner@vrull.eu>
> >
> > We may want to runtime-optimize some core functions (str*, mem*),
> > but not have this leak into libstub and cause build issues.
> > Instead libstub, for the short while it's running, should just use
> > the generic implementation.
> >
> > So, to be able to determine whether functions, that are used both in
> > libstub and the main kernel, are getting compiled as part of libstub or
> > not, add a compile-flag we can check via #ifdef.
> >
> > Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
> > Signed-off-by: Heiko Stuebner <heiko.stuebner@vrull.eu>

I think it would be better to update arch/riscv/kernel/image-vars.h so
that only these generic implementations are exposed to the stub in the
first place.

> > ---
> >  drivers/firmware/efi/libstub/Makefile | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/drivers/firmware/efi/libstub/Makefile b/drivers/firmware/efi/libstub/Makefile
> > index ef5045a53ce0..777d1ab059e3 100644
> > --- a/drivers/firmware/efi/libstub/Makefile
> > +++ b/drivers/firmware/efi/libstub/Makefile
> > @@ -25,7 +25,7 @@ cflags-$(CONFIG_ARM)                := $(subst $(CC_FLAGS_FTRACE),,$(KBUILD_CFLAGS)) \
> >                                  -fno-builtin -fpic \
> >                                  $(call cc-option,-mno-single-pic-base)
> >  cflags-$(CONFIG_RISCV)               := $(subst $(CC_FLAGS_FTRACE),,$(KBUILD_CFLAGS)) \
> > -                                -fpic
> > +                                -fpic -DRISCV_EFISTUB
> >  cflags-$(CONFIG_LOONGARCH)   := $(subst $(CC_FLAGS_FTRACE),,$(KBUILD_CFLAGS)) \
> >                                  -fpie
> >
> > --
> > 2.35.1
> >
>
> Reviewed-by: Andrew Jones <ajones@ventanamicro.com>

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [PATCH v3 12/14] efi/riscv: libstub: mark when compiling libstub
  2022-12-01 20:57       ` Ard Biesheuvel
@ 2022-12-01 22:39         ` Heiko Stübner
  -1 siblings, 0 replies; 84+ messages in thread
From: Heiko Stübner @ 2022-12-01 22:39 UTC (permalink / raw)
  To: Andrew Jones, Ard Biesheuvel
  Cc: linux-riscv, palmer, christoph.muellner, prabhakar.csengg, conor,
	philipp.tomsich, emil.renner.berthing, linux-efi, Conor Dooley

Hi Ard,

Am Donnerstag, 1. Dezember 2022, 21:57:00 CET schrieb Ard Biesheuvel:
> On Thu, 1 Dec 2022 at 20:35, Andrew Jones <ajones@ventanamicro.com> wrote:
> >
> > On Wed, Nov 30, 2022 at 11:56:12PM +0100, Heiko Stuebner wrote:
> > > From: Heiko Stuebner <heiko.stuebner@vrull.eu>
> > >
> > > We may want to runtime-optimize some core functions (str*, mem*),
> > > but not have this leak into libstub and cause build issues.
> > > Instead libstub, for the short while it's running, should just use
> > > the generic implementation.
> > >
> > > So, to be able to determine whether functions, that are used both in
> > > libstub and the main kernel, are getting compiled as part of libstub or
> > > not, add a compile-flag we can check via #ifdef.
> > >
> > > Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
> > > Signed-off-by: Heiko Stuebner <heiko.stuebner@vrull.eu>
> 
> I think it would be better to update arch/riscv/kernel/image-vars.h so
> that only these generic implementations are exposed to the stub in the
> first place.

The relevant code is in patch13 + patch14.

To provide more context, the actual str* function we want to run is
determined at runtime. This is due to a all the possible extensions
(present and future) a riscv core can or cannot support, which in turn
blooms into a plethora of possible implementations for them.

Of course we want to have a unified kernel image, so we check on boot
for available extensions and patch the call to the actual best function

The introduction in the mentioned patches is still simple with a
generic + bitmanipulation variant, but that is more to keep the changes
somewhat manageable and there are already more variants on the horizon.

So the actual strlen and friends is just an inline function with a call to
the actual function, which gets patched via alternatives.

So this looks then like:

--------- 8< ---------
static inline int strcmp(const char *cs, const char *ct)
{
#if defined(RISCV_EFISTUB) || defined(RISCV_PURGATORY)
        return __strcmp_generic(cs, ct);
#else
        register const char *a0 asm("a0") = cs;
        register const char *a1 asm("a1") = ct;
        register int a0_out asm("a0");

        asm volatile(
                ALTERNATIVE(
                        "call __strcmp_generic\n\t",
                        "call __strcmp_zbb\n\t",
                        0, CPUFEATURE_ZBB, CONFIG_RISCV_ISA_ZBB)
                : "=r"(a0_out)
                : "r"(a0), "r"(a1)
                : "ra", "t0", "t1", "t2", "t3", "t4", "t5");

        return a0_out;
#endif
}
--------- 8< ---------

When that gets pulled into libstub without that separation, libstub ends up
with references to __strcmp_generic and __strcmp_zbb.

Of course the zbb variant would never get used, but still would also need
to be present in libstub and image-vars.h just to make the build happy.

And for every additional variant this would also mean adding more to
unused code to libstub and image-vars.h, hence I came up with the flag
to mark when code gets to be part of libstub.


Heiko



> 
> > > ---
> > >  drivers/firmware/efi/libstub/Makefile | 2 +-
> > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > >
> > > diff --git a/drivers/firmware/efi/libstub/Makefile b/drivers/firmware/efi/libstub/Makefile
> > > index ef5045a53ce0..777d1ab059e3 100644
> > > --- a/drivers/firmware/efi/libstub/Makefile
> > > +++ b/drivers/firmware/efi/libstub/Makefile
> > > @@ -25,7 +25,7 @@ cflags-$(CONFIG_ARM)                := $(subst $(CC_FLAGS_FTRACE),,$(KBUILD_CFLAGS)) \
> > >                                  -fno-builtin -fpic \
> > >                                  $(call cc-option,-mno-single-pic-base)
> > >  cflags-$(CONFIG_RISCV)               := $(subst $(CC_FLAGS_FTRACE),,$(KBUILD_CFLAGS)) \
> > > -                                -fpic
> > > +                                -fpic -DRISCV_EFISTUB
> > >  cflags-$(CONFIG_LOONGARCH)   := $(subst $(CC_FLAGS_FTRACE),,$(KBUILD_CFLAGS)) \
> > >                                  -fpie
> > >
> > > --
> > > 2.35.1
> > >
> >
> > Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
> 





^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [PATCH v3 12/14] efi/riscv: libstub: mark when compiling libstub
@ 2022-12-01 22:39         ` Heiko Stübner
  0 siblings, 0 replies; 84+ messages in thread
From: Heiko Stübner @ 2022-12-01 22:39 UTC (permalink / raw)
  To: Andrew Jones, Ard Biesheuvel
  Cc: linux-riscv, palmer, christoph.muellner, prabhakar.csengg, conor,
	philipp.tomsich, emil.renner.berthing, linux-efi, Conor Dooley

Hi Ard,

Am Donnerstag, 1. Dezember 2022, 21:57:00 CET schrieb Ard Biesheuvel:
> On Thu, 1 Dec 2022 at 20:35, Andrew Jones <ajones@ventanamicro.com> wrote:
> >
> > On Wed, Nov 30, 2022 at 11:56:12PM +0100, Heiko Stuebner wrote:
> > > From: Heiko Stuebner <heiko.stuebner@vrull.eu>
> > >
> > > We may want to runtime-optimize some core functions (str*, mem*),
> > > but not have this leak into libstub and cause build issues.
> > > Instead libstub, for the short while it's running, should just use
> > > the generic implementation.
> > >
> > > So, to be able to determine whether functions, that are used both in
> > > libstub and the main kernel, are getting compiled as part of libstub or
> > > not, add a compile-flag we can check via #ifdef.
> > >
> > > Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
> > > Signed-off-by: Heiko Stuebner <heiko.stuebner@vrull.eu>
> 
> I think it would be better to update arch/riscv/kernel/image-vars.h so
> that only these generic implementations are exposed to the stub in the
> first place.

The relevant code is in patch13 + patch14.

To provide more context, the actual str* function we want to run is
determined at runtime. This is due to a all the possible extensions
(present and future) a riscv core can or cannot support, which in turn
blooms into a plethora of possible implementations for them.

Of course we want to have a unified kernel image, so we check on boot
for available extensions and patch the call to the actual best function

The introduction in the mentioned patches is still simple with a
generic + bitmanipulation variant, but that is more to keep the changes
somewhat manageable and there are already more variants on the horizon.

So the actual strlen and friends is just an inline function with a call to
the actual function, which gets patched via alternatives.

So this looks then like:

--------- 8< ---------
static inline int strcmp(const char *cs, const char *ct)
{
#if defined(RISCV_EFISTUB) || defined(RISCV_PURGATORY)
        return __strcmp_generic(cs, ct);
#else
        register const char *a0 asm("a0") = cs;
        register const char *a1 asm("a1") = ct;
        register int a0_out asm("a0");

        asm volatile(
                ALTERNATIVE(
                        "call __strcmp_generic\n\t",
                        "call __strcmp_zbb\n\t",
                        0, CPUFEATURE_ZBB, CONFIG_RISCV_ISA_ZBB)
                : "=r"(a0_out)
                : "r"(a0), "r"(a1)
                : "ra", "t0", "t1", "t2", "t3", "t4", "t5");

        return a0_out;
#endif
}
--------- 8< ---------

When that gets pulled into libstub without that separation, libstub ends up
with references to __strcmp_generic and __strcmp_zbb.

Of course the zbb variant would never get used, but still would also need
to be present in libstub and image-vars.h just to make the build happy.

And for every additional variant this would also mean adding more to
unused code to libstub and image-vars.h, hence I came up with the flag
to mark when code gets to be part of libstub.


Heiko



> 
> > > ---
> > >  drivers/firmware/efi/libstub/Makefile | 2 +-
> > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > >
> > > diff --git a/drivers/firmware/efi/libstub/Makefile b/drivers/firmware/efi/libstub/Makefile
> > > index ef5045a53ce0..777d1ab059e3 100644
> > > --- a/drivers/firmware/efi/libstub/Makefile
> > > +++ b/drivers/firmware/efi/libstub/Makefile
> > > @@ -25,7 +25,7 @@ cflags-$(CONFIG_ARM)                := $(subst $(CC_FLAGS_FTRACE),,$(KBUILD_CFLAGS)) \
> > >                                  -fno-builtin -fpic \
> > >                                  $(call cc-option,-mno-single-pic-base)
> > >  cflags-$(CONFIG_RISCV)               := $(subst $(CC_FLAGS_FTRACE),,$(KBUILD_CFLAGS)) \
> > > -                                -fpic
> > > +                                -fpic -DRISCV_EFISTUB
> > >  cflags-$(CONFIG_LOONGARCH)   := $(subst $(CC_FLAGS_FTRACE),,$(KBUILD_CFLAGS)) \
> > >                                  -fpie
> > >
> > > --
> > > 2.35.1
> > >
> >
> > Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
> 





_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [PATCH v3 13/14] RISC-V: add infrastructure to allow different str* implementations
  2022-11-30 22:56   ` Heiko Stuebner
@ 2022-12-02  4:08     ` Palmer Dabbelt
  -1 siblings, 0 replies; 84+ messages in thread
From: Palmer Dabbelt @ 2022-12-02  4:08 UTC (permalink / raw)
  To: heiko, Ard Biesheuvel
  Cc: linux-riscv, christoph.muellner, prabhakar.csengg, Conor Dooley,
	philipp.tomsich, ajones, heiko, emil.renner.berthing, ardb,
	linux-efi, heiko.stuebner, Conor Dooley

On Wed, 30 Nov 2022 14:56:13 PST (-0800), heiko@sntech.de wrote:
> From: Heiko Stuebner <heiko.stuebner@vrull.eu>
>
> Depending on supported extensions on specific RISC-V cores,
> optimized str* functions might make sense.
>
> This adds basic infrastructure to allow patching the function calls
> via alternatives later on.
>
> The main idea is to have the core str* functions be inline functions
> which then call the most optimized variant and this call then be
> replaced via alternatives.
>
> The big advantage is that we don't need additional calls.
> Though we need to duplicate the generic functions as the main code
> expects either itself or the architecture to provide the str* functions.
>
> The added *_generic functions are done in assembler (taken from
> disassembling the main-kernel functions for now) to allow us to control
> the used registers.
>
> Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
> Signed-off-by: Heiko Stuebner <heiko.stuebner@vrull.eu>
> ---
>  arch/riscv/Makefile             |  3 ++
>  arch/riscv/include/asm/string.h | 66 +++++++++++++++++++++++++++++++++
>  arch/riscv/kernel/image-vars.h  |  6 +--
>  arch/riscv/lib/Makefile         |  3 ++
>  arch/riscv/lib/strcmp.S         | 38 +++++++++++++++++++
>  arch/riscv/lib/strlen.S         | 29 +++++++++++++++
>  arch/riscv/lib/strncmp.S        | 41 ++++++++++++++++++++
>  7 files changed, 183 insertions(+), 3 deletions(-)
>  create mode 100644 arch/riscv/lib/strcmp.S
>  create mode 100644 arch/riscv/lib/strlen.S
>  create mode 100644 arch/riscv/lib/strncmp.S
>
> diff --git a/arch/riscv/Makefile b/arch/riscv/Makefile
> index 0d13b597cb55..581e4370c2a6 100644
> --- a/arch/riscv/Makefile
> +++ b/arch/riscv/Makefile
> @@ -80,6 +80,9 @@ ifeq ($(CONFIG_PERF_EVENTS),y)
>          KBUILD_CFLAGS += -fno-omit-frame-pointer
>  endif
>
> +# strchr is special case, as gcc might want to call its own strlen from there
> +KBUILD_CFLAGS += -fno-builtin-strlen -fno-builtin-strcmp -fno-builtin-strncmp -fno-builtin-strchr

I was poking around Ard's comment on that -DRISCV_EFI to try and figure 
out what it was doing, but I think this is the bigger issue.  I haven't 
benchmarked anything, but my guess is that turning off support for these 
builtin routines will outweigh the faster outline implementations of 
these routines.  I don't have any benchmarks to prove that, but in 
general compilers are pretty smart about handling these builtin routines 
in the common cases and deferring that to runtime is probably the wrong 
option.

I haven't looked all that closely at this patch set.  Is there some 
reason it's necessary to disable the builtin handling?  If not then my 
guess is it's better to leave that enabled unless some benchmarks show 
otherwise (and I don't know of any Zb* hardware to test against).

> +
>  KBUILD_CFLAGS_MODULE += $(call cc-option,-mno-relax)
>  KBUILD_AFLAGS_MODULE += $(call as-option,-Wa$(comma)-mno-relax)
>
> diff --git a/arch/riscv/include/asm/string.h b/arch/riscv/include/asm/string.h
> index 909049366555..b98481d2d154 100644
> --- a/arch/riscv/include/asm/string.h
> +++ b/arch/riscv/include/asm/string.h
> @@ -18,6 +18,72 @@ extern asmlinkage void *__memcpy(void *, const void *, size_t);
>  #define __HAVE_ARCH_MEMMOVE
>  extern asmlinkage void *memmove(void *, const void *, size_t);
>  extern asmlinkage void *__memmove(void *, const void *, size_t);
> +
> +#define __HAVE_ARCH_STRCMP
> +extern asmlinkage int __strcmp_generic(const char *cs, const char *ct);
> +
> +static inline int strcmp(const char *cs, const char *ct)
> +{
> +#ifdef RISCV_EFISTUB
> +	return __strcmp_generic(cs, ct);
> +#else
> +	register const char *a0 asm("a0") = cs;
> +	register const char *a1 asm("a1") = ct;
> +	register int a0_out asm("a0");
> +
> +	asm volatile("call __strcmp_generic\n\t"
> +		: "=r"(a0_out)
> +		: "r"(a0), "r"(a1)
> +		: "ra", "t0", "t1", "t2");
> +
> +	return a0_out;
> +#endif
> +}
> +
> +#define __HAVE_ARCH_STRNCMP
> +extern asmlinkage int __strncmp_generic(const char *cs,
> +					const char *ct, size_t count);
> +
> +static inline int strncmp(const char *cs, const char *ct, size_t count)
> +{
> +#ifdef RISCV_EFISTUB
> +	return __strncmp_generic(cs, ct, count);
> +#else
> +	register const char *a0 asm("a0") = cs;
> +	register const char *a1 asm("a1") = ct;
> +	register size_t a2 asm("a2") = count;
> +	register int a0_out asm("a0");
> +
> +	asm volatile("call __strncmp_generic\n\t"
> +		: "=r"(a0_out)
> +		: "r"(a0), "r"(a1), "r"(a2)
> +		: "ra", "t0", "t1", "t2");
> +
> +	return a0_out;
> +#endif
> +}
> +
> +#define __HAVE_ARCH_STRLEN
> +extern asmlinkage __kernel_size_t __strlen_generic(const char *);
> +
> +static inline __kernel_size_t strlen(const char *s)
> +{
> +#ifdef RISCV_EFISTUB
> +	return __strlen_generic(s);
> +#else
> +	register const char *a0 asm("a0") = s;
> +	register int a0_out asm("a0");
> +
> +	asm volatile(
> +		"call __strlen_generic\n\t"
> +		: "=r"(a0_out)
> +		: "r"(a0)
> +		: "ra", "t0", "t1");
> +
> +	return a0_out;
> +#endif
> +}
> +
>  /* For those files which don't want to check by kasan. */
>  #if defined(CONFIG_KASAN) && !defined(__SANITIZE_ADDRESS__)
>  #define memcpy(dst, src, len) __memcpy(dst, src, len)
> diff --git a/arch/riscv/kernel/image-vars.h b/arch/riscv/kernel/image-vars.h
> index d6e5f739905e..2f270b9fde63 100644
> --- a/arch/riscv/kernel/image-vars.h
> +++ b/arch/riscv/kernel/image-vars.h
> @@ -25,10 +25,10 @@
>   */
>  __efistub_memcmp		= memcmp;
>  __efistub_memchr		= memchr;
> -__efistub_strlen		= strlen;
> +__efistub___strlen_generic	= __strlen_generic;
>  __efistub_strnlen		= strnlen;
> -__efistub_strcmp		= strcmp;
> -__efistub_strncmp		= strncmp;
> +__efistub___strcmp_generic	= __strcmp_generic;
> +__efistub___strncmp_generic	= __strncmp_generic;
>  __efistub_strrchr		= strrchr;
>
>  __efistub__start		= _start;
> diff --git a/arch/riscv/lib/Makefile b/arch/riscv/lib/Makefile
> index 25d5c9664e57..6c74b0bedd60 100644
> --- a/arch/riscv/lib/Makefile
> +++ b/arch/riscv/lib/Makefile
> @@ -3,6 +3,9 @@ lib-y			+= delay.o
>  lib-y			+= memcpy.o
>  lib-y			+= memset.o
>  lib-y			+= memmove.o
> +lib-y			+= strcmp.o
> +lib-y			+= strlen.o
> +lib-y			+= strncmp.o
>  lib-$(CONFIG_MMU)	+= uaccess.o
>  lib-$(CONFIG_64BIT)	+= tishift.o
>
> diff --git a/arch/riscv/lib/strcmp.S b/arch/riscv/lib/strcmp.S
> new file mode 100644
> index 000000000000..f4b7f4e806f0
> --- /dev/null
> +++ b/arch/riscv/lib/strcmp.S
> @@ -0,0 +1,38 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +
> +#include <linux/linkage.h>
> +#include <asm/asm.h>
> +#include <asm-generic/export.h>
> +
> +/* int __strcmp_generic(const char *cs, const char *ct) */
> +ENTRY(__strcmp_generic)
> +	/*
> +	 * Returns
> +	 *   a0 - comparison result, value like strcmp
> +	 *
> +	 * Parameters
> +	 *   a0 - string1
> +	 *   a1 - string2
> +	 *
> +	 * Clobbers
> +	 *   t0, t1, t2
> +	 */
> +	mv	t2, a1
> +1:
> +	lbu	t1, 0(a0)
> +	lbu	t0, 0(a1)
> +	addi	a0, a0, 1
> +	addi	a1, a1, 1
> +	beq	t1, t0, 3f
> +	li	a0, 1
> +	bgeu	t1, t0, 2f
> +	li	a0, -1
> +2:
> +	mv	a1, t2
> +	ret
> +3:
> +	bnez	t1, 1b
> +	li	a0, 0
> +	j	2b
> +END(__strcmp_generic)
> +EXPORT_SYMBOL(__strcmp_generic)
> diff --git a/arch/riscv/lib/strlen.S b/arch/riscv/lib/strlen.S
> new file mode 100644
> index 000000000000..e0e7440ac724
> --- /dev/null
> +++ b/arch/riscv/lib/strlen.S
> @@ -0,0 +1,29 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +
> +#include <linux/linkage.h>
> +#include <asm/asm.h>
> +#include <asm-generic/export.h>
> +
> +/* int __strlen_generic(const char *s) */
> +ENTRY(__strlen_generic)
> +	/*
> +	 * Returns
> +	 *   a0 - string length
> +	 *
> +	 * Parameters
> +	 *   a0 - String to measure
> +	 *
> +	 * Clobbers:
> +	 *   t0, t1
> +	 */
> +	mv	t1, a0
> +1:
> +	lbu	t0, 0(t1)
> +	bnez	t0, 2f
> +	sub	a0, t1, a0
> +	ret
> +2:
> +	addi	t1, t1, 1
> +	j	1b
> +END(__strlen_generic)
> +EXPORT_SYMBOL(__strlen_generic)
> diff --git a/arch/riscv/lib/strncmp.S b/arch/riscv/lib/strncmp.S
> new file mode 100644
> index 000000000000..7e116d942265
> --- /dev/null
> +++ b/arch/riscv/lib/strncmp.S
> @@ -0,0 +1,41 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +
> +#include <linux/linkage.h>
> +#include <asm/asm.h>
> +#include <asm-generic/export.h>
> +
> +/* int __strncmp_generic(const char *cs, const char *ct, size_t count) */
> +ENTRY(__strncmp_generic)
> +	/*
> +	 * Returns
> +	 *   a0 - comparison result, value like strncmp
> +	 *
> +	 * Parameters
> +	 *   a0 - string1
> +	 *   a1 - string2
> +	 *   a2 - number of characters to compare
> +	 *
> +	 * Clobbers
> +	 *   t0, t1, t2
> +	 */
> +	li	t0, 0
> +1:
> +	beq	a2, t0, 4f
> +	add	t1, a0, t0
> +	add	t2, a1, t0
> +	lbu	t1, 0(t1)
> +	lbu	t2, 0(t2)
> +	beq	t1, t2, 3f
> +	li	a0, 1
> +	bgeu	t1, t2, 2f
> +	li	a0, -1
> +2:
> +	ret
> +3:
> +	addi	t0, t0, 1
> +	bnez	t1, 1b
> +4:
> +	li	a0, 0
> +	j	2b
> +END(__strncmp_generic)
> +EXPORT_SYMBOL(__strncmp_generic)

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [PATCH v3 13/14] RISC-V: add infrastructure to allow different str* implementations
@ 2022-12-02  4:08     ` Palmer Dabbelt
  0 siblings, 0 replies; 84+ messages in thread
From: Palmer Dabbelt @ 2022-12-02  4:08 UTC (permalink / raw)
  To: heiko, Ard Biesheuvel
  Cc: linux-riscv, christoph.muellner, prabhakar.csengg, Conor Dooley,
	philipp.tomsich, ajones, heiko, emil.renner.berthing, ardb,
	linux-efi, heiko.stuebner, Conor Dooley

On Wed, 30 Nov 2022 14:56:13 PST (-0800), heiko@sntech.de wrote:
> From: Heiko Stuebner <heiko.stuebner@vrull.eu>
>
> Depending on supported extensions on specific RISC-V cores,
> optimized str* functions might make sense.
>
> This adds basic infrastructure to allow patching the function calls
> via alternatives later on.
>
> The main idea is to have the core str* functions be inline functions
> which then call the most optimized variant and this call then be
> replaced via alternatives.
>
> The big advantage is that we don't need additional calls.
> Though we need to duplicate the generic functions as the main code
> expects either itself or the architecture to provide the str* functions.
>
> The added *_generic functions are done in assembler (taken from
> disassembling the main-kernel functions for now) to allow us to control
> the used registers.
>
> Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
> Signed-off-by: Heiko Stuebner <heiko.stuebner@vrull.eu>
> ---
>  arch/riscv/Makefile             |  3 ++
>  arch/riscv/include/asm/string.h | 66 +++++++++++++++++++++++++++++++++
>  arch/riscv/kernel/image-vars.h  |  6 +--
>  arch/riscv/lib/Makefile         |  3 ++
>  arch/riscv/lib/strcmp.S         | 38 +++++++++++++++++++
>  arch/riscv/lib/strlen.S         | 29 +++++++++++++++
>  arch/riscv/lib/strncmp.S        | 41 ++++++++++++++++++++
>  7 files changed, 183 insertions(+), 3 deletions(-)
>  create mode 100644 arch/riscv/lib/strcmp.S
>  create mode 100644 arch/riscv/lib/strlen.S
>  create mode 100644 arch/riscv/lib/strncmp.S
>
> diff --git a/arch/riscv/Makefile b/arch/riscv/Makefile
> index 0d13b597cb55..581e4370c2a6 100644
> --- a/arch/riscv/Makefile
> +++ b/arch/riscv/Makefile
> @@ -80,6 +80,9 @@ ifeq ($(CONFIG_PERF_EVENTS),y)
>          KBUILD_CFLAGS += -fno-omit-frame-pointer
>  endif
>
> +# strchr is special case, as gcc might want to call its own strlen from there
> +KBUILD_CFLAGS += -fno-builtin-strlen -fno-builtin-strcmp -fno-builtin-strncmp -fno-builtin-strchr

I was poking around Ard's comment on that -DRISCV_EFI to try and figure 
out what it was doing, but I think this is the bigger issue.  I haven't 
benchmarked anything, but my guess is that turning off support for these 
builtin routines will outweigh the faster outline implementations of 
these routines.  I don't have any benchmarks to prove that, but in 
general compilers are pretty smart about handling these builtin routines 
in the common cases and deferring that to runtime is probably the wrong 
option.

I haven't looked all that closely at this patch set.  Is there some 
reason it's necessary to disable the builtin handling?  If not then my 
guess is it's better to leave that enabled unless some benchmarks show 
otherwise (and I don't know of any Zb* hardware to test against).

> +
>  KBUILD_CFLAGS_MODULE += $(call cc-option,-mno-relax)
>  KBUILD_AFLAGS_MODULE += $(call as-option,-Wa$(comma)-mno-relax)
>
> diff --git a/arch/riscv/include/asm/string.h b/arch/riscv/include/asm/string.h
> index 909049366555..b98481d2d154 100644
> --- a/arch/riscv/include/asm/string.h
> +++ b/arch/riscv/include/asm/string.h
> @@ -18,6 +18,72 @@ extern asmlinkage void *__memcpy(void *, const void *, size_t);
>  #define __HAVE_ARCH_MEMMOVE
>  extern asmlinkage void *memmove(void *, const void *, size_t);
>  extern asmlinkage void *__memmove(void *, const void *, size_t);
> +
> +#define __HAVE_ARCH_STRCMP
> +extern asmlinkage int __strcmp_generic(const char *cs, const char *ct);
> +
> +static inline int strcmp(const char *cs, const char *ct)
> +{
> +#ifdef RISCV_EFISTUB
> +	return __strcmp_generic(cs, ct);
> +#else
> +	register const char *a0 asm("a0") = cs;
> +	register const char *a1 asm("a1") = ct;
> +	register int a0_out asm("a0");
> +
> +	asm volatile("call __strcmp_generic\n\t"
> +		: "=r"(a0_out)
> +		: "r"(a0), "r"(a1)
> +		: "ra", "t0", "t1", "t2");
> +
> +	return a0_out;
> +#endif
> +}
> +
> +#define __HAVE_ARCH_STRNCMP
> +extern asmlinkage int __strncmp_generic(const char *cs,
> +					const char *ct, size_t count);
> +
> +static inline int strncmp(const char *cs, const char *ct, size_t count)
> +{
> +#ifdef RISCV_EFISTUB
> +	return __strncmp_generic(cs, ct, count);
> +#else
> +	register const char *a0 asm("a0") = cs;
> +	register const char *a1 asm("a1") = ct;
> +	register size_t a2 asm("a2") = count;
> +	register int a0_out asm("a0");
> +
> +	asm volatile("call __strncmp_generic\n\t"
> +		: "=r"(a0_out)
> +		: "r"(a0), "r"(a1), "r"(a2)
> +		: "ra", "t0", "t1", "t2");
> +
> +	return a0_out;
> +#endif
> +}
> +
> +#define __HAVE_ARCH_STRLEN
> +extern asmlinkage __kernel_size_t __strlen_generic(const char *);
> +
> +static inline __kernel_size_t strlen(const char *s)
> +{
> +#ifdef RISCV_EFISTUB
> +	return __strlen_generic(s);
> +#else
> +	register const char *a0 asm("a0") = s;
> +	register int a0_out asm("a0");
> +
> +	asm volatile(
> +		"call __strlen_generic\n\t"
> +		: "=r"(a0_out)
> +		: "r"(a0)
> +		: "ra", "t0", "t1");
> +
> +	return a0_out;
> +#endif
> +}
> +
>  /* For those files which don't want to check by kasan. */
>  #if defined(CONFIG_KASAN) && !defined(__SANITIZE_ADDRESS__)
>  #define memcpy(dst, src, len) __memcpy(dst, src, len)
> diff --git a/arch/riscv/kernel/image-vars.h b/arch/riscv/kernel/image-vars.h
> index d6e5f739905e..2f270b9fde63 100644
> --- a/arch/riscv/kernel/image-vars.h
> +++ b/arch/riscv/kernel/image-vars.h
> @@ -25,10 +25,10 @@
>   */
>  __efistub_memcmp		= memcmp;
>  __efistub_memchr		= memchr;
> -__efistub_strlen		= strlen;
> +__efistub___strlen_generic	= __strlen_generic;
>  __efistub_strnlen		= strnlen;
> -__efistub_strcmp		= strcmp;
> -__efistub_strncmp		= strncmp;
> +__efistub___strcmp_generic	= __strcmp_generic;
> +__efistub___strncmp_generic	= __strncmp_generic;
>  __efistub_strrchr		= strrchr;
>
>  __efistub__start		= _start;
> diff --git a/arch/riscv/lib/Makefile b/arch/riscv/lib/Makefile
> index 25d5c9664e57..6c74b0bedd60 100644
> --- a/arch/riscv/lib/Makefile
> +++ b/arch/riscv/lib/Makefile
> @@ -3,6 +3,9 @@ lib-y			+= delay.o
>  lib-y			+= memcpy.o
>  lib-y			+= memset.o
>  lib-y			+= memmove.o
> +lib-y			+= strcmp.o
> +lib-y			+= strlen.o
> +lib-y			+= strncmp.o
>  lib-$(CONFIG_MMU)	+= uaccess.o
>  lib-$(CONFIG_64BIT)	+= tishift.o
>
> diff --git a/arch/riscv/lib/strcmp.S b/arch/riscv/lib/strcmp.S
> new file mode 100644
> index 000000000000..f4b7f4e806f0
> --- /dev/null
> +++ b/arch/riscv/lib/strcmp.S
> @@ -0,0 +1,38 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +
> +#include <linux/linkage.h>
> +#include <asm/asm.h>
> +#include <asm-generic/export.h>
> +
> +/* int __strcmp_generic(const char *cs, const char *ct) */
> +ENTRY(__strcmp_generic)
> +	/*
> +	 * Returns
> +	 *   a0 - comparison result, value like strcmp
> +	 *
> +	 * Parameters
> +	 *   a0 - string1
> +	 *   a1 - string2
> +	 *
> +	 * Clobbers
> +	 *   t0, t1, t2
> +	 */
> +	mv	t2, a1
> +1:
> +	lbu	t1, 0(a0)
> +	lbu	t0, 0(a1)
> +	addi	a0, a0, 1
> +	addi	a1, a1, 1
> +	beq	t1, t0, 3f
> +	li	a0, 1
> +	bgeu	t1, t0, 2f
> +	li	a0, -1
> +2:
> +	mv	a1, t2
> +	ret
> +3:
> +	bnez	t1, 1b
> +	li	a0, 0
> +	j	2b
> +END(__strcmp_generic)
> +EXPORT_SYMBOL(__strcmp_generic)
> diff --git a/arch/riscv/lib/strlen.S b/arch/riscv/lib/strlen.S
> new file mode 100644
> index 000000000000..e0e7440ac724
> --- /dev/null
> +++ b/arch/riscv/lib/strlen.S
> @@ -0,0 +1,29 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +
> +#include <linux/linkage.h>
> +#include <asm/asm.h>
> +#include <asm-generic/export.h>
> +
> +/* int __strlen_generic(const char *s) */
> +ENTRY(__strlen_generic)
> +	/*
> +	 * Returns
> +	 *   a0 - string length
> +	 *
> +	 * Parameters
> +	 *   a0 - String to measure
> +	 *
> +	 * Clobbers:
> +	 *   t0, t1
> +	 */
> +	mv	t1, a0
> +1:
> +	lbu	t0, 0(t1)
> +	bnez	t0, 2f
> +	sub	a0, t1, a0
> +	ret
> +2:
> +	addi	t1, t1, 1
> +	j	1b
> +END(__strlen_generic)
> +EXPORT_SYMBOL(__strlen_generic)
> diff --git a/arch/riscv/lib/strncmp.S b/arch/riscv/lib/strncmp.S
> new file mode 100644
> index 000000000000..7e116d942265
> --- /dev/null
> +++ b/arch/riscv/lib/strncmp.S
> @@ -0,0 +1,41 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +
> +#include <linux/linkage.h>
> +#include <asm/asm.h>
> +#include <asm-generic/export.h>
> +
> +/* int __strncmp_generic(const char *cs, const char *ct, size_t count) */
> +ENTRY(__strncmp_generic)
> +	/*
> +	 * Returns
> +	 *   a0 - comparison result, value like strncmp
> +	 *
> +	 * Parameters
> +	 *   a0 - string1
> +	 *   a1 - string2
> +	 *   a2 - number of characters to compare
> +	 *
> +	 * Clobbers
> +	 *   t0, t1, t2
> +	 */
> +	li	t0, 0
> +1:
> +	beq	a2, t0, 4f
> +	add	t1, a0, t0
> +	add	t2, a1, t0
> +	lbu	t1, 0(t1)
> +	lbu	t2, 0(t2)
> +	beq	t1, t2, 3f
> +	li	a0, 1
> +	bgeu	t1, t2, 2f
> +	li	a0, -1
> +2:
> +	ret
> +3:
> +	addi	t0, t0, 1
> +	bnez	t1, 1b
> +4:
> +	li	a0, 0
> +	j	2b
> +END(__strncmp_generic)
> +EXPORT_SYMBOL(__strncmp_generic)

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [PATCH v3 13/14] RISC-V: add infrastructure to allow different str* implementations
  2022-12-02  4:08     ` Palmer Dabbelt
@ 2022-12-02  8:19       ` Ard Biesheuvel
  -1 siblings, 0 replies; 84+ messages in thread
From: Ard Biesheuvel @ 2022-12-02  8:19 UTC (permalink / raw)
  To: Palmer Dabbelt, Nick Desaulniers, clang-built-linux
  Cc: heiko, Ard Biesheuvel, linux-riscv, christoph.muellner,
	prabhakar.csengg, Conor Dooley, philipp.tomsich, ajones,
	emil.renner.berthing, linux-efi, heiko.stuebner, Conor Dooley

On Fri, 2 Dec 2022 at 05:08, Palmer Dabbelt <palmer@dabbelt.com> wrote:
>
> On Wed, 30 Nov 2022 14:56:13 PST (-0800), heiko@sntech.de wrote:
> > From: Heiko Stuebner <heiko.stuebner@vrull.eu>
> >
> > Depending on supported extensions on specific RISC-V cores,
> > optimized str* functions might make sense.
> >
> > This adds basic infrastructure to allow patching the function calls
> > via alternatives later on.
> >
> > The main idea is to have the core str* functions be inline functions
> > which then call the most optimized variant and this call then be
> > replaced via alternatives.
> >
> > The big advantage is that we don't need additional calls.
> > Though we need to duplicate the generic functions as the main code
> > expects either itself or the architecture to provide the str* functions.
> >
> > The added *_generic functions are done in assembler (taken from
> > disassembling the main-kernel functions for now) to allow us to control
> > the used registers.
> >
> > Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
> > Signed-off-by: Heiko Stuebner <heiko.stuebner@vrull.eu>
> > ---
> >  arch/riscv/Makefile             |  3 ++
> >  arch/riscv/include/asm/string.h | 66 +++++++++++++++++++++++++++++++++
> >  arch/riscv/kernel/image-vars.h  |  6 +--
> >  arch/riscv/lib/Makefile         |  3 ++
> >  arch/riscv/lib/strcmp.S         | 38 +++++++++++++++++++
> >  arch/riscv/lib/strlen.S         | 29 +++++++++++++++
> >  arch/riscv/lib/strncmp.S        | 41 ++++++++++++++++++++
> >  7 files changed, 183 insertions(+), 3 deletions(-)
> >  create mode 100644 arch/riscv/lib/strcmp.S
> >  create mode 100644 arch/riscv/lib/strlen.S
> >  create mode 100644 arch/riscv/lib/strncmp.S
> >
> > diff --git a/arch/riscv/Makefile b/arch/riscv/Makefile
> > index 0d13b597cb55..581e4370c2a6 100644
> > --- a/arch/riscv/Makefile
> > +++ b/arch/riscv/Makefile
> > @@ -80,6 +80,9 @@ ifeq ($(CONFIG_PERF_EVENTS),y)
> >          KBUILD_CFLAGS += -fno-omit-frame-pointer
> >  endif
> >
> > +# strchr is special case, as gcc might want to call its own strlen from there
> > +KBUILD_CFLAGS += -fno-builtin-strlen -fno-builtin-strcmp -fno-builtin-strncmp -fno-builtin-strchr
>
> I was poking around Ard's comment on that -DRISCV_EFI to try and figure
> out what it was doing, but I think this is the bigger issue.  I haven't
> benchmarked anything, but my guess is that turning off support for these
> builtin routines will outweigh the faster outline implementations of
> these routines.  I don't have any benchmarks to prove that, but in
> general compilers are pretty smart about handling these builtin routines
> in the common cases and deferring that to runtime is probably the wrong
> option.
>

Indeed. Case in point:

riscv64-linux-gnugcc -O -S -o - -xc - <<<"int foo() { return strlen(\"bar\"); }"

gives me

li a0,3
ret

whereas this

riscv64-linux-gnu-gcc -fno-builtin-strlen -O -S -o - -xc - <<<"int
foo() { return strlen(\"bar\"); }"

gives me

.LC0:
  .string "bar"
  .text
  .align 1
  .globl foo
  .type foo, @function
foo:
  addi sp,sp,-16
  sd ra,8(sp)
  lla a0,.LC0
  call strlen@plt
  ld ra,8(sp)
  addi sp,sp,16
  jr ra

Other str* and mem* routines are optimized in similar ways when
dealing with small sizes or compile time constants.

Also, I'd recommend avoiding redefining these prototypes to static
inline as it deviates from the official prototypes in the C library.

> I haven't looked all that closely at this patch set.  Is there some
> reason it's necessary to disable the builtin handling?  If not then my
> guess is it's better to leave that enabled unless some benchmarks show
> otherwise (and I don't know of any Zb* hardware to test against).
>

Why not have a single generic version of each in the binary with a
patchable NOP at the start, and patch that to an optimized version
when available? That way. all of the early code can remain as is.

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [PATCH v3 13/14] RISC-V: add infrastructure to allow different str* implementations
@ 2022-12-02  8:19       ` Ard Biesheuvel
  0 siblings, 0 replies; 84+ messages in thread
From: Ard Biesheuvel @ 2022-12-02  8:19 UTC (permalink / raw)
  To: Palmer Dabbelt, Nick Desaulniers, clang-built-linux
  Cc: heiko, Ard Biesheuvel, linux-riscv, christoph.muellner,
	prabhakar.csengg, Conor Dooley, philipp.tomsich, ajones,
	emil.renner.berthing, linux-efi, heiko.stuebner, Conor Dooley

On Fri, 2 Dec 2022 at 05:08, Palmer Dabbelt <palmer@dabbelt.com> wrote:
>
> On Wed, 30 Nov 2022 14:56:13 PST (-0800), heiko@sntech.de wrote:
> > From: Heiko Stuebner <heiko.stuebner@vrull.eu>
> >
> > Depending on supported extensions on specific RISC-V cores,
> > optimized str* functions might make sense.
> >
> > This adds basic infrastructure to allow patching the function calls
> > via alternatives later on.
> >
> > The main idea is to have the core str* functions be inline functions
> > which then call the most optimized variant and this call then be
> > replaced via alternatives.
> >
> > The big advantage is that we don't need additional calls.
> > Though we need to duplicate the generic functions as the main code
> > expects either itself or the architecture to provide the str* functions.
> >
> > The added *_generic functions are done in assembler (taken from
> > disassembling the main-kernel functions for now) to allow us to control
> > the used registers.
> >
> > Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
> > Signed-off-by: Heiko Stuebner <heiko.stuebner@vrull.eu>
> > ---
> >  arch/riscv/Makefile             |  3 ++
> >  arch/riscv/include/asm/string.h | 66 +++++++++++++++++++++++++++++++++
> >  arch/riscv/kernel/image-vars.h  |  6 +--
> >  arch/riscv/lib/Makefile         |  3 ++
> >  arch/riscv/lib/strcmp.S         | 38 +++++++++++++++++++
> >  arch/riscv/lib/strlen.S         | 29 +++++++++++++++
> >  arch/riscv/lib/strncmp.S        | 41 ++++++++++++++++++++
> >  7 files changed, 183 insertions(+), 3 deletions(-)
> >  create mode 100644 arch/riscv/lib/strcmp.S
> >  create mode 100644 arch/riscv/lib/strlen.S
> >  create mode 100644 arch/riscv/lib/strncmp.S
> >
> > diff --git a/arch/riscv/Makefile b/arch/riscv/Makefile
> > index 0d13b597cb55..581e4370c2a6 100644
> > --- a/arch/riscv/Makefile
> > +++ b/arch/riscv/Makefile
> > @@ -80,6 +80,9 @@ ifeq ($(CONFIG_PERF_EVENTS),y)
> >          KBUILD_CFLAGS += -fno-omit-frame-pointer
> >  endif
> >
> > +# strchr is special case, as gcc might want to call its own strlen from there
> > +KBUILD_CFLAGS += -fno-builtin-strlen -fno-builtin-strcmp -fno-builtin-strncmp -fno-builtin-strchr
>
> I was poking around Ard's comment on that -DRISCV_EFI to try and figure
> out what it was doing, but I think this is the bigger issue.  I haven't
> benchmarked anything, but my guess is that turning off support for these
> builtin routines will outweigh the faster outline implementations of
> these routines.  I don't have any benchmarks to prove that, but in
> general compilers are pretty smart about handling these builtin routines
> in the common cases and deferring that to runtime is probably the wrong
> option.
>

Indeed. Case in point:

riscv64-linux-gnugcc -O -S -o - -xc - <<<"int foo() { return strlen(\"bar\"); }"

gives me

li a0,3
ret

whereas this

riscv64-linux-gnu-gcc -fno-builtin-strlen -O -S -o - -xc - <<<"int
foo() { return strlen(\"bar\"); }"

gives me

.LC0:
  .string "bar"
  .text
  .align 1
  .globl foo
  .type foo, @function
foo:
  addi sp,sp,-16
  sd ra,8(sp)
  lla a0,.LC0
  call strlen@plt
  ld ra,8(sp)
  addi sp,sp,16
  jr ra

Other str* and mem* routines are optimized in similar ways when
dealing with small sizes or compile time constants.

Also, I'd recommend avoiding redefining these prototypes to static
inline as it deviates from the official prototypes in the C library.

> I haven't looked all that closely at this patch set.  Is there some
> reason it's necessary to disable the builtin handling?  If not then my
> guess is it's better to leave that enabled unless some benchmarks show
> otherwise (and I don't know of any Zb* hardware to test against).
>

Why not have a single generic version of each in the binary with a
patchable NOP at the start, and patch that to an optimized version
when available? That way. all of the early code can remain as is.

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [PATCH v3 12/14] efi/riscv: libstub: mark when compiling libstub
  2022-12-01 22:39         ` Heiko Stübner
@ 2022-12-02 16:37           ` Ard Biesheuvel
  -1 siblings, 0 replies; 84+ messages in thread
From: Ard Biesheuvel @ 2022-12-02 16:37 UTC (permalink / raw)
  To: Heiko Stübner
  Cc: Andrew Jones, linux-riscv, palmer, christoph.muellner,
	prabhakar.csengg, conor, philipp.tomsich, emil.renner.berthing,
	linux-efi, Conor Dooley

On Thu, 1 Dec 2022 at 23:39, Heiko Stübner <heiko@sntech.de> wrote:
>
> Hi Ard,
>
> Am Donnerstag, 1. Dezember 2022, 21:57:00 CET schrieb Ard Biesheuvel:
> > On Thu, 1 Dec 2022 at 20:35, Andrew Jones <ajones@ventanamicro.com> wrote:
> > >
> > > On Wed, Nov 30, 2022 at 11:56:12PM +0100, Heiko Stuebner wrote:
> > > > From: Heiko Stuebner <heiko.stuebner@vrull.eu>
> > > >
> > > > We may want to runtime-optimize some core functions (str*, mem*),
> > > > but not have this leak into libstub and cause build issues.
> > > > Instead libstub, for the short while it's running, should just use
> > > > the generic implementation.
> > > >
> > > > So, to be able to determine whether functions, that are used both in
> > > > libstub and the main kernel, are getting compiled as part of libstub or
> > > > not, add a compile-flag we can check via #ifdef.
> > > >
> > > > Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
> > > > Signed-off-by: Heiko Stuebner <heiko.stuebner@vrull.eu>
> >
> > I think it would be better to update arch/riscv/kernel/image-vars.h so
> > that only these generic implementations are exposed to the stub in the
> > first place.
>

Actually, all references to string and memory functions are going away
from the stub. This is already in -next.

EFI now has zboot support, which means you can create a EFI bootable
kernel image that carries the actual kernel in compressed form rather
than as a hybrid EFI/bare metal image.

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [PATCH v3 12/14] efi/riscv: libstub: mark when compiling libstub
@ 2022-12-02 16:37           ` Ard Biesheuvel
  0 siblings, 0 replies; 84+ messages in thread
From: Ard Biesheuvel @ 2022-12-02 16:37 UTC (permalink / raw)
  To: Heiko Stübner
  Cc: Andrew Jones, linux-riscv, palmer, christoph.muellner,
	prabhakar.csengg, conor, philipp.tomsich, emil.renner.berthing,
	linux-efi, Conor Dooley

On Thu, 1 Dec 2022 at 23:39, Heiko Stübner <heiko@sntech.de> wrote:
>
> Hi Ard,
>
> Am Donnerstag, 1. Dezember 2022, 21:57:00 CET schrieb Ard Biesheuvel:
> > On Thu, 1 Dec 2022 at 20:35, Andrew Jones <ajones@ventanamicro.com> wrote:
> > >
> > > On Wed, Nov 30, 2022 at 11:56:12PM +0100, Heiko Stuebner wrote:
> > > > From: Heiko Stuebner <heiko.stuebner@vrull.eu>
> > > >
> > > > We may want to runtime-optimize some core functions (str*, mem*),
> > > > but not have this leak into libstub and cause build issues.
> > > > Instead libstub, for the short while it's running, should just use
> > > > the generic implementation.
> > > >
> > > > So, to be able to determine whether functions, that are used both in
> > > > libstub and the main kernel, are getting compiled as part of libstub or
> > > > not, add a compile-flag we can check via #ifdef.
> > > >
> > > > Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
> > > > Signed-off-by: Heiko Stuebner <heiko.stuebner@vrull.eu>
> >
> > I think it would be better to update arch/riscv/kernel/image-vars.h so
> > that only these generic implementations are exposed to the stub in the
> > first place.
>

Actually, all references to string and memory functions are going away
from the stub. This is already in -next.

EFI now has zboot support, which means you can create a EFI bootable
kernel image that carries the actual kernel in compressed form rather
than as a hybrid EFI/bare metal image.

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [PATCH v3 02/14] RISC-V: add prefix to all constants/macros in parse_asm.h
  2022-11-30 22:56   ` Heiko Stuebner
@ 2022-12-05 14:00     ` Andrew Jones
  -1 siblings, 0 replies; 84+ messages in thread
From: Andrew Jones @ 2022-12-05 14:00 UTC (permalink / raw)
  To: Heiko Stuebner
  Cc: linux-riscv, palmer, christoph.muellner, prabhakar.csengg, conor,
	philipp.tomsich, emil.renner.berthing, ardb, linux-efi,
	Heiko Stuebner, Conor Dooley

On Wed, Nov 30, 2022 at 11:56:02PM +0100, Heiko Stuebner wrote:
> From: Heiko Stuebner <heiko.stuebner@vrull.eu>
> 
> Some of the constants and macros already have suitable RV_, RVG_ or
> RVC_ prefixes.
> 
> Extend this to the rest of the file as well, as we want to use these
> things in a broader scope soon.
> 
> Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
> Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
> Signed-off-by: Heiko Stuebner <heiko.stuebner@vrull.eu>
> ---
>  arch/riscv/include/asm/parse_asm.h | 194 ++++++++++++++---------------
>  arch/riscv/kernel/kgdb.c           |  40 +++---
>  2 files changed, 117 insertions(+), 117 deletions(-)
> 
> diff --git a/arch/riscv/include/asm/parse_asm.h b/arch/riscv/include/asm/parse_asm.h
> index 7fee806805c1..ea51542e0c65 100644
> --- a/arch/riscv/include/asm/parse_asm.h
> +++ b/arch/riscv/include/asm/parse_asm.h
> @@ -6,37 +6,37 @@
>  #include <linux/bits.h>
>  
>  /* The bit field of immediate value in I-type instruction */
> -#define I_IMM_SIGN_OPOFF	31
> -#define I_IMM_11_0_OPOFF	20
> -#define I_IMM_SIGN_OFF		12
> -#define I_IMM_11_0_OFF		0
> -#define I_IMM_11_0_MASK		GENMASK(11, 0)
> +#define RV_I_IMM_SIGN_OPOFF	31
> +#define RV_I_IMM_11_0_OPOFF	20
> +#define RV_I_IMM_SIGN_OFF	12
> +#define RV_I_IMM_11_0_OFF	0
> +#define RV_I_IMM_11_0_MASK	GENMASK(11, 0)
>  
>  /* The bit field of immediate value in J-type instruction */
> -#define J_IMM_SIGN_OPOFF	31
> -#define J_IMM_10_1_OPOFF	21
> -#define J_IMM_11_OPOFF		20
> -#define J_IMM_19_12_OPOFF	12
> -#define J_IMM_SIGN_OFF		20
> -#define J_IMM_10_1_OFF		1
> -#define J_IMM_11_OFF		11
> -#define J_IMM_19_12_OFF		12
> -#define J_IMM_10_1_MASK		GENMASK(9, 0)
> -#define J_IMM_11_MASK		GENMASK(0, 0)
> -#define J_IMM_19_12_MASK	GENMASK(7, 0)
> +#define RV_J_IMM_SIGN_OPOFF	31
> +#define RV_J_IMM_10_1_OPOFF	21
> +#define RV_J_IMM_11_OPOFF	20
> +#define RV_J_IMM_19_12_OPOFF	12
> +#define RV_J_IMM_SIGN_OFF	20
> +#define RV_J_IMM_10_1_OFF	1
> +#define RV_J_IMM_11_OFF		11
> +#define RV_J_IMM_19_12_OFF	12
> +#define RV_J_IMM_10_1_MASK	GENMASK(9, 0)
> +#define RV_J_IMM_11_MASK	GENMASK(0, 0)
> +#define RV_J_IMM_19_12_MASK	GENMASK(7, 0)

Looking at the macros below which shift down, mask, and then shift
back up, and how these macros are used in the later alternative patching
patches, both in this series and Jisheng's, make me think it'd be better
to define them already shifted, e.g.

#define RV_J_IMM_10_1_MASK	GENMASK(30, 21)
#define RV_J_IMM_11_MASK	GENMASK(20, 20)
#define RV_J_IMM_19_12_MASK 	GENMASK(19, 12)

Thanks,
drew

>  
>  /* The bit field of immediate value in B-type instruction */
> -#define B_IMM_SIGN_OPOFF	31
> -#define B_IMM_10_5_OPOFF	25
> -#define B_IMM_4_1_OPOFF		8
> -#define B_IMM_11_OPOFF		7
> -#define B_IMM_SIGN_OFF		12
> -#define B_IMM_10_5_OFF		5
> -#define B_IMM_4_1_OFF		1
> -#define B_IMM_11_OFF		11
> -#define B_IMM_10_5_MASK		GENMASK(5, 0)
> -#define B_IMM_4_1_MASK		GENMASK(3, 0)
> -#define B_IMM_11_MASK		GENMASK(0, 0)
> +#define RV_B_IMM_SIGN_OPOFF	31
> +#define RV_B_IMM_10_5_OPOFF	25
> +#define RV_B_IMM_4_1_OPOFF	8
> +#define RV_B_IMM_11_OPOFF	7
> +#define RV_B_IMM_SIGN_OFF	12
> +#define RV_B_IMM_10_5_OFF	5
> +#define RV_B_IMM_4_1_OFF	1
> +#define RV_B_IMM_11_OFF		11
> +#define RV_B_IMM_10_5_MASK	GENMASK(5, 0)
> +#define RV_B_IMM_4_1_MASK	GENMASK(3, 0)
> +#define RV_B_IMM_11_MASK	GENMASK(0, 0)
>  
>  /* The register offset in RVG instruction */
>  #define RVG_RS1_OPOFF		15
> @@ -100,71 +100,71 @@
>  #define RVC_C2_RD_OPOFF		7
>  
>  /* parts of opcode for RVG*/
> -#define OPCODE_BRANCH		0x63
> -#define OPCODE_JALR		0x67
> -#define OPCODE_JAL		0x6f
> -#define OPCODE_SYSTEM		0x73
> +#define RVG_OPCODE_BRANCH	0x63
> +#define RVG_OPCODE_JALR		0x67
> +#define RVG_OPCODE_JAL		0x6f
> +#define RVG_OPCODE_SYSTEM	0x73
>  
>  /* parts of opcode for RVC*/
> -#define OPCODE_C_0		0x0
> -#define OPCODE_C_1		0x1
> -#define OPCODE_C_2		0x2
> +#define RVC_OPCODE_C0		0x0
> +#define RVC_OPCODE_C1		0x1
> +#define RVC_OPCODE_C2		0x2
>  
>  /* parts of funct3 code for I, M, A extension*/
> -#define FUNCT3_JALR		0x0
> -#define FUNCT3_BEQ		0x0
> -#define FUNCT3_BNE		0x1000
> -#define FUNCT3_BLT		0x4000
> -#define FUNCT3_BGE		0x5000
> -#define FUNCT3_BLTU		0x6000
> -#define FUNCT3_BGEU		0x7000
> +#define RVG_FUNCT3_JALR		0x0
> +#define RVG_FUNCT3_BEQ		0x0
> +#define RVG_FUNCT3_BNE		0x1000
> +#define RVG_FUNCT3_BLT		0x4000
> +#define RVG_FUNCT3_BGE		0x5000
> +#define RVG_FUNCT3_BLTU		0x6000
> +#define RVG_FUNCT3_BGEU		0x7000
>  
>  /* parts of funct3 code for C extension*/
> -#define FUNCT3_C_BEQZ		0xc000
> -#define FUNCT3_C_BNEZ		0xe000
> -#define FUNCT3_C_J		0xa000
> -#define FUNCT3_C_JAL		0x2000
> -#define FUNCT4_C_JR		0x8000
> -#define FUNCT4_C_JALR		0x9000
> -
> -#define FUNCT12_SRET		0x10200000
> -
> -#define MATCH_JALR		(FUNCT3_JALR | OPCODE_JALR)
> -#define MATCH_JAL		(OPCODE_JAL)
> -#define MATCH_BEQ		(FUNCT3_BEQ | OPCODE_BRANCH)
> -#define MATCH_BNE		(FUNCT3_BNE | OPCODE_BRANCH)
> -#define MATCH_BLT		(FUNCT3_BLT | OPCODE_BRANCH)
> -#define MATCH_BGE		(FUNCT3_BGE | OPCODE_BRANCH)
> -#define MATCH_BLTU		(FUNCT3_BLTU | OPCODE_BRANCH)
> -#define MATCH_BGEU		(FUNCT3_BGEU | OPCODE_BRANCH)
> -#define MATCH_SRET		(FUNCT12_SRET | OPCODE_SYSTEM)
> -#define MATCH_C_BEQZ		(FUNCT3_C_BEQZ | OPCODE_C_1)
> -#define MATCH_C_BNEZ		(FUNCT3_C_BNEZ | OPCODE_C_1)
> -#define MATCH_C_J		(FUNCT3_C_J | OPCODE_C_1)
> -#define MATCH_C_JAL		(FUNCT3_C_JAL | OPCODE_C_1)
> -#define MATCH_C_JR		(FUNCT4_C_JR | OPCODE_C_2)
> -#define MATCH_C_JALR		(FUNCT4_C_JALR | OPCODE_C_2)
> -
> -#define MASK_JALR		0x707f
> -#define MASK_JAL		0x7f
> -#define MASK_C_JALR		0xf07f
> -#define MASK_C_JR		0xf07f
> -#define MASK_C_JAL		0xe003
> -#define MASK_C_J		0xe003
> -#define MASK_BEQ		0x707f
> -#define MASK_BNE		0x707f
> -#define MASK_BLT		0x707f
> -#define MASK_BGE		0x707f
> -#define MASK_BLTU		0x707f
> -#define MASK_BGEU		0x707f
> -#define MASK_C_BEQZ		0xe003
> -#define MASK_C_BNEZ		0xe003
> -#define MASK_SRET		0xffffffff
> +#define RVC_FUNCT3_C_BEQZ	0xc000
> +#define RVC_FUNCT3_C_BNEZ	0xe000
> +#define RVC_FUNCT3_C_J		0xa000
> +#define RVC_FUNCT3_C_JAL	0x2000
> +#define RVC_FUNCT4_C_JR		0x8000
> +#define RVC_FUNCT4_C_JALR	0x9000
> +
> +#define RVG_FUNCT12_SRET	0x10200000
> +
> +#define RVG_MATCH_JALR		(RVG_FUNCT3_JALR | RVG_OPCODE_JALR)
> +#define RVG_MATCH_JAL		(RVG_OPCODE_JAL)
> +#define RVG_MATCH_BEQ		(RVG_FUNCT3_BEQ | RVG_OPCODE_BRANCH)
> +#define RVG_MATCH_BNE		(RVG_FUNCT3_BNE | RVG_OPCODE_BRANCH)
> +#define RVG_MATCH_BLT		(RVG_FUNCT3_BLT | RVG_OPCODE_BRANCH)
> +#define RVG_MATCH_BGE		(RVG_FUNCT3_BGE | RVG_OPCODE_BRANCH)
> +#define RVG_MATCH_BLTU		(RVG_FUNCT3_BLTU | RVG_OPCODE_BRANCH)
> +#define RVG_MATCH_BGEU		(RVG_FUNCT3_BGEU | RVG_OPCODE_BRANCH)
> +#define RVG_MATCH_SRET		(RVG_FUNCT12_SRET | RVG_OPCODE_SYSTEM)
> +#define RVC_MATCH_C_BEQZ	(RVC_FUNCT3_C_BEQZ | RVC_OPCODE_C1)
> +#define RVC_MATCH_C_BNEZ	(RVC_FUNCT3_C_BNEZ | RVC_OPCODE_C1)
> +#define RVC_MATCH_C_J		(RVC_FUNCT3_C_J | RVC_OPCODE_C1)
> +#define RVC_MATCH_C_JAL		(RVC_FUNCT3_C_JAL | RVC_OPCODE_C1)
> +#define RVC_MATCH_C_JR		(RVC_FUNCT4_C_JR | RVC_OPCODE_C2)
> +#define RVC_MATCH_C_JALR	(RVC_FUNCT4_C_JALR | RVC_OPCODE_C2)
> +
> +#define RVG_MASK_JALR		0x707f
> +#define RVG_MASK_JAL		0x7f
> +#define RVC_MASK_C_JALR		0xf07f
> +#define RVC_MASK_C_JR		0xf07f
> +#define RVC_MASK_C_JAL		0xe003
> +#define RVC_MASK_C_J		0xe003
> +#define RVG_MASK_BEQ		0x707f
> +#define RVG_MASK_BNE		0x707f
> +#define RVG_MASK_BLT		0x707f
> +#define RVG_MASK_BGE		0x707f
> +#define RVG_MASK_BLTU		0x707f
> +#define RVG_MASK_BGEU		0x707f
> +#define RVC_MASK_C_BEQZ		0xe003
> +#define RVC_MASK_C_BNEZ		0xe003
> +#define RVG_MASK_SRET		0xffffffff
>  
>  #define __INSN_LENGTH_MASK	_UL(0x3)
>  #define __INSN_LENGTH_GE_32	_UL(0x3)
>  #define __INSN_OPCODE_MASK	_UL(0x7F)
> -#define __INSN_BRANCH_OPCODE	_UL(OPCODE_BRANCH)
> +#define __INSN_BRANCH_OPCODE	_UL(RVG_OPCODE_BRANCH)
>  
>  /* Define a series of is_XXX_insn functions to check if the value INSN
>   * is an instance of instruction XXX.
> @@ -180,26 +180,26 @@ static inline bool is_ ## INSN_NAME ## _insn(long insn) \
>  #define RV_X(X, s, mask)  (((X) >> (s)) & (mask))
>  #define RVC_X(X, s, mask) RV_X(X, s, mask)
>  
> -#define EXTRACT_JTYPE_IMM(x) \
> +#define RV_EXTRACT_JTYPE_IMM(x) \
>  	({typeof(x) x_ = (x); \
> -	(RV_X(x_, J_IMM_10_1_OPOFF, J_IMM_10_1_MASK) << J_IMM_10_1_OFF) | \
> -	(RV_X(x_, J_IMM_11_OPOFF, J_IMM_11_MASK) << J_IMM_11_OFF) | \
> -	(RV_X(x_, J_IMM_19_12_OPOFF, J_IMM_19_12_MASK) << J_IMM_19_12_OFF) | \
> -	(RV_IMM_SIGN(x_) << J_IMM_SIGN_OFF); })
> +	(RV_X(x_, RV_J_IMM_10_1_OPOFF, RV_J_IMM_10_1_MASK) << RV_J_IMM_10_1_OFF) | \
> +	(RV_X(x_, RV_J_IMM_11_OPOFF, RV_J_IMM_11_MASK) << RV_J_IMM_11_OFF) | \
> +	(RV_X(x_, RV_J_IMM_19_12_OPOFF, RV_J_IMM_19_12_MASK) << RV_J_IMM_19_12_OFF) | \
> +	(RV_IMM_SIGN(x_) << RV_J_IMM_SIGN_OFF); })
>  
> -#define EXTRACT_ITYPE_IMM(x) \
> +#define RV_EXTRACT_ITYPE_IMM(x) \
>  	({typeof(x) x_ = (x); \
> -	(RV_X(x_, I_IMM_11_0_OPOFF, I_IMM_11_0_MASK)) | \
> -	(RV_IMM_SIGN(x_) << I_IMM_SIGN_OFF); })
> +	(RV_X(x_, RV_I_IMM_11_0_OPOFF, RV_I_IMM_11_0_MASK)) | \
> +	(RV_IMM_SIGN(x_) << RV_I_IMM_SIGN_OFF); })
>  
> -#define EXTRACT_BTYPE_IMM(x) \
> +#define RV_EXTRACT_BTYPE_IMM(x) \
>  	({typeof(x) x_ = (x); \
> -	(RV_X(x_, B_IMM_4_1_OPOFF, B_IMM_4_1_MASK) << B_IMM_4_1_OFF) | \
> -	(RV_X(x_, B_IMM_10_5_OPOFF, B_IMM_10_5_MASK) << B_IMM_10_5_OFF) | \
> -	(RV_X(x_, B_IMM_11_OPOFF, B_IMM_11_MASK) << B_IMM_11_OFF) | \
> -	(RV_IMM_SIGN(x_) << B_IMM_SIGN_OFF); })
> +	(RV_X(x_, RV_B_IMM_4_1_OPOFF, RV_B_IMM_4_1_MASK) << RV_B_IMM_4_1_OFF) | \
> +	(RV_X(x_, RV_B_IMM_10_5_OPOFF, RV_B_IMM_10_5_MASK) << RV_B_IMM_10_5_OFF) | \
> +	(RV_X(x_, RV_B_IMM_11_OPOFF, RV_B_IMM_11_MASK) << RV_B_IMM_11_OFF) | \
> +	(RV_IMM_SIGN(x_) << RV_B_IMM_SIGN_OFF); })
>  
> -#define EXTRACT_RVC_J_IMM(x) \
> +#define RVC_EXTRACT_JTYPE_IMM(x) \
>  	({typeof(x) x_ = (x); \
>  	(RVC_X(x_, RVC_J_IMM_3_1_OPOFF, RVC_J_IMM_3_1_MASK) << RVC_J_IMM_3_1_OFF) | \
>  	(RVC_X(x_, RVC_J_IMM_4_OPOFF, RVC_J_IMM_4_MASK) << RVC_J_IMM_4_OFF) | \
> @@ -210,7 +210,7 @@ static inline bool is_ ## INSN_NAME ## _insn(long insn) \
>  	(RVC_X(x_, RVC_J_IMM_10_OPOFF, RVC_J_IMM_10_MASK) << RVC_J_IMM_10_OFF) | \
>  	(RVC_IMM_SIGN(x_) << RVC_J_IMM_SIGN_OFF); })
>  
> -#define EXTRACT_RVC_B_IMM(x) \
> +#define RVC_EXTRACT_BTYPE_IMM(x) \
>  	({typeof(x) x_ = (x); \
>  	(RVC_X(x_, RVC_B_IMM_2_1_OPOFF, RVC_B_IMM_2_1_MASK) << RVC_B_IMM_2_1_OFF) | \
>  	(RVC_X(x_, RVC_B_IMM_4_3_OPOFF, RVC_B_IMM_4_3_MASK) << RVC_B_IMM_4_3_OFF) | \
> diff --git a/arch/riscv/kernel/kgdb.c b/arch/riscv/kernel/kgdb.c
> index 963ed7edcff2..137c6d870d58 100644
> --- a/arch/riscv/kernel/kgdb.c
> +++ b/arch/riscv/kernel/kgdb.c
> @@ -29,20 +29,20 @@ DECLARE_INSN(c_jal, MATCH_C_JAL, MASK_C_JAL)
>  #else
>  #define is_c_jal_insn(opcode) 0
>  #endif
> -DECLARE_INSN(jalr, MATCH_JALR, MASK_JALR)
> -DECLARE_INSN(jal, MATCH_JAL, MASK_JAL)
> -DECLARE_INSN(c_jr, MATCH_C_JR, MASK_C_JR)
> -DECLARE_INSN(c_jalr, MATCH_C_JALR, MASK_C_JALR)
> -DECLARE_INSN(c_j, MATCH_C_J, MASK_C_J)
> -DECLARE_INSN(beq, MATCH_BEQ, MASK_BEQ)
> -DECLARE_INSN(bne, MATCH_BNE, MASK_BNE)
> -DECLARE_INSN(blt, MATCH_BLT, MASK_BLT)
> -DECLARE_INSN(bge, MATCH_BGE, MASK_BGE)
> -DECLARE_INSN(bltu, MATCH_BLTU, MASK_BLTU)
> -DECLARE_INSN(bgeu, MATCH_BGEU, MASK_BGEU)
> -DECLARE_INSN(c_beqz, MATCH_C_BEQZ, MASK_C_BEQZ)
> -DECLARE_INSN(c_bnez, MATCH_C_BNEZ, MASK_C_BNEZ)
> -DECLARE_INSN(sret, MATCH_SRET, MASK_SRET)
> +DECLARE_INSN(jalr, RVG_MATCH_JALR, RVG_MASK_JALR)
> +DECLARE_INSN(jal, RVG_MATCH_JAL, RVG_MASK_JAL)
> +DECLARE_INSN(c_jr, RVC_MATCH_C_JR, RVC_MASK_C_JR)
> +DECLARE_INSN(c_jalr, RVC_MATCH_C_JALR, RVC_MASK_C_JALR)
> +DECLARE_INSN(c_j, RVC_MATCH_C_J, RVC_MASK_C_J)
> +DECLARE_INSN(beq, RVG_MATCH_BEQ, RVG_MASK_BEQ)
> +DECLARE_INSN(bne, RVG_MATCH_BNE, RVG_MASK_BNE)
> +DECLARE_INSN(blt, RVG_MATCH_BLT, RVG_MASK_BLT)
> +DECLARE_INSN(bge, RVG_MATCH_BGE, RVG_MASK_BGE)
> +DECLARE_INSN(bltu, RVG_MATCH_BLTU, RVG_MASK_BLTU)
> +DECLARE_INSN(bgeu, RVG_MATCH_BGEU, RVG_MASK_BGEU)
> +DECLARE_INSN(c_beqz, RVC_MATCH_C_BEQZ, RVC_MASK_C_BEQZ)
> +DECLARE_INSN(c_bnez, RVC_MATCH_C_BNEZ, RVC_MASK_C_BNEZ)
> +DECLARE_INSN(sret, RVG_MATCH_SRET, RVG_MASK_SRET)
>  
>  static int decode_register_index(unsigned long opcode, int offset)
>  {
> @@ -69,19 +69,19 @@ static int get_step_address(struct pt_regs *regs, unsigned long *next_addr)
>  			rs1_num = decode_register_index(op_code, RVC_C2_RS1_OPOFF);
>  			*next_addr = regs_ptr[rs1_num];
>  		} else if (is_c_j_insn(op_code) || is_c_jal_insn(op_code)) {
> -			*next_addr = EXTRACT_RVC_J_IMM(op_code) + pc;
> +			*next_addr = RVC_EXTRACT_JTYPE_IMM(op_code) + pc;
>  		} else if (is_c_beqz_insn(op_code)) {
>  			rs1_num = decode_register_index_short(op_code,
>  							      RVC_C1_RS1_OPOFF);
>  			if (!rs1_num || regs_ptr[rs1_num] == 0)
> -				*next_addr = EXTRACT_RVC_B_IMM(op_code) + pc;
> +				*next_addr = RVC_EXTRACT_BTYPE_IMM(op_code) + pc;
>  			else
>  				*next_addr = pc + 2;
>  		} else if (is_c_bnez_insn(op_code)) {
>  			rs1_num =
>  			    decode_register_index_short(op_code, RVC_C1_RS1_OPOFF);
>  			if (rs1_num && regs_ptr[rs1_num] != 0)
> -				*next_addr = EXTRACT_RVC_B_IMM(op_code) + pc;
> +				*next_addr = RVC_EXTRACT_BTYPE_IMM(op_code) + pc;
>  			else
>  				*next_addr = pc + 2;
>  		} else {
> @@ -90,7 +90,7 @@ static int get_step_address(struct pt_regs *regs, unsigned long *next_addr)
>  	} else {
>  		if ((op_code & __INSN_OPCODE_MASK) == __INSN_BRANCH_OPCODE) {
>  			bool result = false;
> -			long imm = EXTRACT_BTYPE_IMM(op_code);
> +			long imm = RV_EXTRACT_BTYPE_IMM(op_code);
>  			unsigned long rs1_val = 0, rs2_val = 0;
>  
>  			rs1_num = decode_register_index(op_code, RVG_RS1_OPOFF);
> @@ -121,12 +121,12 @@ static int get_step_address(struct pt_regs *regs, unsigned long *next_addr)
>  			else
>  				*next_addr = pc + 4;
>  		} else if (is_jal_insn(op_code)) {
> -			*next_addr = EXTRACT_JTYPE_IMM(op_code) + pc;
> +			*next_addr = RV_EXTRACT_JTYPE_IMM(op_code) + pc;
>  		} else if (is_jalr_insn(op_code)) {
>  			rs1_num = decode_register_index(op_code, RVG_RS1_OPOFF);
>  			if (rs1_num)
>  				*next_addr = ((unsigned long *)regs)[rs1_num];
> -			*next_addr += EXTRACT_ITYPE_IMM(op_code);
> +			*next_addr += RV_EXTRACT_ITYPE_IMM(op_code);
>  		} else if (is_sret_insn(op_code)) {
>  			*next_addr = pc;
>  		} else {
> -- 
> 2.35.1
> 

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [PATCH v3 02/14] RISC-V: add prefix to all constants/macros in parse_asm.h
@ 2022-12-05 14:00     ` Andrew Jones
  0 siblings, 0 replies; 84+ messages in thread
From: Andrew Jones @ 2022-12-05 14:00 UTC (permalink / raw)
  To: Heiko Stuebner
  Cc: linux-riscv, palmer, christoph.muellner, prabhakar.csengg, conor,
	philipp.tomsich, emil.renner.berthing, ardb, linux-efi,
	Heiko Stuebner, Conor Dooley

On Wed, Nov 30, 2022 at 11:56:02PM +0100, Heiko Stuebner wrote:
> From: Heiko Stuebner <heiko.stuebner@vrull.eu>
> 
> Some of the constants and macros already have suitable RV_, RVG_ or
> RVC_ prefixes.
> 
> Extend this to the rest of the file as well, as we want to use these
> things in a broader scope soon.
> 
> Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
> Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
> Signed-off-by: Heiko Stuebner <heiko.stuebner@vrull.eu>
> ---
>  arch/riscv/include/asm/parse_asm.h | 194 ++++++++++++++---------------
>  arch/riscv/kernel/kgdb.c           |  40 +++---
>  2 files changed, 117 insertions(+), 117 deletions(-)
> 
> diff --git a/arch/riscv/include/asm/parse_asm.h b/arch/riscv/include/asm/parse_asm.h
> index 7fee806805c1..ea51542e0c65 100644
> --- a/arch/riscv/include/asm/parse_asm.h
> +++ b/arch/riscv/include/asm/parse_asm.h
> @@ -6,37 +6,37 @@
>  #include <linux/bits.h>
>  
>  /* The bit field of immediate value in I-type instruction */
> -#define I_IMM_SIGN_OPOFF	31
> -#define I_IMM_11_0_OPOFF	20
> -#define I_IMM_SIGN_OFF		12
> -#define I_IMM_11_0_OFF		0
> -#define I_IMM_11_0_MASK		GENMASK(11, 0)
> +#define RV_I_IMM_SIGN_OPOFF	31
> +#define RV_I_IMM_11_0_OPOFF	20
> +#define RV_I_IMM_SIGN_OFF	12
> +#define RV_I_IMM_11_0_OFF	0
> +#define RV_I_IMM_11_0_MASK	GENMASK(11, 0)
>  
>  /* The bit field of immediate value in J-type instruction */
> -#define J_IMM_SIGN_OPOFF	31
> -#define J_IMM_10_1_OPOFF	21
> -#define J_IMM_11_OPOFF		20
> -#define J_IMM_19_12_OPOFF	12
> -#define J_IMM_SIGN_OFF		20
> -#define J_IMM_10_1_OFF		1
> -#define J_IMM_11_OFF		11
> -#define J_IMM_19_12_OFF		12
> -#define J_IMM_10_1_MASK		GENMASK(9, 0)
> -#define J_IMM_11_MASK		GENMASK(0, 0)
> -#define J_IMM_19_12_MASK	GENMASK(7, 0)
> +#define RV_J_IMM_SIGN_OPOFF	31
> +#define RV_J_IMM_10_1_OPOFF	21
> +#define RV_J_IMM_11_OPOFF	20
> +#define RV_J_IMM_19_12_OPOFF	12
> +#define RV_J_IMM_SIGN_OFF	20
> +#define RV_J_IMM_10_1_OFF	1
> +#define RV_J_IMM_11_OFF		11
> +#define RV_J_IMM_19_12_OFF	12
> +#define RV_J_IMM_10_1_MASK	GENMASK(9, 0)
> +#define RV_J_IMM_11_MASK	GENMASK(0, 0)
> +#define RV_J_IMM_19_12_MASK	GENMASK(7, 0)

Looking at the macros below which shift down, mask, and then shift
back up, and how these macros are used in the later alternative patching
patches, both in this series and Jisheng's, make me think it'd be better
to define them already shifted, e.g.

#define RV_J_IMM_10_1_MASK	GENMASK(30, 21)
#define RV_J_IMM_11_MASK	GENMASK(20, 20)
#define RV_J_IMM_19_12_MASK 	GENMASK(19, 12)

Thanks,
drew

>  
>  /* The bit field of immediate value in B-type instruction */
> -#define B_IMM_SIGN_OPOFF	31
> -#define B_IMM_10_5_OPOFF	25
> -#define B_IMM_4_1_OPOFF		8
> -#define B_IMM_11_OPOFF		7
> -#define B_IMM_SIGN_OFF		12
> -#define B_IMM_10_5_OFF		5
> -#define B_IMM_4_1_OFF		1
> -#define B_IMM_11_OFF		11
> -#define B_IMM_10_5_MASK		GENMASK(5, 0)
> -#define B_IMM_4_1_MASK		GENMASK(3, 0)
> -#define B_IMM_11_MASK		GENMASK(0, 0)
> +#define RV_B_IMM_SIGN_OPOFF	31
> +#define RV_B_IMM_10_5_OPOFF	25
> +#define RV_B_IMM_4_1_OPOFF	8
> +#define RV_B_IMM_11_OPOFF	7
> +#define RV_B_IMM_SIGN_OFF	12
> +#define RV_B_IMM_10_5_OFF	5
> +#define RV_B_IMM_4_1_OFF	1
> +#define RV_B_IMM_11_OFF		11
> +#define RV_B_IMM_10_5_MASK	GENMASK(5, 0)
> +#define RV_B_IMM_4_1_MASK	GENMASK(3, 0)
> +#define RV_B_IMM_11_MASK	GENMASK(0, 0)
>  
>  /* The register offset in RVG instruction */
>  #define RVG_RS1_OPOFF		15
> @@ -100,71 +100,71 @@
>  #define RVC_C2_RD_OPOFF		7
>  
>  /* parts of opcode for RVG*/
> -#define OPCODE_BRANCH		0x63
> -#define OPCODE_JALR		0x67
> -#define OPCODE_JAL		0x6f
> -#define OPCODE_SYSTEM		0x73
> +#define RVG_OPCODE_BRANCH	0x63
> +#define RVG_OPCODE_JALR		0x67
> +#define RVG_OPCODE_JAL		0x6f
> +#define RVG_OPCODE_SYSTEM	0x73
>  
>  /* parts of opcode for RVC*/
> -#define OPCODE_C_0		0x0
> -#define OPCODE_C_1		0x1
> -#define OPCODE_C_2		0x2
> +#define RVC_OPCODE_C0		0x0
> +#define RVC_OPCODE_C1		0x1
> +#define RVC_OPCODE_C2		0x2
>  
>  /* parts of funct3 code for I, M, A extension*/
> -#define FUNCT3_JALR		0x0
> -#define FUNCT3_BEQ		0x0
> -#define FUNCT3_BNE		0x1000
> -#define FUNCT3_BLT		0x4000
> -#define FUNCT3_BGE		0x5000
> -#define FUNCT3_BLTU		0x6000
> -#define FUNCT3_BGEU		0x7000
> +#define RVG_FUNCT3_JALR		0x0
> +#define RVG_FUNCT3_BEQ		0x0
> +#define RVG_FUNCT3_BNE		0x1000
> +#define RVG_FUNCT3_BLT		0x4000
> +#define RVG_FUNCT3_BGE		0x5000
> +#define RVG_FUNCT3_BLTU		0x6000
> +#define RVG_FUNCT3_BGEU		0x7000
>  
>  /* parts of funct3 code for C extension*/
> -#define FUNCT3_C_BEQZ		0xc000
> -#define FUNCT3_C_BNEZ		0xe000
> -#define FUNCT3_C_J		0xa000
> -#define FUNCT3_C_JAL		0x2000
> -#define FUNCT4_C_JR		0x8000
> -#define FUNCT4_C_JALR		0x9000
> -
> -#define FUNCT12_SRET		0x10200000
> -
> -#define MATCH_JALR		(FUNCT3_JALR | OPCODE_JALR)
> -#define MATCH_JAL		(OPCODE_JAL)
> -#define MATCH_BEQ		(FUNCT3_BEQ | OPCODE_BRANCH)
> -#define MATCH_BNE		(FUNCT3_BNE | OPCODE_BRANCH)
> -#define MATCH_BLT		(FUNCT3_BLT | OPCODE_BRANCH)
> -#define MATCH_BGE		(FUNCT3_BGE | OPCODE_BRANCH)
> -#define MATCH_BLTU		(FUNCT3_BLTU | OPCODE_BRANCH)
> -#define MATCH_BGEU		(FUNCT3_BGEU | OPCODE_BRANCH)
> -#define MATCH_SRET		(FUNCT12_SRET | OPCODE_SYSTEM)
> -#define MATCH_C_BEQZ		(FUNCT3_C_BEQZ | OPCODE_C_1)
> -#define MATCH_C_BNEZ		(FUNCT3_C_BNEZ | OPCODE_C_1)
> -#define MATCH_C_J		(FUNCT3_C_J | OPCODE_C_1)
> -#define MATCH_C_JAL		(FUNCT3_C_JAL | OPCODE_C_1)
> -#define MATCH_C_JR		(FUNCT4_C_JR | OPCODE_C_2)
> -#define MATCH_C_JALR		(FUNCT4_C_JALR | OPCODE_C_2)
> -
> -#define MASK_JALR		0x707f
> -#define MASK_JAL		0x7f
> -#define MASK_C_JALR		0xf07f
> -#define MASK_C_JR		0xf07f
> -#define MASK_C_JAL		0xe003
> -#define MASK_C_J		0xe003
> -#define MASK_BEQ		0x707f
> -#define MASK_BNE		0x707f
> -#define MASK_BLT		0x707f
> -#define MASK_BGE		0x707f
> -#define MASK_BLTU		0x707f
> -#define MASK_BGEU		0x707f
> -#define MASK_C_BEQZ		0xe003
> -#define MASK_C_BNEZ		0xe003
> -#define MASK_SRET		0xffffffff
> +#define RVC_FUNCT3_C_BEQZ	0xc000
> +#define RVC_FUNCT3_C_BNEZ	0xe000
> +#define RVC_FUNCT3_C_J		0xa000
> +#define RVC_FUNCT3_C_JAL	0x2000
> +#define RVC_FUNCT4_C_JR		0x8000
> +#define RVC_FUNCT4_C_JALR	0x9000
> +
> +#define RVG_FUNCT12_SRET	0x10200000
> +
> +#define RVG_MATCH_JALR		(RVG_FUNCT3_JALR | RVG_OPCODE_JALR)
> +#define RVG_MATCH_JAL		(RVG_OPCODE_JAL)
> +#define RVG_MATCH_BEQ		(RVG_FUNCT3_BEQ | RVG_OPCODE_BRANCH)
> +#define RVG_MATCH_BNE		(RVG_FUNCT3_BNE | RVG_OPCODE_BRANCH)
> +#define RVG_MATCH_BLT		(RVG_FUNCT3_BLT | RVG_OPCODE_BRANCH)
> +#define RVG_MATCH_BGE		(RVG_FUNCT3_BGE | RVG_OPCODE_BRANCH)
> +#define RVG_MATCH_BLTU		(RVG_FUNCT3_BLTU | RVG_OPCODE_BRANCH)
> +#define RVG_MATCH_BGEU		(RVG_FUNCT3_BGEU | RVG_OPCODE_BRANCH)
> +#define RVG_MATCH_SRET		(RVG_FUNCT12_SRET | RVG_OPCODE_SYSTEM)
> +#define RVC_MATCH_C_BEQZ	(RVC_FUNCT3_C_BEQZ | RVC_OPCODE_C1)
> +#define RVC_MATCH_C_BNEZ	(RVC_FUNCT3_C_BNEZ | RVC_OPCODE_C1)
> +#define RVC_MATCH_C_J		(RVC_FUNCT3_C_J | RVC_OPCODE_C1)
> +#define RVC_MATCH_C_JAL		(RVC_FUNCT3_C_JAL | RVC_OPCODE_C1)
> +#define RVC_MATCH_C_JR		(RVC_FUNCT4_C_JR | RVC_OPCODE_C2)
> +#define RVC_MATCH_C_JALR	(RVC_FUNCT4_C_JALR | RVC_OPCODE_C2)
> +
> +#define RVG_MASK_JALR		0x707f
> +#define RVG_MASK_JAL		0x7f
> +#define RVC_MASK_C_JALR		0xf07f
> +#define RVC_MASK_C_JR		0xf07f
> +#define RVC_MASK_C_JAL		0xe003
> +#define RVC_MASK_C_J		0xe003
> +#define RVG_MASK_BEQ		0x707f
> +#define RVG_MASK_BNE		0x707f
> +#define RVG_MASK_BLT		0x707f
> +#define RVG_MASK_BGE		0x707f
> +#define RVG_MASK_BLTU		0x707f
> +#define RVG_MASK_BGEU		0x707f
> +#define RVC_MASK_C_BEQZ		0xe003
> +#define RVC_MASK_C_BNEZ		0xe003
> +#define RVG_MASK_SRET		0xffffffff
>  
>  #define __INSN_LENGTH_MASK	_UL(0x3)
>  #define __INSN_LENGTH_GE_32	_UL(0x3)
>  #define __INSN_OPCODE_MASK	_UL(0x7F)
> -#define __INSN_BRANCH_OPCODE	_UL(OPCODE_BRANCH)
> +#define __INSN_BRANCH_OPCODE	_UL(RVG_OPCODE_BRANCH)
>  
>  /* Define a series of is_XXX_insn functions to check if the value INSN
>   * is an instance of instruction XXX.
> @@ -180,26 +180,26 @@ static inline bool is_ ## INSN_NAME ## _insn(long insn) \
>  #define RV_X(X, s, mask)  (((X) >> (s)) & (mask))
>  #define RVC_X(X, s, mask) RV_X(X, s, mask)
>  
> -#define EXTRACT_JTYPE_IMM(x) \
> +#define RV_EXTRACT_JTYPE_IMM(x) \
>  	({typeof(x) x_ = (x); \
> -	(RV_X(x_, J_IMM_10_1_OPOFF, J_IMM_10_1_MASK) << J_IMM_10_1_OFF) | \
> -	(RV_X(x_, J_IMM_11_OPOFF, J_IMM_11_MASK) << J_IMM_11_OFF) | \
> -	(RV_X(x_, J_IMM_19_12_OPOFF, J_IMM_19_12_MASK) << J_IMM_19_12_OFF) | \
> -	(RV_IMM_SIGN(x_) << J_IMM_SIGN_OFF); })
> +	(RV_X(x_, RV_J_IMM_10_1_OPOFF, RV_J_IMM_10_1_MASK) << RV_J_IMM_10_1_OFF) | \
> +	(RV_X(x_, RV_J_IMM_11_OPOFF, RV_J_IMM_11_MASK) << RV_J_IMM_11_OFF) | \
> +	(RV_X(x_, RV_J_IMM_19_12_OPOFF, RV_J_IMM_19_12_MASK) << RV_J_IMM_19_12_OFF) | \
> +	(RV_IMM_SIGN(x_) << RV_J_IMM_SIGN_OFF); })
>  
> -#define EXTRACT_ITYPE_IMM(x) \
> +#define RV_EXTRACT_ITYPE_IMM(x) \
>  	({typeof(x) x_ = (x); \
> -	(RV_X(x_, I_IMM_11_0_OPOFF, I_IMM_11_0_MASK)) | \
> -	(RV_IMM_SIGN(x_) << I_IMM_SIGN_OFF); })
> +	(RV_X(x_, RV_I_IMM_11_0_OPOFF, RV_I_IMM_11_0_MASK)) | \
> +	(RV_IMM_SIGN(x_) << RV_I_IMM_SIGN_OFF); })
>  
> -#define EXTRACT_BTYPE_IMM(x) \
> +#define RV_EXTRACT_BTYPE_IMM(x) \
>  	({typeof(x) x_ = (x); \
> -	(RV_X(x_, B_IMM_4_1_OPOFF, B_IMM_4_1_MASK) << B_IMM_4_1_OFF) | \
> -	(RV_X(x_, B_IMM_10_5_OPOFF, B_IMM_10_5_MASK) << B_IMM_10_5_OFF) | \
> -	(RV_X(x_, B_IMM_11_OPOFF, B_IMM_11_MASK) << B_IMM_11_OFF) | \
> -	(RV_IMM_SIGN(x_) << B_IMM_SIGN_OFF); })
> +	(RV_X(x_, RV_B_IMM_4_1_OPOFF, RV_B_IMM_4_1_MASK) << RV_B_IMM_4_1_OFF) | \
> +	(RV_X(x_, RV_B_IMM_10_5_OPOFF, RV_B_IMM_10_5_MASK) << RV_B_IMM_10_5_OFF) | \
> +	(RV_X(x_, RV_B_IMM_11_OPOFF, RV_B_IMM_11_MASK) << RV_B_IMM_11_OFF) | \
> +	(RV_IMM_SIGN(x_) << RV_B_IMM_SIGN_OFF); })
>  
> -#define EXTRACT_RVC_J_IMM(x) \
> +#define RVC_EXTRACT_JTYPE_IMM(x) \
>  	({typeof(x) x_ = (x); \
>  	(RVC_X(x_, RVC_J_IMM_3_1_OPOFF, RVC_J_IMM_3_1_MASK) << RVC_J_IMM_3_1_OFF) | \
>  	(RVC_X(x_, RVC_J_IMM_4_OPOFF, RVC_J_IMM_4_MASK) << RVC_J_IMM_4_OFF) | \
> @@ -210,7 +210,7 @@ static inline bool is_ ## INSN_NAME ## _insn(long insn) \
>  	(RVC_X(x_, RVC_J_IMM_10_OPOFF, RVC_J_IMM_10_MASK) << RVC_J_IMM_10_OFF) | \
>  	(RVC_IMM_SIGN(x_) << RVC_J_IMM_SIGN_OFF); })
>  
> -#define EXTRACT_RVC_B_IMM(x) \
> +#define RVC_EXTRACT_BTYPE_IMM(x) \
>  	({typeof(x) x_ = (x); \
>  	(RVC_X(x_, RVC_B_IMM_2_1_OPOFF, RVC_B_IMM_2_1_MASK) << RVC_B_IMM_2_1_OFF) | \
>  	(RVC_X(x_, RVC_B_IMM_4_3_OPOFF, RVC_B_IMM_4_3_MASK) << RVC_B_IMM_4_3_OFF) | \
> diff --git a/arch/riscv/kernel/kgdb.c b/arch/riscv/kernel/kgdb.c
> index 963ed7edcff2..137c6d870d58 100644
> --- a/arch/riscv/kernel/kgdb.c
> +++ b/arch/riscv/kernel/kgdb.c
> @@ -29,20 +29,20 @@ DECLARE_INSN(c_jal, MATCH_C_JAL, MASK_C_JAL)
>  #else
>  #define is_c_jal_insn(opcode) 0
>  #endif
> -DECLARE_INSN(jalr, MATCH_JALR, MASK_JALR)
> -DECLARE_INSN(jal, MATCH_JAL, MASK_JAL)
> -DECLARE_INSN(c_jr, MATCH_C_JR, MASK_C_JR)
> -DECLARE_INSN(c_jalr, MATCH_C_JALR, MASK_C_JALR)
> -DECLARE_INSN(c_j, MATCH_C_J, MASK_C_J)
> -DECLARE_INSN(beq, MATCH_BEQ, MASK_BEQ)
> -DECLARE_INSN(bne, MATCH_BNE, MASK_BNE)
> -DECLARE_INSN(blt, MATCH_BLT, MASK_BLT)
> -DECLARE_INSN(bge, MATCH_BGE, MASK_BGE)
> -DECLARE_INSN(bltu, MATCH_BLTU, MASK_BLTU)
> -DECLARE_INSN(bgeu, MATCH_BGEU, MASK_BGEU)
> -DECLARE_INSN(c_beqz, MATCH_C_BEQZ, MASK_C_BEQZ)
> -DECLARE_INSN(c_bnez, MATCH_C_BNEZ, MASK_C_BNEZ)
> -DECLARE_INSN(sret, MATCH_SRET, MASK_SRET)
> +DECLARE_INSN(jalr, RVG_MATCH_JALR, RVG_MASK_JALR)
> +DECLARE_INSN(jal, RVG_MATCH_JAL, RVG_MASK_JAL)
> +DECLARE_INSN(c_jr, RVC_MATCH_C_JR, RVC_MASK_C_JR)
> +DECLARE_INSN(c_jalr, RVC_MATCH_C_JALR, RVC_MASK_C_JALR)
> +DECLARE_INSN(c_j, RVC_MATCH_C_J, RVC_MASK_C_J)
> +DECLARE_INSN(beq, RVG_MATCH_BEQ, RVG_MASK_BEQ)
> +DECLARE_INSN(bne, RVG_MATCH_BNE, RVG_MASK_BNE)
> +DECLARE_INSN(blt, RVG_MATCH_BLT, RVG_MASK_BLT)
> +DECLARE_INSN(bge, RVG_MATCH_BGE, RVG_MASK_BGE)
> +DECLARE_INSN(bltu, RVG_MATCH_BLTU, RVG_MASK_BLTU)
> +DECLARE_INSN(bgeu, RVG_MATCH_BGEU, RVG_MASK_BGEU)
> +DECLARE_INSN(c_beqz, RVC_MATCH_C_BEQZ, RVC_MASK_C_BEQZ)
> +DECLARE_INSN(c_bnez, RVC_MATCH_C_BNEZ, RVC_MASK_C_BNEZ)
> +DECLARE_INSN(sret, RVG_MATCH_SRET, RVG_MASK_SRET)
>  
>  static int decode_register_index(unsigned long opcode, int offset)
>  {
> @@ -69,19 +69,19 @@ static int get_step_address(struct pt_regs *regs, unsigned long *next_addr)
>  			rs1_num = decode_register_index(op_code, RVC_C2_RS1_OPOFF);
>  			*next_addr = regs_ptr[rs1_num];
>  		} else if (is_c_j_insn(op_code) || is_c_jal_insn(op_code)) {
> -			*next_addr = EXTRACT_RVC_J_IMM(op_code) + pc;
> +			*next_addr = RVC_EXTRACT_JTYPE_IMM(op_code) + pc;
>  		} else if (is_c_beqz_insn(op_code)) {
>  			rs1_num = decode_register_index_short(op_code,
>  							      RVC_C1_RS1_OPOFF);
>  			if (!rs1_num || regs_ptr[rs1_num] == 0)
> -				*next_addr = EXTRACT_RVC_B_IMM(op_code) + pc;
> +				*next_addr = RVC_EXTRACT_BTYPE_IMM(op_code) + pc;
>  			else
>  				*next_addr = pc + 2;
>  		} else if (is_c_bnez_insn(op_code)) {
>  			rs1_num =
>  			    decode_register_index_short(op_code, RVC_C1_RS1_OPOFF);
>  			if (rs1_num && regs_ptr[rs1_num] != 0)
> -				*next_addr = EXTRACT_RVC_B_IMM(op_code) + pc;
> +				*next_addr = RVC_EXTRACT_BTYPE_IMM(op_code) + pc;
>  			else
>  				*next_addr = pc + 2;
>  		} else {
> @@ -90,7 +90,7 @@ static int get_step_address(struct pt_regs *regs, unsigned long *next_addr)
>  	} else {
>  		if ((op_code & __INSN_OPCODE_MASK) == __INSN_BRANCH_OPCODE) {
>  			bool result = false;
> -			long imm = EXTRACT_BTYPE_IMM(op_code);
> +			long imm = RV_EXTRACT_BTYPE_IMM(op_code);
>  			unsigned long rs1_val = 0, rs2_val = 0;
>  
>  			rs1_num = decode_register_index(op_code, RVG_RS1_OPOFF);
> @@ -121,12 +121,12 @@ static int get_step_address(struct pt_regs *regs, unsigned long *next_addr)
>  			else
>  				*next_addr = pc + 4;
>  		} else if (is_jal_insn(op_code)) {
> -			*next_addr = EXTRACT_JTYPE_IMM(op_code) + pc;
> +			*next_addr = RV_EXTRACT_JTYPE_IMM(op_code) + pc;
>  		} else if (is_jalr_insn(op_code)) {
>  			rs1_num = decode_register_index(op_code, RVG_RS1_OPOFF);
>  			if (rs1_num)
>  				*next_addr = ((unsigned long *)regs)[rs1_num];
> -			*next_addr += EXTRACT_ITYPE_IMM(op_code);
> +			*next_addr += RV_EXTRACT_ITYPE_IMM(op_code);
>  		} else if (is_sret_insn(op_code)) {
>  			*next_addr = pc;
>  		} else {
> -- 
> 2.35.1
> 

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [PATCH v3 02/14] RISC-V: add prefix to all constants/macros in parse_asm.h
  2022-12-05 14:00     ` Andrew Jones
@ 2022-12-05 14:53       ` Andrew Jones
  -1 siblings, 0 replies; 84+ messages in thread
From: Andrew Jones @ 2022-12-05 14:53 UTC (permalink / raw)
  To: Heiko Stuebner
  Cc: linux-riscv, palmer, christoph.muellner, prabhakar.csengg, conor,
	philipp.tomsich, emil.renner.berthing, ardb, linux-efi,
	Heiko Stuebner, Conor Dooley

On Mon, Dec 05, 2022 at 03:00:57PM +0100, Andrew Jones wrote:
> On Wed, Nov 30, 2022 at 11:56:02PM +0100, Heiko Stuebner wrote:
> > From: Heiko Stuebner <heiko.stuebner@vrull.eu>
> > 
> > Some of the constants and macros already have suitable RV_, RVG_ or
> > RVC_ prefixes.
> > 
> > Extend this to the rest of the file as well, as we want to use these
> > things in a broader scope soon.
> > 
> > Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
> > Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
> > Signed-off-by: Heiko Stuebner <heiko.stuebner@vrull.eu>
> > ---
> >  arch/riscv/include/asm/parse_asm.h | 194 ++++++++++++++---------------
> >  arch/riscv/kernel/kgdb.c           |  40 +++---
> >  2 files changed, 117 insertions(+), 117 deletions(-)
> > 
> > diff --git a/arch/riscv/include/asm/parse_asm.h b/arch/riscv/include/asm/parse_asm.h
> > index 7fee806805c1..ea51542e0c65 100644
> > --- a/arch/riscv/include/asm/parse_asm.h
> > +++ b/arch/riscv/include/asm/parse_asm.h
> > @@ -6,37 +6,37 @@
> >  #include <linux/bits.h>
> >  
> >  /* The bit field of immediate value in I-type instruction */
> > -#define I_IMM_SIGN_OPOFF	31
> > -#define I_IMM_11_0_OPOFF	20
> > -#define I_IMM_SIGN_OFF		12
> > -#define I_IMM_11_0_OFF		0
> > -#define I_IMM_11_0_MASK		GENMASK(11, 0)
> > +#define RV_I_IMM_SIGN_OPOFF	31
> > +#define RV_I_IMM_11_0_OPOFF	20
> > +#define RV_I_IMM_SIGN_OFF	12
> > +#define RV_I_IMM_11_0_OFF	0
> > +#define RV_I_IMM_11_0_MASK	GENMASK(11, 0)
> >  
> >  /* The bit field of immediate value in J-type instruction */
> > -#define J_IMM_SIGN_OPOFF	31
> > -#define J_IMM_10_1_OPOFF	21
> > -#define J_IMM_11_OPOFF		20
> > -#define J_IMM_19_12_OPOFF	12
> > -#define J_IMM_SIGN_OFF		20
> > -#define J_IMM_10_1_OFF		1
> > -#define J_IMM_11_OFF		11
> > -#define J_IMM_19_12_OFF		12
> > -#define J_IMM_10_1_MASK		GENMASK(9, 0)
> > -#define J_IMM_11_MASK		GENMASK(0, 0)
> > -#define J_IMM_19_12_MASK	GENMASK(7, 0)
> > +#define RV_J_IMM_SIGN_OPOFF	31
> > +#define RV_J_IMM_10_1_OPOFF	21
> > +#define RV_J_IMM_11_OPOFF	20
> > +#define RV_J_IMM_19_12_OPOFF	12
> > +#define RV_J_IMM_SIGN_OFF	20
> > +#define RV_J_IMM_10_1_OFF	1
> > +#define RV_J_IMM_11_OFF		11
> > +#define RV_J_IMM_19_12_OFF	12
> > +#define RV_J_IMM_10_1_MASK	GENMASK(9, 0)
> > +#define RV_J_IMM_11_MASK	GENMASK(0, 0)
> > +#define RV_J_IMM_19_12_MASK	GENMASK(7, 0)
> 
> Looking at the macros below which shift down, mask, and then shift
> back up, and how these macros are used in the later alternative patching
> patches, both in this series and Jisheng's, make me think it'd be better
> to define them already shifted, e.g.
> 
> #define RV_J_IMM_10_1_MASK	GENMASK(30, 21)
> #define RV_J_IMM_11_MASK	GENMASK(20, 20)
> #define RV_J_IMM_19_12_MASK 	GENMASK(19, 12)

Eh, never mind, I momentarily lost track of _OFF vs. _OPOFF. These
macro names are hard to read/track. Maybe what we really need are
extract() (which we already have, but as macros) and deposit().
And just put the shifts and masks directly in the extract and deposit
functions. Taking J-type for example,

static inline s32 rv_insn_imm_sign(u32 x)
{
	return -((x >> 31) & 1);
}

static inline s32 rv_insn_extract_j_imm(u32 insn)
{
	return (rv_insn_imm_sign(insn & BIT(31)) << 20)	|
	       ((insn & GENMASK(19, 12)) << 12)		|
	       ((insn & BIT(20)) << 11)			|
	       ((insn & GENMASK(30, 25)) << 5)		|
	       ((insn & GENMASK(24, 21)) << 1);
}
  
static inline void rv_insn_deposit_j_imm(u32 *insn, s32 imm)
{
	*insn &= GENMASK(11, 7) | GENMASK(6, 0);

	*insn |= ((imm & BIT(20)) << 31)	|
		 ((imm & GENMASK(10, 1)) << 21)	|
		 ((imm & BIT(11)) << 20)	|
		 ((imm & GENMASK(19, 12)) << 12);
}

A pair of static inlines like those for each type would be easier to
read and use, IMHO.

Thanks,
drew


> 
> Thanks,
> drew
> 
> >  
> >  /* The bit field of immediate value in B-type instruction */
> > -#define B_IMM_SIGN_OPOFF	31
> > -#define B_IMM_10_5_OPOFF	25
> > -#define B_IMM_4_1_OPOFF		8
> > -#define B_IMM_11_OPOFF		7
> > -#define B_IMM_SIGN_OFF		12
> > -#define B_IMM_10_5_OFF		5
> > -#define B_IMM_4_1_OFF		1
> > -#define B_IMM_11_OFF		11
> > -#define B_IMM_10_5_MASK		GENMASK(5, 0)
> > -#define B_IMM_4_1_MASK		GENMASK(3, 0)
> > -#define B_IMM_11_MASK		GENMASK(0, 0)
> > +#define RV_B_IMM_SIGN_OPOFF	31
> > +#define RV_B_IMM_10_5_OPOFF	25
> > +#define RV_B_IMM_4_1_OPOFF	8
> > +#define RV_B_IMM_11_OPOFF	7
> > +#define RV_B_IMM_SIGN_OFF	12
> > +#define RV_B_IMM_10_5_OFF	5
> > +#define RV_B_IMM_4_1_OFF	1
> > +#define RV_B_IMM_11_OFF		11
> > +#define RV_B_IMM_10_5_MASK	GENMASK(5, 0)
> > +#define RV_B_IMM_4_1_MASK	GENMASK(3, 0)
> > +#define RV_B_IMM_11_MASK	GENMASK(0, 0)
> >  
> >  /* The register offset in RVG instruction */
> >  #define RVG_RS1_OPOFF		15
> > @@ -100,71 +100,71 @@
> >  #define RVC_C2_RD_OPOFF		7
> >  
> >  /* parts of opcode for RVG*/
> > -#define OPCODE_BRANCH		0x63
> > -#define OPCODE_JALR		0x67
> > -#define OPCODE_JAL		0x6f
> > -#define OPCODE_SYSTEM		0x73
> > +#define RVG_OPCODE_BRANCH	0x63
> > +#define RVG_OPCODE_JALR		0x67
> > +#define RVG_OPCODE_JAL		0x6f
> > +#define RVG_OPCODE_SYSTEM	0x73
> >  
> >  /* parts of opcode for RVC*/
> > -#define OPCODE_C_0		0x0
> > -#define OPCODE_C_1		0x1
> > -#define OPCODE_C_2		0x2
> > +#define RVC_OPCODE_C0		0x0
> > +#define RVC_OPCODE_C1		0x1
> > +#define RVC_OPCODE_C2		0x2
> >  
> >  /* parts of funct3 code for I, M, A extension*/
> > -#define FUNCT3_JALR		0x0
> > -#define FUNCT3_BEQ		0x0
> > -#define FUNCT3_BNE		0x1000
> > -#define FUNCT3_BLT		0x4000
> > -#define FUNCT3_BGE		0x5000
> > -#define FUNCT3_BLTU		0x6000
> > -#define FUNCT3_BGEU		0x7000
> > +#define RVG_FUNCT3_JALR		0x0
> > +#define RVG_FUNCT3_BEQ		0x0
> > +#define RVG_FUNCT3_BNE		0x1000
> > +#define RVG_FUNCT3_BLT		0x4000
> > +#define RVG_FUNCT3_BGE		0x5000
> > +#define RVG_FUNCT3_BLTU		0x6000
> > +#define RVG_FUNCT3_BGEU		0x7000
> >  
> >  /* parts of funct3 code for C extension*/
> > -#define FUNCT3_C_BEQZ		0xc000
> > -#define FUNCT3_C_BNEZ		0xe000
> > -#define FUNCT3_C_J		0xa000
> > -#define FUNCT3_C_JAL		0x2000
> > -#define FUNCT4_C_JR		0x8000
> > -#define FUNCT4_C_JALR		0x9000
> > -
> > -#define FUNCT12_SRET		0x10200000
> > -
> > -#define MATCH_JALR		(FUNCT3_JALR | OPCODE_JALR)
> > -#define MATCH_JAL		(OPCODE_JAL)
> > -#define MATCH_BEQ		(FUNCT3_BEQ | OPCODE_BRANCH)
> > -#define MATCH_BNE		(FUNCT3_BNE | OPCODE_BRANCH)
> > -#define MATCH_BLT		(FUNCT3_BLT | OPCODE_BRANCH)
> > -#define MATCH_BGE		(FUNCT3_BGE | OPCODE_BRANCH)
> > -#define MATCH_BLTU		(FUNCT3_BLTU | OPCODE_BRANCH)
> > -#define MATCH_BGEU		(FUNCT3_BGEU | OPCODE_BRANCH)
> > -#define MATCH_SRET		(FUNCT12_SRET | OPCODE_SYSTEM)
> > -#define MATCH_C_BEQZ		(FUNCT3_C_BEQZ | OPCODE_C_1)
> > -#define MATCH_C_BNEZ		(FUNCT3_C_BNEZ | OPCODE_C_1)
> > -#define MATCH_C_J		(FUNCT3_C_J | OPCODE_C_1)
> > -#define MATCH_C_JAL		(FUNCT3_C_JAL | OPCODE_C_1)
> > -#define MATCH_C_JR		(FUNCT4_C_JR | OPCODE_C_2)
> > -#define MATCH_C_JALR		(FUNCT4_C_JALR | OPCODE_C_2)
> > -
> > -#define MASK_JALR		0x707f
> > -#define MASK_JAL		0x7f
> > -#define MASK_C_JALR		0xf07f
> > -#define MASK_C_JR		0xf07f
> > -#define MASK_C_JAL		0xe003
> > -#define MASK_C_J		0xe003
> > -#define MASK_BEQ		0x707f
> > -#define MASK_BNE		0x707f
> > -#define MASK_BLT		0x707f
> > -#define MASK_BGE		0x707f
> > -#define MASK_BLTU		0x707f
> > -#define MASK_BGEU		0x707f
> > -#define MASK_C_BEQZ		0xe003
> > -#define MASK_C_BNEZ		0xe003
> > -#define MASK_SRET		0xffffffff
> > +#define RVC_FUNCT3_C_BEQZ	0xc000
> > +#define RVC_FUNCT3_C_BNEZ	0xe000
> > +#define RVC_FUNCT3_C_J		0xa000
> > +#define RVC_FUNCT3_C_JAL	0x2000
> > +#define RVC_FUNCT4_C_JR		0x8000
> > +#define RVC_FUNCT4_C_JALR	0x9000
> > +
> > +#define RVG_FUNCT12_SRET	0x10200000
> > +
> > +#define RVG_MATCH_JALR		(RVG_FUNCT3_JALR | RVG_OPCODE_JALR)
> > +#define RVG_MATCH_JAL		(RVG_OPCODE_JAL)
> > +#define RVG_MATCH_BEQ		(RVG_FUNCT3_BEQ | RVG_OPCODE_BRANCH)
> > +#define RVG_MATCH_BNE		(RVG_FUNCT3_BNE | RVG_OPCODE_BRANCH)
> > +#define RVG_MATCH_BLT		(RVG_FUNCT3_BLT | RVG_OPCODE_BRANCH)
> > +#define RVG_MATCH_BGE		(RVG_FUNCT3_BGE | RVG_OPCODE_BRANCH)
> > +#define RVG_MATCH_BLTU		(RVG_FUNCT3_BLTU | RVG_OPCODE_BRANCH)
> > +#define RVG_MATCH_BGEU		(RVG_FUNCT3_BGEU | RVG_OPCODE_BRANCH)
> > +#define RVG_MATCH_SRET		(RVG_FUNCT12_SRET | RVG_OPCODE_SYSTEM)
> > +#define RVC_MATCH_C_BEQZ	(RVC_FUNCT3_C_BEQZ | RVC_OPCODE_C1)
> > +#define RVC_MATCH_C_BNEZ	(RVC_FUNCT3_C_BNEZ | RVC_OPCODE_C1)
> > +#define RVC_MATCH_C_J		(RVC_FUNCT3_C_J | RVC_OPCODE_C1)
> > +#define RVC_MATCH_C_JAL		(RVC_FUNCT3_C_JAL | RVC_OPCODE_C1)
> > +#define RVC_MATCH_C_JR		(RVC_FUNCT4_C_JR | RVC_OPCODE_C2)
> > +#define RVC_MATCH_C_JALR	(RVC_FUNCT4_C_JALR | RVC_OPCODE_C2)
> > +
> > +#define RVG_MASK_JALR		0x707f
> > +#define RVG_MASK_JAL		0x7f
> > +#define RVC_MASK_C_JALR		0xf07f
> > +#define RVC_MASK_C_JR		0xf07f
> > +#define RVC_MASK_C_JAL		0xe003
> > +#define RVC_MASK_C_J		0xe003
> > +#define RVG_MASK_BEQ		0x707f
> > +#define RVG_MASK_BNE		0x707f
> > +#define RVG_MASK_BLT		0x707f
> > +#define RVG_MASK_BGE		0x707f
> > +#define RVG_MASK_BLTU		0x707f
> > +#define RVG_MASK_BGEU		0x707f
> > +#define RVC_MASK_C_BEQZ		0xe003
> > +#define RVC_MASK_C_BNEZ		0xe003
> > +#define RVG_MASK_SRET		0xffffffff
> >  
> >  #define __INSN_LENGTH_MASK	_UL(0x3)
> >  #define __INSN_LENGTH_GE_32	_UL(0x3)
> >  #define __INSN_OPCODE_MASK	_UL(0x7F)
> > -#define __INSN_BRANCH_OPCODE	_UL(OPCODE_BRANCH)
> > +#define __INSN_BRANCH_OPCODE	_UL(RVG_OPCODE_BRANCH)
> >  
> >  /* Define a series of is_XXX_insn functions to check if the value INSN
> >   * is an instance of instruction XXX.
> > @@ -180,26 +180,26 @@ static inline bool is_ ## INSN_NAME ## _insn(long insn) \
> >  #define RV_X(X, s, mask)  (((X) >> (s)) & (mask))
> >  #define RVC_X(X, s, mask) RV_X(X, s, mask)
> >  
> > -#define EXTRACT_JTYPE_IMM(x) \
> > +#define RV_EXTRACT_JTYPE_IMM(x) \
> >  	({typeof(x) x_ = (x); \
> > -	(RV_X(x_, J_IMM_10_1_OPOFF, J_IMM_10_1_MASK) << J_IMM_10_1_OFF) | \
> > -	(RV_X(x_, J_IMM_11_OPOFF, J_IMM_11_MASK) << J_IMM_11_OFF) | \
> > -	(RV_X(x_, J_IMM_19_12_OPOFF, J_IMM_19_12_MASK) << J_IMM_19_12_OFF) | \
> > -	(RV_IMM_SIGN(x_) << J_IMM_SIGN_OFF); })
> > +	(RV_X(x_, RV_J_IMM_10_1_OPOFF, RV_J_IMM_10_1_MASK) << RV_J_IMM_10_1_OFF) | \
> > +	(RV_X(x_, RV_J_IMM_11_OPOFF, RV_J_IMM_11_MASK) << RV_J_IMM_11_OFF) | \
> > +	(RV_X(x_, RV_J_IMM_19_12_OPOFF, RV_J_IMM_19_12_MASK) << RV_J_IMM_19_12_OFF) | \
> > +	(RV_IMM_SIGN(x_) << RV_J_IMM_SIGN_OFF); })
> >  
> > -#define EXTRACT_ITYPE_IMM(x) \
> > +#define RV_EXTRACT_ITYPE_IMM(x) \
> >  	({typeof(x) x_ = (x); \
> > -	(RV_X(x_, I_IMM_11_0_OPOFF, I_IMM_11_0_MASK)) | \
> > -	(RV_IMM_SIGN(x_) << I_IMM_SIGN_OFF); })
> > +	(RV_X(x_, RV_I_IMM_11_0_OPOFF, RV_I_IMM_11_0_MASK)) | \
> > +	(RV_IMM_SIGN(x_) << RV_I_IMM_SIGN_OFF); })
> >  
> > -#define EXTRACT_BTYPE_IMM(x) \
> > +#define RV_EXTRACT_BTYPE_IMM(x) \
> >  	({typeof(x) x_ = (x); \
> > -	(RV_X(x_, B_IMM_4_1_OPOFF, B_IMM_4_1_MASK) << B_IMM_4_1_OFF) | \
> > -	(RV_X(x_, B_IMM_10_5_OPOFF, B_IMM_10_5_MASK) << B_IMM_10_5_OFF) | \
> > -	(RV_X(x_, B_IMM_11_OPOFF, B_IMM_11_MASK) << B_IMM_11_OFF) | \
> > -	(RV_IMM_SIGN(x_) << B_IMM_SIGN_OFF); })
> > +	(RV_X(x_, RV_B_IMM_4_1_OPOFF, RV_B_IMM_4_1_MASK) << RV_B_IMM_4_1_OFF) | \
> > +	(RV_X(x_, RV_B_IMM_10_5_OPOFF, RV_B_IMM_10_5_MASK) << RV_B_IMM_10_5_OFF) | \
> > +	(RV_X(x_, RV_B_IMM_11_OPOFF, RV_B_IMM_11_MASK) << RV_B_IMM_11_OFF) | \
> > +	(RV_IMM_SIGN(x_) << RV_B_IMM_SIGN_OFF); })
> >  
> > -#define EXTRACT_RVC_J_IMM(x) \
> > +#define RVC_EXTRACT_JTYPE_IMM(x) \
> >  	({typeof(x) x_ = (x); \
> >  	(RVC_X(x_, RVC_J_IMM_3_1_OPOFF, RVC_J_IMM_3_1_MASK) << RVC_J_IMM_3_1_OFF) | \
> >  	(RVC_X(x_, RVC_J_IMM_4_OPOFF, RVC_J_IMM_4_MASK) << RVC_J_IMM_4_OFF) | \
> > @@ -210,7 +210,7 @@ static inline bool is_ ## INSN_NAME ## _insn(long insn) \
> >  	(RVC_X(x_, RVC_J_IMM_10_OPOFF, RVC_J_IMM_10_MASK) << RVC_J_IMM_10_OFF) | \
> >  	(RVC_IMM_SIGN(x_) << RVC_J_IMM_SIGN_OFF); })
> >  
> > -#define EXTRACT_RVC_B_IMM(x) \
> > +#define RVC_EXTRACT_BTYPE_IMM(x) \
> >  	({typeof(x) x_ = (x); \
> >  	(RVC_X(x_, RVC_B_IMM_2_1_OPOFF, RVC_B_IMM_2_1_MASK) << RVC_B_IMM_2_1_OFF) | \
> >  	(RVC_X(x_, RVC_B_IMM_4_3_OPOFF, RVC_B_IMM_4_3_MASK) << RVC_B_IMM_4_3_OFF) | \
> > diff --git a/arch/riscv/kernel/kgdb.c b/arch/riscv/kernel/kgdb.c
> > index 963ed7edcff2..137c6d870d58 100644
> > --- a/arch/riscv/kernel/kgdb.c
> > +++ b/arch/riscv/kernel/kgdb.c
> > @@ -29,20 +29,20 @@ DECLARE_INSN(c_jal, MATCH_C_JAL, MASK_C_JAL)
> >  #else
> >  #define is_c_jal_insn(opcode) 0
> >  #endif
> > -DECLARE_INSN(jalr, MATCH_JALR, MASK_JALR)
> > -DECLARE_INSN(jal, MATCH_JAL, MASK_JAL)
> > -DECLARE_INSN(c_jr, MATCH_C_JR, MASK_C_JR)
> > -DECLARE_INSN(c_jalr, MATCH_C_JALR, MASK_C_JALR)
> > -DECLARE_INSN(c_j, MATCH_C_J, MASK_C_J)
> > -DECLARE_INSN(beq, MATCH_BEQ, MASK_BEQ)
> > -DECLARE_INSN(bne, MATCH_BNE, MASK_BNE)
> > -DECLARE_INSN(blt, MATCH_BLT, MASK_BLT)
> > -DECLARE_INSN(bge, MATCH_BGE, MASK_BGE)
> > -DECLARE_INSN(bltu, MATCH_BLTU, MASK_BLTU)
> > -DECLARE_INSN(bgeu, MATCH_BGEU, MASK_BGEU)
> > -DECLARE_INSN(c_beqz, MATCH_C_BEQZ, MASK_C_BEQZ)
> > -DECLARE_INSN(c_bnez, MATCH_C_BNEZ, MASK_C_BNEZ)
> > -DECLARE_INSN(sret, MATCH_SRET, MASK_SRET)
> > +DECLARE_INSN(jalr, RVG_MATCH_JALR, RVG_MASK_JALR)
> > +DECLARE_INSN(jal, RVG_MATCH_JAL, RVG_MASK_JAL)
> > +DECLARE_INSN(c_jr, RVC_MATCH_C_JR, RVC_MASK_C_JR)
> > +DECLARE_INSN(c_jalr, RVC_MATCH_C_JALR, RVC_MASK_C_JALR)
> > +DECLARE_INSN(c_j, RVC_MATCH_C_J, RVC_MASK_C_J)
> > +DECLARE_INSN(beq, RVG_MATCH_BEQ, RVG_MASK_BEQ)
> > +DECLARE_INSN(bne, RVG_MATCH_BNE, RVG_MASK_BNE)
> > +DECLARE_INSN(blt, RVG_MATCH_BLT, RVG_MASK_BLT)
> > +DECLARE_INSN(bge, RVG_MATCH_BGE, RVG_MASK_BGE)
> > +DECLARE_INSN(bltu, RVG_MATCH_BLTU, RVG_MASK_BLTU)
> > +DECLARE_INSN(bgeu, RVG_MATCH_BGEU, RVG_MASK_BGEU)
> > +DECLARE_INSN(c_beqz, RVC_MATCH_C_BEQZ, RVC_MASK_C_BEQZ)
> > +DECLARE_INSN(c_bnez, RVC_MATCH_C_BNEZ, RVC_MASK_C_BNEZ)
> > +DECLARE_INSN(sret, RVG_MATCH_SRET, RVG_MASK_SRET)
> >  
> >  static int decode_register_index(unsigned long opcode, int offset)
> >  {
> > @@ -69,19 +69,19 @@ static int get_step_address(struct pt_regs *regs, unsigned long *next_addr)
> >  			rs1_num = decode_register_index(op_code, RVC_C2_RS1_OPOFF);
> >  			*next_addr = regs_ptr[rs1_num];
> >  		} else if (is_c_j_insn(op_code) || is_c_jal_insn(op_code)) {
> > -			*next_addr = EXTRACT_RVC_J_IMM(op_code) + pc;
> > +			*next_addr = RVC_EXTRACT_JTYPE_IMM(op_code) + pc;
> >  		} else if (is_c_beqz_insn(op_code)) {
> >  			rs1_num = decode_register_index_short(op_code,
> >  							      RVC_C1_RS1_OPOFF);
> >  			if (!rs1_num || regs_ptr[rs1_num] == 0)
> > -				*next_addr = EXTRACT_RVC_B_IMM(op_code) + pc;
> > +				*next_addr = RVC_EXTRACT_BTYPE_IMM(op_code) + pc;
> >  			else
> >  				*next_addr = pc + 2;
> >  		} else if (is_c_bnez_insn(op_code)) {
> >  			rs1_num =
> >  			    decode_register_index_short(op_code, RVC_C1_RS1_OPOFF);
> >  			if (rs1_num && regs_ptr[rs1_num] != 0)
> > -				*next_addr = EXTRACT_RVC_B_IMM(op_code) + pc;
> > +				*next_addr = RVC_EXTRACT_BTYPE_IMM(op_code) + pc;
> >  			else
> >  				*next_addr = pc + 2;
> >  		} else {
> > @@ -90,7 +90,7 @@ static int get_step_address(struct pt_regs *regs, unsigned long *next_addr)
> >  	} else {
> >  		if ((op_code & __INSN_OPCODE_MASK) == __INSN_BRANCH_OPCODE) {
> >  			bool result = false;
> > -			long imm = EXTRACT_BTYPE_IMM(op_code);
> > +			long imm = RV_EXTRACT_BTYPE_IMM(op_code);
> >  			unsigned long rs1_val = 0, rs2_val = 0;
> >  
> >  			rs1_num = decode_register_index(op_code, RVG_RS1_OPOFF);
> > @@ -121,12 +121,12 @@ static int get_step_address(struct pt_regs *regs, unsigned long *next_addr)
> >  			else
> >  				*next_addr = pc + 4;
> >  		} else if (is_jal_insn(op_code)) {
> > -			*next_addr = EXTRACT_JTYPE_IMM(op_code) + pc;
> > +			*next_addr = RV_EXTRACT_JTYPE_IMM(op_code) + pc;
> >  		} else if (is_jalr_insn(op_code)) {
> >  			rs1_num = decode_register_index(op_code, RVG_RS1_OPOFF);
> >  			if (rs1_num)
> >  				*next_addr = ((unsigned long *)regs)[rs1_num];
> > -			*next_addr += EXTRACT_ITYPE_IMM(op_code);
> > +			*next_addr += RV_EXTRACT_ITYPE_IMM(op_code);
> >  		} else if (is_sret_insn(op_code)) {
> >  			*next_addr = pc;
> >  		} else {
> > -- 
> > 2.35.1
> > 

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [PATCH v3 02/14] RISC-V: add prefix to all constants/macros in parse_asm.h
@ 2022-12-05 14:53       ` Andrew Jones
  0 siblings, 0 replies; 84+ messages in thread
From: Andrew Jones @ 2022-12-05 14:53 UTC (permalink / raw)
  To: Heiko Stuebner
  Cc: linux-riscv, palmer, christoph.muellner, prabhakar.csengg, conor,
	philipp.tomsich, emil.renner.berthing, ardb, linux-efi,
	Heiko Stuebner, Conor Dooley

On Mon, Dec 05, 2022 at 03:00:57PM +0100, Andrew Jones wrote:
> On Wed, Nov 30, 2022 at 11:56:02PM +0100, Heiko Stuebner wrote:
> > From: Heiko Stuebner <heiko.stuebner@vrull.eu>
> > 
> > Some of the constants and macros already have suitable RV_, RVG_ or
> > RVC_ prefixes.
> > 
> > Extend this to the rest of the file as well, as we want to use these
> > things in a broader scope soon.
> > 
> > Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
> > Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
> > Signed-off-by: Heiko Stuebner <heiko.stuebner@vrull.eu>
> > ---
> >  arch/riscv/include/asm/parse_asm.h | 194 ++++++++++++++---------------
> >  arch/riscv/kernel/kgdb.c           |  40 +++---
> >  2 files changed, 117 insertions(+), 117 deletions(-)
> > 
> > diff --git a/arch/riscv/include/asm/parse_asm.h b/arch/riscv/include/asm/parse_asm.h
> > index 7fee806805c1..ea51542e0c65 100644
> > --- a/arch/riscv/include/asm/parse_asm.h
> > +++ b/arch/riscv/include/asm/parse_asm.h
> > @@ -6,37 +6,37 @@
> >  #include <linux/bits.h>
> >  
> >  /* The bit field of immediate value in I-type instruction */
> > -#define I_IMM_SIGN_OPOFF	31
> > -#define I_IMM_11_0_OPOFF	20
> > -#define I_IMM_SIGN_OFF		12
> > -#define I_IMM_11_0_OFF		0
> > -#define I_IMM_11_0_MASK		GENMASK(11, 0)
> > +#define RV_I_IMM_SIGN_OPOFF	31
> > +#define RV_I_IMM_11_0_OPOFF	20
> > +#define RV_I_IMM_SIGN_OFF	12
> > +#define RV_I_IMM_11_0_OFF	0
> > +#define RV_I_IMM_11_0_MASK	GENMASK(11, 0)
> >  
> >  /* The bit field of immediate value in J-type instruction */
> > -#define J_IMM_SIGN_OPOFF	31
> > -#define J_IMM_10_1_OPOFF	21
> > -#define J_IMM_11_OPOFF		20
> > -#define J_IMM_19_12_OPOFF	12
> > -#define J_IMM_SIGN_OFF		20
> > -#define J_IMM_10_1_OFF		1
> > -#define J_IMM_11_OFF		11
> > -#define J_IMM_19_12_OFF		12
> > -#define J_IMM_10_1_MASK		GENMASK(9, 0)
> > -#define J_IMM_11_MASK		GENMASK(0, 0)
> > -#define J_IMM_19_12_MASK	GENMASK(7, 0)
> > +#define RV_J_IMM_SIGN_OPOFF	31
> > +#define RV_J_IMM_10_1_OPOFF	21
> > +#define RV_J_IMM_11_OPOFF	20
> > +#define RV_J_IMM_19_12_OPOFF	12
> > +#define RV_J_IMM_SIGN_OFF	20
> > +#define RV_J_IMM_10_1_OFF	1
> > +#define RV_J_IMM_11_OFF		11
> > +#define RV_J_IMM_19_12_OFF	12
> > +#define RV_J_IMM_10_1_MASK	GENMASK(9, 0)
> > +#define RV_J_IMM_11_MASK	GENMASK(0, 0)
> > +#define RV_J_IMM_19_12_MASK	GENMASK(7, 0)
> 
> Looking at the macros below which shift down, mask, and then shift
> back up, and how these macros are used in the later alternative patching
> patches, both in this series and Jisheng's, make me think it'd be better
> to define them already shifted, e.g.
> 
> #define RV_J_IMM_10_1_MASK	GENMASK(30, 21)
> #define RV_J_IMM_11_MASK	GENMASK(20, 20)
> #define RV_J_IMM_19_12_MASK 	GENMASK(19, 12)

Eh, never mind, I momentarily lost track of _OFF vs. _OPOFF. These
macro names are hard to read/track. Maybe what we really need are
extract() (which we already have, but as macros) and deposit().
And just put the shifts and masks directly in the extract and deposit
functions. Taking J-type for example,

static inline s32 rv_insn_imm_sign(u32 x)
{
	return -((x >> 31) & 1);
}

static inline s32 rv_insn_extract_j_imm(u32 insn)
{
	return (rv_insn_imm_sign(insn & BIT(31)) << 20)	|
	       ((insn & GENMASK(19, 12)) << 12)		|
	       ((insn & BIT(20)) << 11)			|
	       ((insn & GENMASK(30, 25)) << 5)		|
	       ((insn & GENMASK(24, 21)) << 1);
}
  
static inline void rv_insn_deposit_j_imm(u32 *insn, s32 imm)
{
	*insn &= GENMASK(11, 7) | GENMASK(6, 0);

	*insn |= ((imm & BIT(20)) << 31)	|
		 ((imm & GENMASK(10, 1)) << 21)	|
		 ((imm & BIT(11)) << 20)	|
		 ((imm & GENMASK(19, 12)) << 12);
}

A pair of static inlines like those for each type would be easier to
read and use, IMHO.

Thanks,
drew


> 
> Thanks,
> drew
> 
> >  
> >  /* The bit field of immediate value in B-type instruction */
> > -#define B_IMM_SIGN_OPOFF	31
> > -#define B_IMM_10_5_OPOFF	25
> > -#define B_IMM_4_1_OPOFF		8
> > -#define B_IMM_11_OPOFF		7
> > -#define B_IMM_SIGN_OFF		12
> > -#define B_IMM_10_5_OFF		5
> > -#define B_IMM_4_1_OFF		1
> > -#define B_IMM_11_OFF		11
> > -#define B_IMM_10_5_MASK		GENMASK(5, 0)
> > -#define B_IMM_4_1_MASK		GENMASK(3, 0)
> > -#define B_IMM_11_MASK		GENMASK(0, 0)
> > +#define RV_B_IMM_SIGN_OPOFF	31
> > +#define RV_B_IMM_10_5_OPOFF	25
> > +#define RV_B_IMM_4_1_OPOFF	8
> > +#define RV_B_IMM_11_OPOFF	7
> > +#define RV_B_IMM_SIGN_OFF	12
> > +#define RV_B_IMM_10_5_OFF	5
> > +#define RV_B_IMM_4_1_OFF	1
> > +#define RV_B_IMM_11_OFF		11
> > +#define RV_B_IMM_10_5_MASK	GENMASK(5, 0)
> > +#define RV_B_IMM_4_1_MASK	GENMASK(3, 0)
> > +#define RV_B_IMM_11_MASK	GENMASK(0, 0)
> >  
> >  /* The register offset in RVG instruction */
> >  #define RVG_RS1_OPOFF		15
> > @@ -100,71 +100,71 @@
> >  #define RVC_C2_RD_OPOFF		7
> >  
> >  /* parts of opcode for RVG*/
> > -#define OPCODE_BRANCH		0x63
> > -#define OPCODE_JALR		0x67
> > -#define OPCODE_JAL		0x6f
> > -#define OPCODE_SYSTEM		0x73
> > +#define RVG_OPCODE_BRANCH	0x63
> > +#define RVG_OPCODE_JALR		0x67
> > +#define RVG_OPCODE_JAL		0x6f
> > +#define RVG_OPCODE_SYSTEM	0x73
> >  
> >  /* parts of opcode for RVC*/
> > -#define OPCODE_C_0		0x0
> > -#define OPCODE_C_1		0x1
> > -#define OPCODE_C_2		0x2
> > +#define RVC_OPCODE_C0		0x0
> > +#define RVC_OPCODE_C1		0x1
> > +#define RVC_OPCODE_C2		0x2
> >  
> >  /* parts of funct3 code for I, M, A extension*/
> > -#define FUNCT3_JALR		0x0
> > -#define FUNCT3_BEQ		0x0
> > -#define FUNCT3_BNE		0x1000
> > -#define FUNCT3_BLT		0x4000
> > -#define FUNCT3_BGE		0x5000
> > -#define FUNCT3_BLTU		0x6000
> > -#define FUNCT3_BGEU		0x7000
> > +#define RVG_FUNCT3_JALR		0x0
> > +#define RVG_FUNCT3_BEQ		0x0
> > +#define RVG_FUNCT3_BNE		0x1000
> > +#define RVG_FUNCT3_BLT		0x4000
> > +#define RVG_FUNCT3_BGE		0x5000
> > +#define RVG_FUNCT3_BLTU		0x6000
> > +#define RVG_FUNCT3_BGEU		0x7000
> >  
> >  /* parts of funct3 code for C extension*/
> > -#define FUNCT3_C_BEQZ		0xc000
> > -#define FUNCT3_C_BNEZ		0xe000
> > -#define FUNCT3_C_J		0xa000
> > -#define FUNCT3_C_JAL		0x2000
> > -#define FUNCT4_C_JR		0x8000
> > -#define FUNCT4_C_JALR		0x9000
> > -
> > -#define FUNCT12_SRET		0x10200000
> > -
> > -#define MATCH_JALR		(FUNCT3_JALR | OPCODE_JALR)
> > -#define MATCH_JAL		(OPCODE_JAL)
> > -#define MATCH_BEQ		(FUNCT3_BEQ | OPCODE_BRANCH)
> > -#define MATCH_BNE		(FUNCT3_BNE | OPCODE_BRANCH)
> > -#define MATCH_BLT		(FUNCT3_BLT | OPCODE_BRANCH)
> > -#define MATCH_BGE		(FUNCT3_BGE | OPCODE_BRANCH)
> > -#define MATCH_BLTU		(FUNCT3_BLTU | OPCODE_BRANCH)
> > -#define MATCH_BGEU		(FUNCT3_BGEU | OPCODE_BRANCH)
> > -#define MATCH_SRET		(FUNCT12_SRET | OPCODE_SYSTEM)
> > -#define MATCH_C_BEQZ		(FUNCT3_C_BEQZ | OPCODE_C_1)
> > -#define MATCH_C_BNEZ		(FUNCT3_C_BNEZ | OPCODE_C_1)
> > -#define MATCH_C_J		(FUNCT3_C_J | OPCODE_C_1)
> > -#define MATCH_C_JAL		(FUNCT3_C_JAL | OPCODE_C_1)
> > -#define MATCH_C_JR		(FUNCT4_C_JR | OPCODE_C_2)
> > -#define MATCH_C_JALR		(FUNCT4_C_JALR | OPCODE_C_2)
> > -
> > -#define MASK_JALR		0x707f
> > -#define MASK_JAL		0x7f
> > -#define MASK_C_JALR		0xf07f
> > -#define MASK_C_JR		0xf07f
> > -#define MASK_C_JAL		0xe003
> > -#define MASK_C_J		0xe003
> > -#define MASK_BEQ		0x707f
> > -#define MASK_BNE		0x707f
> > -#define MASK_BLT		0x707f
> > -#define MASK_BGE		0x707f
> > -#define MASK_BLTU		0x707f
> > -#define MASK_BGEU		0x707f
> > -#define MASK_C_BEQZ		0xe003
> > -#define MASK_C_BNEZ		0xe003
> > -#define MASK_SRET		0xffffffff
> > +#define RVC_FUNCT3_C_BEQZ	0xc000
> > +#define RVC_FUNCT3_C_BNEZ	0xe000
> > +#define RVC_FUNCT3_C_J		0xa000
> > +#define RVC_FUNCT3_C_JAL	0x2000
> > +#define RVC_FUNCT4_C_JR		0x8000
> > +#define RVC_FUNCT4_C_JALR	0x9000
> > +
> > +#define RVG_FUNCT12_SRET	0x10200000
> > +
> > +#define RVG_MATCH_JALR		(RVG_FUNCT3_JALR | RVG_OPCODE_JALR)
> > +#define RVG_MATCH_JAL		(RVG_OPCODE_JAL)
> > +#define RVG_MATCH_BEQ		(RVG_FUNCT3_BEQ | RVG_OPCODE_BRANCH)
> > +#define RVG_MATCH_BNE		(RVG_FUNCT3_BNE | RVG_OPCODE_BRANCH)
> > +#define RVG_MATCH_BLT		(RVG_FUNCT3_BLT | RVG_OPCODE_BRANCH)
> > +#define RVG_MATCH_BGE		(RVG_FUNCT3_BGE | RVG_OPCODE_BRANCH)
> > +#define RVG_MATCH_BLTU		(RVG_FUNCT3_BLTU | RVG_OPCODE_BRANCH)
> > +#define RVG_MATCH_BGEU		(RVG_FUNCT3_BGEU | RVG_OPCODE_BRANCH)
> > +#define RVG_MATCH_SRET		(RVG_FUNCT12_SRET | RVG_OPCODE_SYSTEM)
> > +#define RVC_MATCH_C_BEQZ	(RVC_FUNCT3_C_BEQZ | RVC_OPCODE_C1)
> > +#define RVC_MATCH_C_BNEZ	(RVC_FUNCT3_C_BNEZ | RVC_OPCODE_C1)
> > +#define RVC_MATCH_C_J		(RVC_FUNCT3_C_J | RVC_OPCODE_C1)
> > +#define RVC_MATCH_C_JAL		(RVC_FUNCT3_C_JAL | RVC_OPCODE_C1)
> > +#define RVC_MATCH_C_JR		(RVC_FUNCT4_C_JR | RVC_OPCODE_C2)
> > +#define RVC_MATCH_C_JALR	(RVC_FUNCT4_C_JALR | RVC_OPCODE_C2)
> > +
> > +#define RVG_MASK_JALR		0x707f
> > +#define RVG_MASK_JAL		0x7f
> > +#define RVC_MASK_C_JALR		0xf07f
> > +#define RVC_MASK_C_JR		0xf07f
> > +#define RVC_MASK_C_JAL		0xe003
> > +#define RVC_MASK_C_J		0xe003
> > +#define RVG_MASK_BEQ		0x707f
> > +#define RVG_MASK_BNE		0x707f
> > +#define RVG_MASK_BLT		0x707f
> > +#define RVG_MASK_BGE		0x707f
> > +#define RVG_MASK_BLTU		0x707f
> > +#define RVG_MASK_BGEU		0x707f
> > +#define RVC_MASK_C_BEQZ		0xe003
> > +#define RVC_MASK_C_BNEZ		0xe003
> > +#define RVG_MASK_SRET		0xffffffff
> >  
> >  #define __INSN_LENGTH_MASK	_UL(0x3)
> >  #define __INSN_LENGTH_GE_32	_UL(0x3)
> >  #define __INSN_OPCODE_MASK	_UL(0x7F)
> > -#define __INSN_BRANCH_OPCODE	_UL(OPCODE_BRANCH)
> > +#define __INSN_BRANCH_OPCODE	_UL(RVG_OPCODE_BRANCH)
> >  
> >  /* Define a series of is_XXX_insn functions to check if the value INSN
> >   * is an instance of instruction XXX.
> > @@ -180,26 +180,26 @@ static inline bool is_ ## INSN_NAME ## _insn(long insn) \
> >  #define RV_X(X, s, mask)  (((X) >> (s)) & (mask))
> >  #define RVC_X(X, s, mask) RV_X(X, s, mask)
> >  
> > -#define EXTRACT_JTYPE_IMM(x) \
> > +#define RV_EXTRACT_JTYPE_IMM(x) \
> >  	({typeof(x) x_ = (x); \
> > -	(RV_X(x_, J_IMM_10_1_OPOFF, J_IMM_10_1_MASK) << J_IMM_10_1_OFF) | \
> > -	(RV_X(x_, J_IMM_11_OPOFF, J_IMM_11_MASK) << J_IMM_11_OFF) | \
> > -	(RV_X(x_, J_IMM_19_12_OPOFF, J_IMM_19_12_MASK) << J_IMM_19_12_OFF) | \
> > -	(RV_IMM_SIGN(x_) << J_IMM_SIGN_OFF); })
> > +	(RV_X(x_, RV_J_IMM_10_1_OPOFF, RV_J_IMM_10_1_MASK) << RV_J_IMM_10_1_OFF) | \
> > +	(RV_X(x_, RV_J_IMM_11_OPOFF, RV_J_IMM_11_MASK) << RV_J_IMM_11_OFF) | \
> > +	(RV_X(x_, RV_J_IMM_19_12_OPOFF, RV_J_IMM_19_12_MASK) << RV_J_IMM_19_12_OFF) | \
> > +	(RV_IMM_SIGN(x_) << RV_J_IMM_SIGN_OFF); })
> >  
> > -#define EXTRACT_ITYPE_IMM(x) \
> > +#define RV_EXTRACT_ITYPE_IMM(x) \
> >  	({typeof(x) x_ = (x); \
> > -	(RV_X(x_, I_IMM_11_0_OPOFF, I_IMM_11_0_MASK)) | \
> > -	(RV_IMM_SIGN(x_) << I_IMM_SIGN_OFF); })
> > +	(RV_X(x_, RV_I_IMM_11_0_OPOFF, RV_I_IMM_11_0_MASK)) | \
> > +	(RV_IMM_SIGN(x_) << RV_I_IMM_SIGN_OFF); })
> >  
> > -#define EXTRACT_BTYPE_IMM(x) \
> > +#define RV_EXTRACT_BTYPE_IMM(x) \
> >  	({typeof(x) x_ = (x); \
> > -	(RV_X(x_, B_IMM_4_1_OPOFF, B_IMM_4_1_MASK) << B_IMM_4_1_OFF) | \
> > -	(RV_X(x_, B_IMM_10_5_OPOFF, B_IMM_10_5_MASK) << B_IMM_10_5_OFF) | \
> > -	(RV_X(x_, B_IMM_11_OPOFF, B_IMM_11_MASK) << B_IMM_11_OFF) | \
> > -	(RV_IMM_SIGN(x_) << B_IMM_SIGN_OFF); })
> > +	(RV_X(x_, RV_B_IMM_4_1_OPOFF, RV_B_IMM_4_1_MASK) << RV_B_IMM_4_1_OFF) | \
> > +	(RV_X(x_, RV_B_IMM_10_5_OPOFF, RV_B_IMM_10_5_MASK) << RV_B_IMM_10_5_OFF) | \
> > +	(RV_X(x_, RV_B_IMM_11_OPOFF, RV_B_IMM_11_MASK) << RV_B_IMM_11_OFF) | \
> > +	(RV_IMM_SIGN(x_) << RV_B_IMM_SIGN_OFF); })
> >  
> > -#define EXTRACT_RVC_J_IMM(x) \
> > +#define RVC_EXTRACT_JTYPE_IMM(x) \
> >  	({typeof(x) x_ = (x); \
> >  	(RVC_X(x_, RVC_J_IMM_3_1_OPOFF, RVC_J_IMM_3_1_MASK) << RVC_J_IMM_3_1_OFF) | \
> >  	(RVC_X(x_, RVC_J_IMM_4_OPOFF, RVC_J_IMM_4_MASK) << RVC_J_IMM_4_OFF) | \
> > @@ -210,7 +210,7 @@ static inline bool is_ ## INSN_NAME ## _insn(long insn) \
> >  	(RVC_X(x_, RVC_J_IMM_10_OPOFF, RVC_J_IMM_10_MASK) << RVC_J_IMM_10_OFF) | \
> >  	(RVC_IMM_SIGN(x_) << RVC_J_IMM_SIGN_OFF); })
> >  
> > -#define EXTRACT_RVC_B_IMM(x) \
> > +#define RVC_EXTRACT_BTYPE_IMM(x) \
> >  	({typeof(x) x_ = (x); \
> >  	(RVC_X(x_, RVC_B_IMM_2_1_OPOFF, RVC_B_IMM_2_1_MASK) << RVC_B_IMM_2_1_OFF) | \
> >  	(RVC_X(x_, RVC_B_IMM_4_3_OPOFF, RVC_B_IMM_4_3_MASK) << RVC_B_IMM_4_3_OFF) | \
> > diff --git a/arch/riscv/kernel/kgdb.c b/arch/riscv/kernel/kgdb.c
> > index 963ed7edcff2..137c6d870d58 100644
> > --- a/arch/riscv/kernel/kgdb.c
> > +++ b/arch/riscv/kernel/kgdb.c
> > @@ -29,20 +29,20 @@ DECLARE_INSN(c_jal, MATCH_C_JAL, MASK_C_JAL)
> >  #else
> >  #define is_c_jal_insn(opcode) 0
> >  #endif
> > -DECLARE_INSN(jalr, MATCH_JALR, MASK_JALR)
> > -DECLARE_INSN(jal, MATCH_JAL, MASK_JAL)
> > -DECLARE_INSN(c_jr, MATCH_C_JR, MASK_C_JR)
> > -DECLARE_INSN(c_jalr, MATCH_C_JALR, MASK_C_JALR)
> > -DECLARE_INSN(c_j, MATCH_C_J, MASK_C_J)
> > -DECLARE_INSN(beq, MATCH_BEQ, MASK_BEQ)
> > -DECLARE_INSN(bne, MATCH_BNE, MASK_BNE)
> > -DECLARE_INSN(blt, MATCH_BLT, MASK_BLT)
> > -DECLARE_INSN(bge, MATCH_BGE, MASK_BGE)
> > -DECLARE_INSN(bltu, MATCH_BLTU, MASK_BLTU)
> > -DECLARE_INSN(bgeu, MATCH_BGEU, MASK_BGEU)
> > -DECLARE_INSN(c_beqz, MATCH_C_BEQZ, MASK_C_BEQZ)
> > -DECLARE_INSN(c_bnez, MATCH_C_BNEZ, MASK_C_BNEZ)
> > -DECLARE_INSN(sret, MATCH_SRET, MASK_SRET)
> > +DECLARE_INSN(jalr, RVG_MATCH_JALR, RVG_MASK_JALR)
> > +DECLARE_INSN(jal, RVG_MATCH_JAL, RVG_MASK_JAL)
> > +DECLARE_INSN(c_jr, RVC_MATCH_C_JR, RVC_MASK_C_JR)
> > +DECLARE_INSN(c_jalr, RVC_MATCH_C_JALR, RVC_MASK_C_JALR)
> > +DECLARE_INSN(c_j, RVC_MATCH_C_J, RVC_MASK_C_J)
> > +DECLARE_INSN(beq, RVG_MATCH_BEQ, RVG_MASK_BEQ)
> > +DECLARE_INSN(bne, RVG_MATCH_BNE, RVG_MASK_BNE)
> > +DECLARE_INSN(blt, RVG_MATCH_BLT, RVG_MASK_BLT)
> > +DECLARE_INSN(bge, RVG_MATCH_BGE, RVG_MASK_BGE)
> > +DECLARE_INSN(bltu, RVG_MATCH_BLTU, RVG_MASK_BLTU)
> > +DECLARE_INSN(bgeu, RVG_MATCH_BGEU, RVG_MASK_BGEU)
> > +DECLARE_INSN(c_beqz, RVC_MATCH_C_BEQZ, RVC_MASK_C_BEQZ)
> > +DECLARE_INSN(c_bnez, RVC_MATCH_C_BNEZ, RVC_MASK_C_BNEZ)
> > +DECLARE_INSN(sret, RVG_MATCH_SRET, RVG_MASK_SRET)
> >  
> >  static int decode_register_index(unsigned long opcode, int offset)
> >  {
> > @@ -69,19 +69,19 @@ static int get_step_address(struct pt_regs *regs, unsigned long *next_addr)
> >  			rs1_num = decode_register_index(op_code, RVC_C2_RS1_OPOFF);
> >  			*next_addr = regs_ptr[rs1_num];
> >  		} else if (is_c_j_insn(op_code) || is_c_jal_insn(op_code)) {
> > -			*next_addr = EXTRACT_RVC_J_IMM(op_code) + pc;
> > +			*next_addr = RVC_EXTRACT_JTYPE_IMM(op_code) + pc;
> >  		} else if (is_c_beqz_insn(op_code)) {
> >  			rs1_num = decode_register_index_short(op_code,
> >  							      RVC_C1_RS1_OPOFF);
> >  			if (!rs1_num || regs_ptr[rs1_num] == 0)
> > -				*next_addr = EXTRACT_RVC_B_IMM(op_code) + pc;
> > +				*next_addr = RVC_EXTRACT_BTYPE_IMM(op_code) + pc;
> >  			else
> >  				*next_addr = pc + 2;
> >  		} else if (is_c_bnez_insn(op_code)) {
> >  			rs1_num =
> >  			    decode_register_index_short(op_code, RVC_C1_RS1_OPOFF);
> >  			if (rs1_num && regs_ptr[rs1_num] != 0)
> > -				*next_addr = EXTRACT_RVC_B_IMM(op_code) + pc;
> > +				*next_addr = RVC_EXTRACT_BTYPE_IMM(op_code) + pc;
> >  			else
> >  				*next_addr = pc + 2;
> >  		} else {
> > @@ -90,7 +90,7 @@ static int get_step_address(struct pt_regs *regs, unsigned long *next_addr)
> >  	} else {
> >  		if ((op_code & __INSN_OPCODE_MASK) == __INSN_BRANCH_OPCODE) {
> >  			bool result = false;
> > -			long imm = EXTRACT_BTYPE_IMM(op_code);
> > +			long imm = RV_EXTRACT_BTYPE_IMM(op_code);
> >  			unsigned long rs1_val = 0, rs2_val = 0;
> >  
> >  			rs1_num = decode_register_index(op_code, RVG_RS1_OPOFF);
> > @@ -121,12 +121,12 @@ static int get_step_address(struct pt_regs *regs, unsigned long *next_addr)
> >  			else
> >  				*next_addr = pc + 4;
> >  		} else if (is_jal_insn(op_code)) {
> > -			*next_addr = EXTRACT_JTYPE_IMM(op_code) + pc;
> > +			*next_addr = RV_EXTRACT_JTYPE_IMM(op_code) + pc;
> >  		} else if (is_jalr_insn(op_code)) {
> >  			rs1_num = decode_register_index(op_code, RVG_RS1_OPOFF);
> >  			if (rs1_num)
> >  				*next_addr = ((unsigned long *)regs)[rs1_num];
> > -			*next_addr += EXTRACT_ITYPE_IMM(op_code);
> > +			*next_addr += RV_EXTRACT_ITYPE_IMM(op_code);
> >  		} else if (is_sret_insn(op_code)) {
> >  			*next_addr = pc;
> >  		} else {
> > -- 
> > 2.35.1
> > 

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [PATCH v3 02/14] RISC-V: add prefix to all constants/macros in parse_asm.h
  2022-12-05 14:53       ` Andrew Jones
@ 2022-12-05 15:30         ` Andrew Jones
  -1 siblings, 0 replies; 84+ messages in thread
From: Andrew Jones @ 2022-12-05 15:30 UTC (permalink / raw)
  To: Heiko Stuebner
  Cc: linux-riscv, palmer, christoph.muellner, prabhakar.csengg, conor,
	philipp.tomsich, emil.renner.berthing, ardb, linux-efi,
	Heiko Stuebner, Conor Dooley

On Mon, Dec 05, 2022 at 03:53:23PM +0100, Andrew Jones wrote:
> On Mon, Dec 05, 2022 at 03:00:57PM +0100, Andrew Jones wrote:
> > On Wed, Nov 30, 2022 at 11:56:02PM +0100, Heiko Stuebner wrote:
> > > From: Heiko Stuebner <heiko.stuebner@vrull.eu>
> > > 
> > > Some of the constants and macros already have suitable RV_, RVG_ or
> > > RVC_ prefixes.
> > > 
> > > Extend this to the rest of the file as well, as we want to use these
> > > things in a broader scope soon.
> > > 
> > > Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
> > > Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
> > > Signed-off-by: Heiko Stuebner <heiko.stuebner@vrull.eu>
> > > ---
> > >  arch/riscv/include/asm/parse_asm.h | 194 ++++++++++++++---------------
> > >  arch/riscv/kernel/kgdb.c           |  40 +++---
> > >  2 files changed, 117 insertions(+), 117 deletions(-)
> > > 
> > > diff --git a/arch/riscv/include/asm/parse_asm.h b/arch/riscv/include/asm/parse_asm.h
> > > index 7fee806805c1..ea51542e0c65 100644
> > > --- a/arch/riscv/include/asm/parse_asm.h
> > > +++ b/arch/riscv/include/asm/parse_asm.h
> > > @@ -6,37 +6,37 @@
> > >  #include <linux/bits.h>
> > >  
> > >  /* The bit field of immediate value in I-type instruction */
> > > -#define I_IMM_SIGN_OPOFF	31
> > > -#define I_IMM_11_0_OPOFF	20
> > > -#define I_IMM_SIGN_OFF		12
> > > -#define I_IMM_11_0_OFF		0
> > > -#define I_IMM_11_0_MASK		GENMASK(11, 0)
> > > +#define RV_I_IMM_SIGN_OPOFF	31
> > > +#define RV_I_IMM_11_0_OPOFF	20
> > > +#define RV_I_IMM_SIGN_OFF	12
> > > +#define RV_I_IMM_11_0_OFF	0
> > > +#define RV_I_IMM_11_0_MASK	GENMASK(11, 0)
> > >  
> > >  /* The bit field of immediate value in J-type instruction */
> > > -#define J_IMM_SIGN_OPOFF	31
> > > -#define J_IMM_10_1_OPOFF	21
> > > -#define J_IMM_11_OPOFF		20
> > > -#define J_IMM_19_12_OPOFF	12
> > > -#define J_IMM_SIGN_OFF		20
> > > -#define J_IMM_10_1_OFF		1
> > > -#define J_IMM_11_OFF		11
> > > -#define J_IMM_19_12_OFF		12
> > > -#define J_IMM_10_1_MASK		GENMASK(9, 0)
> > > -#define J_IMM_11_MASK		GENMASK(0, 0)
> > > -#define J_IMM_19_12_MASK	GENMASK(7, 0)
> > > +#define RV_J_IMM_SIGN_OPOFF	31
> > > +#define RV_J_IMM_10_1_OPOFF	21
> > > +#define RV_J_IMM_11_OPOFF	20
> > > +#define RV_J_IMM_19_12_OPOFF	12
> > > +#define RV_J_IMM_SIGN_OFF	20
> > > +#define RV_J_IMM_10_1_OFF	1
> > > +#define RV_J_IMM_11_OFF		11
> > > +#define RV_J_IMM_19_12_OFF	12
> > > +#define RV_J_IMM_10_1_MASK	GENMASK(9, 0)
> > > +#define RV_J_IMM_11_MASK	GENMASK(0, 0)
> > > +#define RV_J_IMM_19_12_MASK	GENMASK(7, 0)
> > 
> > Looking at the macros below which shift down, mask, and then shift
> > back up, and how these macros are used in the later alternative patching
> > patches, both in this series and Jisheng's, make me think it'd be better
> > to define them already shifted, e.g.
> > 
> > #define RV_J_IMM_10_1_MASK	GENMASK(30, 21)
> > #define RV_J_IMM_11_MASK	GENMASK(20, 20)
> > #define RV_J_IMM_19_12_MASK 	GENMASK(19, 12)
> 
> Eh, never mind, I momentarily lost track of _OFF vs. _OPOFF. These
> macro names are hard to read/track. Maybe what we really need are
> extract() (which we already have, but as macros) and deposit().
> And just put the shifts and masks directly in the extract and deposit
> functions. Taking J-type for example,
> 
> static inline s32 rv_insn_imm_sign(u32 x)
> {
> 	return -((x >> 31) & 1);
> }
> 
> static inline s32 rv_insn_extract_j_imm(u32 insn)
> {
> 	return (rv_insn_imm_sign(insn & BIT(31)) << 20)	|
> 	       ((insn & GENMASK(19, 12)) << 12)		|
> 	       ((insn & BIT(20)) << 11)			|
> 	       ((insn & GENMASK(30, 25)) << 5)		|
> 	       ((insn & GENMASK(24, 21)) << 1);
> }
>   
> static inline void rv_insn_deposit_j_imm(u32 *insn, s32 imm)
> {
> 	*insn &= GENMASK(11, 7) | GENMASK(6, 0);
> 
> 	*insn |= ((imm & BIT(20)) << 31)	|
> 		 ((imm & GENMASK(10, 1)) << 21)	|
> 		 ((imm & BIT(11)) << 20)	|
> 		 ((imm & GENMASK(19, 12)) << 12);
> }
> 
> A pair of static inlines like those for each type would be easier to
> read and use, IMHO.

BTW, it looks like arch/riscv/kernel/module.c might be another place
we could apply deposit functions.

Thanks,
drew

> 
> Thanks,
> drew
> 
> 
> > 
> > Thanks,
> > drew
> > 
> > >  
> > >  /* The bit field of immediate value in B-type instruction */
> > > -#define B_IMM_SIGN_OPOFF	31
> > > -#define B_IMM_10_5_OPOFF	25
> > > -#define B_IMM_4_1_OPOFF		8
> > > -#define B_IMM_11_OPOFF		7
> > > -#define B_IMM_SIGN_OFF		12
> > > -#define B_IMM_10_5_OFF		5
> > > -#define B_IMM_4_1_OFF		1
> > > -#define B_IMM_11_OFF		11
> > > -#define B_IMM_10_5_MASK		GENMASK(5, 0)
> > > -#define B_IMM_4_1_MASK		GENMASK(3, 0)
> > > -#define B_IMM_11_MASK		GENMASK(0, 0)
> > > +#define RV_B_IMM_SIGN_OPOFF	31
> > > +#define RV_B_IMM_10_5_OPOFF	25
> > > +#define RV_B_IMM_4_1_OPOFF	8
> > > +#define RV_B_IMM_11_OPOFF	7
> > > +#define RV_B_IMM_SIGN_OFF	12
> > > +#define RV_B_IMM_10_5_OFF	5
> > > +#define RV_B_IMM_4_1_OFF	1
> > > +#define RV_B_IMM_11_OFF		11
> > > +#define RV_B_IMM_10_5_MASK	GENMASK(5, 0)
> > > +#define RV_B_IMM_4_1_MASK	GENMASK(3, 0)
> > > +#define RV_B_IMM_11_MASK	GENMASK(0, 0)
> > >  
> > >  /* The register offset in RVG instruction */
> > >  #define RVG_RS1_OPOFF		15
> > > @@ -100,71 +100,71 @@
> > >  #define RVC_C2_RD_OPOFF		7
> > >  
> > >  /* parts of opcode for RVG*/
> > > -#define OPCODE_BRANCH		0x63
> > > -#define OPCODE_JALR		0x67
> > > -#define OPCODE_JAL		0x6f
> > > -#define OPCODE_SYSTEM		0x73
> > > +#define RVG_OPCODE_BRANCH	0x63
> > > +#define RVG_OPCODE_JALR		0x67
> > > +#define RVG_OPCODE_JAL		0x6f
> > > +#define RVG_OPCODE_SYSTEM	0x73
> > >  
> > >  /* parts of opcode for RVC*/
> > > -#define OPCODE_C_0		0x0
> > > -#define OPCODE_C_1		0x1
> > > -#define OPCODE_C_2		0x2
> > > +#define RVC_OPCODE_C0		0x0
> > > +#define RVC_OPCODE_C1		0x1
> > > +#define RVC_OPCODE_C2		0x2
> > >  
> > >  /* parts of funct3 code for I, M, A extension*/
> > > -#define FUNCT3_JALR		0x0
> > > -#define FUNCT3_BEQ		0x0
> > > -#define FUNCT3_BNE		0x1000
> > > -#define FUNCT3_BLT		0x4000
> > > -#define FUNCT3_BGE		0x5000
> > > -#define FUNCT3_BLTU		0x6000
> > > -#define FUNCT3_BGEU		0x7000
> > > +#define RVG_FUNCT3_JALR		0x0
> > > +#define RVG_FUNCT3_BEQ		0x0
> > > +#define RVG_FUNCT3_BNE		0x1000
> > > +#define RVG_FUNCT3_BLT		0x4000
> > > +#define RVG_FUNCT3_BGE		0x5000
> > > +#define RVG_FUNCT3_BLTU		0x6000
> > > +#define RVG_FUNCT3_BGEU		0x7000
> > >  
> > >  /* parts of funct3 code for C extension*/
> > > -#define FUNCT3_C_BEQZ		0xc000
> > > -#define FUNCT3_C_BNEZ		0xe000
> > > -#define FUNCT3_C_J		0xa000
> > > -#define FUNCT3_C_JAL		0x2000
> > > -#define FUNCT4_C_JR		0x8000
> > > -#define FUNCT4_C_JALR		0x9000
> > > -
> > > -#define FUNCT12_SRET		0x10200000
> > > -
> > > -#define MATCH_JALR		(FUNCT3_JALR | OPCODE_JALR)
> > > -#define MATCH_JAL		(OPCODE_JAL)
> > > -#define MATCH_BEQ		(FUNCT3_BEQ | OPCODE_BRANCH)
> > > -#define MATCH_BNE		(FUNCT3_BNE | OPCODE_BRANCH)
> > > -#define MATCH_BLT		(FUNCT3_BLT | OPCODE_BRANCH)
> > > -#define MATCH_BGE		(FUNCT3_BGE | OPCODE_BRANCH)
> > > -#define MATCH_BLTU		(FUNCT3_BLTU | OPCODE_BRANCH)
> > > -#define MATCH_BGEU		(FUNCT3_BGEU | OPCODE_BRANCH)
> > > -#define MATCH_SRET		(FUNCT12_SRET | OPCODE_SYSTEM)
> > > -#define MATCH_C_BEQZ		(FUNCT3_C_BEQZ | OPCODE_C_1)
> > > -#define MATCH_C_BNEZ		(FUNCT3_C_BNEZ | OPCODE_C_1)
> > > -#define MATCH_C_J		(FUNCT3_C_J | OPCODE_C_1)
> > > -#define MATCH_C_JAL		(FUNCT3_C_JAL | OPCODE_C_1)
> > > -#define MATCH_C_JR		(FUNCT4_C_JR | OPCODE_C_2)
> > > -#define MATCH_C_JALR		(FUNCT4_C_JALR | OPCODE_C_2)
> > > -
> > > -#define MASK_JALR		0x707f
> > > -#define MASK_JAL		0x7f
> > > -#define MASK_C_JALR		0xf07f
> > > -#define MASK_C_JR		0xf07f
> > > -#define MASK_C_JAL		0xe003
> > > -#define MASK_C_J		0xe003
> > > -#define MASK_BEQ		0x707f
> > > -#define MASK_BNE		0x707f
> > > -#define MASK_BLT		0x707f
> > > -#define MASK_BGE		0x707f
> > > -#define MASK_BLTU		0x707f
> > > -#define MASK_BGEU		0x707f
> > > -#define MASK_C_BEQZ		0xe003
> > > -#define MASK_C_BNEZ		0xe003
> > > -#define MASK_SRET		0xffffffff
> > > +#define RVC_FUNCT3_C_BEQZ	0xc000
> > > +#define RVC_FUNCT3_C_BNEZ	0xe000
> > > +#define RVC_FUNCT3_C_J		0xa000
> > > +#define RVC_FUNCT3_C_JAL	0x2000
> > > +#define RVC_FUNCT4_C_JR		0x8000
> > > +#define RVC_FUNCT4_C_JALR	0x9000
> > > +
> > > +#define RVG_FUNCT12_SRET	0x10200000
> > > +
> > > +#define RVG_MATCH_JALR		(RVG_FUNCT3_JALR | RVG_OPCODE_JALR)
> > > +#define RVG_MATCH_JAL		(RVG_OPCODE_JAL)
> > > +#define RVG_MATCH_BEQ		(RVG_FUNCT3_BEQ | RVG_OPCODE_BRANCH)
> > > +#define RVG_MATCH_BNE		(RVG_FUNCT3_BNE | RVG_OPCODE_BRANCH)
> > > +#define RVG_MATCH_BLT		(RVG_FUNCT3_BLT | RVG_OPCODE_BRANCH)
> > > +#define RVG_MATCH_BGE		(RVG_FUNCT3_BGE | RVG_OPCODE_BRANCH)
> > > +#define RVG_MATCH_BLTU		(RVG_FUNCT3_BLTU | RVG_OPCODE_BRANCH)
> > > +#define RVG_MATCH_BGEU		(RVG_FUNCT3_BGEU | RVG_OPCODE_BRANCH)
> > > +#define RVG_MATCH_SRET		(RVG_FUNCT12_SRET | RVG_OPCODE_SYSTEM)
> > > +#define RVC_MATCH_C_BEQZ	(RVC_FUNCT3_C_BEQZ | RVC_OPCODE_C1)
> > > +#define RVC_MATCH_C_BNEZ	(RVC_FUNCT3_C_BNEZ | RVC_OPCODE_C1)
> > > +#define RVC_MATCH_C_J		(RVC_FUNCT3_C_J | RVC_OPCODE_C1)
> > > +#define RVC_MATCH_C_JAL		(RVC_FUNCT3_C_JAL | RVC_OPCODE_C1)
> > > +#define RVC_MATCH_C_JR		(RVC_FUNCT4_C_JR | RVC_OPCODE_C2)
> > > +#define RVC_MATCH_C_JALR	(RVC_FUNCT4_C_JALR | RVC_OPCODE_C2)
> > > +
> > > +#define RVG_MASK_JALR		0x707f
> > > +#define RVG_MASK_JAL		0x7f
> > > +#define RVC_MASK_C_JALR		0xf07f
> > > +#define RVC_MASK_C_JR		0xf07f
> > > +#define RVC_MASK_C_JAL		0xe003
> > > +#define RVC_MASK_C_J		0xe003
> > > +#define RVG_MASK_BEQ		0x707f
> > > +#define RVG_MASK_BNE		0x707f
> > > +#define RVG_MASK_BLT		0x707f
> > > +#define RVG_MASK_BGE		0x707f
> > > +#define RVG_MASK_BLTU		0x707f
> > > +#define RVG_MASK_BGEU		0x707f
> > > +#define RVC_MASK_C_BEQZ		0xe003
> > > +#define RVC_MASK_C_BNEZ		0xe003
> > > +#define RVG_MASK_SRET		0xffffffff
> > >  
> > >  #define __INSN_LENGTH_MASK	_UL(0x3)
> > >  #define __INSN_LENGTH_GE_32	_UL(0x3)
> > >  #define __INSN_OPCODE_MASK	_UL(0x7F)
> > > -#define __INSN_BRANCH_OPCODE	_UL(OPCODE_BRANCH)
> > > +#define __INSN_BRANCH_OPCODE	_UL(RVG_OPCODE_BRANCH)
> > >  
> > >  /* Define a series of is_XXX_insn functions to check if the value INSN
> > >   * is an instance of instruction XXX.
> > > @@ -180,26 +180,26 @@ static inline bool is_ ## INSN_NAME ## _insn(long insn) \
> > >  #define RV_X(X, s, mask)  (((X) >> (s)) & (mask))
> > >  #define RVC_X(X, s, mask) RV_X(X, s, mask)
> > >  
> > > -#define EXTRACT_JTYPE_IMM(x) \
> > > +#define RV_EXTRACT_JTYPE_IMM(x) \
> > >  	({typeof(x) x_ = (x); \
> > > -	(RV_X(x_, J_IMM_10_1_OPOFF, J_IMM_10_1_MASK) << J_IMM_10_1_OFF) | \
> > > -	(RV_X(x_, J_IMM_11_OPOFF, J_IMM_11_MASK) << J_IMM_11_OFF) | \
> > > -	(RV_X(x_, J_IMM_19_12_OPOFF, J_IMM_19_12_MASK) << J_IMM_19_12_OFF) | \
> > > -	(RV_IMM_SIGN(x_) << J_IMM_SIGN_OFF); })
> > > +	(RV_X(x_, RV_J_IMM_10_1_OPOFF, RV_J_IMM_10_1_MASK) << RV_J_IMM_10_1_OFF) | \
> > > +	(RV_X(x_, RV_J_IMM_11_OPOFF, RV_J_IMM_11_MASK) << RV_J_IMM_11_OFF) | \
> > > +	(RV_X(x_, RV_J_IMM_19_12_OPOFF, RV_J_IMM_19_12_MASK) << RV_J_IMM_19_12_OFF) | \
> > > +	(RV_IMM_SIGN(x_) << RV_J_IMM_SIGN_OFF); })
> > >  
> > > -#define EXTRACT_ITYPE_IMM(x) \
> > > +#define RV_EXTRACT_ITYPE_IMM(x) \
> > >  	({typeof(x) x_ = (x); \
> > > -	(RV_X(x_, I_IMM_11_0_OPOFF, I_IMM_11_0_MASK)) | \
> > > -	(RV_IMM_SIGN(x_) << I_IMM_SIGN_OFF); })
> > > +	(RV_X(x_, RV_I_IMM_11_0_OPOFF, RV_I_IMM_11_0_MASK)) | \
> > > +	(RV_IMM_SIGN(x_) << RV_I_IMM_SIGN_OFF); })
> > >  
> > > -#define EXTRACT_BTYPE_IMM(x) \
> > > +#define RV_EXTRACT_BTYPE_IMM(x) \
> > >  	({typeof(x) x_ = (x); \
> > > -	(RV_X(x_, B_IMM_4_1_OPOFF, B_IMM_4_1_MASK) << B_IMM_4_1_OFF) | \
> > > -	(RV_X(x_, B_IMM_10_5_OPOFF, B_IMM_10_5_MASK) << B_IMM_10_5_OFF) | \
> > > -	(RV_X(x_, B_IMM_11_OPOFF, B_IMM_11_MASK) << B_IMM_11_OFF) | \
> > > -	(RV_IMM_SIGN(x_) << B_IMM_SIGN_OFF); })
> > > +	(RV_X(x_, RV_B_IMM_4_1_OPOFF, RV_B_IMM_4_1_MASK) << RV_B_IMM_4_1_OFF) | \
> > > +	(RV_X(x_, RV_B_IMM_10_5_OPOFF, RV_B_IMM_10_5_MASK) << RV_B_IMM_10_5_OFF) | \
> > > +	(RV_X(x_, RV_B_IMM_11_OPOFF, RV_B_IMM_11_MASK) << RV_B_IMM_11_OFF) | \
> > > +	(RV_IMM_SIGN(x_) << RV_B_IMM_SIGN_OFF); })
> > >  
> > > -#define EXTRACT_RVC_J_IMM(x) \
> > > +#define RVC_EXTRACT_JTYPE_IMM(x) \
> > >  	({typeof(x) x_ = (x); \
> > >  	(RVC_X(x_, RVC_J_IMM_3_1_OPOFF, RVC_J_IMM_3_1_MASK) << RVC_J_IMM_3_1_OFF) | \
> > >  	(RVC_X(x_, RVC_J_IMM_4_OPOFF, RVC_J_IMM_4_MASK) << RVC_J_IMM_4_OFF) | \
> > > @@ -210,7 +210,7 @@ static inline bool is_ ## INSN_NAME ## _insn(long insn) \
> > >  	(RVC_X(x_, RVC_J_IMM_10_OPOFF, RVC_J_IMM_10_MASK) << RVC_J_IMM_10_OFF) | \
> > >  	(RVC_IMM_SIGN(x_) << RVC_J_IMM_SIGN_OFF); })
> > >  
> > > -#define EXTRACT_RVC_B_IMM(x) \
> > > +#define RVC_EXTRACT_BTYPE_IMM(x) \
> > >  	({typeof(x) x_ = (x); \
> > >  	(RVC_X(x_, RVC_B_IMM_2_1_OPOFF, RVC_B_IMM_2_1_MASK) << RVC_B_IMM_2_1_OFF) | \
> > >  	(RVC_X(x_, RVC_B_IMM_4_3_OPOFF, RVC_B_IMM_4_3_MASK) << RVC_B_IMM_4_3_OFF) | \
> > > diff --git a/arch/riscv/kernel/kgdb.c b/arch/riscv/kernel/kgdb.c
> > > index 963ed7edcff2..137c6d870d58 100644
> > > --- a/arch/riscv/kernel/kgdb.c
> > > +++ b/arch/riscv/kernel/kgdb.c
> > > @@ -29,20 +29,20 @@ DECLARE_INSN(c_jal, MATCH_C_JAL, MASK_C_JAL)
> > >  #else
> > >  #define is_c_jal_insn(opcode) 0
> > >  #endif
> > > -DECLARE_INSN(jalr, MATCH_JALR, MASK_JALR)
> > > -DECLARE_INSN(jal, MATCH_JAL, MASK_JAL)
> > > -DECLARE_INSN(c_jr, MATCH_C_JR, MASK_C_JR)
> > > -DECLARE_INSN(c_jalr, MATCH_C_JALR, MASK_C_JALR)
> > > -DECLARE_INSN(c_j, MATCH_C_J, MASK_C_J)
> > > -DECLARE_INSN(beq, MATCH_BEQ, MASK_BEQ)
> > > -DECLARE_INSN(bne, MATCH_BNE, MASK_BNE)
> > > -DECLARE_INSN(blt, MATCH_BLT, MASK_BLT)
> > > -DECLARE_INSN(bge, MATCH_BGE, MASK_BGE)
> > > -DECLARE_INSN(bltu, MATCH_BLTU, MASK_BLTU)
> > > -DECLARE_INSN(bgeu, MATCH_BGEU, MASK_BGEU)
> > > -DECLARE_INSN(c_beqz, MATCH_C_BEQZ, MASK_C_BEQZ)
> > > -DECLARE_INSN(c_bnez, MATCH_C_BNEZ, MASK_C_BNEZ)
> > > -DECLARE_INSN(sret, MATCH_SRET, MASK_SRET)
> > > +DECLARE_INSN(jalr, RVG_MATCH_JALR, RVG_MASK_JALR)
> > > +DECLARE_INSN(jal, RVG_MATCH_JAL, RVG_MASK_JAL)
> > > +DECLARE_INSN(c_jr, RVC_MATCH_C_JR, RVC_MASK_C_JR)
> > > +DECLARE_INSN(c_jalr, RVC_MATCH_C_JALR, RVC_MASK_C_JALR)
> > > +DECLARE_INSN(c_j, RVC_MATCH_C_J, RVC_MASK_C_J)
> > > +DECLARE_INSN(beq, RVG_MATCH_BEQ, RVG_MASK_BEQ)
> > > +DECLARE_INSN(bne, RVG_MATCH_BNE, RVG_MASK_BNE)
> > > +DECLARE_INSN(blt, RVG_MATCH_BLT, RVG_MASK_BLT)
> > > +DECLARE_INSN(bge, RVG_MATCH_BGE, RVG_MASK_BGE)
> > > +DECLARE_INSN(bltu, RVG_MATCH_BLTU, RVG_MASK_BLTU)
> > > +DECLARE_INSN(bgeu, RVG_MATCH_BGEU, RVG_MASK_BGEU)
> > > +DECLARE_INSN(c_beqz, RVC_MATCH_C_BEQZ, RVC_MASK_C_BEQZ)
> > > +DECLARE_INSN(c_bnez, RVC_MATCH_C_BNEZ, RVC_MASK_C_BNEZ)
> > > +DECLARE_INSN(sret, RVG_MATCH_SRET, RVG_MASK_SRET)
> > >  
> > >  static int decode_register_index(unsigned long opcode, int offset)
> > >  {
> > > @@ -69,19 +69,19 @@ static int get_step_address(struct pt_regs *regs, unsigned long *next_addr)
> > >  			rs1_num = decode_register_index(op_code, RVC_C2_RS1_OPOFF);
> > >  			*next_addr = regs_ptr[rs1_num];
> > >  		} else if (is_c_j_insn(op_code) || is_c_jal_insn(op_code)) {
> > > -			*next_addr = EXTRACT_RVC_J_IMM(op_code) + pc;
> > > +			*next_addr = RVC_EXTRACT_JTYPE_IMM(op_code) + pc;
> > >  		} else if (is_c_beqz_insn(op_code)) {
> > >  			rs1_num = decode_register_index_short(op_code,
> > >  							      RVC_C1_RS1_OPOFF);
> > >  			if (!rs1_num || regs_ptr[rs1_num] == 0)
> > > -				*next_addr = EXTRACT_RVC_B_IMM(op_code) + pc;
> > > +				*next_addr = RVC_EXTRACT_BTYPE_IMM(op_code) + pc;
> > >  			else
> > >  				*next_addr = pc + 2;
> > >  		} else if (is_c_bnez_insn(op_code)) {
> > >  			rs1_num =
> > >  			    decode_register_index_short(op_code, RVC_C1_RS1_OPOFF);
> > >  			if (rs1_num && regs_ptr[rs1_num] != 0)
> > > -				*next_addr = EXTRACT_RVC_B_IMM(op_code) + pc;
> > > +				*next_addr = RVC_EXTRACT_BTYPE_IMM(op_code) + pc;
> > >  			else
> > >  				*next_addr = pc + 2;
> > >  		} else {
> > > @@ -90,7 +90,7 @@ static int get_step_address(struct pt_regs *regs, unsigned long *next_addr)
> > >  	} else {
> > >  		if ((op_code & __INSN_OPCODE_MASK) == __INSN_BRANCH_OPCODE) {
> > >  			bool result = false;
> > > -			long imm = EXTRACT_BTYPE_IMM(op_code);
> > > +			long imm = RV_EXTRACT_BTYPE_IMM(op_code);
> > >  			unsigned long rs1_val = 0, rs2_val = 0;
> > >  
> > >  			rs1_num = decode_register_index(op_code, RVG_RS1_OPOFF);
> > > @@ -121,12 +121,12 @@ static int get_step_address(struct pt_regs *regs, unsigned long *next_addr)
> > >  			else
> > >  				*next_addr = pc + 4;
> > >  		} else if (is_jal_insn(op_code)) {
> > > -			*next_addr = EXTRACT_JTYPE_IMM(op_code) + pc;
> > > +			*next_addr = RV_EXTRACT_JTYPE_IMM(op_code) + pc;
> > >  		} else if (is_jalr_insn(op_code)) {
> > >  			rs1_num = decode_register_index(op_code, RVG_RS1_OPOFF);
> > >  			if (rs1_num)
> > >  				*next_addr = ((unsigned long *)regs)[rs1_num];
> > > -			*next_addr += EXTRACT_ITYPE_IMM(op_code);
> > > +			*next_addr += RV_EXTRACT_ITYPE_IMM(op_code);
> > >  		} else if (is_sret_insn(op_code)) {
> > >  			*next_addr = pc;
> > >  		} else {
> > > -- 
> > > 2.35.1
> > > 

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [PATCH v3 02/14] RISC-V: add prefix to all constants/macros in parse_asm.h
@ 2022-12-05 15:30         ` Andrew Jones
  0 siblings, 0 replies; 84+ messages in thread
From: Andrew Jones @ 2022-12-05 15:30 UTC (permalink / raw)
  To: Heiko Stuebner
  Cc: linux-riscv, palmer, christoph.muellner, prabhakar.csengg, conor,
	philipp.tomsich, emil.renner.berthing, ardb, linux-efi,
	Heiko Stuebner, Conor Dooley

On Mon, Dec 05, 2022 at 03:53:23PM +0100, Andrew Jones wrote:
> On Mon, Dec 05, 2022 at 03:00:57PM +0100, Andrew Jones wrote:
> > On Wed, Nov 30, 2022 at 11:56:02PM +0100, Heiko Stuebner wrote:
> > > From: Heiko Stuebner <heiko.stuebner@vrull.eu>
> > > 
> > > Some of the constants and macros already have suitable RV_, RVG_ or
> > > RVC_ prefixes.
> > > 
> > > Extend this to the rest of the file as well, as we want to use these
> > > things in a broader scope soon.
> > > 
> > > Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
> > > Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
> > > Signed-off-by: Heiko Stuebner <heiko.stuebner@vrull.eu>
> > > ---
> > >  arch/riscv/include/asm/parse_asm.h | 194 ++++++++++++++---------------
> > >  arch/riscv/kernel/kgdb.c           |  40 +++---
> > >  2 files changed, 117 insertions(+), 117 deletions(-)
> > > 
> > > diff --git a/arch/riscv/include/asm/parse_asm.h b/arch/riscv/include/asm/parse_asm.h
> > > index 7fee806805c1..ea51542e0c65 100644
> > > --- a/arch/riscv/include/asm/parse_asm.h
> > > +++ b/arch/riscv/include/asm/parse_asm.h
> > > @@ -6,37 +6,37 @@
> > >  #include <linux/bits.h>
> > >  
> > >  /* The bit field of immediate value in I-type instruction */
> > > -#define I_IMM_SIGN_OPOFF	31
> > > -#define I_IMM_11_0_OPOFF	20
> > > -#define I_IMM_SIGN_OFF		12
> > > -#define I_IMM_11_0_OFF		0
> > > -#define I_IMM_11_0_MASK		GENMASK(11, 0)
> > > +#define RV_I_IMM_SIGN_OPOFF	31
> > > +#define RV_I_IMM_11_0_OPOFF	20
> > > +#define RV_I_IMM_SIGN_OFF	12
> > > +#define RV_I_IMM_11_0_OFF	0
> > > +#define RV_I_IMM_11_0_MASK	GENMASK(11, 0)
> > >  
> > >  /* The bit field of immediate value in J-type instruction */
> > > -#define J_IMM_SIGN_OPOFF	31
> > > -#define J_IMM_10_1_OPOFF	21
> > > -#define J_IMM_11_OPOFF		20
> > > -#define J_IMM_19_12_OPOFF	12
> > > -#define J_IMM_SIGN_OFF		20
> > > -#define J_IMM_10_1_OFF		1
> > > -#define J_IMM_11_OFF		11
> > > -#define J_IMM_19_12_OFF		12
> > > -#define J_IMM_10_1_MASK		GENMASK(9, 0)
> > > -#define J_IMM_11_MASK		GENMASK(0, 0)
> > > -#define J_IMM_19_12_MASK	GENMASK(7, 0)
> > > +#define RV_J_IMM_SIGN_OPOFF	31
> > > +#define RV_J_IMM_10_1_OPOFF	21
> > > +#define RV_J_IMM_11_OPOFF	20
> > > +#define RV_J_IMM_19_12_OPOFF	12
> > > +#define RV_J_IMM_SIGN_OFF	20
> > > +#define RV_J_IMM_10_1_OFF	1
> > > +#define RV_J_IMM_11_OFF		11
> > > +#define RV_J_IMM_19_12_OFF	12
> > > +#define RV_J_IMM_10_1_MASK	GENMASK(9, 0)
> > > +#define RV_J_IMM_11_MASK	GENMASK(0, 0)
> > > +#define RV_J_IMM_19_12_MASK	GENMASK(7, 0)
> > 
> > Looking at the macros below which shift down, mask, and then shift
> > back up, and how these macros are used in the later alternative patching
> > patches, both in this series and Jisheng's, make me think it'd be better
> > to define them already shifted, e.g.
> > 
> > #define RV_J_IMM_10_1_MASK	GENMASK(30, 21)
> > #define RV_J_IMM_11_MASK	GENMASK(20, 20)
> > #define RV_J_IMM_19_12_MASK 	GENMASK(19, 12)
> 
> Eh, never mind, I momentarily lost track of _OFF vs. _OPOFF. These
> macro names are hard to read/track. Maybe what we really need are
> extract() (which we already have, but as macros) and deposit().
> And just put the shifts and masks directly in the extract and deposit
> functions. Taking J-type for example,
> 
> static inline s32 rv_insn_imm_sign(u32 x)
> {
> 	return -((x >> 31) & 1);
> }
> 
> static inline s32 rv_insn_extract_j_imm(u32 insn)
> {
> 	return (rv_insn_imm_sign(insn & BIT(31)) << 20)	|
> 	       ((insn & GENMASK(19, 12)) << 12)		|
> 	       ((insn & BIT(20)) << 11)			|
> 	       ((insn & GENMASK(30, 25)) << 5)		|
> 	       ((insn & GENMASK(24, 21)) << 1);
> }
>   
> static inline void rv_insn_deposit_j_imm(u32 *insn, s32 imm)
> {
> 	*insn &= GENMASK(11, 7) | GENMASK(6, 0);
> 
> 	*insn |= ((imm & BIT(20)) << 31)	|
> 		 ((imm & GENMASK(10, 1)) << 21)	|
> 		 ((imm & BIT(11)) << 20)	|
> 		 ((imm & GENMASK(19, 12)) << 12);
> }
> 
> A pair of static inlines like those for each type would be easier to
> read and use, IMHO.

BTW, it looks like arch/riscv/kernel/module.c might be another place
we could apply deposit functions.

Thanks,
drew

> 
> Thanks,
> drew
> 
> 
> > 
> > Thanks,
> > drew
> > 
> > >  
> > >  /* The bit field of immediate value in B-type instruction */
> > > -#define B_IMM_SIGN_OPOFF	31
> > > -#define B_IMM_10_5_OPOFF	25
> > > -#define B_IMM_4_1_OPOFF		8
> > > -#define B_IMM_11_OPOFF		7
> > > -#define B_IMM_SIGN_OFF		12
> > > -#define B_IMM_10_5_OFF		5
> > > -#define B_IMM_4_1_OFF		1
> > > -#define B_IMM_11_OFF		11
> > > -#define B_IMM_10_5_MASK		GENMASK(5, 0)
> > > -#define B_IMM_4_1_MASK		GENMASK(3, 0)
> > > -#define B_IMM_11_MASK		GENMASK(0, 0)
> > > +#define RV_B_IMM_SIGN_OPOFF	31
> > > +#define RV_B_IMM_10_5_OPOFF	25
> > > +#define RV_B_IMM_4_1_OPOFF	8
> > > +#define RV_B_IMM_11_OPOFF	7
> > > +#define RV_B_IMM_SIGN_OFF	12
> > > +#define RV_B_IMM_10_5_OFF	5
> > > +#define RV_B_IMM_4_1_OFF	1
> > > +#define RV_B_IMM_11_OFF		11
> > > +#define RV_B_IMM_10_5_MASK	GENMASK(5, 0)
> > > +#define RV_B_IMM_4_1_MASK	GENMASK(3, 0)
> > > +#define RV_B_IMM_11_MASK	GENMASK(0, 0)
> > >  
> > >  /* The register offset in RVG instruction */
> > >  #define RVG_RS1_OPOFF		15
> > > @@ -100,71 +100,71 @@
> > >  #define RVC_C2_RD_OPOFF		7
> > >  
> > >  /* parts of opcode for RVG*/
> > > -#define OPCODE_BRANCH		0x63
> > > -#define OPCODE_JALR		0x67
> > > -#define OPCODE_JAL		0x6f
> > > -#define OPCODE_SYSTEM		0x73
> > > +#define RVG_OPCODE_BRANCH	0x63
> > > +#define RVG_OPCODE_JALR		0x67
> > > +#define RVG_OPCODE_JAL		0x6f
> > > +#define RVG_OPCODE_SYSTEM	0x73
> > >  
> > >  /* parts of opcode for RVC*/
> > > -#define OPCODE_C_0		0x0
> > > -#define OPCODE_C_1		0x1
> > > -#define OPCODE_C_2		0x2
> > > +#define RVC_OPCODE_C0		0x0
> > > +#define RVC_OPCODE_C1		0x1
> > > +#define RVC_OPCODE_C2		0x2
> > >  
> > >  /* parts of funct3 code for I, M, A extension*/
> > > -#define FUNCT3_JALR		0x0
> > > -#define FUNCT3_BEQ		0x0
> > > -#define FUNCT3_BNE		0x1000
> > > -#define FUNCT3_BLT		0x4000
> > > -#define FUNCT3_BGE		0x5000
> > > -#define FUNCT3_BLTU		0x6000
> > > -#define FUNCT3_BGEU		0x7000
> > > +#define RVG_FUNCT3_JALR		0x0
> > > +#define RVG_FUNCT3_BEQ		0x0
> > > +#define RVG_FUNCT3_BNE		0x1000
> > > +#define RVG_FUNCT3_BLT		0x4000
> > > +#define RVG_FUNCT3_BGE		0x5000
> > > +#define RVG_FUNCT3_BLTU		0x6000
> > > +#define RVG_FUNCT3_BGEU		0x7000
> > >  
> > >  /* parts of funct3 code for C extension*/
> > > -#define FUNCT3_C_BEQZ		0xc000
> > > -#define FUNCT3_C_BNEZ		0xe000
> > > -#define FUNCT3_C_J		0xa000
> > > -#define FUNCT3_C_JAL		0x2000
> > > -#define FUNCT4_C_JR		0x8000
> > > -#define FUNCT4_C_JALR		0x9000
> > > -
> > > -#define FUNCT12_SRET		0x10200000
> > > -
> > > -#define MATCH_JALR		(FUNCT3_JALR | OPCODE_JALR)
> > > -#define MATCH_JAL		(OPCODE_JAL)
> > > -#define MATCH_BEQ		(FUNCT3_BEQ | OPCODE_BRANCH)
> > > -#define MATCH_BNE		(FUNCT3_BNE | OPCODE_BRANCH)
> > > -#define MATCH_BLT		(FUNCT3_BLT | OPCODE_BRANCH)
> > > -#define MATCH_BGE		(FUNCT3_BGE | OPCODE_BRANCH)
> > > -#define MATCH_BLTU		(FUNCT3_BLTU | OPCODE_BRANCH)
> > > -#define MATCH_BGEU		(FUNCT3_BGEU | OPCODE_BRANCH)
> > > -#define MATCH_SRET		(FUNCT12_SRET | OPCODE_SYSTEM)
> > > -#define MATCH_C_BEQZ		(FUNCT3_C_BEQZ | OPCODE_C_1)
> > > -#define MATCH_C_BNEZ		(FUNCT3_C_BNEZ | OPCODE_C_1)
> > > -#define MATCH_C_J		(FUNCT3_C_J | OPCODE_C_1)
> > > -#define MATCH_C_JAL		(FUNCT3_C_JAL | OPCODE_C_1)
> > > -#define MATCH_C_JR		(FUNCT4_C_JR | OPCODE_C_2)
> > > -#define MATCH_C_JALR		(FUNCT4_C_JALR | OPCODE_C_2)
> > > -
> > > -#define MASK_JALR		0x707f
> > > -#define MASK_JAL		0x7f
> > > -#define MASK_C_JALR		0xf07f
> > > -#define MASK_C_JR		0xf07f
> > > -#define MASK_C_JAL		0xe003
> > > -#define MASK_C_J		0xe003
> > > -#define MASK_BEQ		0x707f
> > > -#define MASK_BNE		0x707f
> > > -#define MASK_BLT		0x707f
> > > -#define MASK_BGE		0x707f
> > > -#define MASK_BLTU		0x707f
> > > -#define MASK_BGEU		0x707f
> > > -#define MASK_C_BEQZ		0xe003
> > > -#define MASK_C_BNEZ		0xe003
> > > -#define MASK_SRET		0xffffffff
> > > +#define RVC_FUNCT3_C_BEQZ	0xc000
> > > +#define RVC_FUNCT3_C_BNEZ	0xe000
> > > +#define RVC_FUNCT3_C_J		0xa000
> > > +#define RVC_FUNCT3_C_JAL	0x2000
> > > +#define RVC_FUNCT4_C_JR		0x8000
> > > +#define RVC_FUNCT4_C_JALR	0x9000
> > > +
> > > +#define RVG_FUNCT12_SRET	0x10200000
> > > +
> > > +#define RVG_MATCH_JALR		(RVG_FUNCT3_JALR | RVG_OPCODE_JALR)
> > > +#define RVG_MATCH_JAL		(RVG_OPCODE_JAL)
> > > +#define RVG_MATCH_BEQ		(RVG_FUNCT3_BEQ | RVG_OPCODE_BRANCH)
> > > +#define RVG_MATCH_BNE		(RVG_FUNCT3_BNE | RVG_OPCODE_BRANCH)
> > > +#define RVG_MATCH_BLT		(RVG_FUNCT3_BLT | RVG_OPCODE_BRANCH)
> > > +#define RVG_MATCH_BGE		(RVG_FUNCT3_BGE | RVG_OPCODE_BRANCH)
> > > +#define RVG_MATCH_BLTU		(RVG_FUNCT3_BLTU | RVG_OPCODE_BRANCH)
> > > +#define RVG_MATCH_BGEU		(RVG_FUNCT3_BGEU | RVG_OPCODE_BRANCH)
> > > +#define RVG_MATCH_SRET		(RVG_FUNCT12_SRET | RVG_OPCODE_SYSTEM)
> > > +#define RVC_MATCH_C_BEQZ	(RVC_FUNCT3_C_BEQZ | RVC_OPCODE_C1)
> > > +#define RVC_MATCH_C_BNEZ	(RVC_FUNCT3_C_BNEZ | RVC_OPCODE_C1)
> > > +#define RVC_MATCH_C_J		(RVC_FUNCT3_C_J | RVC_OPCODE_C1)
> > > +#define RVC_MATCH_C_JAL		(RVC_FUNCT3_C_JAL | RVC_OPCODE_C1)
> > > +#define RVC_MATCH_C_JR		(RVC_FUNCT4_C_JR | RVC_OPCODE_C2)
> > > +#define RVC_MATCH_C_JALR	(RVC_FUNCT4_C_JALR | RVC_OPCODE_C2)
> > > +
> > > +#define RVG_MASK_JALR		0x707f
> > > +#define RVG_MASK_JAL		0x7f
> > > +#define RVC_MASK_C_JALR		0xf07f
> > > +#define RVC_MASK_C_JR		0xf07f
> > > +#define RVC_MASK_C_JAL		0xe003
> > > +#define RVC_MASK_C_J		0xe003
> > > +#define RVG_MASK_BEQ		0x707f
> > > +#define RVG_MASK_BNE		0x707f
> > > +#define RVG_MASK_BLT		0x707f
> > > +#define RVG_MASK_BGE		0x707f
> > > +#define RVG_MASK_BLTU		0x707f
> > > +#define RVG_MASK_BGEU		0x707f
> > > +#define RVC_MASK_C_BEQZ		0xe003
> > > +#define RVC_MASK_C_BNEZ		0xe003
> > > +#define RVG_MASK_SRET		0xffffffff
> > >  
> > >  #define __INSN_LENGTH_MASK	_UL(0x3)
> > >  #define __INSN_LENGTH_GE_32	_UL(0x3)
> > >  #define __INSN_OPCODE_MASK	_UL(0x7F)
> > > -#define __INSN_BRANCH_OPCODE	_UL(OPCODE_BRANCH)
> > > +#define __INSN_BRANCH_OPCODE	_UL(RVG_OPCODE_BRANCH)
> > >  
> > >  /* Define a series of is_XXX_insn functions to check if the value INSN
> > >   * is an instance of instruction XXX.
> > > @@ -180,26 +180,26 @@ static inline bool is_ ## INSN_NAME ## _insn(long insn) \
> > >  #define RV_X(X, s, mask)  (((X) >> (s)) & (mask))
> > >  #define RVC_X(X, s, mask) RV_X(X, s, mask)
> > >  
> > > -#define EXTRACT_JTYPE_IMM(x) \
> > > +#define RV_EXTRACT_JTYPE_IMM(x) \
> > >  	({typeof(x) x_ = (x); \
> > > -	(RV_X(x_, J_IMM_10_1_OPOFF, J_IMM_10_1_MASK) << J_IMM_10_1_OFF) | \
> > > -	(RV_X(x_, J_IMM_11_OPOFF, J_IMM_11_MASK) << J_IMM_11_OFF) | \
> > > -	(RV_X(x_, J_IMM_19_12_OPOFF, J_IMM_19_12_MASK) << J_IMM_19_12_OFF) | \
> > > -	(RV_IMM_SIGN(x_) << J_IMM_SIGN_OFF); })
> > > +	(RV_X(x_, RV_J_IMM_10_1_OPOFF, RV_J_IMM_10_1_MASK) << RV_J_IMM_10_1_OFF) | \
> > > +	(RV_X(x_, RV_J_IMM_11_OPOFF, RV_J_IMM_11_MASK) << RV_J_IMM_11_OFF) | \
> > > +	(RV_X(x_, RV_J_IMM_19_12_OPOFF, RV_J_IMM_19_12_MASK) << RV_J_IMM_19_12_OFF) | \
> > > +	(RV_IMM_SIGN(x_) << RV_J_IMM_SIGN_OFF); })
> > >  
> > > -#define EXTRACT_ITYPE_IMM(x) \
> > > +#define RV_EXTRACT_ITYPE_IMM(x) \
> > >  	({typeof(x) x_ = (x); \
> > > -	(RV_X(x_, I_IMM_11_0_OPOFF, I_IMM_11_0_MASK)) | \
> > > -	(RV_IMM_SIGN(x_) << I_IMM_SIGN_OFF); })
> > > +	(RV_X(x_, RV_I_IMM_11_0_OPOFF, RV_I_IMM_11_0_MASK)) | \
> > > +	(RV_IMM_SIGN(x_) << RV_I_IMM_SIGN_OFF); })
> > >  
> > > -#define EXTRACT_BTYPE_IMM(x) \
> > > +#define RV_EXTRACT_BTYPE_IMM(x) \
> > >  	({typeof(x) x_ = (x); \
> > > -	(RV_X(x_, B_IMM_4_1_OPOFF, B_IMM_4_1_MASK) << B_IMM_4_1_OFF) | \
> > > -	(RV_X(x_, B_IMM_10_5_OPOFF, B_IMM_10_5_MASK) << B_IMM_10_5_OFF) | \
> > > -	(RV_X(x_, B_IMM_11_OPOFF, B_IMM_11_MASK) << B_IMM_11_OFF) | \
> > > -	(RV_IMM_SIGN(x_) << B_IMM_SIGN_OFF); })
> > > +	(RV_X(x_, RV_B_IMM_4_1_OPOFF, RV_B_IMM_4_1_MASK) << RV_B_IMM_4_1_OFF) | \
> > > +	(RV_X(x_, RV_B_IMM_10_5_OPOFF, RV_B_IMM_10_5_MASK) << RV_B_IMM_10_5_OFF) | \
> > > +	(RV_X(x_, RV_B_IMM_11_OPOFF, RV_B_IMM_11_MASK) << RV_B_IMM_11_OFF) | \
> > > +	(RV_IMM_SIGN(x_) << RV_B_IMM_SIGN_OFF); })
> > >  
> > > -#define EXTRACT_RVC_J_IMM(x) \
> > > +#define RVC_EXTRACT_JTYPE_IMM(x) \
> > >  	({typeof(x) x_ = (x); \
> > >  	(RVC_X(x_, RVC_J_IMM_3_1_OPOFF, RVC_J_IMM_3_1_MASK) << RVC_J_IMM_3_1_OFF) | \
> > >  	(RVC_X(x_, RVC_J_IMM_4_OPOFF, RVC_J_IMM_4_MASK) << RVC_J_IMM_4_OFF) | \
> > > @@ -210,7 +210,7 @@ static inline bool is_ ## INSN_NAME ## _insn(long insn) \
> > >  	(RVC_X(x_, RVC_J_IMM_10_OPOFF, RVC_J_IMM_10_MASK) << RVC_J_IMM_10_OFF) | \
> > >  	(RVC_IMM_SIGN(x_) << RVC_J_IMM_SIGN_OFF); })
> > >  
> > > -#define EXTRACT_RVC_B_IMM(x) \
> > > +#define RVC_EXTRACT_BTYPE_IMM(x) \
> > >  	({typeof(x) x_ = (x); \
> > >  	(RVC_X(x_, RVC_B_IMM_2_1_OPOFF, RVC_B_IMM_2_1_MASK) << RVC_B_IMM_2_1_OFF) | \
> > >  	(RVC_X(x_, RVC_B_IMM_4_3_OPOFF, RVC_B_IMM_4_3_MASK) << RVC_B_IMM_4_3_OFF) | \
> > > diff --git a/arch/riscv/kernel/kgdb.c b/arch/riscv/kernel/kgdb.c
> > > index 963ed7edcff2..137c6d870d58 100644
> > > --- a/arch/riscv/kernel/kgdb.c
> > > +++ b/arch/riscv/kernel/kgdb.c
> > > @@ -29,20 +29,20 @@ DECLARE_INSN(c_jal, MATCH_C_JAL, MASK_C_JAL)
> > >  #else
> > >  #define is_c_jal_insn(opcode) 0
> > >  #endif
> > > -DECLARE_INSN(jalr, MATCH_JALR, MASK_JALR)
> > > -DECLARE_INSN(jal, MATCH_JAL, MASK_JAL)
> > > -DECLARE_INSN(c_jr, MATCH_C_JR, MASK_C_JR)
> > > -DECLARE_INSN(c_jalr, MATCH_C_JALR, MASK_C_JALR)
> > > -DECLARE_INSN(c_j, MATCH_C_J, MASK_C_J)
> > > -DECLARE_INSN(beq, MATCH_BEQ, MASK_BEQ)
> > > -DECLARE_INSN(bne, MATCH_BNE, MASK_BNE)
> > > -DECLARE_INSN(blt, MATCH_BLT, MASK_BLT)
> > > -DECLARE_INSN(bge, MATCH_BGE, MASK_BGE)
> > > -DECLARE_INSN(bltu, MATCH_BLTU, MASK_BLTU)
> > > -DECLARE_INSN(bgeu, MATCH_BGEU, MASK_BGEU)
> > > -DECLARE_INSN(c_beqz, MATCH_C_BEQZ, MASK_C_BEQZ)
> > > -DECLARE_INSN(c_bnez, MATCH_C_BNEZ, MASK_C_BNEZ)
> > > -DECLARE_INSN(sret, MATCH_SRET, MASK_SRET)
> > > +DECLARE_INSN(jalr, RVG_MATCH_JALR, RVG_MASK_JALR)
> > > +DECLARE_INSN(jal, RVG_MATCH_JAL, RVG_MASK_JAL)
> > > +DECLARE_INSN(c_jr, RVC_MATCH_C_JR, RVC_MASK_C_JR)
> > > +DECLARE_INSN(c_jalr, RVC_MATCH_C_JALR, RVC_MASK_C_JALR)
> > > +DECLARE_INSN(c_j, RVC_MATCH_C_J, RVC_MASK_C_J)
> > > +DECLARE_INSN(beq, RVG_MATCH_BEQ, RVG_MASK_BEQ)
> > > +DECLARE_INSN(bne, RVG_MATCH_BNE, RVG_MASK_BNE)
> > > +DECLARE_INSN(blt, RVG_MATCH_BLT, RVG_MASK_BLT)
> > > +DECLARE_INSN(bge, RVG_MATCH_BGE, RVG_MASK_BGE)
> > > +DECLARE_INSN(bltu, RVG_MATCH_BLTU, RVG_MASK_BLTU)
> > > +DECLARE_INSN(bgeu, RVG_MATCH_BGEU, RVG_MASK_BGEU)
> > > +DECLARE_INSN(c_beqz, RVC_MATCH_C_BEQZ, RVC_MASK_C_BEQZ)
> > > +DECLARE_INSN(c_bnez, RVC_MATCH_C_BNEZ, RVC_MASK_C_BNEZ)
> > > +DECLARE_INSN(sret, RVG_MATCH_SRET, RVG_MASK_SRET)
> > >  
> > >  static int decode_register_index(unsigned long opcode, int offset)
> > >  {
> > > @@ -69,19 +69,19 @@ static int get_step_address(struct pt_regs *regs, unsigned long *next_addr)
> > >  			rs1_num = decode_register_index(op_code, RVC_C2_RS1_OPOFF);
> > >  			*next_addr = regs_ptr[rs1_num];
> > >  		} else if (is_c_j_insn(op_code) || is_c_jal_insn(op_code)) {
> > > -			*next_addr = EXTRACT_RVC_J_IMM(op_code) + pc;
> > > +			*next_addr = RVC_EXTRACT_JTYPE_IMM(op_code) + pc;
> > >  		} else if (is_c_beqz_insn(op_code)) {
> > >  			rs1_num = decode_register_index_short(op_code,
> > >  							      RVC_C1_RS1_OPOFF);
> > >  			if (!rs1_num || regs_ptr[rs1_num] == 0)
> > > -				*next_addr = EXTRACT_RVC_B_IMM(op_code) + pc;
> > > +				*next_addr = RVC_EXTRACT_BTYPE_IMM(op_code) + pc;
> > >  			else
> > >  				*next_addr = pc + 2;
> > >  		} else if (is_c_bnez_insn(op_code)) {
> > >  			rs1_num =
> > >  			    decode_register_index_short(op_code, RVC_C1_RS1_OPOFF);
> > >  			if (rs1_num && regs_ptr[rs1_num] != 0)
> > > -				*next_addr = EXTRACT_RVC_B_IMM(op_code) + pc;
> > > +				*next_addr = RVC_EXTRACT_BTYPE_IMM(op_code) + pc;
> > >  			else
> > >  				*next_addr = pc + 2;
> > >  		} else {
> > > @@ -90,7 +90,7 @@ static int get_step_address(struct pt_regs *regs, unsigned long *next_addr)
> > >  	} else {
> > >  		if ((op_code & __INSN_OPCODE_MASK) == __INSN_BRANCH_OPCODE) {
> > >  			bool result = false;
> > > -			long imm = EXTRACT_BTYPE_IMM(op_code);
> > > +			long imm = RV_EXTRACT_BTYPE_IMM(op_code);
> > >  			unsigned long rs1_val = 0, rs2_val = 0;
> > >  
> > >  			rs1_num = decode_register_index(op_code, RVG_RS1_OPOFF);
> > > @@ -121,12 +121,12 @@ static int get_step_address(struct pt_regs *regs, unsigned long *next_addr)
> > >  			else
> > >  				*next_addr = pc + 4;
> > >  		} else if (is_jal_insn(op_code)) {
> > > -			*next_addr = EXTRACT_JTYPE_IMM(op_code) + pc;
> > > +			*next_addr = RV_EXTRACT_JTYPE_IMM(op_code) + pc;
> > >  		} else if (is_jalr_insn(op_code)) {
> > >  			rs1_num = decode_register_index(op_code, RVG_RS1_OPOFF);
> > >  			if (rs1_num)
> > >  				*next_addr = ((unsigned long *)regs)[rs1_num];
> > > -			*next_addr += EXTRACT_ITYPE_IMM(op_code);
> > > +			*next_addr += RV_EXTRACT_ITYPE_IMM(op_code);
> > >  		} else if (is_sret_insn(op_code)) {
> > >  			*next_addr = pc;
> > >  		} else {
> > > -- 
> > > 2.35.1
> > > 

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [PATCH v3 11/14] RISC-V: fix auipc-jalr addresses in patched alternatives
  2022-12-01 19:33     ` Andrew Jones
@ 2022-12-06 15:56       ` Andrew Jones
  -1 siblings, 0 replies; 84+ messages in thread
From: Andrew Jones @ 2022-12-06 15:56 UTC (permalink / raw)
  To: Heiko Stuebner
  Cc: linux-riscv, palmer, christoph.muellner, prabhakar.csengg, conor,
	philipp.tomsich, emil.renner.berthing, ardb, linux-efi,
	Heiko Stuebner, Conor Dooley

On Thu, Dec 01, 2022 at 08:33:53PM +0100, Andrew Jones wrote:
> On Wed, Nov 30, 2022 at 11:56:11PM +0100, Heiko Stuebner wrote:
> > From: Heiko Stuebner <heiko.stuebner@vrull.eu>
> > 
> > Alternatives live in a different section, so addresses used by call
> > functions will point to wrong locations after the patch got applied.
> > 
> > Similar to arm64, adjust the location to consider that offset.
> > 
> > Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
> > Signed-off-by: Heiko Stuebner <heiko.stuebner@vrull.eu>
> > ---
> >  arch/riscv/include/asm/alternative.h |  3 ++
> >  arch/riscv/kernel/alternative.c      | 72 ++++++++++++++++++++++++++++
> >  arch/riscv/kernel/cpufeature.c       | 11 ++++-
> >  3 files changed, 84 insertions(+), 2 deletions(-)
> > 
> > diff --git a/arch/riscv/include/asm/alternative.h b/arch/riscv/include/asm/alternative.h
> > index 6511dd73e812..c58ec3cc4bc3 100644
> > --- a/arch/riscv/include/asm/alternative.h
> > +++ b/arch/riscv/include/asm/alternative.h
> > @@ -27,6 +27,9 @@ void __init apply_boot_alternatives(void);
> >  void __init apply_early_boot_alternatives(void);
> >  void apply_module_alternatives(void *start, size_t length);
> >  
> > +void riscv_alternative_fix_auipc_jalr(void *alt_ptr, unsigned int len,
> > +				      int patch_offset);
> > +
> >  struct alt_entry {
> >  	void *old_ptr;		 /* address of original instruciton or data  */
> >  	void *alt_ptr;		 /* address of replacement instruction or data */
> > diff --git a/arch/riscv/kernel/alternative.c b/arch/riscv/kernel/alternative.c
> > index a7d26a00beea..292cc42dc3be 100644
> > --- a/arch/riscv/kernel/alternative.c
> > +++ b/arch/riscv/kernel/alternative.c
> > @@ -15,6 +15,8 @@
> >  #include <asm/vendorid_list.h>
> >  #include <asm/sbi.h>
> >  #include <asm/csr.h>
> > +#include <asm/insn.h>
> > +#include <asm/patch.h>
> >  
> >  struct cpu_manufacturer_info_t {
> >  	unsigned long vendor_id;
> > @@ -53,6 +55,76 @@ static void __init_or_module riscv_fill_cpu_mfr_info(struct cpu_manufacturer_inf
> >  	}
> >  }
> >  
> > +static unsigned int riscv_instruction_at(void *p, unsigned int offset)
> 
> How about explicitly returning a u32?
> 
> > +{
> > +	u16 *parcel = p + (offset * sizeof(u32));
> 
> nit: I realize this is just a helper function, but I think a cleaner
> interface would require the caller do this math, or at least the offset
> scaling, since only the caller knows it's necessary. And, the call to
> patch_text_nosync() requires all the math, so it'd be consistent for
> riscv_instruction_at() to only take a pointer too.
> 
> > +
> > +	return (unsigned int)parcel[0] | (unsigned int)parcel[1] << 16;
> > +}
> > +
> > +static inline bool riscv_insn_is_auipc_jalr(u32 insn1, u32 insn2)
> > +{
> > +	return riscv_insn_is_auipc(insn1) && riscv_insn_is_jalr(insn2);
> > +}
> > +
> > +#define JALR_SIGN_MASK		BIT(RV_I_IMM_SIGN_OPOFF - RV_I_IMM_11_0_OPOFF)
> 
> We know I-type IMM is 11 bits, so we could just define this as BIT(11).
> 
> > +#define AUIPC_PAD		(0x00001000)
> > +
> > +#define to_jalr_imm(value)						\
> > +	((value & RV_I_IMM_11_0_MASK) << RV_I_IMM_11_0_OPOFF)
> 
> Should put () around the macro argument, (value)
> 
> > +
> > +#define to_auipc_imm(value)						\
> > +	((value & JALR_SIGN_MASK) ?					\
> > +	((value & RV_U_IMM_31_12_MASK) + AUIPC_PAD) :	\
> > +	(value & RV_U_IMM_31_12_MASK))
> 
> I know RV_U_IMM_31_12_OPOFF is 0, but it looks odd not shifting
> RV_U_IMM_31_12_MASK when we do shift RV_I_IMM_11_0_MASK.
> 
> So, it looks like to_auipc_imm() is doing
> 
>    offset[31:12] + ((value & BIT(11)) ? (1 << 12) : 0)
> 
> but the spec says the auipc part of the 'call' pseudoinstruction should be
> 
>    offset[31:12] + offset[11]
>  
> which I think would be written as
> 
>  ((((value) & RV_U_IMM_31_12_MASK) << RV_U_IMM_31_12_OPOFF) + ((value) & BIT(11)))
> 
> or what am I missing?

I figured out what I was missing. The way you have it makes sense to me
now and my way was wrong. But we could simplify it like this

#define to_auipc_imm(value) \
 ((value) + BIT(11)) & RV_U_IMM_31_12_MASK

> 
> > +
> > +void riscv_alternative_fix_auipc_jalr(void *alt_ptr, unsigned int len,
> > +				      int patch_offset)
> > +{
> > +	int num_instr = len / sizeof(u32);
> > +	unsigned int call[2];
> > +	int i;
> > +	int imm;
> > +	u32 rd1;
> > +
> > +	/*
> > +	 * stop one instruction before the end, as we're checking
> > +	 * for auipc + jalr
> > +	 */
> > +	for (i = 0; i < num_instr - 1; i++) {
> 
> If we change riscv_instruction_at() to just take a pointer then we can do
> the math in the for() and actually just use pointer arithmetic.
> 
>         uint32_t *p = alt_ptr;
>         for (i = 0; i < num_instr - 1; i++, p++) {
> 
> > +		u32 inst1 = riscv_instruction_at(alt_ptr, i);
>                                                  p
> > +		u32 inst2 = riscv_instruction_at(alt_ptr, i + 1);
>                                                  p + 1
> > +
> > +		if (!riscv_insn_is_auipc_jalr(inst1, inst2))
> > +			continue;
> > +
> > +		/* call will use ra register */
> > +		rd1 = RV_EXTRACT_RD_REG(inst1);
> > +		if (rd1 != 1)
> > +			continue;
> 
> nit: rd1 is only used once, how about
> 
>  if (RV_EXTRACT_RD_REG(inst1) != 1)
> 
> > +
> > +		/* get and adjust new target address */
> > +		imm = RV_EXTRACT_UTYPE_IMM(inst1);
> 
> Based on my understanding of a auipc part of the 'call', it seems we
> should be subtracting BIT(11) here. And, since RV_EXTRACT_* does sign-
> extension for I-type, then I'm not sure we should use it. So,
> 
>         imm = (inst2 >> RV_I_IMM_11_0_OPOFF) & RV_I_IMM_11_0_MASK;
> 	imm += ((inst1 >> RV_U_IMM_31_12_OPOFF) & RV_U_IMM_31_12_MASK) - (imm & BIT(11));

I still think we need a subtraction here, but it should be ' - (imm >> 11)'

Thanks,
drew

> 
> > +		imm += RV_EXTRACT_ITYPE_IMM(inst2);
> > +		imm -= patch_offset;
> > +
> > +		/* pick the original auipc + jalr */
> > +		call[0] = inst1;
> > +		call[1] = inst2;
> > +
> > +		/* drop the old IMMs */
> > +		call[0] &= ~(RV_U_IMM_31_12_MASK);
> 
> Same comment as above about RV_U_IMM_31_12_OPOFF. IMO, this would be more
> consistent with the shift, even though it's zero.
> 
>                 call[0] &= ~(RV_U_IMM_31_12_MASK << RV_U_IMM_31_12_OPOFF);
> 
> > +		call[1] &= ~(RV_I_IMM_11_0_MASK << RV_I_IMM_11_0_OPOFF);
> > +
> > +		/* add the adapted IMMs */
> > +		call[0] |= to_auipc_imm(imm);
> 
> As pointed out above, I'm not sure about to_auipc_imm().
> 
> > +		call[1] |= to_jalr_imm(imm);
> > +
> > +		/* patch the call place again */
> > +		patch_text_nosync(alt_ptr + i * sizeof(u32), call, 8);
>                                   ^                                ^
> 				  p		       sizeof(u32) * 2
> > +	}
> > +}
> > +
> >  /*
> >   * This is called very early in the boot process (directly after we run
> >   * a feature detect on the boot CPU). No need to worry about other CPUs
> > diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c
> > index 694267d1fe81..ba62a4ff5ccd 100644
> > --- a/arch/riscv/kernel/cpufeature.c
> > +++ b/arch/riscv/kernel/cpufeature.c
> > @@ -316,8 +316,15 @@ void __init_or_module riscv_cpufeature_patch_func(struct alt_entry *begin,
> >  		}
> >  
> >  		tmp = (1U << alt->errata_id);
> > -		if (cpu_req_feature & tmp)
> > -			patch_text_nosync(alt->old_ptr, alt->alt_ptr, alt->alt_len);
> > +		if (cpu_req_feature & tmp) {
> > +			/* do the basic patching */
> > +			patch_text_nosync(alt->old_ptr, alt->alt_ptr,
> > +					  alt->alt_len);
> 
> nit: I'd leave this line long and only have one wrap in the line below
> 
> > +
> > +			riscv_alternative_fix_auipc_jalr(alt->old_ptr,
> > +							 alt->alt_len,
> > +							 alt->old_ptr - alt->alt_ptr);
> > +		}
> >  	}
> >  }
> >  #endif
> > -- 
> > 2.35.1
> >
> 
> Thanks,
> drew

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [PATCH v3 11/14] RISC-V: fix auipc-jalr addresses in patched alternatives
@ 2022-12-06 15:56       ` Andrew Jones
  0 siblings, 0 replies; 84+ messages in thread
From: Andrew Jones @ 2022-12-06 15:56 UTC (permalink / raw)
  To: Heiko Stuebner
  Cc: linux-riscv, palmer, christoph.muellner, prabhakar.csengg, conor,
	philipp.tomsich, emil.renner.berthing, ardb, linux-efi,
	Heiko Stuebner, Conor Dooley

On Thu, Dec 01, 2022 at 08:33:53PM +0100, Andrew Jones wrote:
> On Wed, Nov 30, 2022 at 11:56:11PM +0100, Heiko Stuebner wrote:
> > From: Heiko Stuebner <heiko.stuebner@vrull.eu>
> > 
> > Alternatives live in a different section, so addresses used by call
> > functions will point to wrong locations after the patch got applied.
> > 
> > Similar to arm64, adjust the location to consider that offset.
> > 
> > Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
> > Signed-off-by: Heiko Stuebner <heiko.stuebner@vrull.eu>
> > ---
> >  arch/riscv/include/asm/alternative.h |  3 ++
> >  arch/riscv/kernel/alternative.c      | 72 ++++++++++++++++++++++++++++
> >  arch/riscv/kernel/cpufeature.c       | 11 ++++-
> >  3 files changed, 84 insertions(+), 2 deletions(-)
> > 
> > diff --git a/arch/riscv/include/asm/alternative.h b/arch/riscv/include/asm/alternative.h
> > index 6511dd73e812..c58ec3cc4bc3 100644
> > --- a/arch/riscv/include/asm/alternative.h
> > +++ b/arch/riscv/include/asm/alternative.h
> > @@ -27,6 +27,9 @@ void __init apply_boot_alternatives(void);
> >  void __init apply_early_boot_alternatives(void);
> >  void apply_module_alternatives(void *start, size_t length);
> >  
> > +void riscv_alternative_fix_auipc_jalr(void *alt_ptr, unsigned int len,
> > +				      int patch_offset);
> > +
> >  struct alt_entry {
> >  	void *old_ptr;		 /* address of original instruciton or data  */
> >  	void *alt_ptr;		 /* address of replacement instruction or data */
> > diff --git a/arch/riscv/kernel/alternative.c b/arch/riscv/kernel/alternative.c
> > index a7d26a00beea..292cc42dc3be 100644
> > --- a/arch/riscv/kernel/alternative.c
> > +++ b/arch/riscv/kernel/alternative.c
> > @@ -15,6 +15,8 @@
> >  #include <asm/vendorid_list.h>
> >  #include <asm/sbi.h>
> >  #include <asm/csr.h>
> > +#include <asm/insn.h>
> > +#include <asm/patch.h>
> >  
> >  struct cpu_manufacturer_info_t {
> >  	unsigned long vendor_id;
> > @@ -53,6 +55,76 @@ static void __init_or_module riscv_fill_cpu_mfr_info(struct cpu_manufacturer_inf
> >  	}
> >  }
> >  
> > +static unsigned int riscv_instruction_at(void *p, unsigned int offset)
> 
> How about explicitly returning a u32?
> 
> > +{
> > +	u16 *parcel = p + (offset * sizeof(u32));
> 
> nit: I realize this is just a helper function, but I think a cleaner
> interface would require the caller do this math, or at least the offset
> scaling, since only the caller knows it's necessary. And, the call to
> patch_text_nosync() requires all the math, so it'd be consistent for
> riscv_instruction_at() to only take a pointer too.
> 
> > +
> > +	return (unsigned int)parcel[0] | (unsigned int)parcel[1] << 16;
> > +}
> > +
> > +static inline bool riscv_insn_is_auipc_jalr(u32 insn1, u32 insn2)
> > +{
> > +	return riscv_insn_is_auipc(insn1) && riscv_insn_is_jalr(insn2);
> > +}
> > +
> > +#define JALR_SIGN_MASK		BIT(RV_I_IMM_SIGN_OPOFF - RV_I_IMM_11_0_OPOFF)
> 
> We know I-type IMM is 11 bits, so we could just define this as BIT(11).
> 
> > +#define AUIPC_PAD		(0x00001000)
> > +
> > +#define to_jalr_imm(value)						\
> > +	((value & RV_I_IMM_11_0_MASK) << RV_I_IMM_11_0_OPOFF)
> 
> Should put () around the macro argument, (value)
> 
> > +
> > +#define to_auipc_imm(value)						\
> > +	((value & JALR_SIGN_MASK) ?					\
> > +	((value & RV_U_IMM_31_12_MASK) + AUIPC_PAD) :	\
> > +	(value & RV_U_IMM_31_12_MASK))
> 
> I know RV_U_IMM_31_12_OPOFF is 0, but it looks odd not shifting
> RV_U_IMM_31_12_MASK when we do shift RV_I_IMM_11_0_MASK.
> 
> So, it looks like to_auipc_imm() is doing
> 
>    offset[31:12] + ((value & BIT(11)) ? (1 << 12) : 0)
> 
> but the spec says the auipc part of the 'call' pseudoinstruction should be
> 
>    offset[31:12] + offset[11]
>  
> which I think would be written as
> 
>  ((((value) & RV_U_IMM_31_12_MASK) << RV_U_IMM_31_12_OPOFF) + ((value) & BIT(11)))
> 
> or what am I missing?

I figured out what I was missing. The way you have it makes sense to me
now and my way was wrong. But we could simplify it like this

#define to_auipc_imm(value) \
 ((value) + BIT(11)) & RV_U_IMM_31_12_MASK

> 
> > +
> > +void riscv_alternative_fix_auipc_jalr(void *alt_ptr, unsigned int len,
> > +				      int patch_offset)
> > +{
> > +	int num_instr = len / sizeof(u32);
> > +	unsigned int call[2];
> > +	int i;
> > +	int imm;
> > +	u32 rd1;
> > +
> > +	/*
> > +	 * stop one instruction before the end, as we're checking
> > +	 * for auipc + jalr
> > +	 */
> > +	for (i = 0; i < num_instr - 1; i++) {
> 
> If we change riscv_instruction_at() to just take a pointer then we can do
> the math in the for() and actually just use pointer arithmetic.
> 
>         uint32_t *p = alt_ptr;
>         for (i = 0; i < num_instr - 1; i++, p++) {
> 
> > +		u32 inst1 = riscv_instruction_at(alt_ptr, i);
>                                                  p
> > +		u32 inst2 = riscv_instruction_at(alt_ptr, i + 1);
>                                                  p + 1
> > +
> > +		if (!riscv_insn_is_auipc_jalr(inst1, inst2))
> > +			continue;
> > +
> > +		/* call will use ra register */
> > +		rd1 = RV_EXTRACT_RD_REG(inst1);
> > +		if (rd1 != 1)
> > +			continue;
> 
> nit: rd1 is only used once, how about
> 
>  if (RV_EXTRACT_RD_REG(inst1) != 1)
> 
> > +
> > +		/* get and adjust new target address */
> > +		imm = RV_EXTRACT_UTYPE_IMM(inst1);
> 
> Based on my understanding of a auipc part of the 'call', it seems we
> should be subtracting BIT(11) here. And, since RV_EXTRACT_* does sign-
> extension for I-type, then I'm not sure we should use it. So,
> 
>         imm = (inst2 >> RV_I_IMM_11_0_OPOFF) & RV_I_IMM_11_0_MASK;
> 	imm += ((inst1 >> RV_U_IMM_31_12_OPOFF) & RV_U_IMM_31_12_MASK) - (imm & BIT(11));

I still think we need a subtraction here, but it should be ' - (imm >> 11)'

Thanks,
drew

> 
> > +		imm += RV_EXTRACT_ITYPE_IMM(inst2);
> > +		imm -= patch_offset;
> > +
> > +		/* pick the original auipc + jalr */
> > +		call[0] = inst1;
> > +		call[1] = inst2;
> > +
> > +		/* drop the old IMMs */
> > +		call[0] &= ~(RV_U_IMM_31_12_MASK);
> 
> Same comment as above about RV_U_IMM_31_12_OPOFF. IMO, this would be more
> consistent with the shift, even though it's zero.
> 
>                 call[0] &= ~(RV_U_IMM_31_12_MASK << RV_U_IMM_31_12_OPOFF);
> 
> > +		call[1] &= ~(RV_I_IMM_11_0_MASK << RV_I_IMM_11_0_OPOFF);
> > +
> > +		/* add the adapted IMMs */
> > +		call[0] |= to_auipc_imm(imm);
> 
> As pointed out above, I'm not sure about to_auipc_imm().
> 
> > +		call[1] |= to_jalr_imm(imm);
> > +
> > +		/* patch the call place again */
> > +		patch_text_nosync(alt_ptr + i * sizeof(u32), call, 8);
>                                   ^                                ^
> 				  p		       sizeof(u32) * 2
> > +	}
> > +}
> > +
> >  /*
> >   * This is called very early in the boot process (directly after we run
> >   * a feature detect on the boot CPU). No need to worry about other CPUs
> > diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c
> > index 694267d1fe81..ba62a4ff5ccd 100644
> > --- a/arch/riscv/kernel/cpufeature.c
> > +++ b/arch/riscv/kernel/cpufeature.c
> > @@ -316,8 +316,15 @@ void __init_or_module riscv_cpufeature_patch_func(struct alt_entry *begin,
> >  		}
> >  
> >  		tmp = (1U << alt->errata_id);
> > -		if (cpu_req_feature & tmp)
> > -			patch_text_nosync(alt->old_ptr, alt->alt_ptr, alt->alt_len);
> > +		if (cpu_req_feature & tmp) {
> > +			/* do the basic patching */
> > +			patch_text_nosync(alt->old_ptr, alt->alt_ptr,
> > +					  alt->alt_len);
> 
> nit: I'd leave this line long and only have one wrap in the line below
> 
> > +
> > +			riscv_alternative_fix_auipc_jalr(alt->old_ptr,
> > +							 alt->alt_len,
> > +							 alt->old_ptr - alt->alt_ptr);
> > +		}
> >  	}
> >  }
> >  #endif
> > -- 
> > 2.35.1
> >
> 
> Thanks,
> drew

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [PATCH v3 11/14] RISC-V: fix auipc-jalr addresses in patched alternatives
  2022-12-01 19:33     ` Andrew Jones
@ 2022-12-06 22:10       ` Heiko Stübner
  -1 siblings, 0 replies; 84+ messages in thread
From: Heiko Stübner @ 2022-12-06 22:10 UTC (permalink / raw)
  To: Andrew Jones
  Cc: linux-riscv, palmer, christoph.muellner, prabhakar.csengg, conor,
	philipp.tomsich, emil.renner.berthing, ardb, linux-efi,
	Conor Dooley

Am Donnerstag, 1. Dezember 2022, 20:33:53 CET schrieb Andrew Jones:
> On Wed, Nov 30, 2022 at 11:56:11PM +0100, Heiko Stuebner wrote:
> > From: Heiko Stuebner <heiko.stuebner@vrull.eu>
> > 
> > Alternatives live in a different section, so addresses used by call
> > functions will point to wrong locations after the patch got applied.
> > 
> > Similar to arm64, adjust the location to consider that offset.
> > 
> > Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
> > Signed-off-by: Heiko Stuebner <heiko.stuebner@vrull.eu>
> > ---
> >  arch/riscv/include/asm/alternative.h |  3 ++
> >  arch/riscv/kernel/alternative.c      | 72 ++++++++++++++++++++++++++++
> >  arch/riscv/kernel/cpufeature.c       | 11 ++++-
> >  3 files changed, 84 insertions(+), 2 deletions(-)
> > 
> > diff --git a/arch/riscv/include/asm/alternative.h b/arch/riscv/include/asm/alternative.h
> > index 6511dd73e812..c58ec3cc4bc3 100644
> > --- a/arch/riscv/include/asm/alternative.h
> > +++ b/arch/riscv/include/asm/alternative.h
> > @@ -27,6 +27,9 @@ void __init apply_boot_alternatives(void);
> >  void __init apply_early_boot_alternatives(void);
> >  void apply_module_alternatives(void *start, size_t length);
> >  
> > +void riscv_alternative_fix_auipc_jalr(void *alt_ptr, unsigned int len,
> > +				      int patch_offset);
> > +
> >  struct alt_entry {
> >  	void *old_ptr;		 /* address of original instruciton or data  */
> >  	void *alt_ptr;		 /* address of replacement instruction or data */
> > diff --git a/arch/riscv/kernel/alternative.c b/arch/riscv/kernel/alternative.c
> > index a7d26a00beea..292cc42dc3be 100644
> > --- a/arch/riscv/kernel/alternative.c
> > +++ b/arch/riscv/kernel/alternative.c
> > @@ -15,6 +15,8 @@
> >  #include <asm/vendorid_list.h>
> >  #include <asm/sbi.h>
> >  #include <asm/csr.h>
> > +#include <asm/insn.h>
> > +#include <asm/patch.h>
> >  
> >  struct cpu_manufacturer_info_t {
> >  	unsigned long vendor_id;
> > @@ -53,6 +55,76 @@ static void __init_or_module riscv_fill_cpu_mfr_info(struct cpu_manufacturer_inf
> >  	}
> >  }
> >  
> > +static unsigned int riscv_instruction_at(void *p, unsigned int offset)
> 
> How about explicitly returning a u32?

ok

> > +{
> > +	u16 *parcel = p + (offset * sizeof(u32));
> 
> nit: I realize this is just a helper function, but I think a cleaner
> interface would require the caller do this math, or at least the offset
> scaling, since only the caller knows it's necessary. And, the call to
> patch_text_nosync() requires all the math, so it'd be consistent for
> riscv_instruction_at() to only take a pointer too.

ok

> 
> > +
> > +	return (unsigned int)parcel[0] | (unsigned int)parcel[1] << 16;
> > +}
> > +
> > +static inline bool riscv_insn_is_auipc_jalr(u32 insn1, u32 insn2)
> > +{
> > +	return riscv_insn_is_auipc(insn1) && riscv_insn_is_jalr(insn2);
> > +}
> > +
> > +#define JALR_SIGN_MASK		BIT(RV_I_IMM_SIGN_OPOFF - RV_I_IMM_11_0_OPOFF)
> 
> We know I-type IMM is 11 bits, so we could just define this as BIT(11).
> 
> > +#define AUIPC_PAD		(0x00001000)
> > +
> > +#define to_jalr_imm(value)						\
> > +	((value & RV_I_IMM_11_0_MASK) << RV_I_IMM_11_0_OPOFF)
> 
> Should put () around the macro argument, (value)
> 
> > +
> > +#define to_auipc_imm(value)						\
> > +	((value & JALR_SIGN_MASK) ?					\
> > +	((value & RV_U_IMM_31_12_MASK) + AUIPC_PAD) :	\
> > +	(value & RV_U_IMM_31_12_MASK))
> 
> I know RV_U_IMM_31_12_OPOFF is 0, but it looks odd not shifting
> RV_U_IMM_31_12_MASK when we do shift RV_I_IMM_11_0_MASK.
> 
> So, it looks like to_auipc_imm() is doing
> 
>    offset[31:12] + ((value & BIT(11)) ? (1 << 12) : 0)
> 
> but the spec says the auipc part of the 'call' pseudoinstruction should be

can you point me to that part of the spec?

For educational purposes, because in the main riscv-spec I only found
the main auipc + jalr descriptions, but nothing about the actual
"call" pseudoinstruction.

[and I'm probably just looking at the wrong document]


>    offset[31:12] + offset[11]
>  
> which I think would be written as
> 
>  ((((value) & RV_U_IMM_31_12_MASK) << RV_U_IMM_31_12_OPOFF) + ((value) & BIT(11)))
> 
> or what am I missing?

that whole thing came from the ftrace parts, also doing call fixup voodoo
And I can't really say that I understand every nook and cranny of it.

But for practical purposes, the addresses generated now work,
and also seem to work for the ftrace counterpart (see include/asm/ftrace.h)

[another place that will profit from more generalization :-) ]


> > +
> > +void riscv_alternative_fix_auipc_jalr(void *alt_ptr, unsigned int len,
> > +				      int patch_offset)
> > +{
> > +	int num_instr = len / sizeof(u32);
> > +	unsigned int call[2];
> > +	int i;
> > +	int imm;
> > +	u32 rd1;
> > +
> > +	/*
> > +	 * stop one instruction before the end, as we're checking
> > +	 * for auipc + jalr
> > +	 */
> > +	for (i = 0; i < num_instr - 1; i++) {
> 
> If we change riscv_instruction_at() to just take a pointer then we can do
> the math in the for() and actually just use pointer arithmetic.
> 
>         uint32_t *p = alt_ptr;
>         for (i = 0; i < num_instr - 1; i++, p++) {

Wasn't not using uint32 pointers the whole point of going with the accessor?


> > +		u32 inst1 = riscv_instruction_at(alt_ptr, i);
>                                                  p
> > +		u32 inst2 = riscv_instruction_at(alt_ptr, i + 1);
>                                                  p + 1
> > +
> > +		if (!riscv_insn_is_auipc_jalr(inst1, inst2))
> > +			continue;
> > +
> > +		/* call will use ra register */
> > +		rd1 = RV_EXTRACT_RD_REG(inst1);
> > +		if (rd1 != 1)
> > +			continue;
> 
> nit: rd1 is only used once, how about
> 
>  if (RV_EXTRACT_RD_REG(inst1) != 1)

ok


Need to look at the rest tomorrow
Heiko


> > +
> > +		/* get and adjust new target address */
> > +		imm = RV_EXTRACT_UTYPE_IMM(inst1);
> 
> Based on my understanding of a auipc part of the 'call', it seems we
> should be subtracting BIT(11) here. And, since RV_EXTRACT_* does sign-
> extension for I-type, then I'm not sure we should use it. So,
> 
>         imm = (inst2 >> RV_I_IMM_11_0_OPOFF) & RV_I_IMM_11_0_MASK;
> 	imm += ((inst1 >> RV_U_IMM_31_12_OPOFF) & RV_U_IMM_31_12_MASK) - (imm & BIT(11));
> 
> > +		imm += RV_EXTRACT_ITYPE_IMM(inst2);
> > +		imm -= patch_offset;
> > +
> > +		/* pick the original auipc + jalr */
> > +		call[0] = inst1;
> > +		call[1] = inst2;
> > +
> > +		/* drop the old IMMs */
> > +		call[0] &= ~(RV_U_IMM_31_12_MASK);
> 
> Same comment as above about RV_U_IMM_31_12_OPOFF. IMO, this would be more
> consistent with the shift, even though it's zero.
> 
>                 call[0] &= ~(RV_U_IMM_31_12_MASK << RV_U_IMM_31_12_OPOFF);
> 
> > +		call[1] &= ~(RV_I_IMM_11_0_MASK << RV_I_IMM_11_0_OPOFF);
> > +
> > +		/* add the adapted IMMs */
> > +		call[0] |= to_auipc_imm(imm);
> 
> As pointed out above, I'm not sure about to_auipc_imm().
> 
> > +		call[1] |= to_jalr_imm(imm);
> > +
> > +		/* patch the call place again */
> > +		patch_text_nosync(alt_ptr + i * sizeof(u32), call, 8);
>                                   ^                                ^
> 				  p		       sizeof(u32) * 2
> > +	}
> > +}
> > +
> >  /*
> >   * This is called very early in the boot process (directly after we run
> >   * a feature detect on the boot CPU). No need to worry about other CPUs
> > diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c
> > index 694267d1fe81..ba62a4ff5ccd 100644
> > --- a/arch/riscv/kernel/cpufeature.c
> > +++ b/arch/riscv/kernel/cpufeature.c
> > @@ -316,8 +316,15 @@ void __init_or_module riscv_cpufeature_patch_func(struct alt_entry *begin,
> >  		}
> >  
> >  		tmp = (1U << alt->errata_id);
> > -		if (cpu_req_feature & tmp)
> > -			patch_text_nosync(alt->old_ptr, alt->alt_ptr, alt->alt_len);
> > +		if (cpu_req_feature & tmp) {
> > +			/* do the basic patching */
> > +			patch_text_nosync(alt->old_ptr, alt->alt_ptr,
> > +					  alt->alt_len);
> 
> nit: I'd leave this line long and only have one wrap in the line below
> 
> > +
> > +			riscv_alternative_fix_auipc_jalr(alt->old_ptr,
> > +							 alt->alt_len,
> > +							 alt->old_ptr - alt->alt_ptr);
> > +		}
> >  	}
> >  }
> >  #endif
> >
> 
> Thanks,
> drew
> 





^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [PATCH v3 11/14] RISC-V: fix auipc-jalr addresses in patched alternatives
@ 2022-12-06 22:10       ` Heiko Stübner
  0 siblings, 0 replies; 84+ messages in thread
From: Heiko Stübner @ 2022-12-06 22:10 UTC (permalink / raw)
  To: Andrew Jones
  Cc: linux-riscv, palmer, christoph.muellner, prabhakar.csengg, conor,
	philipp.tomsich, emil.renner.berthing, ardb, linux-efi,
	Conor Dooley

Am Donnerstag, 1. Dezember 2022, 20:33:53 CET schrieb Andrew Jones:
> On Wed, Nov 30, 2022 at 11:56:11PM +0100, Heiko Stuebner wrote:
> > From: Heiko Stuebner <heiko.stuebner@vrull.eu>
> > 
> > Alternatives live in a different section, so addresses used by call
> > functions will point to wrong locations after the patch got applied.
> > 
> > Similar to arm64, adjust the location to consider that offset.
> > 
> > Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
> > Signed-off-by: Heiko Stuebner <heiko.stuebner@vrull.eu>
> > ---
> >  arch/riscv/include/asm/alternative.h |  3 ++
> >  arch/riscv/kernel/alternative.c      | 72 ++++++++++++++++++++++++++++
> >  arch/riscv/kernel/cpufeature.c       | 11 ++++-
> >  3 files changed, 84 insertions(+), 2 deletions(-)
> > 
> > diff --git a/arch/riscv/include/asm/alternative.h b/arch/riscv/include/asm/alternative.h
> > index 6511dd73e812..c58ec3cc4bc3 100644
> > --- a/arch/riscv/include/asm/alternative.h
> > +++ b/arch/riscv/include/asm/alternative.h
> > @@ -27,6 +27,9 @@ void __init apply_boot_alternatives(void);
> >  void __init apply_early_boot_alternatives(void);
> >  void apply_module_alternatives(void *start, size_t length);
> >  
> > +void riscv_alternative_fix_auipc_jalr(void *alt_ptr, unsigned int len,
> > +				      int patch_offset);
> > +
> >  struct alt_entry {
> >  	void *old_ptr;		 /* address of original instruciton or data  */
> >  	void *alt_ptr;		 /* address of replacement instruction or data */
> > diff --git a/arch/riscv/kernel/alternative.c b/arch/riscv/kernel/alternative.c
> > index a7d26a00beea..292cc42dc3be 100644
> > --- a/arch/riscv/kernel/alternative.c
> > +++ b/arch/riscv/kernel/alternative.c
> > @@ -15,6 +15,8 @@
> >  #include <asm/vendorid_list.h>
> >  #include <asm/sbi.h>
> >  #include <asm/csr.h>
> > +#include <asm/insn.h>
> > +#include <asm/patch.h>
> >  
> >  struct cpu_manufacturer_info_t {
> >  	unsigned long vendor_id;
> > @@ -53,6 +55,76 @@ static void __init_or_module riscv_fill_cpu_mfr_info(struct cpu_manufacturer_inf
> >  	}
> >  }
> >  
> > +static unsigned int riscv_instruction_at(void *p, unsigned int offset)
> 
> How about explicitly returning a u32?

ok

> > +{
> > +	u16 *parcel = p + (offset * sizeof(u32));
> 
> nit: I realize this is just a helper function, but I think a cleaner
> interface would require the caller do this math, or at least the offset
> scaling, since only the caller knows it's necessary. And, the call to
> patch_text_nosync() requires all the math, so it'd be consistent for
> riscv_instruction_at() to only take a pointer too.

ok

> 
> > +
> > +	return (unsigned int)parcel[0] | (unsigned int)parcel[1] << 16;
> > +}
> > +
> > +static inline bool riscv_insn_is_auipc_jalr(u32 insn1, u32 insn2)
> > +{
> > +	return riscv_insn_is_auipc(insn1) && riscv_insn_is_jalr(insn2);
> > +}
> > +
> > +#define JALR_SIGN_MASK		BIT(RV_I_IMM_SIGN_OPOFF - RV_I_IMM_11_0_OPOFF)
> 
> We know I-type IMM is 11 bits, so we could just define this as BIT(11).
> 
> > +#define AUIPC_PAD		(0x00001000)
> > +
> > +#define to_jalr_imm(value)						\
> > +	((value & RV_I_IMM_11_0_MASK) << RV_I_IMM_11_0_OPOFF)
> 
> Should put () around the macro argument, (value)
> 
> > +
> > +#define to_auipc_imm(value)						\
> > +	((value & JALR_SIGN_MASK) ?					\
> > +	((value & RV_U_IMM_31_12_MASK) + AUIPC_PAD) :	\
> > +	(value & RV_U_IMM_31_12_MASK))
> 
> I know RV_U_IMM_31_12_OPOFF is 0, but it looks odd not shifting
> RV_U_IMM_31_12_MASK when we do shift RV_I_IMM_11_0_MASK.
> 
> So, it looks like to_auipc_imm() is doing
> 
>    offset[31:12] + ((value & BIT(11)) ? (1 << 12) : 0)
> 
> but the spec says the auipc part of the 'call' pseudoinstruction should be

can you point me to that part of the spec?

For educational purposes, because in the main riscv-spec I only found
the main auipc + jalr descriptions, but nothing about the actual
"call" pseudoinstruction.

[and I'm probably just looking at the wrong document]


>    offset[31:12] + offset[11]
>  
> which I think would be written as
> 
>  ((((value) & RV_U_IMM_31_12_MASK) << RV_U_IMM_31_12_OPOFF) + ((value) & BIT(11)))
> 
> or what am I missing?

that whole thing came from the ftrace parts, also doing call fixup voodoo
And I can't really say that I understand every nook and cranny of it.

But for practical purposes, the addresses generated now work,
and also seem to work for the ftrace counterpart (see include/asm/ftrace.h)

[another place that will profit from more generalization :-) ]


> > +
> > +void riscv_alternative_fix_auipc_jalr(void *alt_ptr, unsigned int len,
> > +				      int patch_offset)
> > +{
> > +	int num_instr = len / sizeof(u32);
> > +	unsigned int call[2];
> > +	int i;
> > +	int imm;
> > +	u32 rd1;
> > +
> > +	/*
> > +	 * stop one instruction before the end, as we're checking
> > +	 * for auipc + jalr
> > +	 */
> > +	for (i = 0; i < num_instr - 1; i++) {
> 
> If we change riscv_instruction_at() to just take a pointer then we can do
> the math in the for() and actually just use pointer arithmetic.
> 
>         uint32_t *p = alt_ptr;
>         for (i = 0; i < num_instr - 1; i++, p++) {

Wasn't not using uint32 pointers the whole point of going with the accessor?


> > +		u32 inst1 = riscv_instruction_at(alt_ptr, i);
>                                                  p
> > +		u32 inst2 = riscv_instruction_at(alt_ptr, i + 1);
>                                                  p + 1
> > +
> > +		if (!riscv_insn_is_auipc_jalr(inst1, inst2))
> > +			continue;
> > +
> > +		/* call will use ra register */
> > +		rd1 = RV_EXTRACT_RD_REG(inst1);
> > +		if (rd1 != 1)
> > +			continue;
> 
> nit: rd1 is only used once, how about
> 
>  if (RV_EXTRACT_RD_REG(inst1) != 1)

ok


Need to look at the rest tomorrow
Heiko


> > +
> > +		/* get and adjust new target address */
> > +		imm = RV_EXTRACT_UTYPE_IMM(inst1);
> 
> Based on my understanding of a auipc part of the 'call', it seems we
> should be subtracting BIT(11) here. And, since RV_EXTRACT_* does sign-
> extension for I-type, then I'm not sure we should use it. So,
> 
>         imm = (inst2 >> RV_I_IMM_11_0_OPOFF) & RV_I_IMM_11_0_MASK;
> 	imm += ((inst1 >> RV_U_IMM_31_12_OPOFF) & RV_U_IMM_31_12_MASK) - (imm & BIT(11));
> 
> > +		imm += RV_EXTRACT_ITYPE_IMM(inst2);
> > +		imm -= patch_offset;
> > +
> > +		/* pick the original auipc + jalr */
> > +		call[0] = inst1;
> > +		call[1] = inst2;
> > +
> > +		/* drop the old IMMs */
> > +		call[0] &= ~(RV_U_IMM_31_12_MASK);
> 
> Same comment as above about RV_U_IMM_31_12_OPOFF. IMO, this would be more
> consistent with the shift, even though it's zero.
> 
>                 call[0] &= ~(RV_U_IMM_31_12_MASK << RV_U_IMM_31_12_OPOFF);
> 
> > +		call[1] &= ~(RV_I_IMM_11_0_MASK << RV_I_IMM_11_0_OPOFF);
> > +
> > +		/* add the adapted IMMs */
> > +		call[0] |= to_auipc_imm(imm);
> 
> As pointed out above, I'm not sure about to_auipc_imm().
> 
> > +		call[1] |= to_jalr_imm(imm);
> > +
> > +		/* patch the call place again */
> > +		patch_text_nosync(alt_ptr + i * sizeof(u32), call, 8);
>                                   ^                                ^
> 				  p		       sizeof(u32) * 2
> > +	}
> > +}
> > +
> >  /*
> >   * This is called very early in the boot process (directly after we run
> >   * a feature detect on the boot CPU). No need to worry about other CPUs
> > diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c
> > index 694267d1fe81..ba62a4ff5ccd 100644
> > --- a/arch/riscv/kernel/cpufeature.c
> > +++ b/arch/riscv/kernel/cpufeature.c
> > @@ -316,8 +316,15 @@ void __init_or_module riscv_cpufeature_patch_func(struct alt_entry *begin,
> >  		}
> >  
> >  		tmp = (1U << alt->errata_id);
> > -		if (cpu_req_feature & tmp)
> > -			patch_text_nosync(alt->old_ptr, alt->alt_ptr, alt->alt_len);
> > +		if (cpu_req_feature & tmp) {
> > +			/* do the basic patching */
> > +			patch_text_nosync(alt->old_ptr, alt->alt_ptr,
> > +					  alt->alt_len);
> 
> nit: I'd leave this line long and only have one wrap in the line below
> 
> > +
> > +			riscv_alternative_fix_auipc_jalr(alt->old_ptr,
> > +							 alt->alt_len,
> > +							 alt->old_ptr - alt->alt_ptr);
> > +		}
> >  	}
> >  }
> >  #endif
> >
> 
> Thanks,
> drew
> 





_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [PATCH v3 11/14] RISC-V: fix auipc-jalr addresses in patched alternatives
  2022-12-06 22:10       ` Heiko Stübner
@ 2022-12-07  9:35         ` Heiko Stübner
  -1 siblings, 0 replies; 84+ messages in thread
From: Heiko Stübner @ 2022-12-07  9:35 UTC (permalink / raw)
  To: Andrew Jones
  Cc: linux-riscv, palmer, christoph.muellner, prabhakar.csengg, conor,
	philipp.tomsich, emil.renner.berthing, ardb, linux-efi,
	Conor Dooley

Am Dienstag, 6. Dezember 2022, 23:10:01 CET schrieb Heiko Stübner:
> Am Donnerstag, 1. Dezember 2022, 20:33:53 CET schrieb Andrew Jones:
> > On Wed, Nov 30, 2022 at 11:56:11PM +0100, Heiko Stuebner wrote:
> > > From: Heiko Stuebner <heiko.stuebner@vrull.eu>
> > > 
> > > Alternatives live in a different section, so addresses used by call
> > > functions will point to wrong locations after the patch got applied.
> > > 
> > > Similar to arm64, adjust the location to consider that offset.
> > > 
> > > Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
> > > Signed-off-by: Heiko Stuebner <heiko.stuebner@vrull.eu>
> > > ---
> > >  arch/riscv/include/asm/alternative.h |  3 ++
> > >  arch/riscv/kernel/alternative.c      | 72 ++++++++++++++++++++++++++++
> > >  arch/riscv/kernel/cpufeature.c       | 11 ++++-
> > >  3 files changed, 84 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/arch/riscv/include/asm/alternative.h b/arch/riscv/include/asm/alternative.h
> > > index 6511dd73e812..c58ec3cc4bc3 100644
> > > --- a/arch/riscv/include/asm/alternative.h
> > > +++ b/arch/riscv/include/asm/alternative.h
> > > @@ -27,6 +27,9 @@ void __init apply_boot_alternatives(void);
> > >  void __init apply_early_boot_alternatives(void);
> > >  void apply_module_alternatives(void *start, size_t length);
> > >  
> > > +void riscv_alternative_fix_auipc_jalr(void *alt_ptr, unsigned int len,
> > > +				      int patch_offset);
> > > +
> > >  struct alt_entry {
> > >  	void *old_ptr;		 /* address of original instruciton or data  */
> > >  	void *alt_ptr;		 /* address of replacement instruction or data */
> > > diff --git a/arch/riscv/kernel/alternative.c b/arch/riscv/kernel/alternative.c
> > > index a7d26a00beea..292cc42dc3be 100644
> > > --- a/arch/riscv/kernel/alternative.c
> > > +++ b/arch/riscv/kernel/alternative.c
> > > @@ -15,6 +15,8 @@
> > >  #include <asm/vendorid_list.h>
> > >  #include <asm/sbi.h>
> > >  #include <asm/csr.h>
> > > +#include <asm/insn.h>
> > > +#include <asm/patch.h>
> > >  
> > >  struct cpu_manufacturer_info_t {
> > >  	unsigned long vendor_id;
> > > @@ -53,6 +55,76 @@ static void __init_or_module riscv_fill_cpu_mfr_info(struct cpu_manufacturer_inf
> > >  	}
> > >  }
> > >  
> > > +static unsigned int riscv_instruction_at(void *p, unsigned int offset)
> > 
> > How about explicitly returning a u32?
> 
> ok
> 
> > > +{
> > > +	u16 *parcel = p + (offset * sizeof(u32));
> > 
> > nit: I realize this is just a helper function, but I think a cleaner
> > interface would require the caller do this math, or at least the offset
> > scaling, since only the caller knows it's necessary. And, the call to
> > patch_text_nosync() requires all the math, so it'd be consistent for
> > riscv_instruction_at() to only take a pointer too.
> 
> ok
> 
> > 
> > > +
> > > +	return (unsigned int)parcel[0] | (unsigned int)parcel[1] << 16;
> > > +}
> > > +
> > > +static inline bool riscv_insn_is_auipc_jalr(u32 insn1, u32 insn2)
> > > +{
> > > +	return riscv_insn_is_auipc(insn1) && riscv_insn_is_jalr(insn2);
> > > +}
> > > +
> > > +#define JALR_SIGN_MASK		BIT(RV_I_IMM_SIGN_OPOFF - RV_I_IMM_11_0_OPOFF)
> > 
> > We know I-type IMM is 11 bits, so we could just define this as BIT(11).
> > 
> > > +#define AUIPC_PAD		(0x00001000)
> > > +
> > > +#define to_jalr_imm(value)						\
> > > +	((value & RV_I_IMM_11_0_MASK) << RV_I_IMM_11_0_OPOFF)
> > 
> > Should put () around the macro argument, (value)
> > 
> > > +
> > > +#define to_auipc_imm(value)						\
> > > +	((value & JALR_SIGN_MASK) ?					\
> > > +	((value & RV_U_IMM_31_12_MASK) + AUIPC_PAD) :	\
> > > +	(value & RV_U_IMM_31_12_MASK))
> > 
> > I know RV_U_IMM_31_12_OPOFF is 0, but it looks odd not shifting
> > RV_U_IMM_31_12_MASK when we do shift RV_I_IMM_11_0_MASK.
> > 
> > So, it looks like to_auipc_imm() is doing
> > 
> >    offset[31:12] + ((value & BIT(11)) ? (1 << 12) : 0)
> > 
> > but the spec says the auipc part of the 'call' pseudoinstruction should be
> 
> can you point me to that part of the spec?
> 
> For educational purposes, because in the main riscv-spec I only found
> the main auipc + jalr descriptions, but nothing about the actual
> "call" pseudoinstruction.
> 
> [and I'm probably just looking at the wrong document]
> 
> 
> >    offset[31:12] + offset[11]
> >  
> > which I think would be written as
> > 
> >  ((((value) & RV_U_IMM_31_12_MASK) << RV_U_IMM_31_12_OPOFF) + ((value) & BIT(11)))
> > 
> > or what am I missing?

So far I've found the riscv-asm-manual [0], which only states for call
	auipc x1, offset[31:12]
	jalr x1, x1, offset[11:0]

and the psABI spec [1], neither mention your "offset[31:12] + offset[11]" ?

But [1] does contain that tiny sentence
"The successive instruction has a signed 12-bit immediate so the value
 of the preceding high 20-bit relocation may have 1 added to it."


I.e. the lower 12 bits become themself a signed number [-2048:2047]

Take an example:
- address is 1862116 ( 0b 111000110 100111100100 )
- address[31:12] becomes 1859584 (as 0b 111000110 000000000000)
- while address[11:0] is 2532 as part of the bigger number
- as lone 12bit signed IMM it becomes -1564
- so you need to add this 4096 to the auipc IMM to compensate


Heiko


[0] https://github.com/riscv-non-isa/riscv-asm-manual/blob/master/riscv-asm.md
[1] https://github.com/riscv-non-isa/riscv-elf-psabi-doc/releases/tag/v1.0

> 
> that whole thing came from the ftrace parts, also doing call fixup voodoo
> And I can't really say that I understand every nook and cranny of it.
> 
> But for practical purposes, the addresses generated now work,
> and also seem to work for the ftrace counterpart (see include/asm/ftrace.h)
> 
> [another place that will profit from more generalization :-) ]
> 
> 
> > > +
> > > +void riscv_alternative_fix_auipc_jalr(void *alt_ptr, unsigned int len,
> > > +				      int patch_offset)
> > > +{
> > > +	int num_instr = len / sizeof(u32);
> > > +	unsigned int call[2];
> > > +	int i;
> > > +	int imm;
> > > +	u32 rd1;
> > > +
> > > +	/*
> > > +	 * stop one instruction before the end, as we're checking
> > > +	 * for auipc + jalr
> > > +	 */
> > > +	for (i = 0; i < num_instr - 1; i++) {
> > 
> > If we change riscv_instruction_at() to just take a pointer then we can do
> > the math in the for() and actually just use pointer arithmetic.
> > 
> >         uint32_t *p = alt_ptr;
> >         for (i = 0; i < num_instr - 1; i++, p++) {
> 
> Wasn't not using uint32 pointers the whole point of going with the accessor?
> 
> 
> > > +		u32 inst1 = riscv_instruction_at(alt_ptr, i);
> >                                                  p
> > > +		u32 inst2 = riscv_instruction_at(alt_ptr, i + 1);
> >                                                  p + 1
> > > +
> > > +		if (!riscv_insn_is_auipc_jalr(inst1, inst2))
> > > +			continue;
> > > +
> > > +		/* call will use ra register */
> > > +		rd1 = RV_EXTRACT_RD_REG(inst1);
> > > +		if (rd1 != 1)
> > > +			continue;
> > 
> > nit: rd1 is only used once, how about
> > 
> >  if (RV_EXTRACT_RD_REG(inst1) != 1)
> 
> ok
> 
> 
> Need to look at the rest tomorrow
> Heiko
> 
> 
> > > +
> > > +		/* get and adjust new target address */
> > > +		imm = RV_EXTRACT_UTYPE_IMM(inst1);
> > 
> > Based on my understanding of a auipc part of the 'call', it seems we
> > should be subtracting BIT(11) here. And, since RV_EXTRACT_* does sign-
> > extension for I-type, then I'm not sure we should use it. So,
> > 
> >         imm = (inst2 >> RV_I_IMM_11_0_OPOFF) & RV_I_IMM_11_0_MASK;
> > 	imm += ((inst1 >> RV_U_IMM_31_12_OPOFF) & RV_U_IMM_31_12_MASK) - (imm & BIT(11));
> > 
> > > +		imm += RV_EXTRACT_ITYPE_IMM(inst2);
> > > +		imm -= patch_offset;
> > > +
> > > +		/* pick the original auipc + jalr */
> > > +		call[0] = inst1;
> > > +		call[1] = inst2;
> > > +
> > > +		/* drop the old IMMs */
> > > +		call[0] &= ~(RV_U_IMM_31_12_MASK);
> > 
> > Same comment as above about RV_U_IMM_31_12_OPOFF. IMO, this would be more
> > consistent with the shift, even though it's zero.
> > 
> >                 call[0] &= ~(RV_U_IMM_31_12_MASK << RV_U_IMM_31_12_OPOFF);
> > 
> > > +		call[1] &= ~(RV_I_IMM_11_0_MASK << RV_I_IMM_11_0_OPOFF);
> > > +
> > > +		/* add the adapted IMMs */
> > > +		call[0] |= to_auipc_imm(imm);
> > 
> > As pointed out above, I'm not sure about to_auipc_imm().
> > 
> > > +		call[1] |= to_jalr_imm(imm);
> > > +
> > > +		/* patch the call place again */
> > > +		patch_text_nosync(alt_ptr + i * sizeof(u32), call, 8);
> >                                   ^                                ^
> > 				  p		       sizeof(u32) * 2
> > > +	}
> > > +}
> > > +
> > >  /*
> > >   * This is called very early in the boot process (directly after we run
> > >   * a feature detect on the boot CPU). No need to worry about other CPUs
> > > diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c
> > > index 694267d1fe81..ba62a4ff5ccd 100644
> > > --- a/arch/riscv/kernel/cpufeature.c
> > > +++ b/arch/riscv/kernel/cpufeature.c
> > > @@ -316,8 +316,15 @@ void __init_or_module riscv_cpufeature_patch_func(struct alt_entry *begin,
> > >  		}
> > >  
> > >  		tmp = (1U << alt->errata_id);
> > > -		if (cpu_req_feature & tmp)
> > > -			patch_text_nosync(alt->old_ptr, alt->alt_ptr, alt->alt_len);
> > > +		if (cpu_req_feature & tmp) {
> > > +			/* do the basic patching */
> > > +			patch_text_nosync(alt->old_ptr, alt->alt_ptr,
> > > +					  alt->alt_len);
> > 
> > nit: I'd leave this line long and only have one wrap in the line below
> > 
> > > +
> > > +			riscv_alternative_fix_auipc_jalr(alt->old_ptr,
> > > +							 alt->alt_len,
> > > +							 alt->old_ptr - alt->alt_ptr);
> > > +		}
> > >  	}
> > >  }
> > >  #endif
> > >
> > 
> > Thanks,
> > drew
> > 
> 
> 





^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [PATCH v3 11/14] RISC-V: fix auipc-jalr addresses in patched alternatives
@ 2022-12-07  9:35         ` Heiko Stübner
  0 siblings, 0 replies; 84+ messages in thread
From: Heiko Stübner @ 2022-12-07  9:35 UTC (permalink / raw)
  To: Andrew Jones
  Cc: linux-riscv, palmer, christoph.muellner, prabhakar.csengg, conor,
	philipp.tomsich, emil.renner.berthing, ardb, linux-efi,
	Conor Dooley

Am Dienstag, 6. Dezember 2022, 23:10:01 CET schrieb Heiko Stübner:
> Am Donnerstag, 1. Dezember 2022, 20:33:53 CET schrieb Andrew Jones:
> > On Wed, Nov 30, 2022 at 11:56:11PM +0100, Heiko Stuebner wrote:
> > > From: Heiko Stuebner <heiko.stuebner@vrull.eu>
> > > 
> > > Alternatives live in a different section, so addresses used by call
> > > functions will point to wrong locations after the patch got applied.
> > > 
> > > Similar to arm64, adjust the location to consider that offset.
> > > 
> > > Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
> > > Signed-off-by: Heiko Stuebner <heiko.stuebner@vrull.eu>
> > > ---
> > >  arch/riscv/include/asm/alternative.h |  3 ++
> > >  arch/riscv/kernel/alternative.c      | 72 ++++++++++++++++++++++++++++
> > >  arch/riscv/kernel/cpufeature.c       | 11 ++++-
> > >  3 files changed, 84 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/arch/riscv/include/asm/alternative.h b/arch/riscv/include/asm/alternative.h
> > > index 6511dd73e812..c58ec3cc4bc3 100644
> > > --- a/arch/riscv/include/asm/alternative.h
> > > +++ b/arch/riscv/include/asm/alternative.h
> > > @@ -27,6 +27,9 @@ void __init apply_boot_alternatives(void);
> > >  void __init apply_early_boot_alternatives(void);
> > >  void apply_module_alternatives(void *start, size_t length);
> > >  
> > > +void riscv_alternative_fix_auipc_jalr(void *alt_ptr, unsigned int len,
> > > +				      int patch_offset);
> > > +
> > >  struct alt_entry {
> > >  	void *old_ptr;		 /* address of original instruciton or data  */
> > >  	void *alt_ptr;		 /* address of replacement instruction or data */
> > > diff --git a/arch/riscv/kernel/alternative.c b/arch/riscv/kernel/alternative.c
> > > index a7d26a00beea..292cc42dc3be 100644
> > > --- a/arch/riscv/kernel/alternative.c
> > > +++ b/arch/riscv/kernel/alternative.c
> > > @@ -15,6 +15,8 @@
> > >  #include <asm/vendorid_list.h>
> > >  #include <asm/sbi.h>
> > >  #include <asm/csr.h>
> > > +#include <asm/insn.h>
> > > +#include <asm/patch.h>
> > >  
> > >  struct cpu_manufacturer_info_t {
> > >  	unsigned long vendor_id;
> > > @@ -53,6 +55,76 @@ static void __init_or_module riscv_fill_cpu_mfr_info(struct cpu_manufacturer_inf
> > >  	}
> > >  }
> > >  
> > > +static unsigned int riscv_instruction_at(void *p, unsigned int offset)
> > 
> > How about explicitly returning a u32?
> 
> ok
> 
> > > +{
> > > +	u16 *parcel = p + (offset * sizeof(u32));
> > 
> > nit: I realize this is just a helper function, but I think a cleaner
> > interface would require the caller do this math, or at least the offset
> > scaling, since only the caller knows it's necessary. And, the call to
> > patch_text_nosync() requires all the math, so it'd be consistent for
> > riscv_instruction_at() to only take a pointer too.
> 
> ok
> 
> > 
> > > +
> > > +	return (unsigned int)parcel[0] | (unsigned int)parcel[1] << 16;
> > > +}
> > > +
> > > +static inline bool riscv_insn_is_auipc_jalr(u32 insn1, u32 insn2)
> > > +{
> > > +	return riscv_insn_is_auipc(insn1) && riscv_insn_is_jalr(insn2);
> > > +}
> > > +
> > > +#define JALR_SIGN_MASK		BIT(RV_I_IMM_SIGN_OPOFF - RV_I_IMM_11_0_OPOFF)
> > 
> > We know I-type IMM is 11 bits, so we could just define this as BIT(11).
> > 
> > > +#define AUIPC_PAD		(0x00001000)
> > > +
> > > +#define to_jalr_imm(value)						\
> > > +	((value & RV_I_IMM_11_0_MASK) << RV_I_IMM_11_0_OPOFF)
> > 
> > Should put () around the macro argument, (value)
> > 
> > > +
> > > +#define to_auipc_imm(value)						\
> > > +	((value & JALR_SIGN_MASK) ?					\
> > > +	((value & RV_U_IMM_31_12_MASK) + AUIPC_PAD) :	\
> > > +	(value & RV_U_IMM_31_12_MASK))
> > 
> > I know RV_U_IMM_31_12_OPOFF is 0, but it looks odd not shifting
> > RV_U_IMM_31_12_MASK when we do shift RV_I_IMM_11_0_MASK.
> > 
> > So, it looks like to_auipc_imm() is doing
> > 
> >    offset[31:12] + ((value & BIT(11)) ? (1 << 12) : 0)
> > 
> > but the spec says the auipc part of the 'call' pseudoinstruction should be
> 
> can you point me to that part of the spec?
> 
> For educational purposes, because in the main riscv-spec I only found
> the main auipc + jalr descriptions, but nothing about the actual
> "call" pseudoinstruction.
> 
> [and I'm probably just looking at the wrong document]
> 
> 
> >    offset[31:12] + offset[11]
> >  
> > which I think would be written as
> > 
> >  ((((value) & RV_U_IMM_31_12_MASK) << RV_U_IMM_31_12_OPOFF) + ((value) & BIT(11)))
> > 
> > or what am I missing?

So far I've found the riscv-asm-manual [0], which only states for call
	auipc x1, offset[31:12]
	jalr x1, x1, offset[11:0]

and the psABI spec [1], neither mention your "offset[31:12] + offset[11]" ?

But [1] does contain that tiny sentence
"The successive instruction has a signed 12-bit immediate so the value
 of the preceding high 20-bit relocation may have 1 added to it."


I.e. the lower 12 bits become themself a signed number [-2048:2047]

Take an example:
- address is 1862116 ( 0b 111000110 100111100100 )
- address[31:12] becomes 1859584 (as 0b 111000110 000000000000)
- while address[11:0] is 2532 as part of the bigger number
- as lone 12bit signed IMM it becomes -1564
- so you need to add this 4096 to the auipc IMM to compensate


Heiko


[0] https://github.com/riscv-non-isa/riscv-asm-manual/blob/master/riscv-asm.md
[1] https://github.com/riscv-non-isa/riscv-elf-psabi-doc/releases/tag/v1.0

> 
> that whole thing came from the ftrace parts, also doing call fixup voodoo
> And I can't really say that I understand every nook and cranny of it.
> 
> But for practical purposes, the addresses generated now work,
> and also seem to work for the ftrace counterpart (see include/asm/ftrace.h)
> 
> [another place that will profit from more generalization :-) ]
> 
> 
> > > +
> > > +void riscv_alternative_fix_auipc_jalr(void *alt_ptr, unsigned int len,
> > > +				      int patch_offset)
> > > +{
> > > +	int num_instr = len / sizeof(u32);
> > > +	unsigned int call[2];
> > > +	int i;
> > > +	int imm;
> > > +	u32 rd1;
> > > +
> > > +	/*
> > > +	 * stop one instruction before the end, as we're checking
> > > +	 * for auipc + jalr
> > > +	 */
> > > +	for (i = 0; i < num_instr - 1; i++) {
> > 
> > If we change riscv_instruction_at() to just take a pointer then we can do
> > the math in the for() and actually just use pointer arithmetic.
> > 
> >         uint32_t *p = alt_ptr;
> >         for (i = 0; i < num_instr - 1; i++, p++) {
> 
> Wasn't not using uint32 pointers the whole point of going with the accessor?
> 
> 
> > > +		u32 inst1 = riscv_instruction_at(alt_ptr, i);
> >                                                  p
> > > +		u32 inst2 = riscv_instruction_at(alt_ptr, i + 1);
> >                                                  p + 1
> > > +
> > > +		if (!riscv_insn_is_auipc_jalr(inst1, inst2))
> > > +			continue;
> > > +
> > > +		/* call will use ra register */
> > > +		rd1 = RV_EXTRACT_RD_REG(inst1);
> > > +		if (rd1 != 1)
> > > +			continue;
> > 
> > nit: rd1 is only used once, how about
> > 
> >  if (RV_EXTRACT_RD_REG(inst1) != 1)
> 
> ok
> 
> 
> Need to look at the rest tomorrow
> Heiko
> 
> 
> > > +
> > > +		/* get and adjust new target address */
> > > +		imm = RV_EXTRACT_UTYPE_IMM(inst1);
> > 
> > Based on my understanding of a auipc part of the 'call', it seems we
> > should be subtracting BIT(11) here. And, since RV_EXTRACT_* does sign-
> > extension for I-type, then I'm not sure we should use it. So,
> > 
> >         imm = (inst2 >> RV_I_IMM_11_0_OPOFF) & RV_I_IMM_11_0_MASK;
> > 	imm += ((inst1 >> RV_U_IMM_31_12_OPOFF) & RV_U_IMM_31_12_MASK) - (imm & BIT(11));
> > 
> > > +		imm += RV_EXTRACT_ITYPE_IMM(inst2);
> > > +		imm -= patch_offset;
> > > +
> > > +		/* pick the original auipc + jalr */
> > > +		call[0] = inst1;
> > > +		call[1] = inst2;
> > > +
> > > +		/* drop the old IMMs */
> > > +		call[0] &= ~(RV_U_IMM_31_12_MASK);
> > 
> > Same comment as above about RV_U_IMM_31_12_OPOFF. IMO, this would be more
> > consistent with the shift, even though it's zero.
> > 
> >                 call[0] &= ~(RV_U_IMM_31_12_MASK << RV_U_IMM_31_12_OPOFF);
> > 
> > > +		call[1] &= ~(RV_I_IMM_11_0_MASK << RV_I_IMM_11_0_OPOFF);
> > > +
> > > +		/* add the adapted IMMs */
> > > +		call[0] |= to_auipc_imm(imm);
> > 
> > As pointed out above, I'm not sure about to_auipc_imm().
> > 
> > > +		call[1] |= to_jalr_imm(imm);
> > > +
> > > +		/* patch the call place again */
> > > +		patch_text_nosync(alt_ptr + i * sizeof(u32), call, 8);
> >                                   ^                                ^
> > 				  p		       sizeof(u32) * 2
> > > +	}
> > > +}
> > > +
> > >  /*
> > >   * This is called very early in the boot process (directly after we run
> > >   * a feature detect on the boot CPU). No need to worry about other CPUs
> > > diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c
> > > index 694267d1fe81..ba62a4ff5ccd 100644
> > > --- a/arch/riscv/kernel/cpufeature.c
> > > +++ b/arch/riscv/kernel/cpufeature.c
> > > @@ -316,8 +316,15 @@ void __init_or_module riscv_cpufeature_patch_func(struct alt_entry *begin,
> > >  		}
> > >  
> > >  		tmp = (1U << alt->errata_id);
> > > -		if (cpu_req_feature & tmp)
> > > -			patch_text_nosync(alt->old_ptr, alt->alt_ptr, alt->alt_len);
> > > +		if (cpu_req_feature & tmp) {
> > > +			/* do the basic patching */
> > > +			patch_text_nosync(alt->old_ptr, alt->alt_ptr,
> > > +					  alt->alt_len);
> > 
> > nit: I'd leave this line long and only have one wrap in the line below
> > 
> > > +
> > > +			riscv_alternative_fix_auipc_jalr(alt->old_ptr,
> > > +							 alt->alt_len,
> > > +							 alt->old_ptr - alt->alt_ptr);
> > > +		}
> > >  	}
> > >  }
> > >  #endif
> > >
> > 
> > Thanks,
> > drew
> > 
> 
> 





_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [PATCH v3 11/14] RISC-V: fix auipc-jalr addresses in patched alternatives
  2022-12-07  9:35         ` Heiko Stübner
@ 2022-12-07 10:44           ` Andrew Jones
  -1 siblings, 0 replies; 84+ messages in thread
From: Andrew Jones @ 2022-12-07 10:44 UTC (permalink / raw)
  To: Heiko Stübner
  Cc: linux-riscv, palmer, christoph.muellner, prabhakar.csengg, conor,
	philipp.tomsich, emil.renner.berthing, ardb, linux-efi,
	Conor Dooley

On Wed, Dec 07, 2022 at 10:35:50AM +0100, Heiko Stübner wrote:
> Am Dienstag, 6. Dezember 2022, 23:10:01 CET schrieb Heiko Stübner:
> > Am Donnerstag, 1. Dezember 2022, 20:33:53 CET schrieb Andrew Jones:
> > > On Wed, Nov 30, 2022 at 11:56:11PM +0100, Heiko Stuebner wrote:
> > > > From: Heiko Stuebner <heiko.stuebner@vrull.eu>
...
> > > > +#define to_auipc_imm(value)						\
> > > > +	((value & JALR_SIGN_MASK) ?					\
> > > > +	((value & RV_U_IMM_31_12_MASK) + AUIPC_PAD) :	\
> > > > +	(value & RV_U_IMM_31_12_MASK))
> > > 
> > > I know RV_U_IMM_31_12_OPOFF is 0, but it looks odd not shifting
> > > RV_U_IMM_31_12_MASK when we do shift RV_I_IMM_11_0_MASK.
> > > 
> > > So, it looks like to_auipc_imm() is doing
> > > 
> > >    offset[31:12] + ((value & BIT(11)) ? (1 << 12) : 0)
> > > 
> > > but the spec says the auipc part of the 'call' pseudoinstruction should be
> > 
> > can you point me to that part of the spec?
> > 
> > For educational purposes, because in the main riscv-spec I only found
> > the main auipc + jalr descriptions, but nothing about the actual
> > "call" pseudoinstruction.
> > 
> > [and I'm probably just looking at the wrong document]
> > 
> > 
> > >    offset[31:12] + offset[11]
> > >  
> > > which I think would be written as
> > > 
> > >  ((((value) & RV_U_IMM_31_12_MASK) << RV_U_IMM_31_12_OPOFF) + ((value) & BIT(11)))
> > > 
> > > or what am I missing?
> 
> So far I've found the riscv-asm-manual [0], which only states for call
> 	auipc x1, offset[31:12]
> 	jalr x1, x1, offset[11:0]
> 
> and the psABI spec [1], neither mention your "offset[31:12] + offset[11]" ?
> 
> But [1] does contain that tiny sentence
> "The successive instruction has a signed 12-bit immediate so the value
>  of the preceding high 20-bit relocation may have 1 added to it."
> 
> 
> I.e. the lower 12 bits become themself a signed number [-2048:2047]
> 
> Take an example:
> - address is 1862116 ( 0b 111000110 100111100100 )
> - address[31:12] becomes 1859584 (as 0b 111000110 000000000000)
> - while address[11:0] is 2532 as part of the bigger number
> - as lone 12bit signed IMM it becomes -1564
> - so you need to add this 4096 to the auipc IMM to compensate
> 
> 
> Heiko
> 
> 
> [0] https://github.com/riscv-non-isa/riscv-asm-manual/blob/master/riscv-asm.md
> [1] https://github.com/riscv-non-isa/riscv-elf-psabi-doc/releases/tag/v1.0

Yeah, I got this figured out yesterday and sent another mail confirming
the way you had it was right. Did you receive that mail?

Where I see 

 call offset     auipc x1, offset[31 : 12] + offset[11]
                 jalr x1, offset[11:0](x1)

is in chapter 25 of the unprivileged ISA.

> 
> > 
> > that whole thing came from the ftrace parts, also doing call fixup voodoo
> > And I can't really say that I understand every nook and cranny of it.
> > 
> > But for practical purposes, the addresses generated now work,
> > and also seem to work for the ftrace counterpart (see include/asm/ftrace.h)
> > 
> > [another place that will profit from more generalization :-) ]
> > 
> > 
> > > > +
> > > > +void riscv_alternative_fix_auipc_jalr(void *alt_ptr, unsigned int len,
> > > > +				      int patch_offset)
> > > > +{
> > > > +	int num_instr = len / sizeof(u32);
> > > > +	unsigned int call[2];
> > > > +	int i;
> > > > +	int imm;
> > > > +	u32 rd1;
> > > > +
> > > > +	/*
> > > > +	 * stop one instruction before the end, as we're checking
> > > > +	 * for auipc + jalr
> > > > +	 */
> > > > +	for (i = 0; i < num_instr - 1; i++) {
> > > 
> > > If we change riscv_instruction_at() to just take a pointer then we can do
> > > the math in the for() and actually just use pointer arithmetic.
> > > 
> > >         uint32_t *p = alt_ptr;
> > >         for (i = 0; i < num_instr - 1; i++, p++) {
> > 
> > Wasn't not using uint32 pointers the whole point of going with the accessor?

Oh, right. So maybe stick to the offsetting, but still do the math in the
caller.

Thanks,
drew

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [PATCH v3 11/14] RISC-V: fix auipc-jalr addresses in patched alternatives
@ 2022-12-07 10:44           ` Andrew Jones
  0 siblings, 0 replies; 84+ messages in thread
From: Andrew Jones @ 2022-12-07 10:44 UTC (permalink / raw)
  To: Heiko Stübner
  Cc: linux-riscv, palmer, christoph.muellner, prabhakar.csengg, conor,
	philipp.tomsich, emil.renner.berthing, ardb, linux-efi,
	Conor Dooley

On Wed, Dec 07, 2022 at 10:35:50AM +0100, Heiko Stübner wrote:
> Am Dienstag, 6. Dezember 2022, 23:10:01 CET schrieb Heiko Stübner:
> > Am Donnerstag, 1. Dezember 2022, 20:33:53 CET schrieb Andrew Jones:
> > > On Wed, Nov 30, 2022 at 11:56:11PM +0100, Heiko Stuebner wrote:
> > > > From: Heiko Stuebner <heiko.stuebner@vrull.eu>
...
> > > > +#define to_auipc_imm(value)						\
> > > > +	((value & JALR_SIGN_MASK) ?					\
> > > > +	((value & RV_U_IMM_31_12_MASK) + AUIPC_PAD) :	\
> > > > +	(value & RV_U_IMM_31_12_MASK))
> > > 
> > > I know RV_U_IMM_31_12_OPOFF is 0, but it looks odd not shifting
> > > RV_U_IMM_31_12_MASK when we do shift RV_I_IMM_11_0_MASK.
> > > 
> > > So, it looks like to_auipc_imm() is doing
> > > 
> > >    offset[31:12] + ((value & BIT(11)) ? (1 << 12) : 0)
> > > 
> > > but the spec says the auipc part of the 'call' pseudoinstruction should be
> > 
> > can you point me to that part of the spec?
> > 
> > For educational purposes, because in the main riscv-spec I only found
> > the main auipc + jalr descriptions, but nothing about the actual
> > "call" pseudoinstruction.
> > 
> > [and I'm probably just looking at the wrong document]
> > 
> > 
> > >    offset[31:12] + offset[11]
> > >  
> > > which I think would be written as
> > > 
> > >  ((((value) & RV_U_IMM_31_12_MASK) << RV_U_IMM_31_12_OPOFF) + ((value) & BIT(11)))
> > > 
> > > or what am I missing?
> 
> So far I've found the riscv-asm-manual [0], which only states for call
> 	auipc x1, offset[31:12]
> 	jalr x1, x1, offset[11:0]
> 
> and the psABI spec [1], neither mention your "offset[31:12] + offset[11]" ?
> 
> But [1] does contain that tiny sentence
> "The successive instruction has a signed 12-bit immediate so the value
>  of the preceding high 20-bit relocation may have 1 added to it."
> 
> 
> I.e. the lower 12 bits become themself a signed number [-2048:2047]
> 
> Take an example:
> - address is 1862116 ( 0b 111000110 100111100100 )
> - address[31:12] becomes 1859584 (as 0b 111000110 000000000000)
> - while address[11:0] is 2532 as part of the bigger number
> - as lone 12bit signed IMM it becomes -1564
> - so you need to add this 4096 to the auipc IMM to compensate
> 
> 
> Heiko
> 
> 
> [0] https://github.com/riscv-non-isa/riscv-asm-manual/blob/master/riscv-asm.md
> [1] https://github.com/riscv-non-isa/riscv-elf-psabi-doc/releases/tag/v1.0

Yeah, I got this figured out yesterday and sent another mail confirming
the way you had it was right. Did you receive that mail?

Where I see 

 call offset     auipc x1, offset[31 : 12] + offset[11]
                 jalr x1, offset[11:0](x1)

is in chapter 25 of the unprivileged ISA.

> 
> > 
> > that whole thing came from the ftrace parts, also doing call fixup voodoo
> > And I can't really say that I understand every nook and cranny of it.
> > 
> > But for practical purposes, the addresses generated now work,
> > and also seem to work for the ftrace counterpart (see include/asm/ftrace.h)
> > 
> > [another place that will profit from more generalization :-) ]
> > 
> > 
> > > > +
> > > > +void riscv_alternative_fix_auipc_jalr(void *alt_ptr, unsigned int len,
> > > > +				      int patch_offset)
> > > > +{
> > > > +	int num_instr = len / sizeof(u32);
> > > > +	unsigned int call[2];
> > > > +	int i;
> > > > +	int imm;
> > > > +	u32 rd1;
> > > > +
> > > > +	/*
> > > > +	 * stop one instruction before the end, as we're checking
> > > > +	 * for auipc + jalr
> > > > +	 */
> > > > +	for (i = 0; i < num_instr - 1; i++) {
> > > 
> > > If we change riscv_instruction_at() to just take a pointer then we can do
> > > the math in the for() and actually just use pointer arithmetic.
> > > 
> > >         uint32_t *p = alt_ptr;
> > >         for (i = 0; i < num_instr - 1; i++, p++) {
> > 
> > Wasn't not using uint32 pointers the whole point of going with the accessor?

Oh, right. So maybe stick to the offsetting, but still do the math in the
caller.

Thanks,
drew

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [PATCH v3 11/14] RISC-V: fix auipc-jalr addresses in patched alternatives
  2022-12-07 10:44           ` Andrew Jones
@ 2022-12-07 11:19             ` Emil Renner Berthing
  -1 siblings, 0 replies; 84+ messages in thread
From: Emil Renner Berthing @ 2022-12-07 11:19 UTC (permalink / raw)
  To: Andrew Jones
  Cc: Heiko Stübner, linux-riscv, palmer, christoph.muellner,
	prabhakar.csengg, conor, philipp.tomsich, ardb, linux-efi,
	Conor Dooley

On Wed, 7 Dec 2022 at 11:44, Andrew Jones <ajones@ventanamicro.com> wrote:
>
> On Wed, Dec 07, 2022 at 10:35:50AM +0100, Heiko Stübner wrote:
> > Am Dienstag, 6. Dezember 2022, 23:10:01 CET schrieb Heiko Stübner:
> > > Am Donnerstag, 1. Dezember 2022, 20:33:53 CET schrieb Andrew Jones:
> > > > On Wed, Nov 30, 2022 at 11:56:11PM +0100, Heiko Stuebner wrote:
> > > > > From: Heiko Stuebner <heiko.stuebner@vrull.eu>
> ...
> > > > > +#define to_auipc_imm(value)                                            \
> > > > > +       ((value & JALR_SIGN_MASK) ?                                     \
> > > > > +       ((value & RV_U_IMM_31_12_MASK) + AUIPC_PAD) :   \
> > > > > +       (value & RV_U_IMM_31_12_MASK))
> > > >
> > > > I know RV_U_IMM_31_12_OPOFF is 0, but it looks odd not shifting
> > > > RV_U_IMM_31_12_MASK when we do shift RV_I_IMM_11_0_MASK.
> > > >
> > > > So, it looks like to_auipc_imm() is doing
> > > >
> > > >    offset[31:12] + ((value & BIT(11)) ? (1 << 12) : 0)
> > > >
> > > > but the spec says the auipc part of the 'call' pseudoinstruction should be
> > >
> > > can you point me to that part of the spec?
> > >
> > > For educational purposes, because in the main riscv-spec I only found
> > > the main auipc + jalr descriptions, but nothing about the actual
> > > "call" pseudoinstruction.
> > >
> > > [and I'm probably just looking at the wrong document]
> > >
> > >
> > > >    offset[31:12] + offset[11]
> > > >
> > > > which I think would be written as
> > > >
> > > >  ((((value) & RV_U_IMM_31_12_MASK) << RV_U_IMM_31_12_OPOFF) + ((value) & BIT(11)))
> > > >
> > > > or what am I missing?
> >
> > So far I've found the riscv-asm-manual [0], which only states for call
> >       auipc x1, offset[31:12]
> >       jalr x1, x1, offset[11:0]
> >
> > and the psABI spec [1], neither mention your "offset[31:12] + offset[11]" ?
> >
> > But [1] does contain that tiny sentence
> > "The successive instruction has a signed 12-bit immediate so the value
> >  of the preceding high 20-bit relocation may have 1 added to it."
> >
> >
> > I.e. the lower 12 bits become themself a signed number [-2048:2047]
> >
> > Take an example:
> > - address is 1862116 ( 0b 111000110 100111100100 )
> > - address[31:12] becomes 1859584 (as 0b 111000110 000000000000)
> > - while address[11:0] is 2532 as part of the bigger number
> > - as lone 12bit signed IMM it becomes -1564
> > - so you need to add this 4096 to the auipc IMM to compensate
> >
> >
> > Heiko
> >
> >
> > [0] https://github.com/riscv-non-isa/riscv-asm-manual/blob/master/riscv-asm.md
> > [1] https://github.com/riscv-non-isa/riscv-elf-psabi-doc/releases/tag/v1.0
>
> Yeah, I got this figured out yesterday and sent another mail confirming
> the way you had it was right. Did you receive that mail?
>
> Where I see
>
>  call offset     auipc x1, offset[31 : 12] + offset[11]
>                  jalr x1, offset[11:0](x1)
>
> is in chapter 25 of the unprivileged ISA.

Sorry, for being late to the conversation. We've fixed confusion about
this auipc/jalr offset at least once before. Maybe that explanation
there is helpful to someone:
https://git.kernel.org/torvalds/c/0966d385830de3470b7131db8e86c0c5bc9c52dc

> >
> > >
> > > that whole thing came from the ftrace parts, also doing call fixup voodoo
> > > And I can't really say that I understand every nook and cranny of it.
> > >
> > > But for practical purposes, the addresses generated now work,
> > > and also seem to work for the ftrace counterpart (see include/asm/ftrace.h)
> > >
> > > [another place that will profit from more generalization :-) ]
> > >
> > >
> > > > > +
> > > > > +void riscv_alternative_fix_auipc_jalr(void *alt_ptr, unsigned int len,
> > > > > +                                     int patch_offset)
> > > > > +{
> > > > > +       int num_instr = len / sizeof(u32);
> > > > > +       unsigned int call[2];
> > > > > +       int i;
> > > > > +       int imm;
> > > > > +       u32 rd1;
> > > > > +
> > > > > +       /*
> > > > > +        * stop one instruction before the end, as we're checking
> > > > > +        * for auipc + jalr
> > > > > +        */
> > > > > +       for (i = 0; i < num_instr - 1; i++) {
> > > >
> > > > If we change riscv_instruction_at() to just take a pointer then we can do
> > > > the math in the for() and actually just use pointer arithmetic.
> > > >
> > > >         uint32_t *p = alt_ptr;
> > > >         for (i = 0; i < num_instr - 1; i++, p++) {
> > >
> > > Wasn't not using uint32 pointers the whole point of going with the accessor?
>
> Oh, right. So maybe stick to the offsetting, but still do the math in the
> caller.
>
> Thanks,
> drew

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [PATCH v3 11/14] RISC-V: fix auipc-jalr addresses in patched alternatives
@ 2022-12-07 11:19             ` Emil Renner Berthing
  0 siblings, 0 replies; 84+ messages in thread
From: Emil Renner Berthing @ 2022-12-07 11:19 UTC (permalink / raw)
  To: Andrew Jones
  Cc: Heiko Stübner, linux-riscv, palmer, christoph.muellner,
	prabhakar.csengg, conor, philipp.tomsich, ardb, linux-efi,
	Conor Dooley

On Wed, 7 Dec 2022 at 11:44, Andrew Jones <ajones@ventanamicro.com> wrote:
>
> On Wed, Dec 07, 2022 at 10:35:50AM +0100, Heiko Stübner wrote:
> > Am Dienstag, 6. Dezember 2022, 23:10:01 CET schrieb Heiko Stübner:
> > > Am Donnerstag, 1. Dezember 2022, 20:33:53 CET schrieb Andrew Jones:
> > > > On Wed, Nov 30, 2022 at 11:56:11PM +0100, Heiko Stuebner wrote:
> > > > > From: Heiko Stuebner <heiko.stuebner@vrull.eu>
> ...
> > > > > +#define to_auipc_imm(value)                                            \
> > > > > +       ((value & JALR_SIGN_MASK) ?                                     \
> > > > > +       ((value & RV_U_IMM_31_12_MASK) + AUIPC_PAD) :   \
> > > > > +       (value & RV_U_IMM_31_12_MASK))
> > > >
> > > > I know RV_U_IMM_31_12_OPOFF is 0, but it looks odd not shifting
> > > > RV_U_IMM_31_12_MASK when we do shift RV_I_IMM_11_0_MASK.
> > > >
> > > > So, it looks like to_auipc_imm() is doing
> > > >
> > > >    offset[31:12] + ((value & BIT(11)) ? (1 << 12) : 0)
> > > >
> > > > but the spec says the auipc part of the 'call' pseudoinstruction should be
> > >
> > > can you point me to that part of the spec?
> > >
> > > For educational purposes, because in the main riscv-spec I only found
> > > the main auipc + jalr descriptions, but nothing about the actual
> > > "call" pseudoinstruction.
> > >
> > > [and I'm probably just looking at the wrong document]
> > >
> > >
> > > >    offset[31:12] + offset[11]
> > > >
> > > > which I think would be written as
> > > >
> > > >  ((((value) & RV_U_IMM_31_12_MASK) << RV_U_IMM_31_12_OPOFF) + ((value) & BIT(11)))
> > > >
> > > > or what am I missing?
> >
> > So far I've found the riscv-asm-manual [0], which only states for call
> >       auipc x1, offset[31:12]
> >       jalr x1, x1, offset[11:0]
> >
> > and the psABI spec [1], neither mention your "offset[31:12] + offset[11]" ?
> >
> > But [1] does contain that tiny sentence
> > "The successive instruction has a signed 12-bit immediate so the value
> >  of the preceding high 20-bit relocation may have 1 added to it."
> >
> >
> > I.e. the lower 12 bits become themself a signed number [-2048:2047]
> >
> > Take an example:
> > - address is 1862116 ( 0b 111000110 100111100100 )
> > - address[31:12] becomes 1859584 (as 0b 111000110 000000000000)
> > - while address[11:0] is 2532 as part of the bigger number
> > - as lone 12bit signed IMM it becomes -1564
> > - so you need to add this 4096 to the auipc IMM to compensate
> >
> >
> > Heiko
> >
> >
> > [0] https://github.com/riscv-non-isa/riscv-asm-manual/blob/master/riscv-asm.md
> > [1] https://github.com/riscv-non-isa/riscv-elf-psabi-doc/releases/tag/v1.0
>
> Yeah, I got this figured out yesterday and sent another mail confirming
> the way you had it was right. Did you receive that mail?
>
> Where I see
>
>  call offset     auipc x1, offset[31 : 12] + offset[11]
>                  jalr x1, offset[11:0](x1)
>
> is in chapter 25 of the unprivileged ISA.

Sorry, for being late to the conversation. We've fixed confusion about
this auipc/jalr offset at least once before. Maybe that explanation
there is helpful to someone:
https://git.kernel.org/torvalds/c/0966d385830de3470b7131db8e86c0c5bc9c52dc

> >
> > >
> > > that whole thing came from the ftrace parts, also doing call fixup voodoo
> > > And I can't really say that I understand every nook and cranny of it.
> > >
> > > But for practical purposes, the addresses generated now work,
> > > and also seem to work for the ftrace counterpart (see include/asm/ftrace.h)
> > >
> > > [another place that will profit from more generalization :-) ]
> > >
> > >
> > > > > +
> > > > > +void riscv_alternative_fix_auipc_jalr(void *alt_ptr, unsigned int len,
> > > > > +                                     int patch_offset)
> > > > > +{
> > > > > +       int num_instr = len / sizeof(u32);
> > > > > +       unsigned int call[2];
> > > > > +       int i;
> > > > > +       int imm;
> > > > > +       u32 rd1;
> > > > > +
> > > > > +       /*
> > > > > +        * stop one instruction before the end, as we're checking
> > > > > +        * for auipc + jalr
> > > > > +        */
> > > > > +       for (i = 0; i < num_instr - 1; i++) {
> > > >
> > > > If we change riscv_instruction_at() to just take a pointer then we can do
> > > > the math in the for() and actually just use pointer arithmetic.
> > > >
> > > >         uint32_t *p = alt_ptr;
> > > >         for (i = 0; i < num_instr - 1; i++, p++) {
> > >
> > > Wasn't not using uint32 pointers the whole point of going with the accessor?
>
> Oh, right. So maybe stick to the offsetting, but still do the math in the
> caller.
>
> Thanks,
> drew

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [PATCH v3 11/14] RISC-V: fix auipc-jalr addresses in patched alternatives
  2022-11-30 22:56   ` Heiko Stuebner
@ 2022-12-07 11:29     ` Emil Renner Berthing
  -1 siblings, 0 replies; 84+ messages in thread
From: Emil Renner Berthing @ 2022-12-07 11:29 UTC (permalink / raw)
  To: Heiko Stuebner
  Cc: linux-riscv, palmer, christoph.muellner, prabhakar.csengg, conor,
	philipp.tomsich, ajones, ardb, linux-efi, Heiko Stuebner,
	Conor Dooley

On Wed, 30 Nov 2022 at 23:57, Heiko Stuebner <heiko@sntech.de> wrote:
>
> From: Heiko Stuebner <heiko.stuebner@vrull.eu>
>
> Alternatives live in a different section, so addresses used by call
> functions will point to wrong locations after the patch got applied.
>
> Similar to arm64, adjust the location to consider that offset.
>
> Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
> Signed-off-by: Heiko Stuebner <heiko.stuebner@vrull.eu>
> ---
>  arch/riscv/include/asm/alternative.h |  3 ++
>  arch/riscv/kernel/alternative.c      | 72 ++++++++++++++++++++++++++++
>  arch/riscv/kernel/cpufeature.c       | 11 ++++-
>  3 files changed, 84 insertions(+), 2 deletions(-)
>
> diff --git a/arch/riscv/include/asm/alternative.h b/arch/riscv/include/asm/alternative.h
> index 6511dd73e812..c58ec3cc4bc3 100644
> --- a/arch/riscv/include/asm/alternative.h
> +++ b/arch/riscv/include/asm/alternative.h
> @@ -27,6 +27,9 @@ void __init apply_boot_alternatives(void);
>  void __init apply_early_boot_alternatives(void);
>  void apply_module_alternatives(void *start, size_t length);
>
> +void riscv_alternative_fix_auipc_jalr(void *alt_ptr, unsigned int len,
> +                                     int patch_offset);
> +
>  struct alt_entry {
>         void *old_ptr;           /* address of original instruciton or data  */
>         void *alt_ptr;           /* address of replacement instruction or data */
> diff --git a/arch/riscv/kernel/alternative.c b/arch/riscv/kernel/alternative.c
> index a7d26a00beea..292cc42dc3be 100644
> --- a/arch/riscv/kernel/alternative.c
> +++ b/arch/riscv/kernel/alternative.c
> @@ -15,6 +15,8 @@
>  #include <asm/vendorid_list.h>
>  #include <asm/sbi.h>
>  #include <asm/csr.h>
> +#include <asm/insn.h>
> +#include <asm/patch.h>
>
>  struct cpu_manufacturer_info_t {
>         unsigned long vendor_id;
> @@ -53,6 +55,76 @@ static void __init_or_module riscv_fill_cpu_mfr_info(struct cpu_manufacturer_inf
>         }
>  }
>
> +static unsigned int riscv_instruction_at(void *p, unsigned int offset)
> +{
> +       u16 *parcel = p + (offset * sizeof(u32));
> +
> +       return (unsigned int)parcel[0] | (unsigned int)parcel[1] << 16;
> +}
> +
> +static inline bool riscv_insn_is_auipc_jalr(u32 insn1, u32 insn2)
> +{
> +       return riscv_insn_is_auipc(insn1) && riscv_insn_is_jalr(insn2);
> +}
> +
> +#define JALR_SIGN_MASK         BIT(RV_I_IMM_SIGN_OPOFF - RV_I_IMM_11_0_OPOFF)
> +#define AUIPC_PAD              (0x00001000)
> +
> +#define to_jalr_imm(value)                                             \
> +       ((value & RV_I_IMM_11_0_MASK) << RV_I_IMM_11_0_OPOFF)
> +
> +#define to_auipc_imm(value)                                            \
> +       ((value & JALR_SIGN_MASK) ?                                     \
> +       ((value & RV_U_IMM_31_12_MASK) + AUIPC_PAD) :   \
> +       (value & RV_U_IMM_31_12_MASK))
> +
> +void riscv_alternative_fix_auipc_jalr(void *alt_ptr, unsigned int len,
> +                                     int patch_offset)
> +{
> +       int num_instr = len / sizeof(u32);
> +       unsigned int call[2];
> +       int i;
> +       int imm;
> +       u32 rd1;
> +
> +       /*
> +        * stop one instruction before the end, as we're checking
> +        * for auipc + jalr
> +        */
> +       for (i = 0; i < num_instr - 1; i++) {
> +               u32 inst1 = riscv_instruction_at(alt_ptr, i);
> +               u32 inst2 = riscv_instruction_at(alt_ptr, i + 1);
> +
> +               if (!riscv_insn_is_auipc_jalr(inst1, inst2))
> +                       continue;
> +
> +               /* call will use ra register */
> +               rd1 = RV_EXTRACT_RD_REG(inst1);
> +               if (rd1 != 1)
> +                       continue;
> +
> +               /* get and adjust new target address */
> +               imm = RV_EXTRACT_UTYPE_IMM(inst1);
> +               imm += RV_EXTRACT_ITYPE_IMM(inst2);
> +               imm -= patch_offset;
> +
> +               /* pick the original auipc + jalr */
> +               call[0] = inst1;
> +               call[1] = inst2;
> +
> +               /* drop the old IMMs */
> +               call[0] &= ~(RV_U_IMM_31_12_MASK);
> +               call[1] &= ~(RV_I_IMM_11_0_MASK << RV_I_IMM_11_0_OPOFF);
> +
> +               /* add the adapted IMMs */
> +               call[0] |= to_auipc_imm(imm);
> +               call[1] |= to_jalr_imm(imm);

It's fine like this, but as explained in
https://git.kernel.org/torvalds/c/0966d385830de3470b7131db8e86c0c5bc9c52dc
we can get rid of the conditional by doing

call[0] |= (imm + 0x800) >> 12;
call[1] |= imm & 0xfff;

> +
> +               /* patch the call place again */
> +               patch_text_nosync(alt_ptr + i * sizeof(u32), call, 8);
> +       }
> +}
> +
>  /*
>   * This is called very early in the boot process (directly after we run
>   * a feature detect on the boot CPU). No need to worry about other CPUs
> diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c
> index 694267d1fe81..ba62a4ff5ccd 100644
> --- a/arch/riscv/kernel/cpufeature.c
> +++ b/arch/riscv/kernel/cpufeature.c
> @@ -316,8 +316,15 @@ void __init_or_module riscv_cpufeature_patch_func(struct alt_entry *begin,
>                 }
>
>                 tmp = (1U << alt->errata_id);
> -               if (cpu_req_feature & tmp)
> -                       patch_text_nosync(alt->old_ptr, alt->alt_ptr, alt->alt_len);
> +               if (cpu_req_feature & tmp) {
> +                       /* do the basic patching */
> +                       patch_text_nosync(alt->old_ptr, alt->alt_ptr,
> +                                         alt->alt_len);
> +
> +                       riscv_alternative_fix_auipc_jalr(alt->old_ptr,
> +                                                        alt->alt_len,
> +                                                        alt->old_ptr - alt->alt_ptr);
> +               }
>         }
>  }
>  #endif
> --
> 2.35.1
>

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [PATCH v3 11/14] RISC-V: fix auipc-jalr addresses in patched alternatives
@ 2022-12-07 11:29     ` Emil Renner Berthing
  0 siblings, 0 replies; 84+ messages in thread
From: Emil Renner Berthing @ 2022-12-07 11:29 UTC (permalink / raw)
  To: Heiko Stuebner
  Cc: linux-riscv, palmer, christoph.muellner, prabhakar.csengg, conor,
	philipp.tomsich, ajones, ardb, linux-efi, Heiko Stuebner,
	Conor Dooley

On Wed, 30 Nov 2022 at 23:57, Heiko Stuebner <heiko@sntech.de> wrote:
>
> From: Heiko Stuebner <heiko.stuebner@vrull.eu>
>
> Alternatives live in a different section, so addresses used by call
> functions will point to wrong locations after the patch got applied.
>
> Similar to arm64, adjust the location to consider that offset.
>
> Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
> Signed-off-by: Heiko Stuebner <heiko.stuebner@vrull.eu>
> ---
>  arch/riscv/include/asm/alternative.h |  3 ++
>  arch/riscv/kernel/alternative.c      | 72 ++++++++++++++++++++++++++++
>  arch/riscv/kernel/cpufeature.c       | 11 ++++-
>  3 files changed, 84 insertions(+), 2 deletions(-)
>
> diff --git a/arch/riscv/include/asm/alternative.h b/arch/riscv/include/asm/alternative.h
> index 6511dd73e812..c58ec3cc4bc3 100644
> --- a/arch/riscv/include/asm/alternative.h
> +++ b/arch/riscv/include/asm/alternative.h
> @@ -27,6 +27,9 @@ void __init apply_boot_alternatives(void);
>  void __init apply_early_boot_alternatives(void);
>  void apply_module_alternatives(void *start, size_t length);
>
> +void riscv_alternative_fix_auipc_jalr(void *alt_ptr, unsigned int len,
> +                                     int patch_offset);
> +
>  struct alt_entry {
>         void *old_ptr;           /* address of original instruciton or data  */
>         void *alt_ptr;           /* address of replacement instruction or data */
> diff --git a/arch/riscv/kernel/alternative.c b/arch/riscv/kernel/alternative.c
> index a7d26a00beea..292cc42dc3be 100644
> --- a/arch/riscv/kernel/alternative.c
> +++ b/arch/riscv/kernel/alternative.c
> @@ -15,6 +15,8 @@
>  #include <asm/vendorid_list.h>
>  #include <asm/sbi.h>
>  #include <asm/csr.h>
> +#include <asm/insn.h>
> +#include <asm/patch.h>
>
>  struct cpu_manufacturer_info_t {
>         unsigned long vendor_id;
> @@ -53,6 +55,76 @@ static void __init_or_module riscv_fill_cpu_mfr_info(struct cpu_manufacturer_inf
>         }
>  }
>
> +static unsigned int riscv_instruction_at(void *p, unsigned int offset)
> +{
> +       u16 *parcel = p + (offset * sizeof(u32));
> +
> +       return (unsigned int)parcel[0] | (unsigned int)parcel[1] << 16;
> +}
> +
> +static inline bool riscv_insn_is_auipc_jalr(u32 insn1, u32 insn2)
> +{
> +       return riscv_insn_is_auipc(insn1) && riscv_insn_is_jalr(insn2);
> +}
> +
> +#define JALR_SIGN_MASK         BIT(RV_I_IMM_SIGN_OPOFF - RV_I_IMM_11_0_OPOFF)
> +#define AUIPC_PAD              (0x00001000)
> +
> +#define to_jalr_imm(value)                                             \
> +       ((value & RV_I_IMM_11_0_MASK) << RV_I_IMM_11_0_OPOFF)
> +
> +#define to_auipc_imm(value)                                            \
> +       ((value & JALR_SIGN_MASK) ?                                     \
> +       ((value & RV_U_IMM_31_12_MASK) + AUIPC_PAD) :   \
> +       (value & RV_U_IMM_31_12_MASK))
> +
> +void riscv_alternative_fix_auipc_jalr(void *alt_ptr, unsigned int len,
> +                                     int patch_offset)
> +{
> +       int num_instr = len / sizeof(u32);
> +       unsigned int call[2];
> +       int i;
> +       int imm;
> +       u32 rd1;
> +
> +       /*
> +        * stop one instruction before the end, as we're checking
> +        * for auipc + jalr
> +        */
> +       for (i = 0; i < num_instr - 1; i++) {
> +               u32 inst1 = riscv_instruction_at(alt_ptr, i);
> +               u32 inst2 = riscv_instruction_at(alt_ptr, i + 1);
> +
> +               if (!riscv_insn_is_auipc_jalr(inst1, inst2))
> +                       continue;
> +
> +               /* call will use ra register */
> +               rd1 = RV_EXTRACT_RD_REG(inst1);
> +               if (rd1 != 1)
> +                       continue;
> +
> +               /* get and adjust new target address */
> +               imm = RV_EXTRACT_UTYPE_IMM(inst1);
> +               imm += RV_EXTRACT_ITYPE_IMM(inst2);
> +               imm -= patch_offset;
> +
> +               /* pick the original auipc + jalr */
> +               call[0] = inst1;
> +               call[1] = inst2;
> +
> +               /* drop the old IMMs */
> +               call[0] &= ~(RV_U_IMM_31_12_MASK);
> +               call[1] &= ~(RV_I_IMM_11_0_MASK << RV_I_IMM_11_0_OPOFF);
> +
> +               /* add the adapted IMMs */
> +               call[0] |= to_auipc_imm(imm);
> +               call[1] |= to_jalr_imm(imm);

It's fine like this, but as explained in
https://git.kernel.org/torvalds/c/0966d385830de3470b7131db8e86c0c5bc9c52dc
we can get rid of the conditional by doing

call[0] |= (imm + 0x800) >> 12;
call[1] |= imm & 0xfff;

> +
> +               /* patch the call place again */
> +               patch_text_nosync(alt_ptr + i * sizeof(u32), call, 8);
> +       }
> +}
> +
>  /*
>   * This is called very early in the boot process (directly after we run
>   * a feature detect on the boot CPU). No need to worry about other CPUs
> diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c
> index 694267d1fe81..ba62a4ff5ccd 100644
> --- a/arch/riscv/kernel/cpufeature.c
> +++ b/arch/riscv/kernel/cpufeature.c
> @@ -316,8 +316,15 @@ void __init_or_module riscv_cpufeature_patch_func(struct alt_entry *begin,
>                 }
>
>                 tmp = (1U << alt->errata_id);
> -               if (cpu_req_feature & tmp)
> -                       patch_text_nosync(alt->old_ptr, alt->alt_ptr, alt->alt_len);
> +               if (cpu_req_feature & tmp) {
> +                       /* do the basic patching */
> +                       patch_text_nosync(alt->old_ptr, alt->alt_ptr,
> +                                         alt->alt_len);
> +
> +                       riscv_alternative_fix_auipc_jalr(alt->old_ptr,
> +                                                        alt->alt_len,
> +                                                        alt->old_ptr - alt->alt_ptr);
> +               }
>         }
>  }
>  #endif
> --
> 2.35.1
>

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [PATCH v3 11/14] RISC-V: fix auipc-jalr addresses in patched alternatives
  2022-12-07 11:29     ` Emil Renner Berthing
@ 2022-12-07 11:32       ` Emil Renner Berthing
  -1 siblings, 0 replies; 84+ messages in thread
From: Emil Renner Berthing @ 2022-12-07 11:32 UTC (permalink / raw)
  To: Heiko Stuebner
  Cc: linux-riscv, palmer, christoph.muellner, prabhakar.csengg, conor,
	philipp.tomsich, ajones, ardb, linux-efi, Heiko Stuebner,
	Conor Dooley

On Wed, 7 Dec 2022 at 12:29, Emil Renner Berthing
<emil.renner.berthing@canonical.com> wrote:
>
> On Wed, 30 Nov 2022 at 23:57, Heiko Stuebner <heiko@sntech.de> wrote:
> >
> > From: Heiko Stuebner <heiko.stuebner@vrull.eu>
> >
> > Alternatives live in a different section, so addresses used by call
> > functions will point to wrong locations after the patch got applied.
> >
> > Similar to arm64, adjust the location to consider that offset.
> >
> > Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
> > Signed-off-by: Heiko Stuebner <heiko.stuebner@vrull.eu>
> > ---
> >  arch/riscv/include/asm/alternative.h |  3 ++
> >  arch/riscv/kernel/alternative.c      | 72 ++++++++++++++++++++++++++++
> >  arch/riscv/kernel/cpufeature.c       | 11 ++++-
> >  3 files changed, 84 insertions(+), 2 deletions(-)
> >
> > diff --git a/arch/riscv/include/asm/alternative.h b/arch/riscv/include/asm/alternative.h
> > index 6511dd73e812..c58ec3cc4bc3 100644
> > --- a/arch/riscv/include/asm/alternative.h
> > +++ b/arch/riscv/include/asm/alternative.h
> > @@ -27,6 +27,9 @@ void __init apply_boot_alternatives(void);
> >  void __init apply_early_boot_alternatives(void);
> >  void apply_module_alternatives(void *start, size_t length);
> >
> > +void riscv_alternative_fix_auipc_jalr(void *alt_ptr, unsigned int len,
> > +                                     int patch_offset);
> > +
> >  struct alt_entry {
> >         void *old_ptr;           /* address of original instruciton or data  */
> >         void *alt_ptr;           /* address of replacement instruction or data */
> > diff --git a/arch/riscv/kernel/alternative.c b/arch/riscv/kernel/alternative.c
> > index a7d26a00beea..292cc42dc3be 100644
> > --- a/arch/riscv/kernel/alternative.c
> > +++ b/arch/riscv/kernel/alternative.c
> > @@ -15,6 +15,8 @@
> >  #include <asm/vendorid_list.h>
> >  #include <asm/sbi.h>
> >  #include <asm/csr.h>
> > +#include <asm/insn.h>
> > +#include <asm/patch.h>
> >
> >  struct cpu_manufacturer_info_t {
> >         unsigned long vendor_id;
> > @@ -53,6 +55,76 @@ static void __init_or_module riscv_fill_cpu_mfr_info(struct cpu_manufacturer_inf
> >         }
> >  }
> >
> > +static unsigned int riscv_instruction_at(void *p, unsigned int offset)
> > +{
> > +       u16 *parcel = p + (offset * sizeof(u32));
> > +
> > +       return (unsigned int)parcel[0] | (unsigned int)parcel[1] << 16;
> > +}
> > +
> > +static inline bool riscv_insn_is_auipc_jalr(u32 insn1, u32 insn2)
> > +{
> > +       return riscv_insn_is_auipc(insn1) && riscv_insn_is_jalr(insn2);
> > +}
> > +
> > +#define JALR_SIGN_MASK         BIT(RV_I_IMM_SIGN_OPOFF - RV_I_IMM_11_0_OPOFF)
> > +#define AUIPC_PAD              (0x00001000)
> > +
> > +#define to_jalr_imm(value)                                             \
> > +       ((value & RV_I_IMM_11_0_MASK) << RV_I_IMM_11_0_OPOFF)
> > +
> > +#define to_auipc_imm(value)                                            \
> > +       ((value & JALR_SIGN_MASK) ?                                     \
> > +       ((value & RV_U_IMM_31_12_MASK) + AUIPC_PAD) :   \
> > +       (value & RV_U_IMM_31_12_MASK))
> > +
> > +void riscv_alternative_fix_auipc_jalr(void *alt_ptr, unsigned int len,
> > +                                     int patch_offset)
> > +{
> > +       int num_instr = len / sizeof(u32);
> > +       unsigned int call[2];
> > +       int i;
> > +       int imm;
> > +       u32 rd1;
> > +
> > +       /*
> > +        * stop one instruction before the end, as we're checking
> > +        * for auipc + jalr
> > +        */
> > +       for (i = 0; i < num_instr - 1; i++) {
> > +               u32 inst1 = riscv_instruction_at(alt_ptr, i);
> > +               u32 inst2 = riscv_instruction_at(alt_ptr, i + 1);
> > +
> > +               if (!riscv_insn_is_auipc_jalr(inst1, inst2))
> > +                       continue;
> > +
> > +               /* call will use ra register */
> > +               rd1 = RV_EXTRACT_RD_REG(inst1);
> > +               if (rd1 != 1)
> > +                       continue;
> > +
> > +               /* get and adjust new target address */
> > +               imm = RV_EXTRACT_UTYPE_IMM(inst1);
> > +               imm += RV_EXTRACT_ITYPE_IMM(inst2);
> > +               imm -= patch_offset;
> > +
> > +               /* pick the original auipc + jalr */
> > +               call[0] = inst1;
> > +               call[1] = inst2;
> > +
> > +               /* drop the old IMMs */
> > +               call[0] &= ~(RV_U_IMM_31_12_MASK);
> > +               call[1] &= ~(RV_I_IMM_11_0_MASK << RV_I_IMM_11_0_OPOFF);
> > +
> > +               /* add the adapted IMMs */
> > +               call[0] |= to_auipc_imm(imm);
> > +               call[1] |= to_jalr_imm(imm);
>
> It's fine like this, but as explained in
> https://git.kernel.org/torvalds/c/0966d385830de3470b7131db8e86c0c5bc9c52dc
> we can get rid of the conditional by doing
>
> call[0] |= (imm + 0x800) >> 12;
> call[1] |= imm & 0xfff;

Ugh, sorry this is of course wrong. It still needs to take into
account how the immediates are encoded in the instruction, but
calculating the immediates can still be done without a conditional.

> > +
> > +               /* patch the call place again */
> > +               patch_text_nosync(alt_ptr + i * sizeof(u32), call, 8);
> > +       }
> > +}
> > +
> >  /*
> >   * This is called very early in the boot process (directly after we run
> >   * a feature detect on the boot CPU). No need to worry about other CPUs
> > diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c
> > index 694267d1fe81..ba62a4ff5ccd 100644
> > --- a/arch/riscv/kernel/cpufeature.c
> > +++ b/arch/riscv/kernel/cpufeature.c
> > @@ -316,8 +316,15 @@ void __init_or_module riscv_cpufeature_patch_func(struct alt_entry *begin,
> >                 }
> >
> >                 tmp = (1U << alt->errata_id);
> > -               if (cpu_req_feature & tmp)
> > -                       patch_text_nosync(alt->old_ptr, alt->alt_ptr, alt->alt_len);
> > +               if (cpu_req_feature & tmp) {
> > +                       /* do the basic patching */
> > +                       patch_text_nosync(alt->old_ptr, alt->alt_ptr,
> > +                                         alt->alt_len);
> > +
> > +                       riscv_alternative_fix_auipc_jalr(alt->old_ptr,
> > +                                                        alt->alt_len,
> > +                                                        alt->old_ptr - alt->alt_ptr);
> > +               }
> >         }
> >  }
> >  #endif
> > --
> > 2.35.1
> >

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [PATCH v3 11/14] RISC-V: fix auipc-jalr addresses in patched alternatives
@ 2022-12-07 11:32       ` Emil Renner Berthing
  0 siblings, 0 replies; 84+ messages in thread
From: Emil Renner Berthing @ 2022-12-07 11:32 UTC (permalink / raw)
  To: Heiko Stuebner
  Cc: linux-riscv, palmer, christoph.muellner, prabhakar.csengg, conor,
	philipp.tomsich, ajones, ardb, linux-efi, Heiko Stuebner,
	Conor Dooley

On Wed, 7 Dec 2022 at 12:29, Emil Renner Berthing
<emil.renner.berthing@canonical.com> wrote:
>
> On Wed, 30 Nov 2022 at 23:57, Heiko Stuebner <heiko@sntech.de> wrote:
> >
> > From: Heiko Stuebner <heiko.stuebner@vrull.eu>
> >
> > Alternatives live in a different section, so addresses used by call
> > functions will point to wrong locations after the patch got applied.
> >
> > Similar to arm64, adjust the location to consider that offset.
> >
> > Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
> > Signed-off-by: Heiko Stuebner <heiko.stuebner@vrull.eu>
> > ---
> >  arch/riscv/include/asm/alternative.h |  3 ++
> >  arch/riscv/kernel/alternative.c      | 72 ++++++++++++++++++++++++++++
> >  arch/riscv/kernel/cpufeature.c       | 11 ++++-
> >  3 files changed, 84 insertions(+), 2 deletions(-)
> >
> > diff --git a/arch/riscv/include/asm/alternative.h b/arch/riscv/include/asm/alternative.h
> > index 6511dd73e812..c58ec3cc4bc3 100644
> > --- a/arch/riscv/include/asm/alternative.h
> > +++ b/arch/riscv/include/asm/alternative.h
> > @@ -27,6 +27,9 @@ void __init apply_boot_alternatives(void);
> >  void __init apply_early_boot_alternatives(void);
> >  void apply_module_alternatives(void *start, size_t length);
> >
> > +void riscv_alternative_fix_auipc_jalr(void *alt_ptr, unsigned int len,
> > +                                     int patch_offset);
> > +
> >  struct alt_entry {
> >         void *old_ptr;           /* address of original instruciton or data  */
> >         void *alt_ptr;           /* address of replacement instruction or data */
> > diff --git a/arch/riscv/kernel/alternative.c b/arch/riscv/kernel/alternative.c
> > index a7d26a00beea..292cc42dc3be 100644
> > --- a/arch/riscv/kernel/alternative.c
> > +++ b/arch/riscv/kernel/alternative.c
> > @@ -15,6 +15,8 @@
> >  #include <asm/vendorid_list.h>
> >  #include <asm/sbi.h>
> >  #include <asm/csr.h>
> > +#include <asm/insn.h>
> > +#include <asm/patch.h>
> >
> >  struct cpu_manufacturer_info_t {
> >         unsigned long vendor_id;
> > @@ -53,6 +55,76 @@ static void __init_or_module riscv_fill_cpu_mfr_info(struct cpu_manufacturer_inf
> >         }
> >  }
> >
> > +static unsigned int riscv_instruction_at(void *p, unsigned int offset)
> > +{
> > +       u16 *parcel = p + (offset * sizeof(u32));
> > +
> > +       return (unsigned int)parcel[0] | (unsigned int)parcel[1] << 16;
> > +}
> > +
> > +static inline bool riscv_insn_is_auipc_jalr(u32 insn1, u32 insn2)
> > +{
> > +       return riscv_insn_is_auipc(insn1) && riscv_insn_is_jalr(insn2);
> > +}
> > +
> > +#define JALR_SIGN_MASK         BIT(RV_I_IMM_SIGN_OPOFF - RV_I_IMM_11_0_OPOFF)
> > +#define AUIPC_PAD              (0x00001000)
> > +
> > +#define to_jalr_imm(value)                                             \
> > +       ((value & RV_I_IMM_11_0_MASK) << RV_I_IMM_11_0_OPOFF)
> > +
> > +#define to_auipc_imm(value)                                            \
> > +       ((value & JALR_SIGN_MASK) ?                                     \
> > +       ((value & RV_U_IMM_31_12_MASK) + AUIPC_PAD) :   \
> > +       (value & RV_U_IMM_31_12_MASK))
> > +
> > +void riscv_alternative_fix_auipc_jalr(void *alt_ptr, unsigned int len,
> > +                                     int patch_offset)
> > +{
> > +       int num_instr = len / sizeof(u32);
> > +       unsigned int call[2];
> > +       int i;
> > +       int imm;
> > +       u32 rd1;
> > +
> > +       /*
> > +        * stop one instruction before the end, as we're checking
> > +        * for auipc + jalr
> > +        */
> > +       for (i = 0; i < num_instr - 1; i++) {
> > +               u32 inst1 = riscv_instruction_at(alt_ptr, i);
> > +               u32 inst2 = riscv_instruction_at(alt_ptr, i + 1);
> > +
> > +               if (!riscv_insn_is_auipc_jalr(inst1, inst2))
> > +                       continue;
> > +
> > +               /* call will use ra register */
> > +               rd1 = RV_EXTRACT_RD_REG(inst1);
> > +               if (rd1 != 1)
> > +                       continue;
> > +
> > +               /* get and adjust new target address */
> > +               imm = RV_EXTRACT_UTYPE_IMM(inst1);
> > +               imm += RV_EXTRACT_ITYPE_IMM(inst2);
> > +               imm -= patch_offset;
> > +
> > +               /* pick the original auipc + jalr */
> > +               call[0] = inst1;
> > +               call[1] = inst2;
> > +
> > +               /* drop the old IMMs */
> > +               call[0] &= ~(RV_U_IMM_31_12_MASK);
> > +               call[1] &= ~(RV_I_IMM_11_0_MASK << RV_I_IMM_11_0_OPOFF);
> > +
> > +               /* add the adapted IMMs */
> > +               call[0] |= to_auipc_imm(imm);
> > +               call[1] |= to_jalr_imm(imm);
>
> It's fine like this, but as explained in
> https://git.kernel.org/torvalds/c/0966d385830de3470b7131db8e86c0c5bc9c52dc
> we can get rid of the conditional by doing
>
> call[0] |= (imm + 0x800) >> 12;
> call[1] |= imm & 0xfff;

Ugh, sorry this is of course wrong. It still needs to take into
account how the immediates are encoded in the instruction, but
calculating the immediates can still be done without a conditional.

> > +
> > +               /* patch the call place again */
> > +               patch_text_nosync(alt_ptr + i * sizeof(u32), call, 8);
> > +       }
> > +}
> > +
> >  /*
> >   * This is called very early in the boot process (directly after we run
> >   * a feature detect on the boot CPU). No need to worry about other CPUs
> > diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c
> > index 694267d1fe81..ba62a4ff5ccd 100644
> > --- a/arch/riscv/kernel/cpufeature.c
> > +++ b/arch/riscv/kernel/cpufeature.c
> > @@ -316,8 +316,15 @@ void __init_or_module riscv_cpufeature_patch_func(struct alt_entry *begin,
> >                 }
> >
> >                 tmp = (1U << alt->errata_id);
> > -               if (cpu_req_feature & tmp)
> > -                       patch_text_nosync(alt->old_ptr, alt->alt_ptr, alt->alt_len);
> > +               if (cpu_req_feature & tmp) {
> > +                       /* do the basic patching */
> > +                       patch_text_nosync(alt->old_ptr, alt->alt_ptr,
> > +                                         alt->alt_len);
> > +
> > +                       riscv_alternative_fix_auipc_jalr(alt->old_ptr,
> > +                                                        alt->alt_len,
> > +                                                        alt->old_ptr - alt->alt_ptr);
> > +               }
> >         }
> >  }
> >  #endif
> > --
> > 2.35.1
> >

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [PATCH v3 12/14] efi/riscv: libstub: mark when compiling libstub
  2022-12-02 16:37           ` Ard Biesheuvel
@ 2023-01-04 15:21             ` Andrew Jones
  -1 siblings, 0 replies; 84+ messages in thread
From: Andrew Jones @ 2023-01-04 15:21 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: Heiko Stübner, linux-riscv, palmer, christoph.muellner,
	prabhakar.csengg, conor, philipp.tomsich, emil.renner.berthing,
	linux-efi, Conor Dooley, alexghiti

On Fri, Dec 02, 2022 at 05:37:16PM +0100, Ard Biesheuvel wrote:
> On Thu, 1 Dec 2022 at 23:39, Heiko Stübner <heiko@sntech.de> wrote:
> >
> > Hi Ard,
> >
> > Am Donnerstag, 1. Dezember 2022, 21:57:00 CET schrieb Ard Biesheuvel:
> > > On Thu, 1 Dec 2022 at 20:35, Andrew Jones <ajones@ventanamicro.com> wrote:
> > > >
> > > > On Wed, Nov 30, 2022 at 11:56:12PM +0100, Heiko Stuebner wrote:
> > > > > From: Heiko Stuebner <heiko.stuebner@vrull.eu>
> > > > >
> > > > > We may want to runtime-optimize some core functions (str*, mem*),
> > > > > but not have this leak into libstub and cause build issues.
> > > > > Instead libstub, for the short while it's running, should just use
> > > > > the generic implementation.
> > > > >
> > > > > So, to be able to determine whether functions, that are used both in
> > > > > libstub and the main kernel, are getting compiled as part of libstub or
> > > > > not, add a compile-flag we can check via #ifdef.
> > > > >
> > > > > Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
> > > > > Signed-off-by: Heiko Stuebner <heiko.stuebner@vrull.eu>
> > >
> > > I think it would be better to update arch/riscv/kernel/image-vars.h so
> > > that only these generic implementations are exposed to the stub in the
> > > first place.
> >
> 
> Actually, all references to string and memory functions are going away
> from the stub. This is already in -next.
> 
> EFI now has zboot support, which means you can create a EFI bootable
> kernel image that carries the actual kernel in compressed form rather
> than as a hybrid EFI/bare metal image.

While chatting about EFI stub string functions again in the context of [1]
we recalled this comment that states the references should be going away
anyway. I'm just replying here with interested parties on CC to try and
bring it back to the forefront.

[1] https://lore.kernel.org/all/20221216162141.1701255-5-alexghiti@rivosinc.com/

Thanks,
drew

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [PATCH v3 12/14] efi/riscv: libstub: mark when compiling libstub
@ 2023-01-04 15:21             ` Andrew Jones
  0 siblings, 0 replies; 84+ messages in thread
From: Andrew Jones @ 2023-01-04 15:21 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: Heiko Stübner, linux-riscv, palmer, christoph.muellner,
	prabhakar.csengg, conor, philipp.tomsich, emil.renner.berthing,
	linux-efi, Conor Dooley, alexghiti

On Fri, Dec 02, 2022 at 05:37:16PM +0100, Ard Biesheuvel wrote:
> On Thu, 1 Dec 2022 at 23:39, Heiko Stübner <heiko@sntech.de> wrote:
> >
> > Hi Ard,
> >
> > Am Donnerstag, 1. Dezember 2022, 21:57:00 CET schrieb Ard Biesheuvel:
> > > On Thu, 1 Dec 2022 at 20:35, Andrew Jones <ajones@ventanamicro.com> wrote:
> > > >
> > > > On Wed, Nov 30, 2022 at 11:56:12PM +0100, Heiko Stuebner wrote:
> > > > > From: Heiko Stuebner <heiko.stuebner@vrull.eu>
> > > > >
> > > > > We may want to runtime-optimize some core functions (str*, mem*),
> > > > > but not have this leak into libstub and cause build issues.
> > > > > Instead libstub, for the short while it's running, should just use
> > > > > the generic implementation.
> > > > >
> > > > > So, to be able to determine whether functions, that are used both in
> > > > > libstub and the main kernel, are getting compiled as part of libstub or
> > > > > not, add a compile-flag we can check via #ifdef.
> > > > >
> > > > > Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
> > > > > Signed-off-by: Heiko Stuebner <heiko.stuebner@vrull.eu>
> > >
> > > I think it would be better to update arch/riscv/kernel/image-vars.h so
> > > that only these generic implementations are exposed to the stub in the
> > > first place.
> >
> 
> Actually, all references to string and memory functions are going away
> from the stub. This is already in -next.
> 
> EFI now has zboot support, which means you can create a EFI bootable
> kernel image that carries the actual kernel in compressed form rather
> than as a hybrid EFI/bare metal image.

While chatting about EFI stub string functions again in the context of [1]
we recalled this comment that states the references should be going away
anyway. I'm just replying here with interested parties on CC to try and
bring it back to the forefront.

[1] https://lore.kernel.org/all/20221216162141.1701255-5-alexghiti@rivosinc.com/

Thanks,
drew

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [PATCH v3 12/14] efi/riscv: libstub: mark when compiling libstub
  2023-01-04 15:21             ` Andrew Jones
@ 2023-01-04 15:32               ` Heiko Stübner
  -1 siblings, 0 replies; 84+ messages in thread
From: Heiko Stübner @ 2023-01-04 15:32 UTC (permalink / raw)
  To: Ard Biesheuvel, Andrew Jones
  Cc: linux-riscv, palmer, christoph.muellner, prabhakar.csengg, conor,
	philipp.tomsich, emil.renner.berthing, linux-efi, Conor Dooley,
	alexghiti

Am Mittwoch, 4. Januar 2023, 16:21:53 CET schrieb Andrew Jones:
> On Fri, Dec 02, 2022 at 05:37:16PM +0100, Ard Biesheuvel wrote:
> > On Thu, 1 Dec 2022 at 23:39, Heiko Stübner <heiko@sntech.de> wrote:
> > >
> > > Hi Ard,
> > >
> > > Am Donnerstag, 1. Dezember 2022, 21:57:00 CET schrieb Ard Biesheuvel:
> > > > On Thu, 1 Dec 2022 at 20:35, Andrew Jones <ajones@ventanamicro.com> wrote:
> > > > >
> > > > > On Wed, Nov 30, 2022 at 11:56:12PM +0100, Heiko Stuebner wrote:
> > > > > > From: Heiko Stuebner <heiko.stuebner@vrull.eu>
> > > > > >
> > > > > > We may want to runtime-optimize some core functions (str*, mem*),
> > > > > > but not have this leak into libstub and cause build issues.
> > > > > > Instead libstub, for the short while it's running, should just use
> > > > > > the generic implementation.
> > > > > >
> > > > > > So, to be able to determine whether functions, that are used both in
> > > > > > libstub and the main kernel, are getting compiled as part of libstub or
> > > > > > not, add a compile-flag we can check via #ifdef.
> > > > > >
> > > > > > Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
> > > > > > Signed-off-by: Heiko Stuebner <heiko.stuebner@vrull.eu>
> > > >
> > > > I think it would be better to update arch/riscv/kernel/image-vars.h so
> > > > that only these generic implementations are exposed to the stub in the
> > > > first place.
> > >
> > 
> > Actually, all references to string and memory functions are going away
> > from the stub. This is already in -next.
> > 
> > EFI now has zboot support, which means you can create a EFI bootable
> > kernel image that carries the actual kernel in compressed form rather
> > than as a hybrid EFI/bare metal image.
> 
> While chatting about EFI stub string functions again in the context of [1]
> we recalled this comment that states the references should be going away
> anyway. I'm just replying here with interested parties on CC to try and
> bring it back to the forefront.
> 
> [1] https://lore.kernel.org/all/20221216162141.1701255-5-alexghiti@rivosinc.com/

I'm currently following Ard's other suggestion and am using a more
traditional model for str* functions (aka non-inline).

One thing I found on the new EFI parts was that while it provides it's own
implementations, it does not provide it's own prototype.

On the real kernel-side all the HAVE_ARCH_* blocks do wrap around
both the generic implementation as well as the prototype in string.h,
and there _are_ other implementations of str* or mem* functions
done as inline, on other arches. So that may be something that might
need fixing on the EFI-side.


Heiko



^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [PATCH v3 12/14] efi/riscv: libstub: mark when compiling libstub
@ 2023-01-04 15:32               ` Heiko Stübner
  0 siblings, 0 replies; 84+ messages in thread
From: Heiko Stübner @ 2023-01-04 15:32 UTC (permalink / raw)
  To: Ard Biesheuvel, Andrew Jones
  Cc: linux-riscv, palmer, christoph.muellner, prabhakar.csengg, conor,
	philipp.tomsich, emil.renner.berthing, linux-efi, Conor Dooley,
	alexghiti

Am Mittwoch, 4. Januar 2023, 16:21:53 CET schrieb Andrew Jones:
> On Fri, Dec 02, 2022 at 05:37:16PM +0100, Ard Biesheuvel wrote:
> > On Thu, 1 Dec 2022 at 23:39, Heiko Stübner <heiko@sntech.de> wrote:
> > >
> > > Hi Ard,
> > >
> > > Am Donnerstag, 1. Dezember 2022, 21:57:00 CET schrieb Ard Biesheuvel:
> > > > On Thu, 1 Dec 2022 at 20:35, Andrew Jones <ajones@ventanamicro.com> wrote:
> > > > >
> > > > > On Wed, Nov 30, 2022 at 11:56:12PM +0100, Heiko Stuebner wrote:
> > > > > > From: Heiko Stuebner <heiko.stuebner@vrull.eu>
> > > > > >
> > > > > > We may want to runtime-optimize some core functions (str*, mem*),
> > > > > > but not have this leak into libstub and cause build issues.
> > > > > > Instead libstub, for the short while it's running, should just use
> > > > > > the generic implementation.
> > > > > >
> > > > > > So, to be able to determine whether functions, that are used both in
> > > > > > libstub and the main kernel, are getting compiled as part of libstub or
> > > > > > not, add a compile-flag we can check via #ifdef.
> > > > > >
> > > > > > Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
> > > > > > Signed-off-by: Heiko Stuebner <heiko.stuebner@vrull.eu>
> > > >
> > > > I think it would be better to update arch/riscv/kernel/image-vars.h so
> > > > that only these generic implementations are exposed to the stub in the
> > > > first place.
> > >
> > 
> > Actually, all references to string and memory functions are going away
> > from the stub. This is already in -next.
> > 
> > EFI now has zboot support, which means you can create a EFI bootable
> > kernel image that carries the actual kernel in compressed form rather
> > than as a hybrid EFI/bare metal image.
> 
> While chatting about EFI stub string functions again in the context of [1]
> we recalled this comment that states the references should be going away
> anyway. I'm just replying here with interested parties on CC to try and
> bring it back to the forefront.
> 
> [1] https://lore.kernel.org/all/20221216162141.1701255-5-alexghiti@rivosinc.com/

I'm currently following Ard's other suggestion and am using a more
traditional model for str* functions (aka non-inline).

One thing I found on the new EFI parts was that while it provides it's own
implementations, it does not provide it's own prototype.

On the real kernel-side all the HAVE_ARCH_* blocks do wrap around
both the generic implementation as well as the prototype in string.h,
and there _are_ other implementations of str* or mem* functions
done as inline, on other arches. So that may be something that might
need fixing on the EFI-side.


Heiko



_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 84+ messages in thread

end of thread, other threads:[~2023-01-04 15:48 UTC | newest]

Thread overview: 84+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-11-30 22:56 [PATCH v3 0/14] Zbb string optimizations and call support in alternatives Heiko Stuebner
2022-11-30 22:56 ` Heiko Stuebner
2022-11-30 22:56 ` [PATCH v3 01/14] RISC-V: fix funct4 definition for c.jalr in parse_asm.h Heiko Stuebner
2022-11-30 22:56   ` Heiko Stuebner
2022-12-01 17:00   ` Andrew Jones
2022-12-01 17:00     ` Andrew Jones
2022-12-01 17:05     ` Andrew Jones
2022-12-01 17:05       ` Andrew Jones
2022-12-01 17:07       ` Heiko Stübner
2022-12-01 17:07         ` Heiko Stübner
2022-11-30 22:56 ` [PATCH v3 02/14] RISC-V: add prefix to all constants/macros " Heiko Stuebner
2022-11-30 22:56   ` Heiko Stuebner
2022-12-05 14:00   ` Andrew Jones
2022-12-05 14:00     ` Andrew Jones
2022-12-05 14:53     ` Andrew Jones
2022-12-05 14:53       ` Andrew Jones
2022-12-05 15:30       ` Andrew Jones
2022-12-05 15:30         ` Andrew Jones
2022-11-30 22:56 ` [PATCH v3 03/14] RISC-V: detach funct-values from their offset Heiko Stuebner
2022-11-30 22:56   ` Heiko Stuebner
2022-12-01 17:36   ` Andrew Jones
2022-12-01 17:36     ` Andrew Jones
2022-11-30 22:56 ` [PATCH v3 04/14] RISC-V: add ebreak instructions to definitions Heiko Stuebner
2022-11-30 22:56   ` Heiko Stuebner
2022-11-30 22:56 ` [PATCH v3 05/14] RISC-V: add auipc elements to parse_asm header Heiko Stuebner
2022-11-30 22:56   ` Heiko Stuebner
2022-11-30 22:56 ` [PATCH v3 06/14] RISC-V: Move riscv_insn_is_* macros into a common header Heiko Stuebner
2022-11-30 22:56   ` Heiko Stuebner
2022-11-30 22:56 ` [PATCH v3 07/14] RISC-V: rename parse_asm.h to insn.h Heiko Stuebner
2022-11-30 22:56   ` Heiko Stuebner
2022-11-30 22:56 ` [PATCH v3 08/14] RISC-V: kprobes: use central defined funct3 constants Heiko Stuebner
2022-11-30 22:56   ` Heiko Stuebner
2022-11-30 22:56 ` [PATCH v3 09/14] RISC-V: add U-type imm parsing to insn.h header Heiko Stuebner
2022-11-30 22:56   ` Heiko Stuebner
2022-11-30 22:56 ` [PATCH v3 10/14] RISC-V: add rd reg " Heiko Stuebner
2022-11-30 22:56   ` Heiko Stuebner
2022-11-30 22:56 ` [PATCH v3 11/14] RISC-V: fix auipc-jalr addresses in patched alternatives Heiko Stuebner
2022-11-30 22:56   ` Heiko Stuebner
2022-12-01 19:33   ` Andrew Jones
2022-12-01 19:33     ` Andrew Jones
2022-12-06 15:56     ` Andrew Jones
2022-12-06 15:56       ` Andrew Jones
2022-12-06 22:10     ` Heiko Stübner
2022-12-06 22:10       ` Heiko Stübner
2022-12-07  9:35       ` Heiko Stübner
2022-12-07  9:35         ` Heiko Stübner
2022-12-07 10:44         ` Andrew Jones
2022-12-07 10:44           ` Andrew Jones
2022-12-07 11:19           ` Emil Renner Berthing
2022-12-07 11:19             ` Emil Renner Berthing
2022-12-07 11:29   ` Emil Renner Berthing
2022-12-07 11:29     ` Emil Renner Berthing
2022-12-07 11:32     ` Emil Renner Berthing
2022-12-07 11:32       ` Emil Renner Berthing
2022-11-30 22:56 ` [PATCH v3 12/14] efi/riscv: libstub: mark when compiling libstub Heiko Stuebner
2022-11-30 22:56   ` Heiko Stuebner
2022-12-01 19:34   ` Andrew Jones
2022-12-01 19:34     ` Andrew Jones
2022-12-01 20:57     ` Ard Biesheuvel
2022-12-01 20:57       ` Ard Biesheuvel
2022-12-01 22:39       ` Heiko Stübner
2022-12-01 22:39         ` Heiko Stübner
2022-12-02 16:37         ` Ard Biesheuvel
2022-12-02 16:37           ` Ard Biesheuvel
2023-01-04 15:21           ` Andrew Jones
2023-01-04 15:21             ` Andrew Jones
2023-01-04 15:32             ` Heiko Stübner
2023-01-04 15:32               ` Heiko Stübner
2022-11-30 22:56 ` [PATCH v3 13/14] RISC-V: add infrastructure to allow different str* implementations Heiko Stuebner
2022-11-30 22:56   ` Heiko Stuebner
2022-12-01 20:07   ` Andrew Jones
2022-12-01 20:07     ` Andrew Jones
2022-12-01 20:14     ` Andrew Jones
2022-12-01 20:14       ` Andrew Jones
2022-12-02  4:08   ` Palmer Dabbelt
2022-12-02  4:08     ` Palmer Dabbelt
2022-12-02  8:19     ` Ard Biesheuvel
2022-12-02  8:19       ` Ard Biesheuvel
2022-11-30 22:56 ` [PATCH v3 14/14] RISC-V: add zbb support to string functions Heiko Stuebner
2022-11-30 22:56   ` Heiko Stuebner
2022-12-01  0:02 ` [PATCH v3 0/14] Zbb string optimizations and call support in alternatives Conor Dooley
2022-12-01  0:02   ` Conor Dooley
2022-12-01 11:42   ` Heiko Stübner
2022-12-01 11:42     ` Heiko Stübner

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.