linux-riscv.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [PATCH RFC 0/2] RISC-V: T-Head vector handling
@ 2023-02-28 21:54 Heiko Stuebner
  2023-02-28 21:54 ` [PATCH RFC 1/2] RISC-V: define the elements of the VCSR vector CSR Heiko Stuebner
                   ` (3 more replies)
  0 siblings, 4 replies; 21+ messages in thread
From: Heiko Stuebner @ 2023-02-28 21:54 UTC (permalink / raw)
  To: palmer
  Cc: linux-riscv, samuel, guoren, christoph.muellner, heiko,
	conor.dooley, linux-kernel, Heiko Stuebner

From: Heiko Stuebner <heiko.stuebner@vrull.eu>

As is widely known the T-Head C9xx cores used for example in the
Allwinner D1 implement an older non-ratified variant of the vector spec.

While userspace will probably have a lot more problems implementing
support for both, on the kernel side the needed changes are actually
somewhat small'ish and can be handled via alternatives somewhat nicely.

With this patchset I could run the same userspace program (picked from
some riscv-vector-test repository) that does some vector additions on
both qemu and a d1-nezha board. On both platforms it ran sucessfully and
even produced the same results.


As can be seen in the todo list, there are 2 places where the changed
SR_VS location still needs to be handled in the next revision
(assembly + ALTERNATIVES + constants + probably stringify resulted in
 some grey hair so far already)


ToDo:
- follow along with the base vector patchset
- handle SR_VS access in _save_context and _secondary_start_sbi


Heiko Stuebner (2):
  RISC-V: define the elements of the VCSR vector CSR
  RISC-V: add T-Head vector errata handling

 arch/riscv/Kconfig.erratas           |  13 +++
 arch/riscv/errata/thead/errata.c     |  32 ++++++
 arch/riscv/include/asm/csr.h         |  31 +++++-
 arch/riscv/include/asm/errata_list.h |  62 +++++++++++-
 arch/riscv/include/asm/vector.h      | 139 +++++++++++++++++++++++++--
 5 files changed, 261 insertions(+), 16 deletions(-)

-- 
2.39.0


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH RFC 1/2] RISC-V: define the elements of the VCSR vector CSR
  2023-02-28 21:54 [PATCH RFC 0/2] RISC-V: T-Head vector handling Heiko Stuebner
@ 2023-02-28 21:54 ` Heiko Stuebner
  2023-03-01  2:22   ` Guo Ren
  2023-03-15 18:31   ` Conor Dooley
  2023-02-28 21:54 ` [PATCH RFC 2/2] RISC-V: add T-Head vector errata handling Heiko Stuebner
                   ` (2 subsequent siblings)
  3 siblings, 2 replies; 21+ messages in thread
From: Heiko Stuebner @ 2023-02-28 21:54 UTC (permalink / raw)
  To: palmer
  Cc: linux-riscv, samuel, guoren, christoph.muellner, heiko,
	conor.dooley, linux-kernel, Heiko Stuebner

From: Heiko Stuebner <heiko.stuebner@vrull.eu>

The VCSR CSR contains two elements VXRM[2:1] and VXSAT[0].

Define constants for those to access the elements in a readable way.

Signed-off-by: Heiko Stuebner <heiko.stuebner@vrull.eu>
---
 arch/riscv/include/asm/csr.h | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/arch/riscv/include/asm/csr.h b/arch/riscv/include/asm/csr.h
index add51662b7c3..8b06f2472915 100644
--- a/arch/riscv/include/asm/csr.h
+++ b/arch/riscv/include/asm/csr.h
@@ -176,6 +176,11 @@
 #define ENVCFG_CBIE_INV			_AC(0x3, UL)
 #define ENVCFG_FIOM			_AC(0x1, UL)
 
+/* VCSR flags */
+#define VCSR_VXRM_MASK			3
+#define VCSR_VXRM_SHIFT			1
+#define VCSR_VXSAT_MASK			1
+
 /* symbolic CSR names: */
 #define CSR_CYCLE		0xc00
 #define CSR_TIME		0xc01
-- 
2.39.0


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH RFC 2/2] RISC-V: add T-Head vector errata handling
  2023-02-28 21:54 [PATCH RFC 0/2] RISC-V: T-Head vector handling Heiko Stuebner
  2023-02-28 21:54 ` [PATCH RFC 1/2] RISC-V: define the elements of the VCSR vector CSR Heiko Stuebner
@ 2023-02-28 21:54 ` Heiko Stuebner
  2023-03-01  2:12   ` Guo Ren
                     ` (3 more replies)
  2023-03-01  2:21 ` [PATCH RFC 0/2] RISC-V: T-Head vector handling Guo Ren
  2023-03-15  5:29 ` Palmer Dabbelt
  3 siblings, 4 replies; 21+ messages in thread
From: Heiko Stuebner @ 2023-02-28 21:54 UTC (permalink / raw)
  To: palmer
  Cc: linux-riscv, samuel, guoren, christoph.muellner, heiko,
	conor.dooley, linux-kernel, Heiko Stuebner

From: Heiko Stuebner <heiko.stuebner@vrull.eu>

T-Head C9xx cores implement an older version (0.7.1) of the vector
specification.

Relevant changes concerning the kernel are:
- different placement of the SR_VS bit for the vector unit status
- different encoding of the vsetvli instruction
- different instructions for loads and stores

And a fixed VLEN of 128.

The in-kernel access to vector instances is limited to the save and
restore of process states so the above mentioned areas can simply be
handled via the alternatives framework, similar to other T-Head specific
issues.

Signed-off-by: Heiko Stuebner <heiko.stuebner@vrull.eu>
---
 arch/riscv/Kconfig.erratas           |  13 +++
 arch/riscv/errata/thead/errata.c     |  32 ++++++
 arch/riscv/include/asm/csr.h         |  26 ++++-
 arch/riscv/include/asm/errata_list.h |  62 +++++++++++-
 arch/riscv/include/asm/vector.h      | 139 +++++++++++++++++++++++++--
 5 files changed, 256 insertions(+), 16 deletions(-)

diff --git a/arch/riscv/Kconfig.erratas b/arch/riscv/Kconfig.erratas
index 69621ae6d647..624cefc9fcd7 100644
--- a/arch/riscv/Kconfig.erratas
+++ b/arch/riscv/Kconfig.erratas
@@ -79,4 +79,17 @@ config ERRATA_THEAD_PMU
 
 	  If you don't know what to do here, say "Y".
 
+config ERRATA_THEAD_VECTOR
+	bool "Apply T-Head Vector errata"
+	depends on ERRATA_THEAD && RISCV_ISA_V
+	default y
+	help
+	  The T-Head C9xx cores implement an earlier version 0.7.1
+	  of the vector extensions.
+
+	  This will apply the necessary errata to handle the non-standard
+	  behaviour via when switch to and from vector mode for processes.
+
+	  If you don't know what to do here, say "Y".
+
 endmenu # "CPU errata selection"
diff --git a/arch/riscv/errata/thead/errata.c b/arch/riscv/errata/thead/errata.c
index fac5742d1c1e..55b3aaa2468a 100644
--- a/arch/riscv/errata/thead/errata.c
+++ b/arch/riscv/errata/thead/errata.c
@@ -12,6 +12,7 @@
 #include <asm/cacheflush.h>
 #include <asm/errata_list.h>
 #include <asm/patch.h>
+#include <asm/vector.h>
 #include <asm/vendorid_list.h>
 
 static bool errata_probe_pbmt(unsigned int stage,
@@ -63,6 +64,34 @@ static bool errata_probe_pmu(unsigned int stage,
 	return true;
 }
 
+static bool errata_probe_vector(unsigned int stage,
+				unsigned long arch_id, unsigned long impid)
+{
+	if (!IS_ENABLED(CONFIG_ERRATA_THEAD_VECTOR))
+		return false;
+
+	/* target-c9xx cores report arch_id and impid as 0 */
+	if (arch_id != 0 || impid != 0)
+		return false;
+
+	if (stage == RISCV_ALTERNATIVES_EARLY_BOOT) {
+		/*
+		 * Disable VECTOR to detect illegal usage of vector in kernel.
+		 * This is normally done in _start_kernel but with the
+		 * vector-1.0 SR_VS bits. VS is using [24:23] on T-Head's
+		 * vector-0.7.1 and the vector-1.0-bits are unused there.
+		 */
+		csr_clear(CSR_STATUS, SR_VS_THEAD);
+		return false;
+	}
+
+	/* let has_vector() return true and set the static vlen */
+	static_branch_enable(&riscv_isa_ext_keys[RISCV_ISA_EXT_KEY_VECTOR]);
+	riscv_vsize = 128 / 8 * 32;
+
+	return true;
+}
+
 static u32 thead_errata_probe(unsigned int stage,
 			      unsigned long archid, unsigned long impid)
 {
@@ -77,6 +106,9 @@ static u32 thead_errata_probe(unsigned int stage,
 	if (errata_probe_pmu(stage, archid, impid))
 		cpu_req_errata |= BIT(ERRATA_THEAD_PMU);
 
+	if (errata_probe_vector(stage, archid, impid))
+		cpu_req_errata |= BIT(ERRATA_THEAD_VECTOR);
+
 	return cpu_req_errata;
 }
 
diff --git a/arch/riscv/include/asm/csr.h b/arch/riscv/include/asm/csr.h
index 8b06f2472915..8d16c11487aa 100644
--- a/arch/riscv/include/asm/csr.h
+++ b/arch/riscv/include/asm/csr.h
@@ -24,11 +24,27 @@
 #define SR_FS_CLEAN	_AC(0x00004000, UL)
 #define SR_FS_DIRTY	_AC(0x00006000, UL)
 
-#define SR_VS           _AC(0x00000600, UL) /* Vector Status */
-#define SR_VS_OFF       _AC(0x00000000, UL)
-#define SR_VS_INITIAL   _AC(0x00000200, UL)
-#define SR_VS_CLEAN     _AC(0x00000400, UL)
-#define SR_VS_DIRTY     _AC(0x00000600, UL)
+#define SR_VS_OFF		_AC(0x00000000, UL)
+
+#define SR_VS_1_0		_AC(0x00000600, UL) /* Vector Status */
+#define SR_VS_INITIAL_1_0	_AC(0x00000200, UL)
+#define SR_VS_CLEAN_1_0		_AC(0x00000400, UL)
+#define SR_VS_DIRTY_1_0		_AC(0x00000600, UL)
+
+#define SR_VS_THEAD		_AC(0x01800000, UL) /* Vector Status */
+#define SR_VS_INITIAL_THEAD	_AC(0x00800000, UL)
+#define SR_VS_CLEAN_THEAD	_AC(0x01000000, UL)
+#define SR_VS_DIRTY_THEAD	_AC(0x01800000, UL)
+
+/*
+ * Always default to vector-1.0 handling in assembly and let the broken
+ * implementations handle their case separately.
+ */
+#ifdef __ASSEMBLY__
+#define SR_VS			SR_VS_1_0
+#else
+
+#endif
 
 #define SR_XS		_AC(0x00018000, UL) /* Extension Status */
 #define SR_XS_OFF	_AC(0x00000000, UL)
diff --git a/arch/riscv/include/asm/errata_list.h b/arch/riscv/include/asm/errata_list.h
index 95e626b7281e..3f93cdd1599f 100644
--- a/arch/riscv/include/asm/errata_list.h
+++ b/arch/riscv/include/asm/errata_list.h
@@ -19,7 +19,8 @@
 #define	ERRATA_THEAD_PBMT 0
 #define	ERRATA_THEAD_CMO 1
 #define	ERRATA_THEAD_PMU 2
-#define	ERRATA_THEAD_NUMBER 3
+#define	ERRATA_THEAD_VECTOR 3
+#define	ERRATA_THEAD_NUMBER 4
 #endif
 
 #define	CPUFEATURE_SVPBMT 0
@@ -157,6 +158,65 @@ asm volatile(ALTERNATIVE(						\
 	: "=r" (__ovl) :						\
 	: "memory")
 
+#ifdef CONFIG_ERRATA_THEAD_PBMT
+
+#define ALT_VS_SHIFT 61
+#define ALT_THEAD_VS_SHIFT 59
+#define ALT_THEAD_SR_VS(_val, _vs)					\
+asm(ALTERNATIVE(  "li %0, %1\t\nslli %0,%0,%3",				\
+		  "li %0, %2\t\nslli %0,%0,%4", THEAD_VENDOR_ID,	\
+			ERRATA_THEAD_PBMT, CONFIG_ERRATA_THEAD_PBMT)	\
+		: "=r"(_val)						\
+		: "I"(prot##_MAIN >> ALT_SVPBMT_SHIFT),		\
+		  "I"(prot##_THEAD >> ALT_THEAD_PBMT_SHIFT),		\
+		  "I"(ALT_SVPBMT_SHIFT),				\
+		  "I"(ALT_THEAD_PBMT_SHIFT))
+#else
+#define ALT_THEAD_SR_VS(_val ## _MAIN)
+#endif
+
+#ifdef CONFIG_ERRATA_THEAD_VECTOR
+
+#define THEAD_C9XX_CSR_VXSAT			0x9
+#define THEAD_C9XX_CSR_VXRM			0xa
+
+/*
+ * Vector 0.7.1 as used for example on T-Head Xuantie cores, uses an older
+ * encoding for vsetvli (ta, ma vs. d1), so provide an instruction for
+ * vsetvli	t4, x0, e8, m8, d1
+ */
+#define THEAD_VSETVLI_T4X0E8M8D1	".long	0x00307ed7\n\t"
+
+/*
+ * While in theory, the vector-0.7.1 vsb.v and vlb.v result in the same
+ * encoding as the standard vse8.v and vle8.v, compilers seem to optimize
+ * the call resulting in a different encoding and then using a value for
+ * the "mop" field that is not part of vector-0.7.1
+ * So encode specific variants for vstate_save and _restore.
+ */
+#define THEAD_VSB_V_V0T0		".long	0x02028027\n\t"
+#define THEAD_VSB_V_V8T0		".long	0x02028427\n\t"
+#define THEAD_VSB_V_V16T0		".long	0x02028827\n\t"
+#define THEAD_VSB_V_V24T0		".long	0x02028c27\n\t"
+#define THEAD_VLB_V_V0T0		".long	0x012028007\n\t"
+#define THEAD_VLB_V_V8T0		".long	0x012028407\n\t"
+#define THEAD_VLB_V_V16T0		".long	0x012028807\n\t"
+#define THEAD_VLB_V_V24T0		".long	0x012028c07\n\t"
+
+#define ALT_SR_VS_VECTOR_1_0_SHIFT	9
+#define ALT_SR_VS_THEAD_SHIFT		23
+
+#define ALT_SR_VS(_val, prot)						\
+asm(ALTERNATIVE("li %0, %1\t\nslli %0,%0,%3",				\
+		"li %0, %2\t\nslli %0,%0,%4", THEAD_VENDOR_ID,		\
+		ERRATA_THEAD_VECTOR, CONFIG_ERRATA_THEAD_VECTOR)	\
+		: "=r"(_val)						\
+		: "I"(prot##_1_0 >> ALT_SR_VS_VECTOR_1_0_SHIFT),	\
+		  "I"(prot##_THEAD >> ALT_SR_VS_THEAD_SHIFT),		\
+		  "I"(ALT_SR_VS_VECTOR_1_0_SHIFT),			\
+		  "I"(ALT_SR_VS_THEAD_SHIFT))
+#endif /* CONFIG_ERRATA_THEAD_VECTOR */
+
 #endif /* __ASSEMBLY__ */
 
 #endif
diff --git a/arch/riscv/include/asm/vector.h b/arch/riscv/include/asm/vector.h
index ad9e6161dd89..ad91f783316e 100644
--- a/arch/riscv/include/asm/vector.h
+++ b/arch/riscv/include/asm/vector.h
@@ -15,6 +15,55 @@
 #include <asm/hwcap.h>
 #include <asm/csr.h>
 #include <asm/asm.h>
+#include <asm/errata_list.h>
+
+#ifdef CONFIG_ERRATA_THEAD_VECTOR
+
+static inline u32 riscv_sr_vs(void)
+{
+	u32 val;
+
+	ALT_SR_VS(val, SR_VS);
+	return val;
+}
+
+static inline u32 riscv_sr_vs_initial(void)
+{
+	u32 val;
+
+	ALT_SR_VS(val, SR_VS_INITIAL);
+	return val;
+}
+
+static inline u32 riscv_sr_vs_clean(void)
+{
+	u32 val;
+
+	ALT_SR_VS(val, SR_VS_CLEAN);
+	return val;
+}
+
+static inline u32 riscv_sr_vs_dirty(void)
+{
+	u32 val;
+
+	ALT_SR_VS(val, SR_VS_DIRTY);
+	return val;
+}
+
+#define SR_VS		riscv_sr_vs()
+#define SR_VS_INITIAL	riscv_sr_vs_initial()
+#define SR_VS_CLEAN	riscv_sr_vs_clean()
+#define SR_VS_DIRTY	riscv_sr_vs_dirty()
+
+#else /* CONFIG_ERRATA_THEAD_VECTOR */
+
+#define SR_VS		SR_VS_1_0
+#define SR_VS_INITIAL	SR_VS_INITIAL_1_0
+#define SR_VS_CLEAN	SR_VS_CLEAN_1_0
+#define SR_VS_DIRTY	SR_VS_DIRTY_1_0
+
+#endif /* CONFIG_ERRATA_THEAD_VECTOR */
 
 #define CSR_STR(x) __ASM_STR(x)
 
@@ -29,6 +78,7 @@ static __always_inline bool has_vector(void)
 static inline void __vstate_clean(struct pt_regs *regs)
 {
 	regs->status = (regs->status & ~(SR_VS)) | SR_VS_CLEAN;
+
 }
 
 static inline void vstate_off(struct pt_regs *regs)
@@ -58,30 +108,75 @@ static __always_inline void rvv_disable(void)
 
 static __always_inline void __vstate_csr_save(struct __riscv_v_state *dest)
 {
-	asm volatile (
+	register u32 t1 asm("t1") = (SR_FS);
+
+	/*
+	 * CSR_VCSR is defined as
+	 * [2:1] - vxrm[1:0]
+	 * [0] - vxsat
+	 * The earlier vector spec implemented by T-Head uses separate
+	 * registers for the same bit-elements, so just combine those
+	 * into the existing output field.
+	 *
+	 * Additionally T-Head cores need FS to be enabled when accessing
+	 * the VXRM and VXSAT CSRs, otherwise ending in illegal instructions.
+	 */
+	asm volatile (ALTERNATIVE(
 		"csrr	%0, " CSR_STR(CSR_VSTART) "\n\t"
 		"csrr	%1, " CSR_STR(CSR_VTYPE) "\n\t"
 		"csrr	%2, " CSR_STR(CSR_VL) "\n\t"
 		"csrr	%3, " CSR_STR(CSR_VCSR) "\n\t"
+		__nops(5),
+		"csrs	sstatus, t1\n\t"
+		"csrr	%0, " CSR_STR(CSR_VSTART) "\n\t"
+		"csrr	%1, " CSR_STR(CSR_VTYPE) "\n\t"
+		"csrr	%2, " CSR_STR(CSR_VL) "\n\t"
+		"csrr	%3, " CSR_STR(THEAD_C9XX_CSR_VXRM) "\n\t"
+		"slliw	%3, %3, " CSR_STR(VCSR_VXRM_SHIFT) "\n\t"
+		"csrr	t4, " CSR_STR(THEAD_C9XX_CSR_VXSAT) "\n\t"
+		"or	%3, %3, t4\n\t"
+		"csrc	sstatus, t1\n\t",
+		THEAD_VENDOR_ID,
+		ERRATA_THEAD_VECTOR, CONFIG_ERRATA_THEAD_VECTOR)
 		: "=r" (dest->vstart), "=r" (dest->vtype), "=r" (dest->vl),
-		  "=r" (dest->vcsr) : :);
+		  "=r" (dest->vcsr) : "r"(t1) : "t4");
 }
 
 static __always_inline void __vstate_csr_restore(struct __riscv_v_state *src)
 {
-	asm volatile (
+	register u32 t1 asm("t1") = (SR_FS);
+
+	/*
+	 * Similar to __vstate_csr_save above, restore values for the
+	 * separate VXRM and VXSAT CSRs from the vcsr variable.
+	 */
+	asm volatile (ALTERNATIVE(
 		"vsetvl	 x0, %2, %1\n\t"
 		"csrw	" CSR_STR(CSR_VSTART) ", %0\n\t"
 		"csrw	" CSR_STR(CSR_VCSR) ", %3\n\t"
+		__nops(6),
+		"csrs	sstatus, t1\n\t"
+		"vsetvl	 x0, %2, %1\n\t"
+		"csrw	" CSR_STR(CSR_VSTART) ", %0\n\t"
+		"srliw	t4, %3, " CSR_STR(VCSR_VXRM_SHIFT) "\n\t"
+		"andi	t4, t4, " CSR_STR(VCSR_VXRM_MASK) "\n\t"
+		"csrw	" CSR_STR(THEAD_C9XX_CSR_VXRM) ", t4\n\t"
+		"andi	%3, %3, " CSR_STR(VCSR_VXSAT_MASK) "\n\t"
+		"csrw	" CSR_STR(THEAD_C9XX_CSR_VXSAT) ", %3\n\t"
+		"csrc	sstatus, t1\n\t",
+		THEAD_VENDOR_ID,
+		ERRATA_THEAD_VECTOR, CONFIG_ERRATA_THEAD_VECTOR)
 		: : "r" (src->vstart), "r" (src->vtype), "r" (src->vl),
-		    "r" (src->vcsr) :);
+		    "r" (src->vcsr), "r"(t1): "t4");
 }
 
 static inline void __vstate_save(struct __riscv_v_state *save_to, void *datap)
 {
 	rvv_enable();
 	__vstate_csr_save(save_to);
-	asm volatile (
+
+	asm volatile (ALTERNATIVE(
+		"nop\n\t"
 		"vsetvli	t4, x0, e8, m8, ta, ma\n\t"
 		"vse8.v		v0, (%0)\n\t"
 		"add		%0, %0, t4\n\t"
@@ -89,8 +184,18 @@ static inline void __vstate_save(struct __riscv_v_state *save_to, void *datap)
 		"add		%0, %0, t4\n\t"
 		"vse8.v		v16, (%0)\n\t"
 		"add		%0, %0, t4\n\t"
-		"vse8.v		v24, (%0)\n\t"
-		: : "r" (datap) : "t4", "memory");
+		"vse8.v		v24, (%0)\n\t",
+		"mv		t0, %0\n\t"
+		THEAD_VSETVLI_T4X0E8M8D1
+		THEAD_VSB_V_V0T0
+		"addi		t0, t0, 128\n\t"
+		THEAD_VSB_V_V8T0
+		"addi		t0, t0, 128\n\t"
+		THEAD_VSB_V_V16T0
+		"addi		t0, t0, 128\n\t"
+		THEAD_VSB_V_V24T0, THEAD_VENDOR_ID,
+		ERRATA_THEAD_VECTOR, CONFIG_ERRATA_THEAD_VECTOR)
+		: : "r" (datap) : "t0", "t4", "memory");
 	rvv_disable();
 }
 
@@ -98,7 +203,9 @@ static inline void __vstate_restore(struct __riscv_v_state *restore_from,
 				    void *datap)
 {
 	rvv_enable();
-	asm volatile (
+
+	asm volatile (ALTERNATIVE(
+		"nop\n\t"
 		"vsetvli	t4, x0, e8, m8, ta, ma\n\t"
 		"vle8.v		v0, (%0)\n\t"
 		"add		%0, %0, t4\n\t"
@@ -106,8 +213,20 @@ static inline void __vstate_restore(struct __riscv_v_state *restore_from,
 		"add		%0, %0, t4\n\t"
 		"vle8.v		v16, (%0)\n\t"
 		"add		%0, %0, t4\n\t"
-		"vle8.v		v24, (%0)\n\t"
-		: : "r" (datap) : "t4");
+		"vle8.v		v24, (%0)\n\t",
+
+		"mv		t0, %0\n\t"
+		THEAD_VSETVLI_T4X0E8M8D1
+		THEAD_VLB_V_V0T0
+		"addi		t0, t0, 128\n\t"
+		THEAD_VLB_V_V8T0
+		"addi		%0, %0, 128\n\t"
+		THEAD_VLB_V_V16T0
+		"addi		%0, %0, 128\n\t"
+		THEAD_VLB_V_V24T0, THEAD_VENDOR_ID,
+		ERRATA_THEAD_VECTOR, CONFIG_ERRATA_THEAD_VECTOR)
+		: : "r" (datap) : "t0", "t4");
+
 	__vstate_csr_restore(restore_from);
 	rvv_disable();
 }
-- 
2.39.0


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH RFC 2/2] RISC-V: add T-Head vector errata handling
  2023-02-28 21:54 ` [PATCH RFC 2/2] RISC-V: add T-Head vector errata handling Heiko Stuebner
@ 2023-03-01  2:12   ` Guo Ren
  2023-03-15 18:56   ` Conor Dooley
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 21+ messages in thread
From: Guo Ren @ 2023-03-01  2:12 UTC (permalink / raw)
  To: Heiko Stuebner
  Cc: palmer, linux-riscv, samuel, christoph.muellner, conor.dooley,
	linux-kernel, Heiko Stuebner

On Wed, Mar 1, 2023 at 5:54 AM Heiko Stuebner <heiko@sntech.de> wrote:
>
> From: Heiko Stuebner <heiko.stuebner@vrull.eu>
>
> T-Head C9xx cores implement an older version (0.7.1) of the vector
> specification.
>
> Relevant changes concerning the kernel are:
> - different placement of the SR_VS bit for the vector unit status
> - different encoding of the vsetvli instruction
> - different instructions for loads and stores
>
> And a fixed VLEN of 128.
>
> The in-kernel access to vector instances is limited to the save and
> restore of process states so the above mentioned areas can simply be
> handled via the alternatives framework, similar to other T-Head specific
> issues.
>
> Signed-off-by: Heiko Stuebner <heiko.stuebner@vrull.eu>
> ---
>  arch/riscv/Kconfig.erratas           |  13 +++
>  arch/riscv/errata/thead/errata.c     |  32 ++++++
>  arch/riscv/include/asm/csr.h         |  26 ++++-
>  arch/riscv/include/asm/errata_list.h |  62 +++++++++++-
>  arch/riscv/include/asm/vector.h      | 139 +++++++++++++++++++++++++--
>  5 files changed, 256 insertions(+), 16 deletions(-)
>
> diff --git a/arch/riscv/Kconfig.erratas b/arch/riscv/Kconfig.erratas
> index 69621ae6d647..624cefc9fcd7 100644
> --- a/arch/riscv/Kconfig.erratas
> +++ b/arch/riscv/Kconfig.erratas
> @@ -79,4 +79,17 @@ config ERRATA_THEAD_PMU
>
>           If you don't know what to do here, say "Y".
>
> +config ERRATA_THEAD_VECTOR
> +       bool "Apply T-Head Vector errata"
> +       depends on ERRATA_THEAD && RISCV_ISA_V
> +       default y
> +       help
> +         The T-Head C9xx cores implement an earlier version 0.7.1
> +         of the vector extensions.
> +
> +         This will apply the necessary errata to handle the non-standard
> +         behaviour via when switch to and from vector mode for processes.
> +
> +         If you don't know what to do here, say "Y".
> +
>  endmenu # "CPU errata selection"
> diff --git a/arch/riscv/errata/thead/errata.c b/arch/riscv/errata/thead/errata.c
> index fac5742d1c1e..55b3aaa2468a 100644
> --- a/arch/riscv/errata/thead/errata.c
> +++ b/arch/riscv/errata/thead/errata.c
> @@ -12,6 +12,7 @@
>  #include <asm/cacheflush.h>
>  #include <asm/errata_list.h>
>  #include <asm/patch.h>
> +#include <asm/vector.h>
>  #include <asm/vendorid_list.h>
>
>  static bool errata_probe_pbmt(unsigned int stage,
> @@ -63,6 +64,34 @@ static bool errata_probe_pmu(unsigned int stage,
>         return true;
>  }
>
> +static bool errata_probe_vector(unsigned int stage,
> +                               unsigned long arch_id, unsigned long impid)
> +{
> +       if (!IS_ENABLED(CONFIG_ERRATA_THEAD_VECTOR))
> +               return false;
> +
> +       /* target-c9xx cores report arch_id and impid as 0 */
> +       if (arch_id != 0 || impid != 0)
> +               return false;
> +
> +       if (stage == RISCV_ALTERNATIVES_EARLY_BOOT) {
> +               /*
> +                * Disable VECTOR to detect illegal usage of vector in kernel.
> +                * This is normally done in _start_kernel but with the
> +                * vector-1.0 SR_VS bits. VS is using [24:23] on T-Head's
> +                * vector-0.7.1 and the vector-1.0-bits are unused there.
> +                */
> +               csr_clear(CSR_STATUS, SR_VS_THEAD);
> +               return false;
> +       }
> +
> +       /* let has_vector() return true and set the static vlen */
> +       static_branch_enable(&riscv_isa_ext_keys[RISCV_ISA_EXT_KEY_VECTOR]);
> +       riscv_vsize = 128 / 8 * 32;
> +
> +       return true;
> +}
> +
>  static u32 thead_errata_probe(unsigned int stage,
>                               unsigned long archid, unsigned long impid)
>  {
> @@ -77,6 +106,9 @@ static u32 thead_errata_probe(unsigned int stage,
>         if (errata_probe_pmu(stage, archid, impid))
>                 cpu_req_errata |= BIT(ERRATA_THEAD_PMU);
>
> +       if (errata_probe_vector(stage, archid, impid))
> +               cpu_req_errata |= BIT(ERRATA_THEAD_VECTOR);
> +
>         return cpu_req_errata;
>  }
>
> diff --git a/arch/riscv/include/asm/csr.h b/arch/riscv/include/asm/csr.h
> index 8b06f2472915..8d16c11487aa 100644
> --- a/arch/riscv/include/asm/csr.h
> +++ b/arch/riscv/include/asm/csr.h
> @@ -24,11 +24,27 @@
>  #define SR_FS_CLEAN    _AC(0x00004000, UL)
>  #define SR_FS_DIRTY    _AC(0x00006000, UL)
>
> -#define SR_VS           _AC(0x00000600, UL) /* Vector Status */
> -#define SR_VS_OFF       _AC(0x00000000, UL)
> -#define SR_VS_INITIAL   _AC(0x00000200, UL)
> -#define SR_VS_CLEAN     _AC(0x00000400, UL)
> -#define SR_VS_DIRTY     _AC(0x00000600, UL)
> +#define SR_VS_OFF              _AC(0x00000000, UL)
> +
> +#define SR_VS_1_0              _AC(0x00000600, UL) /* Vector Status */
How about keep SR_VS?

> +#define SR_VS_INITIAL_1_0      _AC(0x00000200, UL)
> +#define SR_VS_CLEAN_1_0                _AC(0x00000400, UL)
> +#define SR_VS_DIRTY_1_0                _AC(0x00000600, UL)
> +
> +#define SR_VS_THEAD            _AC(0x01800000, UL) /* Vector Status */
How about?

#ifdef ERRATA_THEAD_VECTOR
#define SR_VS_0_7

> +#define SR_VS_INITIAL_THEAD    _AC(0x00800000, UL)
> +#define SR_VS_CLEAN_THEAD      _AC(0x01000000, UL)
> +#define SR_VS_DIRTY_THEAD      _AC(0x01800000, UL)
> +
> +/*
> + * Always default to vector-1.0 handling in assembly and let the broken
> + * implementations handle their case separately.
> + */
> +#ifdef __ASSEMBLY__
> +#define SR_VS                  SR_VS_1_0
> +#else
> +
> +#endif
>
>  #define SR_XS          _AC(0x00018000, UL) /* Extension Status */
>  #define SR_XS_OFF      _AC(0x00000000, UL)
> diff --git a/arch/riscv/include/asm/errata_list.h b/arch/riscv/include/asm/errata_list.h
> index 95e626b7281e..3f93cdd1599f 100644
> --- a/arch/riscv/include/asm/errata_list.h
> +++ b/arch/riscv/include/asm/errata_list.h
> @@ -19,7 +19,8 @@
>  #define        ERRATA_THEAD_PBMT 0
>  #define        ERRATA_THEAD_CMO 1
>  #define        ERRATA_THEAD_PMU 2
> -#define        ERRATA_THEAD_NUMBER 3
> +#define        ERRATA_THEAD_VECTOR 3
> +#define        ERRATA_THEAD_NUMBER 4
>  #endif
>
>  #define        CPUFEATURE_SVPBMT 0
> @@ -157,6 +158,65 @@ asm volatile(ALTERNATIVE(                                          \
>         : "=r" (__ovl) :                                                \
>         : "memory")
>
> +#ifdef CONFIG_ERRATA_THEAD_PBMT
> +
> +#define ALT_VS_SHIFT 61
> +#define ALT_THEAD_VS_SHIFT 59
> +#define ALT_THEAD_SR_VS(_val, _vs)                                     \
> +asm(ALTERNATIVE(  "li %0, %1\t\nslli %0,%0,%3",                                \
> +                 "li %0, %2\t\nslli %0,%0,%4", THEAD_VENDOR_ID,        \
> +                       ERRATA_THEAD_PBMT, CONFIG_ERRATA_THEAD_PBMT)    \
> +               : "=r"(_val)                                            \
> +               : "I"(prot##_MAIN >> ALT_SVPBMT_SHIFT),         \
> +                 "I"(prot##_THEAD >> ALT_THEAD_PBMT_SHIFT),            \
> +                 "I"(ALT_SVPBMT_SHIFT),                                \
> +                 "I"(ALT_THEAD_PBMT_SHIFT))
> +#else
> +#define ALT_THEAD_SR_VS(_val ## _MAIN)
> +#endif
> +
> +#ifdef CONFIG_ERRATA_THEAD_VECTOR
> +
> +#define THEAD_C9XX_CSR_VXSAT                   0x9
> +#define THEAD_C9XX_CSR_VXRM                    0xa
> +
> +/*
> + * Vector 0.7.1 as used for example on T-Head Xuantie cores, uses an older
> + * encoding for vsetvli (ta, ma vs. d1), so provide an instruction for
> + * vsetvli     t4, x0, e8, m8, d1
> + */
> +#define THEAD_VSETVLI_T4X0E8M8D1       ".long  0x00307ed7\n\t"
> +
> +/*
> + * While in theory, the vector-0.7.1 vsb.v and vlb.v result in the same
> + * encoding as the standard vse8.v and vle8.v, compilers seem to optimize
> + * the call resulting in a different encoding and then using a value for
> + * the "mop" field that is not part of vector-0.7.1
> + * So encode specific variants for vstate_save and _restore.
> + */
> +#define THEAD_VSB_V_V0T0               ".long  0x02028027\n\t"
> +#define THEAD_VSB_V_V8T0               ".long  0x02028427\n\t"
> +#define THEAD_VSB_V_V16T0              ".long  0x02028827\n\t"
> +#define THEAD_VSB_V_V24T0              ".long  0x02028c27\n\t"
> +#define THEAD_VLB_V_V0T0               ".long  0x012028007\n\t"
> +#define THEAD_VLB_V_V8T0               ".long  0x012028407\n\t"
> +#define THEAD_VLB_V_V16T0              ".long  0x012028807\n\t"
> +#define THEAD_VLB_V_V24T0              ".long  0x012028c07\n\t"
> +
> +#define ALT_SR_VS_VECTOR_1_0_SHIFT     9
> +#define ALT_SR_VS_THEAD_SHIFT          23
> +
> +#define ALT_SR_VS(_val, prot)                                          \
> +asm(ALTERNATIVE("li %0, %1\t\nslli %0,%0,%3",                          \
> +               "li %0, %2\t\nslli %0,%0,%4", THEAD_VENDOR_ID,          \
> +               ERRATA_THEAD_VECTOR, CONFIG_ERRATA_THEAD_VECTOR)        \
> +               : "=r"(_val)                                            \
> +               : "I"(prot##_1_0 >> ALT_SR_VS_VECTOR_1_0_SHIFT),        \
> +                 "I"(prot##_THEAD >> ALT_SR_VS_THEAD_SHIFT),           \
> +                 "I"(ALT_SR_VS_VECTOR_1_0_SHIFT),                      \
> +                 "I"(ALT_SR_VS_THEAD_SHIFT))
> +#endif /* CONFIG_ERRATA_THEAD_VECTOR */
> +
>  #endif /* __ASSEMBLY__ */
>
>  #endif
> diff --git a/arch/riscv/include/asm/vector.h b/arch/riscv/include/asm/vector.h
> index ad9e6161dd89..ad91f783316e 100644
> --- a/arch/riscv/include/asm/vector.h
> +++ b/arch/riscv/include/asm/vector.h
> @@ -15,6 +15,55 @@
>  #include <asm/hwcap.h>
>  #include <asm/csr.h>
>  #include <asm/asm.h>
> +#include <asm/errata_list.h>
> +
> +#ifdef CONFIG_ERRATA_THEAD_VECTOR
> +
> +static inline u32 riscv_sr_vs(void)
> +{
> +       u32 val;
> +
> +       ALT_SR_VS(val, SR_VS);
> +       return val;
> +}
> +
> +static inline u32 riscv_sr_vs_initial(void)
> +{
> +       u32 val;
> +
> +       ALT_SR_VS(val, SR_VS_INITIAL);
> +       return val;
> +}
> +
> +static inline u32 riscv_sr_vs_clean(void)
> +{
> +       u32 val;
> +
> +       ALT_SR_VS(val, SR_VS_CLEAN);
> +       return val;
> +}
> +
> +static inline u32 riscv_sr_vs_dirty(void)
> +{
> +       u32 val;
> +
> +       ALT_SR_VS(val, SR_VS_DIRTY);
> +       return val;
> +}
> +
> +#define SR_VS          riscv_sr_vs()
> +#define SR_VS_INITIAL  riscv_sr_vs_initial()
> +#define SR_VS_CLEAN    riscv_sr_vs_clean()
> +#define SR_VS_DIRTY    riscv_sr_vs_dirty()
> +
> +#else /* CONFIG_ERRATA_THEAD_VECTOR */
> +
> +#define SR_VS          SR_VS_1_0
> +#define SR_VS_INITIAL  SR_VS_INITIAL_1_0
> +#define SR_VS_CLEAN    SR_VS_CLEAN_1_0
> +#define SR_VS_DIRTY    SR_VS_DIRTY_1_0
> +
> +#endif /* CONFIG_ERRATA_THEAD_VECTOR */
>
>  #define CSR_STR(x) __ASM_STR(x)
>
> @@ -29,6 +78,7 @@ static __always_inline bool has_vector(void)
>  static inline void __vstate_clean(struct pt_regs *regs)
>  {
>         regs->status = (regs->status & ~(SR_VS)) | SR_VS_CLEAN;
> +
>  }
>
>  static inline void vstate_off(struct pt_regs *regs)
> @@ -58,30 +108,75 @@ static __always_inline void rvv_disable(void)
>
>  static __always_inline void __vstate_csr_save(struct __riscv_v_state *dest)
>  {
> -       asm volatile (
> +       register u32 t1 asm("t1") = (SR_FS);
> +
> +       /*
> +        * CSR_VCSR is defined as
> +        * [2:1] - vxrm[1:0]
> +        * [0] - vxsat
> +        * The earlier vector spec implemented by T-Head uses separate
> +        * registers for the same bit-elements, so just combine those
> +        * into the existing output field.
> +        *
> +        * Additionally T-Head cores need FS to be enabled when accessing
> +        * the VXRM and VXSAT CSRs, otherwise ending in illegal instructions.
> +        */
> +       asm volatile (ALTERNATIVE(
>                 "csrr   %0, " CSR_STR(CSR_VSTART) "\n\t"
>                 "csrr   %1, " CSR_STR(CSR_VTYPE) "\n\t"
>                 "csrr   %2, " CSR_STR(CSR_VL) "\n\t"
>                 "csrr   %3, " CSR_STR(CSR_VCSR) "\n\t"
> +               __nops(5),
> +               "csrs   sstatus, t1\n\t"
> +               "csrr   %0, " CSR_STR(CSR_VSTART) "\n\t"
> +               "csrr   %1, " CSR_STR(CSR_VTYPE) "\n\t"
> +               "csrr   %2, " CSR_STR(CSR_VL) "\n\t"
> +               "csrr   %3, " CSR_STR(THEAD_C9XX_CSR_VXRM) "\n\t"
> +               "slliw  %3, %3, " CSR_STR(VCSR_VXRM_SHIFT) "\n\t"
> +               "csrr   t4, " CSR_STR(THEAD_C9XX_CSR_VXSAT) "\n\t"
> +               "or     %3, %3, t4\n\t"
> +               "csrc   sstatus, t1\n\t",
> +               THEAD_VENDOR_ID,
> +               ERRATA_THEAD_VECTOR, CONFIG_ERRATA_THEAD_VECTOR)
>                 : "=r" (dest->vstart), "=r" (dest->vtype), "=r" (dest->vl),
> -                 "=r" (dest->vcsr) : :);
> +                 "=r" (dest->vcsr) : "r"(t1) : "t4");
>  }
>
>  static __always_inline void __vstate_csr_restore(struct __riscv_v_state *src)
>  {
> -       asm volatile (
> +       register u32 t1 asm("t1") = (SR_FS);
> +
> +       /*
> +        * Similar to __vstate_csr_save above, restore values for the
> +        * separate VXRM and VXSAT CSRs from the vcsr variable.
> +        */
> +       asm volatile (ALTERNATIVE(
>                 "vsetvl  x0, %2, %1\n\t"
>                 "csrw   " CSR_STR(CSR_VSTART) ", %0\n\t"
>                 "csrw   " CSR_STR(CSR_VCSR) ", %3\n\t"
> +               __nops(6),
> +               "csrs   sstatus, t1\n\t"
> +               "vsetvl  x0, %2, %1\n\t"
> +               "csrw   " CSR_STR(CSR_VSTART) ", %0\n\t"
> +               "srliw  t4, %3, " CSR_STR(VCSR_VXRM_SHIFT) "\n\t"
> +               "andi   t4, t4, " CSR_STR(VCSR_VXRM_MASK) "\n\t"
> +               "csrw   " CSR_STR(THEAD_C9XX_CSR_VXRM) ", t4\n\t"
> +               "andi   %3, %3, " CSR_STR(VCSR_VXSAT_MASK) "\n\t"
> +               "csrw   " CSR_STR(THEAD_C9XX_CSR_VXSAT) ", %3\n\t"
> +               "csrc   sstatus, t1\n\t",
> +               THEAD_VENDOR_ID,
> +               ERRATA_THEAD_VECTOR, CONFIG_ERRATA_THEAD_VECTOR)
>                 : : "r" (src->vstart), "r" (src->vtype), "r" (src->vl),
> -                   "r" (src->vcsr) :);
> +                   "r" (src->vcsr), "r"(t1): "t4");
>  }
>
>  static inline void __vstate_save(struct __riscv_v_state *save_to, void *datap)
>  {
>         rvv_enable();
>         __vstate_csr_save(save_to);
> -       asm volatile (
> +
> +       asm volatile (ALTERNATIVE(
> +               "nop\n\t"
>                 "vsetvli        t4, x0, e8, m8, ta, ma\n\t"
>                 "vse8.v         v0, (%0)\n\t"
>                 "add            %0, %0, t4\n\t"
> @@ -89,8 +184,18 @@ static inline void __vstate_save(struct __riscv_v_state *save_to, void *datap)
>                 "add            %0, %0, t4\n\t"
>                 "vse8.v         v16, (%0)\n\t"
>                 "add            %0, %0, t4\n\t"
> -               "vse8.v         v24, (%0)\n\t"
> -               : : "r" (datap) : "t4", "memory");
> +               "vse8.v         v24, (%0)\n\t",
> +               "mv             t0, %0\n\t"
> +               THEAD_VSETVLI_T4X0E8M8D1
> +               THEAD_VSB_V_V0T0
> +               "addi           t0, t0, 128\n\t"
> +               THEAD_VSB_V_V8T0
> +               "addi           t0, t0, 128\n\t"
> +               THEAD_VSB_V_V16T0
> +               "addi           t0, t0, 128\n\t"
> +               THEAD_VSB_V_V24T0, THEAD_VENDOR_ID,
> +               ERRATA_THEAD_VECTOR, CONFIG_ERRATA_THEAD_VECTOR)
> +               : : "r" (datap) : "t0", "t4", "memory");
>         rvv_disable();
>  }
>
> @@ -98,7 +203,9 @@ static inline void __vstate_restore(struct __riscv_v_state *restore_from,
>                                     void *datap)
>  {
>         rvv_enable();
> -       asm volatile (
> +
> +       asm volatile (ALTERNATIVE(
> +               "nop\n\t"
>                 "vsetvli        t4, x0, e8, m8, ta, ma\n\t"
>                 "vle8.v         v0, (%0)\n\t"
>                 "add            %0, %0, t4\n\t"
> @@ -106,8 +213,20 @@ static inline void __vstate_restore(struct __riscv_v_state *restore_from,
>                 "add            %0, %0, t4\n\t"
>                 "vle8.v         v16, (%0)\n\t"
>                 "add            %0, %0, t4\n\t"
> -               "vle8.v         v24, (%0)\n\t"
> -               : : "r" (datap) : "t4");
> +               "vle8.v         v24, (%0)\n\t",
> +
> +               "mv             t0, %0\n\t"
> +               THEAD_VSETVLI_T4X0E8M8D1
> +               THEAD_VLB_V_V0T0
> +               "addi           t0, t0, 128\n\t"
> +               THEAD_VLB_V_V8T0
> +               "addi           %0, %0, 128\n\t"
> +               THEAD_VLB_V_V16T0
> +               "addi           %0, %0, 128\n\t"
> +               THEAD_VLB_V_V24T0, THEAD_VENDOR_ID,
> +               ERRATA_THEAD_VECTOR, CONFIG_ERRATA_THEAD_VECTOR)
> +               : : "r" (datap) : "t0", "t4");
> +
>         __vstate_csr_restore(restore_from);
>         rvv_disable();
>  }
> --
> 2.39.0
>


-- 
Best Regards
 Guo Ren

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH RFC 0/2] RISC-V: T-Head vector handling
  2023-02-28 21:54 [PATCH RFC 0/2] RISC-V: T-Head vector handling Heiko Stuebner
  2023-02-28 21:54 ` [PATCH RFC 1/2] RISC-V: define the elements of the VCSR vector CSR Heiko Stuebner
  2023-02-28 21:54 ` [PATCH RFC 2/2] RISC-V: add T-Head vector errata handling Heiko Stuebner
@ 2023-03-01  2:21 ` Guo Ren
  2023-03-15  5:29 ` Palmer Dabbelt
  3 siblings, 0 replies; 21+ messages in thread
From: Guo Ren @ 2023-03-01  2:21 UTC (permalink / raw)
  To: Heiko Stuebner
  Cc: palmer, linux-riscv, samuel, christoph.muellner, conor.dooley,
	linux-kernel, Heiko Stuebner

On Wed, Mar 1, 2023 at 5:54 AM Heiko Stuebner <heiko@sntech.de> wrote:
>
> From: Heiko Stuebner <heiko.stuebner@vrull.eu>
>
> As is widely known the T-Head C9xx cores used for example in the
> Allwinner D1 implement an older non-ratified variant of the vector spec.
>
> While userspace will probably have a lot more problems implementing
> support for both, on the kernel side the needed changes are actually
> somewhat small'ish and can be handled via alternatives somewhat nicely.
>
> With this patchset I could run the same userspace program (picked from
> some riscv-vector-test repository) that does some vector additions on
> both qemu and a d1-nezha board. On both platforms it ran sucessfully and
> even produced the same results.
Great! Thx.

>
>
> As can be seen in the todo list, there are 2 places where the changed
> SR_VS location still needs to be handled in the next revision
> (assembly + ALTERNATIVES + constants + probably stringify resulted in
>  some grey hair so far already)
>
>
> ToDo:
> - follow along with the base vector patchset
> - handle SR_VS access in _save_context and _secondary_start_sbi
>
>
> Heiko Stuebner (2):
>   RISC-V: define the elements of the VCSR vector CSR
>   RISC-V: add T-Head vector errata handling
>
>  arch/riscv/Kconfig.erratas           |  13 +++
>  arch/riscv/errata/thead/errata.c     |  32 ++++++
>  arch/riscv/include/asm/csr.h         |  31 +++++-
>  arch/riscv/include/asm/errata_list.h |  62 +++++++++++-
>  arch/riscv/include/asm/vector.h      | 139 +++++++++++++++++++++++++--
>  5 files changed, 261 insertions(+), 16 deletions(-)
>
> --
> 2.39.0
>


-- 
Best Regards
 Guo Ren

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH RFC 1/2] RISC-V: define the elements of the VCSR vector CSR
  2023-02-28 21:54 ` [PATCH RFC 1/2] RISC-V: define the elements of the VCSR vector CSR Heiko Stuebner
@ 2023-03-01  2:22   ` Guo Ren
  2023-03-15 18:31   ` Conor Dooley
  1 sibling, 0 replies; 21+ messages in thread
From: Guo Ren @ 2023-03-01  2:22 UTC (permalink / raw)
  To: Heiko Stuebner
  Cc: palmer, linux-riscv, samuel, christoph.muellner, conor.dooley,
	linux-kernel, Heiko Stuebner

Acked-by: Guo Ren <guoren@kernel.org>

On Wed, Mar 1, 2023 at 5:54 AM Heiko Stuebner <heiko@sntech.de> wrote:
>
> From: Heiko Stuebner <heiko.stuebner@vrull.eu>
>
> The VCSR CSR contains two elements VXRM[2:1] and VXSAT[0].
>
> Define constants for those to access the elements in a readable way.
>
> Signed-off-by: Heiko Stuebner <heiko.stuebner@vrull.eu>
> ---
>  arch/riscv/include/asm/csr.h | 5 +++++
>  1 file changed, 5 insertions(+)
>
> diff --git a/arch/riscv/include/asm/csr.h b/arch/riscv/include/asm/csr.h
> index add51662b7c3..8b06f2472915 100644
> --- a/arch/riscv/include/asm/csr.h
> +++ b/arch/riscv/include/asm/csr.h
> @@ -176,6 +176,11 @@
>  #define ENVCFG_CBIE_INV                        _AC(0x3, UL)
>  #define ENVCFG_FIOM                    _AC(0x1, UL)
>
> +/* VCSR flags */
> +#define VCSR_VXRM_MASK                 3
> +#define VCSR_VXRM_SHIFT                        1
> +#define VCSR_VXSAT_MASK                        1
> +
>  /* symbolic CSR names: */
>  #define CSR_CYCLE              0xc00
>  #define CSR_TIME               0xc01
> --
> 2.39.0
>


-- 
Best Regards
 Guo Ren

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH RFC 0/2] RISC-V: T-Head vector handling
  2023-02-28 21:54 [PATCH RFC 0/2] RISC-V: T-Head vector handling Heiko Stuebner
                   ` (2 preceding siblings ...)
  2023-03-01  2:21 ` [PATCH RFC 0/2] RISC-V: T-Head vector handling Guo Ren
@ 2023-03-15  5:29 ` Palmer Dabbelt
  2023-03-15  6:31   ` Heiko Stuebner
  2023-06-12 15:29   ` Palmer Dabbelt
  3 siblings, 2 replies; 21+ messages in thread
From: Palmer Dabbelt @ 2023-03-15  5:29 UTC (permalink / raw)
  To: heiko
  Cc: linux-riscv, samuel, guoren, christoph.muellner, heiko,
	Conor Dooley, linux-kernel, heiko.stuebner

On Tue, 28 Feb 2023 13:54:33 PST (-0800), heiko@sntech.de wrote:
> From: Heiko Stuebner <heiko.stuebner@vrull.eu>
>
> As is widely known the T-Head C9xx cores used for example in the
> Allwinner D1 implement an older non-ratified variant of the vector spec.
>
> While userspace will probably have a lot more problems implementing
> support for both, on the kernel side the needed changes are actually
> somewhat small'ish and can be handled via alternatives somewhat nicely.
>
> With this patchset I could run the same userspace program (picked from
> some riscv-vector-test repository) that does some vector additions on
> both qemu and a d1-nezha board. On both platforms it ran sucessfully and
> even produced the same results.
>
>
> As can be seen in the todo list, there are 2 places where the changed
> SR_VS location still needs to be handled in the next revision
> (assembly + ALTERNATIVES + constants + probably stringify resulted in
>  some grey hair so far already)
>
>
> ToDo:
> - follow along with the base vector patchset
> - handle SR_VS access in _save_context and _secondary_start_sbi
>
>
> Heiko Stuebner (2):
>   RISC-V: define the elements of the VCSR vector CSR
>   RISC-V: add T-Head vector errata handling
>
>  arch/riscv/Kconfig.erratas           |  13 +++
>  arch/riscv/errata/thead/errata.c     |  32 ++++++
>  arch/riscv/include/asm/csr.h         |  31 +++++-
>  arch/riscv/include/asm/errata_list.h |  62 +++++++++++-
>  arch/riscv/include/asm/vector.h      | 139 +++++++++++++++++++++++++--
>  5 files changed, 261 insertions(+), 16 deletions(-)

I have no opposition to calling the T-Head vector stuff an errata 
against V, the RISC-V folks have already made it quite apparent that 
anything goes here.  I would like to get the standard V uABI sorted out 
first, though, as there's still a lot of moving pieces there.  It's kind 
of hard here as T-Head got thrown under the bus, but I'm not sure what 
else to do about it.

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH RFC 0/2] RISC-V: T-Head vector handling
  2023-03-15  5:29 ` Palmer Dabbelt
@ 2023-03-15  6:31   ` Heiko Stuebner
  2023-06-12 15:29   ` Palmer Dabbelt
  1 sibling, 0 replies; 21+ messages in thread
From: Heiko Stuebner @ 2023-03-15  6:31 UTC (permalink / raw)
  To: Palmer Dabbelt
  Cc: linux-riscv, samuel, guoren, christoph.muellner, Conor Dooley,
	linux-kernel

Hi Palmer,

Am Mittwoch, 15. März 2023, 06:29:41 CET schrieb Palmer Dabbelt:
> On Tue, 28 Feb 2023 13:54:33 PST (-0800), heiko@sntech.de wrote:
> > From: Heiko Stuebner <heiko.stuebner@vrull.eu>
> >
> > As is widely known the T-Head C9xx cores used for example in the
> > Allwinner D1 implement an older non-ratified variant of the vector spec.
> >
> > While userspace will probably have a lot more problems implementing
> > support for both, on the kernel side the needed changes are actually
> > somewhat small'ish and can be handled via alternatives somewhat nicely.
> >
> > With this patchset I could run the same userspace program (picked from
> > some riscv-vector-test repository) that does some vector additions on
> > both qemu and a d1-nezha board. On both platforms it ran sucessfully and
> > even produced the same results.
> >
> >
> > As can be seen in the todo list, there are 2 places where the changed
> > SR_VS location still needs to be handled in the next revision
> > (assembly + ALTERNATIVES + constants + probably stringify resulted in
> >  some grey hair so far already)
> >
> >
> > ToDo:
> > - follow along with the base vector patchset
> > - handle SR_VS access in _save_context and _secondary_start_sbi
> >
> >
> > Heiko Stuebner (2):
> >   RISC-V: define the elements of the VCSR vector CSR
> >   RISC-V: add T-Head vector errata handling
> >
> >  arch/riscv/Kconfig.erratas           |  13 +++
> >  arch/riscv/errata/thead/errata.c     |  32 ++++++
> >  arch/riscv/include/asm/csr.h         |  31 +++++-
> >  arch/riscv/include/asm/errata_list.h |  62 +++++++++++-
> >  arch/riscv/include/asm/vector.h      | 139 +++++++++++++++++++++++++--
> >  5 files changed, 261 insertions(+), 16 deletions(-)
> 
> I have no opposition to calling the T-Head vector stuff an errata 
> against V, the RISC-V folks have already made it quite apparent that 
> anything goes here.  I would like to get the standard V uABI sorted out 
> first, though, as there's still a lot of moving pieces there.

yeah, that's the reason the series is an RFC and is based on the main
vector series and I fully expect the main support to land first :-) .


> It's kind 
> of hard here as T-Head got thrown under the bus, but I'm not sure what 
> else to do about it.

Thankfully on the kernel-side the differences to implemeent both "at the
same time" are not that huge - userspace of course will need to figure
out their own solution.


Heiko



_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH RFC 1/2] RISC-V: define the elements of the VCSR vector CSR
  2023-02-28 21:54 ` [PATCH RFC 1/2] RISC-V: define the elements of the VCSR vector CSR Heiko Stuebner
  2023-03-01  2:22   ` Guo Ren
@ 2023-03-15 18:31   ` Conor Dooley
  1 sibling, 0 replies; 21+ messages in thread
From: Conor Dooley @ 2023-03-15 18:31 UTC (permalink / raw)
  To: Heiko Stuebner
  Cc: palmer, linux-riscv, samuel, guoren, christoph.muellner,
	conor.dooley, linux-kernel, Heiko Stuebner


[-- Attachment #1.1: Type: text/plain, Size: 1005 bytes --]

On Tue, Feb 28, 2023 at 10:54:34PM +0100, Heiko Stuebner wrote:
> From: Heiko Stuebner <heiko.stuebner@vrull.eu>
> 
> The VCSR CSR contains two elements VXRM[2:1] and VXSAT[0].
> 
> Define constants for those to access the elements in a readable way.
> 
> Signed-off-by: Heiko Stuebner <heiko.stuebner@vrull.eu>

Reviewed-by: Conor Dooley <conor.dooley@microchip.com>

> ---
>  arch/riscv/include/asm/csr.h | 5 +++++
>  1 file changed, 5 insertions(+)
> 
> diff --git a/arch/riscv/include/asm/csr.h b/arch/riscv/include/asm/csr.h
> index add51662b7c3..8b06f2472915 100644
> --- a/arch/riscv/include/asm/csr.h
> +++ b/arch/riscv/include/asm/csr.h
> @@ -176,6 +176,11 @@
>  #define ENVCFG_CBIE_INV			_AC(0x3, UL)
>  #define ENVCFG_FIOM			_AC(0x1, UL)
>  
> +/* VCSR flags */
> +#define VCSR_VXRM_MASK			3
> +#define VCSR_VXRM_SHIFT			1
> +#define VCSR_VXSAT_MASK			1
> +
>  /* symbolic CSR names: */
>  #define CSR_CYCLE		0xc00
>  #define CSR_TIME		0xc01
> -- 
> 2.39.0
> 

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

[-- Attachment #2: Type: text/plain, Size: 161 bytes --]

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH RFC 2/2] RISC-V: add T-Head vector errata handling
  2023-02-28 21:54 ` [PATCH RFC 2/2] RISC-V: add T-Head vector errata handling Heiko Stuebner
  2023-03-01  2:12   ` Guo Ren
@ 2023-03-15 18:56   ` Conor Dooley
  2023-06-13  6:35   ` Stefan O'Rear
  2023-06-23  9:12   ` Emil Renner Berthing
  3 siblings, 0 replies; 21+ messages in thread
From: Conor Dooley @ 2023-03-15 18:56 UTC (permalink / raw)
  To: Heiko Stuebner
  Cc: palmer, linux-riscv, samuel, guoren, christoph.muellner,
	conor.dooley, linux-kernel, Heiko Stuebner


[-- Attachment #1.1: Type: text/plain, Size: 14794 bytes --]

On Tue, Feb 28, 2023 at 10:54:35PM +0100, Heiko Stuebner wrote:
> From: Heiko Stuebner <heiko.stuebner@vrull.eu>
> 
> T-Head C9xx cores implement an older version (0.7.1) of the vector
> specification.
> 
> Relevant changes concerning the kernel are:
> - different placement of the SR_VS bit for the vector unit status
> - different encoding of the vsetvli instruction
> - different instructions for loads and stores
> 
> And a fixed VLEN of 128.
> 
> The in-kernel access to vector instances is limited to the save and
> restore of process states so the above mentioned areas can simply be
> handled via the alternatives framework, similar to other T-Head specific
> issues.
> 

Apart from the "simple" form of the alternatives throwing me for a
second (too used to the _2 variant), this looks grand.
I think this needs a rebase as things stand based on what Andy's already
submitted, plus I know he's working on another revision - so I suppose I
shall withold leaving an R-b until this is closer to a final form!

Cheers,
Conor.

> Signed-off-by: Heiko Stuebner <heiko.stuebner@vrull.eu>
> ---
>  arch/riscv/Kconfig.erratas           |  13 +++
>  arch/riscv/errata/thead/errata.c     |  32 ++++++
>  arch/riscv/include/asm/csr.h         |  26 ++++-
>  arch/riscv/include/asm/errata_list.h |  62 +++++++++++-
>  arch/riscv/include/asm/vector.h      | 139 +++++++++++++++++++++++++--
>  5 files changed, 256 insertions(+), 16 deletions(-)
> 
> diff --git a/arch/riscv/Kconfig.erratas b/arch/riscv/Kconfig.erratas
> index 69621ae6d647..624cefc9fcd7 100644
> --- a/arch/riscv/Kconfig.erratas
> +++ b/arch/riscv/Kconfig.erratas
> @@ -79,4 +79,17 @@ config ERRATA_THEAD_PMU
>  
>  	  If you don't know what to do here, say "Y".
>  
> +config ERRATA_THEAD_VECTOR
> +	bool "Apply T-Head Vector errata"
> +	depends on ERRATA_THEAD && RISCV_ISA_V
> +	default y
> +	help
> +	  The T-Head C9xx cores implement an earlier version 0.7.1
> +	  of the vector extensions.
> +
> +	  This will apply the necessary errata to handle the non-standard
> +	  behaviour via when switch to and from vector mode for processes.
> +
> +	  If you don't know what to do here, say "Y".
> +
>  endmenu # "CPU errata selection"
> diff --git a/arch/riscv/errata/thead/errata.c b/arch/riscv/errata/thead/errata.c
> index fac5742d1c1e..55b3aaa2468a 100644
> --- a/arch/riscv/errata/thead/errata.c
> +++ b/arch/riscv/errata/thead/errata.c
> @@ -12,6 +12,7 @@
>  #include <asm/cacheflush.h>
>  #include <asm/errata_list.h>
>  #include <asm/patch.h>
> +#include <asm/vector.h>
>  #include <asm/vendorid_list.h>
>  
>  static bool errata_probe_pbmt(unsigned int stage,
> @@ -63,6 +64,34 @@ static bool errata_probe_pmu(unsigned int stage,
>  	return true;
>  }
>  
> +static bool errata_probe_vector(unsigned int stage,
> +				unsigned long arch_id, unsigned long impid)
> +{
> +	if (!IS_ENABLED(CONFIG_ERRATA_THEAD_VECTOR))
> +		return false;
> +
> +	/* target-c9xx cores report arch_id and impid as 0 */
> +	if (arch_id != 0 || impid != 0)
> +		return false;
> +
> +	if (stage == RISCV_ALTERNATIVES_EARLY_BOOT) {
> +		/*
> +		 * Disable VECTOR to detect illegal usage of vector in kernel.
> +		 * This is normally done in _start_kernel but with the
> +		 * vector-1.0 SR_VS bits. VS is using [24:23] on T-Head's
> +		 * vector-0.7.1 and the vector-1.0-bits are unused there.
> +		 */
> +		csr_clear(CSR_STATUS, SR_VS_THEAD);
> +		return false;
> +	}
> +
> +	/* let has_vector() return true and set the static vlen */
> +	static_branch_enable(&riscv_isa_ext_keys[RISCV_ISA_EXT_KEY_VECTOR]);
> +	riscv_vsize = 128 / 8 * 32;
> +
> +	return true;
> +}
> +
>  static u32 thead_errata_probe(unsigned int stage,
>  			      unsigned long archid, unsigned long impid)
>  {
> @@ -77,6 +106,9 @@ static u32 thead_errata_probe(unsigned int stage,
>  	if (errata_probe_pmu(stage, archid, impid))
>  		cpu_req_errata |= BIT(ERRATA_THEAD_PMU);
>  
> +	if (errata_probe_vector(stage, archid, impid))
> +		cpu_req_errata |= BIT(ERRATA_THEAD_VECTOR);
> +
>  	return cpu_req_errata;
>  }
>  
> diff --git a/arch/riscv/include/asm/csr.h b/arch/riscv/include/asm/csr.h
> index 8b06f2472915..8d16c11487aa 100644
> --- a/arch/riscv/include/asm/csr.h
> +++ b/arch/riscv/include/asm/csr.h
> @@ -24,11 +24,27 @@
>  #define SR_FS_CLEAN	_AC(0x00004000, UL)
>  #define SR_FS_DIRTY	_AC(0x00006000, UL)
>  
> -#define SR_VS           _AC(0x00000600, UL) /* Vector Status */
> -#define SR_VS_OFF       _AC(0x00000000, UL)
> -#define SR_VS_INITIAL   _AC(0x00000200, UL)
> -#define SR_VS_CLEAN     _AC(0x00000400, UL)
> -#define SR_VS_DIRTY     _AC(0x00000600, UL)
> +#define SR_VS_OFF		_AC(0x00000000, UL)
> +
> +#define SR_VS_1_0		_AC(0x00000600, UL) /* Vector Status */
> +#define SR_VS_INITIAL_1_0	_AC(0x00000200, UL)
> +#define SR_VS_CLEAN_1_0		_AC(0x00000400, UL)
> +#define SR_VS_DIRTY_1_0		_AC(0x00000600, UL)
> +
> +#define SR_VS_THEAD		_AC(0x01800000, UL) /* Vector Status */
> +#define SR_VS_INITIAL_THEAD	_AC(0x00800000, UL)
> +#define SR_VS_CLEAN_THEAD	_AC(0x01000000, UL)
> +#define SR_VS_DIRTY_THEAD	_AC(0x01800000, UL)
> +
> +/*
> + * Always default to vector-1.0 handling in assembly and let the broken
> + * implementations handle their case separately.
> + */
> +#ifdef __ASSEMBLY__
> +#define SR_VS			SR_VS_1_0
> +#else
> +
> +#endif
>  
>  #define SR_XS		_AC(0x00018000, UL) /* Extension Status */
>  #define SR_XS_OFF	_AC(0x00000000, UL)
> diff --git a/arch/riscv/include/asm/errata_list.h b/arch/riscv/include/asm/errata_list.h
> index 95e626b7281e..3f93cdd1599f 100644
> --- a/arch/riscv/include/asm/errata_list.h
> +++ b/arch/riscv/include/asm/errata_list.h
> @@ -19,7 +19,8 @@
>  #define	ERRATA_THEAD_PBMT 0
>  #define	ERRATA_THEAD_CMO 1
>  #define	ERRATA_THEAD_PMU 2
> -#define	ERRATA_THEAD_NUMBER 3
> +#define	ERRATA_THEAD_VECTOR 3
> +#define	ERRATA_THEAD_NUMBER 4
>  #endif
>  
>  #define	CPUFEATURE_SVPBMT 0
> @@ -157,6 +158,65 @@ asm volatile(ALTERNATIVE(						\
>  	: "=r" (__ovl) :						\
>  	: "memory")
>  
> +#ifdef CONFIG_ERRATA_THEAD_PBMT
> +
> +#define ALT_VS_SHIFT 61
> +#define ALT_THEAD_VS_SHIFT 59
> +#define ALT_THEAD_SR_VS(_val, _vs)					\
> +asm(ALTERNATIVE(  "li %0, %1\t\nslli %0,%0,%3",				\
> +		  "li %0, %2\t\nslli %0,%0,%4", THEAD_VENDOR_ID,	\
> +			ERRATA_THEAD_PBMT, CONFIG_ERRATA_THEAD_PBMT)	\
> +		: "=r"(_val)						\
> +		: "I"(prot##_MAIN >> ALT_SVPBMT_SHIFT),		\
> +		  "I"(prot##_THEAD >> ALT_THEAD_PBMT_SHIFT),		\
> +		  "I"(ALT_SVPBMT_SHIFT),				\
> +		  "I"(ALT_THEAD_PBMT_SHIFT))
> +#else
> +#define ALT_THEAD_SR_VS(_val ## _MAIN)
> +#endif
> +
> +#ifdef CONFIG_ERRATA_THEAD_VECTOR
> +
> +#define THEAD_C9XX_CSR_VXSAT			0x9
> +#define THEAD_C9XX_CSR_VXRM			0xa
> +
> +/*
> + * Vector 0.7.1 as used for example on T-Head Xuantie cores, uses an older
> + * encoding for vsetvli (ta, ma vs. d1), so provide an instruction for
> + * vsetvli	t4, x0, e8, m8, d1
> + */
> +#define THEAD_VSETVLI_T4X0E8M8D1	".long	0x00307ed7\n\t"
> +
> +/*
> + * While in theory, the vector-0.7.1 vsb.v and vlb.v result in the same
> + * encoding as the standard vse8.v and vle8.v, compilers seem to optimize
> + * the call resulting in a different encoding and then using a value for
> + * the "mop" field that is not part of vector-0.7.1
> + * So encode specific variants for vstate_save and _restore.
> + */
> +#define THEAD_VSB_V_V0T0		".long	0x02028027\n\t"
> +#define THEAD_VSB_V_V8T0		".long	0x02028427\n\t"
> +#define THEAD_VSB_V_V16T0		".long	0x02028827\n\t"
> +#define THEAD_VSB_V_V24T0		".long	0x02028c27\n\t"
> +#define THEAD_VLB_V_V0T0		".long	0x012028007\n\t"
> +#define THEAD_VLB_V_V8T0		".long	0x012028407\n\t"
> +#define THEAD_VLB_V_V16T0		".long	0x012028807\n\t"
> +#define THEAD_VLB_V_V24T0		".long	0x012028c07\n\t"
> +
> +#define ALT_SR_VS_VECTOR_1_0_SHIFT	9
> +#define ALT_SR_VS_THEAD_SHIFT		23
> +
> +#define ALT_SR_VS(_val, prot)						\
> +asm(ALTERNATIVE("li %0, %1\t\nslli %0,%0,%3",				\
> +		"li %0, %2\t\nslli %0,%0,%4", THEAD_VENDOR_ID,		\
> +		ERRATA_THEAD_VECTOR, CONFIG_ERRATA_THEAD_VECTOR)	\
> +		: "=r"(_val)						\
> +		: "I"(prot##_1_0 >> ALT_SR_VS_VECTOR_1_0_SHIFT),	\
> +		  "I"(prot##_THEAD >> ALT_SR_VS_THEAD_SHIFT),		\
> +		  "I"(ALT_SR_VS_VECTOR_1_0_SHIFT),			\
> +		  "I"(ALT_SR_VS_THEAD_SHIFT))
> +#endif /* CONFIG_ERRATA_THEAD_VECTOR */
> +
>  #endif /* __ASSEMBLY__ */
>  
>  #endif
> diff --git a/arch/riscv/include/asm/vector.h b/arch/riscv/include/asm/vector.h
> index ad9e6161dd89..ad91f783316e 100644
> --- a/arch/riscv/include/asm/vector.h
> +++ b/arch/riscv/include/asm/vector.h
> @@ -15,6 +15,55 @@
>  #include <asm/hwcap.h>
>  #include <asm/csr.h>
>  #include <asm/asm.h>
> +#include <asm/errata_list.h>
> +
> +#ifdef CONFIG_ERRATA_THEAD_VECTOR
> +
> +static inline u32 riscv_sr_vs(void)
> +{
> +	u32 val;
> +
> +	ALT_SR_VS(val, SR_VS);
> +	return val;
> +}
> +
> +static inline u32 riscv_sr_vs_initial(void)
> +{
> +	u32 val;
> +
> +	ALT_SR_VS(val, SR_VS_INITIAL);
> +	return val;
> +}
> +
> +static inline u32 riscv_sr_vs_clean(void)
> +{
> +	u32 val;
> +
> +	ALT_SR_VS(val, SR_VS_CLEAN);
> +	return val;
> +}
> +
> +static inline u32 riscv_sr_vs_dirty(void)
> +{
> +	u32 val;
> +
> +	ALT_SR_VS(val, SR_VS_DIRTY);
> +	return val;
> +}
> +
> +#define SR_VS		riscv_sr_vs()
> +#define SR_VS_INITIAL	riscv_sr_vs_initial()
> +#define SR_VS_CLEAN	riscv_sr_vs_clean()
> +#define SR_VS_DIRTY	riscv_sr_vs_dirty()
> +
> +#else /* CONFIG_ERRATA_THEAD_VECTOR */
> +
> +#define SR_VS		SR_VS_1_0
> +#define SR_VS_INITIAL	SR_VS_INITIAL_1_0
> +#define SR_VS_CLEAN	SR_VS_CLEAN_1_0
> +#define SR_VS_DIRTY	SR_VS_DIRTY_1_0
> +
> +#endif /* CONFIG_ERRATA_THEAD_VECTOR */
>  
>  #define CSR_STR(x) __ASM_STR(x)
>  
> @@ -29,6 +78,7 @@ static __always_inline bool has_vector(void)
>  static inline void __vstate_clean(struct pt_regs *regs)
>  {
>  	regs->status = (regs->status & ~(SR_VS)) | SR_VS_CLEAN;
> +
>  }
>  
>  static inline void vstate_off(struct pt_regs *regs)
> @@ -58,30 +108,75 @@ static __always_inline void rvv_disable(void)
>  
>  static __always_inline void __vstate_csr_save(struct __riscv_v_state *dest)
>  {
> -	asm volatile (
> +	register u32 t1 asm("t1") = (SR_FS);
> +
> +	/*
> +	 * CSR_VCSR is defined as
> +	 * [2:1] - vxrm[1:0]
> +	 * [0] - vxsat
> +	 * The earlier vector spec implemented by T-Head uses separate
> +	 * registers for the same bit-elements, so just combine those
> +	 * into the existing output field.
> +	 *
> +	 * Additionally T-Head cores need FS to be enabled when accessing
> +	 * the VXRM and VXSAT CSRs, otherwise ending in illegal instructions.
> +	 */
> +	asm volatile (ALTERNATIVE(
>  		"csrr	%0, " CSR_STR(CSR_VSTART) "\n\t"
>  		"csrr	%1, " CSR_STR(CSR_VTYPE) "\n\t"
>  		"csrr	%2, " CSR_STR(CSR_VL) "\n\t"
>  		"csrr	%3, " CSR_STR(CSR_VCSR) "\n\t"
> +		__nops(5),
> +		"csrs	sstatus, t1\n\t"
> +		"csrr	%0, " CSR_STR(CSR_VSTART) "\n\t"
> +		"csrr	%1, " CSR_STR(CSR_VTYPE) "\n\t"
> +		"csrr	%2, " CSR_STR(CSR_VL) "\n\t"
> +		"csrr	%3, " CSR_STR(THEAD_C9XX_CSR_VXRM) "\n\t"
> +		"slliw	%3, %3, " CSR_STR(VCSR_VXRM_SHIFT) "\n\t"
> +		"csrr	t4, " CSR_STR(THEAD_C9XX_CSR_VXSAT) "\n\t"
> +		"or	%3, %3, t4\n\t"
> +		"csrc	sstatus, t1\n\t",
> +		THEAD_VENDOR_ID,
> +		ERRATA_THEAD_VECTOR, CONFIG_ERRATA_THEAD_VECTOR)
>  		: "=r" (dest->vstart), "=r" (dest->vtype), "=r" (dest->vl),
> -		  "=r" (dest->vcsr) : :);
> +		  "=r" (dest->vcsr) : "r"(t1) : "t4");
>  }
>  
>  static __always_inline void __vstate_csr_restore(struct __riscv_v_state *src)
>  {
> -	asm volatile (
> +	register u32 t1 asm("t1") = (SR_FS);
> +
> +	/*
> +	 * Similar to __vstate_csr_save above, restore values for the
> +	 * separate VXRM and VXSAT CSRs from the vcsr variable.
> +	 */
> +	asm volatile (ALTERNATIVE(
>  		"vsetvl	 x0, %2, %1\n\t"
>  		"csrw	" CSR_STR(CSR_VSTART) ", %0\n\t"
>  		"csrw	" CSR_STR(CSR_VCSR) ", %3\n\t"
> +		__nops(6),
> +		"csrs	sstatus, t1\n\t"
> +		"vsetvl	 x0, %2, %1\n\t"
> +		"csrw	" CSR_STR(CSR_VSTART) ", %0\n\t"
> +		"srliw	t4, %3, " CSR_STR(VCSR_VXRM_SHIFT) "\n\t"
> +		"andi	t4, t4, " CSR_STR(VCSR_VXRM_MASK) "\n\t"
> +		"csrw	" CSR_STR(THEAD_C9XX_CSR_VXRM) ", t4\n\t"
> +		"andi	%3, %3, " CSR_STR(VCSR_VXSAT_MASK) "\n\t"
> +		"csrw	" CSR_STR(THEAD_C9XX_CSR_VXSAT) ", %3\n\t"
> +		"csrc	sstatus, t1\n\t",
> +		THEAD_VENDOR_ID,
> +		ERRATA_THEAD_VECTOR, CONFIG_ERRATA_THEAD_VECTOR)
>  		: : "r" (src->vstart), "r" (src->vtype), "r" (src->vl),
> -		    "r" (src->vcsr) :);
> +		    "r" (src->vcsr), "r"(t1): "t4");
>  }
>  
>  static inline void __vstate_save(struct __riscv_v_state *save_to, void *datap)
>  {
>  	rvv_enable();
>  	__vstate_csr_save(save_to);
> -	asm volatile (
> +
> +	asm volatile (ALTERNATIVE(
> +		"nop\n\t"
>  		"vsetvli	t4, x0, e8, m8, ta, ma\n\t"
>  		"vse8.v		v0, (%0)\n\t"
>  		"add		%0, %0, t4\n\t"
> @@ -89,8 +184,18 @@ static inline void __vstate_save(struct __riscv_v_state *save_to, void *datap)
>  		"add		%0, %0, t4\n\t"
>  		"vse8.v		v16, (%0)\n\t"
>  		"add		%0, %0, t4\n\t"
> -		"vse8.v		v24, (%0)\n\t"
> -		: : "r" (datap) : "t4", "memory");
> +		"vse8.v		v24, (%0)\n\t",
> +		"mv		t0, %0\n\t"
> +		THEAD_VSETVLI_T4X0E8M8D1
> +		THEAD_VSB_V_V0T0
> +		"addi		t0, t0, 128\n\t"
> +		THEAD_VSB_V_V8T0
> +		"addi		t0, t0, 128\n\t"
> +		THEAD_VSB_V_V16T0
> +		"addi		t0, t0, 128\n\t"
> +		THEAD_VSB_V_V24T0, THEAD_VENDOR_ID,
> +		ERRATA_THEAD_VECTOR, CONFIG_ERRATA_THEAD_VECTOR)
> +		: : "r" (datap) : "t0", "t4", "memory");
>  	rvv_disable();
>  }
>  
> @@ -98,7 +203,9 @@ static inline void __vstate_restore(struct __riscv_v_state *restore_from,
>  				    void *datap)
>  {
>  	rvv_enable();
> -	asm volatile (
> +
> +	asm volatile (ALTERNATIVE(
> +		"nop\n\t"
>  		"vsetvli	t4, x0, e8, m8, ta, ma\n\t"
>  		"vle8.v		v0, (%0)\n\t"
>  		"add		%0, %0, t4\n\t"
> @@ -106,8 +213,20 @@ static inline void __vstate_restore(struct __riscv_v_state *restore_from,
>  		"add		%0, %0, t4\n\t"
>  		"vle8.v		v16, (%0)\n\t"
>  		"add		%0, %0, t4\n\t"
> -		"vle8.v		v24, (%0)\n\t"
> -		: : "r" (datap) : "t4");
> +		"vle8.v		v24, (%0)\n\t",
> +
> +		"mv		t0, %0\n\t"
> +		THEAD_VSETVLI_T4X0E8M8D1
> +		THEAD_VLB_V_V0T0
> +		"addi		t0, t0, 128\n\t"
> +		THEAD_VLB_V_V8T0
> +		"addi		%0, %0, 128\n\t"
> +		THEAD_VLB_V_V16T0
> +		"addi		%0, %0, 128\n\t"
> +		THEAD_VLB_V_V24T0, THEAD_VENDOR_ID,
> +		ERRATA_THEAD_VECTOR, CONFIG_ERRATA_THEAD_VECTOR)
> +		: : "r" (datap) : "t0", "t4");
> +
>  	__vstate_csr_restore(restore_from);
>  	rvv_disable();
>  }
> -- 
> 2.39.0
> 

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

[-- Attachment #2: Type: text/plain, Size: 161 bytes --]

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH RFC 0/2] RISC-V: T-Head vector handling
  2023-03-15  5:29 ` Palmer Dabbelt
  2023-03-15  6:31   ` Heiko Stuebner
@ 2023-06-12 15:29   ` Palmer Dabbelt
  2023-06-12 15:44     ` Heiko Stübner
  1 sibling, 1 reply; 21+ messages in thread
From: Palmer Dabbelt @ 2023-06-12 15:29 UTC (permalink / raw)
  To: heiko, guoren
  Cc: linux-riscv, samuel, christoph.muellner, heiko, Conor Dooley,
	linux-kernel, heiko.stuebner

On Tue, 14 Mar 2023 22:29:41 PDT (-0700), Palmer Dabbelt wrote:
> On Tue, 28 Feb 2023 13:54:33 PST (-0800), heiko@sntech.de wrote:
>> From: Heiko Stuebner <heiko.stuebner@vrull.eu>
>>
>> As is widely known the T-Head C9xx cores used for example in the
>> Allwinner D1 implement an older non-ratified variant of the vector spec.
>>
>> While userspace will probably have a lot more problems implementing
>> support for both, on the kernel side the needed changes are actually
>> somewhat small'ish and can be handled via alternatives somewhat nicely.
>>
>> With this patchset I could run the same userspace program (picked from
>> some riscv-vector-test repository) that does some vector additions on
>> both qemu and a d1-nezha board. On both platforms it ran sucessfully and
>> even produced the same results.
>>
>>
>> As can be seen in the todo list, there are 2 places where the changed
>> SR_VS location still needs to be handled in the next revision
>> (assembly + ALTERNATIVES + constants + probably stringify resulted in
>>  some grey hair so far already)
>>
>>
>> ToDo:
>> - follow along with the base vector patchset
>> - handle SR_VS access in _save_context and _secondary_start_sbi
>>
>>
>> Heiko Stuebner (2):
>>   RISC-V: define the elements of the VCSR vector CSR
>>   RISC-V: add T-Head vector errata handling
>>
>>  arch/riscv/Kconfig.erratas           |  13 +++
>>  arch/riscv/errata/thead/errata.c     |  32 ++++++
>>  arch/riscv/include/asm/csr.h         |  31 +++++-
>>  arch/riscv/include/asm/errata_list.h |  62 +++++++++++-
>>  arch/riscv/include/asm/vector.h      | 139 +++++++++++++++++++++++++--
>>  5 files changed, 261 insertions(+), 16 deletions(-)
>
> I have no opposition to calling the T-Head vector stuff an errata
> against V, the RISC-V folks have already made it quite apparent that
> anything goes here.  I would like to get the standard V uABI sorted out
> first, though, as there's still a lot of moving pieces there.  It's kind
> of hard here as T-Head got thrown under the bus, but I'm not sure what
> else to do about it.

The V-1.0 support has been merged, so I think we're good to go.  Does 
someone mind re-spinning this against for-next so it lines up with all 
the new user interfaces?

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH RFC 0/2] RISC-V: T-Head vector handling
  2023-06-12 15:29   ` Palmer Dabbelt
@ 2023-06-12 15:44     ` Heiko Stübner
  0 siblings, 0 replies; 21+ messages in thread
From: Heiko Stübner @ 2023-06-12 15:44 UTC (permalink / raw)
  To: guoren, Palmer Dabbelt
  Cc: linux-riscv, samuel, christoph.muellner, Conor Dooley, linux-kernel

Hi,

Am Montag, 12. Juni 2023, 17:29:49 CEST schrieb Palmer Dabbelt:
> On Tue, 14 Mar 2023 22:29:41 PDT (-0700), Palmer Dabbelt wrote:
> > On Tue, 28 Feb 2023 13:54:33 PST (-0800), heiko@sntech.de wrote:
> >> From: Heiko Stuebner <heiko.stuebner@vrull.eu>
> >>
> >> As is widely known the T-Head C9xx cores used for example in the
> >> Allwinner D1 implement an older non-ratified variant of the vector spec.
> >>
> >> While userspace will probably have a lot more problems implementing
> >> support for both, on the kernel side the needed changes are actually
> >> somewhat small'ish and can be handled via alternatives somewhat nicely.
> >>
> >> With this patchset I could run the same userspace program (picked from
> >> some riscv-vector-test repository) that does some vector additions on
> >> both qemu and a d1-nezha board. On both platforms it ran sucessfully and
> >> even produced the same results.
> >>
> >>
> >> As can be seen in the todo list, there are 2 places where the changed
> >> SR_VS location still needs to be handled in the next revision
> >> (assembly + ALTERNATIVES + constants + probably stringify resulted in
> >>  some grey hair so far already)
> >>
> >>
> >> ToDo:
> >> - follow along with the base vector patchset
> >> - handle SR_VS access in _save_context and _secondary_start_sbi
> >>
> >>
> >> Heiko Stuebner (2):
> >>   RISC-V: define the elements of the VCSR vector CSR
> >>   RISC-V: add T-Head vector errata handling
> >>
> >>  arch/riscv/Kconfig.erratas           |  13 +++
> >>  arch/riscv/errata/thead/errata.c     |  32 ++++++
> >>  arch/riscv/include/asm/csr.h         |  31 +++++-
> >>  arch/riscv/include/asm/errata_list.h |  62 +++++++++++-
> >>  arch/riscv/include/asm/vector.h      | 139 +++++++++++++++++++++++++--
> >>  5 files changed, 261 insertions(+), 16 deletions(-)
> >
> > I have no opposition to calling the T-Head vector stuff an errata
> > against V, the RISC-V folks have already made it quite apparent that
> > anything goes here.  I would like to get the standard V uABI sorted out
> > first, though, as there's still a lot of moving pieces there.  It's kind
> > of hard here as T-Head got thrown under the bus, but I'm not sure what
> > else to do about it.
> 
> The V-1.0 support has been merged, so I think we're good to go.  Does 
> someone mind re-spinning this against for-next so it lines up with all 
> the new user interfaces?

glad to hear that. I found the merge message now as well.
Somehow I was only Cc'ed on individual patches but not on the
cover-letter, so didn't realize the merge till now.

I'll try to re-spin and adapt to the changes since the initial submission.

Heiko


 I'll try to re-spin and adapt to the changes that
happened since the original submission.




_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH RFC 2/2] RISC-V: add T-Head vector errata handling
  2023-02-28 21:54 ` [PATCH RFC 2/2] RISC-V: add T-Head vector errata handling Heiko Stuebner
  2023-03-01  2:12   ` Guo Ren
  2023-03-15 18:56   ` Conor Dooley
@ 2023-06-13  6:35   ` Stefan O'Rear
  2023-06-22 17:39     ` Heiko Stübner
  2023-06-23  9:12   ` Emil Renner Berthing
  3 siblings, 1 reply; 21+ messages in thread
From: Stefan O'Rear @ 2023-06-13  6:35 UTC (permalink / raw)
  To: Heiko Stuebner, palmer
  Cc: linux-riscv, samuel, guoren, christoph.muellner, conor.dooley,
	linux-kernel, Heiko Stuebner

On Tue, Feb 28, 2023, at 4:54 PM, Heiko Stuebner wrote:
> @@ -29,6 +78,7 @@ static __always_inline bool has_vector(void)
>  static inline void __vstate_clean(struct pt_regs *regs)
>  {
>  	regs->status = (regs->status & ~(SR_VS)) | SR_VS_CLEAN;
> +
>  }
> 
>  static inline void vstate_off(struct pt_regs *regs)
> @@ -58,30 +108,75 @@ static __always_inline void rvv_disable(void)
> 
>  static __always_inline void __vstate_csr_save(struct __riscv_v_state *dest)
>  {
> -	asm volatile (
> +	register u32 t1 asm("t1") = (SR_FS);
> +
> +	/*
> +	 * CSR_VCSR is defined as
> +	 * [2:1] - vxrm[1:0]
> +	 * [0] - vxsat
> +	 * The earlier vector spec implemented by T-Head uses separate
> +	 * registers for the same bit-elements, so just combine those
> +	 * into the existing output field.
> +	 *
> +	 * Additionally T-Head cores need FS to be enabled when accessing
> +	 * the VXRM and VXSAT CSRs, otherwise ending in illegal instructions.
> +	 */
> +	asm volatile (ALTERNATIVE(
>  		"csrr	%0, " CSR_STR(CSR_VSTART) "\n\t"
>  		"csrr	%1, " CSR_STR(CSR_VTYPE) "\n\t"
>  		"csrr	%2, " CSR_STR(CSR_VL) "\n\t"
>  		"csrr	%3, " CSR_STR(CSR_VCSR) "\n\t"
> +		__nops(5),
> +		"csrs	sstatus, t1\n\t"
> +		"csrr	%0, " CSR_STR(CSR_VSTART) "\n\t"
> +		"csrr	%1, " CSR_STR(CSR_VTYPE) "\n\t"
> +		"csrr	%2, " CSR_STR(CSR_VL) "\n\t"
> +		"csrr	%3, " CSR_STR(THEAD_C9XX_CSR_VXRM) "\n\t"
> +		"slliw	%3, %3, " CSR_STR(VCSR_VXRM_SHIFT) "\n\t"
> +		"csrr	t4, " CSR_STR(THEAD_C9XX_CSR_VXSAT) "\n\t"
> +		"or	%3, %3, t4\n\t"
> +		"csrc	sstatus, t1\n\t",
> +		THEAD_VENDOR_ID,
> +		ERRATA_THEAD_VECTOR, CONFIG_ERRATA_THEAD_VECTOR)
>  		: "=r" (dest->vstart), "=r" (dest->vtype), "=r" (dest->vl),
> -		  "=r" (dest->vcsr) : :);
> +		  "=r" (dest->vcsr) : "r"(t1) : "t4");
>  }
> 
>  static __always_inline void __vstate_csr_restore(struct __riscv_v_state *src)
>  {
> -	asm volatile (
> +	register u32 t1 asm("t1") = (SR_FS);
> +
> +	/*
> +	 * Similar to __vstate_csr_save above, restore values for the
> +	 * separate VXRM and VXSAT CSRs from the vcsr variable.
> +	 */
> +	asm volatile (ALTERNATIVE(
>  		"vsetvl	 x0, %2, %1\n\t"
>  		"csrw	" CSR_STR(CSR_VSTART) ", %0\n\t"
>  		"csrw	" CSR_STR(CSR_VCSR) ", %3\n\t"
> +		__nops(6),
> +		"csrs	sstatus, t1\n\t"
> +		"vsetvl	 x0, %2, %1\n\t"
> +		"csrw	" CSR_STR(CSR_VSTART) ", %0\n\t"
> +		"srliw	t4, %3, " CSR_STR(VCSR_VXRM_SHIFT) "\n\t"
> +		"andi	t4, t4, " CSR_STR(VCSR_VXRM_MASK) "\n\t"
> +		"csrw	" CSR_STR(THEAD_C9XX_CSR_VXRM) ", t4\n\t"
> +		"andi	%3, %3, " CSR_STR(VCSR_VXSAT_MASK) "\n\t"
> +		"csrw	" CSR_STR(THEAD_C9XX_CSR_VXSAT) ", %3\n\t"
> +		"csrc	sstatus, t1\n\t",
> +		THEAD_VENDOR_ID,
> +		ERRATA_THEAD_VECTOR, CONFIG_ERRATA_THEAD_VECTOR)
>  		: : "r" (src->vstart), "r" (src->vtype), "r" (src->vl),
> -		    "r" (src->vcsr) :);
> +		    "r" (src->vcsr), "r"(t1): "t4");
>  }

vxrm and vxsat are part of fcsr in 0.7.1, so they should already have been
handled by __fstate_save and __fstate_restore, and this code is likely to
misbehave (saving the new process's vxrm/vxsat in the old process's save area
because float state is swapped before vector state in __switch_to).

-s

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH RFC 2/2] RISC-V: add T-Head vector errata handling
  2023-06-13  6:35   ` Stefan O'Rear
@ 2023-06-22 17:39     ` Heiko Stübner
  2023-06-22 18:58       ` Stefan O'Rear
  0 siblings, 1 reply; 21+ messages in thread
From: Heiko Stübner @ 2023-06-22 17:39 UTC (permalink / raw)
  To: palmer, Stefan O'Rear
  Cc: linux-riscv, samuel, guoren, christoph.muellner, conor.dooley,
	linux-kernel

Hi Stefan,

Am Dienstag, 13. Juni 2023, 08:35:53 CEST schrieb Stefan O'Rear:
> On Tue, Feb 28, 2023, at 4:54 PM, Heiko Stuebner wrote:
> > @@ -29,6 +78,7 @@ static __always_inline bool has_vector(void)
> >  static inline void __vstate_clean(struct pt_regs *regs)
> >  {
> >  	regs->status = (regs->status & ~(SR_VS)) | SR_VS_CLEAN;
> > +
> >  }
> > 
> >  static inline void vstate_off(struct pt_regs *regs)
> > @@ -58,30 +108,75 @@ static __always_inline void rvv_disable(void)
> > 
> >  static __always_inline void __vstate_csr_save(struct __riscv_v_state *dest)
> >  {
> > -	asm volatile (
> > +	register u32 t1 asm("t1") = (SR_FS);
> > +
> > +	/*
> > +	 * CSR_VCSR is defined as
> > +	 * [2:1] - vxrm[1:0]
> > +	 * [0] - vxsat
> > +	 * The earlier vector spec implemented by T-Head uses separate
> > +	 * registers for the same bit-elements, so just combine those
> > +	 * into the existing output field.
> > +	 *
> > +	 * Additionally T-Head cores need FS to be enabled when accessing
> > +	 * the VXRM and VXSAT CSRs, otherwise ending in illegal instructions.
> > +	 */
> > +	asm volatile (ALTERNATIVE(
> >  		"csrr	%0, " CSR_STR(CSR_VSTART) "\n\t"
> >  		"csrr	%1, " CSR_STR(CSR_VTYPE) "\n\t"
> >  		"csrr	%2, " CSR_STR(CSR_VL) "\n\t"
> >  		"csrr	%3, " CSR_STR(CSR_VCSR) "\n\t"
> > +		__nops(5),
> > +		"csrs	sstatus, t1\n\t"
> > +		"csrr	%0, " CSR_STR(CSR_VSTART) "\n\t"
> > +		"csrr	%1, " CSR_STR(CSR_VTYPE) "\n\t"
> > +		"csrr	%2, " CSR_STR(CSR_VL) "\n\t"
> > +		"csrr	%3, " CSR_STR(THEAD_C9XX_CSR_VXRM) "\n\t"
> > +		"slliw	%3, %3, " CSR_STR(VCSR_VXRM_SHIFT) "\n\t"
> > +		"csrr	t4, " CSR_STR(THEAD_C9XX_CSR_VXSAT) "\n\t"
> > +		"or	%3, %3, t4\n\t"
> > +		"csrc	sstatus, t1\n\t",
> > +		THEAD_VENDOR_ID,
> > +		ERRATA_THEAD_VECTOR, CONFIG_ERRATA_THEAD_VECTOR)
> >  		: "=r" (dest->vstart), "=r" (dest->vtype), "=r" (dest->vl),
> > -		  "=r" (dest->vcsr) : :);
> > +		  "=r" (dest->vcsr) : "r"(t1) : "t4");
> >  }
> > 
> >  static __always_inline void __vstate_csr_restore(struct __riscv_v_state *src)
> >  {
> > -	asm volatile (
> > +	register u32 t1 asm("t1") = (SR_FS);
> > +
> > +	/*
> > +	 * Similar to __vstate_csr_save above, restore values for the
> > +	 * separate VXRM and VXSAT CSRs from the vcsr variable.
> > +	 */
> > +	asm volatile (ALTERNATIVE(
> >  		"vsetvl	 x0, %2, %1\n\t"
> >  		"csrw	" CSR_STR(CSR_VSTART) ", %0\n\t"
> >  		"csrw	" CSR_STR(CSR_VCSR) ", %3\n\t"
> > +		__nops(6),
> > +		"csrs	sstatus, t1\n\t"
> > +		"vsetvl	 x0, %2, %1\n\t"
> > +		"csrw	" CSR_STR(CSR_VSTART) ", %0\n\t"
> > +		"srliw	t4, %3, " CSR_STR(VCSR_VXRM_SHIFT) "\n\t"
> > +		"andi	t4, t4, " CSR_STR(VCSR_VXRM_MASK) "\n\t"
> > +		"csrw	" CSR_STR(THEAD_C9XX_CSR_VXRM) ", t4\n\t"
> > +		"andi	%3, %3, " CSR_STR(VCSR_VXSAT_MASK) "\n\t"
> > +		"csrw	" CSR_STR(THEAD_C9XX_CSR_VXSAT) ", %3\n\t"
> > +		"csrc	sstatus, t1\n\t",
> > +		THEAD_VENDOR_ID,
> > +		ERRATA_THEAD_VECTOR, CONFIG_ERRATA_THEAD_VECTOR)
> >  		: : "r" (src->vstart), "r" (src->vtype), "r" (src->vl),
> > -		    "r" (src->vcsr) :);
> > +		    "r" (src->vcsr), "r"(t1): "t4");
> >  }
> 
> vxrm and vxsat are part of fcsr in 0.7.1, so they should already have been
> handled by __fstate_save and __fstate_restore, and this code is likely to
> misbehave (saving the new process's vxrm/vxsat in the old process's save area
> because float state is swapped before vector state in __switch_to).

I'm not sure I follow your description but may be overlooking or have
misunderstood something.

Somehow I way to often have trouble resolving CSR addresses, but according
to openSBI, FCSR has the location of 0x3
(#define CSR_FCSR 0x003 in include/sbi/riscv_encoding.h)

where CSR_VXSAT and CSR_VXRM are at 0x9 and 0xa respectively.
(#define CSR_VXSAT 0x9 and  #define CSR_VXRM 0xa)


And looking at __fstate_save + __fstate_restore the only CSRs accessed seem
to be CSR_STATUS and FCSR itself.

I definitly won't claim to be right, but don't see the issue yet.


Thanks for a hint
Heiko



_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH RFC 2/2] RISC-V: add T-Head vector errata handling
  2023-06-22 17:39     ` Heiko Stübner
@ 2023-06-22 18:58       ` Stefan O'Rear
  2023-06-22 20:35         ` Heiko Stübner
  0 siblings, 1 reply; 21+ messages in thread
From: Stefan O'Rear @ 2023-06-22 18:58 UTC (permalink / raw)
  To: Heiko Stuebner, Palmer Dabbelt
  Cc: linux-riscv, samuel, guoren, christoph.muellner, conor.dooley,
	linux-kernel

On Thu, Jun 22, 2023, at 1:39 PM, Heiko Stübner wrote:
> Am Dienstag, 13. Juni 2023, 08:35:53 CEST schrieb Stefan O'Rear:
>> vxrm and vxsat are part of fcsr in 0.7.1, so they should already have been
>> handled by __fstate_save and __fstate_restore, and this code is likely to
>> misbehave (saving the new process's vxrm/vxsat in the old process's save area
>> because float state is swapped before vector state in __switch_to).
>
> I'm not sure I follow your description but may be overlooking or have
> misunderstood something.
>
> Somehow I way to often have trouble resolving CSR addresses, but according
> to openSBI, FCSR has the location of 0x3
> (#define CSR_FCSR 0x003 in include/sbi/riscv_encoding.h)
>
> where CSR_VXSAT and CSR_VXRM are at 0x9 and 0xa respectively.
> (#define CSR_VXSAT 0x9 and  #define CSR_VXRM 0xa)
>
>
> And looking at __fstate_save + __fstate_restore the only CSRs accessed seem
> to be CSR_STATUS and FCSR itself.
>
> I definitly won't claim to be right, but don't see the issue yet.
>
>
> Thanks for a hint
> Heiko

Are you aware of "3.7. Vector Fixed-Point Fields in fcsr" in
riscv-v-spec-0.7.1.pdf?

-s

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH RFC 2/2] RISC-V: add T-Head vector errata handling
  2023-06-22 18:58       ` Stefan O'Rear
@ 2023-06-22 20:35         ` Heiko Stübner
  2023-06-23  3:06           ` Stefan O'Rear
  0 siblings, 1 reply; 21+ messages in thread
From: Heiko Stübner @ 2023-06-22 20:35 UTC (permalink / raw)
  To: Palmer Dabbelt, Stefan O'Rear
  Cc: linux-riscv, samuel, guoren, christoph.muellner, conor.dooley,
	linux-kernel

Hi Stefan,

Am Donnerstag, 22. Juni 2023, 20:58:37 CEST schrieb Stefan O'Rear:
> On Thu, Jun 22, 2023, at 1:39 PM, Heiko Stübner wrote:
> > Am Dienstag, 13. Juni 2023, 08:35:53 CEST schrieb Stefan O'Rear:
> >> vxrm and vxsat are part of fcsr in 0.7.1, so they should already have been
> >> handled by __fstate_save and __fstate_restore, and this code is likely to
> >> misbehave (saving the new process's vxrm/vxsat in the old process's save area
> >> because float state is swapped before vector state in __switch_to).
> >
> > I'm not sure I follow your description but may be overlooking or have
> > misunderstood something.
> >
> > Somehow I way to often have trouble resolving CSR addresses, but according
> > to openSBI, FCSR has the location of 0x3
> > (#define CSR_FCSR 0x003 in include/sbi/riscv_encoding.h)
> >
> > where CSR_VXSAT and CSR_VXRM are at 0x9 and 0xa respectively.
> > (#define CSR_VXSAT 0x9 and  #define CSR_VXRM 0xa)
> >
> >
> > And looking at __fstate_save + __fstate_restore the only CSRs accessed seem
> > to be CSR_STATUS and FCSR itself.
> >
> > I definitly won't claim to be right, but don't see the issue yet.
> >
> >
> > Thanks for a hint
> > Heiko
> 
> Are you aware of "3.7. Vector Fixed-Point Fields in fcsr" in
> riscv-v-spec-0.7.1.pdf?

oh wow, thanks a lot for that pointer, now I understand your concern.

So in vector-0.7.1 fcsr[10:9] mirrors vxrm and fcsr[8] mirrors vxsat.


On a positive note, the T-Head cores seem to not implement the full
vector 0.7.1 specification after all, in the documentation I have [0]
fcsr[31:8] are declared as "0" and uppermost bits are [7:5] for the "frm"
field.

So I guess a code comment should suffice to explain :-)


Regards
Heiko


[0] https://github.com/T-head-Semi/openc906/blob/main/doc/%E7%8E%84%E9%93%81C906%E7%94%A8%E6%88%B7%E6%89%8B%E5%86%8C.pdf
16.3.1.3 浮点控制状态寄存器(FCSR) on page 334



_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH RFC 2/2] RISC-V: add T-Head vector errata handling
  2023-06-22 20:35         ` Heiko Stübner
@ 2023-06-23  3:06           ` Stefan O'Rear
  2023-06-23 10:22             ` Heiko Stübner
  0 siblings, 1 reply; 21+ messages in thread
From: Stefan O'Rear @ 2023-06-23  3:06 UTC (permalink / raw)
  To: Heiko Stuebner, Palmer Dabbelt
  Cc: linux-riscv, samuel, guoren, christoph.muellner, conor.dooley,
	linux-kernel

On Thu, Jun 22, 2023, at 4:35 PM, Heiko Stübner wrote:
> Am Donnerstag, 22. Juni 2023, 20:58:37 CEST schrieb Stefan O'Rear:
>> Are you aware of "3.7. Vector Fixed-Point Fields in fcsr" in
>> riscv-v-spec-0.7.1.pdf?
>
> oh wow, thanks a lot for that pointer, now I understand your concern.
>
> So in vector-0.7.1 fcsr[10:9] mirrors vxrm and fcsr[8] mirrors vxsat.
>
>
> On a positive note, the T-Head cores seem to not implement the full
> vector 0.7.1 specification after all, in the documentation I have [0]
> fcsr[31:8] are declared as "0" and uppermost bits are [7:5] for the "frm"
> field.

Given that the pdf you linked does not mention any vector CSRs, I am not
confident that it provides a complete and accurate description of vector
functionality in other registers for the C906 with vector extension.

Assuming that you have access to such a chip, I would be much happier with
the proposed "just a comment" approach if our understanding of the behavior
were confirmed on hardware (specifically: csr_write(CSR_FCSR, 0x700) should
not affect csr_read(CSR_VXRM) or csr_read(CSR_VXSAT)).

-s

> So I guess a code comment should suffice to explain :-)
>
>
> Regards
> Heiko
>
>
> [0] 
> https://github.com/T-head-Semi/openc906/blob/main/doc/%E7%8E%84%E9%93%81C906%E7%94%A8%E6%88%B7%E6%89%8B%E5%86%8C.pdf
> 16.3.1.3 浮点控制状态寄存器(FCSR) on page 334

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH RFC 2/2] RISC-V: add T-Head vector errata handling
  2023-02-28 21:54 ` [PATCH RFC 2/2] RISC-V: add T-Head vector errata handling Heiko Stuebner
                     ` (2 preceding siblings ...)
  2023-06-13  6:35   ` Stefan O'Rear
@ 2023-06-23  9:12   ` Emil Renner Berthing
  3 siblings, 0 replies; 21+ messages in thread
From: Emil Renner Berthing @ 2023-06-23  9:12 UTC (permalink / raw)
  To: Heiko Stuebner
  Cc: palmer, linux-riscv, samuel, guoren, christoph.muellner,
	conor.dooley, linux-kernel, Heiko Stuebner

On Tue, 28 Feb 2023 at 22:56, Heiko Stuebner <heiko@sntech.de> wrote:
>
> From: Heiko Stuebner <heiko.stuebner@vrull.eu>
>
> T-Head C9xx cores implement an older version (0.7.1) of the vector
> specification.
>
> Relevant changes concerning the kernel are:
> - different placement of the SR_VS bit for the vector unit status
> - different encoding of the vsetvli instruction
> - different instructions for loads and stores
>
> And a fixed VLEN of 128.
>
> The in-kernel access to vector instances is limited to the save and
> restore of process states so the above mentioned areas can simply be
> handled via the alternatives framework, similar to other T-Head specific
> issues.
>
> Signed-off-by: Heiko Stuebner <heiko.stuebner@vrull.eu>
> ---
>  arch/riscv/Kconfig.erratas           |  13 +++
>  arch/riscv/errata/thead/errata.c     |  32 ++++++
>  arch/riscv/include/asm/csr.h         |  26 ++++-
>  arch/riscv/include/asm/errata_list.h |  62 +++++++++++-
>  arch/riscv/include/asm/vector.h      | 139 +++++++++++++++++++++++++--
>  5 files changed, 256 insertions(+), 16 deletions(-)
>
> diff --git a/arch/riscv/Kconfig.erratas b/arch/riscv/Kconfig.erratas
> index 69621ae6d647..624cefc9fcd7 100644
> --- a/arch/riscv/Kconfig.erratas
> +++ b/arch/riscv/Kconfig.erratas
> @@ -79,4 +79,17 @@ config ERRATA_THEAD_PMU
>
>           If you don't know what to do here, say "Y".
>
> +config ERRATA_THEAD_VECTOR
> +       bool "Apply T-Head Vector errata"
> +       depends on ERRATA_THEAD && RISCV_ISA_V
> +       default y
> +       help
> +         The T-Head C9xx cores implement an earlier version 0.7.1
> +         of the vector extensions.
> +
> +         This will apply the necessary errata to handle the non-standard
> +         behaviour via when switch to and from vector mode for processes.
> +
> +         If you don't know what to do here, say "Y".
> +
>  endmenu # "CPU errata selection"
> diff --git a/arch/riscv/errata/thead/errata.c b/arch/riscv/errata/thead/errata.c
> index fac5742d1c1e..55b3aaa2468a 100644
> --- a/arch/riscv/errata/thead/errata.c
> +++ b/arch/riscv/errata/thead/errata.c
> @@ -12,6 +12,7 @@
>  #include <asm/cacheflush.h>
>  #include <asm/errata_list.h>
>  #include <asm/patch.h>
> +#include <asm/vector.h>
>  #include <asm/vendorid_list.h>
>
>  static bool errata_probe_pbmt(unsigned int stage,
> @@ -63,6 +64,34 @@ static bool errata_probe_pmu(unsigned int stage,
>         return true;
>  }
>
> +static bool errata_probe_vector(unsigned int stage,
> +                               unsigned long arch_id, unsigned long impid)
> +{
> +       if (!IS_ENABLED(CONFIG_ERRATA_THEAD_VECTOR))
> +               return false;
> +
> +       /* target-c9xx cores report arch_id and impid as 0 */
> +       if (arch_id != 0 || impid != 0)
> +               return false;
> +
> +       if (stage == RISCV_ALTERNATIVES_EARLY_BOOT) {
> +               /*
> +                * Disable VECTOR to detect illegal usage of vector in kernel.
> +                * This is normally done in _start_kernel but with the
> +                * vector-1.0 SR_VS bits. VS is using [24:23] on T-Head's
> +                * vector-0.7.1 and the vector-1.0-bits are unused there.
> +                */
> +               csr_clear(CSR_STATUS, SR_VS_THEAD);
> +               return false;
> +       }
> +
> +       /* let has_vector() return true and set the static vlen */
> +       static_branch_enable(&riscv_isa_ext_keys[RISCV_ISA_EXT_KEY_VECTOR]);
> +       riscv_vsize = 128 / 8 * 32;
> +
> +       return true;
> +}
> +
>  static u32 thead_errata_probe(unsigned int stage,
>                               unsigned long archid, unsigned long impid)
>  {
> @@ -77,6 +106,9 @@ static u32 thead_errata_probe(unsigned int stage,
>         if (errata_probe_pmu(stage, archid, impid))
>                 cpu_req_errata |= BIT(ERRATA_THEAD_PMU);
>
> +       if (errata_probe_vector(stage, archid, impid))
> +               cpu_req_errata |= BIT(ERRATA_THEAD_VECTOR);
> +
>         return cpu_req_errata;
>  }
>
> diff --git a/arch/riscv/include/asm/csr.h b/arch/riscv/include/asm/csr.h
> index 8b06f2472915..8d16c11487aa 100644
> --- a/arch/riscv/include/asm/csr.h
> +++ b/arch/riscv/include/asm/csr.h
> @@ -24,11 +24,27 @@
>  #define SR_FS_CLEAN    _AC(0x00004000, UL)
>  #define SR_FS_DIRTY    _AC(0x00006000, UL)
>
> -#define SR_VS           _AC(0x00000600, UL) /* Vector Status */
> -#define SR_VS_OFF       _AC(0x00000000, UL)
> -#define SR_VS_INITIAL   _AC(0x00000200, UL)
> -#define SR_VS_CLEAN     _AC(0x00000400, UL)
> -#define SR_VS_DIRTY     _AC(0x00000600, UL)
> +#define SR_VS_OFF              _AC(0x00000000, UL)
> +
> +#define SR_VS_1_0              _AC(0x00000600, UL) /* Vector Status */
> +#define SR_VS_INITIAL_1_0      _AC(0x00000200, UL)
> +#define SR_VS_CLEAN_1_0                _AC(0x00000400, UL)
> +#define SR_VS_DIRTY_1_0                _AC(0x00000600, UL)
> +
> +#define SR_VS_THEAD            _AC(0x01800000, UL) /* Vector Status */
> +#define SR_VS_INITIAL_THEAD    _AC(0x00800000, UL)
> +#define SR_VS_CLEAN_THEAD      _AC(0x01000000, UL)
> +#define SR_VS_DIRTY_THEAD      _AC(0x01800000, UL)
> +
> +/*
> + * Always default to vector-1.0 handling in assembly and let the broken
> + * implementations handle their case separately.
> + */
> +#ifdef __ASSEMBLY__
> +#define SR_VS                  SR_VS_1_0
> +#else
> +
> +#endif
>
>  #define SR_XS          _AC(0x00018000, UL) /* Extension Status */
>  #define SR_XS_OFF      _AC(0x00000000, UL)
> diff --git a/arch/riscv/include/asm/errata_list.h b/arch/riscv/include/asm/errata_list.h
> index 95e626b7281e..3f93cdd1599f 100644
> --- a/arch/riscv/include/asm/errata_list.h
> +++ b/arch/riscv/include/asm/errata_list.h
> @@ -19,7 +19,8 @@
>  #define        ERRATA_THEAD_PBMT 0
>  #define        ERRATA_THEAD_CMO 1
>  #define        ERRATA_THEAD_PMU 2
> -#define        ERRATA_THEAD_NUMBER 3
> +#define        ERRATA_THEAD_VECTOR 3
> +#define        ERRATA_THEAD_NUMBER 4
>  #endif
>
>  #define        CPUFEATURE_SVPBMT 0
> @@ -157,6 +158,65 @@ asm volatile(ALTERNATIVE(                                          \
>         : "=r" (__ovl) :                                                \
>         : "memory")
>
> +#ifdef CONFIG_ERRATA_THEAD_PBMT
> +
> +#define ALT_VS_SHIFT 61
> +#define ALT_THEAD_VS_SHIFT 59
> +#define ALT_THEAD_SR_VS(_val, _vs)                                     \
> +asm(ALTERNATIVE(  "li %0, %1\t\nslli %0,%0,%3",                                \
> +                 "li %0, %2\t\nslli %0,%0,%4", THEAD_VENDOR_ID,        \
> +                       ERRATA_THEAD_PBMT, CONFIG_ERRATA_THEAD_PBMT)    \
> +               : "=r"(_val)                                            \
> +               : "I"(prot##_MAIN >> ALT_SVPBMT_SHIFT),         \
> +                 "I"(prot##_THEAD >> ALT_THEAD_PBMT_SHIFT),            \
> +                 "I"(ALT_SVPBMT_SHIFT),                                \
> +                 "I"(ALT_THEAD_PBMT_SHIFT))
> +#else
> +#define ALT_THEAD_SR_VS(_val ## _MAIN)
> +#endif
> +
> +#ifdef CONFIG_ERRATA_THEAD_VECTOR
> +
> +#define THEAD_C9XX_CSR_VXSAT                   0x9
> +#define THEAD_C9XX_CSR_VXRM                    0xa
> +
> +/*
> + * Vector 0.7.1 as used for example on T-Head Xuantie cores, uses an older
> + * encoding for vsetvli (ta, ma vs. d1), so provide an instruction for
> + * vsetvli     t4, x0, e8, m8, d1
> + */
> +#define THEAD_VSETVLI_T4X0E8M8D1       ".long  0x00307ed7\n\t"
> +
> +/*
> + * While in theory, the vector-0.7.1 vsb.v and vlb.v result in the same
> + * encoding as the standard vse8.v and vle8.v, compilers seem to optimize
> + * the call resulting in a different encoding and then using a value for
> + * the "mop" field that is not part of vector-0.7.1
> + * So encode specific variants for vstate_save and _restore.
> + */
> +#define THEAD_VSB_V_V0T0               ".long  0x02028027\n\t"
> +#define THEAD_VSB_V_V8T0               ".long  0x02028427\n\t"
> +#define THEAD_VSB_V_V16T0              ".long  0x02028827\n\t"
> +#define THEAD_VSB_V_V24T0              ".long  0x02028c27\n\t"
> +#define THEAD_VLB_V_V0T0               ".long  0x012028007\n\t"
> +#define THEAD_VLB_V_V8T0               ".long  0x012028407\n\t"
> +#define THEAD_VLB_V_V16T0              ".long  0x012028807\n\t"
> +#define THEAD_VLB_V_V24T0              ".long  0x012028c07\n\t"
> +
> +#define ALT_SR_VS_VECTOR_1_0_SHIFT     9
> +#define ALT_SR_VS_THEAD_SHIFT          23
> +
> +#define ALT_SR_VS(_val, prot)                                          \
> +asm(ALTERNATIVE("li %0, %1\t\nslli %0,%0,%3",                          \
> +               "li %0, %2\t\nslli %0,%0,%4", THEAD_VENDOR_ID,          \
> +               ERRATA_THEAD_VECTOR, CONFIG_ERRATA_THEAD_VECTOR)        \
> +               : "=r"(_val)                                            \
> +               : "I"(prot##_1_0 >> ALT_SR_VS_VECTOR_1_0_SHIFT),        \
> +                 "I"(prot##_THEAD >> ALT_SR_VS_THEAD_SHIFT),           \
> +                 "I"(ALT_SR_VS_VECTOR_1_0_SHIFT),                      \
> +                 "I"(ALT_SR_VS_THEAD_SHIFT))
> +#endif /* CONFIG_ERRATA_THEAD_VECTOR */
> +
>  #endif /* __ASSEMBLY__ */
>
>  #endif
> diff --git a/arch/riscv/include/asm/vector.h b/arch/riscv/include/asm/vector.h
> index ad9e6161dd89..ad91f783316e 100644
> --- a/arch/riscv/include/asm/vector.h
> +++ b/arch/riscv/include/asm/vector.h
> @@ -15,6 +15,55 @@
>  #include <asm/hwcap.h>
>  #include <asm/csr.h>
>  #include <asm/asm.h>
> +#include <asm/errata_list.h>
> +
> +#ifdef CONFIG_ERRATA_THEAD_VECTOR
> +
> +static inline u32 riscv_sr_vs(void)
> +{
> +       u32 val;
> +
> +       ALT_SR_VS(val, SR_VS);
> +       return val;
> +}
> +
> +static inline u32 riscv_sr_vs_initial(void)
> +{
> +       u32 val;
> +
> +       ALT_SR_VS(val, SR_VS_INITIAL);
> +       return val;
> +}
> +
> +static inline u32 riscv_sr_vs_clean(void)
> +{
> +       u32 val;
> +
> +       ALT_SR_VS(val, SR_VS_CLEAN);
> +       return val;
> +}
> +
> +static inline u32 riscv_sr_vs_dirty(void)
> +{
> +       u32 val;
> +
> +       ALT_SR_VS(val, SR_VS_DIRTY);
> +       return val;
> +}
> +
> +#define SR_VS          riscv_sr_vs()
> +#define SR_VS_INITIAL  riscv_sr_vs_initial()
> +#define SR_VS_CLEAN    riscv_sr_vs_clean()
> +#define SR_VS_DIRTY    riscv_sr_vs_dirty()
> +
> +#else /* CONFIG_ERRATA_THEAD_VECTOR */
> +
> +#define SR_VS          SR_VS_1_0
> +#define SR_VS_INITIAL  SR_VS_INITIAL_1_0
> +#define SR_VS_CLEAN    SR_VS_CLEAN_1_0
> +#define SR_VS_DIRTY    SR_VS_DIRTY_1_0
> +
> +#endif /* CONFIG_ERRATA_THEAD_VECTOR */
>
>  #define CSR_STR(x) __ASM_STR(x)
>
> @@ -29,6 +78,7 @@ static __always_inline bool has_vector(void)
>  static inline void __vstate_clean(struct pt_regs *regs)
>  {
>         regs->status = (regs->status & ~(SR_VS)) | SR_VS_CLEAN;
> +
>  }
>
>  static inline void vstate_off(struct pt_regs *regs)
> @@ -58,30 +108,75 @@ static __always_inline void rvv_disable(void)
>
>  static __always_inline void __vstate_csr_save(struct __riscv_v_state *dest)
>  {
> -       asm volatile (
> +       register u32 t1 asm("t1") = (SR_FS);
> +
> +       /*
> +        * CSR_VCSR is defined as
> +        * [2:1] - vxrm[1:0]
> +        * [0] - vxsat
> +        * The earlier vector spec implemented by T-Head uses separate
> +        * registers for the same bit-elements, so just combine those
> +        * into the existing output field.
> +        *
> +        * Additionally T-Head cores need FS to be enabled when accessing
> +        * the VXRM and VXSAT CSRs, otherwise ending in illegal instructions.
> +        */
> +       asm volatile (ALTERNATIVE(
>                 "csrr   %0, " CSR_STR(CSR_VSTART) "\n\t"
>                 "csrr   %1, " CSR_STR(CSR_VTYPE) "\n\t"
>                 "csrr   %2, " CSR_STR(CSR_VL) "\n\t"
>                 "csrr   %3, " CSR_STR(CSR_VCSR) "\n\t"
> +               __nops(5),
> +               "csrs   sstatus, t1\n\t"
> +               "csrr   %0, " CSR_STR(CSR_VSTART) "\n\t"
> +               "csrr   %1, " CSR_STR(CSR_VTYPE) "\n\t"
> +               "csrr   %2, " CSR_STR(CSR_VL) "\n\t"
> +               "csrr   %3, " CSR_STR(THEAD_C9XX_CSR_VXRM) "\n\t"
> +               "slliw  %3, %3, " CSR_STR(VCSR_VXRM_SHIFT) "\n\t"
> +               "csrr   t4, " CSR_STR(THEAD_C9XX_CSR_VXSAT) "\n\t"
> +               "or     %3, %3, t4\n\t"
> +               "csrc   sstatus, t1\n\t",
> +               THEAD_VENDOR_ID,
> +               ERRATA_THEAD_VECTOR, CONFIG_ERRATA_THEAD_VECTOR)
>                 : "=r" (dest->vstart), "=r" (dest->vtype), "=r" (dest->vl),
> -                 "=r" (dest->vcsr) : :);
> +                 "=r" (dest->vcsr) : "r"(t1) : "t4");
>  }
>
>  static __always_inline void __vstate_csr_restore(struct __riscv_v_state *src)
>  {
> -       asm volatile (
> +       register u32 t1 asm("t1") = (SR_FS);
> +
> +       /*
> +        * Similar to __vstate_csr_save above, restore values for the
> +        * separate VXRM and VXSAT CSRs from the vcsr variable.
> +        */
> +       asm volatile (ALTERNATIVE(
>                 "vsetvl  x0, %2, %1\n\t"
>                 "csrw   " CSR_STR(CSR_VSTART) ", %0\n\t"
>                 "csrw   " CSR_STR(CSR_VCSR) ", %3\n\t"
> +               __nops(6),
> +               "csrs   sstatus, t1\n\t"
> +               "vsetvl  x0, %2, %1\n\t"
> +               "csrw   " CSR_STR(CSR_VSTART) ", %0\n\t"
> +               "srliw  t4, %3, " CSR_STR(VCSR_VXRM_SHIFT) "\n\t"
> +               "andi   t4, t4, " CSR_STR(VCSR_VXRM_MASK) "\n\t"
> +               "csrw   " CSR_STR(THEAD_C9XX_CSR_VXRM) ", t4\n\t"
> +               "andi   %3, %3, " CSR_STR(VCSR_VXSAT_MASK) "\n\t"
> +               "csrw   " CSR_STR(THEAD_C9XX_CSR_VXSAT) ", %3\n\t"
> +               "csrc   sstatus, t1\n\t",
> +               THEAD_VENDOR_ID,
> +               ERRATA_THEAD_VECTOR, CONFIG_ERRATA_THEAD_VECTOR)
>                 : : "r" (src->vstart), "r" (src->vtype), "r" (src->vl),
> -                   "r" (src->vcsr) :);
> +                   "r" (src->vcsr), "r"(t1): "t4");
>  }

Hi Heiko,

Just for my understanding. Here you're adding 5 and 6 nops to the (in
the future) common case? If so, then why not use a static branch? That
should only add 1 nop and be easier to read/understand.

/Emil

>  static inline void __vstate_save(struct __riscv_v_state *save_to, void *datap)
>  {
>         rvv_enable();
>         __vstate_csr_save(save_to);
> -       asm volatile (
> +
> +       asm volatile (ALTERNATIVE(
> +               "nop\n\t"
>                 "vsetvli        t4, x0, e8, m8, ta, ma\n\t"
>                 "vse8.v         v0, (%0)\n\t"
>                 "add            %0, %0, t4\n\t"
> @@ -89,8 +184,18 @@ static inline void __vstate_save(struct __riscv_v_state *save_to, void *datap)
>                 "add            %0, %0, t4\n\t"
>                 "vse8.v         v16, (%0)\n\t"
>                 "add            %0, %0, t4\n\t"
> -               "vse8.v         v24, (%0)\n\t"
> -               : : "r" (datap) : "t4", "memory");
> +               "vse8.v         v24, (%0)\n\t",
> +               "mv             t0, %0\n\t"
> +               THEAD_VSETVLI_T4X0E8M8D1
> +               THEAD_VSB_V_V0T0
> +               "addi           t0, t0, 128\n\t"
> +               THEAD_VSB_V_V8T0
> +               "addi           t0, t0, 128\n\t"
> +               THEAD_VSB_V_V16T0
> +               "addi           t0, t0, 128\n\t"
> +               THEAD_VSB_V_V24T0, THEAD_VENDOR_ID,
> +               ERRATA_THEAD_VECTOR, CONFIG_ERRATA_THEAD_VECTOR)
> +               : : "r" (datap) : "t0", "t4", "memory");
>         rvv_disable();
>  }
>
> @@ -98,7 +203,9 @@ static inline void __vstate_restore(struct __riscv_v_state *restore_from,
>                                     void *datap)
>  {
>         rvv_enable();
> -       asm volatile (
> +
> +       asm volatile (ALTERNATIVE(
> +               "nop\n\t"
>                 "vsetvli        t4, x0, e8, m8, ta, ma\n\t"
>                 "vle8.v         v0, (%0)\n\t"
>                 "add            %0, %0, t4\n\t"
> @@ -106,8 +213,20 @@ static inline void __vstate_restore(struct __riscv_v_state *restore_from,
>                 "add            %0, %0, t4\n\t"
>                 "vle8.v         v16, (%0)\n\t"
>                 "add            %0, %0, t4\n\t"
> -               "vle8.v         v24, (%0)\n\t"
> -               : : "r" (datap) : "t4");
> +               "vle8.v         v24, (%0)\n\t",
> +
> +               "mv             t0, %0\n\t"
> +               THEAD_VSETVLI_T4X0E8M8D1
> +               THEAD_VLB_V_V0T0
> +               "addi           t0, t0, 128\n\t"
> +               THEAD_VLB_V_V8T0
> +               "addi           %0, %0, 128\n\t"
> +               THEAD_VLB_V_V16T0
> +               "addi           %0, %0, 128\n\t"
> +               THEAD_VLB_V_V24T0, THEAD_VENDOR_ID,
> +               ERRATA_THEAD_VECTOR, CONFIG_ERRATA_THEAD_VECTOR)
> +               : : "r" (datap) : "t0", "t4");
> +
>         __vstate_csr_restore(restore_from);
>         rvv_disable();
>  }
> --
> 2.39.0
>
>
> _______________________________________________
> linux-riscv mailing list
> linux-riscv@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-riscv

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH RFC 2/2] RISC-V: add T-Head vector errata handling
  2023-06-23  3:06           ` Stefan O'Rear
@ 2023-06-23 10:22             ` Heiko Stübner
  2023-06-23 23:26               ` Heiko Stübner
  0 siblings, 1 reply; 21+ messages in thread
From: Heiko Stübner @ 2023-06-23 10:22 UTC (permalink / raw)
  To: Palmer Dabbelt, Stefan O'Rear
  Cc: linux-riscv, samuel, guoren, christoph.muellner, conor.dooley,
	linux-kernel

Am Freitag, 23. Juni 2023, 05:06:44 CEST schrieb Stefan O'Rear:
> On Thu, Jun 22, 2023, at 4:35 PM, Heiko Stübner wrote:
> > Am Donnerstag, 22. Juni 2023, 20:58:37 CEST schrieb Stefan O'Rear:
> >> Are you aware of "3.7. Vector Fixed-Point Fields in fcsr" in
> >> riscv-v-spec-0.7.1.pdf?
> >
> > oh wow, thanks a lot for that pointer, now I understand your concern.
> >
> > So in vector-0.7.1 fcsr[10:9] mirrors vxrm and fcsr[8] mirrors vxsat.
> >
> >
> > On a positive note, the T-Head cores seem to not implement the full
> > vector 0.7.1 specification after all, in the documentation I have [0]
> > fcsr[31:8] are declared as "0" and uppermost bits are [7:5] for the "frm"
> > field.
> 
> Given that the pdf you linked does not mention any vector CSRs, I am not
> confident that it provides a complete and accurate description of vector
> functionality in other registers for the C906 with vector extension.
> 
> Assuming that you have access to such a chip, I would be much happier with
> the proposed "just a comment" approach if our understanding of the behavior
> were confirmed on hardware (specifically: csr_write(CSR_FCSR, 0x700) should
> not affect csr_read(CSR_VXRM) or csr_read(CSR_VXSAT)).

For one, you're right that I should definitly try to confirm this on hardware :-) .

But also the VXSAT and VXRM CSRs are actually documented in that pdf
on page 335


Heiko


> 
> -s
> 
> > So I guess a code comment should suffice to explain :-)
> >
> >
> > Regards
> > Heiko
> >
> >
> > [0] 
> > https://github.com/T-head-Semi/openc906/blob/main/doc/%E7%8E%84%E9%93%81C906%E7%94%A8%E6%88%B7%E6%89%8B%E5%86%8C.pdf
> > 16.3.1.3 浮点控制状态寄存器(FCSR) on page 334
> 





_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH RFC 2/2] RISC-V: add T-Head vector errata handling
  2023-06-23 10:22             ` Heiko Stübner
@ 2023-06-23 23:26               ` Heiko Stübner
  2023-06-24  3:23                 ` Stefan O'Rear
  0 siblings, 1 reply; 21+ messages in thread
From: Heiko Stübner @ 2023-06-23 23:26 UTC (permalink / raw)
  To: Palmer Dabbelt, Stefan O'Rear
  Cc: linux-riscv, samuel, guoren, christoph.muellner, conor.dooley,
	linux-kernel

Am Freitag, 23. Juni 2023, 12:22:35 CEST schrieb Heiko Stübner:
> Am Freitag, 23. Juni 2023, 05:06:44 CEST schrieb Stefan O'Rear:
> > On Thu, Jun 22, 2023, at 4:35 PM, Heiko Stübner wrote:
> > > Am Donnerstag, 22. Juni 2023, 20:58:37 CEST schrieb Stefan O'Rear:
> > >> Are you aware of "3.7. Vector Fixed-Point Fields in fcsr" in
> > >> riscv-v-spec-0.7.1.pdf?
> > >
> > > oh wow, thanks a lot for that pointer, now I understand your concern.
> > >
> > > So in vector-0.7.1 fcsr[10:9] mirrors vxrm and fcsr[8] mirrors vxsat.
> > >
> > >
> > > On a positive note, the T-Head cores seem to not implement the full
> > > vector 0.7.1 specification after all, in the documentation I have [0]
> > > fcsr[31:8] are declared as "0" and uppermost bits are [7:5] for the "frm"
> > > field.
> > 
> > Given that the pdf you linked does not mention any vector CSRs, I am not
> > confident that it provides a complete and accurate description of vector
> > functionality in other registers for the C906 with vector extension.
> > 
> > Assuming that you have access to such a chip, I would be much happier with
> > the proposed "just a comment" approach if our understanding of the behavior
> > were confirmed on hardware (specifically: csr_write(CSR_FCSR, 0x700) should
> > not affect csr_read(CSR_VXRM) or csr_read(CSR_VXSAT)).
> 
> For one, you're right that I should definitly try to confirm this on hardware :-) .

ok, so now I know the documentation is wrong.

	before, vxrm 0x0, vxsat 0x0
writing the 0x700 to fcsr
	after, vxrm 0x3, vxsat 0x1

Essentially the link between the CSRs really is there - oh fun.
So we're back at your original concern - sadly.

I guess I need to figure out how to not have this stuff break
because relying on the fpu parts to handle feels not correct
at first glance.


Heiko



_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH RFC 2/2] RISC-V: add T-Head vector errata handling
  2023-06-23 23:26               ` Heiko Stübner
@ 2023-06-24  3:23                 ` Stefan O'Rear
  0 siblings, 0 replies; 21+ messages in thread
From: Stefan O'Rear @ 2023-06-24  3:23 UTC (permalink / raw)
  To: Heiko Stuebner, Palmer Dabbelt
  Cc: linux-riscv, samuel, guoren, christoph.muellner, conor.dooley,
	linux-kernel

On Fri, Jun 23, 2023, at 7:26 PM, Heiko Stübner wrote:
> Am Freitag, 23. Juni 2023, 12:22:35 CEST schrieb Heiko Stübner:
>> Am Freitag, 23. Juni 2023, 05:06:44 CEST schrieb Stefan O'Rear:
>> > On Thu, Jun 22, 2023, at 4:35 PM, Heiko Stübner wrote:
>> > > Am Donnerstag, 22. Juni 2023, 20:58:37 CEST schrieb Stefan O'Rear:
>> > >> Are you aware of "3.7. Vector Fixed-Point Fields in fcsr" in
>> > >> riscv-v-spec-0.7.1.pdf?
>> > >
>> > > oh wow, thanks a lot for that pointer, now I understand your concern.
>> > >
>> > > So in vector-0.7.1 fcsr[10:9] mirrors vxrm and fcsr[8] mirrors vxsat.
>> > >
>> > >
>> > > On a positive note, the T-Head cores seem to not implement the full
>> > > vector 0.7.1 specification after all, in the documentation I have [0]
>> > > fcsr[31:8] are declared as "0" and uppermost bits are [7:5] for the "frm"
>> > > field.
>> > 
>> > Given that the pdf you linked does not mention any vector CSRs, I am not
>> > confident that it provides a complete and accurate description of vector
>> > functionality in other registers for the C906 with vector extension.
>> > 
>> > Assuming that you have access to such a chip, I would be much happier with
>> > the proposed "just a comment" approach if our understanding of the behavior
>> > were confirmed on hardware (specifically: csr_write(CSR_FCSR, 0x700) should
>> > not affect csr_read(CSR_VXRM) or csr_read(CSR_VXSAT)).
>> 
>> For one, you're right that I should definitly try to confirm this on hardware :-) .
>
> ok, so now I know the documentation is wrong.
>
> 	before, vxrm 0x0, vxsat 0x0
> writing the 0x700 to fcsr
> 	after, vxrm 0x3, vxsat 0x1
>
> Essentially the link between the CSRs really is there - oh fun.
> So we're back at your original concern - sadly.
>
> I guess I need to figure out how to not have this stuff break
> because relying on the fpu parts to handle feels not correct
> at first glance.

I don't see a conceptual problem here.

There are a large number of vector instructions that access the floating point
state (frm, fflags, f{0..31}).  These instructions require a valid floating
point context, sstatus.FS>0, trap if sstatus.FS=0, and set sstatus.FS=3 if they
change anything.  vfadd, vfsub, vfmul, vfdiv, vfmv, etc, etc in both 0.7.1 and 1.0.

0.7.1 extends the floating point state to include vxrm and vxsat and adds vaaddu,
vnclip, vsmul, etc to the list of instructions which access both floating point and
vector state.  This breaks floating-point emulation (if a core provides integer
vectors in hardware, it provides a fcsr register with three writeable bits, which
means that accesses to fcsr won't trap and can't be emulated to provide access to
the software frm and fflags), which is probably why the behavior was changed in 1.0,
but is otherwise internally consistent.

You can continue to treat "fpu parts" and "vector parts" as independent as long as
you recognize vxrm and vxsat as _exclusively_ owned by the "fpu parts".

Access to vxrm and vxsat should be exclusively controlled by sstatus.FS since they
are aliases for fields in fcsr, while access to vstart/vtype/vlen should be
exclusively controlled by sstatus.VS.  It would be good to test this, since it's not
actually in the spec (risc-v has a rather systematic underspecification problem),
and T-Head's idea of obviously correct behavior may differ from mine.

1.0 puts vxrm and vxsat into the vector state, controlled by sstatus.VS; vsmul now
works on the vector state only, so as far as state management is concerned it now
acts like vadd and not like vfadd.  This is also internally consistent, but vxcsr
must be owned by whoever manages sstatus.VS.

It's a little bit weird to say "vxrm and vxsat live in different parts of the kernel
state depending on V extension version" but it might be less weird to say "fcsr and
all its parts lives somewhere, vxcsr and all its parts lives somewhere else,
vxcsr doesn't exist in 0.7.1".  Is this adequate?

-s

>
> Heiko

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2023-06-24  3:24 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-02-28 21:54 [PATCH RFC 0/2] RISC-V: T-Head vector handling Heiko Stuebner
2023-02-28 21:54 ` [PATCH RFC 1/2] RISC-V: define the elements of the VCSR vector CSR Heiko Stuebner
2023-03-01  2:22   ` Guo Ren
2023-03-15 18:31   ` Conor Dooley
2023-02-28 21:54 ` [PATCH RFC 2/2] RISC-V: add T-Head vector errata handling Heiko Stuebner
2023-03-01  2:12   ` Guo Ren
2023-03-15 18:56   ` Conor Dooley
2023-06-13  6:35   ` Stefan O'Rear
2023-06-22 17:39     ` Heiko Stübner
2023-06-22 18:58       ` Stefan O'Rear
2023-06-22 20:35         ` Heiko Stübner
2023-06-23  3:06           ` Stefan O'Rear
2023-06-23 10:22             ` Heiko Stübner
2023-06-23 23:26               ` Heiko Stübner
2023-06-24  3:23                 ` Stefan O'Rear
2023-06-23  9:12   ` Emil Renner Berthing
2023-03-01  2:21 ` [PATCH RFC 0/2] RISC-V: T-Head vector handling Guo Ren
2023-03-15  5:29 ` Palmer Dabbelt
2023-03-15  6:31   ` Heiko Stuebner
2023-06-12 15:29   ` Palmer Dabbelt
2023-06-12 15:44     ` Heiko Stübner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).