All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC 0/2] riscv: Idle thread using Zawrs extension
@ 2024-04-18 11:49 ` Xu Lu
  0 siblings, 0 replies; 31+ messages in thread
From: Xu Lu @ 2024-04-18 11:49 UTC (permalink / raw)
  To: paul.walmsley, palmer, aou, andy.chiu, guoren
  Cc: linux-riscv, linux-kernel, lihangjing, dengliang.1214, xieyongji,
	chaiwen.cc, Xu Lu

This patch series introduces a new implementation of idle thread using
Zawrs extension.

The Zawrs[0] extension introduces two new instructions named WRS.STO and
WRS.NTO in RISC-V. When software registers a reservation set using LR
instruction, a subsequent WRS.STO or WRS.NTO instruction will cause the
hart to stall in a low-power state until a store happens to the
reservation set or an interrupt becomes pending. The difference between
these two instructions is that WRS.STO will terminate stall after an
implementation-defined timeout while WRS.NTO won't.

This patch series implements idle thread using WRS.NTO instruction.
Besides, we found there is no need to send a real IPI to wake up an idle
CPU. Instead, we write IPI information to the reservation set of an idle
CPU to wake it up and let it handle IPI quickly, without going through
tranditional interrupt handling routine.

[0] https://github.com/riscv/riscv-zawrs/blob/main/zawrs.adoc

Xu Lu (2):
  riscv: process: Introduce idle thread using Zawrs extension
  riscv: Use Zawrs to accelerate IPI to idle cpu

 arch/riscv/Kconfig                 |  24 +++++++
 arch/riscv/include/asm/cpuidle.h   |  11 +---
 arch/riscv/include/asm/hwcap.h     |   1 +
 arch/riscv/include/asm/processor.h |  31 +++++++++
 arch/riscv/include/asm/smp.h       |  14 ++++
 arch/riscv/kernel/cpu.c            |   5 ++
 arch/riscv/kernel/cpufeature.c     |   1 +
 arch/riscv/kernel/process.c        | 102 ++++++++++++++++++++++++++++-
 arch/riscv/kernel/smp.c            |  39 +++++++----
 9 files changed, 205 insertions(+), 23 deletions(-)

-- 
2.20.1


^ permalink raw reply	[flat|nested] 31+ messages in thread

* [RFC 0/2] riscv: Idle thread using Zawrs extension
@ 2024-04-18 11:49 ` Xu Lu
  0 siblings, 0 replies; 31+ messages in thread
From: Xu Lu @ 2024-04-18 11:49 UTC (permalink / raw)
  To: paul.walmsley, palmer, aou, andy.chiu, guoren
  Cc: linux-riscv, linux-kernel, lihangjing, dengliang.1214, xieyongji,
	chaiwen.cc, Xu Lu

This patch series introduces a new implementation of idle thread using
Zawrs extension.

The Zawrs[0] extension introduces two new instructions named WRS.STO and
WRS.NTO in RISC-V. When software registers a reservation set using LR
instruction, a subsequent WRS.STO or WRS.NTO instruction will cause the
hart to stall in a low-power state until a store happens to the
reservation set or an interrupt becomes pending. The difference between
these two instructions is that WRS.STO will terminate stall after an
implementation-defined timeout while WRS.NTO won't.

This patch series implements idle thread using WRS.NTO instruction.
Besides, we found there is no need to send a real IPI to wake up an idle
CPU. Instead, we write IPI information to the reservation set of an idle
CPU to wake it up and let it handle IPI quickly, without going through
tranditional interrupt handling routine.

[0] https://github.com/riscv/riscv-zawrs/blob/main/zawrs.adoc

Xu Lu (2):
  riscv: process: Introduce idle thread using Zawrs extension
  riscv: Use Zawrs to accelerate IPI to idle cpu

 arch/riscv/Kconfig                 |  24 +++++++
 arch/riscv/include/asm/cpuidle.h   |  11 +---
 arch/riscv/include/asm/hwcap.h     |   1 +
 arch/riscv/include/asm/processor.h |  31 +++++++++
 arch/riscv/include/asm/smp.h       |  14 ++++
 arch/riscv/kernel/cpu.c            |   5 ++
 arch/riscv/kernel/cpufeature.c     |   1 +
 arch/riscv/kernel/process.c        | 102 ++++++++++++++++++++++++++++-
 arch/riscv/kernel/smp.c            |  39 +++++++----
 9 files changed, 205 insertions(+), 23 deletions(-)

-- 
2.20.1


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [RFC 1/2] riscv: process: Introduce idle thread using Zawrs extension
  2024-04-18 11:49 ` Xu Lu
@ 2024-04-18 11:49   ` Xu Lu
  -1 siblings, 0 replies; 31+ messages in thread
From: Xu Lu @ 2024-04-18 11:49 UTC (permalink / raw)
  To: paul.walmsley, palmer, aou, andy.chiu, guoren
  Cc: linux-riscv, linux-kernel, lihangjing, dengliang.1214, xieyongji,
	chaiwen.cc, Xu Lu

The Zawrs extension introduces a new instruction WRS.NTO, which will
register a reservation set and causes the hart to temporarily stall
execution in a low-power state until a store occurs to the reservation
set or an interrupt is observed.

This commit implements new version of idle thread for RISC-V via Zawrs
extension.

Signed-off-by: Xu Lu <luxu.kernel@bytedance.com>
Reviewed-by: Hangjing Li <lihangjing@bytedance.com>
Reviewed-by: Liang Deng <dengliang.1214@bytedance.com>
Reviewed-by: Wen Chai <chaiwen.cc@bytedance.com>
---
 arch/riscv/Kconfig                 | 24 +++++++++++++++++
 arch/riscv/include/asm/cpuidle.h   | 11 +-------
 arch/riscv/include/asm/hwcap.h     |  1 +
 arch/riscv/include/asm/processor.h | 17 +++++++++++++
 arch/riscv/kernel/cpu.c            |  5 ++++
 arch/riscv/kernel/cpufeature.c     |  1 +
 arch/riscv/kernel/process.c        | 41 +++++++++++++++++++++++++++++-
 7 files changed, 89 insertions(+), 11 deletions(-)

diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index be09c8836d56..a0d344e9803f 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -19,6 +19,7 @@ config RISCV
 	select ARCH_ENABLE_SPLIT_PMD_PTLOCK if PGTABLE_LEVELS > 2
 	select ARCH_ENABLE_THP_MIGRATION if TRANSPARENT_HUGEPAGE
 	select ARCH_HAS_BINFMT_FLAT
+	select ARCH_HAS_CPU_FINALIZE_INIT
 	select ARCH_HAS_CURRENT_STACK_POINTER
 	select ARCH_HAS_DEBUG_VIRTUAL if MMU
 	select ARCH_HAS_DEBUG_VM_PGTABLE
@@ -525,6 +526,20 @@ config RISCV_ISA_SVPBMT
 
 	   If you don't know what to do here, say Y.
 
+config RISCV_ISA_ZAWRS
+	bool "Zawrs extension support for wait-on-reservation-set instructions"
+	depends on RISCV_ALTERNATIVE
+	default y
+	help
+	   Adds support to dynamically detect the presence of the Zawrs
+	   extension and enable its usage.
+
+	   The Zawrs extension defines a pair of instructions to be used
+	   in polling loops that allows a core to enter a low-power state
+	   and wait on a store to a memory location.
+
+	   If you don't know what to do here, say Y.
+
 config TOOLCHAIN_HAS_V
 	bool
 	default y
@@ -1075,6 +1090,15 @@ endmenu # "Power management options"
 
 menu "CPU Power Management"
 
+config RISCV_ZAWRS_IDLE
+	bool "Idle thread using ZAWRS extensions"
+	depends on RISCV_ISA_ZAWRS
+	default y
+	help
+		Adds support to implement idle thread using ZAWRS extension.
+
+		If you don't know what to do here, say Y.
+
 source "drivers/cpuidle/Kconfig"
 
 source "drivers/cpufreq/Kconfig"
diff --git a/arch/riscv/include/asm/cpuidle.h b/arch/riscv/include/asm/cpuidle.h
index 71fdc607d4bc..94c9ecb46571 100644
--- a/arch/riscv/include/asm/cpuidle.h
+++ b/arch/riscv/include/asm/cpuidle.h
@@ -10,15 +10,6 @@
 #include <asm/barrier.h>
 #include <asm/processor.h>
 
-static inline void cpu_do_idle(void)
-{
-	/*
-	 * Add mb() here to ensure that all
-	 * IO/MEM accesses are completed prior
-	 * to entering WFI.
-	 */
-	mb();
-	wait_for_interrupt();
-}
+void cpu_do_idle(void);
 
 #endif
diff --git a/arch/riscv/include/asm/hwcap.h b/arch/riscv/include/asm/hwcap.h
index e17d0078a651..5b358c3cf212 100644
--- a/arch/riscv/include/asm/hwcap.h
+++ b/arch/riscv/include/asm/hwcap.h
@@ -81,6 +81,7 @@
 #define RISCV_ISA_EXT_ZTSO		72
 #define RISCV_ISA_EXT_ZACAS		73
 #define RISCV_ISA_EXT_XANDESPMU		74
+#define RISCV_ISA_EXT_ZAWRS		75
 
 #define RISCV_ISA_EXT_XLINUXENVCFG	127
 
diff --git a/arch/riscv/include/asm/processor.h b/arch/riscv/include/asm/processor.h
index 0faf5f161f1e..1143367de8c6 100644
--- a/arch/riscv/include/asm/processor.h
+++ b/arch/riscv/include/asm/processor.h
@@ -157,6 +157,21 @@ static inline void wait_for_interrupt(void)
 	__asm__ __volatile__ ("wfi");
 }
 
+static inline void wrs_nto(unsigned long *addr)
+{
+	int val;
+
+	__asm__ __volatile__(
+#ifdef CONFIG_64BIT
+			"lr.d %[p], %[v] \n\t"
+#else
+			"lr.w %[p], %[v] \n\t"
+#endif
+			".long 0x00d00073 \n\t"
+			: [p] "=&r" (val), [v] "+A" (*addr)
+			: : "memory");
+}
+
 extern phys_addr_t dma32_phys_limit;
 
 struct device_node;
@@ -183,6 +198,8 @@ extern int set_unalign_ctl(struct task_struct *tsk, unsigned int val);
 #define GET_UNALIGN_CTL(tsk, addr)	get_unalign_ctl((tsk), (addr))
 #define SET_UNALIGN_CTL(tsk, val)	set_unalign_ctl((tsk), (val))
 
+extern void select_idle_routine(void);
+
 #endif /* __ASSEMBLY__ */
 
 #endif /* _ASM_RISCV_PROCESSOR_H */
diff --git a/arch/riscv/kernel/cpu.c b/arch/riscv/kernel/cpu.c
index d11d6320fb0d..69cebd41f5f3 100644
--- a/arch/riscv/kernel/cpu.c
+++ b/arch/riscv/kernel/cpu.c
@@ -22,6 +22,11 @@ bool arch_match_cpu_phys_id(int cpu, u64 phys_id)
 	return phys_id == cpuid_to_hartid_map(cpu);
 }
 
+void __init arch_cpu_finalize_init(void)
+{
+	select_idle_routine();
+}
+
 /*
  * Returns the hart ID of the given device tree node, or -ENODEV if the node
  * isn't an enabled and valid RISC-V hart node.
diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c
index 3ed2359eae35..c080e6ca54ba 100644
--- a/arch/riscv/kernel/cpufeature.c
+++ b/arch/riscv/kernel/cpufeature.c
@@ -305,6 +305,7 @@ const struct riscv_isa_ext_data riscv_isa_ext[] = {
 	__RISCV_ISA_EXT_DATA(svnapot, RISCV_ISA_EXT_SVNAPOT),
 	__RISCV_ISA_EXT_DATA(svpbmt, RISCV_ISA_EXT_SVPBMT),
 	__RISCV_ISA_EXT_DATA(xandespmu, RISCV_ISA_EXT_XANDESPMU),
+	__RISCV_ISA_EXT_DATA(zawrs, RISCV_ISA_EXT_ZAWRS),
 };
 
 const size_t riscv_isa_ext_count = ARRAY_SIZE(riscv_isa_ext);
diff --git a/arch/riscv/kernel/process.c b/arch/riscv/kernel/process.c
index 92922dbd5b5c..9f0f7b888bc1 100644
--- a/arch/riscv/kernel/process.c
+++ b/arch/riscv/kernel/process.c
@@ -15,6 +15,7 @@
 #include <linux/tick.h>
 #include <linux/ptrace.h>
 #include <linux/uaccess.h>
+#include <linux/static_call.h>
 
 #include <asm/unistd.h>
 #include <asm/processor.h>
@@ -37,11 +38,49 @@ EXPORT_SYMBOL(__stack_chk_guard);
 
 extern asmlinkage void ret_from_fork(void);
 
-void arch_cpu_idle(void)
+static __cpuidle void default_idle(void)
+{
+	/*
+	 * Add mb() here to ensure that all
+	 * IO/MEM accesses are completed prior
+	 * to entering WFI.
+	 */
+	mb();
+	wait_for_interrupt();
+}
+
+static __cpuidle void wrs_idle(void)
+{
+	/*
+	 * Add mb() here to ensure that all
+	 * IO/MEM accesses are completed prior
+	 * to entering WRS.NTO.
+	 */
+	mb();
+	wrs_nto(&current_thread_info()->flags);
+}
+
+DEFINE_STATIC_CALL_NULL(riscv_idle, default_idle);
+
+void __cpuidle cpu_do_idle(void)
+{
+	static_call(riscv_idle)();
+}
+
+void __cpuidle arch_cpu_idle(void)
 {
 	cpu_do_idle();
 }
 
+void __init select_idle_routine(void)
+{
+	if (IS_ENABLED(CONFIG_RISCV_ZAWRS_IDLE) &&
+			riscv_has_extension_likely(RISCV_ISA_EXT_ZAWRS))
+		static_call_update(riscv_idle, wrs_idle);
+	else
+		static_call_update(riscv_idle, default_idle);
+}
+
 int set_unalign_ctl(struct task_struct *tsk, unsigned int val)
 {
 	if (!unaligned_ctl_available())
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [RFC 1/2] riscv: process: Introduce idle thread using Zawrs extension
@ 2024-04-18 11:49   ` Xu Lu
  0 siblings, 0 replies; 31+ messages in thread
From: Xu Lu @ 2024-04-18 11:49 UTC (permalink / raw)
  To: paul.walmsley, palmer, aou, andy.chiu, guoren
  Cc: linux-riscv, linux-kernel, lihangjing, dengliang.1214, xieyongji,
	chaiwen.cc, Xu Lu

The Zawrs extension introduces a new instruction WRS.NTO, which will
register a reservation set and causes the hart to temporarily stall
execution in a low-power state until a store occurs to the reservation
set or an interrupt is observed.

This commit implements new version of idle thread for RISC-V via Zawrs
extension.

Signed-off-by: Xu Lu <luxu.kernel@bytedance.com>
Reviewed-by: Hangjing Li <lihangjing@bytedance.com>
Reviewed-by: Liang Deng <dengliang.1214@bytedance.com>
Reviewed-by: Wen Chai <chaiwen.cc@bytedance.com>
---
 arch/riscv/Kconfig                 | 24 +++++++++++++++++
 arch/riscv/include/asm/cpuidle.h   | 11 +-------
 arch/riscv/include/asm/hwcap.h     |  1 +
 arch/riscv/include/asm/processor.h | 17 +++++++++++++
 arch/riscv/kernel/cpu.c            |  5 ++++
 arch/riscv/kernel/cpufeature.c     |  1 +
 arch/riscv/kernel/process.c        | 41 +++++++++++++++++++++++++++++-
 7 files changed, 89 insertions(+), 11 deletions(-)

diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index be09c8836d56..a0d344e9803f 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -19,6 +19,7 @@ config RISCV
 	select ARCH_ENABLE_SPLIT_PMD_PTLOCK if PGTABLE_LEVELS > 2
 	select ARCH_ENABLE_THP_MIGRATION if TRANSPARENT_HUGEPAGE
 	select ARCH_HAS_BINFMT_FLAT
+	select ARCH_HAS_CPU_FINALIZE_INIT
 	select ARCH_HAS_CURRENT_STACK_POINTER
 	select ARCH_HAS_DEBUG_VIRTUAL if MMU
 	select ARCH_HAS_DEBUG_VM_PGTABLE
@@ -525,6 +526,20 @@ config RISCV_ISA_SVPBMT
 
 	   If you don't know what to do here, say Y.
 
+config RISCV_ISA_ZAWRS
+	bool "Zawrs extension support for wait-on-reservation-set instructions"
+	depends on RISCV_ALTERNATIVE
+	default y
+	help
+	   Adds support to dynamically detect the presence of the Zawrs
+	   extension and enable its usage.
+
+	   The Zawrs extension defines a pair of instructions to be used
+	   in polling loops that allows a core to enter a low-power state
+	   and wait on a store to a memory location.
+
+	   If you don't know what to do here, say Y.
+
 config TOOLCHAIN_HAS_V
 	bool
 	default y
@@ -1075,6 +1090,15 @@ endmenu # "Power management options"
 
 menu "CPU Power Management"
 
+config RISCV_ZAWRS_IDLE
+	bool "Idle thread using ZAWRS extensions"
+	depends on RISCV_ISA_ZAWRS
+	default y
+	help
+		Adds support to implement idle thread using ZAWRS extension.
+
+		If you don't know what to do here, say Y.
+
 source "drivers/cpuidle/Kconfig"
 
 source "drivers/cpufreq/Kconfig"
diff --git a/arch/riscv/include/asm/cpuidle.h b/arch/riscv/include/asm/cpuidle.h
index 71fdc607d4bc..94c9ecb46571 100644
--- a/arch/riscv/include/asm/cpuidle.h
+++ b/arch/riscv/include/asm/cpuidle.h
@@ -10,15 +10,6 @@
 #include <asm/barrier.h>
 #include <asm/processor.h>
 
-static inline void cpu_do_idle(void)
-{
-	/*
-	 * Add mb() here to ensure that all
-	 * IO/MEM accesses are completed prior
-	 * to entering WFI.
-	 */
-	mb();
-	wait_for_interrupt();
-}
+void cpu_do_idle(void);
 
 #endif
diff --git a/arch/riscv/include/asm/hwcap.h b/arch/riscv/include/asm/hwcap.h
index e17d0078a651..5b358c3cf212 100644
--- a/arch/riscv/include/asm/hwcap.h
+++ b/arch/riscv/include/asm/hwcap.h
@@ -81,6 +81,7 @@
 #define RISCV_ISA_EXT_ZTSO		72
 #define RISCV_ISA_EXT_ZACAS		73
 #define RISCV_ISA_EXT_XANDESPMU		74
+#define RISCV_ISA_EXT_ZAWRS		75
 
 #define RISCV_ISA_EXT_XLINUXENVCFG	127
 
diff --git a/arch/riscv/include/asm/processor.h b/arch/riscv/include/asm/processor.h
index 0faf5f161f1e..1143367de8c6 100644
--- a/arch/riscv/include/asm/processor.h
+++ b/arch/riscv/include/asm/processor.h
@@ -157,6 +157,21 @@ static inline void wait_for_interrupt(void)
 	__asm__ __volatile__ ("wfi");
 }
 
+static inline void wrs_nto(unsigned long *addr)
+{
+	int val;
+
+	__asm__ __volatile__(
+#ifdef CONFIG_64BIT
+			"lr.d %[p], %[v] \n\t"
+#else
+			"lr.w %[p], %[v] \n\t"
+#endif
+			".long 0x00d00073 \n\t"
+			: [p] "=&r" (val), [v] "+A" (*addr)
+			: : "memory");
+}
+
 extern phys_addr_t dma32_phys_limit;
 
 struct device_node;
@@ -183,6 +198,8 @@ extern int set_unalign_ctl(struct task_struct *tsk, unsigned int val);
 #define GET_UNALIGN_CTL(tsk, addr)	get_unalign_ctl((tsk), (addr))
 #define SET_UNALIGN_CTL(tsk, val)	set_unalign_ctl((tsk), (val))
 
+extern void select_idle_routine(void);
+
 #endif /* __ASSEMBLY__ */
 
 #endif /* _ASM_RISCV_PROCESSOR_H */
diff --git a/arch/riscv/kernel/cpu.c b/arch/riscv/kernel/cpu.c
index d11d6320fb0d..69cebd41f5f3 100644
--- a/arch/riscv/kernel/cpu.c
+++ b/arch/riscv/kernel/cpu.c
@@ -22,6 +22,11 @@ bool arch_match_cpu_phys_id(int cpu, u64 phys_id)
 	return phys_id == cpuid_to_hartid_map(cpu);
 }
 
+void __init arch_cpu_finalize_init(void)
+{
+	select_idle_routine();
+}
+
 /*
  * Returns the hart ID of the given device tree node, or -ENODEV if the node
  * isn't an enabled and valid RISC-V hart node.
diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c
index 3ed2359eae35..c080e6ca54ba 100644
--- a/arch/riscv/kernel/cpufeature.c
+++ b/arch/riscv/kernel/cpufeature.c
@@ -305,6 +305,7 @@ const struct riscv_isa_ext_data riscv_isa_ext[] = {
 	__RISCV_ISA_EXT_DATA(svnapot, RISCV_ISA_EXT_SVNAPOT),
 	__RISCV_ISA_EXT_DATA(svpbmt, RISCV_ISA_EXT_SVPBMT),
 	__RISCV_ISA_EXT_DATA(xandespmu, RISCV_ISA_EXT_XANDESPMU),
+	__RISCV_ISA_EXT_DATA(zawrs, RISCV_ISA_EXT_ZAWRS),
 };
 
 const size_t riscv_isa_ext_count = ARRAY_SIZE(riscv_isa_ext);
diff --git a/arch/riscv/kernel/process.c b/arch/riscv/kernel/process.c
index 92922dbd5b5c..9f0f7b888bc1 100644
--- a/arch/riscv/kernel/process.c
+++ b/arch/riscv/kernel/process.c
@@ -15,6 +15,7 @@
 #include <linux/tick.h>
 #include <linux/ptrace.h>
 #include <linux/uaccess.h>
+#include <linux/static_call.h>
 
 #include <asm/unistd.h>
 #include <asm/processor.h>
@@ -37,11 +38,49 @@ EXPORT_SYMBOL(__stack_chk_guard);
 
 extern asmlinkage void ret_from_fork(void);
 
-void arch_cpu_idle(void)
+static __cpuidle void default_idle(void)
+{
+	/*
+	 * Add mb() here to ensure that all
+	 * IO/MEM accesses are completed prior
+	 * to entering WFI.
+	 */
+	mb();
+	wait_for_interrupt();
+}
+
+static __cpuidle void wrs_idle(void)
+{
+	/*
+	 * Add mb() here to ensure that all
+	 * IO/MEM accesses are completed prior
+	 * to entering WRS.NTO.
+	 */
+	mb();
+	wrs_nto(&current_thread_info()->flags);
+}
+
+DEFINE_STATIC_CALL_NULL(riscv_idle, default_idle);
+
+void __cpuidle cpu_do_idle(void)
+{
+	static_call(riscv_idle)();
+}
+
+void __cpuidle arch_cpu_idle(void)
 {
 	cpu_do_idle();
 }
 
+void __init select_idle_routine(void)
+{
+	if (IS_ENABLED(CONFIG_RISCV_ZAWRS_IDLE) &&
+			riscv_has_extension_likely(RISCV_ISA_EXT_ZAWRS))
+		static_call_update(riscv_idle, wrs_idle);
+	else
+		static_call_update(riscv_idle, default_idle);
+}
+
 int set_unalign_ctl(struct task_struct *tsk, unsigned int val)
 {
 	if (!unaligned_ctl_available())
-- 
2.20.1


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [RFC 2/2] riscv: Use Zawrs to accelerate IPI to idle cpu
  2024-04-18 11:49 ` Xu Lu
@ 2024-04-18 11:49   ` Xu Lu
  -1 siblings, 0 replies; 31+ messages in thread
From: Xu Lu @ 2024-04-18 11:49 UTC (permalink / raw)
  To: paul.walmsley, palmer, aou, andy.chiu, guoren
  Cc: linux-riscv, linux-kernel, lihangjing, dengliang.1214, xieyongji,
	chaiwen.cc, Xu Lu

When sending IPI to a cpu which has entered idle state using Zawrs
extension, there is no need to send a physical software interrupt.
Instead, we can write the IPI information to the address reserved by
target cpu, which will wake it from WRS.NTO. Then the target cpu can
handle the IPI directly without falling into traditional interrupt
handling routine.

Signed-off-by: Xu Lu <luxu.kernel@bytedance.com>
---
 arch/riscv/include/asm/processor.h | 14 +++++++
 arch/riscv/include/asm/smp.h       | 14 +++++++
 arch/riscv/kernel/process.c        | 65 +++++++++++++++++++++++++++++-
 arch/riscv/kernel/smp.c            | 39 ++++++++++++------
 4 files changed, 118 insertions(+), 14 deletions(-)

diff --git a/arch/riscv/include/asm/processor.h b/arch/riscv/include/asm/processor.h
index 1143367de8c6..76091cf2e8be 100644
--- a/arch/riscv/include/asm/processor.h
+++ b/arch/riscv/include/asm/processor.h
@@ -172,6 +172,20 @@ static inline void wrs_nto(unsigned long *addr)
 			: : "memory");
 }
 
+static inline void wrs_nto_if(int *addr, int val)
+{
+	int prev;
+
+	__asm__ __volatile__(
+			"lr.w %[p], %[a] \n\t"
+			"bne %[p], %[v], 1f \n\t"
+			".long 0x00d00073 \n\t"
+			"1: \n\t"
+			: [p] "=&r" (prev), [a] "+A" (*addr)
+			: [v] "r" (val)
+			: "memory");
+}
+
 extern phys_addr_t dma32_phys_limit;
 
 struct device_node;
diff --git a/arch/riscv/include/asm/smp.h b/arch/riscv/include/asm/smp.h
index 0d555847cde6..2f27fd743092 100644
--- a/arch/riscv/include/asm/smp.h
+++ b/arch/riscv/include/asm/smp.h
@@ -19,6 +19,20 @@ extern unsigned long boot_cpu_hartid;
 
 #include <linux/jump_label.h>
 
+enum ipi_message_type {
+	IPI_RESCHEDULE,
+	IPI_CALL_FUNC,
+	IPI_CPU_STOP,
+	IPI_CPU_CRASH_STOP,
+	IPI_IRQ_WORK,
+	IPI_TIMER,
+	IPI_MAX
+};
+
+int ipi_virq_base_get(void);
+
+irqreturn_t handle_IPI(int irq, void *data);
+
 /*
  * Mapping between linux logical cpu index and hartid.
  */
diff --git a/arch/riscv/kernel/process.c b/arch/riscv/kernel/process.c
index 9f0f7b888bc1..7d6bf780d334 100644
--- a/arch/riscv/kernel/process.c
+++ b/arch/riscv/kernel/process.c
@@ -16,6 +16,7 @@
 #include <linux/ptrace.h>
 #include <linux/uaccess.h>
 #include <linux/static_call.h>
+#include <linux/hardirq.h>
 
 #include <asm/unistd.h>
 #include <asm/processor.h>
@@ -27,6 +28,7 @@
 #include <asm/cpuidle.h>
 #include <asm/vector.h>
 #include <asm/cpufeature.h>
+#include <asm/smp.h>
 
 register unsigned long gp_in_global __asm__("gp");
 
@@ -38,6 +40,8 @@ EXPORT_SYMBOL(__stack_chk_guard);
 
 extern asmlinkage void ret_from_fork(void);
 
+DEFINE_PER_CPU(atomic_t, idle_ipi_mask);
+
 static __cpuidle void default_idle(void)
 {
 	/*
@@ -49,6 +53,16 @@ static __cpuidle void default_idle(void)
 	wait_for_interrupt();
 }
 
+static __cpuidle void default_idle_enter(void)
+{
+	/* Do nothing */
+}
+
+static __cpuidle void default_idle_exit(void)
+{
+	/* Do nothing */
+}
+
 static __cpuidle void wrs_idle(void)
 {
 	/*
@@ -57,10 +71,42 @@ static __cpuidle void wrs_idle(void)
 	 * to entering WRS.NTO.
 	 */
 	mb();
+#ifdef CONFIG_SMP
+	wrs_nto_if(&this_cpu_ptr(&idle_ipi_mask)->counter, BIT(IPI_MAX));
+#else
 	wrs_nto(&current_thread_info()->flags);
+#endif
+}
+
+static __cpuidle void wrs_idle_enter(void)
+{
+#ifdef CONFIG_SMP
+	atomic_set(this_cpu_ptr(&idle_ipi_mask), BIT(IPI_MAX));
+#endif
+}
+
+static __cpuidle void wrs_idle_exit(void)
+{
+#ifdef CONFIG_SMP
+	int pending;
+	unsigned long flags;
+	enum ipi_message_type ipi;
+
+	local_irq_save(flags);
+	pending = atomic_xchg_relaxed(this_cpu_ptr(&idle_ipi_mask), 0);
+	for (ipi = IPI_RESCHEDULE; ipi < IPI_MAX; ipi++)
+		if (pending & BIT(ipi)) {
+			irq_enter();
+			handle_IPI(ipi_virq_base_get() + ipi, NULL);
+			irq_exit();
+		}
+	local_irq_restore(flags);
+#endif
 }
 
 DEFINE_STATIC_CALL_NULL(riscv_idle, default_idle);
+DEFINE_STATIC_CALL_NULL(riscv_idle_enter, default_idle_enter);
+DEFINE_STATIC_CALL_NULL(riscv_idle_exit, default_idle_exit);
 
 void __cpuidle cpu_do_idle(void)
 {
@@ -72,13 +118,28 @@ void __cpuidle arch_cpu_idle(void)
 	cpu_do_idle();
 }
 
+void __cpuidle arch_cpu_idle_enter(void)
+{
+	static_call(riscv_idle_enter)();
+}
+
+void __cpuidle arch_cpu_idle_exit(void)
+{
+	static_call(riscv_idle_exit)();
+}
+
 void __init select_idle_routine(void)
 {
 	if (IS_ENABLED(CONFIG_RISCV_ZAWRS_IDLE) &&
-			riscv_has_extension_likely(RISCV_ISA_EXT_ZAWRS))
+			riscv_has_extension_likely(RISCV_ISA_EXT_ZAWRS)) {
 		static_call_update(riscv_idle, wrs_idle);
-	else
+		static_call_update(riscv_idle_enter, wrs_idle_enter);
+		static_call_update(riscv_idle_exit, wrs_idle_exit);
+	} else {
 		static_call_update(riscv_idle, default_idle);
+		static_call_update(riscv_idle_enter, default_idle_enter);
+		static_call_update(riscv_idle_exit, default_idle_exit);
+	}
 }
 
 int set_unalign_ctl(struct task_struct *tsk, unsigned int val)
diff --git a/arch/riscv/kernel/smp.c b/arch/riscv/kernel/smp.c
index 45dd4035416e..b5416ee41967 100644
--- a/arch/riscv/kernel/smp.c
+++ b/arch/riscv/kernel/smp.c
@@ -26,16 +26,6 @@
 #include <asm/cacheflush.h>
 #include <asm/cpu_ops.h>
 
-enum ipi_message_type {
-	IPI_RESCHEDULE,
-	IPI_CALL_FUNC,
-	IPI_CPU_STOP,
-	IPI_CPU_CRASH_STOP,
-	IPI_IRQ_WORK,
-	IPI_TIMER,
-	IPI_MAX
-};
-
 unsigned long __cpuid_to_hartid_map[NR_CPUS] __ro_after_init = {
 	[0 ... NR_CPUS-1] = INVALID_HARTID
 };
@@ -94,14 +84,34 @@ static inline void ipi_cpu_crash_stop(unsigned int cpu, struct pt_regs *regs)
 }
 #endif
 
+#if defined(CONFIG_RISCV_ZAWRS_IDLE) && defined(CONFIG_SMP)
+DECLARE_PER_CPU(atomic_t, idle_ipi_mask);
+#endif
+
 static void send_ipi_mask(const struct cpumask *mask, enum ipi_message_type op)
 {
+#if defined(CONFIG_RISCV_ZAWRS_IDLE) && defined(CONFIG_SMP)
+	int cpu, val;
+
+	for_each_cpu(cpu, mask) {
+		val = atomic_fetch_or_relaxed(BIT(op), per_cpu_ptr(&idle_ipi_mask, cpu));
+		if (likely(!(val & BIT(IPI_MAX))))
+			__ipi_send_mask(ipi_desc[op], cpumask_of(cpu));
+	}
+#else
 	__ipi_send_mask(ipi_desc[op], mask);
+#endif
 }
 
 static void send_ipi_single(int cpu, enum ipi_message_type op)
 {
-	__ipi_send_mask(ipi_desc[op], cpumask_of(cpu));
+#if defined(CONFIG_RISCV_ZAWRS_IDLE) && defined(CONFIG_SMP)
+	int val;
+
+	val = atomic_fetch_or_relaxed(BIT(op), per_cpu_ptr(&idle_ipi_mask, cpu));
+	if (likely(!(val & BIT(IPI_MAX))))
+#endif
+		__ipi_send_mask(ipi_desc[op], cpumask_of(cpu));
 }
 
 #ifdef CONFIG_IRQ_WORK
@@ -111,7 +121,7 @@ void arch_irq_work_raise(void)
 }
 #endif
 
-static irqreturn_t handle_IPI(int irq, void *data)
+irqreturn_t handle_IPI(int irq, void *data)
 {
 	int ipi = irq - ipi_virq_base;
 
@@ -332,3 +342,8 @@ void arch_smp_send_reschedule(int cpu)
 	send_ipi_single(cpu, IPI_RESCHEDULE);
 }
 EXPORT_SYMBOL_GPL(arch_smp_send_reschedule);
+
+int ipi_virq_base_get(void)
+{
+	return ipi_virq_base;
+}
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [RFC 2/2] riscv: Use Zawrs to accelerate IPI to idle cpu
@ 2024-04-18 11:49   ` Xu Lu
  0 siblings, 0 replies; 31+ messages in thread
From: Xu Lu @ 2024-04-18 11:49 UTC (permalink / raw)
  To: paul.walmsley, palmer, aou, andy.chiu, guoren
  Cc: linux-riscv, linux-kernel, lihangjing, dengliang.1214, xieyongji,
	chaiwen.cc, Xu Lu

When sending IPI to a cpu which has entered idle state using Zawrs
extension, there is no need to send a physical software interrupt.
Instead, we can write the IPI information to the address reserved by
target cpu, which will wake it from WRS.NTO. Then the target cpu can
handle the IPI directly without falling into traditional interrupt
handling routine.

Signed-off-by: Xu Lu <luxu.kernel@bytedance.com>
---
 arch/riscv/include/asm/processor.h | 14 +++++++
 arch/riscv/include/asm/smp.h       | 14 +++++++
 arch/riscv/kernel/process.c        | 65 +++++++++++++++++++++++++++++-
 arch/riscv/kernel/smp.c            | 39 ++++++++++++------
 4 files changed, 118 insertions(+), 14 deletions(-)

diff --git a/arch/riscv/include/asm/processor.h b/arch/riscv/include/asm/processor.h
index 1143367de8c6..76091cf2e8be 100644
--- a/arch/riscv/include/asm/processor.h
+++ b/arch/riscv/include/asm/processor.h
@@ -172,6 +172,20 @@ static inline void wrs_nto(unsigned long *addr)
 			: : "memory");
 }
 
+static inline void wrs_nto_if(int *addr, int val)
+{
+	int prev;
+
+	__asm__ __volatile__(
+			"lr.w %[p], %[a] \n\t"
+			"bne %[p], %[v], 1f \n\t"
+			".long 0x00d00073 \n\t"
+			"1: \n\t"
+			: [p] "=&r" (prev), [a] "+A" (*addr)
+			: [v] "r" (val)
+			: "memory");
+}
+
 extern phys_addr_t dma32_phys_limit;
 
 struct device_node;
diff --git a/arch/riscv/include/asm/smp.h b/arch/riscv/include/asm/smp.h
index 0d555847cde6..2f27fd743092 100644
--- a/arch/riscv/include/asm/smp.h
+++ b/arch/riscv/include/asm/smp.h
@@ -19,6 +19,20 @@ extern unsigned long boot_cpu_hartid;
 
 #include <linux/jump_label.h>
 
+enum ipi_message_type {
+	IPI_RESCHEDULE,
+	IPI_CALL_FUNC,
+	IPI_CPU_STOP,
+	IPI_CPU_CRASH_STOP,
+	IPI_IRQ_WORK,
+	IPI_TIMER,
+	IPI_MAX
+};
+
+int ipi_virq_base_get(void);
+
+irqreturn_t handle_IPI(int irq, void *data);
+
 /*
  * Mapping between linux logical cpu index and hartid.
  */
diff --git a/arch/riscv/kernel/process.c b/arch/riscv/kernel/process.c
index 9f0f7b888bc1..7d6bf780d334 100644
--- a/arch/riscv/kernel/process.c
+++ b/arch/riscv/kernel/process.c
@@ -16,6 +16,7 @@
 #include <linux/ptrace.h>
 #include <linux/uaccess.h>
 #include <linux/static_call.h>
+#include <linux/hardirq.h>
 
 #include <asm/unistd.h>
 #include <asm/processor.h>
@@ -27,6 +28,7 @@
 #include <asm/cpuidle.h>
 #include <asm/vector.h>
 #include <asm/cpufeature.h>
+#include <asm/smp.h>
 
 register unsigned long gp_in_global __asm__("gp");
 
@@ -38,6 +40,8 @@ EXPORT_SYMBOL(__stack_chk_guard);
 
 extern asmlinkage void ret_from_fork(void);
 
+DEFINE_PER_CPU(atomic_t, idle_ipi_mask);
+
 static __cpuidle void default_idle(void)
 {
 	/*
@@ -49,6 +53,16 @@ static __cpuidle void default_idle(void)
 	wait_for_interrupt();
 }
 
+static __cpuidle void default_idle_enter(void)
+{
+	/* Do nothing */
+}
+
+static __cpuidle void default_idle_exit(void)
+{
+	/* Do nothing */
+}
+
 static __cpuidle void wrs_idle(void)
 {
 	/*
@@ -57,10 +71,42 @@ static __cpuidle void wrs_idle(void)
 	 * to entering WRS.NTO.
 	 */
 	mb();
+#ifdef CONFIG_SMP
+	wrs_nto_if(&this_cpu_ptr(&idle_ipi_mask)->counter, BIT(IPI_MAX));
+#else
 	wrs_nto(&current_thread_info()->flags);
+#endif
+}
+
+static __cpuidle void wrs_idle_enter(void)
+{
+#ifdef CONFIG_SMP
+	atomic_set(this_cpu_ptr(&idle_ipi_mask), BIT(IPI_MAX));
+#endif
+}
+
+static __cpuidle void wrs_idle_exit(void)
+{
+#ifdef CONFIG_SMP
+	int pending;
+	unsigned long flags;
+	enum ipi_message_type ipi;
+
+	local_irq_save(flags);
+	pending = atomic_xchg_relaxed(this_cpu_ptr(&idle_ipi_mask), 0);
+	for (ipi = IPI_RESCHEDULE; ipi < IPI_MAX; ipi++)
+		if (pending & BIT(ipi)) {
+			irq_enter();
+			handle_IPI(ipi_virq_base_get() + ipi, NULL);
+			irq_exit();
+		}
+	local_irq_restore(flags);
+#endif
 }
 
 DEFINE_STATIC_CALL_NULL(riscv_idle, default_idle);
+DEFINE_STATIC_CALL_NULL(riscv_idle_enter, default_idle_enter);
+DEFINE_STATIC_CALL_NULL(riscv_idle_exit, default_idle_exit);
 
 void __cpuidle cpu_do_idle(void)
 {
@@ -72,13 +118,28 @@ void __cpuidle arch_cpu_idle(void)
 	cpu_do_idle();
 }
 
+void __cpuidle arch_cpu_idle_enter(void)
+{
+	static_call(riscv_idle_enter)();
+}
+
+void __cpuidle arch_cpu_idle_exit(void)
+{
+	static_call(riscv_idle_exit)();
+}
+
 void __init select_idle_routine(void)
 {
 	if (IS_ENABLED(CONFIG_RISCV_ZAWRS_IDLE) &&
-			riscv_has_extension_likely(RISCV_ISA_EXT_ZAWRS))
+			riscv_has_extension_likely(RISCV_ISA_EXT_ZAWRS)) {
 		static_call_update(riscv_idle, wrs_idle);
-	else
+		static_call_update(riscv_idle_enter, wrs_idle_enter);
+		static_call_update(riscv_idle_exit, wrs_idle_exit);
+	} else {
 		static_call_update(riscv_idle, default_idle);
+		static_call_update(riscv_idle_enter, default_idle_enter);
+		static_call_update(riscv_idle_exit, default_idle_exit);
+	}
 }
 
 int set_unalign_ctl(struct task_struct *tsk, unsigned int val)
diff --git a/arch/riscv/kernel/smp.c b/arch/riscv/kernel/smp.c
index 45dd4035416e..b5416ee41967 100644
--- a/arch/riscv/kernel/smp.c
+++ b/arch/riscv/kernel/smp.c
@@ -26,16 +26,6 @@
 #include <asm/cacheflush.h>
 #include <asm/cpu_ops.h>
 
-enum ipi_message_type {
-	IPI_RESCHEDULE,
-	IPI_CALL_FUNC,
-	IPI_CPU_STOP,
-	IPI_CPU_CRASH_STOP,
-	IPI_IRQ_WORK,
-	IPI_TIMER,
-	IPI_MAX
-};
-
 unsigned long __cpuid_to_hartid_map[NR_CPUS] __ro_after_init = {
 	[0 ... NR_CPUS-1] = INVALID_HARTID
 };
@@ -94,14 +84,34 @@ static inline void ipi_cpu_crash_stop(unsigned int cpu, struct pt_regs *regs)
 }
 #endif
 
+#if defined(CONFIG_RISCV_ZAWRS_IDLE) && defined(CONFIG_SMP)
+DECLARE_PER_CPU(atomic_t, idle_ipi_mask);
+#endif
+
 static void send_ipi_mask(const struct cpumask *mask, enum ipi_message_type op)
 {
+#if defined(CONFIG_RISCV_ZAWRS_IDLE) && defined(CONFIG_SMP)
+	int cpu, val;
+
+	for_each_cpu(cpu, mask) {
+		val = atomic_fetch_or_relaxed(BIT(op), per_cpu_ptr(&idle_ipi_mask, cpu));
+		if (likely(!(val & BIT(IPI_MAX))))
+			__ipi_send_mask(ipi_desc[op], cpumask_of(cpu));
+	}
+#else
 	__ipi_send_mask(ipi_desc[op], mask);
+#endif
 }
 
 static void send_ipi_single(int cpu, enum ipi_message_type op)
 {
-	__ipi_send_mask(ipi_desc[op], cpumask_of(cpu));
+#if defined(CONFIG_RISCV_ZAWRS_IDLE) && defined(CONFIG_SMP)
+	int val;
+
+	val = atomic_fetch_or_relaxed(BIT(op), per_cpu_ptr(&idle_ipi_mask, cpu));
+	if (likely(!(val & BIT(IPI_MAX))))
+#endif
+		__ipi_send_mask(ipi_desc[op], cpumask_of(cpu));
 }
 
 #ifdef CONFIG_IRQ_WORK
@@ -111,7 +121,7 @@ void arch_irq_work_raise(void)
 }
 #endif
 
-static irqreturn_t handle_IPI(int irq, void *data)
+irqreturn_t handle_IPI(int irq, void *data)
 {
 	int ipi = irq - ipi_virq_base;
 
@@ -332,3 +342,8 @@ void arch_smp_send_reschedule(int cpu)
 	send_ipi_single(cpu, IPI_RESCHEDULE);
 }
 EXPORT_SYMBOL_GPL(arch_smp_send_reschedule);
+
+int ipi_virq_base_get(void)
+{
+	return ipi_virq_base;
+}
-- 
2.20.1


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [RFC 0/2] riscv: Idle thread using Zawrs extension
  2024-04-18 11:49 ` Xu Lu
@ 2024-04-18 12:26   ` Christoph Müllner
  -1 siblings, 0 replies; 31+ messages in thread
From: Christoph Müllner @ 2024-04-18 12:26 UTC (permalink / raw)
  To: Xu Lu
  Cc: paul.walmsley, palmer, aou, andy.chiu, guoren, linux-riscv,
	linux-kernel, lihangjing, dengliang.1214, xieyongji, chaiwen.cc,
	Andrew Jones, Conor Dooley

On Thu, Apr 18, 2024 at 1:50 PM Xu Lu <luxu.kernel@bytedance.com> wrote:
>
> This patch series introduces a new implementation of idle thread using
> Zawrs extension.

This overlaps with the following series:
  https://lore.kernel.org/all/20240315134009.580167-7-ajones@ventanamicro.com/

BR
Christoph

>
> The Zawrs[0] extension introduces two new instructions named WRS.STO and
> WRS.NTO in RISC-V. When software registers a reservation set using LR
> instruction, a subsequent WRS.STO or WRS.NTO instruction will cause the
> hart to stall in a low-power state until a store happens to the
> reservation set or an interrupt becomes pending. The difference between
> these two instructions is that WRS.STO will terminate stall after an
> implementation-defined timeout while WRS.NTO won't.
>
> This patch series implements idle thread using WRS.NTO instruction.
> Besides, we found there is no need to send a real IPI to wake up an idle
> CPU. Instead, we write IPI information to the reservation set of an idle
> CPU to wake it up and let it handle IPI quickly, without going through
> tranditional interrupt handling routine.
>
> [0] https://github.com/riscv/riscv-zawrs/blob/main/zawrs.adoc
>
> Xu Lu (2):
>   riscv: process: Introduce idle thread using Zawrs extension
>   riscv: Use Zawrs to accelerate IPI to idle cpu
>
>  arch/riscv/Kconfig                 |  24 +++++++
>  arch/riscv/include/asm/cpuidle.h   |  11 +---
>  arch/riscv/include/asm/hwcap.h     |   1 +
>  arch/riscv/include/asm/processor.h |  31 +++++++++
>  arch/riscv/include/asm/smp.h       |  14 ++++
>  arch/riscv/kernel/cpu.c            |   5 ++
>  arch/riscv/kernel/cpufeature.c     |   1 +
>  arch/riscv/kernel/process.c        | 102 ++++++++++++++++++++++++++++-
>  arch/riscv/kernel/smp.c            |  39 +++++++----
>  9 files changed, 205 insertions(+), 23 deletions(-)
>
> --
> 2.20.1
>
>
> _______________________________________________
> linux-riscv mailing list
> linux-riscv@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC 0/2] riscv: Idle thread using Zawrs extension
@ 2024-04-18 12:26   ` Christoph Müllner
  0 siblings, 0 replies; 31+ messages in thread
From: Christoph Müllner @ 2024-04-18 12:26 UTC (permalink / raw)
  To: Xu Lu
  Cc: paul.walmsley, palmer, aou, andy.chiu, guoren, linux-riscv,
	linux-kernel, lihangjing, dengliang.1214, xieyongji, chaiwen.cc,
	Andrew Jones, Conor Dooley

On Thu, Apr 18, 2024 at 1:50 PM Xu Lu <luxu.kernel@bytedance.com> wrote:
>
> This patch series introduces a new implementation of idle thread using
> Zawrs extension.

This overlaps with the following series:
  https://lore.kernel.org/all/20240315134009.580167-7-ajones@ventanamicro.com/

BR
Christoph

>
> The Zawrs[0] extension introduces two new instructions named WRS.STO and
> WRS.NTO in RISC-V. When software registers a reservation set using LR
> instruction, a subsequent WRS.STO or WRS.NTO instruction will cause the
> hart to stall in a low-power state until a store happens to the
> reservation set or an interrupt becomes pending. The difference between
> these two instructions is that WRS.STO will terminate stall after an
> implementation-defined timeout while WRS.NTO won't.
>
> This patch series implements idle thread using WRS.NTO instruction.
> Besides, we found there is no need to send a real IPI to wake up an idle
> CPU. Instead, we write IPI information to the reservation set of an idle
> CPU to wake it up and let it handle IPI quickly, without going through
> tranditional interrupt handling routine.
>
> [0] https://github.com/riscv/riscv-zawrs/blob/main/zawrs.adoc
>
> Xu Lu (2):
>   riscv: process: Introduce idle thread using Zawrs extension
>   riscv: Use Zawrs to accelerate IPI to idle cpu
>
>  arch/riscv/Kconfig                 |  24 +++++++
>  arch/riscv/include/asm/cpuidle.h   |  11 +---
>  arch/riscv/include/asm/hwcap.h     |   1 +
>  arch/riscv/include/asm/processor.h |  31 +++++++++
>  arch/riscv/include/asm/smp.h       |  14 ++++
>  arch/riscv/kernel/cpu.c            |   5 ++
>  arch/riscv/kernel/cpufeature.c     |   1 +
>  arch/riscv/kernel/process.c        | 102 ++++++++++++++++++++++++++++-
>  arch/riscv/kernel/smp.c            |  39 +++++++----
>  9 files changed, 205 insertions(+), 23 deletions(-)
>
> --
> 2.20.1
>
>
> _______________________________________________
> linux-riscv mailing list
> linux-riscv@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-riscv

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [External] Re: [RFC 0/2] riscv: Idle thread using Zawrs extension
  2024-04-18 12:26   ` Christoph Müllner
@ 2024-04-18 12:44     ` Xu Lu
  -1 siblings, 0 replies; 31+ messages in thread
From: Xu Lu @ 2024-04-18 12:44 UTC (permalink / raw)
  To: Christoph Müllner
  Cc: paul.walmsley, palmer, aou, andy.chiu, guoren, linux-riscv,
	linux-kernel, lihangjing, dengliang.1214, xieyongji, chaiwen.cc,
	Andrew Jones, Conor Dooley

On Thu, Apr 18, 2024 at 8:26 PM Christoph Müllner
<christoph.muellner@vrull.eu> wrote:
>
> On Thu, Apr 18, 2024 at 1:50 PM Xu Lu <luxu.kernel@bytedance.com> wrote:
> >
> > This patch series introduces a new implementation of idle thread using
> > Zawrs extension.
>
> This overlaps with the following series:
>   https://lore.kernel.org/all/20240315134009.580167-7-ajones@ventanamicro.com/

Hi Christoph.
Thanks for your reply!
Actually our patch series is different from this. The work from your
link focuses on providing support for Zawrs and implementing spinlock
using it, while our work focuses on implementing idle thread using
Zawrs and accelerating IPI to idle cpu. Of course, the ISA ZAWRS
config part can be merged. We will refine our code in the next version
to reduce code conflicts.

>
> BR
> Christoph
>
> >
> > The Zawrs[0] extension introduces two new instructions named WRS.STO and
> > WRS.NTO in RISC-V. When software registers a reservation set using LR
> > instruction, a subsequent WRS.STO or WRS.NTO instruction will cause the
> > hart to stall in a low-power state until a store happens to the
> > reservation set or an interrupt becomes pending. The difference between
> > these two instructions is that WRS.STO will terminate stall after an
> > implementation-defined timeout while WRS.NTO won't.
> >
> > This patch series implements idle thread using WRS.NTO instruction.
> > Besides, we found there is no need to send a real IPI to wake up an idle
> > CPU. Instead, we write IPI information to the reservation set of an idle
> > CPU to wake it up and let it handle IPI quickly, without going through
> > tranditional interrupt handling routine.
> >
> > [0] https://github.com/riscv/riscv-zawrs/blob/main/zawrs.adoc
> >
> > Xu Lu (2):
> >   riscv: process: Introduce idle thread using Zawrs extension
> >   riscv: Use Zawrs to accelerate IPI to idle cpu
> >
> >  arch/riscv/Kconfig                 |  24 +++++++
> >  arch/riscv/include/asm/cpuidle.h   |  11 +---
> >  arch/riscv/include/asm/hwcap.h     |   1 +
> >  arch/riscv/include/asm/processor.h |  31 +++++++++
> >  arch/riscv/include/asm/smp.h       |  14 ++++
> >  arch/riscv/kernel/cpu.c            |   5 ++
> >  arch/riscv/kernel/cpufeature.c     |   1 +
> >  arch/riscv/kernel/process.c        | 102 ++++++++++++++++++++++++++++-
> >  arch/riscv/kernel/smp.c            |  39 +++++++----
> >  9 files changed, 205 insertions(+), 23 deletions(-)
> >
> > --
> > 2.20.1
> >
> >
> > _______________________________________________
> > linux-riscv mailing list
> > linux-riscv@lists.infradead.org
> > http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [External] Re: [RFC 0/2] riscv: Idle thread using Zawrs extension
@ 2024-04-18 12:44     ` Xu Lu
  0 siblings, 0 replies; 31+ messages in thread
From: Xu Lu @ 2024-04-18 12:44 UTC (permalink / raw)
  To: Christoph Müllner
  Cc: paul.walmsley, palmer, aou, andy.chiu, guoren, linux-riscv,
	linux-kernel, lihangjing, dengliang.1214, xieyongji, chaiwen.cc,
	Andrew Jones, Conor Dooley

On Thu, Apr 18, 2024 at 8:26 PM Christoph Müllner
<christoph.muellner@vrull.eu> wrote:
>
> On Thu, Apr 18, 2024 at 1:50 PM Xu Lu <luxu.kernel@bytedance.com> wrote:
> >
> > This patch series introduces a new implementation of idle thread using
> > Zawrs extension.
>
> This overlaps with the following series:
>   https://lore.kernel.org/all/20240315134009.580167-7-ajones@ventanamicro.com/

Hi Christoph.
Thanks for your reply!
Actually our patch series is different from this. The work from your
link focuses on providing support for Zawrs and implementing spinlock
using it, while our work focuses on implementing idle thread using
Zawrs and accelerating IPI to idle cpu. Of course, the ISA ZAWRS
config part can be merged. We will refine our code in the next version
to reduce code conflicts.

>
> BR
> Christoph
>
> >
> > The Zawrs[0] extension introduces two new instructions named WRS.STO and
> > WRS.NTO in RISC-V. When software registers a reservation set using LR
> > instruction, a subsequent WRS.STO or WRS.NTO instruction will cause the
> > hart to stall in a low-power state until a store happens to the
> > reservation set or an interrupt becomes pending. The difference between
> > these two instructions is that WRS.STO will terminate stall after an
> > implementation-defined timeout while WRS.NTO won't.
> >
> > This patch series implements idle thread using WRS.NTO instruction.
> > Besides, we found there is no need to send a real IPI to wake up an idle
> > CPU. Instead, we write IPI information to the reservation set of an idle
> > CPU to wake it up and let it handle IPI quickly, without going through
> > tranditional interrupt handling routine.
> >
> > [0] https://github.com/riscv/riscv-zawrs/blob/main/zawrs.adoc
> >
> > Xu Lu (2):
> >   riscv: process: Introduce idle thread using Zawrs extension
> >   riscv: Use Zawrs to accelerate IPI to idle cpu
> >
> >  arch/riscv/Kconfig                 |  24 +++++++
> >  arch/riscv/include/asm/cpuidle.h   |  11 +---
> >  arch/riscv/include/asm/hwcap.h     |   1 +
> >  arch/riscv/include/asm/processor.h |  31 +++++++++
> >  arch/riscv/include/asm/smp.h       |  14 ++++
> >  arch/riscv/kernel/cpu.c            |   5 ++
> >  arch/riscv/kernel/cpufeature.c     |   1 +
> >  arch/riscv/kernel/process.c        | 102 ++++++++++++++++++++++++++++-
> >  arch/riscv/kernel/smp.c            |  39 +++++++----
> >  9 files changed, 205 insertions(+), 23 deletions(-)
> >
> > --
> > 2.20.1
> >
> >
> > _______________________________________________
> > linux-riscv mailing list
> > linux-riscv@lists.infradead.org
> > http://lists.infradead.org/mailman/listinfo/linux-riscv

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [External] Re: [RFC 0/2] riscv: Idle thread using Zawrs extension
  2024-04-18 12:44     ` Xu Lu
@ 2024-04-18 12:56       ` Christoph Müllner
  -1 siblings, 0 replies; 31+ messages in thread
From: Christoph Müllner @ 2024-04-18 12:56 UTC (permalink / raw)
  To: Xu Lu, Andrew Jones
  Cc: paul.walmsley, palmer, aou, andy.chiu, guoren, linux-riscv,
	linux-kernel, lihangjing, dengliang.1214, xieyongji, chaiwen.cc,
	Conor Dooley

On Thu, Apr 18, 2024 at 2:44 PM Xu Lu <luxu.kernel@bytedance.com> wrote:
>
> On Thu, Apr 18, 2024 at 8:26 PM Christoph Müllner
> <christoph.muellner@vrull.eu> wrote:
> >
> > On Thu, Apr 18, 2024 at 1:50 PM Xu Lu <luxu.kernel@bytedance.com> wrote:
> > >
> > > This patch series introduces a new implementation of idle thread using
> > > Zawrs extension.
> >
> > This overlaps with the following series:
> >   https://lore.kernel.org/all/20240315134009.580167-7-ajones@ventanamicro.com/
>
> Hi Christoph.
> Thanks for your reply!
> Actually our patch series is different from this. The work from your
> link focuses on providing support for Zawrs and implementing spinlock
> using it, while our work focuses on implementing idle thread using
> Zawrs and accelerating IPI to idle cpu. Of course, the ISA ZAWRS
> config part can be merged. We will refine our code in the next version
> to reduce code conflicts.

Yes, I've seen that this targets another optimization, but the basic
Zawrs support
would be identical to the other patchset (even if it is not).
I would propose that we work on a basic Zawrs support patchset that introduces
the Kconfig, DTS and hwprobe parts (a subset of Andrew's patchset).
Once this is merged, all other optimizations can be built upon it
(spinlocks, idle thread, glibc CPU spinning).
If this proposal is fine for the maintainers/reviewers, then Andrew could resend
these basic-support patches.

BR
Christoph


>
> >
> > BR
> > Christoph
> >
> > >
> > > The Zawrs[0] extension introduces two new instructions named WRS.STO and
> > > WRS.NTO in RISC-V. When software registers a reservation set using LR
> > > instruction, a subsequent WRS.STO or WRS.NTO instruction will cause the
> > > hart to stall in a low-power state until a store happens to the
> > > reservation set or an interrupt becomes pending. The difference between
> > > these two instructions is that WRS.STO will terminate stall after an
> > > implementation-defined timeout while WRS.NTO won't.
> > >
> > > This patch series implements idle thread using WRS.NTO instruction.
> > > Besides, we found there is no need to send a real IPI to wake up an idle
> > > CPU. Instead, we write IPI information to the reservation set of an idle
> > > CPU to wake it up and let it handle IPI quickly, without going through
> > > tranditional interrupt handling routine.
> > >
> > > [0] https://github.com/riscv/riscv-zawrs/blob/main/zawrs.adoc
> > >
> > > Xu Lu (2):
> > >   riscv: process: Introduce idle thread using Zawrs extension
> > >   riscv: Use Zawrs to accelerate IPI to idle cpu
> > >
> > >  arch/riscv/Kconfig                 |  24 +++++++
> > >  arch/riscv/include/asm/cpuidle.h   |  11 +---
> > >  arch/riscv/include/asm/hwcap.h     |   1 +
> > >  arch/riscv/include/asm/processor.h |  31 +++++++++
> > >  arch/riscv/include/asm/smp.h       |  14 ++++
> > >  arch/riscv/kernel/cpu.c            |   5 ++
> > >  arch/riscv/kernel/cpufeature.c     |   1 +
> > >  arch/riscv/kernel/process.c        | 102 ++++++++++++++++++++++++++++-
> > >  arch/riscv/kernel/smp.c            |  39 +++++++----
> > >  9 files changed, 205 insertions(+), 23 deletions(-)
> > >
> > > --
> > > 2.20.1
> > >
> > >
> > > _______________________________________________
> > > linux-riscv mailing list
> > > linux-riscv@lists.infradead.org
> > > http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [External] Re: [RFC 0/2] riscv: Idle thread using Zawrs extension
@ 2024-04-18 12:56       ` Christoph Müllner
  0 siblings, 0 replies; 31+ messages in thread
From: Christoph Müllner @ 2024-04-18 12:56 UTC (permalink / raw)
  To: Xu Lu, Andrew Jones
  Cc: paul.walmsley, palmer, aou, andy.chiu, guoren, linux-riscv,
	linux-kernel, lihangjing, dengliang.1214, xieyongji, chaiwen.cc,
	Conor Dooley

On Thu, Apr 18, 2024 at 2:44 PM Xu Lu <luxu.kernel@bytedance.com> wrote:
>
> On Thu, Apr 18, 2024 at 8:26 PM Christoph Müllner
> <christoph.muellner@vrull.eu> wrote:
> >
> > On Thu, Apr 18, 2024 at 1:50 PM Xu Lu <luxu.kernel@bytedance.com> wrote:
> > >
> > > This patch series introduces a new implementation of idle thread using
> > > Zawrs extension.
> >
> > This overlaps with the following series:
> >   https://lore.kernel.org/all/20240315134009.580167-7-ajones@ventanamicro.com/
>
> Hi Christoph.
> Thanks for your reply!
> Actually our patch series is different from this. The work from your
> link focuses on providing support for Zawrs and implementing spinlock
> using it, while our work focuses on implementing idle thread using
> Zawrs and accelerating IPI to idle cpu. Of course, the ISA ZAWRS
> config part can be merged. We will refine our code in the next version
> to reduce code conflicts.

Yes, I've seen that this targets another optimization, but the basic
Zawrs support
would be identical to the other patchset (even if it is not).
I would propose that we work on a basic Zawrs support patchset that introduces
the Kconfig, DTS and hwprobe parts (a subset of Andrew's patchset).
Once this is merged, all other optimizations can be built upon it
(spinlocks, idle thread, glibc CPU spinning).
If this proposal is fine for the maintainers/reviewers, then Andrew could resend
these basic-support patches.

BR
Christoph


>
> >
> > BR
> > Christoph
> >
> > >
> > > The Zawrs[0] extension introduces two new instructions named WRS.STO and
> > > WRS.NTO in RISC-V. When software registers a reservation set using LR
> > > instruction, a subsequent WRS.STO or WRS.NTO instruction will cause the
> > > hart to stall in a low-power state until a store happens to the
> > > reservation set or an interrupt becomes pending. The difference between
> > > these two instructions is that WRS.STO will terminate stall after an
> > > implementation-defined timeout while WRS.NTO won't.
> > >
> > > This patch series implements idle thread using WRS.NTO instruction.
> > > Besides, we found there is no need to send a real IPI to wake up an idle
> > > CPU. Instead, we write IPI information to the reservation set of an idle
> > > CPU to wake it up and let it handle IPI quickly, without going through
> > > tranditional interrupt handling routine.
> > >
> > > [0] https://github.com/riscv/riscv-zawrs/blob/main/zawrs.adoc
> > >
> > > Xu Lu (2):
> > >   riscv: process: Introduce idle thread using Zawrs extension
> > >   riscv: Use Zawrs to accelerate IPI to idle cpu
> > >
> > >  arch/riscv/Kconfig                 |  24 +++++++
> > >  arch/riscv/include/asm/cpuidle.h   |  11 +---
> > >  arch/riscv/include/asm/hwcap.h     |   1 +
> > >  arch/riscv/include/asm/processor.h |  31 +++++++++
> > >  arch/riscv/include/asm/smp.h       |  14 ++++
> > >  arch/riscv/kernel/cpu.c            |   5 ++
> > >  arch/riscv/kernel/cpufeature.c     |   1 +
> > >  arch/riscv/kernel/process.c        | 102 ++++++++++++++++++++++++++++-
> > >  arch/riscv/kernel/smp.c            |  39 +++++++----
> > >  9 files changed, 205 insertions(+), 23 deletions(-)
> > >
> > > --
> > > 2.20.1
> > >
> > >
> > > _______________________________________________
> > > linux-riscv mailing list
> > > linux-riscv@lists.infradead.org
> > > http://lists.infradead.org/mailman/listinfo/linux-riscv

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [External] Re: [RFC 0/2] riscv: Idle thread using Zawrs extension
  2024-04-18 12:56       ` Christoph Müllner
@ 2024-04-18 13:09         ` Xu Lu
  -1 siblings, 0 replies; 31+ messages in thread
From: Xu Lu @ 2024-04-18 13:09 UTC (permalink / raw)
  To: Christoph Müllner
  Cc: Andrew Jones, paul.walmsley, palmer, aou, andy.chiu, guoren,
	linux-riscv, linux-kernel, lihangjing, dengliang.1214, xieyongji,
	chaiwen.cc, Conor Dooley

On Thu, Apr 18, 2024 at 8:56 PM Christoph Müllner
<christoph.muellner@vrull.eu> wrote:
>
> On Thu, Apr 18, 2024 at 2:44 PM Xu Lu <luxu.kernel@bytedance.com> wrote:
> >
> > On Thu, Apr 18, 2024 at 8:26 PM Christoph Müllner
> > <christoph.muellner@vrull.eu> wrote:
> > >
> > > On Thu, Apr 18, 2024 at 1:50 PM Xu Lu <luxu.kernel@bytedance.com> wrote:
> > > >
> > > > This patch series introduces a new implementation of idle thread using
> > > > Zawrs extension.
> > >
> > > This overlaps with the following series:
> > >   https://lore.kernel.org/all/20240315134009.580167-7-ajones@ventanamicro.com/
> >
> > Hi Christoph.
> > Thanks for your reply!
> > Actually our patch series is different from this. The work from your
> > link focuses on providing support for Zawrs and implementing spinlock
> > using it, while our work focuses on implementing idle thread using
> > Zawrs and accelerating IPI to idle cpu. Of course, the ISA ZAWRS
> > config part can be merged. We will refine our code in the next version
> > to reduce code conflicts.
>
> Yes, I've seen that this targets another optimization, but the basic
> Zawrs support
> would be identical to the other patchset (even if it is not).
> I would propose that we work on a basic Zawrs support patchset that introduces
> the Kconfig, DTS and hwprobe parts (a subset of Andrew's patchset).
> Once this is merged, all other optimizations can be built upon it
> (spinlocks, idle thread, glibc CPU spinning).
> If this proposal is fine for the maintainers/reviewers, then Andrew could resend
> these basic-support patches.
>
> BR
> Christoph

Roger that! This does make more sense. We will rebase our code on
Andrew's basic support patches in the next version.

Regards,
Xu Lu

>
>
> >
> > >
> > > BR
> > > Christoph
> > >
> > > >
> > > > The Zawrs[0] extension introduces two new instructions named WRS.STO and
> > > > WRS.NTO in RISC-V. When software registers a reservation set using LR
> > > > instruction, a subsequent WRS.STO or WRS.NTO instruction will cause the
> > > > hart to stall in a low-power state until a store happens to the
> > > > reservation set or an interrupt becomes pending. The difference between
> > > > these two instructions is that WRS.STO will terminate stall after an
> > > > implementation-defined timeout while WRS.NTO won't.
> > > >
> > > > This patch series implements idle thread using WRS.NTO instruction.
> > > > Besides, we found there is no need to send a real IPI to wake up an idle
> > > > CPU. Instead, we write IPI information to the reservation set of an idle
> > > > CPU to wake it up and let it handle IPI quickly, without going through
> > > > tranditional interrupt handling routine.
> > > >
> > > > [0] https://github.com/riscv/riscv-zawrs/blob/main/zawrs.adoc
> > > >
> > > > Xu Lu (2):
> > > >   riscv: process: Introduce idle thread using Zawrs extension
> > > >   riscv: Use Zawrs to accelerate IPI to idle cpu
> > > >
> > > >  arch/riscv/Kconfig                 |  24 +++++++
> > > >  arch/riscv/include/asm/cpuidle.h   |  11 +---
> > > >  arch/riscv/include/asm/hwcap.h     |   1 +
> > > >  arch/riscv/include/asm/processor.h |  31 +++++++++
> > > >  arch/riscv/include/asm/smp.h       |  14 ++++
> > > >  arch/riscv/kernel/cpu.c            |   5 ++
> > > >  arch/riscv/kernel/cpufeature.c     |   1 +
> > > >  arch/riscv/kernel/process.c        | 102 ++++++++++++++++++++++++++++-
> > > >  arch/riscv/kernel/smp.c            |  39 +++++++----
> > > >  9 files changed, 205 insertions(+), 23 deletions(-)
> > > >
> > > > --
> > > > 2.20.1
> > > >
> > > >
> > > > _______________________________________________
> > > > linux-riscv mailing list
> > > > linux-riscv@lists.infradead.org
> > > > http://lists.infradead.org/mailman/listinfo/linux-riscv

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [External] Re: [RFC 0/2] riscv: Idle thread using Zawrs extension
@ 2024-04-18 13:09         ` Xu Lu
  0 siblings, 0 replies; 31+ messages in thread
From: Xu Lu @ 2024-04-18 13:09 UTC (permalink / raw)
  To: Christoph Müllner
  Cc: Andrew Jones, paul.walmsley, palmer, aou, andy.chiu, guoren,
	linux-riscv, linux-kernel, lihangjing, dengliang.1214, xieyongji,
	chaiwen.cc, Conor Dooley

On Thu, Apr 18, 2024 at 8:56 PM Christoph Müllner
<christoph.muellner@vrull.eu> wrote:
>
> On Thu, Apr 18, 2024 at 2:44 PM Xu Lu <luxu.kernel@bytedance.com> wrote:
> >
> > On Thu, Apr 18, 2024 at 8:26 PM Christoph Müllner
> > <christoph.muellner@vrull.eu> wrote:
> > >
> > > On Thu, Apr 18, 2024 at 1:50 PM Xu Lu <luxu.kernel@bytedance.com> wrote:
> > > >
> > > > This patch series introduces a new implementation of idle thread using
> > > > Zawrs extension.
> > >
> > > This overlaps with the following series:
> > >   https://lore.kernel.org/all/20240315134009.580167-7-ajones@ventanamicro.com/
> >
> > Hi Christoph.
> > Thanks for your reply!
> > Actually our patch series is different from this. The work from your
> > link focuses on providing support for Zawrs and implementing spinlock
> > using it, while our work focuses on implementing idle thread using
> > Zawrs and accelerating IPI to idle cpu. Of course, the ISA ZAWRS
> > config part can be merged. We will refine our code in the next version
> > to reduce code conflicts.
>
> Yes, I've seen that this targets another optimization, but the basic
> Zawrs support
> would be identical to the other patchset (even if it is not).
> I would propose that we work on a basic Zawrs support patchset that introduces
> the Kconfig, DTS and hwprobe parts (a subset of Andrew's patchset).
> Once this is merged, all other optimizations can be built upon it
> (spinlocks, idle thread, glibc CPU spinning).
> If this proposal is fine for the maintainers/reviewers, then Andrew could resend
> these basic-support patches.
>
> BR
> Christoph

Roger that! This does make more sense. We will rebase our code on
Andrew's basic support patches in the next version.

Regards,
Xu Lu

>
>
> >
> > >
> > > BR
> > > Christoph
> > >
> > > >
> > > > The Zawrs[0] extension introduces two new instructions named WRS.STO and
> > > > WRS.NTO in RISC-V. When software registers a reservation set using LR
> > > > instruction, a subsequent WRS.STO or WRS.NTO instruction will cause the
> > > > hart to stall in a low-power state until a store happens to the
> > > > reservation set or an interrupt becomes pending. The difference between
> > > > these two instructions is that WRS.STO will terminate stall after an
> > > > implementation-defined timeout while WRS.NTO won't.
> > > >
> > > > This patch series implements idle thread using WRS.NTO instruction.
> > > > Besides, we found there is no need to send a real IPI to wake up an idle
> > > > CPU. Instead, we write IPI information to the reservation set of an idle
> > > > CPU to wake it up and let it handle IPI quickly, without going through
> > > > tranditional interrupt handling routine.
> > > >
> > > > [0] https://github.com/riscv/riscv-zawrs/blob/main/zawrs.adoc
> > > >
> > > > Xu Lu (2):
> > > >   riscv: process: Introduce idle thread using Zawrs extension
> > > >   riscv: Use Zawrs to accelerate IPI to idle cpu
> > > >
> > > >  arch/riscv/Kconfig                 |  24 +++++++
> > > >  arch/riscv/include/asm/cpuidle.h   |  11 +---
> > > >  arch/riscv/include/asm/hwcap.h     |   1 +
> > > >  arch/riscv/include/asm/processor.h |  31 +++++++++
> > > >  arch/riscv/include/asm/smp.h       |  14 ++++
> > > >  arch/riscv/kernel/cpu.c            |   5 ++
> > > >  arch/riscv/kernel/cpufeature.c     |   1 +
> > > >  arch/riscv/kernel/process.c        | 102 ++++++++++++++++++++++++++++-
> > > >  arch/riscv/kernel/smp.c            |  39 +++++++----
> > > >  9 files changed, 205 insertions(+), 23 deletions(-)
> > > >
> > > > --
> > > > 2.20.1
> > > >
> > > >
> > > > _______________________________________________
> > > > linux-riscv mailing list
> > > > linux-riscv@lists.infradead.org
> > > > http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [External] Re: [RFC 0/2] riscv: Idle thread using Zawrs extension
  2024-04-18 13:09         ` Xu Lu
@ 2024-04-18 14:08           ` Conor Dooley
  -1 siblings, 0 replies; 31+ messages in thread
From: Conor Dooley @ 2024-04-18 14:08 UTC (permalink / raw)
  To: Xu Lu
  Cc: Christoph Müllner, Andrew Jones, paul.walmsley, palmer, aou,
	andy.chiu, guoren, linux-riscv, linux-kernel, lihangjing,
	dengliang.1214, xieyongji, chaiwen.cc, Conor Dooley

[-- Attachment #1: Type: text/plain, Size: 2094 bytes --]

On Thu, Apr 18, 2024 at 09:09:06PM +0800, Xu Lu wrote:
> On Thu, Apr 18, 2024 at 8:56 PM Christoph Müllner
> <christoph.muellner@vrull.eu> wrote:
> >
> > On Thu, Apr 18, 2024 at 2:44 PM Xu Lu <luxu.kernel@bytedance.com> wrote:
> > >
> > > On Thu, Apr 18, 2024 at 8:26 PM Christoph Müllner
> > > <christoph.muellner@vrull.eu> wrote:
> > > >
> > > > On Thu, Apr 18, 2024 at 1:50 PM Xu Lu <luxu.kernel@bytedance.com> wrote:
> > > > >
> > > > > This patch series introduces a new implementation of idle thread using
> > > > > Zawrs extension.
> > > >
> > > > This overlaps with the following series:
> > > >   https://lore.kernel.org/all/20240315134009.580167-7-ajones@ventanamicro.com/
> > >
> > > Hi Christoph.
> > > Thanks for your reply!
> > > Actually our patch series is different from this. The work from your
> > > link focuses on providing support for Zawrs and implementing spinlock
> > > using it, while our work focuses on implementing idle thread using
> > > Zawrs and accelerating IPI to idle cpu. Of course, the ISA ZAWRS
> > > config part can be merged. We will refine our code in the next version
> > > to reduce code conflicts.
> >
> > Yes, I've seen that this targets another optimization, but the basic
> > Zawrs support
> > would be identical to the other patchset (even if it is not).
> > I would propose that we work on a basic Zawrs support patchset that introduces
> > the Kconfig, DTS and hwprobe parts (a subset of Andrew's patchset).
> > Once this is merged, all other optimizations can be built upon it
> > (spinlocks, idle thread, glibc CPU spinning).
> > If this proposal is fine for the maintainers/reviewers, then Andrew could resend
> > these basic-support patches.
> >
> > BR
> > Christoph
> 
> Roger that! This does make more sense. We will rebase our code on
> Andrew's basic support patches in the next version.

IIRC Drew's working on a new version of the linked series (we were
talking about it yesterday) so hold off for that before doing a rebase
and sending a new version I think.

Thanks,
Conor.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [External] Re: [RFC 0/2] riscv: Idle thread using Zawrs extension
@ 2024-04-18 14:08           ` Conor Dooley
  0 siblings, 0 replies; 31+ messages in thread
From: Conor Dooley @ 2024-04-18 14:08 UTC (permalink / raw)
  To: Xu Lu
  Cc: Christoph Müllner, Andrew Jones, paul.walmsley, palmer, aou,
	andy.chiu, guoren, linux-riscv, linux-kernel, lihangjing,
	dengliang.1214, xieyongji, chaiwen.cc, Conor Dooley


[-- Attachment #1.1: Type: text/plain, Size: 2094 bytes --]

On Thu, Apr 18, 2024 at 09:09:06PM +0800, Xu Lu wrote:
> On Thu, Apr 18, 2024 at 8:56 PM Christoph Müllner
> <christoph.muellner@vrull.eu> wrote:
> >
> > On Thu, Apr 18, 2024 at 2:44 PM Xu Lu <luxu.kernel@bytedance.com> wrote:
> > >
> > > On Thu, Apr 18, 2024 at 8:26 PM Christoph Müllner
> > > <christoph.muellner@vrull.eu> wrote:
> > > >
> > > > On Thu, Apr 18, 2024 at 1:50 PM Xu Lu <luxu.kernel@bytedance.com> wrote:
> > > > >
> > > > > This patch series introduces a new implementation of idle thread using
> > > > > Zawrs extension.
> > > >
> > > > This overlaps with the following series:
> > > >   https://lore.kernel.org/all/20240315134009.580167-7-ajones@ventanamicro.com/
> > >
> > > Hi Christoph.
> > > Thanks for your reply!
> > > Actually our patch series is different from this. The work from your
> > > link focuses on providing support for Zawrs and implementing spinlock
> > > using it, while our work focuses on implementing idle thread using
> > > Zawrs and accelerating IPI to idle cpu. Of course, the ISA ZAWRS
> > > config part can be merged. We will refine our code in the next version
> > > to reduce code conflicts.
> >
> > Yes, I've seen that this targets another optimization, but the basic
> > Zawrs support
> > would be identical to the other patchset (even if it is not).
> > I would propose that we work on a basic Zawrs support patchset that introduces
> > the Kconfig, DTS and hwprobe parts (a subset of Andrew's patchset).
> > Once this is merged, all other optimizations can be built upon it
> > (spinlocks, idle thread, glibc CPU spinning).
> > If this proposal is fine for the maintainers/reviewers, then Andrew could resend
> > these basic-support patches.
> >
> > BR
> > Christoph
> 
> Roger that! This does make more sense. We will rebase our code on
> Andrew's basic support patches in the next version.

IIRC Drew's working on a new version of the linked series (we were
talking about it yesterday) so hold off for that before doing a rebase
and sending a new version I think.

Thanks,
Conor.

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

[-- Attachment #2: Type: text/plain, Size: 161 bytes --]

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [External] Re: [RFC 0/2] riscv: Idle thread using Zawrs extension
  2024-04-18 13:09         ` Xu Lu
@ 2024-04-18 14:10           ` Andrew Jones
  -1 siblings, 0 replies; 31+ messages in thread
From: Andrew Jones @ 2024-04-18 14:10 UTC (permalink / raw)
  To: Xu Lu
  Cc: Christoph Müllner, paul.walmsley, palmer, aou, andy.chiu,
	guoren, linux-riscv, linux-kernel, lihangjing, dengliang.1214,
	xieyongji, chaiwen.cc, Conor Dooley

On Thu, Apr 18, 2024 at 09:09:06PM +0800, Xu Lu wrote:
> On Thu, Apr 18, 2024 at 8:56 PM Christoph Müllner
> <christoph.muellner@vrull.eu> wrote:
> >
> > On Thu, Apr 18, 2024 at 2:44 PM Xu Lu <luxu.kernel@bytedance.com> wrote:
> > >
> > > On Thu, Apr 18, 2024 at 8:26 PM Christoph Müllner
> > > <christoph.muellner@vrull.eu> wrote:
> > > >
> > > > On Thu, Apr 18, 2024 at 1:50 PM Xu Lu <luxu.kernel@bytedance.com> wrote:
> > > > >
> > > > > This patch series introduces a new implementation of idle thread using
> > > > > Zawrs extension.
> > > >
> > > > This overlaps with the following series:
> > > >   https://lore.kernel.org/all/20240315134009.580167-7-ajones@ventanamicro.com/
> > >
> > > Hi Christoph.
> > > Thanks for your reply!
> > > Actually our patch series is different from this. The work from your
> > > link focuses on providing support for Zawrs and implementing spinlock
> > > using it, while our work focuses on implementing idle thread using
> > > Zawrs and accelerating IPI to idle cpu. Of course, the ISA ZAWRS
> > > config part can be merged. We will refine our code in the next version
> > > to reduce code conflicts.
> >
> > Yes, I've seen that this targets another optimization, but the basic
> > Zawrs support
> > would be identical to the other patchset (even if it is not).
> > I would propose that we work on a basic Zawrs support patchset that introduces
> > the Kconfig, DTS and hwprobe parts (a subset of Andrew's patchset).
> > Once this is merged, all other optimizations can be built upon it
> > (spinlocks, idle thread, glibc CPU spinning).
> > If this proposal is fine for the maintainers/reviewers, then Andrew could resend
> > these basic-support patches.
> >
> > BR
> > Christoph
> 
> Roger that! This does make more sense. We will rebase our code on
> Andrew's basic support patches in the next version.

And I'm just about to send that next version. I'll send tomorrow morning
if not yet today.

Thanks,
drew



> 
> Regards,
> Xu Lu
> 
> >
> >
> > >
> > > >
> > > > BR
> > > > Christoph
> > > >
> > > > >
> > > > > The Zawrs[0] extension introduces two new instructions named WRS.STO and
> > > > > WRS.NTO in RISC-V. When software registers a reservation set using LR
> > > > > instruction, a subsequent WRS.STO or WRS.NTO instruction will cause the
> > > > > hart to stall in a low-power state until a store happens to the
> > > > > reservation set or an interrupt becomes pending. The difference between
> > > > > these two instructions is that WRS.STO will terminate stall after an
> > > > > implementation-defined timeout while WRS.NTO won't.
> > > > >
> > > > > This patch series implements idle thread using WRS.NTO instruction.
> > > > > Besides, we found there is no need to send a real IPI to wake up an idle
> > > > > CPU. Instead, we write IPI information to the reservation set of an idle
> > > > > CPU to wake it up and let it handle IPI quickly, without going through
> > > > > tranditional interrupt handling routine.
> > > > >
> > > > > [0] https://github.com/riscv/riscv-zawrs/blob/main/zawrs.adoc
> > > > >
> > > > > Xu Lu (2):
> > > > >   riscv: process: Introduce idle thread using Zawrs extension
> > > > >   riscv: Use Zawrs to accelerate IPI to idle cpu
> > > > >
> > > > >  arch/riscv/Kconfig                 |  24 +++++++
> > > > >  arch/riscv/include/asm/cpuidle.h   |  11 +---
> > > > >  arch/riscv/include/asm/hwcap.h     |   1 +
> > > > >  arch/riscv/include/asm/processor.h |  31 +++++++++
> > > > >  arch/riscv/include/asm/smp.h       |  14 ++++
> > > > >  arch/riscv/kernel/cpu.c            |   5 ++
> > > > >  arch/riscv/kernel/cpufeature.c     |   1 +
> > > > >  arch/riscv/kernel/process.c        | 102 ++++++++++++++++++++++++++++-
> > > > >  arch/riscv/kernel/smp.c            |  39 +++++++----
> > > > >  9 files changed, 205 insertions(+), 23 deletions(-)
> > > > >
> > > > > --
> > > > > 2.20.1
> > > > >
> > > > >
> > > > > _______________________________________________
> > > > > linux-riscv mailing list
> > > > > linux-riscv@lists.infradead.org
> > > > > http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [External] Re: [RFC 0/2] riscv: Idle thread using Zawrs extension
@ 2024-04-18 14:10           ` Andrew Jones
  0 siblings, 0 replies; 31+ messages in thread
From: Andrew Jones @ 2024-04-18 14:10 UTC (permalink / raw)
  To: Xu Lu
  Cc: Christoph Müllner, paul.walmsley, palmer, aou, andy.chiu,
	guoren, linux-riscv, linux-kernel, lihangjing, dengliang.1214,
	xieyongji, chaiwen.cc, Conor Dooley

On Thu, Apr 18, 2024 at 09:09:06PM +0800, Xu Lu wrote:
> On Thu, Apr 18, 2024 at 8:56 PM Christoph Müllner
> <christoph.muellner@vrull.eu> wrote:
> >
> > On Thu, Apr 18, 2024 at 2:44 PM Xu Lu <luxu.kernel@bytedance.com> wrote:
> > >
> > > On Thu, Apr 18, 2024 at 8:26 PM Christoph Müllner
> > > <christoph.muellner@vrull.eu> wrote:
> > > >
> > > > On Thu, Apr 18, 2024 at 1:50 PM Xu Lu <luxu.kernel@bytedance.com> wrote:
> > > > >
> > > > > This patch series introduces a new implementation of idle thread using
> > > > > Zawrs extension.
> > > >
> > > > This overlaps with the following series:
> > > >   https://lore.kernel.org/all/20240315134009.580167-7-ajones@ventanamicro.com/
> > >
> > > Hi Christoph.
> > > Thanks for your reply!
> > > Actually our patch series is different from this. The work from your
> > > link focuses on providing support for Zawrs and implementing spinlock
> > > using it, while our work focuses on implementing idle thread using
> > > Zawrs and accelerating IPI to idle cpu. Of course, the ISA ZAWRS
> > > config part can be merged. We will refine our code in the next version
> > > to reduce code conflicts.
> >
> > Yes, I've seen that this targets another optimization, but the basic
> > Zawrs support
> > would be identical to the other patchset (even if it is not).
> > I would propose that we work on a basic Zawrs support patchset that introduces
> > the Kconfig, DTS and hwprobe parts (a subset of Andrew's patchset).
> > Once this is merged, all other optimizations can be built upon it
> > (spinlocks, idle thread, glibc CPU spinning).
> > If this proposal is fine for the maintainers/reviewers, then Andrew could resend
> > these basic-support patches.
> >
> > BR
> > Christoph
> 
> Roger that! This does make more sense. We will rebase our code on
> Andrew's basic support patches in the next version.

And I'm just about to send that next version. I'll send tomorrow morning
if not yet today.

Thanks,
drew



> 
> Regards,
> Xu Lu
> 
> >
> >
> > >
> > > >
> > > > BR
> > > > Christoph
> > > >
> > > > >
> > > > > The Zawrs[0] extension introduces two new instructions named WRS.STO and
> > > > > WRS.NTO in RISC-V. When software registers a reservation set using LR
> > > > > instruction, a subsequent WRS.STO or WRS.NTO instruction will cause the
> > > > > hart to stall in a low-power state until a store happens to the
> > > > > reservation set or an interrupt becomes pending. The difference between
> > > > > these two instructions is that WRS.STO will terminate stall after an
> > > > > implementation-defined timeout while WRS.NTO won't.
> > > > >
> > > > > This patch series implements idle thread using WRS.NTO instruction.
> > > > > Besides, we found there is no need to send a real IPI to wake up an idle
> > > > > CPU. Instead, we write IPI information to the reservation set of an idle
> > > > > CPU to wake it up and let it handle IPI quickly, without going through
> > > > > tranditional interrupt handling routine.
> > > > >
> > > > > [0] https://github.com/riscv/riscv-zawrs/blob/main/zawrs.adoc
> > > > >
> > > > > Xu Lu (2):
> > > > >   riscv: process: Introduce idle thread using Zawrs extension
> > > > >   riscv: Use Zawrs to accelerate IPI to idle cpu
> > > > >
> > > > >  arch/riscv/Kconfig                 |  24 +++++++
> > > > >  arch/riscv/include/asm/cpuidle.h   |  11 +---
> > > > >  arch/riscv/include/asm/hwcap.h     |   1 +
> > > > >  arch/riscv/include/asm/processor.h |  31 +++++++++
> > > > >  arch/riscv/include/asm/smp.h       |  14 ++++
> > > > >  arch/riscv/kernel/cpu.c            |   5 ++
> > > > >  arch/riscv/kernel/cpufeature.c     |   1 +
> > > > >  arch/riscv/kernel/process.c        | 102 ++++++++++++++++++++++++++++-
> > > > >  arch/riscv/kernel/smp.c            |  39 +++++++----
> > > > >  9 files changed, 205 insertions(+), 23 deletions(-)
> > > > >
> > > > > --
> > > > > 2.20.1
> > > > >
> > > > >
> > > > > _______________________________________________
> > > > > linux-riscv mailing list
> > > > > linux-riscv@lists.infradead.org
> > > > > http://lists.infradead.org/mailman/listinfo/linux-riscv

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC 1/2] riscv: process: Introduce idle thread using Zawrs extension
  2024-04-18 11:49   ` Xu Lu
@ 2024-04-18 15:05     ` Conor Dooley
  -1 siblings, 0 replies; 31+ messages in thread
From: Conor Dooley @ 2024-04-18 15:05 UTC (permalink / raw)
  To: Xu Lu
  Cc: paul.walmsley, palmer, aou, andy.chiu, guoren, linux-riscv,
	linux-kernel, lihangjing, dengliang.1214, xieyongji, chaiwen.cc,
	Andrew Jones

[-- Attachment #1: Type: text/plain, Size: 3448 bytes --]

+ Drew,

On Thu, Apr 18, 2024 at 07:49:41PM +0800, Xu Lu wrote:
> The Zawrs extension introduces a new instruction WRS.NTO, which will
> register a reservation set and causes the hart to temporarily stall
> execution in a low-power state until a store occurs to the reservation
> set or an interrupt is observed.
> 
> This commit implements new version of idle thread for RISC-V via Zawrs
> extension.
> 
> Signed-off-by: Xu Lu <luxu.kernel@bytedance.com>
> Reviewed-by: Hangjing Li <lihangjing@bytedance.com>
> Reviewed-by: Liang Deng <dengliang.1214@bytedance.com>
> Reviewed-by: Wen Chai <chaiwen.cc@bytedance.com>
> ---
>  arch/riscv/Kconfig                 | 24 +++++++++++++++++
>  arch/riscv/include/asm/cpuidle.h   | 11 +-------
>  arch/riscv/include/asm/hwcap.h     |  1 +
>  arch/riscv/include/asm/processor.h | 17 +++++++++++++
>  arch/riscv/kernel/cpu.c            |  5 ++++
>  arch/riscv/kernel/cpufeature.c     |  1 +
>  arch/riscv/kernel/process.c        | 41 +++++++++++++++++++++++++++++-
>  7 files changed, 89 insertions(+), 11 deletions(-)
> 
> diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
> index be09c8836d56..a0d344e9803f 100644
> --- a/arch/riscv/Kconfig
> +++ b/arch/riscv/Kconfig
> @@ -19,6 +19,7 @@ config RISCV
>  	select ARCH_ENABLE_SPLIT_PMD_PTLOCK if PGTABLE_LEVELS > 2
>  	select ARCH_ENABLE_THP_MIGRATION if TRANSPARENT_HUGEPAGE
>  	select ARCH_HAS_BINFMT_FLAT
> +	select ARCH_HAS_CPU_FINALIZE_INIT
>  	select ARCH_HAS_CURRENT_STACK_POINTER
>  	select ARCH_HAS_DEBUG_VIRTUAL if MMU
>  	select ARCH_HAS_DEBUG_VM_PGTABLE
> @@ -525,6 +526,20 @@ config RISCV_ISA_SVPBMT
>  
>  	   If you don't know what to do here, say Y.
>  
> +config RISCV_ISA_ZAWRS
> +	bool "Zawrs extension support for wait-on-reservation-set instructions"
> +	depends on RISCV_ALTERNATIVE
> +	default y
> +	help
> +	   Adds support to dynamically detect the presence of the Zawrs
> +	   extension and enable its usage.

Drew, could you, in your update, use the wording:
	   Add support for enabling optimisations in the kernel when the
	   Zawrs extension is detected at boot.

There was some confusion recently about what these options were actually
for, because this option doesn't control "dynamic detection" as the
ACPI or DT detection is compiled at all times. I had written a patch for
this wording in other options at the time but had forgotten to properly
send it:
https://lore.kernel.org/linux-riscv/20240418-stable-railway-7cce07e1e440@spud/T/#u

> +
> +	   The Zawrs extension defines a pair of instructions to be used
> +	   in polling loops that allows a core to enter a low-power state
> +	   and wait on a store to a memory location.
> +
> +	   If you don't know what to do here, say Y.
> +
>  config TOOLCHAIN_HAS_V
>  	bool
>  	default y
> @@ -1075,6 +1090,15 @@ endmenu # "Power management options"
>  
>  menu "CPU Power Management"
>  
> +config RISCV_ZAWRS_IDLE
> +	bool "Idle thread using ZAWRS extensions"
> +	depends on RISCV_ISA_ZAWRS
> +	default y
> +	help
> +		Adds support to implement idle thread using ZAWRS extension.
> +
> +		If you don't know what to do here, say Y.

I don't think this second option is needed, why would we not always want
to use the Zawrs version of this when it is available? Can we do it
unconditionally when RISCV_ISA_ZAWRS is set and the extension is
detected at runtime?

Cheers,
Conor.



[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC 1/2] riscv: process: Introduce idle thread using Zawrs extension
@ 2024-04-18 15:05     ` Conor Dooley
  0 siblings, 0 replies; 31+ messages in thread
From: Conor Dooley @ 2024-04-18 15:05 UTC (permalink / raw)
  To: Xu Lu
  Cc: paul.walmsley, palmer, aou, andy.chiu, guoren, linux-riscv,
	linux-kernel, lihangjing, dengliang.1214, xieyongji, chaiwen.cc,
	Andrew Jones


[-- Attachment #1.1: Type: text/plain, Size: 3448 bytes --]

+ Drew,

On Thu, Apr 18, 2024 at 07:49:41PM +0800, Xu Lu wrote:
> The Zawrs extension introduces a new instruction WRS.NTO, which will
> register a reservation set and causes the hart to temporarily stall
> execution in a low-power state until a store occurs to the reservation
> set or an interrupt is observed.
> 
> This commit implements new version of idle thread for RISC-V via Zawrs
> extension.
> 
> Signed-off-by: Xu Lu <luxu.kernel@bytedance.com>
> Reviewed-by: Hangjing Li <lihangjing@bytedance.com>
> Reviewed-by: Liang Deng <dengliang.1214@bytedance.com>
> Reviewed-by: Wen Chai <chaiwen.cc@bytedance.com>
> ---
>  arch/riscv/Kconfig                 | 24 +++++++++++++++++
>  arch/riscv/include/asm/cpuidle.h   | 11 +-------
>  arch/riscv/include/asm/hwcap.h     |  1 +
>  arch/riscv/include/asm/processor.h | 17 +++++++++++++
>  arch/riscv/kernel/cpu.c            |  5 ++++
>  arch/riscv/kernel/cpufeature.c     |  1 +
>  arch/riscv/kernel/process.c        | 41 +++++++++++++++++++++++++++++-
>  7 files changed, 89 insertions(+), 11 deletions(-)
> 
> diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
> index be09c8836d56..a0d344e9803f 100644
> --- a/arch/riscv/Kconfig
> +++ b/arch/riscv/Kconfig
> @@ -19,6 +19,7 @@ config RISCV
>  	select ARCH_ENABLE_SPLIT_PMD_PTLOCK if PGTABLE_LEVELS > 2
>  	select ARCH_ENABLE_THP_MIGRATION if TRANSPARENT_HUGEPAGE
>  	select ARCH_HAS_BINFMT_FLAT
> +	select ARCH_HAS_CPU_FINALIZE_INIT
>  	select ARCH_HAS_CURRENT_STACK_POINTER
>  	select ARCH_HAS_DEBUG_VIRTUAL if MMU
>  	select ARCH_HAS_DEBUG_VM_PGTABLE
> @@ -525,6 +526,20 @@ config RISCV_ISA_SVPBMT
>  
>  	   If you don't know what to do here, say Y.
>  
> +config RISCV_ISA_ZAWRS
> +	bool "Zawrs extension support for wait-on-reservation-set instructions"
> +	depends on RISCV_ALTERNATIVE
> +	default y
> +	help
> +	   Adds support to dynamically detect the presence of the Zawrs
> +	   extension and enable its usage.

Drew, could you, in your update, use the wording:
	   Add support for enabling optimisations in the kernel when the
	   Zawrs extension is detected at boot.

There was some confusion recently about what these options were actually
for, because this option doesn't control "dynamic detection" as the
ACPI or DT detection is compiled at all times. I had written a patch for
this wording in other options at the time but had forgotten to properly
send it:
https://lore.kernel.org/linux-riscv/20240418-stable-railway-7cce07e1e440@spud/T/#u

> +
> +	   The Zawrs extension defines a pair of instructions to be used
> +	   in polling loops that allows a core to enter a low-power state
> +	   and wait on a store to a memory location.
> +
> +	   If you don't know what to do here, say Y.
> +
>  config TOOLCHAIN_HAS_V
>  	bool
>  	default y
> @@ -1075,6 +1090,15 @@ endmenu # "Power management options"
>  
>  menu "CPU Power Management"
>  
> +config RISCV_ZAWRS_IDLE
> +	bool "Idle thread using ZAWRS extensions"
> +	depends on RISCV_ISA_ZAWRS
> +	default y
> +	help
> +		Adds support to implement idle thread using ZAWRS extension.
> +
> +		If you don't know what to do here, say Y.

I don't think this second option is needed, why would we not always want
to use the Zawrs version of this when it is available? Can we do it
unconditionally when RISCV_ISA_ZAWRS is set and the extension is
detected at runtime?

Cheers,
Conor.



[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

[-- Attachment #2: Type: text/plain, Size: 161 bytes --]

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [External] Re: [RFC 1/2] riscv: process: Introduce idle thread using Zawrs extension
  2024-04-18 15:05     ` Conor Dooley
@ 2024-04-18 16:14       ` Xu Lu
  -1 siblings, 0 replies; 31+ messages in thread
From: Xu Lu @ 2024-04-18 16:14 UTC (permalink / raw)
  To: Conor Dooley
  Cc: paul.walmsley, palmer, aou, andy.chiu, guoren, linux-riscv,
	linux-kernel, lihangjing, dengliang.1214, xieyongji, chaiwen.cc,
	Andrew Jones

On Thu, Apr 18, 2024 at 11:06 PM Conor Dooley <conor@kernel.org> wrote:
>
> + Drew,
>
> On Thu, Apr 18, 2024 at 07:49:41PM +0800, Xu Lu wrote:
> > The Zawrs extension introduces a new instruction WRS.NTO, which will
> > register a reservation set and causes the hart to temporarily stall
> > execution in a low-power state until a store occurs to the reservation
> > set or an interrupt is observed.
> >
> > This commit implements new version of idle thread for RISC-V via Zawrs
> > extension.
> >
> > Signed-off-by: Xu Lu <luxu.kernel@bytedance.com>
> > Reviewed-by: Hangjing Li <lihangjing@bytedance.com>
> > Reviewed-by: Liang Deng <dengliang.1214@bytedance.com>
> > Reviewed-by: Wen Chai <chaiwen.cc@bytedance.com>
> > ---
> >  arch/riscv/Kconfig                 | 24 +++++++++++++++++
> >  arch/riscv/include/asm/cpuidle.h   | 11 +-------
> >  arch/riscv/include/asm/hwcap.h     |  1 +
> >  arch/riscv/include/asm/processor.h | 17 +++++++++++++
> >  arch/riscv/kernel/cpu.c            |  5 ++++
> >  arch/riscv/kernel/cpufeature.c     |  1 +
> >  arch/riscv/kernel/process.c        | 41 +++++++++++++++++++++++++++++-
> >  7 files changed, 89 insertions(+), 11 deletions(-)
> >
> > diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
> > index be09c8836d56..a0d344e9803f 100644
> > --- a/arch/riscv/Kconfig
> > +++ b/arch/riscv/Kconfig
> > @@ -19,6 +19,7 @@ config RISCV
> >       select ARCH_ENABLE_SPLIT_PMD_PTLOCK if PGTABLE_LEVELS > 2
> >       select ARCH_ENABLE_THP_MIGRATION if TRANSPARENT_HUGEPAGE
> >       select ARCH_HAS_BINFMT_FLAT
> > +     select ARCH_HAS_CPU_FINALIZE_INIT
> >       select ARCH_HAS_CURRENT_STACK_POINTER
> >       select ARCH_HAS_DEBUG_VIRTUAL if MMU
> >       select ARCH_HAS_DEBUG_VM_PGTABLE
> > @@ -525,6 +526,20 @@ config RISCV_ISA_SVPBMT
> >
> >          If you don't know what to do here, say Y.
> >
> > +config RISCV_ISA_ZAWRS
> > +     bool "Zawrs extension support for wait-on-reservation-set instructions"
> > +     depends on RISCV_ALTERNATIVE
> > +     default y
> > +     help
> > +        Adds support to dynamically detect the presence of the Zawrs
> > +        extension and enable its usage.
>
> Drew, could you, in your update, use the wording:
>            Add support for enabling optimisations in the kernel when the
>            Zawrs extension is detected at boot.
>
> There was some confusion recently about what these options were actually
> for, because this option doesn't control "dynamic detection" as the
> ACPI or DT detection is compiled at all times. I had written a patch for
> this wording in other options at the time but had forgotten to properly
> send it:
> https://lore.kernel.org/linux-riscv/20240418-stable-railway-7cce07e1e440@spud/T/#u
>
> > +
> > +        The Zawrs extension defines a pair of instructions to be used
> > +        in polling loops that allows a core to enter a low-power state
> > +        and wait on a store to a memory location.
> > +
> > +        If you don't know what to do here, say Y.
> > +
> >  config TOOLCHAIN_HAS_V
> >       bool
> >       default y
> > @@ -1075,6 +1090,15 @@ endmenu # "Power management options"
> >
> >  menu "CPU Power Management"
> >
> > +config RISCV_ZAWRS_IDLE
> > +     bool "Idle thread using ZAWRS extensions"
> > +     depends on RISCV_ISA_ZAWRS
> > +     default y
> > +     help
> > +             Adds support to implement idle thread using ZAWRS extension.
> > +
> > +             If you don't know what to do here, say Y.
>
> I don't think this second option is needed, why would we not always want
> to use the Zawrs version of this when it is available? Can we do it
> unconditionally when RISCV_ISA_ZAWRS is set and the extension is
> detected at runtime?
>
> Cheers,
> Conor.

Indeed, we can always choose WRS.NTO when entering idle.

This config is introduced for the second commit in this patch series.
In the second commit, we detect whether the target cpu is idle when
sending IPI and write IPI info to the reserve set of idle cpu so as to
avoid sending a physical IPI. Besides, the target idle cpu need not to
go through traditional interrupt handling routine. However, if all
cpus are busy and hardly enter idle, this commit may introduce
performance overhead of extra instructions when sending IPI. Thus we
introduce this config just in case.

Regards,
Xu Lu

>
>

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [External] Re: [RFC 1/2] riscv: process: Introduce idle thread using Zawrs extension
@ 2024-04-18 16:14       ` Xu Lu
  0 siblings, 0 replies; 31+ messages in thread
From: Xu Lu @ 2024-04-18 16:14 UTC (permalink / raw)
  To: Conor Dooley
  Cc: paul.walmsley, palmer, aou, andy.chiu, guoren, linux-riscv,
	linux-kernel, lihangjing, dengliang.1214, xieyongji, chaiwen.cc,
	Andrew Jones

On Thu, Apr 18, 2024 at 11:06 PM Conor Dooley <conor@kernel.org> wrote:
>
> + Drew,
>
> On Thu, Apr 18, 2024 at 07:49:41PM +0800, Xu Lu wrote:
> > The Zawrs extension introduces a new instruction WRS.NTO, which will
> > register a reservation set and causes the hart to temporarily stall
> > execution in a low-power state until a store occurs to the reservation
> > set or an interrupt is observed.
> >
> > This commit implements new version of idle thread for RISC-V via Zawrs
> > extension.
> >
> > Signed-off-by: Xu Lu <luxu.kernel@bytedance.com>
> > Reviewed-by: Hangjing Li <lihangjing@bytedance.com>
> > Reviewed-by: Liang Deng <dengliang.1214@bytedance.com>
> > Reviewed-by: Wen Chai <chaiwen.cc@bytedance.com>
> > ---
> >  arch/riscv/Kconfig                 | 24 +++++++++++++++++
> >  arch/riscv/include/asm/cpuidle.h   | 11 +-------
> >  arch/riscv/include/asm/hwcap.h     |  1 +
> >  arch/riscv/include/asm/processor.h | 17 +++++++++++++
> >  arch/riscv/kernel/cpu.c            |  5 ++++
> >  arch/riscv/kernel/cpufeature.c     |  1 +
> >  arch/riscv/kernel/process.c        | 41 +++++++++++++++++++++++++++++-
> >  7 files changed, 89 insertions(+), 11 deletions(-)
> >
> > diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
> > index be09c8836d56..a0d344e9803f 100644
> > --- a/arch/riscv/Kconfig
> > +++ b/arch/riscv/Kconfig
> > @@ -19,6 +19,7 @@ config RISCV
> >       select ARCH_ENABLE_SPLIT_PMD_PTLOCK if PGTABLE_LEVELS > 2
> >       select ARCH_ENABLE_THP_MIGRATION if TRANSPARENT_HUGEPAGE
> >       select ARCH_HAS_BINFMT_FLAT
> > +     select ARCH_HAS_CPU_FINALIZE_INIT
> >       select ARCH_HAS_CURRENT_STACK_POINTER
> >       select ARCH_HAS_DEBUG_VIRTUAL if MMU
> >       select ARCH_HAS_DEBUG_VM_PGTABLE
> > @@ -525,6 +526,20 @@ config RISCV_ISA_SVPBMT
> >
> >          If you don't know what to do here, say Y.
> >
> > +config RISCV_ISA_ZAWRS
> > +     bool "Zawrs extension support for wait-on-reservation-set instructions"
> > +     depends on RISCV_ALTERNATIVE
> > +     default y
> > +     help
> > +        Adds support to dynamically detect the presence of the Zawrs
> > +        extension and enable its usage.
>
> Drew, could you, in your update, use the wording:
>            Add support for enabling optimisations in the kernel when the
>            Zawrs extension is detected at boot.
>
> There was some confusion recently about what these options were actually
> for, because this option doesn't control "dynamic detection" as the
> ACPI or DT detection is compiled at all times. I had written a patch for
> this wording in other options at the time but had forgotten to properly
> send it:
> https://lore.kernel.org/linux-riscv/20240418-stable-railway-7cce07e1e440@spud/T/#u
>
> > +
> > +        The Zawrs extension defines a pair of instructions to be used
> > +        in polling loops that allows a core to enter a low-power state
> > +        and wait on a store to a memory location.
> > +
> > +        If you don't know what to do here, say Y.
> > +
> >  config TOOLCHAIN_HAS_V
> >       bool
> >       default y
> > @@ -1075,6 +1090,15 @@ endmenu # "Power management options"
> >
> >  menu "CPU Power Management"
> >
> > +config RISCV_ZAWRS_IDLE
> > +     bool "Idle thread using ZAWRS extensions"
> > +     depends on RISCV_ISA_ZAWRS
> > +     default y
> > +     help
> > +             Adds support to implement idle thread using ZAWRS extension.
> > +
> > +             If you don't know what to do here, say Y.
>
> I don't think this second option is needed, why would we not always want
> to use the Zawrs version of this when it is available? Can we do it
> unconditionally when RISCV_ISA_ZAWRS is set and the extension is
> detected at runtime?
>
> Cheers,
> Conor.

Indeed, we can always choose WRS.NTO when entering idle.

This config is introduced for the second commit in this patch series.
In the second commit, we detect whether the target cpu is idle when
sending IPI and write IPI info to the reserve set of idle cpu so as to
avoid sending a physical IPI. Besides, the target idle cpu need not to
go through traditional interrupt handling routine. However, if all
cpus are busy and hardly enter idle, this commit may introduce
performance overhead of extra instructions when sending IPI. Thus we
introduce this config just in case.

Regards,
Xu Lu

>
>

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC 1/2] riscv: process: Introduce idle thread using Zawrs extension
  2024-04-18 15:05     ` Conor Dooley
@ 2024-04-18 19:10       ` Andrew Jones
  -1 siblings, 0 replies; 31+ messages in thread
From: Andrew Jones @ 2024-04-18 19:10 UTC (permalink / raw)
  To: Conor Dooley
  Cc: Xu Lu, paul.walmsley, palmer, aou, andy.chiu, guoren,
	linux-riscv, linux-kernel, lihangjing, dengliang.1214, xieyongji,
	chaiwen.cc

On Thu, Apr 18, 2024 at 04:05:55PM +0100, Conor Dooley wrote:
> + Drew,
> 
> On Thu, Apr 18, 2024 at 07:49:41PM +0800, Xu Lu wrote:
> > The Zawrs extension introduces a new instruction WRS.NTO, which will
> > register a reservation set and causes the hart to temporarily stall
> > execution in a low-power state until a store occurs to the reservation
> > set or an interrupt is observed.
> > 
> > This commit implements new version of idle thread for RISC-V via Zawrs
> > extension.
> > 
> > Signed-off-by: Xu Lu <luxu.kernel@bytedance.com>
> > Reviewed-by: Hangjing Li <lihangjing@bytedance.com>
> > Reviewed-by: Liang Deng <dengliang.1214@bytedance.com>
> > Reviewed-by: Wen Chai <chaiwen.cc@bytedance.com>
> > ---
> >  arch/riscv/Kconfig                 | 24 +++++++++++++++++
> >  arch/riscv/include/asm/cpuidle.h   | 11 +-------
> >  arch/riscv/include/asm/hwcap.h     |  1 +
> >  arch/riscv/include/asm/processor.h | 17 +++++++++++++
> >  arch/riscv/kernel/cpu.c            |  5 ++++
> >  arch/riscv/kernel/cpufeature.c     |  1 +
> >  arch/riscv/kernel/process.c        | 41 +++++++++++++++++++++++++++++-
> >  7 files changed, 89 insertions(+), 11 deletions(-)
> > 
> > diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
> > index be09c8836d56..a0d344e9803f 100644
> > --- a/arch/riscv/Kconfig
> > +++ b/arch/riscv/Kconfig
> > @@ -19,6 +19,7 @@ config RISCV
> >  	select ARCH_ENABLE_SPLIT_PMD_PTLOCK if PGTABLE_LEVELS > 2
> >  	select ARCH_ENABLE_THP_MIGRATION if TRANSPARENT_HUGEPAGE
> >  	select ARCH_HAS_BINFMT_FLAT
> > +	select ARCH_HAS_CPU_FINALIZE_INIT
> >  	select ARCH_HAS_CURRENT_STACK_POINTER
> >  	select ARCH_HAS_DEBUG_VIRTUAL if MMU
> >  	select ARCH_HAS_DEBUG_VM_PGTABLE
> > @@ -525,6 +526,20 @@ config RISCV_ISA_SVPBMT
> >  
> >  	   If you don't know what to do here, say Y.
> >  
> > +config RISCV_ISA_ZAWRS
> > +	bool "Zawrs extension support for wait-on-reservation-set instructions"
> > +	depends on RISCV_ALTERNATIVE
> > +	default y
> > +	help
> > +	   Adds support to dynamically detect the presence of the Zawrs
> > +	   extension and enable its usage.
> 
> Drew, could you, in your update, use the wording:
> 	   Add support for enabling optimisations in the kernel when the
> 	   Zawrs extension is detected at boot.

How about

  The Zawrs extension defines a pair of instructions to be used in
  polling loops which allow a hart to enter a low-power state or to
  trap to the hypervisor while waiting on a store to a memory location.
  Enable the use of these instructions when the Zawrs extension is
  detected at boot.

Thanks,
drew

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC 1/2] riscv: process: Introduce idle thread using Zawrs extension
@ 2024-04-18 19:10       ` Andrew Jones
  0 siblings, 0 replies; 31+ messages in thread
From: Andrew Jones @ 2024-04-18 19:10 UTC (permalink / raw)
  To: Conor Dooley
  Cc: Xu Lu, paul.walmsley, palmer, aou, andy.chiu, guoren,
	linux-riscv, linux-kernel, lihangjing, dengliang.1214, xieyongji,
	chaiwen.cc

On Thu, Apr 18, 2024 at 04:05:55PM +0100, Conor Dooley wrote:
> + Drew,
> 
> On Thu, Apr 18, 2024 at 07:49:41PM +0800, Xu Lu wrote:
> > The Zawrs extension introduces a new instruction WRS.NTO, which will
> > register a reservation set and causes the hart to temporarily stall
> > execution in a low-power state until a store occurs to the reservation
> > set or an interrupt is observed.
> > 
> > This commit implements new version of idle thread for RISC-V via Zawrs
> > extension.
> > 
> > Signed-off-by: Xu Lu <luxu.kernel@bytedance.com>
> > Reviewed-by: Hangjing Li <lihangjing@bytedance.com>
> > Reviewed-by: Liang Deng <dengliang.1214@bytedance.com>
> > Reviewed-by: Wen Chai <chaiwen.cc@bytedance.com>
> > ---
> >  arch/riscv/Kconfig                 | 24 +++++++++++++++++
> >  arch/riscv/include/asm/cpuidle.h   | 11 +-------
> >  arch/riscv/include/asm/hwcap.h     |  1 +
> >  arch/riscv/include/asm/processor.h | 17 +++++++++++++
> >  arch/riscv/kernel/cpu.c            |  5 ++++
> >  arch/riscv/kernel/cpufeature.c     |  1 +
> >  arch/riscv/kernel/process.c        | 41 +++++++++++++++++++++++++++++-
> >  7 files changed, 89 insertions(+), 11 deletions(-)
> > 
> > diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
> > index be09c8836d56..a0d344e9803f 100644
> > --- a/arch/riscv/Kconfig
> > +++ b/arch/riscv/Kconfig
> > @@ -19,6 +19,7 @@ config RISCV
> >  	select ARCH_ENABLE_SPLIT_PMD_PTLOCK if PGTABLE_LEVELS > 2
> >  	select ARCH_ENABLE_THP_MIGRATION if TRANSPARENT_HUGEPAGE
> >  	select ARCH_HAS_BINFMT_FLAT
> > +	select ARCH_HAS_CPU_FINALIZE_INIT
> >  	select ARCH_HAS_CURRENT_STACK_POINTER
> >  	select ARCH_HAS_DEBUG_VIRTUAL if MMU
> >  	select ARCH_HAS_DEBUG_VM_PGTABLE
> > @@ -525,6 +526,20 @@ config RISCV_ISA_SVPBMT
> >  
> >  	   If you don't know what to do here, say Y.
> >  
> > +config RISCV_ISA_ZAWRS
> > +	bool "Zawrs extension support for wait-on-reservation-set instructions"
> > +	depends on RISCV_ALTERNATIVE
> > +	default y
> > +	help
> > +	   Adds support to dynamically detect the presence of the Zawrs
> > +	   extension and enable its usage.
> 
> Drew, could you, in your update, use the wording:
> 	   Add support for enabling optimisations in the kernel when the
> 	   Zawrs extension is detected at boot.

How about

  The Zawrs extension defines a pair of instructions to be used in
  polling loops which allow a hart to enter a low-power state or to
  trap to the hypervisor while waiting on a store to a memory location.
  Enable the use of these instructions when the Zawrs extension is
  detected at boot.

Thanks,
drew

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC 1/2] riscv: process: Introduce idle thread using Zawrs extension
  2024-04-18 19:10       ` Andrew Jones
@ 2024-04-18 22:00         ` Samuel Holland
  -1 siblings, 0 replies; 31+ messages in thread
From: Samuel Holland @ 2024-04-18 22:00 UTC (permalink / raw)
  To: Andrew Jones, Conor Dooley
  Cc: Xu Lu, paul.walmsley, palmer, aou, andy.chiu, guoren,
	linux-riscv, linux-kernel, lihangjing, dengliang.1214, xieyongji,
	chaiwen.cc

Hi Drew,

On 2024-04-18 2:10 PM, Andrew Jones wrote:
> On Thu, Apr 18, 2024 at 04:05:55PM +0100, Conor Dooley wrote:
>> + Drew,
>>
>> On Thu, Apr 18, 2024 at 07:49:41PM +0800, Xu Lu wrote:
>>> The Zawrs extension introduces a new instruction WRS.NTO, which will
>>> register a reservation set and causes the hart to temporarily stall
>>> execution in a low-power state until a store occurs to the reservation
>>> set or an interrupt is observed.
>>>
>>> This commit implements new version of idle thread for RISC-V via Zawrs
>>> extension.
>>>
>>> Signed-off-by: Xu Lu <luxu.kernel@bytedance.com>
>>> Reviewed-by: Hangjing Li <lihangjing@bytedance.com>
>>> Reviewed-by: Liang Deng <dengliang.1214@bytedance.com>
>>> Reviewed-by: Wen Chai <chaiwen.cc@bytedance.com>
>>> ---
>>>  arch/riscv/Kconfig                 | 24 +++++++++++++++++
>>>  arch/riscv/include/asm/cpuidle.h   | 11 +-------
>>>  arch/riscv/include/asm/hwcap.h     |  1 +
>>>  arch/riscv/include/asm/processor.h | 17 +++++++++++++
>>>  arch/riscv/kernel/cpu.c            |  5 ++++
>>>  arch/riscv/kernel/cpufeature.c     |  1 +
>>>  arch/riscv/kernel/process.c        | 41 +++++++++++++++++++++++++++++-
>>>  7 files changed, 89 insertions(+), 11 deletions(-)
>>>
>>> diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
>>> index be09c8836d56..a0d344e9803f 100644
>>> --- a/arch/riscv/Kconfig
>>> +++ b/arch/riscv/Kconfig
>>> @@ -19,6 +19,7 @@ config RISCV
>>>  	select ARCH_ENABLE_SPLIT_PMD_PTLOCK if PGTABLE_LEVELS > 2
>>>  	select ARCH_ENABLE_THP_MIGRATION if TRANSPARENT_HUGEPAGE
>>>  	select ARCH_HAS_BINFMT_FLAT
>>> +	select ARCH_HAS_CPU_FINALIZE_INIT
>>>  	select ARCH_HAS_CURRENT_STACK_POINTER
>>>  	select ARCH_HAS_DEBUG_VIRTUAL if MMU
>>>  	select ARCH_HAS_DEBUG_VM_PGTABLE
>>> @@ -525,6 +526,20 @@ config RISCV_ISA_SVPBMT
>>>  
>>>  	   If you don't know what to do here, say Y.
>>>  
>>> +config RISCV_ISA_ZAWRS
>>> +	bool "Zawrs extension support for wait-on-reservation-set instructions"
>>> +	depends on RISCV_ALTERNATIVE
>>> +	default y
>>> +	help
>>> +	   Adds support to dynamically detect the presence of the Zawrs
>>> +	   extension and enable its usage.
>>
>> Drew, could you, in your update, use the wording:
>> 	   Add support for enabling optimisations in the kernel when the
>> 	   Zawrs extension is detected at boot.
> 
> How about
> 
>   The Zawrs extension defines a pair of instructions to be used in
>   polling loops which allow a hart to enter a low-power state or to
>   trap to the hypervisor while waiting on a store to a memory location.
>   Enable the use of these instructions when the Zawrs extension is

                                        ^ in the kernel

I believe "in the kernel" was an important part of the clarification that these
Kconfig options do not affect whether userspace can use these instructions.

Regards,
Samuel

>   detected at boot.
> 
> Thanks,
> drew
> 
> _______________________________________________
> linux-riscv mailing list
> linux-riscv@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-riscv


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC 1/2] riscv: process: Introduce idle thread using Zawrs extension
@ 2024-04-18 22:00         ` Samuel Holland
  0 siblings, 0 replies; 31+ messages in thread
From: Samuel Holland @ 2024-04-18 22:00 UTC (permalink / raw)
  To: Andrew Jones, Conor Dooley
  Cc: Xu Lu, paul.walmsley, palmer, aou, andy.chiu, guoren,
	linux-riscv, linux-kernel, lihangjing, dengliang.1214, xieyongji,
	chaiwen.cc

Hi Drew,

On 2024-04-18 2:10 PM, Andrew Jones wrote:
> On Thu, Apr 18, 2024 at 04:05:55PM +0100, Conor Dooley wrote:
>> + Drew,
>>
>> On Thu, Apr 18, 2024 at 07:49:41PM +0800, Xu Lu wrote:
>>> The Zawrs extension introduces a new instruction WRS.NTO, which will
>>> register a reservation set and causes the hart to temporarily stall
>>> execution in a low-power state until a store occurs to the reservation
>>> set or an interrupt is observed.
>>>
>>> This commit implements new version of idle thread for RISC-V via Zawrs
>>> extension.
>>>
>>> Signed-off-by: Xu Lu <luxu.kernel@bytedance.com>
>>> Reviewed-by: Hangjing Li <lihangjing@bytedance.com>
>>> Reviewed-by: Liang Deng <dengliang.1214@bytedance.com>
>>> Reviewed-by: Wen Chai <chaiwen.cc@bytedance.com>
>>> ---
>>>  arch/riscv/Kconfig                 | 24 +++++++++++++++++
>>>  arch/riscv/include/asm/cpuidle.h   | 11 +-------
>>>  arch/riscv/include/asm/hwcap.h     |  1 +
>>>  arch/riscv/include/asm/processor.h | 17 +++++++++++++
>>>  arch/riscv/kernel/cpu.c            |  5 ++++
>>>  arch/riscv/kernel/cpufeature.c     |  1 +
>>>  arch/riscv/kernel/process.c        | 41 +++++++++++++++++++++++++++++-
>>>  7 files changed, 89 insertions(+), 11 deletions(-)
>>>
>>> diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
>>> index be09c8836d56..a0d344e9803f 100644
>>> --- a/arch/riscv/Kconfig
>>> +++ b/arch/riscv/Kconfig
>>> @@ -19,6 +19,7 @@ config RISCV
>>>  	select ARCH_ENABLE_SPLIT_PMD_PTLOCK if PGTABLE_LEVELS > 2
>>>  	select ARCH_ENABLE_THP_MIGRATION if TRANSPARENT_HUGEPAGE
>>>  	select ARCH_HAS_BINFMT_FLAT
>>> +	select ARCH_HAS_CPU_FINALIZE_INIT
>>>  	select ARCH_HAS_CURRENT_STACK_POINTER
>>>  	select ARCH_HAS_DEBUG_VIRTUAL if MMU
>>>  	select ARCH_HAS_DEBUG_VM_PGTABLE
>>> @@ -525,6 +526,20 @@ config RISCV_ISA_SVPBMT
>>>  
>>>  	   If you don't know what to do here, say Y.
>>>  
>>> +config RISCV_ISA_ZAWRS
>>> +	bool "Zawrs extension support for wait-on-reservation-set instructions"
>>> +	depends on RISCV_ALTERNATIVE
>>> +	default y
>>> +	help
>>> +	   Adds support to dynamically detect the presence of the Zawrs
>>> +	   extension and enable its usage.
>>
>> Drew, could you, in your update, use the wording:
>> 	   Add support for enabling optimisations in the kernel when the
>> 	   Zawrs extension is detected at boot.
> 
> How about
> 
>   The Zawrs extension defines a pair of instructions to be used in
>   polling loops which allow a hart to enter a low-power state or to
>   trap to the hypervisor while waiting on a store to a memory location.
>   Enable the use of these instructions when the Zawrs extension is

                                        ^ in the kernel

I believe "in the kernel" was an important part of the clarification that these
Kconfig options do not affect whether userspace can use these instructions.

Regards,
Samuel

>   detected at boot.
> 
> Thanks,
> drew
> 
> _______________________________________________
> linux-riscv mailing list
> linux-riscv@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-riscv


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC 1/2] riscv: process: Introduce idle thread using Zawrs extension
  2024-04-18 22:00         ` Samuel Holland
@ 2024-04-18 22:09           ` Conor Dooley
  -1 siblings, 0 replies; 31+ messages in thread
From: Conor Dooley @ 2024-04-18 22:09 UTC (permalink / raw)
  To: Samuel Holland
  Cc: Andrew Jones, Xu Lu, paul.walmsley, palmer, aou, andy.chiu,
	guoren, linux-riscv, linux-kernel, lihangjing, dengliang.1214,
	xieyongji, chaiwen.cc

[-- Attachment #1: Type: text/plain, Size: 3571 bytes --]

On Thu, Apr 18, 2024 at 05:00:42PM -0500, Samuel Holland wrote:
> Hi Drew,
> 
> On 2024-04-18 2:10 PM, Andrew Jones wrote:
> > On Thu, Apr 18, 2024 at 04:05:55PM +0100, Conor Dooley wrote:
> >> + Drew,
> >>
> >> On Thu, Apr 18, 2024 at 07:49:41PM +0800, Xu Lu wrote:
> >>> The Zawrs extension introduces a new instruction WRS.NTO, which will
> >>> register a reservation set and causes the hart to temporarily stall
> >>> execution in a low-power state until a store occurs to the reservation
> >>> set or an interrupt is observed.
> >>>
> >>> This commit implements new version of idle thread for RISC-V via Zawrs
> >>> extension.
> >>>
> >>> Signed-off-by: Xu Lu <luxu.kernel@bytedance.com>
> >>> Reviewed-by: Hangjing Li <lihangjing@bytedance.com>
> >>> Reviewed-by: Liang Deng <dengliang.1214@bytedance.com>
> >>> Reviewed-by: Wen Chai <chaiwen.cc@bytedance.com>
> >>> ---
> >>>  arch/riscv/Kconfig                 | 24 +++++++++++++++++
> >>>  arch/riscv/include/asm/cpuidle.h   | 11 +-------
> >>>  arch/riscv/include/asm/hwcap.h     |  1 +
> >>>  arch/riscv/include/asm/processor.h | 17 +++++++++++++
> >>>  arch/riscv/kernel/cpu.c            |  5 ++++
> >>>  arch/riscv/kernel/cpufeature.c     |  1 +
> >>>  arch/riscv/kernel/process.c        | 41 +++++++++++++++++++++++++++++-
> >>>  7 files changed, 89 insertions(+), 11 deletions(-)
> >>>
> >>> diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
> >>> index be09c8836d56..a0d344e9803f 100644
> >>> --- a/arch/riscv/Kconfig
> >>> +++ b/arch/riscv/Kconfig
> >>> @@ -19,6 +19,7 @@ config RISCV
> >>>  	select ARCH_ENABLE_SPLIT_PMD_PTLOCK if PGTABLE_LEVELS > 2
> >>>  	select ARCH_ENABLE_THP_MIGRATION if TRANSPARENT_HUGEPAGE
> >>>  	select ARCH_HAS_BINFMT_FLAT
> >>> +	select ARCH_HAS_CPU_FINALIZE_INIT
> >>>  	select ARCH_HAS_CURRENT_STACK_POINTER
> >>>  	select ARCH_HAS_DEBUG_VIRTUAL if MMU
> >>>  	select ARCH_HAS_DEBUG_VM_PGTABLE
> >>> @@ -525,6 +526,20 @@ config RISCV_ISA_SVPBMT
> >>>  
> >>>  	   If you don't know what to do here, say Y.
> >>>  
> >>> +config RISCV_ISA_ZAWRS
> >>> +	bool "Zawrs extension support for wait-on-reservation-set instructions"
> >>> +	depends on RISCV_ALTERNATIVE
> >>> +	default y
> >>> +	help
> >>> +	   Adds support to dynamically detect the presence of the Zawrs
> >>> +	   extension and enable its usage.
> >>
> >> Drew, could you, in your update, use the wording:
> >> 	   Add support for enabling optimisations in the kernel when the
> >> 	   Zawrs extension is detected at boot.
> > 
> > How about

Probably should have said, this was just a replacement for the first
paragraph, not the entire text.

> > 
> >   The Zawrs extension defines a pair of instructions to be used in
> >   polling loops which allow a hart to enter a low-power state or to
> >   trap to the hypervisor while waiting on a store to a memory location.
> >   Enable the use of these instructions when the Zawrs extension is
> 
>                                         ^ in the kernel
> 
> I believe "in the kernel" was an important part of the clarification that these
> Kconfig options do not affect whether userspace can use these instructions.

Meant to reply earlier but forgot. Samuel's correct, it is indeed the
key bit I wanted, I just suggest what's above to match what was in the
patch I had sent earlier today. Don't really care all that much if it
is a match nor not, but I do care about the help text actually
describing /who/ gets to use the extension when the option is enabled.

Thanks,
Conor.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC 1/2] riscv: process: Introduce idle thread using Zawrs extension
@ 2024-04-18 22:09           ` Conor Dooley
  0 siblings, 0 replies; 31+ messages in thread
From: Conor Dooley @ 2024-04-18 22:09 UTC (permalink / raw)
  To: Samuel Holland
  Cc: Andrew Jones, Xu Lu, paul.walmsley, palmer, aou, andy.chiu,
	guoren, linux-riscv, linux-kernel, lihangjing, dengliang.1214,
	xieyongji, chaiwen.cc


[-- Attachment #1.1: Type: text/plain, Size: 3571 bytes --]

On Thu, Apr 18, 2024 at 05:00:42PM -0500, Samuel Holland wrote:
> Hi Drew,
> 
> On 2024-04-18 2:10 PM, Andrew Jones wrote:
> > On Thu, Apr 18, 2024 at 04:05:55PM +0100, Conor Dooley wrote:
> >> + Drew,
> >>
> >> On Thu, Apr 18, 2024 at 07:49:41PM +0800, Xu Lu wrote:
> >>> The Zawrs extension introduces a new instruction WRS.NTO, which will
> >>> register a reservation set and causes the hart to temporarily stall
> >>> execution in a low-power state until a store occurs to the reservation
> >>> set or an interrupt is observed.
> >>>
> >>> This commit implements new version of idle thread for RISC-V via Zawrs
> >>> extension.
> >>>
> >>> Signed-off-by: Xu Lu <luxu.kernel@bytedance.com>
> >>> Reviewed-by: Hangjing Li <lihangjing@bytedance.com>
> >>> Reviewed-by: Liang Deng <dengliang.1214@bytedance.com>
> >>> Reviewed-by: Wen Chai <chaiwen.cc@bytedance.com>
> >>> ---
> >>>  arch/riscv/Kconfig                 | 24 +++++++++++++++++
> >>>  arch/riscv/include/asm/cpuidle.h   | 11 +-------
> >>>  arch/riscv/include/asm/hwcap.h     |  1 +
> >>>  arch/riscv/include/asm/processor.h | 17 +++++++++++++
> >>>  arch/riscv/kernel/cpu.c            |  5 ++++
> >>>  arch/riscv/kernel/cpufeature.c     |  1 +
> >>>  arch/riscv/kernel/process.c        | 41 +++++++++++++++++++++++++++++-
> >>>  7 files changed, 89 insertions(+), 11 deletions(-)
> >>>
> >>> diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
> >>> index be09c8836d56..a0d344e9803f 100644
> >>> --- a/arch/riscv/Kconfig
> >>> +++ b/arch/riscv/Kconfig
> >>> @@ -19,6 +19,7 @@ config RISCV
> >>>  	select ARCH_ENABLE_SPLIT_PMD_PTLOCK if PGTABLE_LEVELS > 2
> >>>  	select ARCH_ENABLE_THP_MIGRATION if TRANSPARENT_HUGEPAGE
> >>>  	select ARCH_HAS_BINFMT_FLAT
> >>> +	select ARCH_HAS_CPU_FINALIZE_INIT
> >>>  	select ARCH_HAS_CURRENT_STACK_POINTER
> >>>  	select ARCH_HAS_DEBUG_VIRTUAL if MMU
> >>>  	select ARCH_HAS_DEBUG_VM_PGTABLE
> >>> @@ -525,6 +526,20 @@ config RISCV_ISA_SVPBMT
> >>>  
> >>>  	   If you don't know what to do here, say Y.
> >>>  
> >>> +config RISCV_ISA_ZAWRS
> >>> +	bool "Zawrs extension support for wait-on-reservation-set instructions"
> >>> +	depends on RISCV_ALTERNATIVE
> >>> +	default y
> >>> +	help
> >>> +	   Adds support to dynamically detect the presence of the Zawrs
> >>> +	   extension and enable its usage.
> >>
> >> Drew, could you, in your update, use the wording:
> >> 	   Add support for enabling optimisations in the kernel when the
> >> 	   Zawrs extension is detected at boot.
> > 
> > How about

Probably should have said, this was just a replacement for the first
paragraph, not the entire text.

> > 
> >   The Zawrs extension defines a pair of instructions to be used in
> >   polling loops which allow a hart to enter a low-power state or to
> >   trap to the hypervisor while waiting on a store to a memory location.
> >   Enable the use of these instructions when the Zawrs extension is
> 
>                                         ^ in the kernel
> 
> I believe "in the kernel" was an important part of the clarification that these
> Kconfig options do not affect whether userspace can use these instructions.

Meant to reply earlier but forgot. Samuel's correct, it is indeed the
key bit I wanted, I just suggest what's above to match what was in the
patch I had sent earlier today. Don't really care all that much if it
is a match nor not, but I do care about the help text actually
describing /who/ gets to use the extension when the option is enabled.

Thanks,
Conor.

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

[-- Attachment #2: Type: text/plain, Size: 161 bytes --]

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC 2/2] riscv: Use Zawrs to accelerate IPI to idle cpu
  2024-04-18 11:49   ` Xu Lu
  (?)
@ 2024-04-20 13:30   ` kernel test robot
  -1 siblings, 0 replies; 31+ messages in thread
From: kernel test robot @ 2024-04-20 13:30 UTC (permalink / raw)
  To: Xu Lu; +Cc: llvm, oe-kbuild-all

Hi Xu,

[This is a private test report for your RFC patch.]
kernel test robot noticed the following build errors:

[auto build test ERROR on v6.9-rc2]
[cannot apply to linus/master v6.9-rc4 v6.9-rc3 next-20240419]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Xu-Lu/riscv-process-Introduce-idle-thread-using-Zawrs-extension/20240418-195356
base:   v6.9-rc2
patch link:    https://lore.kernel.org/r/20240418114942.52770-3-luxu.kernel%40bytedance.com
patch subject: [RFC 2/2] riscv: Use Zawrs to accelerate IPI to idle cpu
config: riscv-allmodconfig (https://download.01.org/0day-ci/archive/20240420/202404202129.uQmgTDG1-lkp@intel.com/config)
compiler: clang version 19.0.0git (https://github.com/llvm/llvm-project 7089c359a3845323f6f30c44a47dd901f2edfe63)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20240420/202404202129.uQmgTDG1-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202404202129.uQmgTDG1-lkp@intel.com/

All error/warnings (new ones prefixed by >>):

   In file included from drivers/media/platform/mediatek/mdp/mtk_mdp_core.c:11:
   In file included from include/linux/interrupt.h:21:
   In file included from arch/riscv/include/asm/sections.h:9:
   In file included from include/linux/mm.h:2208:
   include/linux/vmstat.h:508:43: warning: arithmetic between different enumeration types ('enum zone_stat_item' and 'enum numa_stat_item') [-Wenum-enum-conversion]
     508 |         return vmstat_text[NR_VM_ZONE_STAT_ITEMS +
         |                            ~~~~~~~~~~~~~~~~~~~~~ ^
     509 |                            item];
         |                            ~~~~
   include/linux/vmstat.h:515:43: warning: arithmetic between different enumeration types ('enum zone_stat_item' and 'enum numa_stat_item') [-Wenum-enum-conversion]
     515 |         return vmstat_text[NR_VM_ZONE_STAT_ITEMS +
         |                            ~~~~~~~~~~~~~~~~~~~~~ ^
     516 |                            NR_VM_NUMA_EVENT_ITEMS +
         |                            ~~~~~~~~~~~~~~~~~~~~~~
   include/linux/vmstat.h:522:36: warning: arithmetic between different enumeration types ('enum node_stat_item' and 'enum lru_list') [-Wenum-enum-conversion]
     522 |         return node_stat_name(NR_LRU_BASE + lru) + 3; // skip "nr_"
         |                               ~~~~~~~~~~~ ^ ~~~
   include/linux/vmstat.h:527:43: warning: arithmetic between different enumeration types ('enum zone_stat_item' and 'enum numa_stat_item') [-Wenum-enum-conversion]
     527 |         return vmstat_text[NR_VM_ZONE_STAT_ITEMS +
         |                            ~~~~~~~~~~~~~~~~~~~~~ ^
     528 |                            NR_VM_NUMA_EVENT_ITEMS +
         |                            ~~~~~~~~~~~~~~~~~~~~~~
   include/linux/vmstat.h:536:43: warning: arithmetic between different enumeration types ('enum zone_stat_item' and 'enum numa_stat_item') [-Wenum-enum-conversion]
     536 |         return vmstat_text[NR_VM_ZONE_STAT_ITEMS +
         |                            ~~~~~~~~~~~~~~~~~~~~~ ^
     537 |                            NR_VM_NUMA_EVENT_ITEMS +
         |                            ~~~~~~~~~~~~~~~~~~~~~~
   In file included from drivers/media/platform/mediatek/mdp/mtk_mdp_core.c:23:
>> drivers/media/platform/mediatek/vpu/mtk_vpu.h:63:2: error: redefinition of enumerator 'IPI_MAX'
      63 |         IPI_MAX,
         |         ^
   arch/riscv/include/asm/smp.h:29:2: note: previous definition is here
      29 |         IPI_MAX
         |         ^
   drivers/media/platform/mediatek/mdp/mtk_mdp_core.c:206:52: warning: implicit conversion from 'unsigned long long' to 'unsigned int' changes value from 18446744073709551615 to 4294967295 [-Wconstant-conversion]
     206 |         ret = vb2_dma_contig_set_max_seg_size(&pdev->dev, DMA_BIT_MASK(32));
         |               ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~             ^~~~~~~~~~~~~~~~
   include/linux/dma-mapping.h:77:40: note: expanded from macro 'DMA_BIT_MASK'
      77 | #define DMA_BIT_MASK(n) (((n) == 64) ? ~0ULL : ((1ULL<<(n))-1))
         |                                        ^~~~~
   6 warnings and 1 error generated.
--
   In file included from drivers/media/platform/mediatek/mdp/mtk_mdp_m2m.c:17:
   In file included from drivers/media/platform/mediatek/mdp/mtk_mdp_core.h:12:
   In file included from include/media/v4l2-ctrls.h:14:
   In file included from include/media/media-request.h:20:
   In file included from include/media/media-device.h:16:
   In file included from include/linux/pci.h:38:
   In file included from include/linux/interrupt.h:21:
   In file included from arch/riscv/include/asm/sections.h:9:
   In file included from include/linux/mm.h:2208:
   include/linux/vmstat.h:508:43: warning: arithmetic between different enumeration types ('enum zone_stat_item' and 'enum numa_stat_item') [-Wenum-enum-conversion]
     508 |         return vmstat_text[NR_VM_ZONE_STAT_ITEMS +
         |                            ~~~~~~~~~~~~~~~~~~~~~ ^
     509 |                            item];
         |                            ~~~~
   include/linux/vmstat.h:515:43: warning: arithmetic between different enumeration types ('enum zone_stat_item' and 'enum numa_stat_item') [-Wenum-enum-conversion]
     515 |         return vmstat_text[NR_VM_ZONE_STAT_ITEMS +
         |                            ~~~~~~~~~~~~~~~~~~~~~ ^
     516 |                            NR_VM_NUMA_EVENT_ITEMS +
         |                            ~~~~~~~~~~~~~~~~~~~~~~
   include/linux/vmstat.h:522:36: warning: arithmetic between different enumeration types ('enum node_stat_item' and 'enum lru_list') [-Wenum-enum-conversion]
     522 |         return node_stat_name(NR_LRU_BASE + lru) + 3; // skip "nr_"
         |                               ~~~~~~~~~~~ ^ ~~~
   include/linux/vmstat.h:527:43: warning: arithmetic between different enumeration types ('enum zone_stat_item' and 'enum numa_stat_item') [-Wenum-enum-conversion]
     527 |         return vmstat_text[NR_VM_ZONE_STAT_ITEMS +
         |                            ~~~~~~~~~~~~~~~~~~~~~ ^
     528 |                            NR_VM_NUMA_EVENT_ITEMS +
         |                            ~~~~~~~~~~~~~~~~~~~~~~
   include/linux/vmstat.h:536:43: warning: arithmetic between different enumeration types ('enum zone_stat_item' and 'enum numa_stat_item') [-Wenum-enum-conversion]
     536 |         return vmstat_text[NR_VM_ZONE_STAT_ITEMS +
         |                            ~~~~~~~~~~~~~~~~~~~~~ ^
     537 |                            NR_VM_NUMA_EVENT_ITEMS +
         |                            ~~~~~~~~~~~~~~~~~~~~~~
   In file included from drivers/media/platform/mediatek/mdp/mtk_mdp_m2m.c:20:
>> drivers/media/platform/mediatek/vpu/mtk_vpu.h:63:2: error: redefinition of enumerator 'IPI_MAX'
      63 |         IPI_MAX,
         |         ^
   arch/riscv/include/asm/smp.h:29:2: note: previous definition is here
      29 |         IPI_MAX
         |         ^
   5 warnings and 1 error generated.
--
   In file included from drivers/media/platform/mediatek/vcodec/common/mtk_vcodec_intr.c:10:
   In file included from drivers/media/platform/mediatek/vcodec/common/../decoder/mtk_vcodec_dec_drv.h:10:
   In file included from drivers/media/platform/mediatek/vcodec/common/../decoder/../common/mtk_vcodec_cmn_drv.h:12:
   In file included from include/media/v4l2-ctrls.h:14:
   In file included from include/media/media-request.h:20:
   In file included from include/media/media-device.h:16:
   In file included from include/linux/pci.h:38:
   In file included from include/linux/interrupt.h:21:
   In file included from arch/riscv/include/asm/sections.h:9:
   In file included from include/linux/mm.h:2208:
   include/linux/vmstat.h:508:43: warning: arithmetic between different enumeration types ('enum zone_stat_item' and 'enum numa_stat_item') [-Wenum-enum-conversion]
     508 |         return vmstat_text[NR_VM_ZONE_STAT_ITEMS +
         |                            ~~~~~~~~~~~~~~~~~~~~~ ^
     509 |                            item];
         |                            ~~~~
   include/linux/vmstat.h:515:43: warning: arithmetic between different enumeration types ('enum zone_stat_item' and 'enum numa_stat_item') [-Wenum-enum-conversion]
     515 |         return vmstat_text[NR_VM_ZONE_STAT_ITEMS +
         |                            ~~~~~~~~~~~~~~~~~~~~~ ^
     516 |                            NR_VM_NUMA_EVENT_ITEMS +
         |                            ~~~~~~~~~~~~~~~~~~~~~~
   include/linux/vmstat.h:522:36: warning: arithmetic between different enumeration types ('enum node_stat_item' and 'enum lru_list') [-Wenum-enum-conversion]
     522 |         return node_stat_name(NR_LRU_BASE + lru) + 3; // skip "nr_"
         |                               ~~~~~~~~~~~ ^ ~~~
   include/linux/vmstat.h:527:43: warning: arithmetic between different enumeration types ('enum zone_stat_item' and 'enum numa_stat_item') [-Wenum-enum-conversion]
     527 |         return vmstat_text[NR_VM_ZONE_STAT_ITEMS +
         |                            ~~~~~~~~~~~~~~~~~~~~~ ^
     528 |                            NR_VM_NUMA_EVENT_ITEMS +
         |                            ~~~~~~~~~~~~~~~~~~~~~~
   include/linux/vmstat.h:536:43: warning: arithmetic between different enumeration types ('enum zone_stat_item' and 'enum numa_stat_item') [-Wenum-enum-conversion]
     536 |         return vmstat_text[NR_VM_ZONE_STAT_ITEMS +
         |                            ~~~~~~~~~~~~~~~~~~~~~ ^
     537 |                            NR_VM_NUMA_EVENT_ITEMS +
         |                            ~~~~~~~~~~~~~~~~~~~~~~
   In file included from drivers/media/platform/mediatek/vcodec/common/mtk_vcodec_intr.c:10:
   In file included from drivers/media/platform/mediatek/vcodec/common/../decoder/mtk_vcodec_dec_drv.h:12:
   In file included from drivers/media/platform/mediatek/vcodec/common/../decoder/../common/mtk_vcodec_fw_priv.h:6:
   In file included from drivers/media/platform/mediatek/vcodec/common/../decoder/../common/mtk_vcodec_fw.h:9:
>> drivers/media/platform/mediatek/vcodec/common/../decoder/../common/../../vpu/mtk_vpu.h:63:2: error: redefinition of enumerator 'IPI_MAX'
      63 |         IPI_MAX,
         |         ^
   arch/riscv/include/asm/smp.h:29:2: note: previous definition is here
      29 |         IPI_MAX
         |         ^
   5 warnings and 1 error generated.
--
   In file included from drivers/media/platform/mediatek/vpu/mtk_vpu.c:9:
   In file included from include/linux/interrupt.h:21:
   In file included from arch/riscv/include/asm/sections.h:9:
   In file included from include/linux/mm.h:2208:
   include/linux/vmstat.h:508:43: warning: arithmetic between different enumeration types ('enum zone_stat_item' and 'enum numa_stat_item') [-Wenum-enum-conversion]
     508 |         return vmstat_text[NR_VM_ZONE_STAT_ITEMS +
         |                            ~~~~~~~~~~~~~~~~~~~~~ ^
     509 |                            item];
         |                            ~~~~
   include/linux/vmstat.h:515:43: warning: arithmetic between different enumeration types ('enum zone_stat_item' and 'enum numa_stat_item') [-Wenum-enum-conversion]
     515 |         return vmstat_text[NR_VM_ZONE_STAT_ITEMS +
         |                            ~~~~~~~~~~~~~~~~~~~~~ ^
     516 |                            NR_VM_NUMA_EVENT_ITEMS +
         |                            ~~~~~~~~~~~~~~~~~~~~~~
   include/linux/vmstat.h:522:36: warning: arithmetic between different enumeration types ('enum node_stat_item' and 'enum lru_list') [-Wenum-enum-conversion]
     522 |         return node_stat_name(NR_LRU_BASE + lru) + 3; // skip "nr_"
         |                               ~~~~~~~~~~~ ^ ~~~
   include/linux/vmstat.h:527:43: warning: arithmetic between different enumeration types ('enum zone_stat_item' and 'enum numa_stat_item') [-Wenum-enum-conversion]
     527 |         return vmstat_text[NR_VM_ZONE_STAT_ITEMS +
         |                            ~~~~~~~~~~~~~~~~~~~~~ ^
     528 |                            NR_VM_NUMA_EVENT_ITEMS +
         |                            ~~~~~~~~~~~~~~~~~~~~~~
   include/linux/vmstat.h:536:43: warning: arithmetic between different enumeration types ('enum zone_stat_item' and 'enum numa_stat_item') [-Wenum-enum-conversion]
     536 |         return vmstat_text[NR_VM_ZONE_STAT_ITEMS +
         |                            ~~~~~~~~~~~~~~~~~~~~~ ^
     537 |                            NR_VM_NUMA_EVENT_ITEMS +
         |                            ~~~~~~~~~~~~~~~~~~~~~~
   In file included from drivers/media/platform/mediatek/vpu/mtk_vpu.c:20:
>> drivers/media/platform/mediatek/vpu/mtk_vpu.h:63:2: error: redefinition of enumerator 'IPI_MAX'
      63 |         IPI_MAX,
         |         ^
   arch/riscv/include/asm/smp.h:29:2: note: previous definition is here
      29 |         IPI_MAX
         |         ^
>> drivers/media/platform/mediatek/vpu/mtk_vpu.c:299:9: warning: comparison of different enumeration types ('enum ipi_id' and 'enum ipi_message_type') [-Wenum-compare]
     299 |         if (id < IPI_MAX && handler) {
         |             ~~ ^ ~~~~~~~
   drivers/media/platform/mediatek/vpu/mtk_vpu.c:322:31: warning: comparison of different enumeration types ('enum ipi_id' and 'enum ipi_message_type') [-Wenum-compare]
     322 |         if (id <= IPI_VPU_INIT || id >= IPI_MAX ||
         |                                   ~~ ^  ~~~~~~~
   7 warnings and 1 error generated.
--
   In file included from drivers/media/platform/mediatek/vcodec/encoder/venc/venc_vp8_if.c:8:
   In file included from include/linux/interrupt.h:21:
   In file included from arch/riscv/include/asm/sections.h:9:
   In file included from include/linux/mm.h:2208:
   include/linux/vmstat.h:508:43: warning: arithmetic between different enumeration types ('enum zone_stat_item' and 'enum numa_stat_item') [-Wenum-enum-conversion]
     508 |         return vmstat_text[NR_VM_ZONE_STAT_ITEMS +
         |                            ~~~~~~~~~~~~~~~~~~~~~ ^
     509 |                            item];
         |                            ~~~~
   include/linux/vmstat.h:515:43: warning: arithmetic between different enumeration types ('enum zone_stat_item' and 'enum numa_stat_item') [-Wenum-enum-conversion]
     515 |         return vmstat_text[NR_VM_ZONE_STAT_ITEMS +
         |                            ~~~~~~~~~~~~~~~~~~~~~ ^
     516 |                            NR_VM_NUMA_EVENT_ITEMS +
         |                            ~~~~~~~~~~~~~~~~~~~~~~
   include/linux/vmstat.h:522:36: warning: arithmetic between different enumeration types ('enum node_stat_item' and 'enum lru_list') [-Wenum-enum-conversion]
     522 |         return node_stat_name(NR_LRU_BASE + lru) + 3; // skip "nr_"
         |                               ~~~~~~~~~~~ ^ ~~~
   include/linux/vmstat.h:527:43: warning: arithmetic between different enumeration types ('enum zone_stat_item' and 'enum numa_stat_item') [-Wenum-enum-conversion]
     527 |         return vmstat_text[NR_VM_ZONE_STAT_ITEMS +
         |                            ~~~~~~~~~~~~~~~~~~~~~ ^
     528 |                            NR_VM_NUMA_EVENT_ITEMS +
         |                            ~~~~~~~~~~~~~~~~~~~~~~
   include/linux/vmstat.h:536:43: warning: arithmetic between different enumeration types ('enum zone_stat_item' and 'enum numa_stat_item') [-Wenum-enum-conversion]
     536 |         return vmstat_text[NR_VM_ZONE_STAT_ITEMS +
         |                            ~~~~~~~~~~~~~~~~~~~~~ ^
     537 |                            NR_VM_NUMA_EVENT_ITEMS +
         |                            ~~~~~~~~~~~~~~~~~~~~~~
   In file included from drivers/media/platform/mediatek/vcodec/encoder/venc/venc_vp8_if.c:12:
   In file included from drivers/media/platform/mediatek/vcodec/encoder/venc/../mtk_vcodec_enc_drv.h:12:
   In file included from drivers/media/platform/mediatek/vcodec/encoder/venc/../../common/mtk_vcodec_fw_priv.h:6:
   In file included from drivers/media/platform/mediatek/vcodec/encoder/venc/../../common/mtk_vcodec_fw.h:9:
>> drivers/media/platform/mediatek/vcodec/encoder/venc/../../common/../../vpu/mtk_vpu.h:63:2: error: redefinition of enumerator 'IPI_MAX'
      63 |         IPI_MAX,
         |         ^
   arch/riscv/include/asm/smp.h:29:2: note: previous definition is here
      29 |         IPI_MAX
         |         ^
   5 warnings and 1 error generated.
--
   In file included from drivers/media/platform/mediatek/vcodec/encoder/venc/venc_h264_if.c:9:
   In file included from include/linux/interrupt.h:21:
   In file included from arch/riscv/include/asm/sections.h:9:
   In file included from include/linux/mm.h:2208:
   include/linux/vmstat.h:508:43: warning: arithmetic between different enumeration types ('enum zone_stat_item' and 'enum numa_stat_item') [-Wenum-enum-conversion]
     508 |         return vmstat_text[NR_VM_ZONE_STAT_ITEMS +
         |                            ~~~~~~~~~~~~~~~~~~~~~ ^
     509 |                            item];
         |                            ~~~~
   include/linux/vmstat.h:515:43: warning: arithmetic between different enumeration types ('enum zone_stat_item' and 'enum numa_stat_item') [-Wenum-enum-conversion]
     515 |         return vmstat_text[NR_VM_ZONE_STAT_ITEMS +
         |                            ~~~~~~~~~~~~~~~~~~~~~ ^
     516 |                            NR_VM_NUMA_EVENT_ITEMS +
         |                            ~~~~~~~~~~~~~~~~~~~~~~
   include/linux/vmstat.h:522:36: warning: arithmetic between different enumeration types ('enum node_stat_item' and 'enum lru_list') [-Wenum-enum-conversion]
     522 |         return node_stat_name(NR_LRU_BASE + lru) + 3; // skip "nr_"
         |                               ~~~~~~~~~~~ ^ ~~~
   include/linux/vmstat.h:527:43: warning: arithmetic between different enumeration types ('enum zone_stat_item' and 'enum numa_stat_item') [-Wenum-enum-conversion]
     527 |         return vmstat_text[NR_VM_ZONE_STAT_ITEMS +
         |                            ~~~~~~~~~~~~~~~~~~~~~ ^
     528 |                            NR_VM_NUMA_EVENT_ITEMS +
         |                            ~~~~~~~~~~~~~~~~~~~~~~
   include/linux/vmstat.h:536:43: warning: arithmetic between different enumeration types ('enum zone_stat_item' and 'enum numa_stat_item') [-Wenum-enum-conversion]
     536 |         return vmstat_text[NR_VM_ZONE_STAT_ITEMS +
         |                            ~~~~~~~~~~~~~~~~~~~~~ ^
     537 |                            NR_VM_NUMA_EVENT_ITEMS +
         |                            ~~~~~~~~~~~~~~~~~~~~~~
   In file included from drivers/media/platform/mediatek/vcodec/encoder/venc/venc_h264_if.c:13:
   In file included from drivers/media/platform/mediatek/vcodec/encoder/venc/../mtk_vcodec_enc_drv.h:12:
   In file included from drivers/media/platform/mediatek/vcodec/encoder/venc/../../common/mtk_vcodec_fw_priv.h:6:
   In file included from drivers/media/platform/mediatek/vcodec/encoder/venc/../../common/mtk_vcodec_fw.h:9:
>> drivers/media/platform/mediatek/vcodec/encoder/venc/../../common/../../vpu/mtk_vpu.h:63:2: error: redefinition of enumerator 'IPI_MAX'
      63 |         IPI_MAX,
         |         ^
   arch/riscv/include/asm/smp.h:29:2: note: previous definition is here
      29 |         IPI_MAX
         |         ^
   drivers/media/platform/mediatek/vcodec/encoder/venc/venc_h264_if.c:596:29: warning: conditional expression between different enumeration types ('enum scp_ipi_id' and 'enum ipi_id') [-Wenum-compare-conditional]
     596 |         inst->vpu_inst.id = is_ext ? SCP_IPI_VENC_H264 : IPI_VENC_H264;
         |                                    ^ ~~~~~~~~~~~~~~~~~   ~~~~~~~~~~~~~
   6 warnings and 1 error generated.
--
   In file included from drivers/media/platform/mediatek/vcodec/encoder/mtk_vcodec_enc.c:9:
   In file included from include/media/v4l2-mem2mem.h:16:
   In file included from include/media/videobuf2-v4l2.h:16:
   In file included from include/media/videobuf2-core.h:18:
   In file included from include/linux/dma-buf.h:19:
   In file included from include/linux/scatterlist.h:8:
   In file included from include/linux/mm.h:2208:
   include/linux/vmstat.h:508:43: warning: arithmetic between different enumeration types ('enum zone_stat_item' and 'enum numa_stat_item') [-Wenum-enum-conversion]
     508 |         return vmstat_text[NR_VM_ZONE_STAT_ITEMS +
         |                            ~~~~~~~~~~~~~~~~~~~~~ ^
     509 |                            item];
         |                            ~~~~
   include/linux/vmstat.h:515:43: warning: arithmetic between different enumeration types ('enum zone_stat_item' and 'enum numa_stat_item') [-Wenum-enum-conversion]
     515 |         return vmstat_text[NR_VM_ZONE_STAT_ITEMS +
         |                            ~~~~~~~~~~~~~~~~~~~~~ ^
     516 |                            NR_VM_NUMA_EVENT_ITEMS +
         |                            ~~~~~~~~~~~~~~~~~~~~~~
   include/linux/vmstat.h:522:36: warning: arithmetic between different enumeration types ('enum node_stat_item' and 'enum lru_list') [-Wenum-enum-conversion]
     522 |         return node_stat_name(NR_LRU_BASE + lru) + 3; // skip "nr_"
         |                               ~~~~~~~~~~~ ^ ~~~
   include/linux/vmstat.h:527:43: warning: arithmetic between different enumeration types ('enum zone_stat_item' and 'enum numa_stat_item') [-Wenum-enum-conversion]
     527 |         return vmstat_text[NR_VM_ZONE_STAT_ITEMS +
         |                            ~~~~~~~~~~~~~~~~~~~~~ ^
     528 |                            NR_VM_NUMA_EVENT_ITEMS +
         |                            ~~~~~~~~~~~~~~~~~~~~~~
   include/linux/vmstat.h:536:43: warning: arithmetic between different enumeration types ('enum zone_stat_item' and 'enum numa_stat_item') [-Wenum-enum-conversion]
     536 |         return vmstat_text[NR_VM_ZONE_STAT_ITEMS +
         |                            ~~~~~~~~~~~~~~~~~~~~~ ^
     537 |                            NR_VM_NUMA_EVENT_ITEMS +
         |                            ~~~~~~~~~~~~~~~~~~~~~~
   In file included from drivers/media/platform/mediatek/vcodec/encoder/mtk_vcodec_enc.c:13:
   In file included from drivers/media/platform/mediatek/vcodec/encoder/mtk_vcodec_enc.h:14:
   In file included from drivers/media/platform/mediatek/vcodec/encoder/mtk_vcodec_enc_drv.h:12:
   In file included from drivers/media/platform/mediatek/vcodec/encoder/../common/mtk_vcodec_fw_priv.h:6:
   In file included from drivers/media/platform/mediatek/vcodec/encoder/../common/mtk_vcodec_fw.h:9:
>> drivers/media/platform/mediatek/vcodec/encoder/../common/../../vpu/mtk_vpu.h:63:2: error: redefinition of enumerator 'IPI_MAX'
      63 |         IPI_MAX,
         |         ^
   arch/riscv/include/asm/smp.h:29:2: note: previous definition is here
      29 |         IPI_MAX
         |         ^
   5 warnings and 1 error generated.
--
   In file included from drivers/media/platform/mediatek/vcodec/decoder/vdec/vdec_h264_if.c:10:
   In file included from drivers/media/platform/mediatek/vcodec/decoder/vdec/../vdec_drv_if.h:11:
   In file included from drivers/media/platform/mediatek/vcodec/decoder/vdec/../mtk_vcodec_dec.h:11:
   In file included from include/media/videobuf2-core.h:18:
   In file included from include/linux/dma-buf.h:19:
   In file included from include/linux/scatterlist.h:8:
   In file included from include/linux/mm.h:2208:
   include/linux/vmstat.h:508:43: warning: arithmetic between different enumeration types ('enum zone_stat_item' and 'enum numa_stat_item') [-Wenum-enum-conversion]
     508 |         return vmstat_text[NR_VM_ZONE_STAT_ITEMS +
         |                            ~~~~~~~~~~~~~~~~~~~~~ ^
     509 |                            item];
         |                            ~~~~
   include/linux/vmstat.h:515:43: warning: arithmetic between different enumeration types ('enum zone_stat_item' and 'enum numa_stat_item') [-Wenum-enum-conversion]
     515 |         return vmstat_text[NR_VM_ZONE_STAT_ITEMS +
         |                            ~~~~~~~~~~~~~~~~~~~~~ ^
     516 |                            NR_VM_NUMA_EVENT_ITEMS +
         |                            ~~~~~~~~~~~~~~~~~~~~~~
   include/linux/vmstat.h:522:36: warning: arithmetic between different enumeration types ('enum node_stat_item' and 'enum lru_list') [-Wenum-enum-conversion]
     522 |         return node_stat_name(NR_LRU_BASE + lru) + 3; // skip "nr_"
         |                               ~~~~~~~~~~~ ^ ~~~
   include/linux/vmstat.h:527:43: warning: arithmetic between different enumeration types ('enum zone_stat_item' and 'enum numa_stat_item') [-Wenum-enum-conversion]
     527 |         return vmstat_text[NR_VM_ZONE_STAT_ITEMS +
         |                            ~~~~~~~~~~~~~~~~~~~~~ ^
     528 |                            NR_VM_NUMA_EVENT_ITEMS +
         |                            ~~~~~~~~~~~~~~~~~~~~~~
   include/linux/vmstat.h:536:43: warning: arithmetic between different enumeration types ('enum zone_stat_item' and 'enum numa_stat_item') [-Wenum-enum-conversion]
     536 |         return vmstat_text[NR_VM_ZONE_STAT_ITEMS +
         |                            ~~~~~~~~~~~~~~~~~~~~~ ^
     537 |                            NR_VM_NUMA_EVENT_ITEMS +
         |                            ~~~~~~~~~~~~~~~~~~~~~~
   In file included from drivers/media/platform/mediatek/vcodec/decoder/vdec/vdec_h264_if.c:10:
   In file included from drivers/media/platform/mediatek/vcodec/decoder/vdec/../vdec_drv_if.h:11:
   In file included from drivers/media/platform/mediatek/vcodec/decoder/vdec/../mtk_vcodec_dec.h:14:
   In file included from drivers/media/platform/mediatek/vcodec/decoder/vdec/../mtk_vcodec_dec_drv.h:12:
   In file included from drivers/media/platform/mediatek/vcodec/decoder/vdec/../../common/mtk_vcodec_fw_priv.h:6:
   In file included from drivers/media/platform/mediatek/vcodec/decoder/vdec/../../common/mtk_vcodec_fw.h:9:
>> drivers/media/platform/mediatek/vcodec/decoder/vdec/../../common/../../vpu/mtk_vpu.h:63:2: error: redefinition of enumerator 'IPI_MAX'
      63 |         IPI_MAX,
         |         ^
   arch/riscv/include/asm/smp.h:29:2: note: previous definition is here
      29 |         IPI_MAX
         |         ^
   5 warnings and 1 error generated.
..


vim +/IPI_MAX +63 drivers/media/platform/mediatek/vpu/mtk_vpu.h

3003a180ef6b94 drivers/media/platform/mtk-vpu/mtk_vpu.h      Andrew-CT Chen 2016-05-03  11  
3003a180ef6b94 drivers/media/platform/mtk-vpu/mtk_vpu.h      Andrew-CT Chen 2016-05-03  12  /**
0f02beec61875e drivers/media/platform/mtk-vpu/mtk_vpu.h      Hans Verkuil   2021-03-11  13   * DOC: VPU
0f02beec61875e drivers/media/platform/mtk-vpu/mtk_vpu.h      Hans Verkuil   2021-03-11  14   *
3003a180ef6b94 drivers/media/platform/mtk-vpu/mtk_vpu.h      Andrew-CT Chen 2016-05-03  15   * VPU (video processor unit) is a tiny processor controlling video hardware
3003a180ef6b94 drivers/media/platform/mtk-vpu/mtk_vpu.h      Andrew-CT Chen 2016-05-03  16   * related to video codec, scaling and color format converting.
3003a180ef6b94 drivers/media/platform/mtk-vpu/mtk_vpu.h      Andrew-CT Chen 2016-05-03  17   * VPU interfaces with other blocks by share memory and interrupt.
0f02beec61875e drivers/media/platform/mtk-vpu/mtk_vpu.h      Hans Verkuil   2021-03-11  18   */
3003a180ef6b94 drivers/media/platform/mtk-vpu/mtk_vpu.h      Andrew-CT Chen 2016-05-03  19  
bfb1b99802ef16 drivers/media/platform/mediatek/vpu/mtk_vpu.h Arnd Bergmann  2024-02-24  20  typedef void (*ipi_handler_t) (void *data,
3003a180ef6b94 drivers/media/platform/mtk-vpu/mtk_vpu.h      Andrew-CT Chen 2016-05-03  21  			       unsigned int len,
3003a180ef6b94 drivers/media/platform/mtk-vpu/mtk_vpu.h      Andrew-CT Chen 2016-05-03  22  			       void *priv);
3003a180ef6b94 drivers/media/platform/mtk-vpu/mtk_vpu.h      Andrew-CT Chen 2016-05-03  23  
3003a180ef6b94 drivers/media/platform/mtk-vpu/mtk_vpu.h      Andrew-CT Chen 2016-05-03  24  /**
3003a180ef6b94 drivers/media/platform/mtk-vpu/mtk_vpu.h      Andrew-CT Chen 2016-05-03  25   * enum ipi_id - the id of inter-processor interrupt
3003a180ef6b94 drivers/media/platform/mtk-vpu/mtk_vpu.h      Andrew-CT Chen 2016-05-03  26   *
3003a180ef6b94 drivers/media/platform/mtk-vpu/mtk_vpu.h      Andrew-CT Chen 2016-05-03  27   * @IPI_VPU_INIT:	 The interrupt from vpu is to notfiy kernel
e2818a59f7ca42 drivers/media/platform/mtk-vpu/mtk_vpu.h      Andrew-CT Chen 2016-09-02  28   *			 VPU initialization completed.
e2818a59f7ca42 drivers/media/platform/mtk-vpu/mtk_vpu.h      Andrew-CT Chen 2016-09-02  29   *			 IPI_VPU_INIT is sent from VPU when firmware is
e2818a59f7ca42 drivers/media/platform/mtk-vpu/mtk_vpu.h      Andrew-CT Chen 2016-09-02  30   *			 loaded. AP doesn't need to send IPI_VPU_INIT
e2818a59f7ca42 drivers/media/platform/mtk-vpu/mtk_vpu.h      Andrew-CT Chen 2016-09-02  31   *			 command to VPU.
e2818a59f7ca42 drivers/media/platform/mtk-vpu/mtk_vpu.h      Andrew-CT Chen 2016-09-02  32   *			 For other IPI below, AP should send the request
e2818a59f7ca42 drivers/media/platform/mtk-vpu/mtk_vpu.h      Andrew-CT Chen 2016-09-02  33   *			 to VPU to trigger the interrupt.
e2818a59f7ca42 drivers/media/platform/mtk-vpu/mtk_vpu.h      Andrew-CT Chen 2016-09-02  34   * @IPI_VDEC_H264:	 The interrupt from vpu is to notify kernel to
e2818a59f7ca42 drivers/media/platform/mtk-vpu/mtk_vpu.h      Andrew-CT Chen 2016-09-02  35   *			 handle H264 vidoe decoder job, and vice versa.
e2818a59f7ca42 drivers/media/platform/mtk-vpu/mtk_vpu.h      Andrew-CT Chen 2016-09-02  36   *			 Decode output format is always MT21 no matter what
e2818a59f7ca42 drivers/media/platform/mtk-vpu/mtk_vpu.h      Andrew-CT Chen 2016-09-02  37   *			 the input format is.
e2818a59f7ca42 drivers/media/platform/mtk-vpu/mtk_vpu.h      Andrew-CT Chen 2016-09-02  38   * @IPI_VDEC_VP8:	 The interrupt from is to notify kernel to
e2818a59f7ca42 drivers/media/platform/mtk-vpu/mtk_vpu.h      Andrew-CT Chen 2016-09-02  39   *			 handle VP8 video decoder job, and vice versa.
e2818a59f7ca42 drivers/media/platform/mtk-vpu/mtk_vpu.h      Andrew-CT Chen 2016-09-02  40   *			 Decode output format is always MT21 no matter what
e2818a59f7ca42 drivers/media/platform/mtk-vpu/mtk_vpu.h      Andrew-CT Chen 2016-09-02  41   *			 the input format is.
e2818a59f7ca42 drivers/media/platform/mtk-vpu/mtk_vpu.h      Andrew-CT Chen 2016-09-02  42   * @IPI_VDEC_VP9:	 The interrupt from vpu is to notify kernel to
e2818a59f7ca42 drivers/media/platform/mtk-vpu/mtk_vpu.h      Andrew-CT Chen 2016-09-02  43   *			 handle VP9 video decoder job, and vice versa.
e2818a59f7ca42 drivers/media/platform/mtk-vpu/mtk_vpu.h      Andrew-CT Chen 2016-09-02  44   *			 Decode output format is always MT21 no matter what
e2818a59f7ca42 drivers/media/platform/mtk-vpu/mtk_vpu.h      Andrew-CT Chen 2016-09-02  45   *			 the input format is.
3003a180ef6b94 drivers/media/platform/mtk-vpu/mtk_vpu.h      Andrew-CT Chen 2016-05-03  46   * @IPI_VENC_H264:	 The interrupt from vpu is to notify kernel to
e2818a59f7ca42 drivers/media/platform/mtk-vpu/mtk_vpu.h      Andrew-CT Chen 2016-09-02  47   *			 handle H264 video encoder job, and vice versa.
3003a180ef6b94 drivers/media/platform/mtk-vpu/mtk_vpu.h      Andrew-CT Chen 2016-05-03  48   * @IPI_VENC_VP8:	 The interrupt fro vpu is to notify kernel to
e2818a59f7ca42 drivers/media/platform/mtk-vpu/mtk_vpu.h      Andrew-CT Chen 2016-09-02  49   *			 handle VP8 video encoder job,, and vice versa.
737ea6cfd22631 drivers/media/platform/mtk-vpu/mtk_vpu.h      Minghsiu Tsai  2016-09-08  50   * @IPI_MDP:		 The interrupt from vpu is to notify kernel to
737ea6cfd22631 drivers/media/platform/mtk-vpu/mtk_vpu.h      Minghsiu Tsai  2016-09-08  51   *			 handle MDP (Media Data Path) job, and vice versa.
3003a180ef6b94 drivers/media/platform/mtk-vpu/mtk_vpu.h      Andrew-CT Chen 2016-05-03  52   * @IPI_MAX:		 The maximum IPI number
3003a180ef6b94 drivers/media/platform/mtk-vpu/mtk_vpu.h      Andrew-CT Chen 2016-05-03  53   */
3003a180ef6b94 drivers/media/platform/mtk-vpu/mtk_vpu.h      Andrew-CT Chen 2016-05-03  54  
3003a180ef6b94 drivers/media/platform/mtk-vpu/mtk_vpu.h      Andrew-CT Chen 2016-05-03  55  enum ipi_id {
3003a180ef6b94 drivers/media/platform/mtk-vpu/mtk_vpu.h      Andrew-CT Chen 2016-05-03  56  	IPI_VPU_INIT = 0,
e2818a59f7ca42 drivers/media/platform/mtk-vpu/mtk_vpu.h      Andrew-CT Chen 2016-09-02  57  	IPI_VDEC_H264,
e2818a59f7ca42 drivers/media/platform/mtk-vpu/mtk_vpu.h      Andrew-CT Chen 2016-09-02  58  	IPI_VDEC_VP8,
e2818a59f7ca42 drivers/media/platform/mtk-vpu/mtk_vpu.h      Andrew-CT Chen 2016-09-02  59  	IPI_VDEC_VP9,
3003a180ef6b94 drivers/media/platform/mtk-vpu/mtk_vpu.h      Andrew-CT Chen 2016-05-03  60  	IPI_VENC_H264,
3003a180ef6b94 drivers/media/platform/mtk-vpu/mtk_vpu.h      Andrew-CT Chen 2016-05-03  61  	IPI_VENC_VP8,
737ea6cfd22631 drivers/media/platform/mtk-vpu/mtk_vpu.h      Minghsiu Tsai  2016-09-08  62  	IPI_MDP,
3003a180ef6b94 drivers/media/platform/mtk-vpu/mtk_vpu.h      Andrew-CT Chen 2016-05-03 @63  	IPI_MAX,
3003a180ef6b94 drivers/media/platform/mtk-vpu/mtk_vpu.h      Andrew-CT Chen 2016-05-03  64  };
3003a180ef6b94 drivers/media/platform/mtk-vpu/mtk_vpu.h      Andrew-CT Chen 2016-05-03  65  

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [External] Re: [RFC 1/2] riscv: process: Introduce idle thread using Zawrs extension
  2024-04-18 16:14       ` Xu Lu
@ 2024-04-22  8:21         ` Conor Dooley
  -1 siblings, 0 replies; 31+ messages in thread
From: Conor Dooley @ 2024-04-22  8:21 UTC (permalink / raw)
  To: Xu Lu
  Cc: Conor Dooley, paul.walmsley, palmer, aou, andy.chiu, guoren,
	linux-riscv, linux-kernel, lihangjing, dengliang.1214, xieyongji,
	chaiwen.cc, Andrew Jones

[-- Attachment #1: Type: text/plain, Size: 2041 bytes --]

On Fri, Apr 19, 2024 at 12:14:47AM +0800, Xu Lu wrote:
> On Thu, Apr 18, 2024 at 11:06 PM Conor Dooley <conor@kernel.org> wrote:
> > On Thu, Apr 18, 2024 at 07:49:41PM +0800, Xu Lu wrote:

> > > +        The Zawrs extension defines a pair of instructions to be used
> > > +        in polling loops that allows a core to enter a low-power state
> > > +        and wait on a store to a memory location.
> > > +
> > > +        If you don't know what to do here, say Y.
> > > +
> > >  config TOOLCHAIN_HAS_V
> > >       bool
> > >       default y
> > > @@ -1075,6 +1090,15 @@ endmenu # "Power management options"
> > >
> > >  menu "CPU Power Management"
> > >
> > > +config RISCV_ZAWRS_IDLE
> > > +     bool "Idle thread using ZAWRS extensions"
> > > +     depends on RISCV_ISA_ZAWRS
> > > +     default y
> > > +     help
> > > +             Adds support to implement idle thread using ZAWRS extension.
> > > +
> > > +             If you don't know what to do here, say Y.
> >
> > I don't think this second option is needed, why would we not always want
> > to use the Zawrs version of this when it is available? Can we do it
> > unconditionally when RISCV_ISA_ZAWRS is set and the extension is
> > detected at runtime?
> >
> > Cheers,
> > Conor.
> 
> Indeed, we can always choose WRS.NTO when entering idle.
> 
> This config is introduced for the second commit in this patch series.
> In the second commit, we detect whether the target cpu is idle when
> sending IPI and write IPI info to the reserve set of idle cpu so as to
> avoid sending a physical IPI. Besides, the target idle cpu need not to
> go through traditional interrupt handling routine. However, if all
> cpus are busy and hardly enter idle, this commit may introduce
> performance overhead of extra instructions when sending IPI. Thus we
> introduce this config just in case.

Could you add the downsides into the help text of the config option so
that people can understand why to enable/disable the option?

Thanks,
Conor.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [External] Re: [RFC 1/2] riscv: process: Introduce idle thread using Zawrs extension
@ 2024-04-22  8:21         ` Conor Dooley
  0 siblings, 0 replies; 31+ messages in thread
From: Conor Dooley @ 2024-04-22  8:21 UTC (permalink / raw)
  To: Xu Lu
  Cc: Conor Dooley, paul.walmsley, palmer, aou, andy.chiu, guoren,
	linux-riscv, linux-kernel, lihangjing, dengliang.1214, xieyongji,
	chaiwen.cc, Andrew Jones


[-- Attachment #1.1: Type: text/plain, Size: 2041 bytes --]

On Fri, Apr 19, 2024 at 12:14:47AM +0800, Xu Lu wrote:
> On Thu, Apr 18, 2024 at 11:06 PM Conor Dooley <conor@kernel.org> wrote:
> > On Thu, Apr 18, 2024 at 07:49:41PM +0800, Xu Lu wrote:

> > > +        The Zawrs extension defines a pair of instructions to be used
> > > +        in polling loops that allows a core to enter a low-power state
> > > +        and wait on a store to a memory location.
> > > +
> > > +        If you don't know what to do here, say Y.
> > > +
> > >  config TOOLCHAIN_HAS_V
> > >       bool
> > >       default y
> > > @@ -1075,6 +1090,15 @@ endmenu # "Power management options"
> > >
> > >  menu "CPU Power Management"
> > >
> > > +config RISCV_ZAWRS_IDLE
> > > +     bool "Idle thread using ZAWRS extensions"
> > > +     depends on RISCV_ISA_ZAWRS
> > > +     default y
> > > +     help
> > > +             Adds support to implement idle thread using ZAWRS extension.
> > > +
> > > +             If you don't know what to do here, say Y.
> >
> > I don't think this second option is needed, why would we not always want
> > to use the Zawrs version of this when it is available? Can we do it
> > unconditionally when RISCV_ISA_ZAWRS is set and the extension is
> > detected at runtime?
> >
> > Cheers,
> > Conor.
> 
> Indeed, we can always choose WRS.NTO when entering idle.
> 
> This config is introduced for the second commit in this patch series.
> In the second commit, we detect whether the target cpu is idle when
> sending IPI and write IPI info to the reserve set of idle cpu so as to
> avoid sending a physical IPI. Besides, the target idle cpu need not to
> go through traditional interrupt handling routine. However, if all
> cpus are busy and hardly enter idle, this commit may introduce
> performance overhead of extra instructions when sending IPI. Thus we
> introduce this config just in case.

Could you add the downsides into the help text of the config option so
that people can understand why to enable/disable the option?

Thanks,
Conor.

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

[-- Attachment #2: Type: text/plain, Size: 161 bytes --]

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 31+ messages in thread

end of thread, other threads:[~2024-04-22  8:22 UTC | newest]

Thread overview: 31+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-04-18 11:49 [RFC 0/2] riscv: Idle thread using Zawrs extension Xu Lu
2024-04-18 11:49 ` Xu Lu
2024-04-18 11:49 ` [RFC 1/2] riscv: process: Introduce idle " Xu Lu
2024-04-18 11:49   ` Xu Lu
2024-04-18 15:05   ` Conor Dooley
2024-04-18 15:05     ` Conor Dooley
2024-04-18 16:14     ` [External] " Xu Lu
2024-04-18 16:14       ` Xu Lu
2024-04-22  8:21       ` Conor Dooley
2024-04-22  8:21         ` Conor Dooley
2024-04-18 19:10     ` Andrew Jones
2024-04-18 19:10       ` Andrew Jones
2024-04-18 22:00       ` Samuel Holland
2024-04-18 22:00         ` Samuel Holland
2024-04-18 22:09         ` Conor Dooley
2024-04-18 22:09           ` Conor Dooley
2024-04-18 11:49 ` [RFC 2/2] riscv: Use Zawrs to accelerate IPI to idle cpu Xu Lu
2024-04-18 11:49   ` Xu Lu
2024-04-20 13:30   ` kernel test robot
2024-04-18 12:26 ` [RFC 0/2] riscv: Idle thread using Zawrs extension Christoph Müllner
2024-04-18 12:26   ` Christoph Müllner
2024-04-18 12:44   ` [External] " Xu Lu
2024-04-18 12:44     ` Xu Lu
2024-04-18 12:56     ` Christoph Müllner
2024-04-18 12:56       ` Christoph Müllner
2024-04-18 13:09       ` Xu Lu
2024-04-18 13:09         ` Xu Lu
2024-04-18 14:08         ` Conor Dooley
2024-04-18 14:08           ` Conor Dooley
2024-04-18 14:10         ` Andrew Jones
2024-04-18 14:10           ` Andrew Jones

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.