linux-arch.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH V5 00/11] riscv: Add GENERIC_ENTRY support and related features
@ 2022-09-18 15:52 guoren
  2022-09-18 15:52 ` [PATCH V5 01/11] riscv: elf_kexec: Fixup compile warning guoren
                   ` (10 more replies)
  0 siblings, 11 replies; 29+ messages in thread
From: guoren @ 2022-09-18 15:52 UTC (permalink / raw)
  To: arnd, guoren, palmer, tglx, peterz, luto, conor.dooley, heiko,
	jszhang, lazyparser, falcon, chenhuacai, apatel, atishp, palmer,
	paul.walmsley, mark.rutland, zouyipeng, bigeasy, David.Laight
  Cc: linux-arch, linux-kernel, linux-riscv, Guo Ren

From: Guo Ren <guoren@linux.alibaba.com>

The patches convert riscv to use the generic entry infrastructure from
kernel/entry/*. Additionally, add independent irq stacks (IRQ_STACKS)
for percpu to prevent kernel stack overflows. Add generic_entry based
STACKLEAK support.

[1] https://github.com/guoren83/linux/tree/generic_entry_v5

v5:
 - Add riscv own stackleak patch instead of generic entry modification
   (by Mark Rutland)
 - Add EXPERT dependency for THREAD_SIZE (by Arnd)
 - Add EXPERT dependency for IRQ_STACK (by Sebastian, David Laight)
 - Corrected __trap_section (by Peter Ziji)
 - Add Tested-by (Yipeng Zou)
 - Use CONFIG_SOFTIRQ_ON_OWN_STACK replace "#ifndef CONFIG_PREEMPT_RT"
 - Fixup systrace_enter compile error
 - Fixup exit_to_user_mode_prepare preempt_disable warning

V4:
https://lore.kernel.org/linux-riscv/20220908022506.1275799-1-guoren@kernel.org/
 - Fixup entry.S with "la" bug (by Conor.Dooley)
 - Fixup missing noinstr bug (by Peter Ziji)

V3:
https://lore.kernel.org/linux-riscv/20220906035423.634617-1-guoren@kernel.org/
 - Fixup CONFIG_COMPAT=n compile error
 - Add THREAD_SIZE_ORDER config
 - Optimize elf_kexec.c warning fixup
 - Add static to irq_stack_ptr definition

V2:
https://lore.kernel.org/linux-riscv/20220904072637.8619-1-guoren@kernel.org/
 - Fixup compile error by include "riscv: ptrace: Remove duplicate
   operation"
 - Fixup compile warning
   Reported-by: kernel test robot <lkp@intel.com>
 - Add test repo link in cover letter

V1:
https://lore.kernel.org/linux-riscv/20220903163808.1954131-1-guoren@kernel.org/

Dao Lu (1):
  riscv: Add support for STACKLEAK gcc plugin

Guo Ren (9):
  riscv: elf_kexec: Fixup compile warning
  riscv: compat_syscall_table: Fixup compile warning
  riscv: ptrace: Remove duplicate operation
  riscv: traps: Add noinstr to prevent instrumentation inserted
  entry: Prevent DEBUG_PREEMPT warning
  riscv: convert to generic entry
  riscv: Support HAVE_IRQ_EXIT_ON_IRQ_STACK
  riscv: Support HAVE_SOFTIRQ_ON_OWN_STACK
  riscv: Add config of thread stack size

Lai Jiangshan (1):
  compiler_types.h: Add __noinstr_section() for noinstr

 arch/riscv/Kconfig                    |  29 +++
 arch/riscv/include/asm/csr.h          |   1 -
 arch/riscv/include/asm/entry-common.h |   8 +
 arch/riscv/include/asm/irq.h          |   3 +
 arch/riscv/include/asm/ptrace.h       |  10 +-
 arch/riscv/include/asm/stacktrace.h   |   5 +
 arch/riscv/include/asm/syscall.h      |   6 +
 arch/riscv/include/asm/thread_info.h  |  27 +--
 arch/riscv/include/asm/vmap_stack.h   |  28 +++
 arch/riscv/kernel/Makefile            |   1 +
 arch/riscv/kernel/elf_kexec.c         |   2 +-
 arch/riscv/kernel/entry.S             | 257 ++++++--------------------
 arch/riscv/kernel/irq.c               |  75 ++++++++
 arch/riscv/kernel/ptrace.c            |  44 -----
 arch/riscv/kernel/signal.c            |  21 +--
 arch/riscv/kernel/sys_riscv.c         |  27 +++
 arch/riscv/kernel/traps.c             |  15 +-
 arch/riscv/mm/fault.c                 |  12 +-
 drivers/firmware/efi/libstub/Makefile |   2 +-
 include/linux/compiler_types.h        |   8 +-
 kernel/entry/common.c                 |   2 +
 21 files changed, 282 insertions(+), 301 deletions(-)
 create mode 100644 arch/riscv/include/asm/entry-common.h
 create mode 100644 arch/riscv/include/asm/vmap_stack.h

-- 
2.36.1


^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH V5 01/11] riscv: elf_kexec: Fixup compile warning
  2022-09-18 15:52 [PATCH V5 00/11] riscv: Add GENERIC_ENTRY support and related features guoren
@ 2022-09-18 15:52 ` guoren
  2022-09-18 15:52 ` [PATCH V5 02/11] riscv: compat_syscall_table: " guoren
                   ` (9 subsequent siblings)
  10 siblings, 0 replies; 29+ messages in thread
From: guoren @ 2022-09-18 15:52 UTC (permalink / raw)
  To: arnd, guoren, palmer, tglx, peterz, luto, conor.dooley, heiko,
	jszhang, lazyparser, falcon, chenhuacai, apatel, atishp, palmer,
	paul.walmsley, mark.rutland, zouyipeng, bigeasy, David.Laight
  Cc: linux-arch, linux-kernel, linux-riscv, Guo Ren, kernel test robot

From: Guo Ren <guoren@linux.alibaba.com>

If CRYTPO or CRYPTO_SHA256 or KEXE_FILE is not enabled, then:

COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross W=1
O=build_dir ARCH=riscv SHELL=/bin/bash arch/riscv/

../arch/riscv/kernel/elf_kexec.c: In function 'elf_kexec_load':
../arch/riscv/kernel/elf_kexec.c:185:23: warning: variable
'kernel_start' set but not used [-Wunused-but-set-variable]
  185 |         unsigned long kernel_start;
      |                       ^~~~~~~~~~~~

Fixes: 838b3e28488f ("RISC-V: Load purgatory in kexec_file")
Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
Signed-off-by: Guo Ren <guoren@kernel.org>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Reported-by: kernel test robot <lkp@intel.com>
---
 arch/riscv/kernel/elf_kexec.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/riscv/kernel/elf_kexec.c b/arch/riscv/kernel/elf_kexec.c
index 0cb94992c15b..4b9264340b78 100644
--- a/arch/riscv/kernel/elf_kexec.c
+++ b/arch/riscv/kernel/elf_kexec.c
@@ -198,7 +198,7 @@ static void *elf_kexec_load(struct kimage *image, char *kernel_buf,
 	if (ret)
 		goto out;
 	kernel_start = image->start;
-	pr_notice("The entry point of kernel at 0x%lx\n", image->start);
+	pr_notice("The entry point of kernel at 0x%lx\n", kernel_start);
 
 	/* Add the kernel binary to the image */
 	ret = riscv_kexec_elf_load(image, &ehdr, &elf_info,
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH V5 02/11] riscv: compat_syscall_table: Fixup compile warning
  2022-09-18 15:52 [PATCH V5 00/11] riscv: Add GENERIC_ENTRY support and related features guoren
  2022-09-18 15:52 ` [PATCH V5 01/11] riscv: elf_kexec: Fixup compile warning guoren
@ 2022-09-18 15:52 ` guoren
  2022-09-18 15:52 ` [PATCH V5 03/11] riscv: ptrace: Remove duplicate operation guoren
                   ` (8 subsequent siblings)
  10 siblings, 0 replies; 29+ messages in thread
From: guoren @ 2022-09-18 15:52 UTC (permalink / raw)
  To: arnd, guoren, palmer, tglx, peterz, luto, conor.dooley, heiko,
	jszhang, lazyparser, falcon, chenhuacai, apatel, atishp, palmer,
	paul.walmsley, mark.rutland, zouyipeng, bigeasy, David.Laight
  Cc: linux-arch, linux-kernel, linux-riscv, Guo Ren, kernel test robot

From: Guo Ren <guoren@linux.alibaba.com>

../arch/riscv/kernel/compat_syscall_table.c:12:41: warning: initialized
field overwritten [-Woverride-init]
   12 | #define __SYSCALL(nr, call)      [nr] = (call),
      |                                         ^
../include/uapi/asm-generic/unistd.h:567:1: note: in expansion of macro
'__SYSCALL'
  567 | __SYSCALL(__NR_semget, sys_semget)

Fixes: 59c10c52f573 ("riscv: compat: syscall: Add compat_sys_call_table implementation")
Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
Signed-off-by: Guo Ren <guoren@kernel.org>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Reported-by: kernel test robot <lkp@intel.com>
---
 arch/riscv/kernel/Makefile | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/riscv/kernel/Makefile b/arch/riscv/kernel/Makefile
index 33bb60a354cd..01da14e21019 100644
--- a/arch/riscv/kernel/Makefile
+++ b/arch/riscv/kernel/Makefile
@@ -9,6 +9,7 @@ CFLAGS_REMOVE_patch.o	= $(CC_FLAGS_FTRACE)
 CFLAGS_REMOVE_sbi.o	= $(CC_FLAGS_FTRACE)
 endif
 CFLAGS_syscall_table.o	+= $(call cc-option,-Wno-override-init,)
+CFLAGS_compat_syscall_table.o += $(call cc-option,-Wno-override-init,)
 
 ifdef CONFIG_KEXEC
 AFLAGS_kexec_relocate.o := -mcmodel=medany $(call cc-option,-mno-relax)
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH V5 03/11] riscv: ptrace: Remove duplicate operation
  2022-09-18 15:52 [PATCH V5 00/11] riscv: Add GENERIC_ENTRY support and related features guoren
  2022-09-18 15:52 ` [PATCH V5 01/11] riscv: elf_kexec: Fixup compile warning guoren
  2022-09-18 15:52 ` [PATCH V5 02/11] riscv: compat_syscall_table: " guoren
@ 2022-09-18 15:52 ` guoren
  2022-09-18 15:52 ` [PATCH V5 04/11] compiler_types.h: Add __noinstr_section() for noinstr guoren
                   ` (7 subsequent siblings)
  10 siblings, 0 replies; 29+ messages in thread
From: guoren @ 2022-09-18 15:52 UTC (permalink / raw)
  To: arnd, guoren, palmer, tglx, peterz, luto, conor.dooley, heiko,
	jszhang, lazyparser, falcon, chenhuacai, apatel, atishp, palmer,
	paul.walmsley, mark.rutland, zouyipeng, bigeasy, David.Laight
  Cc: linux-arch, linux-kernel, linux-riscv, Guo Ren, Oleg Nesterov

From: Guo Ren <guoren@linux.alibaba.com>

The TIF_SYSCALL_TRACE is controlled by a common code, see
kernel/ptrace.c and include/linux/thread.h.

clear_task_syscall_work(child, SYSCALL_TRACE);

Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
Signed-off-by: Guo Ren <guoren@kernel.org>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
---
 arch/riscv/kernel/ptrace.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/arch/riscv/kernel/ptrace.c b/arch/riscv/kernel/ptrace.c
index 2ae8280ae475..44f4b1ca315d 100644
--- a/arch/riscv/kernel/ptrace.c
+++ b/arch/riscv/kernel/ptrace.c
@@ -212,7 +212,6 @@ unsigned long regs_get_kernel_stack_nth(struct pt_regs *regs, unsigned int n)
 
 void ptrace_disable(struct task_struct *child)
 {
-	clear_tsk_thread_flag(child, TIF_SYSCALL_TRACE);
 }
 
 long arch_ptrace(struct task_struct *child, long request,
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH V5 04/11] compiler_types.h: Add __noinstr_section() for noinstr
  2022-09-18 15:52 [PATCH V5 00/11] riscv: Add GENERIC_ENTRY support and related features guoren
                   ` (2 preceding siblings ...)
  2022-09-18 15:52 ` [PATCH V5 03/11] riscv: ptrace: Remove duplicate operation guoren
@ 2022-09-18 15:52 ` guoren
  2022-09-18 15:52 ` [PATCH V5 05/11] riscv: traps: Add noinstr to prevent instrumentation inserted guoren
                   ` (6 subsequent siblings)
  10 siblings, 0 replies; 29+ messages in thread
From: guoren @ 2022-09-18 15:52 UTC (permalink / raw)
  To: arnd, guoren, palmer, tglx, peterz, luto, conor.dooley, heiko,
	jszhang, lazyparser, falcon, chenhuacai, apatel, atishp, palmer,
	paul.walmsley, mark.rutland, zouyipeng, bigeasy, David.Laight
  Cc: linux-arch, linux-kernel, linux-riscv, Lai Jiangshan,
	Borislav Petkov, Miguel Ojeda, Kees Cook, Nick Desaulniers

From: Lai Jiangshan <laijs@linux.alibaba.com>

And it will be extended for C entry code.

Cc: Borislav Petkov <bp@alien8.de>
Reviewed-by: Miguel Ojeda <ojeda@kernel.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Suggested-by: Nick Desaulniers <ndesaulniers@google.com>
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
---
 include/linux/compiler_types.h | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/include/linux/compiler_types.h b/include/linux/compiler_types.h
index 4f2a819fd60a..e9ce11ea4d8b 100644
--- a/include/linux/compiler_types.h
+++ b/include/linux/compiler_types.h
@@ -227,9 +227,11 @@ struct ftrace_likely_data {
 #endif
 
 /* Section for code which can't be instrumented at all */
-#define noinstr								\
-	noinline notrace __attribute((__section__(".noinstr.text")))	\
-	__no_kcsan __no_sanitize_address __no_profile __no_sanitize_coverage
+#define __noinstr_section(section)				\
+	noinline notrace __section(section) __no_profile	\
+	__no_kcsan __no_sanitize_address __no_sanitize_coverage
+
+#define noinstr __noinstr_section(".noinstr.text")
 
 #endif /* __KERNEL__ */
 
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH V5 05/11] riscv: traps: Add noinstr to prevent instrumentation inserted
  2022-09-18 15:52 [PATCH V5 00/11] riscv: Add GENERIC_ENTRY support and related features guoren
                   ` (3 preceding siblings ...)
  2022-09-18 15:52 ` [PATCH V5 04/11] compiler_types.h: Add __noinstr_section() for noinstr guoren
@ 2022-09-18 15:52 ` guoren
  2022-09-18 15:52 ` [PATCH V5 06/11] entry: Prevent DEBUG_PREEMPT warning guoren
                   ` (5 subsequent siblings)
  10 siblings, 0 replies; 29+ messages in thread
From: guoren @ 2022-09-18 15:52 UTC (permalink / raw)
  To: arnd, guoren, palmer, tglx, peterz, luto, conor.dooley, heiko,
	jszhang, lazyparser, falcon, chenhuacai, apatel, atishp, palmer,
	paul.walmsley, mark.rutland, zouyipeng, bigeasy, David.Laight
  Cc: linux-arch, linux-kernel, linux-riscv, Guo Ren

From: Guo Ren <guoren@linux.alibaba.com>

Without noinstr the compiler is free to insert instrumentation (think
all the k*SAN, KCov, GCov, ftrace etc..) which can call code we're not
yet ready to run this early in the entry path, for instance it could
rely on RCU which isn't on yet, or expect lockdep state. (by peterz)

Link: https://lore.kernel.org/linux-riscv/YxcQ6NoPf3AH0EXe@hirez.programming.kicks-ass.net/raw
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
Signed-off-by: Guo Ren <guoren@kernel.org>
---
 arch/riscv/kernel/traps.c | 4 ++--
 arch/riscv/mm/fault.c     | 2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/riscv/kernel/traps.c b/arch/riscv/kernel/traps.c
index 635e6ec26938..588e17c386c6 100644
--- a/arch/riscv/kernel/traps.c
+++ b/arch/riscv/kernel/traps.c
@@ -92,9 +92,9 @@ static void do_trap_error(struct pt_regs *regs, int signo, int code,
 }
 
 #if defined(CONFIG_XIP_KERNEL) && defined(CONFIG_RISCV_ALTERNATIVE)
-#define __trap_section		__section(".xip.traps")
+#define __trap_section __noinstr_section(".xip.traps")
 #else
-#define __trap_section
+#define __trap_section noinstr
 #endif
 #define DO_ERROR_INFO(name, signo, code, str)				\
 asmlinkage __visible __trap_section void name(struct pt_regs *regs)	\
diff --git a/arch/riscv/mm/fault.c b/arch/riscv/mm/fault.c
index f2fbd1400b7c..c7829289e806 100644
--- a/arch/riscv/mm/fault.c
+++ b/arch/riscv/mm/fault.c
@@ -203,7 +203,7 @@ static inline bool access_error(unsigned long cause, struct vm_area_struct *vma)
  * This routine handles page faults.  It determines the address and the
  * problem, and then passes it off to one of the appropriate routines.
  */
-asmlinkage void do_page_fault(struct pt_regs *regs)
+asmlinkage void noinstr do_page_fault(struct pt_regs *regs)
 {
 	struct task_struct *tsk;
 	struct vm_area_struct *vma;
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH V5 06/11] entry: Prevent DEBUG_PREEMPT warning
  2022-09-18 15:52 [PATCH V5 00/11] riscv: Add GENERIC_ENTRY support and related features guoren
                   ` (4 preceding siblings ...)
  2022-09-18 15:52 ` [PATCH V5 05/11] riscv: traps: Add noinstr to prevent instrumentation inserted guoren
@ 2022-09-18 15:52 ` guoren
  2022-09-19 11:58   ` Peter Zijlstra
  2022-09-18 15:52 ` [PATCH V5 07/11] riscv: convert to generic entry guoren
                   ` (4 subsequent siblings)
  10 siblings, 1 reply; 29+ messages in thread
From: guoren @ 2022-09-18 15:52 UTC (permalink / raw)
  To: arnd, guoren, palmer, tglx, peterz, luto, conor.dooley, heiko,
	jszhang, lazyparser, falcon, chenhuacai, apatel, atishp, palmer,
	paul.walmsley, mark.rutland, zouyipeng, bigeasy, David.Laight
  Cc: linux-arch, linux-kernel, linux-riscv, Guo Ren

From: Guo Ren <guoren@linux.alibaba.com>

When DEBUG_PREEMPT=y,
	exit_to_user_mode_prepare
	->tick_nohz_user_enter_prepare
	  ->tick_nohz_full_cpu(smp_processor_id())
	    ->smp_processor_id()
	      ->debug_smp_processor_id()
		->check preempt_count() then:

[    5.717610] BUG: using smp_processor_id() in preemptible [00000000]
code: S20urandom/94
[    5.718111] caller is debug_smp_processor_id+0x24/0x38
[    5.718417] CPU: 1 PID: 94 Comm: S20urandom Not tainted
6.0.0-rc3-00010-gfd0a0d619c63-dirty #238
[    5.718886] Hardware name: riscv-virtio,qemu (DT)
[    5.719136] Call Trace:
[    5.719281] [<ffffffff800055fc>] dump_backtrace+0x2c/0x3c
[    5.719566] [<ffffffff80ae6cb0>] show_stack+0x44/0x5c
[    5.720023] [<ffffffff80aee870>] dump_stack_lvl+0x74/0xa4
[    5.720557] [<ffffffff80aee8bc>] dump_stack+0x1c/0x2c
[    5.721033] [<ffffffff80af65c0>]
check_preemption_disabled+0x104/0x108
[    5.721538] [<ffffffff80af65e8>] debug_smp_processor_id+0x24/0x38
[    5.722001] [<ffffffff800aee30>] exit_to_user_mode_prepare+0x48/0x178
[    5.722355] [<ffffffff80af5bf4>] irqentry_exit_to_user_mode+0x18/0x30
[    5.722685] [<ffffffff80af5c70>] irqentry_exit+0x64/0xa4
[    5.722953] [<ffffffff80af52f4>] do_page_fault+0x1d8/0x544
[    5.723291] [<ffffffff80003310>] ret_from_exception+0x0/0xb8

(Above is found in riscv platform with generic_entry)

The smp_processor_id() needs irqs disable or preempt_disable, so use
preempt dis/in protecting the tick_nohz_user_enter_prepare().

Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
Signed-off-by: Guo Ren <guoren@kernel.org>
---
 kernel/entry/common.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/kernel/entry/common.c b/kernel/entry/common.c
index 063068a9ea9b..36e4cd50531c 100644
--- a/kernel/entry/common.c
+++ b/kernel/entry/common.c
@@ -194,8 +194,10 @@ static void exit_to_user_mode_prepare(struct pt_regs *regs)
 
 	lockdep_assert_irqs_disabled();
 
+	preempt_disable();
 	/* Flush pending rcuog wakeup before the last need_resched() check */
 	tick_nohz_user_enter_prepare();
+	preempt_enable();
 
 	if (unlikely(ti_work & EXIT_TO_USER_MODE_WORK))
 		ti_work = exit_to_user_mode_loop(regs, ti_work);
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH V5 07/11] riscv: convert to generic entry
  2022-09-18 15:52 [PATCH V5 00/11] riscv: Add GENERIC_ENTRY support and related features guoren
                   ` (5 preceding siblings ...)
  2022-09-18 15:52 ` [PATCH V5 06/11] entry: Prevent DEBUG_PREEMPT warning guoren
@ 2022-09-18 15:52 ` guoren
  2022-09-19 13:34   ` Peter Zijlstra
  2022-09-18 15:52 ` [PATCH V5 08/11] riscv: Support HAVE_IRQ_EXIT_ON_IRQ_STACK guoren
                   ` (3 subsequent siblings)
  10 siblings, 1 reply; 29+ messages in thread
From: guoren @ 2022-09-18 15:52 UTC (permalink / raw)
  To: arnd, guoren, palmer, tglx, peterz, luto, conor.dooley, heiko,
	jszhang, lazyparser, falcon, chenhuacai, apatel, atishp, palmer,
	paul.walmsley, mark.rutland, zouyipeng, bigeasy, David.Laight
  Cc: linux-arch, linux-kernel, linux-riscv, Guo Ren

From: Guo Ren <guoren@linux.alibaba.com>

This patch converts riscv to use the generic entry infrastructure from
kernel/entry/*. The generic entry makes maintainers' work easier and
codes more elegant. Here are the changes than before:

 - More clear entry.S with handle_exception and ret_from_exception
 - Get rid of complex custom signal implementation
 - More readable syscall procedure
 - Little modification on ret_from_fork & ret_from_kernel_thread
 - Wrap with irqentry_enter/exit and syscall_enter/exit_from_user_mode
 - Use the standard preemption code instead of custom

Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
Signed-off-by: Guo Ren <guoren@kernel.org>
Suggested-by: Huacai Chen <chenhuacai@kernel.org>
Tested-by: Yipeng Zou <zouyipeng@huawei.com>
---
 arch/riscv/Kconfig                    |   1 +
 arch/riscv/include/asm/csr.h          |   1 -
 arch/riscv/include/asm/entry-common.h |   8 +
 arch/riscv/include/asm/ptrace.h       |  10 +-
 arch/riscv/include/asm/stacktrace.h   |   5 +
 arch/riscv/include/asm/syscall.h      |   6 +
 arch/riscv/include/asm/thread_info.h  |  13 +-
 arch/riscv/kernel/entry.S             | 228 +++-----------------------
 arch/riscv/kernel/irq.c               |  15 ++
 arch/riscv/kernel/ptrace.c            |  43 -----
 arch/riscv/kernel/signal.c            |  21 +--
 arch/riscv/kernel/sys_riscv.c         |  27 +++
 arch/riscv/kernel/traps.c             |  11 ++
 arch/riscv/mm/fault.c                 |  12 +-
 14 files changed, 117 insertions(+), 284 deletions(-)
 create mode 100644 arch/riscv/include/asm/entry-common.h

diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index ed66c31e4655..a07bb3b73b5b 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -56,6 +56,7 @@ config RISCV
 	select GENERIC_ATOMIC64 if !64BIT
 	select GENERIC_CLOCKEVENTS_BROADCAST if SMP
 	select GENERIC_EARLY_IOREMAP
+	select GENERIC_ENTRY
 	select GENERIC_GETTIMEOFDAY if HAVE_GENERIC_VDSO
 	select GENERIC_IDLE_POLL_SETUP
 	select GENERIC_IOREMAP if MMU
diff --git a/arch/riscv/include/asm/csr.h b/arch/riscv/include/asm/csr.h
index 0e571f6483d9..7c2b8cdb7b77 100644
--- a/arch/riscv/include/asm/csr.h
+++ b/arch/riscv/include/asm/csr.h
@@ -40,7 +40,6 @@
 #define SR_UXL		_AC(0x300000000, UL) /* XLEN mask for U-mode */
 #define SR_UXL_32	_AC(0x100000000, UL) /* XLEN = 32 for U-mode */
 #define SR_UXL_64	_AC(0x200000000, UL) /* XLEN = 64 for U-mode */
-#define SR_UXL_SHIFT	32
 #endif
 
 /* SATP flags */
diff --git a/arch/riscv/include/asm/entry-common.h b/arch/riscv/include/asm/entry-common.h
new file mode 100644
index 000000000000..1636ac2af28e
--- /dev/null
+++ b/arch/riscv/include/asm/entry-common.h
@@ -0,0 +1,8 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+#ifndef _ASM_RISCV_ENTRY_COMMON_H
+#define _ASM_RISCV_ENTRY_COMMON_H
+
+#include <asm/stacktrace.h>
+
+#endif /* _ASM_RISCV_ENTRY_COMMON_H */
diff --git a/arch/riscv/include/asm/ptrace.h b/arch/riscv/include/asm/ptrace.h
index 6ecd461129d2..4e46a611f255 100644
--- a/arch/riscv/include/asm/ptrace.h
+++ b/arch/riscv/include/asm/ptrace.h
@@ -53,6 +53,9 @@ struct pt_regs {
 	unsigned long orig_a0;
 };
 
+#define PTRACE_SYSEMU			0x1f
+#define PTRACE_SYSEMU_SINGLESTEP	0x20
+
 #ifdef CONFIG_64BIT
 #define REG_FMT "%016lx"
 #else
@@ -121,8 +124,6 @@ extern unsigned long regs_get_kernel_stack_nth(struct pt_regs *regs,
 
 void prepare_ftrace_return(unsigned long *parent, unsigned long self_addr,
 			   unsigned long frame_pointer);
-int do_syscall_trace_enter(struct pt_regs *regs);
-void do_syscall_trace_exit(struct pt_regs *regs);
 
 /**
  * regs_get_register() - get register value from its offset
@@ -172,6 +173,11 @@ static inline unsigned long regs_get_kernel_argument(struct pt_regs *regs,
 	return 0;
 }
 
+static inline int regs_irqs_disabled(struct pt_regs *regs)
+{
+	return !(regs->status & SR_IE);
+}
+
 #endif /* __ASSEMBLY__ */
 
 #endif /* _ASM_RISCV_PTRACE_H */
diff --git a/arch/riscv/include/asm/stacktrace.h b/arch/riscv/include/asm/stacktrace.h
index 3450c1912afd..f7e8ef2418b9 100644
--- a/arch/riscv/include/asm/stacktrace.h
+++ b/arch/riscv/include/asm/stacktrace.h
@@ -16,4 +16,9 @@ extern void notrace walk_stackframe(struct task_struct *task, struct pt_regs *re
 extern void dump_backtrace(struct pt_regs *regs, struct task_struct *task,
 			   const char *loglvl);
 
+static inline bool on_thread_stack(void)
+{
+	return !(((unsigned long)(current->stack) ^ current_stack_pointer) & ~(THREAD_SIZE - 1));
+}
+
 #endif /* _ASM_RISCV_STACKTRACE_H */
diff --git a/arch/riscv/include/asm/syscall.h b/arch/riscv/include/asm/syscall.h
index 384a63b86420..6c573f18030b 100644
--- a/arch/riscv/include/asm/syscall.h
+++ b/arch/riscv/include/asm/syscall.h
@@ -74,5 +74,11 @@ static inline int syscall_get_arch(struct task_struct *task)
 #endif
 }
 
+static inline bool arch_syscall_is_vdso_sigreturn(struct pt_regs *regs)
+{
+	return false;
+}
+
 asmlinkage long sys_riscv_flush_icache(uintptr_t, uintptr_t, uintptr_t);
+asmlinkage void do_sys_ecall_u(struct pt_regs *regs);
 #endif	/* _ASM_RISCV_SYSCALL_H */
diff --git a/arch/riscv/include/asm/thread_info.h b/arch/riscv/include/asm/thread_info.h
index 67322f878e0d..7de4fb96f0b5 100644
--- a/arch/riscv/include/asm/thread_info.h
+++ b/arch/riscv/include/asm/thread_info.h
@@ -66,6 +66,7 @@ struct thread_info {
 	long			kernel_sp;	/* Kernel stack pointer */
 	long			user_sp;	/* User stack pointer */
 	int			cpu;
+	unsigned long		syscall_work;	/* SYSCALL_WORK_ flags */
 };
 
 /*
@@ -88,26 +89,18 @@ struct thread_info {
  * - pending work-to-be-done flags are in lowest half-word
  * - other flags in upper half-word(s)
  */
-#define TIF_SYSCALL_TRACE	0	/* syscall trace active */
 #define TIF_NOTIFY_RESUME	1	/* callback before returning to user */
 #define TIF_SIGPENDING		2	/* signal pending */
 #define TIF_NEED_RESCHED	3	/* rescheduling necessary */
 #define TIF_RESTORE_SIGMASK	4	/* restore signal mask in do_signal() */
 #define TIF_MEMDIE		5	/* is terminating due to OOM killer */
-#define TIF_SYSCALL_TRACEPOINT  6       /* syscall tracepoint instrumentation */
-#define TIF_SYSCALL_AUDIT	7	/* syscall auditing */
-#define TIF_SECCOMP		8	/* syscall secure computing */
 #define TIF_NOTIFY_SIGNAL	9	/* signal notifications exist */
 #define TIF_UPROBE		10	/* uprobe breakpoint or singlestep */
 #define TIF_32BIT		11	/* compat-mode 32bit process */
 
-#define _TIF_SYSCALL_TRACE	(1 << TIF_SYSCALL_TRACE)
 #define _TIF_NOTIFY_RESUME	(1 << TIF_NOTIFY_RESUME)
 #define _TIF_SIGPENDING		(1 << TIF_SIGPENDING)
 #define _TIF_NEED_RESCHED	(1 << TIF_NEED_RESCHED)
-#define _TIF_SYSCALL_TRACEPOINT	(1 << TIF_SYSCALL_TRACEPOINT)
-#define _TIF_SYSCALL_AUDIT	(1 << TIF_SYSCALL_AUDIT)
-#define _TIF_SECCOMP		(1 << TIF_SECCOMP)
 #define _TIF_NOTIFY_SIGNAL	(1 << TIF_NOTIFY_SIGNAL)
 #define _TIF_UPROBE		(1 << TIF_UPROBE)
 
@@ -115,8 +108,4 @@ struct thread_info {
 	(_TIF_NOTIFY_RESUME | _TIF_SIGPENDING | _TIF_NEED_RESCHED | \
 	 _TIF_NOTIFY_SIGNAL | _TIF_UPROBE)
 
-#define _TIF_SYSCALL_WORK \
-	(_TIF_SYSCALL_TRACE | _TIF_SYSCALL_TRACEPOINT | _TIF_SYSCALL_AUDIT | \
-	 _TIF_SECCOMP)
-
 #endif /* _ASM_RISCV_THREAD_INFO_H */
diff --git a/arch/riscv/kernel/entry.S b/arch/riscv/kernel/entry.S
index b9eda3fcbd6d..5f49517cd3a2 100644
--- a/arch/riscv/kernel/entry.S
+++ b/arch/riscv/kernel/entry.S
@@ -14,10 +14,6 @@
 #include <asm/asm-offsets.h>
 #include <asm/errata_list.h>
 
-#if !IS_ENABLED(CONFIG_PREEMPTION)
-.set resume_kernel, restore_all
-#endif
-
 ENTRY(handle_exception)
 	/*
 	 * If coming from userspace, preserve the user thread pointer and load
@@ -106,19 +102,8 @@ _save_context:
 .option norelax
 	la gp, __global_pointer$
 .option pop
-
-#ifdef CONFIG_TRACE_IRQFLAGS
-	call __trace_hardirqs_off
-#endif
-
-#ifdef CONFIG_CONTEXT_TRACKING_USER
-	/* If previous state is in user mode, call user_exit_callable(). */
-	li   a0, SR_PP
-	and a0, s1, a0
-	bnez a0, skip_context_tracking
-	call user_exit_callable
-skip_context_tracking:
-#endif
+	move a0, sp /* pt_regs */
+	la ra, ret_from_exception
 
 	/*
 	 * MSB of cause differentiates between
@@ -126,134 +111,26 @@ skip_context_tracking:
 	 */
 	bge s4, zero, 1f
 
-	la ra, ret_from_exception
-
 	/* Handle interrupts */
-	move a0, sp /* pt_regs */
-	la a1, generic_handle_arch_irq
-	jr a1
+	tail do_riscv_irq
 1:
-	/*
-	 * Exceptions run with interrupts enabled or disabled depending on the
-	 * state of SR_PIE in m/sstatus.
-	 */
-	andi t0, s1, SR_PIE
-	beqz t0, 1f
-	/* kprobes, entered via ebreak, must have interrupts disabled. */
-	li t0, EXC_BREAKPOINT
-	beq s4, t0, 1f
-#ifdef CONFIG_TRACE_IRQFLAGS
-	call __trace_hardirqs_on
-#endif
-	csrs CSR_STATUS, SR_IE
-
-1:
-	la ra, ret_from_exception
-	/* Handle syscalls */
-	li t0, EXC_SYSCALL
-	beq s4, t0, handle_syscall
-
 	/* Handle other exceptions */
 	slli t0, s4, RISCV_LGPTR
 	la t1, excp_vect_table
 	la t2, excp_vect_table_end
-	move a0, sp /* pt_regs */
 	add t0, t1, t0
 	/* Check if exception code lies within bounds */
-	bgeu t0, t2, 1f
+	bgeu t0, t2, 2f
 	REG_L t0, 0(t0)
 	jr t0
-1:
+2:
 	tail do_trap_unknown
+END(handle_exception)
 
-handle_syscall:
-#ifdef CONFIG_RISCV_M_MODE
-	/*
-	 * When running is M-Mode (no MMU config), MPIE does not get set.
-	 * As a result, we need to force enable interrupts here because
-	 * handle_exception did not do set SR_IE as it always sees SR_PIE
-	 * being cleared.
-	 */
-	csrs CSR_STATUS, SR_IE
-#endif
-#if defined(CONFIG_TRACE_IRQFLAGS) || defined(CONFIG_CONTEXT_TRACKING_USER)
-	/* Recover a0 - a7 for system calls */
-	REG_L a0, PT_A0(sp)
-	REG_L a1, PT_A1(sp)
-	REG_L a2, PT_A2(sp)
-	REG_L a3, PT_A3(sp)
-	REG_L a4, PT_A4(sp)
-	REG_L a5, PT_A5(sp)
-	REG_L a6, PT_A6(sp)
-	REG_L a7, PT_A7(sp)
-#endif
-	 /* save the initial A0 value (needed in signal handlers) */
-	REG_S a0, PT_ORIG_A0(sp)
-	/*
-	 * Advance SEPC to avoid executing the original
-	 * scall instruction on sret
-	 */
-	addi s2, s2, 0x4
-	REG_S s2, PT_EPC(sp)
-	/* Trace syscalls, but only if requested by the user. */
-	REG_L t0, TASK_TI_FLAGS(tp)
-	andi t0, t0, _TIF_SYSCALL_WORK
-	bnez t0, handle_syscall_trace_enter
-check_syscall_nr:
-	/* Check to make sure we don't jump to a bogus syscall number. */
-	li t0, __NR_syscalls
-	la s0, sys_ni_syscall
-	/*
-	 * Syscall number held in a7.
-	 * If syscall number is above allowed value, redirect to ni_syscall.
-	 */
-	bgeu a7, t0, 3f
-#ifdef CONFIG_COMPAT
+ENTRY(ret_from_exception)
 	REG_L s0, PT_STATUS(sp)
-	srli s0, s0, SR_UXL_SHIFT
-	andi s0, s0, (SR_UXL >> SR_UXL_SHIFT)
-	li t0, (SR_UXL_32 >> SR_UXL_SHIFT)
-	sub t0, s0, t0
-	bnez t0, 1f
-
-	/* Call compat_syscall */
-	la s0, compat_sys_call_table
-	j 2f
-1:
-#endif
-	/* Call syscall */
-	la s0, sys_call_table
-2:
-	slli t0, a7, RISCV_LGPTR
-	add s0, s0, t0
-	REG_L s0, 0(s0)
-3:
-	jalr s0
 
-ret_from_syscall:
-	/* Set user a0 to kernel a0 */
-	REG_S a0, PT_A0(sp)
-	/*
-	 * We didn't execute the actual syscall.
-	 * Seccomp already set return value for the current task pt_regs.
-	 * (If it was configured with SECCOMP_RET_ERRNO/TRACE)
-	 */
-ret_from_syscall_rejected:
-#ifdef CONFIG_DEBUG_RSEQ
-	move a0, sp
-	call rseq_syscall
-#endif
-	/* Trace syscalls, but only if requested by the user. */
-	REG_L t0, TASK_TI_FLAGS(tp)
-	andi t0, t0, _TIF_SYSCALL_WORK
-	bnez t0, handle_syscall_trace_exit
-
-ret_from_exception:
-	REG_L s0, PT_STATUS(sp)
 	csrc CSR_STATUS, SR_IE
-#ifdef CONFIG_TRACE_IRQFLAGS
-	call __trace_hardirqs_off
-#endif
 #ifdef CONFIG_RISCV_M_MODE
 	/* the MPP value is too large to be used as an immediate arg for addi */
 	li t0, SR_MPP
@@ -261,17 +138,7 @@ ret_from_exception:
 #else
 	andi s0, s0, SR_SPP
 #endif
-	bnez s0, resume_kernel
-
-resume_userspace:
-	/* Interrupts must be disabled here so flags are checked atomically */
-	REG_L s0, TASK_TI_FLAGS(tp) /* current_thread_info->flags */
-	andi s1, s0, _TIF_WORK_MASK
-	bnez s1, work_pending
-
-#ifdef CONFIG_CONTEXT_TRACKING_USER
-	call user_enter_callable
-#endif
+	bnez s0, 1f
 
 	/* Save unwound kernel stack pointer in thread_info */
 	addi s0, sp, PT_SIZE_ON_STACK
@@ -282,19 +149,7 @@ resume_userspace:
 	 * structures again.
 	 */
 	csrw CSR_SCRATCH, tp
-
-restore_all:
-#ifdef CONFIG_TRACE_IRQFLAGS
-	REG_L s1, PT_STATUS(sp)
-	andi t0, s1, SR_PIE
-	beqz t0, 1f
-	call __trace_hardirqs_on
-	j 2f
 1:
-	call __trace_hardirqs_off
-2:
-#endif
-	REG_L a0, PT_STATUS(sp)
 	/*
 	 * The current load reservation is effectively part of the processor's
 	 * state, in the sense that load reservations cannot be shared between
@@ -315,9 +170,11 @@ restore_all:
 	REG_L  a2, PT_EPC(sp)
 	REG_SC x0, a2, PT_EPC(sp)
 
-	csrw CSR_STATUS, a0
 	csrw CSR_EPC, a2
 
+	REG_L a0, PT_STATUS(sp)
+	csrw CSR_STATUS, a0
+
 	REG_L x1,  PT_RA(sp)
 	REG_L x3,  PT_GP(sp)
 	REG_L x4,  PT_TP(sp)
@@ -356,54 +213,10 @@ restore_all:
 #else
 	sret
 #endif
-
-#if IS_ENABLED(CONFIG_PREEMPTION)
-resume_kernel:
-	REG_L s0, TASK_TI_PREEMPT_COUNT(tp)
-	bnez s0, restore_all
-	REG_L s0, TASK_TI_FLAGS(tp)
-	andi s0, s0, _TIF_NEED_RESCHED
-	beqz s0, restore_all
-	call preempt_schedule_irq
-	j restore_all
-#endif
-
-work_pending:
-	/* Enter slow path for supplementary processing */
-	la ra, ret_from_exception
-	andi s1, s0, _TIF_NEED_RESCHED
-	bnez s1, work_resched
-work_notifysig:
-	/* Handle pending signals and notify-resume requests */
-	csrs CSR_STATUS, SR_IE /* Enable interrupts for do_notify_resume() */
-	move a0, sp /* pt_regs */
-	move a1, s0 /* current_thread_info->flags */
-	tail do_notify_resume
-work_resched:
-	tail schedule
-
-/* Slow paths for ptrace. */
-handle_syscall_trace_enter:
-	move a0, sp
-	call do_syscall_trace_enter
-	move t0, a0
-	REG_L a0, PT_A0(sp)
-	REG_L a1, PT_A1(sp)
-	REG_L a2, PT_A2(sp)
-	REG_L a3, PT_A3(sp)
-	REG_L a4, PT_A4(sp)
-	REG_L a5, PT_A5(sp)
-	REG_L a6, PT_A6(sp)
-	REG_L a7, PT_A7(sp)
-	bnez t0, ret_from_syscall_rejected
-	j check_syscall_nr
-handle_syscall_trace_exit:
-	move a0, sp
-	call do_syscall_trace_exit
-	j ret_from_exception
+END(ret_from_exception)
 
 #ifdef CONFIG_VMAP_STACK
-handle_kernel_stack_overflow:
+ENTRY(handle_kernel_stack_overflow)
 	la sp, shadow_stack
 	addi sp, sp, SHADOW_OVERFLOW_STACK_SIZE
 
@@ -499,21 +312,24 @@ restore_caller_reg:
 	REG_S s5, PT_TP(sp)
 	move a0, sp
 	tail handle_bad_stack
+END(handle_kernel_stack_overflow)
 #endif
 
-END(handle_exception)
-
 ENTRY(ret_from_fork)
+	call schedule_tail
+	move a0, sp /* pt_regs */
 	la ra, ret_from_exception
-	tail schedule_tail
+	tail syscall_exit_to_user_mode
 ENDPROC(ret_from_fork)
 
 ENTRY(ret_from_kernel_thread)
 	call schedule_tail
 	/* Call fn(arg) */
-	la ra, ret_from_exception
 	move a0, s1
-	jr s0
+	jalr s0
+	move a0, sp /* pt_regs */
+	la ra, ret_from_exception
+	tail syscall_exit_to_user_mode
 ENDPROC(ret_from_kernel_thread)
 
 
@@ -582,7 +398,7 @@ ENTRY(excp_vect_table)
 	RISCV_PTR do_trap_load_fault
 	RISCV_PTR do_trap_store_misaligned
 	RISCV_PTR do_trap_store_fault
-	RISCV_PTR do_trap_ecall_u /* system call, gets intercepted */
+	RISCV_PTR do_sys_ecall_u /* system call */
 	RISCV_PTR do_trap_ecall_s
 	RISCV_PTR do_trap_unknown
 	RISCV_PTR do_trap_ecall_m
diff --git a/arch/riscv/kernel/irq.c b/arch/riscv/kernel/irq.c
index 7207fa08d78f..24c2e1bd756a 100644
--- a/arch/riscv/kernel/irq.c
+++ b/arch/riscv/kernel/irq.c
@@ -5,6 +5,7 @@
  * Copyright (C) 2018 Christoph Hellwig
  */
 
+#include <linux/entry-common.h>
 #include <linux/interrupt.h>
 #include <linux/irqchip.h>
 #include <linux/seq_file.h>
@@ -22,3 +23,17 @@ void __init init_IRQ(void)
 	if (!handle_arch_irq)
 		panic("No interrupt controller found.");
 }
+
+asmlinkage void noinstr do_riscv_irq(struct pt_regs *regs)
+{
+	struct pt_regs *old_regs;
+	irqentry_state_t state = irqentry_enter(regs);
+
+	irq_enter_rcu();
+	old_regs = set_irq_regs(regs);
+	handle_arch_irq(regs);
+	set_irq_regs(old_regs);
+	irq_exit_rcu();
+
+	irqentry_exit(regs, state);
+}
diff --git a/arch/riscv/kernel/ptrace.c b/arch/riscv/kernel/ptrace.c
index 44f4b1ca315d..23c48b14a0e7 100644
--- a/arch/riscv/kernel/ptrace.c
+++ b/arch/riscv/kernel/ptrace.c
@@ -19,9 +19,6 @@
 #include <linux/sched.h>
 #include <linux/sched/task_stack.h>
 
-#define CREATE_TRACE_POINTS
-#include <trace/events/syscalls.h>
-
 enum riscv_regset {
 	REGSET_X,
 #ifdef CONFIG_FPU
@@ -228,46 +225,6 @@ long arch_ptrace(struct task_struct *child, long request,
 	return ret;
 }
 
-/*
- * Allows PTRACE_SYSCALL to work.  These are called from entry.S in
- * {handle,ret_from}_syscall.
- */
-__visible int do_syscall_trace_enter(struct pt_regs *regs)
-{
-	if (test_thread_flag(TIF_SYSCALL_TRACE))
-		if (ptrace_report_syscall_entry(regs))
-			return -1;
-
-	/*
-	 * Do the secure computing after ptrace; failures should be fast.
-	 * If this fails we might have return value in a0 from seccomp
-	 * (via SECCOMP_RET_ERRNO/TRACE).
-	 */
-	if (secure_computing() == -1)
-		return -1;
-
-#ifdef CONFIG_HAVE_SYSCALL_TRACEPOINTS
-	if (test_thread_flag(TIF_SYSCALL_TRACEPOINT))
-		trace_sys_enter(regs, syscall_get_nr(current, regs));
-#endif
-
-	audit_syscall_entry(regs->a7, regs->a0, regs->a1, regs->a2, regs->a3);
-	return 0;
-}
-
-__visible void do_syscall_trace_exit(struct pt_regs *regs)
-{
-	audit_syscall_exit(regs);
-
-	if (test_thread_flag(TIF_SYSCALL_TRACE))
-		ptrace_report_syscall_exit(regs, 0);
-
-#ifdef CONFIG_HAVE_SYSCALL_TRACEPOINTS
-	if (test_thread_flag(TIF_SYSCALL_TRACEPOINT))
-		trace_sys_exit(regs, regs_return_value(regs));
-#endif
-}
-
 #ifdef CONFIG_COMPAT
 static int compat_riscv_gpr_get(struct task_struct *target,
 				const struct user_regset *regset,
diff --git a/arch/riscv/kernel/signal.c b/arch/riscv/kernel/signal.c
index 5a2de6b6f882..5871eccbbd94 100644
--- a/arch/riscv/kernel/signal.c
+++ b/arch/riscv/kernel/signal.c
@@ -12,6 +12,7 @@
 #include <linux/syscalls.h>
 #include <linux/resume_user_mode.h>
 #include <linux/linkage.h>
+#include <linux/entry-common.h>
 
 #include <asm/ucontext.h>
 #include <asm/vdso.h>
@@ -272,7 +273,7 @@ static void handle_signal(struct ksignal *ksig, struct pt_regs *regs)
 	signal_setup_done(ret, ksig, 0);
 }
 
-static void do_signal(struct pt_regs *regs)
+void arch_do_signal_or_restart(struct pt_regs *regs)
 {
 	struct ksignal ksig;
 
@@ -309,21 +310,3 @@ static void do_signal(struct pt_regs *regs)
 	 */
 	restore_saved_sigmask();
 }
-
-/*
- * notification of userspace execution resumption
- * - triggered by the _TIF_WORK_MASK flags
- */
-asmlinkage __visible void do_notify_resume(struct pt_regs *regs,
-					   unsigned long thread_info_flags)
-{
-	if (thread_info_flags & _TIF_UPROBE)
-		uprobe_notify_resume(regs);
-
-	/* Handle pending signal delivery */
-	if (thread_info_flags & (_TIF_SIGPENDING | _TIF_NOTIFY_SIGNAL))
-		do_signal(regs);
-
-	if (thread_info_flags & _TIF_NOTIFY_RESUME)
-		resume_user_mode_work(regs);
-}
diff --git a/arch/riscv/kernel/sys_riscv.c b/arch/riscv/kernel/sys_riscv.c
index 571556bb9261..41cc1c4bccb3 100644
--- a/arch/riscv/kernel/sys_riscv.c
+++ b/arch/riscv/kernel/sys_riscv.c
@@ -5,8 +5,10 @@
  * Copyright (C) 2017 SiFive
  */
 
+#include <linux/entry-common.h>
 #include <linux/syscalls.h>
 #include <asm/unistd.h>
+#include <asm/syscall.h>
 #include <asm/cacheflush.h>
 #include <asm-generic/mman-common.h>
 
@@ -72,3 +74,28 @@ SYSCALL_DEFINE3(riscv_flush_icache, uintptr_t, start, uintptr_t, end,
 
 	return 0;
 }
+
+typedef long (*syscall_t)(ulong, ulong, ulong, ulong, ulong, ulong, ulong);
+
+asmlinkage void do_sys_ecall_u(struct pt_regs *regs)
+{
+	syscall_t syscall;
+	ulong nr = regs->a7;
+
+	regs->epc += 4;
+	regs->orig_a0 = regs->a0;
+	regs->a0 = -ENOSYS;
+
+	nr = syscall_enter_from_user_mode(regs, nr);
+#ifdef CONFIG_COMPAT
+	if ((regs->status & SR_UXL) == SR_UXL_32)
+		syscall = compat_sys_call_table[nr];
+	else
+#endif
+		syscall = sys_call_table[nr];
+
+	if (nr < NR_syscalls)
+		regs->a0 = syscall(regs->orig_a0, regs->a1, regs->a2,
+				   regs->a3, regs->a4, regs->a5, regs->a6);
+	syscall_exit_to_user_mode(regs);
+}
diff --git a/arch/riscv/kernel/traps.c b/arch/riscv/kernel/traps.c
index 588e17c386c6..73f06cd149d9 100644
--- a/arch/riscv/kernel/traps.c
+++ b/arch/riscv/kernel/traps.c
@@ -17,6 +17,7 @@
 #include <linux/module.h>
 #include <linux/irq.h>
 #include <linux/kexec.h>
+#include <linux/entry-common.h>
 
 #include <asm/asm-prototypes.h>
 #include <asm/bug.h>
@@ -99,7 +100,9 @@ static void do_trap_error(struct pt_regs *regs, int signo, int code,
 #define DO_ERROR_INFO(name, signo, code, str)				\
 asmlinkage __visible __trap_section void name(struct pt_regs *regs)	\
 {									\
+	irqentry_state_t state = irqentry_enter(regs);			\
 	do_trap_error(regs, signo, code, regs->epc, "Oops - " str);	\
+	irqentry_exit(regs, state);					\
 }
 
 DO_ERROR_INFO(do_trap_unknown,
@@ -123,18 +126,22 @@ int handle_misaligned_store(struct pt_regs *regs);
 
 asmlinkage void __trap_section do_trap_load_misaligned(struct pt_regs *regs)
 {
+	irqentry_state_t state = irqentry_enter(regs);
 	if (!handle_misaligned_load(regs))
 		return;
 	do_trap_error(regs, SIGBUS, BUS_ADRALN, regs->epc,
 		      "Oops - load address misaligned");
+	irqentry_exit(regs, state);
 }
 
 asmlinkage void __trap_section do_trap_store_misaligned(struct pt_regs *regs)
 {
+	irqentry_state_t state = irqentry_enter(regs);
 	if (!handle_misaligned_store(regs))
 		return;
 	do_trap_error(regs, SIGBUS, BUS_ADRALN, regs->epc,
 		      "Oops - store (or AMO) address misaligned");
+	irqentry_exit(regs, state);
 }
 #endif
 DO_ERROR_INFO(do_trap_store_fault,
@@ -158,6 +165,8 @@ static inline unsigned long get_break_insn_length(unsigned long pc)
 
 asmlinkage __visible __trap_section void do_trap_break(struct pt_regs *regs)
 {
+	irqentry_state_t state = irqentry_enter(regs);
+
 #ifdef CONFIG_KPROBES
 	if (kprobe_single_step_handler(regs))
 		return;
@@ -185,6 +194,8 @@ asmlinkage __visible __trap_section void do_trap_break(struct pt_regs *regs)
 		regs->epc += get_break_insn_length(regs->epc);
 	else
 		die(regs, "Kernel BUG");
+
+	irqentry_exit(regs, state);
 }
 NOKPROBE_SYMBOL(do_trap_break);
 
diff --git a/arch/riscv/mm/fault.c b/arch/riscv/mm/fault.c
index c7829289e806..cc8e642a91ea 100644
--- a/arch/riscv/mm/fault.c
+++ b/arch/riscv/mm/fault.c
@@ -15,6 +15,7 @@
 #include <linux/uaccess.h>
 #include <linux/kprobes.h>
 #include <linux/kfence.h>
+#include <linux/entry-common.h>
 
 #include <asm/ptrace.h>
 #include <asm/tlbflush.h>
@@ -203,7 +204,7 @@ static inline bool access_error(unsigned long cause, struct vm_area_struct *vma)
  * This routine handles page faults.  It determines the address and the
  * problem, and then passes it off to one of the appropriate routines.
  */
-asmlinkage void noinstr do_page_fault(struct pt_regs *regs)
+static void __do_page_fault(struct pt_regs *regs)
 {
 	struct task_struct *tsk;
 	struct vm_area_struct *vma;
@@ -350,4 +351,13 @@ asmlinkage void noinstr do_page_fault(struct pt_regs *regs)
 	}
 	return;
 }
+
+asmlinkage void noinstr do_page_fault(struct pt_regs *regs)
+{
+	irqentry_state_t state = irqentry_enter(regs);
+
+	__do_page_fault(regs);
+
+	irqentry_exit(regs, state);
+}
 NOKPROBE_SYMBOL(do_page_fault);
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH V5 08/11] riscv: Support HAVE_IRQ_EXIT_ON_IRQ_STACK
  2022-09-18 15:52 [PATCH V5 00/11] riscv: Add GENERIC_ENTRY support and related features guoren
                   ` (6 preceding siblings ...)
  2022-09-18 15:52 ` [PATCH V5 07/11] riscv: convert to generic entry guoren
@ 2022-09-18 15:52 ` guoren
  2022-09-19 13:45   ` Peter Zijlstra
  2022-09-21  8:34   ` Chen Zhongjin
  2022-09-18 15:52 ` [PATCH V5 09/11] riscv: Support HAVE_SOFTIRQ_ON_OWN_STACK guoren
                   ` (2 subsequent siblings)
  10 siblings, 2 replies; 29+ messages in thread
From: guoren @ 2022-09-18 15:52 UTC (permalink / raw)
  To: arnd, guoren, palmer, tglx, peterz, luto, conor.dooley, heiko,
	jszhang, lazyparser, falcon, chenhuacai, apatel, atishp, palmer,
	paul.walmsley, mark.rutland, zouyipeng, bigeasy, David.Laight
  Cc: linux-arch, linux-kernel, linux-riscv, Guo Ren

From: Guo Ren <guoren@linux.alibaba.com>

Add independent irq stacks for percpu to prevent kernel stack overflows.
It is also compatible with VMAP_STACK by implementing
arch_alloc_vmap_stack.  Many architectures have supported
HAVE_IRQ_EXIT_ON_IRQ_STACK, riscv should follow up.

Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
Signed-off-by: Guo Ren <guoren@kernel.org>
---
 arch/riscv/Kconfig                   |  8 +++++
 arch/riscv/include/asm/irq.h         |  3 ++
 arch/riscv/include/asm/thread_info.h |  2 ++
 arch/riscv/include/asm/vmap_stack.h  | 28 ++++++++++++++++
 arch/riscv/kernel/entry.S            | 27 ++++++++++++++++
 arch/riscv/kernel/irq.c              | 48 ++++++++++++++++++++++++++--
 6 files changed, 114 insertions(+), 2 deletions(-)
 create mode 100644 arch/riscv/include/asm/vmap_stack.h

diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index a07bb3b73b5b..75db47a983f2 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -433,6 +433,14 @@ config FPU
 
 	  If you don't know what to do here, say Y.
 
+config IRQ_STACKS
+	bool "Independent irq stacks" if EXPERT
+	default y
+	select HAVE_IRQ_EXIT_ON_IRQ_STACK
+	help
+	  Add independent irq stacks for percpu to prevent kernel stack overflows.
+	  We may save some memory footprint by disabling IRQ_STACKS.
+
 endmenu # "Platform type"
 
 menu "Kernel features"
diff --git a/arch/riscv/include/asm/irq.h b/arch/riscv/include/asm/irq.h
index e4c435509983..205e1c693dfd 100644
--- a/arch/riscv/include/asm/irq.h
+++ b/arch/riscv/include/asm/irq.h
@@ -13,5 +13,8 @@
 #include <asm-generic/irq.h>
 
 extern void __init init_IRQ(void);
+asmlinkage void call_on_stack(struct pt_regs *regs, ulong *sp,
+				     void (*fn)(struct pt_regs *), ulong tmp);
+asmlinkage void noinstr do_riscv_irq(struct pt_regs *regs);
 
 #endif /* _ASM_RISCV_IRQ_H */
diff --git a/arch/riscv/include/asm/thread_info.h b/arch/riscv/include/asm/thread_info.h
index 7de4fb96f0b5..043da8ccc7e6 100644
--- a/arch/riscv/include/asm/thread_info.h
+++ b/arch/riscv/include/asm/thread_info.h
@@ -40,6 +40,8 @@
 #define OVERFLOW_STACK_SIZE     SZ_4K
 #define SHADOW_OVERFLOW_STACK_SIZE (1024)
 
+#define IRQ_STACK_SIZE		THREAD_SIZE
+
 #ifndef __ASSEMBLY__
 
 extern long shadow_stack[SHADOW_OVERFLOW_STACK_SIZE / sizeof(long)];
diff --git a/arch/riscv/include/asm/vmap_stack.h b/arch/riscv/include/asm/vmap_stack.h
new file mode 100644
index 000000000000..3fbf481abf4f
--- /dev/null
+++ b/arch/riscv/include/asm/vmap_stack.h
@@ -0,0 +1,28 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+// Copied from arch/arm64/include/asm/vmap_stack.h.
+#ifndef _ASM_RISCV_VMAP_STACK_H
+#define _ASM_RISCV_VMAP_STACK_H
+
+#include <linux/bug.h>
+#include <linux/gfp.h>
+#include <linux/kconfig.h>
+#include <linux/vmalloc.h>
+#include <linux/pgtable.h>
+#include <asm/thread_info.h>
+
+/*
+ * To ensure that VMAP'd stack overflow detection works correctly, all VMAP'd
+ * stacks need to have the same alignment.
+ */
+static inline unsigned long *arch_alloc_vmap_stack(size_t stack_size, int node)
+{
+	void *p;
+
+	BUILD_BUG_ON(!IS_ENABLED(CONFIG_VMAP_STACK));
+
+	p = __vmalloc_node(stack_size, THREAD_ALIGN, THREADINFO_GFP, node,
+			__builtin_return_address(0));
+	return kasan_reset_tag(p);
+}
+
+#endif /* _ASM_RISCV_VMAP_STACK_H */
diff --git a/arch/riscv/kernel/entry.S b/arch/riscv/kernel/entry.S
index 5f49517cd3a2..426529b84db0 100644
--- a/arch/riscv/kernel/entry.S
+++ b/arch/riscv/kernel/entry.S
@@ -332,6 +332,33 @@ ENTRY(ret_from_kernel_thread)
 	tail syscall_exit_to_user_mode
 ENDPROC(ret_from_kernel_thread)
 
+#ifdef CONFIG_IRQ_STACKS
+ENTRY(call_on_stack)
+	/* Create a frame record to save our ra and fp */
+	addi	sp, sp, -RISCV_SZPTR
+	REG_S	ra, (sp)
+	addi	sp, sp, -RISCV_SZPTR
+	REG_S	fp, (sp)
+
+	/* Save sp in fp */
+	move	fp, sp
+
+	/* Move to the new stack and call the function there */
+	li	a3, IRQ_STACK_SIZE
+	add	sp, a1, a3
+	jalr	a2
+
+	/*
+	 * Restore sp from prev fp, and fp, ra from the frame
+	 */
+	move	sp, fp
+	REG_L	fp, (sp)
+	addi	sp, sp, RISCV_SZPTR
+	REG_L	ra, (sp)
+	addi	sp, sp, RISCV_SZPTR
+	ret
+ENDPROC(call_on_stack)
+#endif
 
 /*
  * Integer register context switch
diff --git a/arch/riscv/kernel/irq.c b/arch/riscv/kernel/irq.c
index 24c2e1bd756a..5ad4952203c5 100644
--- a/arch/riscv/kernel/irq.c
+++ b/arch/riscv/kernel/irq.c
@@ -10,6 +10,37 @@
 #include <linux/irqchip.h>
 #include <linux/seq_file.h>
 #include <asm/smp.h>
+#include <asm/vmap_stack.h>
+
+#ifdef CONFIG_IRQ_STACKS
+static DEFINE_PER_CPU(ulong *, irq_stack_ptr);
+
+#ifdef CONFIG_VMAP_STACK
+static void init_irq_stacks(void)
+{
+	int cpu;
+	ulong *p;
+
+	for_each_possible_cpu(cpu) {
+		p = arch_alloc_vmap_stack(IRQ_STACK_SIZE, cpu_to_node(cpu));
+		per_cpu(irq_stack_ptr, cpu) = p;
+	}
+}
+#else
+/* irq stack only needs to be 16 byte aligned - not IRQ_STACK_SIZE aligned. */
+DEFINE_PER_CPU_ALIGNED(ulong [IRQ_STACK_SIZE/sizeof(ulong)], irq_stack);
+
+static void init_irq_stacks(void)
+{
+	int cpu;
+
+	for_each_possible_cpu(cpu)
+		per_cpu(irq_stack_ptr, cpu) = per_cpu(irq_stack, cpu);
+}
+#endif /* CONFIG_VMAP_STACK */
+#else
+static void init_irq_stacks(void) {}
+#endif /* CONFIG_IRQ_STACKS */
 
 int arch_show_interrupts(struct seq_file *p, int prec)
 {
@@ -19,21 +50,34 @@ int arch_show_interrupts(struct seq_file *p, int prec)
 
 void __init init_IRQ(void)
 {
+	init_irq_stacks();
 	irqchip_init();
 	if (!handle_arch_irq)
 		panic("No interrupt controller found.");
 }
 
-asmlinkage void noinstr do_riscv_irq(struct pt_regs *regs)
+static void noinstr handle_riscv_irq(struct pt_regs *regs)
 {
 	struct pt_regs *old_regs;
-	irqentry_state_t state = irqentry_enter(regs);
 
 	irq_enter_rcu();
 	old_regs = set_irq_regs(regs);
 	handle_arch_irq(regs);
 	set_irq_regs(old_regs);
 	irq_exit_rcu();
+}
+
+asmlinkage void noinstr do_riscv_irq(struct pt_regs *regs)
+{
+	irqentry_state_t state = irqentry_enter(regs);
+#ifdef CONFIG_IRQ_STACKS
+	ulong *sp = per_cpu(irq_stack_ptr, smp_processor_id());
+
+	if (on_thread_stack())
+		call_on_stack(regs, sp, handle_riscv_irq, 0);
+	else
+#endif
+		handle_riscv_irq(regs);
 
 	irqentry_exit(regs, state);
 }
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH V5 09/11] riscv: Support HAVE_SOFTIRQ_ON_OWN_STACK
  2022-09-18 15:52 [PATCH V5 00/11] riscv: Add GENERIC_ENTRY support and related features guoren
                   ` (7 preceding siblings ...)
  2022-09-18 15:52 ` [PATCH V5 08/11] riscv: Support HAVE_IRQ_EXIT_ON_IRQ_STACK guoren
@ 2022-09-18 15:52 ` guoren
  2022-09-20  0:11   ` Guo Ren
  2022-09-18 15:52 ` [PATCH V5 10/11] riscv: Add config of thread stack size guoren
  2022-09-18 15:52 ` [PATCH V5 11/11] riscv: Add support for STACKLEAK gcc plugin guoren
  10 siblings, 1 reply; 29+ messages in thread
From: guoren @ 2022-09-18 15:52 UTC (permalink / raw)
  To: arnd, guoren, palmer, tglx, peterz, luto, conor.dooley, heiko,
	jszhang, lazyparser, falcon, chenhuacai, apatel, atishp, palmer,
	paul.walmsley, mark.rutland, zouyipeng, bigeasy, David.Laight
  Cc: linux-arch, linux-kernel, linux-riscv, Guo Ren

From: Guo Ren <guoren@linux.alibaba.com>

Add the HAVE_SOFTIRQ_ON_OWN_STACK feature for the IRQ_STACKS config. The
irq and softirq use the same independent irq_stack of percpu by time
division multiplexing.

Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
Signed-off-by: Guo Ren <guoren@kernel.org>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---
 arch/riscv/Kconfig      |  7 ++++---
 arch/riscv/kernel/irq.c | 16 ++++++++++++++++
 2 files changed, 20 insertions(+), 3 deletions(-)

diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index 75db47a983f2..dfe600f3526c 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -434,12 +434,13 @@ config FPU
 	  If you don't know what to do here, say Y.
 
 config IRQ_STACKS
-	bool "Independent irq stacks" if EXPERT
+	bool "Independent irq & softirq stacks" if EXPERT
 	default y
 	select HAVE_IRQ_EXIT_ON_IRQ_STACK
+	select HAVE_SOFTIRQ_ON_OWN_STACK
 	help
-	  Add independent irq stacks for percpu to prevent kernel stack overflows.
-	  We may save some memory footprint by disabling IRQ_STACKS.
+	  Add independent irq & softirq stacks for percpu to prevent kernel stack
+	  overflows. We may save some memory footprint by disabling IRQ_STACKS.
 
 endmenu # "Platform type"
 
diff --git a/arch/riscv/kernel/irq.c b/arch/riscv/kernel/irq.c
index 5ad4952203c5..6dc9ccd01470 100644
--- a/arch/riscv/kernel/irq.c
+++ b/arch/riscv/kernel/irq.c
@@ -11,6 +11,7 @@
 #include <linux/seq_file.h>
 #include <asm/smp.h>
 #include <asm/vmap_stack.h>
+#include <asm/softirq_stack.h>
 
 #ifdef CONFIG_IRQ_STACKS
 static DEFINE_PER_CPU(ulong *, irq_stack_ptr);
@@ -38,6 +39,21 @@ static void init_irq_stacks(void)
 		per_cpu(irq_stack_ptr, cpu) = per_cpu(irq_stack, cpu);
 }
 #endif /* CONFIG_VMAP_STACK */
+
+#ifdef CONFIG_SOFTIRQ_ON_OWN_STACK
+static void do_riscv_softirq(struct pt_regs *regs)
+{
+	__do_softirq();
+}
+
+void do_softirq_own_stack(void)
+{
+	ulong *sp = per_cpu(irq_stack_ptr, smp_processor_id());
+
+	call_on_stack(NULL, sp, do_riscv_softirq, 0);
+}
+#endif /* CONFIG_SOFTIRQ_ON_OWN_STACK */
+
 #else
 static void init_irq_stacks(void) {}
 #endif /* CONFIG_IRQ_STACKS */
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH V5 10/11] riscv: Add config of thread stack size
  2022-09-18 15:52 [PATCH V5 00/11] riscv: Add GENERIC_ENTRY support and related features guoren
                   ` (8 preceding siblings ...)
  2022-09-18 15:52 ` [PATCH V5 09/11] riscv: Support HAVE_SOFTIRQ_ON_OWN_STACK guoren
@ 2022-09-18 15:52 ` guoren
  2022-09-18 15:52 ` [PATCH V5 11/11] riscv: Add support for STACKLEAK gcc plugin guoren
  10 siblings, 0 replies; 29+ messages in thread
From: guoren @ 2022-09-18 15:52 UTC (permalink / raw)
  To: arnd, guoren, palmer, tglx, peterz, luto, conor.dooley, heiko,
	jszhang, lazyparser, falcon, chenhuacai, apatel, atishp, palmer,
	paul.walmsley, mark.rutland, zouyipeng, bigeasy, David.Laight
  Cc: linux-arch, linux-kernel, linux-riscv, Guo Ren, Andreas Schwab

From: Guo Ren <guoren@linux.alibaba.com>

0cac21b02ba5 ("risc v: use 16KB kernel stack on 64-bit") increase the
thread size mandatory, but some scenarios, such as D1 with a small
memory footprint, would suffer from that. After independent irq stack
support, let's give users a choice to determine their custom stack size.

Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
Signed-off-by: Guo Ren <guoren@kernel.org>
Cc: Andreas Schwab <schwab@suse.de>
---
 arch/riscv/Kconfig                   | 18 ++++++++++++++++++
 arch/riscv/include/asm/thread_info.h | 12 +-----------
 2 files changed, 19 insertions(+), 11 deletions(-)

diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index dfe600f3526c..8241b12399d7 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -442,6 +442,24 @@ config IRQ_STACKS
 	  Add independent irq & softirq stacks for percpu to prevent kernel stack
 	  overflows. We may save some memory footprint by disabling IRQ_STACKS.
 
+config THREAD_SIZE
+	int "Kernel stack size (in bytes)" if EXPERT
+	range 4096 65536
+	default 8192 if 32BIT && !KASAN
+	default 32768 if 64BIT && KASAN
+	default 16384
+	help
+	  Specify the Pages of thread stack size (from 4KB to 64KB), which also
+	  affects irq stack size, which is equal to thread stack size.
+
+config THREAD_SIZE_ORDER
+	int
+	default 0 if THREAD_SIZE = 4096
+	default 1 if THREAD_SIZE <= 8192
+	default 2 if THREAD_SIZE <= 16384
+	default 3 if THREAD_SIZE <= 32768
+	default 4
+
 endmenu # "Platform type"
 
 menu "Kernel features"
diff --git a/arch/riscv/include/asm/thread_info.h b/arch/riscv/include/asm/thread_info.h
index 043da8ccc7e6..c970d41dc4c6 100644
--- a/arch/riscv/include/asm/thread_info.h
+++ b/arch/riscv/include/asm/thread_info.h
@@ -11,18 +11,8 @@
 #include <asm/page.h>
 #include <linux/const.h>
 
-#ifdef CONFIG_KASAN
-#define KASAN_STACK_ORDER 1
-#else
-#define KASAN_STACK_ORDER 0
-#endif
-
 /* thread information allocation */
-#ifdef CONFIG_64BIT
-#define THREAD_SIZE_ORDER	(2 + KASAN_STACK_ORDER)
-#else
-#define THREAD_SIZE_ORDER	(1 + KASAN_STACK_ORDER)
-#endif
+#define THREAD_SIZE_ORDER	CONFIG_THREAD_SIZE_ORDER
 #define THREAD_SIZE		(PAGE_SIZE << THREAD_SIZE_ORDER)
 
 /*
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH V5 11/11] riscv: Add support for STACKLEAK gcc plugin
  2022-09-18 15:52 [PATCH V5 00/11] riscv: Add GENERIC_ENTRY support and related features guoren
                   ` (9 preceding siblings ...)
  2022-09-18 15:52 ` [PATCH V5 10/11] riscv: Add config of thread stack size guoren
@ 2022-09-18 15:52 ` guoren
  10 siblings, 0 replies; 29+ messages in thread
From: guoren @ 2022-09-18 15:52 UTC (permalink / raw)
  To: arnd, guoren, palmer, tglx, peterz, luto, conor.dooley, heiko,
	jszhang, lazyparser, falcon, chenhuacai, apatel, atishp, palmer,
	paul.walmsley, mark.rutland, zouyipeng, bigeasy, David.Laight
  Cc: linux-arch, linux-kernel, linux-riscv, Dao Lu, Xianting Tian,
	Conor Dooley

From: Dao Lu <daolu@rivosinc.com>

Add support for STACKLEAK gcc plugin to riscv by implementing
stackleak_check_alloca, based heavily on the arm64 version, and
modifying the entry.S. Additionally, this disables the plugin for EFI
stub code for riscv. All modifications base on generic_entry.

Link: https://lore.kernel.org/linux-riscv/20220615213834.3116135-1-daolu@rivosinc.com/
Signed-off-by: Dao Lu <daolu@rivosinc.com>
Co-developed-by: Xianting Tian <xianting.tian@linux.alibaba.com>
Signed-off-by: Xianting Tian <xianting.tian@linux.alibaba.com>
Co-developed-by: Guo Ren <guoren@kernel.org>
Signed-off-by: Guo Ren <guoren@kernel.org>
Cc: Conor Dooley <Conor.Dooley@microchip.com>
Cc: Mark Rutland <mark.rutland@arm.com>
---
 arch/riscv/Kconfig                    | 1 +
 arch/riscv/kernel/entry.S             | 8 +++++++-
 drivers/firmware/efi/libstub/Makefile | 2 +-
 3 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index 8241b12399d7..b4476f17fed6 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -81,6 +81,7 @@ config RISCV
 	select HAVE_ARCH_MMAP_RND_BITS if MMU
 	select HAVE_ARCH_MMAP_RND_COMPAT_BITS if COMPAT
 	select HAVE_ARCH_SECCOMP_FILTER
+	select HAVE_ARCH_STACKLEAK
 	select HAVE_ARCH_TRACEHOOK
 	select HAVE_ARCH_TRANSPARENT_HUGEPAGE if 64BIT && MMU
 	select ARCH_ENABLE_THP_MIGRATION if TRANSPARENT_HUGEPAGE
diff --git a/arch/riscv/kernel/entry.S b/arch/riscv/kernel/entry.S
index 426529b84db0..2207cf44a3bc 100644
--- a/arch/riscv/kernel/entry.S
+++ b/arch/riscv/kernel/entry.S
@@ -130,7 +130,6 @@ END(handle_exception)
 ENTRY(ret_from_exception)
 	REG_L s0, PT_STATUS(sp)
 
-	csrc CSR_STATUS, SR_IE
 #ifdef CONFIG_RISCV_M_MODE
 	/* the MPP value is too large to be used as an immediate arg for addi */
 	li t0, SR_MPP
@@ -139,6 +138,9 @@ ENTRY(ret_from_exception)
 	andi s0, s0, SR_SPP
 #endif
 	bnez s0, 1f
+#ifdef CONFIG_GCC_PLUGIN_STACKLEAK
+	call stackleak_erase
+#endif
 
 	/* Save unwound kernel stack pointer in thread_info */
 	addi s0, sp, PT_SIZE_ON_STACK
@@ -148,8 +150,12 @@ ENTRY(ret_from_exception)
 	 * Save TP into the scratch register , so we can find the kernel data
 	 * structures again.
 	 */
+	csrc CSR_STATUS, SR_IE
 	csrw CSR_SCRATCH, tp
+	j 2f
 1:
+	csrc CSR_STATUS, SR_IE
+2:
 	/*
 	 * The current load reservation is effectively part of the processor's
 	 * state, in the sense that load reservations cannot be shared between
diff --git a/drivers/firmware/efi/libstub/Makefile b/drivers/firmware/efi/libstub/Makefile
index d0537573501e..5e1fc4f82883 100644
--- a/drivers/firmware/efi/libstub/Makefile
+++ b/drivers/firmware/efi/libstub/Makefile
@@ -25,7 +25,7 @@ cflags-$(CONFIG_ARM)		:= $(subst $(CC_FLAGS_FTRACE),,$(KBUILD_CFLAGS)) \
 				   -fno-builtin -fpic \
 				   $(call cc-option,-mno-single-pic-base)
 cflags-$(CONFIG_RISCV)		:= $(subst $(CC_FLAGS_FTRACE),,$(KBUILD_CFLAGS)) \
-				   -fpic
+				   -fpic $(DISABLE_STACKLEAK_PLUGIN)
 
 cflags-$(CONFIG_EFI_GENERIC_STUB) += -I$(srctree)/scripts/dtc/libfdt
 
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: [PATCH V5 06/11] entry: Prevent DEBUG_PREEMPT warning
  2022-09-18 15:52 ` [PATCH V5 06/11] entry: Prevent DEBUG_PREEMPT warning guoren
@ 2022-09-19 11:58   ` Peter Zijlstra
  2022-09-20  1:45     ` Guo Ren
  0 siblings, 1 reply; 29+ messages in thread
From: Peter Zijlstra @ 2022-09-19 11:58 UTC (permalink / raw)
  To: guoren
  Cc: arnd, palmer, tglx, luto, conor.dooley, heiko, jszhang,
	lazyparser, falcon, chenhuacai, apatel, atishp, palmer,
	paul.walmsley, mark.rutland, zouyipeng, bigeasy, David.Laight,
	linux-arch, linux-kernel, linux-riscv, Guo Ren

On Sun, Sep 18, 2022 at 11:52:41AM -0400, guoren@kernel.org wrote:
> From: Guo Ren <guoren@linux.alibaba.com>
> 
> When DEBUG_PREEMPT=y,
> 	exit_to_user_mode_prepare
> 	->tick_nohz_user_enter_prepare
> 	  ->tick_nohz_full_cpu(smp_processor_id())
> 	    ->smp_processor_id()
> 	      ->debug_smp_processor_id()
> 		->check preempt_count() then:
> 
> [    5.717610] BUG: using smp_processor_id() in preemptible [00000000]
> code: S20urandom/94
> [    5.718111] caller is debug_smp_processor_id+0x24/0x38
> [    5.718417] CPU: 1 PID: 94 Comm: S20urandom Not tainted
> 6.0.0-rc3-00010-gfd0a0d619c63-dirty #238
> [    5.718886] Hardware name: riscv-virtio,qemu (DT)
> [    5.719136] Call Trace:
> [    5.719281] [<ffffffff800055fc>] dump_backtrace+0x2c/0x3c
> [    5.719566] [<ffffffff80ae6cb0>] show_stack+0x44/0x5c
> [    5.720023] [<ffffffff80aee870>] dump_stack_lvl+0x74/0xa4
> [    5.720557] [<ffffffff80aee8bc>] dump_stack+0x1c/0x2c
> [    5.721033] [<ffffffff80af65c0>]
> check_preemption_disabled+0x104/0x108
> [    5.721538] [<ffffffff80af65e8>] debug_smp_processor_id+0x24/0x38
> [    5.722001] [<ffffffff800aee30>] exit_to_user_mode_prepare+0x48/0x178
> [    5.722355] [<ffffffff80af5bf4>] irqentry_exit_to_user_mode+0x18/0x30
> [    5.722685] [<ffffffff80af5c70>] irqentry_exit+0x64/0xa4
> [    5.722953] [<ffffffff80af52f4>] do_page_fault+0x1d8/0x544
> [    5.723291] [<ffffffff80003310>] ret_from_exception+0x0/0xb8
> 
> (Above is found in riscv platform with generic_entry)
> 
> The smp_processor_id() needs irqs disable or preempt_disable, so use
> preempt dis/in protecting the tick_nohz_user_enter_prepare().
> 
> Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
> Signed-off-by: Guo Ren <guoren@kernel.org>
> ---
>  kernel/entry/common.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/kernel/entry/common.c b/kernel/entry/common.c
> index 063068a9ea9b..36e4cd50531c 100644
> --- a/kernel/entry/common.c
> +++ b/kernel/entry/common.c
> @@ -194,8 +194,10 @@ static void exit_to_user_mode_prepare(struct pt_regs *regs)
>  
>  	lockdep_assert_irqs_disabled();

    Observe ^^^^

>  
> +	preempt_disable();
>  	/* Flush pending rcuog wakeup before the last need_resched() check */
>  	tick_nohz_user_enter_prepare();
> +	preempt_enable();

This makes no sense; if IRQs are disabled, check_preemption_disabled()
should bail early per:

	if (irqs_disabled())
		goto out;

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH V5 07/11] riscv: convert to generic entry
  2022-09-18 15:52 ` [PATCH V5 07/11] riscv: convert to generic entry guoren
@ 2022-09-19 13:34   ` Peter Zijlstra
  2022-09-20  6:36     ` Guo Ren
  0 siblings, 1 reply; 29+ messages in thread
From: Peter Zijlstra @ 2022-09-19 13:34 UTC (permalink / raw)
  To: guoren
  Cc: arnd, palmer, tglx, luto, conor.dooley, heiko, jszhang,
	lazyparser, falcon, chenhuacai, apatel, atishp, palmer,
	paul.walmsley, mark.rutland, zouyipeng, bigeasy, David.Laight,
	linux-arch, linux-kernel, linux-riscv, Guo Ren

On Sun, Sep 18, 2022 at 11:52:42AM -0400, guoren@kernel.org wrote:

> @@ -123,18 +126,22 @@ int handle_misaligned_store(struct pt_regs *regs);
>  
>  asmlinkage void __trap_section do_trap_load_misaligned(struct pt_regs *regs)
>  {
> +	irqentry_state_t state = irqentry_enter(regs);
>  	if (!handle_misaligned_load(regs))
>  		return;
>  	do_trap_error(regs, SIGBUS, BUS_ADRALN, regs->epc,
>  		      "Oops - load address misaligned");
> +	irqentry_exit(regs, state);
>  }
>  
>  asmlinkage void __trap_section do_trap_store_misaligned(struct pt_regs *regs)
>  {
> +	irqentry_state_t state = irqentry_enter(regs);
>  	if (!handle_misaligned_store(regs))
>  		return;
>  	do_trap_error(regs, SIGBUS, BUS_ADRALN, regs->epc,
>  		      "Oops - store (or AMO) address misaligned");
> +	irqentry_exit(regs, state);
>  }
>  #endif
>  DO_ERROR_INFO(do_trap_store_fault,
> @@ -158,6 +165,8 @@ static inline unsigned long get_break_insn_length(unsigned long pc)
>  
>  asmlinkage __visible __trap_section void do_trap_break(struct pt_regs *regs)
>  {
> +	irqentry_state_t state = irqentry_enter(regs);
> +
>  #ifdef CONFIG_KPROBES
>  	if (kprobe_single_step_handler(regs))
>  		return;

FWIW; on x86 I've classified many of the 'from-kernel' traps as
NMI-like, since those traps can happen from any context, including with
IRQs disabled.

The basic shape of the trap handlers looks a little like:

	if (user_mode(regs)) {
		irqenrty_enter_from_user_mode(regs);
		do_user_trap();
		irqentry_exit_to_user_mode(regs);
	} else {
		irqentry_state_t state = irqentry_nmi_enter(regs);
		do_kernel_trap();
		irqentry_nmi_exit(regs, state);
	}

Not saying you have to match Risc-V in this patch-set, just something to
consider.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH V5 08/11] riscv: Support HAVE_IRQ_EXIT_ON_IRQ_STACK
  2022-09-18 15:52 ` [PATCH V5 08/11] riscv: Support HAVE_IRQ_EXIT_ON_IRQ_STACK guoren
@ 2022-09-19 13:45   ` Peter Zijlstra
  2022-09-20  6:08     ` Guo Ren
  2022-09-21  8:34   ` Chen Zhongjin
  1 sibling, 1 reply; 29+ messages in thread
From: Peter Zijlstra @ 2022-09-19 13:45 UTC (permalink / raw)
  To: guoren
  Cc: arnd, palmer, tglx, luto, conor.dooley, heiko, jszhang,
	lazyparser, falcon, chenhuacai, apatel, atishp, palmer,
	paul.walmsley, mark.rutland, zouyipeng, bigeasy, David.Laight,
	linux-arch, linux-kernel, linux-riscv, Guo Ren, keescook

On Sun, Sep 18, 2022 at 11:52:43AM -0400, guoren@kernel.org wrote:

> +ENTRY(call_on_stack)
> +	/* Create a frame record to save our ra and fp */
> +	addi	sp, sp, -RISCV_SZPTR
> +	REG_S	ra, (sp)
> +	addi	sp, sp, -RISCV_SZPTR
> +	REG_S	fp, (sp)
> +
> +	/* Save sp in fp */
> +	move	fp, sp
> +
> +	/* Move to the new stack and call the function there */
> +	li	a3, IRQ_STACK_SIZE
> +	add	sp, a1, a3
> +	jalr	a2
> +
> +	/*
> +	 * Restore sp from prev fp, and fp, ra from the frame
> +	 */
> +	move	sp, fp
> +	REG_L	fp, (sp)
> +	addi	sp, sp, RISCV_SZPTR
> +	REG_L	ra, (sp)
> +	addi	sp, sp, RISCV_SZPTR
> +	ret
> +ENDPROC(call_on_stack)

IIRC x86_64 moved away from a stack-switch function like this because it
presents a convenient exploit gadget.

I'm not much of an exploit writer and I've no idea how effective our
inline stategy is, perhaps other can comment.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH V5 09/11] riscv: Support HAVE_SOFTIRQ_ON_OWN_STACK
  2022-09-18 15:52 ` [PATCH V5 09/11] riscv: Support HAVE_SOFTIRQ_ON_OWN_STACK guoren
@ 2022-09-20  0:11   ` Guo Ren
  0 siblings, 0 replies; 29+ messages in thread
From: Guo Ren @ 2022-09-20  0:11 UTC (permalink / raw)
  To: arnd, guoren, palmer, tglx, peterz, luto, conor.dooley, heiko,
	jszhang, lazyparser, falcon, chenhuacai, apatel, atishp, palmer,
	paul.walmsley, mark.rutland, zouyipeng, bigeasy, David.Laight
  Cc: linux-arch, linux-kernel, linux-riscv, Guo Ren

On Sun, Sep 18, 2022 at 11:53 PM <guoren@kernel.org> wrote:
>
> From: Guo Ren <guoren@linux.alibaba.com>
>
> Add the HAVE_SOFTIRQ_ON_OWN_STACK feature for the IRQ_STACKS config. The
> irq and softirq use the same independent irq_stack of percpu by time
> division multiplexing.
>
> Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
> Signed-off-by: Guo Ren <guoren@kernel.org>
> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
> ---
>  arch/riscv/Kconfig      |  7 ++++---
>  arch/riscv/kernel/irq.c | 16 ++++++++++++++++
>  2 files changed, 20 insertions(+), 3 deletions(-)
>
> diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
> index 75db47a983f2..dfe600f3526c 100644
> --- a/arch/riscv/Kconfig
> +++ b/arch/riscv/Kconfig
> @@ -434,12 +434,13 @@ config FPU
>           If you don't know what to do here, say Y.
>
>  config IRQ_STACKS
> -       bool "Independent irq stacks" if EXPERT
> +       bool "Independent irq & softirq stacks" if EXPERT
>         default y
>         select HAVE_IRQ_EXIT_ON_IRQ_STACK
> +       select HAVE_SOFTIRQ_ON_OWN_STACK
>         help
> -         Add independent irq stacks for percpu to prevent kernel stack overflows.
> -         We may save some memory footprint by disabling IRQ_STACKS.
> +         Add independent irq & softirq stacks for percpu to prevent kernel stack
> +         overflows. We may save some memory footprint by disabling IRQ_STACKS.
>
>  endmenu # "Platform type"
>
> diff --git a/arch/riscv/kernel/irq.c b/arch/riscv/kernel/irq.c
> index 5ad4952203c5..6dc9ccd01470 100644
> --- a/arch/riscv/kernel/irq.c
> +++ b/arch/riscv/kernel/irq.c
> @@ -11,6 +11,7 @@
>  #include <linux/seq_file.h>
>  #include <asm/smp.h>
>  #include <asm/vmap_stack.h>
> +#include <asm/softirq_stack.h>
>
>  #ifdef CONFIG_IRQ_STACKS
>  static DEFINE_PER_CPU(ulong *, irq_stack_ptr);
> @@ -38,6 +39,21 @@ static void init_irq_stacks(void)
>                 per_cpu(irq_stack_ptr, cpu) = per_cpu(irq_stack, cpu);
>  }
>  #endif /* CONFIG_VMAP_STACK */
> +
> +#ifdef CONFIG_SOFTIRQ_ON_OWN_STACK
Sorry, it should be. If you compiled an error, please modify it manually,
+#ifdef CONFIG_HAVE_SOFTIRQ_ON_OWN_STACK

> +static void do_riscv_softirq(struct pt_regs *regs)
> +{
> +       __do_softirq();
> +}
> +
> +void do_softirq_own_stack(void)
> +{
> +       ulong *sp = per_cpu(irq_stack_ptr, smp_processor_id());
> +
> +       call_on_stack(NULL, sp, do_riscv_softirq, 0);
> +}
> +#endif /* CONFIG_SOFTIRQ_ON_OWN_STACK */
> +
>  #else
>  static void init_irq_stacks(void) {}
>  #endif /* CONFIG_IRQ_STACKS */
> --
> 2.36.1
>


-- 
Best Regards
 Guo Ren

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH V5 06/11] entry: Prevent DEBUG_PREEMPT warning
  2022-09-19 11:58   ` Peter Zijlstra
@ 2022-09-20  1:45     ` Guo Ren
  2022-09-30 12:27       ` Guo Ren
  0 siblings, 1 reply; 29+ messages in thread
From: Guo Ren @ 2022-09-20  1:45 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: arnd, palmer, tglx, luto, conor.dooley, heiko, jszhang,
	lazyparser, falcon, chenhuacai, apatel, atishp, palmer,
	paul.walmsley, mark.rutland, zouyipeng, bigeasy, David.Laight,
	linux-arch, linux-kernel, linux-riscv, Guo Ren

On Mon, Sep 19, 2022 at 7:59 PM Peter Zijlstra <peterz@infradead.org> wrote:
>
> On Sun, Sep 18, 2022 at 11:52:41AM -0400, guoren@kernel.org wrote:
> > From: Guo Ren <guoren@linux.alibaba.com>
> >
> > When DEBUG_PREEMPT=y,
> >       exit_to_user_mode_prepare
> >       ->tick_nohz_user_enter_prepare
> >         ->tick_nohz_full_cpu(smp_processor_id())
> >           ->smp_processor_id()
> >             ->debug_smp_processor_id()
> >               ->check preempt_count() then:
> >
> > [    5.717610] BUG: using smp_processor_id() in preemptible [00000000]
> > code: S20urandom/94
> > [    5.718111] caller is debug_smp_processor_id+0x24/0x38
> > [    5.718417] CPU: 1 PID: 94 Comm: S20urandom Not tainted
> > 6.0.0-rc3-00010-gfd0a0d619c63-dirty #238
> > [    5.718886] Hardware name: riscv-virtio,qemu (DT)
> > [    5.719136] Call Trace:
> > [    5.719281] [<ffffffff800055fc>] dump_backtrace+0x2c/0x3c
> > [    5.719566] [<ffffffff80ae6cb0>] show_stack+0x44/0x5c
> > [    5.720023] [<ffffffff80aee870>] dump_stack_lvl+0x74/0xa4
> > [    5.720557] [<ffffffff80aee8bc>] dump_stack+0x1c/0x2c
> > [    5.721033] [<ffffffff80af65c0>]
> > check_preemption_disabled+0x104/0x108
> > [    5.721538] [<ffffffff80af65e8>] debug_smp_processor_id+0x24/0x38
> > [    5.722001] [<ffffffff800aee30>] exit_to_user_mode_prepare+0x48/0x178
> > [    5.722355] [<ffffffff80af5bf4>] irqentry_exit_to_user_mode+0x18/0x30
> > [    5.722685] [<ffffffff80af5c70>] irqentry_exit+0x64/0xa4
> > [    5.722953] [<ffffffff80af52f4>] do_page_fault+0x1d8/0x544
> > [    5.723291] [<ffffffff80003310>] ret_from_exception+0x0/0xb8
> >
> > (Above is found in riscv platform with generic_entry)
> >
> > The smp_processor_id() needs irqs disable or preempt_disable, so use
> > preempt dis/in protecting the tick_nohz_user_enter_prepare().
> >
> > Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
> > Signed-off-by: Guo Ren <guoren@kernel.org>
> > ---
> >  kernel/entry/common.c | 2 ++
> >  1 file changed, 2 insertions(+)
> >
> > diff --git a/kernel/entry/common.c b/kernel/entry/common.c
> > index 063068a9ea9b..36e4cd50531c 100644
> > --- a/kernel/entry/common.c
> > +++ b/kernel/entry/common.c
> > @@ -194,8 +194,10 @@ static void exit_to_user_mode_prepare(struct pt_regs *regs)
> >
> >       lockdep_assert_irqs_disabled();
>
>     Observe ^^^^
Thanks! I would enable PROVE_LOCKING for test.

>
> >
> > +     preempt_disable();
> >       /* Flush pending rcuog wakeup before the last need_resched() check */
> >       tick_nohz_user_enter_prepare();
> > +     preempt_enable();
>
> This makes no sense; if IRQs are disabled, check_preemption_disabled()
> should bail early per:
>
>         if (irqs_disabled())
>                 goto out;
Ditto.


-- 
Best Regards
 Guo Ren

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH V5 08/11] riscv: Support HAVE_IRQ_EXIT_ON_IRQ_STACK
  2022-09-19 13:45   ` Peter Zijlstra
@ 2022-09-20  6:08     ` Guo Ren
  2022-09-20  7:27       ` Peter Zijlstra
  0 siblings, 1 reply; 29+ messages in thread
From: Guo Ren @ 2022-09-20  6:08 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: arnd, palmer, tglx, luto, conor.dooley, heiko, jszhang,
	lazyparser, falcon, chenhuacai, apatel, atishp, palmer,
	paul.walmsley, mark.rutland, zouyipeng, bigeasy, David.Laight,
	linux-arch, linux-kernel, linux-riscv, Guo Ren, keescook

On Mon, Sep 19, 2022 at 9:45 PM Peter Zijlstra <peterz@infradead.org> wrote:
>
> On Sun, Sep 18, 2022 at 11:52:43AM -0400, guoren@kernel.org wrote:
>
> > +ENTRY(call_on_stack)
> > +     /* Create a frame record to save our ra and fp */
> > +     addi    sp, sp, -RISCV_SZPTR
> > +     REG_S   ra, (sp)
> > +     addi    sp, sp, -RISCV_SZPTR
> > +     REG_S   fp, (sp)
> > +
> > +     /* Save sp in fp */
> > +     move    fp, sp
> > +
> > +     /* Move to the new stack and call the function there */
> > +     li      a3, IRQ_STACK_SIZE
> > +     add     sp, a1, a3
> > +     jalr    a2
> > +
> > +     /*
> > +      * Restore sp from prev fp, and fp, ra from the frame
> > +      */
> > +     move    sp, fp
> > +     REG_L   fp, (sp)
> > +     addi    sp, sp, RISCV_SZPTR
> > +     REG_L   ra, (sp)
> > +     addi    sp, sp, RISCV_SZPTR
> > +     ret
> > +ENDPROC(call_on_stack)
>
> IIRC x86_64 moved away from a stack-switch function like this because it
> presents a convenient exploit gadget.
I found:
https://lore.kernel.org/all/20210204204903.350275743@linutronix.de/

  - The fact that the stack switching code ended up being an easy to find
    exploit gadget.

What's the exploit gadget? Do you have a ref link? Thx.

>
> I'm not much of an exploit writer and I've no idea how effective our
> inline stategy is, perhaps other can comment.
It seems that I should move to an inline flavor. a0cfc74d0b00
("x86/irq: Provide macro for inlining irq stack switching")

-- 
Best Regards
 Guo Ren

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH V5 07/11] riscv: convert to generic entry
  2022-09-19 13:34   ` Peter Zijlstra
@ 2022-09-20  6:36     ` Guo Ren
  2022-09-20  7:22       ` Peter Zijlstra
  0 siblings, 1 reply; 29+ messages in thread
From: Guo Ren @ 2022-09-20  6:36 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: arnd, palmer, tglx, luto, conor.dooley, heiko, jszhang,
	lazyparser, falcon, chenhuacai, apatel, atishp, palmer,
	paul.walmsley, mark.rutland, zouyipeng, bigeasy, David.Laight,
	linux-arch, linux-kernel, linux-riscv, Guo Ren

On Mon, Sep 19, 2022 at 9:34 PM Peter Zijlstra <peterz@infradead.org> wrote:
>
> On Sun, Sep 18, 2022 at 11:52:42AM -0400, guoren@kernel.org wrote:
>
> > @@ -123,18 +126,22 @@ int handle_misaligned_store(struct pt_regs *regs);
> >
> >  asmlinkage void __trap_section do_trap_load_misaligned(struct pt_regs *regs)
> >  {
> > +     irqentry_state_t state = irqentry_enter(regs);
> >       if (!handle_misaligned_load(regs))
> >               return;
> >       do_trap_error(regs, SIGBUS, BUS_ADRALN, regs->epc,
> >                     "Oops - load address misaligned");
> > +     irqentry_exit(regs, state);
> >  }
> >
> >  asmlinkage void __trap_section do_trap_store_misaligned(struct pt_regs *regs)
> >  {
> > +     irqentry_state_t state = irqentry_enter(regs);
> >       if (!handle_misaligned_store(regs))
> >               return;
> >       do_trap_error(regs, SIGBUS, BUS_ADRALN, regs->epc,
> >                     "Oops - store (or AMO) address misaligned");
> > +     irqentry_exit(regs, state);
> >  }
> >  #endif
> >  DO_ERROR_INFO(do_trap_store_fault,
> > @@ -158,6 +165,8 @@ static inline unsigned long get_break_insn_length(unsigned long pc)
> >
> >  asmlinkage __visible __trap_section void do_trap_break(struct pt_regs *regs)
> >  {
> > +     irqentry_state_t state = irqentry_enter(regs);
> > +
> >  #ifdef CONFIG_KPROBES
> >       if (kprobe_single_step_handler(regs))
> >               return;
>
> FWIW; on x86 I've classified many of the 'from-kernel' traps as
> NMI-like, since those traps can happen from any context, including with
> IRQs disabled.
The do_trap_break is for ebreak instruction, not NMI. RISC-V NMI has
separate CSR. ref:

This proposal adds support for resumable non-maskable interrupts
(RNMIs) to RISC-V. The extension adds four new CSRs (`mnepc`,
`mncause`, `mnstatus`, and `mnscratch`) to hold the interrupted state,
and a new instruction to resume from the RNMI handler.

>
> The basic shape of the trap handlers looks a little like:
>
>         if (user_mode(regs)) {
If nmi comes from user_mode, why we using
irqenrty_enter/exit_from/to_user_mode instead of
irqentry_nmi_enter/exit?

>                 irqenrty_enter_from_user_mode(regs);
>                 do_user_trap();
>                 irqentry_exit_to_user_mode(regs);
>         } else {
>                 irqentry_state_t state = irqentry_nmi_enter(regs);
>                 do_kernel_trap();
>                 irqentry_nmi_exit(regs, state);
>         }
>
> Not saying you have to match Risc-V in this patch-set, just something to
> consider.
I think the shape of the riscv NMI handler looks a little like this:

asmlinkage __visible __trap_section void do_trap_nmi(struct pt_regs *regs)
{
                 irqentry_state_t state = irqentry_nmi_enter(regs);
                 do_nmi_trap();
                 irqentry_nmi_exit(regs, state);
}

-- 
Best Regards
 Guo Ren

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH V5 07/11] riscv: convert to generic entry
  2022-09-20  6:36     ` Guo Ren
@ 2022-09-20  7:22       ` Peter Zijlstra
  2022-09-30 11:28         ` Guo Ren
  0 siblings, 1 reply; 29+ messages in thread
From: Peter Zijlstra @ 2022-09-20  7:22 UTC (permalink / raw)
  To: Guo Ren
  Cc: arnd, palmer, tglx, luto, conor.dooley, heiko, jszhang,
	lazyparser, falcon, chenhuacai, apatel, atishp, palmer,
	paul.walmsley, mark.rutland, zouyipeng, bigeasy, David.Laight,
	linux-arch, linux-kernel, linux-riscv, Guo Ren

On Tue, Sep 20, 2022 at 02:36:33PM +0800, Guo Ren wrote:
> On Mon, Sep 19, 2022 at 9:34 PM Peter Zijlstra <peterz@infradead.org> wrote:
> >
> > On Sun, Sep 18, 2022 at 11:52:42AM -0400, guoren@kernel.org wrote:
> >
> > > @@ -123,18 +126,22 @@ int handle_misaligned_store(struct pt_regs *regs);
> > >
> > >  asmlinkage void __trap_section do_trap_load_misaligned(struct pt_regs *regs)
> > >  {
> > > +     irqentry_state_t state = irqentry_enter(regs);
> > >       if (!handle_misaligned_load(regs))
> > >               return;
> > >       do_trap_error(regs, SIGBUS, BUS_ADRALN, regs->epc,
> > >                     "Oops - load address misaligned");
> > > +     irqentry_exit(regs, state);
> > >  }
> > >
> > >  asmlinkage void __trap_section do_trap_store_misaligned(struct pt_regs *regs)
> > >  {
> > > +     irqentry_state_t state = irqentry_enter(regs);
> > >       if (!handle_misaligned_store(regs))
> > >               return;
> > >       do_trap_error(regs, SIGBUS, BUS_ADRALN, regs->epc,
> > >                     "Oops - store (or AMO) address misaligned");
> > > +     irqentry_exit(regs, state);
> > >  }
> > >  #endif
> > >  DO_ERROR_INFO(do_trap_store_fault,
> > > @@ -158,6 +165,8 @@ static inline unsigned long get_break_insn_length(unsigned long pc)
> > >
> > >  asmlinkage __visible __trap_section void do_trap_break(struct pt_regs *regs)
> > >  {
> > > +     irqentry_state_t state = irqentry_enter(regs);
> > > +
> > >  #ifdef CONFIG_KPROBES
> > >       if (kprobe_single_step_handler(regs))
> > >               return;
> >
> > FWIW; on x86 I've classified many of the 'from-kernel' traps as
> > NMI-like, since those traps can happen from any context, including with
> > IRQs disabled.
> The do_trap_break is for ebreak instruction, not NMI. RISC-V NMI has
> separate CSR. ref:
> 
> This proposal adds support for resumable non-maskable interrupts
> (RNMIs) to RISC-V. The extension adds four new CSRs (`mnepc`,
> `mncause`, `mnstatus`, and `mnscratch`) to hold the interrupted state,
> and a new instruction to resume from the RNMI handler.

Yes, but that's not what I'm saying. I'm saying I've classified
'from-kernel' traps as NMI-like.

Consider:

	raw_spin_lock_irq(&foo);
	...
	<trap>

Then you want the trap to behave as if it were an NMI; that is abide by
the rules of NMI (strictly wait-free code).

So yes, they are not NMI, but they nest just like it, so we want the
handlers to abide by the same rules.

Does that make sense?

> >
> > The basic shape of the trap handlers looks a little like:
> >
> >         if (user_mode(regs)) {
> If nmi comes from user_mode, why we using
> irqenrty_enter/exit_from/to_user_mode instead of
> irqentry_nmi_enter/exit?

s/nmi/trap/ because the 'from-user' trap never nests inside kernel code.

Additionally, many 'from-user' traps want to do 'silly' things like send
signals, which is something that requires scheduling.

They're fundamentally different from 'from-kernel' traps, which per the
above, nest most dangerously.

> >                 irqenrty_enter_from_user_mode(regs);
> >                 do_user_trap();
> >                 irqentry_exit_to_user_mode(regs);
> >         } else {
> >                 irqentry_state_t state = irqentry_nmi_enter(regs);
> >                 do_kernel_trap();
> >                 irqentry_nmi_exit(regs, state);
> >         }
> >
> > Not saying you have to match Risc-V in this patch-set, just something to
> > consider.
> I think the shape of the riscv NMI handler looks a little like this:
> 
> asmlinkage __visible __trap_section void do_trap_nmi(struct pt_regs *regs)
> {
>                  irqentry_state_t state = irqentry_nmi_enter(regs);
>                  do_nmi_trap();
>                  irqentry_nmi_exit(regs, state);
> }

That is correct for the NMI handler; but here I'm specifically talking
about traps, like the unalign trap, break trap etc. Those that can
happen *anywhere* in kernel code and nest most unfortunate.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH V5 08/11] riscv: Support HAVE_IRQ_EXIT_ON_IRQ_STACK
  2022-09-20  6:08     ` Guo Ren
@ 2022-09-20  7:27       ` Peter Zijlstra
  2022-09-20  7:34         ` Peter Zijlstra
  0 siblings, 1 reply; 29+ messages in thread
From: Peter Zijlstra @ 2022-09-20  7:27 UTC (permalink / raw)
  To: Guo Ren
  Cc: arnd, palmer, tglx, luto, conor.dooley, heiko, jszhang,
	lazyparser, falcon, chenhuacai, apatel, atishp, palmer,
	paul.walmsley, mark.rutland, zouyipeng, bigeasy, David.Laight,
	linux-arch, linux-kernel, linux-riscv, Guo Ren, keescook

On Tue, Sep 20, 2022 at 02:08:55PM +0800, Guo Ren wrote:
> On Mon, Sep 19, 2022 at 9:45 PM Peter Zijlstra <peterz@infradead.org> wrote:
> >
> > On Sun, Sep 18, 2022 at 11:52:43AM -0400, guoren@kernel.org wrote:
> >
> > > +ENTRY(call_on_stack)
> > > +     /* Create a frame record to save our ra and fp */
> > > +     addi    sp, sp, -RISCV_SZPTR
> > > +     REG_S   ra, (sp)
> > > +     addi    sp, sp, -RISCV_SZPTR
> > > +     REG_S   fp, (sp)
> > > +
> > > +     /* Save sp in fp */
> > > +     move    fp, sp
> > > +
> > > +     /* Move to the new stack and call the function there */
> > > +     li      a3, IRQ_STACK_SIZE
> > > +     add     sp, a1, a3
> > > +     jalr    a2
> > > +
> > > +     /*
> > > +      * Restore sp from prev fp, and fp, ra from the frame
> > > +      */
> > > +     move    sp, fp
> > > +     REG_L   fp, (sp)
> > > +     addi    sp, sp, RISCV_SZPTR
> > > +     REG_L   ra, (sp)
> > > +     addi    sp, sp, RISCV_SZPTR
> > > +     ret
> > > +ENDPROC(call_on_stack)
> >
> > IIRC x86_64 moved away from a stack-switch function like this because it
> > presents a convenient exploit gadget.
> I found:
> https://lore.kernel.org/all/20210204204903.350275743@linutronix.de/
> 
>   - The fact that the stack switching code ended up being an easy to find
>     exploit gadget.
> 
> What's the exploit gadget? Do you have a ref link? Thx.

Sadly no, I do not. Kees might. But basically it boils down to this
function taking both a stack pointer and a function pointer as
arguments (@a1 and @a2 resp. if I'm not reading this wrong).

If an attacker can call this with arguments of its choice then it gains
full control of subsequent execution.


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH V5 08/11] riscv: Support HAVE_IRQ_EXIT_ON_IRQ_STACK
  2022-09-20  7:27       ` Peter Zijlstra
@ 2022-09-20  7:34         ` Peter Zijlstra
  2022-09-21  6:16           ` Guo Ren
  0 siblings, 1 reply; 29+ messages in thread
From: Peter Zijlstra @ 2022-09-20  7:34 UTC (permalink / raw)
  To: Guo Ren
  Cc: arnd, palmer, tglx, luto, conor.dooley, heiko, jszhang,
	lazyparser, falcon, chenhuacai, apatel, atishp, palmer,
	paul.walmsley, mark.rutland, zouyipeng, bigeasy, David.Laight,
	linux-arch, linux-kernel, linux-riscv, Guo Ren, keescook

On Tue, Sep 20, 2022 at 09:27:51AM +0200, Peter Zijlstra wrote:
> On Tue, Sep 20, 2022 at 02:08:55PM +0800, Guo Ren wrote:
> > On Mon, Sep 19, 2022 at 9:45 PM Peter Zijlstra <peterz@infradead.org> wrote:
> > >
> > > On Sun, Sep 18, 2022 at 11:52:43AM -0400, guoren@kernel.org wrote:
> > >
> > > > +ENTRY(call_on_stack)
> > > > +     /* Create a frame record to save our ra and fp */
> > > > +     addi    sp, sp, -RISCV_SZPTR
> > > > +     REG_S   ra, (sp)
> > > > +     addi    sp, sp, -RISCV_SZPTR
> > > > +     REG_S   fp, (sp)
> > > > +
> > > > +     /* Save sp in fp */
> > > > +     move    fp, sp
> > > > +
> > > > +     /* Move to the new stack and call the function there */
> > > > +     li      a3, IRQ_STACK_SIZE
> > > > +     add     sp, a1, a3
> > > > +     jalr    a2
> > > > +
> > > > +     /*
> > > > +      * Restore sp from prev fp, and fp, ra from the frame
> > > > +      */
> > > > +     move    sp, fp
> > > > +     REG_L   fp, (sp)
> > > > +     addi    sp, sp, RISCV_SZPTR
> > > > +     REG_L   ra, (sp)
> > > > +     addi    sp, sp, RISCV_SZPTR
> > > > +     ret
> > > > +ENDPROC(call_on_stack)
> > >
> > > IIRC x86_64 moved away from a stack-switch function like this because it
> > > presents a convenient exploit gadget.
> > I found:
> > https://lore.kernel.org/all/20210204204903.350275743@linutronix.de/
> > 
> >   - The fact that the stack switching code ended up being an easy to find
> >     exploit gadget.
> > 
> > What's the exploit gadget? Do you have a ref link? Thx.
> 
> Sadly no, I do not. Kees might. But basically it boils down to this
> function taking both a stack pointer and a function pointer as
> arguments (@a1 and @a2 resp. if I'm not reading this wrong).
> 
> If an attacker can call this with arguments of its choice then it gains
> full control of subsequent execution.

If you inline it the hope is that the function pointers go away or at
least the encompassing function doesn't have quite such a 'convenient'
signature to hijack control flow.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH V5 08/11] riscv: Support HAVE_IRQ_EXIT_ON_IRQ_STACK
  2022-09-20  7:34         ` Peter Zijlstra
@ 2022-09-21  6:16           ` Guo Ren
  0 siblings, 0 replies; 29+ messages in thread
From: Guo Ren @ 2022-09-21  6:16 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: arnd, palmer, tglx, luto, conor.dooley, heiko, jszhang,
	lazyparser, falcon, chenhuacai, apatel, atishp, palmer,
	paul.walmsley, mark.rutland, zouyipeng, bigeasy, David.Laight,
	linux-arch, linux-kernel, linux-riscv, Guo Ren, keescook

On Tue, Sep 20, 2022 at 3:34 PM Peter Zijlstra <peterz@infradead.org> wrote:
>
> On Tue, Sep 20, 2022 at 09:27:51AM +0200, Peter Zijlstra wrote:
> > On Tue, Sep 20, 2022 at 02:08:55PM +0800, Guo Ren wrote:
> > > On Mon, Sep 19, 2022 at 9:45 PM Peter Zijlstra <peterz@infradead.org> wrote:
> > > >
> > > > On Sun, Sep 18, 2022 at 11:52:43AM -0400, guoren@kernel.org wrote:
> > > >
> > > > > +ENTRY(call_on_stack)
> > > > > +     /* Create a frame record to save our ra and fp */
> > > > > +     addi    sp, sp, -RISCV_SZPTR
> > > > > +     REG_S   ra, (sp)
> > > > > +     addi    sp, sp, -RISCV_SZPTR
> > > > > +     REG_S   fp, (sp)
> > > > > +
> > > > > +     /* Save sp in fp */
> > > > > +     move    fp, sp
> > > > > +
> > > > > +     /* Move to the new stack and call the function there */
> > > > > +     li      a3, IRQ_STACK_SIZE
> > > > > +     add     sp, a1, a3
> > > > > +     jalr    a2
> > > > > +
> > > > > +     /*
> > > > > +      * Restore sp from prev fp, and fp, ra from the frame
> > > > > +      */
> > > > > +     move    sp, fp
> > > > > +     REG_L   fp, (sp)
> > > > > +     addi    sp, sp, RISCV_SZPTR
> > > > > +     REG_L   ra, (sp)
> > > > > +     addi    sp, sp, RISCV_SZPTR
> > > > > +     ret
> > > > > +ENDPROC(call_on_stack)
> > > >
> > > > IIRC x86_64 moved away from a stack-switch function like this because it
> > > > presents a convenient exploit gadget.
> > > I found:
> > > https://lore.kernel.org/all/20210204204903.350275743@linutronix.de/
> > >
> > >   - The fact that the stack switching code ended up being an easy to find
> > >     exploit gadget.
> > >
> > > What's the exploit gadget? Do you have a ref link? Thx.
> >
> > Sadly no, I do not. Kees might. But basically it boils down to this
> > function taking both a stack pointer and a function pointer as
> > arguments (@a1 and @a2 resp. if I'm not reading this wrong).
> >
> > If an attacker can call this with arguments of its choice then it gains
> > full control of subsequent execution.
>
> If you inline it the hope is that the function pointers go away or at
> least the encompassing function doesn't have quite such a 'convenient'
> signature to hijack control flow.
Thanks for mentioning it. I would change to an inline style.


-- 
Best Regards
 Guo Ren

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH V5 08/11] riscv: Support HAVE_IRQ_EXIT_ON_IRQ_STACK
  2022-09-18 15:52 ` [PATCH V5 08/11] riscv: Support HAVE_IRQ_EXIT_ON_IRQ_STACK guoren
  2022-09-19 13:45   ` Peter Zijlstra
@ 2022-09-21  8:34   ` Chen Zhongjin
  2022-09-21  9:53     ` Guo Ren
  1 sibling, 1 reply; 29+ messages in thread
From: Chen Zhongjin @ 2022-09-21  8:34 UTC (permalink / raw)
  To: guoren, arnd, palmer, tglx, peterz, luto, conor.dooley, heiko,
	jszhang, lazyparser, falcon, chenhuacai, apatel, atishp, palmer,
	paul.walmsley, mark.rutland, zouyipeng, bigeasy, David.Laight
  Cc: linux-arch, linux-kernel, linux-riscv, Guo Ren

Hi,

On 2022/9/18 23:52, guoren@kernel.org wrote:
> diff --git a/arch/riscv/kernel/entry.S b/arch/riscv/kernel/entry.S
> index 5f49517cd3a2..426529b84db0 100644
> --- a/arch/riscv/kernel/entry.S
> +++ b/arch/riscv/kernel/entry.S
> @@ -332,6 +332,33 @@ ENTRY(ret_from_kernel_thread)
>   	tail syscall_exit_to_user_mode
>   ENDPROC(ret_from_kernel_thread)
>   
> +#ifdef CONFIG_IRQ_STACKS
> +ENTRY(call_on_stack)
> +	/* Create a frame record to save our ra and fp */
> +	addi	sp, sp, -RISCV_SZPTR
> +	REG_S	ra, (sp)
> +	addi	sp, sp, -RISCV_SZPTR
> +	REG_S	fp, (sp)
> +
> +	/* Save sp in fp */
> +	move	fp, sp
> +
> +	/* Move to the new stack and call the function there */
> +	li	a3, IRQ_STACK_SIZE
> +	add	sp, a1, a3
> +	jalr	a2
> +
> +	/*
> +	 * Restore sp from prev fp, and fp, ra from the frame
> +	 */
> +	move	sp, fp
> +	REG_L	fp, (sp)
> +	addi	sp, sp, RISCV_SZPTR
> +	REG_L	ra, (sp)
> +	addi	sp, sp, RISCV_SZPTR
> +	ret
> +ENDPROC(call_on_stack)
> +#endif

Seems my compiler (riscv64-linux-gnu-gcc 8.4.0, cross compiling from 
x86) cannot recognize the register `fp`.

After I changed it to `s0` this can pass compiling.


Seems there is nowhere else using `fp`, can this just using `s0` instead?

Best,

Chen


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH V5 08/11] riscv: Support HAVE_IRQ_EXIT_ON_IRQ_STACK
  2022-09-21  8:34   ` Chen Zhongjin
@ 2022-09-21  9:53     ` Guo Ren
  2022-09-21 11:56       ` Chen Zhongjin
  0 siblings, 1 reply; 29+ messages in thread
From: Guo Ren @ 2022-09-21  9:53 UTC (permalink / raw)
  To: Chen Zhongjin
  Cc: arnd, palmer, tglx, peterz, luto, conor.dooley, heiko, jszhang,
	lazyparser, falcon, chenhuacai, apatel, atishp, palmer,
	paul.walmsley, mark.rutland, zouyipeng, bigeasy, David.Laight,
	linux-arch, linux-kernel, linux-riscv, Guo Ren

On Wed, Sep 21, 2022 at 4:34 PM Chen Zhongjin <chenzhongjin@huawei.com> wrote:
>
> Hi,
>
> On 2022/9/18 23:52, guoren@kernel.org wrote:
> > diff --git a/arch/riscv/kernel/entry.S b/arch/riscv/kernel/entry.S
> > index 5f49517cd3a2..426529b84db0 100644
> > --- a/arch/riscv/kernel/entry.S
> > +++ b/arch/riscv/kernel/entry.S
> > @@ -332,6 +332,33 @@ ENTRY(ret_from_kernel_thread)
> >       tail syscall_exit_to_user_mode
> >   ENDPROC(ret_from_kernel_thread)
> >
> > +#ifdef CONFIG_IRQ_STACKS
> > +ENTRY(call_on_stack)
> > +     /* Create a frame record to save our ra and fp */
> > +     addi    sp, sp, -RISCV_SZPTR
> > +     REG_S   ra, (sp)
> > +     addi    sp, sp, -RISCV_SZPTR
> > +     REG_S   fp, (sp)
> > +
> > +     /* Save sp in fp */
> > +     move    fp, sp
> > +
> > +     /* Move to the new stack and call the function there */
> > +     li      a3, IRQ_STACK_SIZE
> > +     add     sp, a1, a3
> > +     jalr    a2
> > +
> > +     /*
> > +      * Restore sp from prev fp, and fp, ra from the frame
> > +      */
> > +     move    sp, fp
> > +     REG_L   fp, (sp)
> > +     addi    sp, sp, RISCV_SZPTR
> > +     REG_L   ra, (sp)
> > +     addi    sp, sp, RISCV_SZPTR
> > +     ret
> > +ENDPROC(call_on_stack)
> > +#endif
>
> Seems my compiler (riscv64-linux-gnu-gcc 8.4.0, cross compiling from
> x86) cannot recognize the register `fp`.
The whole entry.S uses s0 instead of fp, so I approve of your advice. Thx.

>
> After I changed it to `s0` this can pass compiling.
>
>
> Seems there is nowhere else using `fp`, can this just using `s0` instead?
>
> Best,
>
> Chen
>


-- 
Best Regards
 Guo Ren

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH V5 08/11] riscv: Support HAVE_IRQ_EXIT_ON_IRQ_STACK
  2022-09-21  9:53     ` Guo Ren
@ 2022-09-21 11:56       ` Chen Zhongjin
  2022-09-22  1:26         ` Guo Ren
  0 siblings, 1 reply; 29+ messages in thread
From: Chen Zhongjin @ 2022-09-21 11:56 UTC (permalink / raw)
  To: Guo Ren
  Cc: arnd, palmer, tglx, peterz, luto, conor.dooley, heiko, jszhang,
	lazyparser, falcon, chenhuacai, apatel, atishp, palmer,
	paul.walmsley, mark.rutland, zouyipeng, bigeasy, David.Laight,
	linux-arch, linux-kernel, linux-riscv, Guo Ren

Hi,

Sorry to bother again, I just finished the test with your patches on 
mine patch set.

On 2022/9/21 17:53, Guo Ren wrote:
> On Wed, Sep 21, 2022 at 4:34 PM Chen Zhongjin <chenzhongjin@huawei.com> wrote:
>> Hi,
>>
>> On 2022/9/18 23:52, guoren@kernel.org wrote:
>>> diff --git a/arch/riscv/kernel/entry.S b/arch/riscv/kernel/entry.S
>>> index 5f49517cd3a2..426529b84db0 100644
>>> --- a/arch/riscv/kernel/entry.S
>>> +++ b/arch/riscv/kernel/entry.S
>>> @@ -332,6 +332,33 @@ ENTRY(ret_from_kernel_thread)
>>>        tail syscall_exit_to_user_mode
>>>    ENDPROC(ret_from_kernel_thread)
>>>
>>> +#ifdef CONFIG_IRQ_STACKS
>>> +ENTRY(call_on_stack)
>>> +     /* Create a frame record to save our ra and fp */
>>> +     addi    sp, sp, -RISCV_SZPTR
>>> +     REG_S   ra, (sp)
>>> +     addi    sp, sp, -RISCV_SZPTR
>>> +     REG_S   fp, (sp)
>>> +
>>> +     /* Save sp in fp */
>>> +     move    fp, sp
>>> +

Considering that s0 points to previous sp normally, I think here we 
should have 'addi fp, sp, 2*RISCV_SZPTR'.

An example below:

     addi    sp, sp, -16
     sd  ra, 8(sp)
     sd  s0, 0(sp)
     addi    s0, sp, 16    <- s0 is set to previous sp
     ...

     ld  ra,8(sp)
     ld  s0,0(sp)
     addi    sp,sp,16

So maybe it's better to save the stack frame as below:

     addi    sp, sp, -2*RISCV_SZPTR
     REG_S   ra, RISCV_SZPTR(sp)
     REG_S   s0, (sp)

     /* Save sp in fp */
     addi    s0, sp, 2*RISCV_SZPTR

     ...

     /*
      * Restore sp from prev fp, and fp, ra from the frame
      */
     addi    sp, s0, -2*RISCV_SZPTR
     REG_L   ra, RISCV_SZPTR(sp)
     REG_L   s0, (sp)
     addi    sp, sp, 2*RISCV_SZPTR


Anyway, lets set fp as sp + 2 * RISCV_SZPTR, so that unwinder can 
connect two stacks same as normal function.

I tested this with my patch and the unwinder works properly.


Thanks for your time!

Best,

Chen

>>> +     /* Move to the new stack and call the function there */
>>> +     li      a3, IRQ_STACK_SIZE
>>> +     add     sp, a1, a3
>>> +     jalr    a2
>>> +
>>> +     /*
>>> +      * Restore sp from prev fp, and fp, ra from the frame
>>> +      */
>>> +     move    sp, fp
>>> +     REG_L   fp, (sp)
>>> +     addi    sp, sp, RISCV_SZPTR
>>> +     REG_L   ra, (sp)
>>> +     addi    sp, sp, RISCV_SZPTR
>>> +     ret
>>> +ENDPROC(call_on_stack)
>>> +#endif
>> Seems my compiler (riscv64-linux-gnu-gcc 8.4.0, cross compiling from
>> x86) cannot recognize the register `fp`.
> The whole entry.S uses s0 instead of fp, so I approve of your advice. Thx.
>
>> After I changed it to `s0` this can pass compiling.
>>
>>
>> Seems there is nowhere else using `fp`, can this just using `s0` instead?
>>
>> Best,
>>
>> Chen
>>
>>

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH V5 08/11] riscv: Support HAVE_IRQ_EXIT_ON_IRQ_STACK
  2022-09-21 11:56       ` Chen Zhongjin
@ 2022-09-22  1:26         ` Guo Ren
  0 siblings, 0 replies; 29+ messages in thread
From: Guo Ren @ 2022-09-22  1:26 UTC (permalink / raw)
  To: Chen Zhongjin
  Cc: arnd, palmer, tglx, peterz, luto, conor.dooley, heiko, jszhang,
	lazyparser, falcon, chenhuacai, apatel, atishp, palmer,
	paul.walmsley, mark.rutland, zouyipeng, bigeasy, David.Laight,
	linux-arch, linux-kernel, linux-riscv, Guo Ren

On Wed, Sep 21, 2022 at 7:56 PM Chen Zhongjin <chenzhongjin@huawei.com> wrote:
>
> Hi,
>
> Sorry to bother again, I just finished the test with your patches on
> mine patch set.
>
> On 2022/9/21 17:53, Guo Ren wrote:
> > On Wed, Sep 21, 2022 at 4:34 PM Chen Zhongjin <chenzhongjin@huawei.com> wrote:
> >> Hi,
> >>
> >> On 2022/9/18 23:52, guoren@kernel.org wrote:
> >>> diff --git a/arch/riscv/kernel/entry.S b/arch/riscv/kernel/entry.S
> >>> index 5f49517cd3a2..426529b84db0 100644
> >>> --- a/arch/riscv/kernel/entry.S
> >>> +++ b/arch/riscv/kernel/entry.S
> >>> @@ -332,6 +332,33 @@ ENTRY(ret_from_kernel_thread)
> >>>        tail syscall_exit_to_user_mode
> >>>    ENDPROC(ret_from_kernel_thread)
> >>>
> >>> +#ifdef CONFIG_IRQ_STACKS
> >>> +ENTRY(call_on_stack)
> >>> +     /* Create a frame record to save our ra and fp */
> >>> +     addi    sp, sp, -RISCV_SZPTR
> >>> +     REG_S   ra, (sp)
> >>> +     addi    sp, sp, -RISCV_SZPTR
> >>> +     REG_S   fp, (sp)
> >>> +
> >>> +     /* Save sp in fp */
> >>> +     move    fp, sp
> >>> +
>
> Considering that s0 points to previous sp normally, I think here we
> should have 'addi fp, sp, 2*RISCV_SZPTR'.
>
> An example below:
>
>      addi    sp, sp, -16
>      sd  ra, 8(sp)
>      sd  s0, 0(sp)
>      addi    s0, sp, 16    <- s0 is set to previous sp
>      ...
>
>      ld  ra,8(sp)
>      ld  s0,0(sp)
>      addi    sp,sp,16
>
> So maybe it's better to save the stack frame as below:
>
>      addi    sp, sp, -2*RISCV_SZPTR
>      REG_S   ra, RISCV_SZPTR(sp)
>      REG_S   s0, (sp)
>
>      /* Save sp in fp */
>      addi    s0, sp, 2*RISCV_SZPTR
>
>      ...
>
>      /*
>       * Restore sp from prev fp, and fp, ra from the frame
>       */
>      addi    sp, s0, -2*RISCV_SZPTR
>      REG_L   ra, RISCV_SZPTR(sp)
>      REG_L   s0, (sp)
>      addi    sp, sp, 2*RISCV_SZPTR
>
>
> Anyway, lets set fp as sp + 2 * RISCV_SZPTR, so that unwinder can
> connect two stacks same as normal function.
>
> I tested this with my patch and the unwinder works properly.
Thx, you got it. My patch broke the fp chain. I would fix it in the
next version.

>
>
> Thanks for your time!
>
> Best,
>
> Chen
>
> >>> +     /* Move to the new stack and call the function there */
> >>> +     li      a3, IRQ_STACK_SIZE
> >>> +     add     sp, a1, a3
> >>> +     jalr    a2
> >>> +
> >>> +     /*
> >>> +      * Restore sp from prev fp, and fp, ra from the frame
> >>> +      */
> >>> +     move    sp, fp
> >>> +     REG_L   fp, (sp)
> >>> +     addi    sp, sp, RISCV_SZPTR
> >>> +     REG_L   ra, (sp)
> >>> +     addi    sp, sp, RISCV_SZPTR
> >>> +     ret
> >>> +ENDPROC(call_on_stack)
> >>> +#endif
> >> Seems my compiler (riscv64-linux-gnu-gcc 8.4.0, cross compiling from
> >> x86) cannot recognize the register `fp`.
> > The whole entry.S uses s0 instead of fp, so I approve of your advice. Thx.
> >
> >> After I changed it to `s0` this can pass compiling.
> >>
> >>
> >> Seems there is nowhere else using `fp`, can this just using `s0` instead?
> >>
> >> Best,
> >>
> >> Chen
> >>
> >>



-- 
Best Regards
 Guo Ren

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH V5 07/11] riscv: convert to generic entry
  2022-09-20  7:22       ` Peter Zijlstra
@ 2022-09-30 11:28         ` Guo Ren
  0 siblings, 0 replies; 29+ messages in thread
From: Guo Ren @ 2022-09-30 11:28 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: arnd, palmer, tglx, luto, conor.dooley, heiko, jszhang,
	lazyparser, falcon, chenhuacai, apatel, atishp, palmer,
	paul.walmsley, mark.rutland, zouyipeng, bigeasy, David.Laight,
	linux-arch, linux-kernel, linux-riscv, Guo Ren

On Tue, Sep 20, 2022 at 3:23 PM Peter Zijlstra <peterz@infradead.org> wrote:
>
> On Tue, Sep 20, 2022 at 02:36:33PM +0800, Guo Ren wrote:
> > On Mon, Sep 19, 2022 at 9:34 PM Peter Zijlstra <peterz@infradead.org> wrote:
> > >
> > > On Sun, Sep 18, 2022 at 11:52:42AM -0400, guoren@kernel.org wrote:
> > >
> > > > @@ -123,18 +126,22 @@ int handle_misaligned_store(struct pt_regs *regs);
> > > >
> > > >  asmlinkage void __trap_section do_trap_load_misaligned(struct pt_regs *regs)
> > > >  {
> > > > +     irqentry_state_t state = irqentry_enter(regs);
> > > >       if (!handle_misaligned_load(regs))
> > > >               return;
> > > >       do_trap_error(regs, SIGBUS, BUS_ADRALN, regs->epc,
> > > >                     "Oops - load address misaligned");
> > > > +     irqentry_exit(regs, state);
> > > >  }
> > > >
> > > >  asmlinkage void __trap_section do_trap_store_misaligned(struct pt_regs *regs)
> > > >  {
> > > > +     irqentry_state_t state = irqentry_enter(regs);
> > > >       if (!handle_misaligned_store(regs))
> > > >               return;
> > > >       do_trap_error(regs, SIGBUS, BUS_ADRALN, regs->epc,
> > > >                     "Oops - store (or AMO) address misaligned");
> > > > +     irqentry_exit(regs, state);
> > > >  }
> > > >  #endif
> > > >  DO_ERROR_INFO(do_trap_store_fault,
> > > > @@ -158,6 +165,8 @@ static inline unsigned long get_break_insn_length(unsigned long pc)
> > > >
> > > >  asmlinkage __visible __trap_section void do_trap_break(struct pt_regs *regs)
> > > >  {
> > > > +     irqentry_state_t state = irqentry_enter(regs);
> > > > +
> > > >  #ifdef CONFIG_KPROBES
> > > >       if (kprobe_single_step_handler(regs))
> > > >               return;
> > >
> > > FWIW; on x86 I've classified many of the 'from-kernel' traps as
> > > NMI-like, since those traps can happen from any context, including with
> > > IRQs disabled.
> > The do_trap_break is for ebreak instruction, not NMI. RISC-V NMI has
> > separate CSR. ref:
> >
> > This proposal adds support for resumable non-maskable interrupts
> > (RNMIs) to RISC-V. The extension adds four new CSRs (`mnepc`,
> > `mncause`, `mnstatus`, and `mnscratch`) to hold the interrupted state,
> > and a new instruction to resume from the RNMI handler.
>
> Yes, but that's not what I'm saying. I'm saying I've classified
> 'from-kernel' traps as NMI-like.
>
> Consider:
>
>         raw_spin_lock_irq(&foo);
>         ...
>         <trap>
>
> Then you want the trap to behave as if it were an NMI; that is abide by
> the rules of NMI (strictly wait-free code).
>
> So yes, they are not NMI, but they nest just like it, so we want the
> handlers to abide by the same rules.
>
> Does that make sense?
Yes, thx for clarification. I've looked at exc_int3 of x86. Here is my
new version:

diff --git a/arch/riscv/kernel/sys_riscv.c b/arch/riscv/kernel/sys_riscv.c
index 571556bb9261..41cc1c4bccb3 100644
--- a/arch/riscv/kernel/sys_riscv.c
+++ b/arch/riscv/kernel/sys_riscv.c
@@ -5,8 +5,10 @@
  * Copyright (C) 2017 SiFive
  */

+#include <linux/entry-common.h>
 #include <linux/syscalls.h>
 #include <asm/unistd.h>
+#include <asm/syscall.h>
 #include <asm/cacheflush.h>
 #include <asm-generic/mman-common.h>

@@ -72,3 +74,28 @@ SYSCALL_DEFINE3(riscv_flush_icache, uintptr_t,
start, uintptr_t, end,

        return 0;
 }
+
+typedef long (*syscall_t)(ulong, ulong, ulong, ulong, ulong, ulong, ulong);
+
+asmlinkage void do_sys_ecall_u(struct pt_regs *regs)
+{
+       syscall_t syscall;
+       ulong nr = regs->a7;
+
+       regs->epc += 4;
+       regs->orig_a0 = regs->a0;
+       regs->a0 = -ENOSYS;
+
+       nr = syscall_enter_from_user_mode(regs, nr);
+#ifdef CONFIG_COMPAT
+       if ((regs->status & SR_UXL) == SR_UXL_32)
+               syscall = compat_sys_call_table[nr];
+       else
+#endif
+               syscall = sys_call_table[nr];
+
+       if (nr < NR_syscalls)
+               regs->a0 = syscall(regs->orig_a0, regs->a1, regs->a2,
+                                  regs->a3, regs->a4, regs->a5, regs->a6);
+       syscall_exit_to_user_mode(regs);
+}
diff --git a/arch/riscv/kernel/traps.c b/arch/riscv/kernel/traps.c
index 588e17c386c6..d20037585c2f 100644
--- a/arch/riscv/kernel/traps.c
+++ b/arch/riscv/kernel/traps.c
@@ -17,6 +17,7 @@
 #include <linux/module.h>
 #include <linux/irq.h>
 #include <linux/kexec.h>
+#include <linux/entry-common.h>

 #include <asm/asm-prototypes.h>
 #include <asm/bug.h>
@@ -96,10 +97,18 @@ static void do_trap_error(struct pt_regs *regs,
int signo, int code,
 #else
 #define __trap_section noinstr
 #endif
-#define DO_ERROR_INFO(name, signo, code, str)                          \
-asmlinkage __visible __trap_section void name(struct pt_regs *regs)    \
-{                                                                      \
-       do_trap_error(regs, signo, code, regs->epc, "Oops - " str);     \
+#define DO_ERROR_INFO(name, signo, code, str)
         \
+asmlinkage __visible __trap_section void name(struct pt_regs *regs)
         \
+{
         \
+       if (user_mode(regs)) {
         \
+               irqentry_enter_from_user_mode(regs);
         \
+               do_trap_error(regs, signo, code, regs->epc, "Oops - "
str);     \
+               irqentry_exit_to_user_mode(regs);
         \
+       } else {
         \
+               irqentry_state_t irq_state = irqentry_nmi_enter(regs);
         \
+               do_trap_error(regs, signo, code, regs->epc, "Oops - "
str);     \
+               irqentry_nmi_exit(regs, irq_state);
         \
+       }
         \
 }

 DO_ERROR_INFO(do_trap_unknown,
@@ -123,18 +132,36 @@ int handle_misaligned_store(struct pt_regs *regs);

 asmlinkage void __trap_section do_trap_load_misaligned(struct pt_regs *regs)
 {
-       if (!handle_misaligned_load(regs))
-               return;
-       do_trap_error(regs, SIGBUS, BUS_ADRALN, regs->epc,
-                     "Oops - load address misaligned");
+       if (user_mode(regs)) {
+               irqentry_enter_from_user_mode(regs);
+               if (handle_misaligned_load(regs))
+                       do_trap_error(regs, SIGBUS, BUS_ADRALN, regs->epc,
+                             "Oops - load address misaligned");
+               irqentry_exit_to_user_mode(regs);
+       } else {
+               irqentry_state_t irq_state = irqentry_nmi_enter(regs);
+               if (handle_misaligned_load(regs))
+                       do_trap_error(regs, SIGBUS, BUS_ADRALN, regs->epc,
+                             "Oops - load address misaligned");
+               irqentry_nmi_exit(regs, irq_state);
+       }
 }

 asmlinkage void __trap_section do_trap_store_misaligned(struct pt_regs *regs)
 {
-       if (!handle_misaligned_store(regs))
-               return;
-       do_trap_error(regs, SIGBUS, BUS_ADRALN, regs->epc,
-                     "Oops - store (or AMO) address misaligned");
+       if (user_mode(regs)) {
+               irqentry_enter_from_user_mode(regs);
+               if (handle_misaligned_store(regs))
+                       do_trap_error(regs, SIGBUS, BUS_ADRALN, regs->epc,
+                               "Oops - store (or AMO) address misaligned");
+               irqentry_exit_to_user_mode(regs);
+       } else {
+               irqentry_state_t irq_state = irqentry_nmi_enter(regs);
+               if (handle_misaligned_store(regs))
+                       do_trap_error(regs, SIGBUS, BUS_ADRALN, regs->epc,
+                               "Oops - store (or AMO) address misaligned");
+               irqentry_nmi_exit(regs, irq_state);
+       }
 }
 #endif
 DO_ERROR_INFO(do_trap_store_fault,
@@ -156,7 +183,7 @@ static inline unsigned long
get_break_insn_length(unsigned long pc)
        return GET_INSN_LENGTH(insn);
 }

-asmlinkage __visible __trap_section void do_trap_break(struct pt_regs *regs)
+static void __do_trap_break(struct pt_regs *regs)
 {
 #ifdef CONFIG_KPROBES
        if (kprobe_single_step_handler(regs))
@@ -186,6 +213,19 @@ asmlinkage __visible __trap_section void
do_trap_break(struct pt_regs *regs)
        else
                die(regs, "Kernel BUG");
 }
+
+asmlinkage __visible __trap_section void do_trap_break(struct pt_regs *regs)
+{
+       if (user_mode(regs)) {
+               irqentry_enter_from_user_mode(regs);
+               __do_trap_break(regs);
+               irqentry_exit_to_user_mode(regs);
+       } else {
+               irqentry_state_t irq_state = irqentry_nmi_enter(regs);
+               __do_trap_break(regs);
+               irqentry_nmi_exit(regs, irq_state);
+       }
+}
 NOKPROBE_SYMBOL(do_trap_break);

 #ifdef CONFIG_GENERIC_BUG
diff --git a/arch/riscv/mm/fault.c b/arch/riscv/mm/fault.c
index c7829289e806..cc8e642a91ea 100644
--- a/arch/riscv/mm/fault.c
+++ b/arch/riscv/mm/fault.c
@@ -15,6 +15,7 @@
 #include <linux/uaccess.h>
 #include <linux/kprobes.h>
 #include <linux/kfence.h>
+#include <linux/entry-common.h>

 #include <asm/ptrace.h>
 #include <asm/tlbflush.h>
@@ -203,7 +204,7 @@ static inline bool access_error(unsigned long
cause, struct vm_area_struct *vma)
  * This routine handles page faults.  It determines the address and the
  * problem, and then passes it off to one of the appropriate routines.
  */
-asmlinkage void noinstr do_page_fault(struct pt_regs *regs)
+static void __do_page_fault(struct pt_regs *regs)
 {
        struct task_struct *tsk;
        struct vm_area_struct *vma;
@@ -350,4 +351,13 @@ asmlinkage void noinstr do_page_fault(struct pt_regs *regs)
        }
        return;
 }
+
+asmlinkage void noinstr do_page_fault(struct pt_regs *regs)
+{
+       irqentry_state_t state = irqentry_enter(regs);
+
+       __do_page_fault(regs);
+
+       irqentry_exit(regs, state);
+}
 NOKPROBE_SYMBOL(do_page_fault);


        }
}

>
> > >
> > > The basic shape of the trap handlers looks a little like:
> > >
> > >         if (user_mode(regs)) {
> > If nmi comes from user_mode, why we using
> > irqenrty_enter/exit_from/to_user_mode instead of
> > irqentry_nmi_enter/exit?
>
> s/nmi/trap/ because the 'from-user' trap never nests inside kernel code.
>
> Additionally, many 'from-user' traps want to do 'silly' things like send
> signals, which is something that requires scheduling.
>
> They're fundamentally different from 'from-kernel' traps, which per the
> above, nest most dangerously.
>
> > >                 irqenrty_enter_from_user_mode(regs);
> > >                 do_user_trap();
> > >                 irqentry_exit_to_user_mode(regs);
> > >         } else {
> > >                 irqentry_state_t state = irqentry_nmi_enter(regs);
> > >                 do_kernel_trap();
> > >                 irqentry_nmi_exit(regs, state);
> > >         }
> > >
> > > Not saying you have to match Risc-V in this patch-set, just something to
> > > consider.
> > I think the shape of the riscv NMI handler looks a little like this:
> >
> > asmlinkage __visible __trap_section void do_trap_nmi(struct pt_regs *regs)
> > {
> >                  irqentry_state_t state = irqentry_nmi_enter(regs);
> >                  do_nmi_trap();
> >                  irqentry_nmi_exit(regs, state);
> > }
>
> That is correct for the NMI handler; but here I'm specifically talking
> about traps, like the unalign trap, break trap etc. Those that can
> happen *anywhere* in kernel code and nest most unfortunate.



--
Best Regards
 Guo Ren

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: [PATCH V5 06/11] entry: Prevent DEBUG_PREEMPT warning
  2022-09-20  1:45     ` Guo Ren
@ 2022-09-30 12:27       ` Guo Ren
  0 siblings, 0 replies; 29+ messages in thread
From: Guo Ren @ 2022-09-30 12:27 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: arnd, palmer, tglx, luto, conor.dooley, heiko, jszhang,
	lazyparser, falcon, chenhuacai, apatel, atishp, palmer,
	paul.walmsley, mark.rutland, zouyipeng, bigeasy, David.Laight,
	linux-arch, linux-kernel, linux-riscv, Guo Ren

On Tue, Sep 20, 2022 at 9:45 AM Guo Ren <guoren@kernel.org> wrote:
>
> On Mon, Sep 19, 2022 at 7:59 PM Peter Zijlstra <peterz@infradead.org> wrote:
> >
> > On Sun, Sep 18, 2022 at 11:52:41AM -0400, guoren@kernel.org wrote:
> > > From: Guo Ren <guoren@linux.alibaba.com>
> > >
> > > When DEBUG_PREEMPT=y,
> > >       exit_to_user_mode_prepare
> > >       ->tick_nohz_user_enter_prepare
> > >         ->tick_nohz_full_cpu(smp_processor_id())
> > >           ->smp_processor_id()
> > >             ->debug_smp_processor_id()
> > >               ->check preempt_count() then:
> > >
> > > [    5.717610] BUG: using smp_processor_id() in preemptible [00000000]
> > > code: S20urandom/94
> > > [    5.718111] caller is debug_smp_processor_id+0x24/0x38
> > > [    5.718417] CPU: 1 PID: 94 Comm: S20urandom Not tainted
> > > 6.0.0-rc3-00010-gfd0a0d619c63-dirty #238
> > > [    5.718886] Hardware name: riscv-virtio,qemu (DT)
> > > [    5.719136] Call Trace:
> > > [    5.719281] [<ffffffff800055fc>] dump_backtrace+0x2c/0x3c
> > > [    5.719566] [<ffffffff80ae6cb0>] show_stack+0x44/0x5c
> > > [    5.720023] [<ffffffff80aee870>] dump_stack_lvl+0x74/0xa4
> > > [    5.720557] [<ffffffff80aee8bc>] dump_stack+0x1c/0x2c
> > > [    5.721033] [<ffffffff80af65c0>]
> > > check_preemption_disabled+0x104/0x108
> > > [    5.721538] [<ffffffff80af65e8>] debug_smp_processor_id+0x24/0x38
> > > [    5.722001] [<ffffffff800aee30>] exit_to_user_mode_prepare+0x48/0x178
> > > [    5.722355] [<ffffffff80af5bf4>] irqentry_exit_to_user_mode+0x18/0x30
> > > [    5.722685] [<ffffffff80af5c70>] irqentry_exit+0x64/0xa4
> > > [    5.722953] [<ffffffff80af52f4>] do_page_fault+0x1d8/0x544
> > > [    5.723291] [<ffffffff80003310>] ret_from_exception+0x0/0xb8
> > >
> > > (Above is found in riscv platform with generic_entry)
> > >
> > > The smp_processor_id() needs irqs disable or preempt_disable, so use
> > > preempt dis/in protecting the tick_nohz_user_enter_prepare().
> > >
> > > Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
> > > Signed-off-by: Guo Ren <guoren@kernel.org>
> > > ---
> > >  kernel/entry/common.c | 2 ++
> > >  1 file changed, 2 insertions(+)
> > >
> > > diff --git a/kernel/entry/common.c b/kernel/entry/common.c
> > > index 063068a9ea9b..36e4cd50531c 100644
> > > --- a/kernel/entry/common.c
> > > +++ b/kernel/entry/common.c
> > > @@ -194,8 +194,10 @@ static void exit_to_user_mode_prepare(struct pt_regs *regs)
> > >
> > >       lockdep_assert_irqs_disabled();
> >
> >     Observe ^^^^
> Thanks! I would enable PROVE_LOCKING for test.
It's my bug in page_fault, here is the solution:

@@ -358,6 +358,8 @@ asmlinkage void noinstr do_page_fault(struct pt_regs *regs)

        __do_page_fault(regs);

+       local_irq_disable();
+
        irqentry_exit(regs, state);
 }
 NOKPROBE_SYMBOL(do_page_fault);

>
> >
> > >
> > > +     preempt_disable();
> > >       /* Flush pending rcuog wakeup before the last need_resched() check */
> > >       tick_nohz_user_enter_prepare();
> > > +     preempt_enable();
> >
> > This makes no sense; if IRQs are disabled, check_preemption_disabled()
> > should bail early per:
> >
> >         if (irqs_disabled())
> >                 goto out;
> Ditto.
>
>
> --
> Best Regards
>  Guo Ren



-- 
Best Regards
 Guo Ren

^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2022-09-30 12:28 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-09-18 15:52 [PATCH V5 00/11] riscv: Add GENERIC_ENTRY support and related features guoren
2022-09-18 15:52 ` [PATCH V5 01/11] riscv: elf_kexec: Fixup compile warning guoren
2022-09-18 15:52 ` [PATCH V5 02/11] riscv: compat_syscall_table: " guoren
2022-09-18 15:52 ` [PATCH V5 03/11] riscv: ptrace: Remove duplicate operation guoren
2022-09-18 15:52 ` [PATCH V5 04/11] compiler_types.h: Add __noinstr_section() for noinstr guoren
2022-09-18 15:52 ` [PATCH V5 05/11] riscv: traps: Add noinstr to prevent instrumentation inserted guoren
2022-09-18 15:52 ` [PATCH V5 06/11] entry: Prevent DEBUG_PREEMPT warning guoren
2022-09-19 11:58   ` Peter Zijlstra
2022-09-20  1:45     ` Guo Ren
2022-09-30 12:27       ` Guo Ren
2022-09-18 15:52 ` [PATCH V5 07/11] riscv: convert to generic entry guoren
2022-09-19 13:34   ` Peter Zijlstra
2022-09-20  6:36     ` Guo Ren
2022-09-20  7:22       ` Peter Zijlstra
2022-09-30 11:28         ` Guo Ren
2022-09-18 15:52 ` [PATCH V5 08/11] riscv: Support HAVE_IRQ_EXIT_ON_IRQ_STACK guoren
2022-09-19 13:45   ` Peter Zijlstra
2022-09-20  6:08     ` Guo Ren
2022-09-20  7:27       ` Peter Zijlstra
2022-09-20  7:34         ` Peter Zijlstra
2022-09-21  6:16           ` Guo Ren
2022-09-21  8:34   ` Chen Zhongjin
2022-09-21  9:53     ` Guo Ren
2022-09-21 11:56       ` Chen Zhongjin
2022-09-22  1:26         ` Guo Ren
2022-09-18 15:52 ` [PATCH V5 09/11] riscv: Support HAVE_SOFTIRQ_ON_OWN_STACK guoren
2022-09-20  0:11   ` Guo Ren
2022-09-18 15:52 ` [PATCH V5 10/11] riscv: Add config of thread stack size guoren
2022-09-18 15:52 ` [PATCH V5 11/11] riscv: Add support for STACKLEAK gcc plugin guoren

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).