All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v4] riscv: fix race when vmap stack overflow
@ 2022-11-30  2:24 ` Palmer Dabbelt
  0 siblings, 0 replies; 16+ messages in thread
From: Palmer Dabbelt @ 2022-11-30  2:24 UTC (permalink / raw)
  To: jszhang, guoren, linux-riscv, linux-kernel; +Cc: Palmer Dabbelt

From: Jisheng Zhang <jszhang@kernel.org>

Currently, when detecting vmap stack overflow, riscv firstly switches
to the so called shadow stack, then use this shadow stack to call the
get_overflow_stack() to get the overflow stack. However, there's
a race here if two or more harts use the same shadow stack at the same
time.

To solve this race, we introduce spin_shadow_stack atomic var, which
will be swap between its own address and 0 in atomic way, when the
var is set, it means the shadow_stack is being used; when the var
is cleared, it means the shadow_stack isn't being used.

Fixes: 31da94c25aea ("riscv: add VMAP_STACK overflow detection")
Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Suggested-by: Guo Ren <guoren@kernel.org>
Reviewed-by: Guo Ren <guoren@kernel.org>
Link: https://lore.kernel.org/r/20221030124517.2370-1-jszhang@kernel.org
[Palmer: Add AQ to the swap, and also some comments.]
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
---
Sorry to just re-spin this one without any warning, but I'd read patch a
few times and every time I'd managed to convice myself there was a much
simpler way of doing this.  By the time I'd figured out why that's not
the case it seemed faster to just write the comments.

I've stashed this, right on top of the offending commit, at
palmer/riscv-fix_vmap_stack.

Since v3:
 - Add AQ to the swap.
 - Add a bunch of comments.

Since v2:
 - use REG_AMOSWAP
 - add comment to the purpose of smp_store_release()

Since v1:
 - use smp_store_release directly
 - use unsigned int instead of atomic_t
---
 arch/riscv/include/asm/asm.h |  1 +
 arch/riscv/kernel/entry.S    | 13 +++++++++++++
 arch/riscv/kernel/traps.c    | 18 ++++++++++++++++++
 3 files changed, 32 insertions(+)

diff --git a/arch/riscv/include/asm/asm.h b/arch/riscv/include/asm/asm.h
index 618d7c5af1a2..e15a1c9f1cf8 100644
--- a/arch/riscv/include/asm/asm.h
+++ b/arch/riscv/include/asm/asm.h
@@ -23,6 +23,7 @@
 #define REG_L		__REG_SEL(ld, lw)
 #define REG_S		__REG_SEL(sd, sw)
 #define REG_SC		__REG_SEL(sc.d, sc.w)
+#define REG_AMOSWAP_AQ	__REG_SEL(amoswap.d.aq, amoswap.w.aq)
 #define REG_ASM		__REG_SEL(.dword, .word)
 #define SZREG		__REG_SEL(8, 4)
 #define LGREG		__REG_SEL(3, 2)
diff --git a/arch/riscv/kernel/entry.S b/arch/riscv/kernel/entry.S
index 98f502654edd..5fdb6ba09600 100644
--- a/arch/riscv/kernel/entry.S
+++ b/arch/riscv/kernel/entry.S
@@ -387,6 +387,19 @@ handle_syscall_trace_exit:
 
 #ifdef CONFIG_VMAP_STACK
 handle_kernel_stack_overflow:
+	/*
+	 * Takes the psuedo-spinlock for the shadow stack, in case multiple
+	 * harts are concurrently overflowing their kernel stacks.  We could
+	 * store any value here, but since we're overflowing the kernel stack
+	 * already we only have SP to use as a scratch register.  So we just
+	 * swap in the address of the spinlock, as that's definately non-zero.
+	 *
+	 * Pairs with a store_release in handle_bad_stack().
+	 */
+1:	la sp, spin_shadow_stack
+	REG_AMOSWAP_AQ sp, sp, (sp)
+	bnez sp, 1b
+
 	la sp, shadow_stack
 	addi sp, sp, SHADOW_OVERFLOW_STACK_SIZE
 
diff --git a/arch/riscv/kernel/traps.c b/arch/riscv/kernel/traps.c
index bb6a450f0ecc..be54ccea8c47 100644
--- a/arch/riscv/kernel/traps.c
+++ b/arch/riscv/kernel/traps.c
@@ -213,11 +213,29 @@ asmlinkage unsigned long get_overflow_stack(void)
 		OVERFLOW_STACK_SIZE;
 }
 
+/*
+ * A pseudo spinlock to protect the shadow stack from being used by multiple
+ * harts concurrently.  This isn't a real spinlock because the lock side must
+ * be taken without a valid stack and only a single register, it's only taken
+ * while in the process of panicing anyway so the performance and error
+ * checking a proper spinlock gives us doesn't matter.
+ */
+unsigned long spin_shadow_stack;
+
 asmlinkage void handle_bad_stack(struct pt_regs *regs)
 {
 	unsigned long tsk_stk = (unsigned long)current->stack;
 	unsigned long ovf_stk = (unsigned long)this_cpu_ptr(overflow_stack);
 
+	/*
+	 * We're done with the shadow stack by this point, as we're on the
+	 * overflow stack.  Tell any other concurrent overflowing harts that
+	 * they can proceed with panicing by releasing the pseudo-spinlock.
+	 *
+	 * This pairs with an amoswap.aq in handle_kernel_stack_overflow.
+	 */
+	smp_store_release(&spin_shadow_stack, 0);
+
 	console_verbose();
 
 	pr_emerg("Insufficient stack space to handle exception!\n");
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v4] riscv: fix race when vmap stack overflow
@ 2022-11-30  2:24 ` Palmer Dabbelt
  0 siblings, 0 replies; 16+ messages in thread
From: Palmer Dabbelt @ 2022-11-30  2:24 UTC (permalink / raw)
  To: jszhang, guoren, linux-riscv, linux-kernel; +Cc: Palmer Dabbelt

From: Jisheng Zhang <jszhang@kernel.org>

Currently, when detecting vmap stack overflow, riscv firstly switches
to the so called shadow stack, then use this shadow stack to call the
get_overflow_stack() to get the overflow stack. However, there's
a race here if two or more harts use the same shadow stack at the same
time.

To solve this race, we introduce spin_shadow_stack atomic var, which
will be swap between its own address and 0 in atomic way, when the
var is set, it means the shadow_stack is being used; when the var
is cleared, it means the shadow_stack isn't being used.

Fixes: 31da94c25aea ("riscv: add VMAP_STACK overflow detection")
Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Suggested-by: Guo Ren <guoren@kernel.org>
Reviewed-by: Guo Ren <guoren@kernel.org>
Link: https://lore.kernel.org/r/20221030124517.2370-1-jszhang@kernel.org
[Palmer: Add AQ to the swap, and also some comments.]
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
---
Sorry to just re-spin this one without any warning, but I'd read patch a
few times and every time I'd managed to convice myself there was a much
simpler way of doing this.  By the time I'd figured out why that's not
the case it seemed faster to just write the comments.

I've stashed this, right on top of the offending commit, at
palmer/riscv-fix_vmap_stack.

Since v3:
 - Add AQ to the swap.
 - Add a bunch of comments.

Since v2:
 - use REG_AMOSWAP
 - add comment to the purpose of smp_store_release()

Since v1:
 - use smp_store_release directly
 - use unsigned int instead of atomic_t
---
 arch/riscv/include/asm/asm.h |  1 +
 arch/riscv/kernel/entry.S    | 13 +++++++++++++
 arch/riscv/kernel/traps.c    | 18 ++++++++++++++++++
 3 files changed, 32 insertions(+)

diff --git a/arch/riscv/include/asm/asm.h b/arch/riscv/include/asm/asm.h
index 618d7c5af1a2..e15a1c9f1cf8 100644
--- a/arch/riscv/include/asm/asm.h
+++ b/arch/riscv/include/asm/asm.h
@@ -23,6 +23,7 @@
 #define REG_L		__REG_SEL(ld, lw)
 #define REG_S		__REG_SEL(sd, sw)
 #define REG_SC		__REG_SEL(sc.d, sc.w)
+#define REG_AMOSWAP_AQ	__REG_SEL(amoswap.d.aq, amoswap.w.aq)
 #define REG_ASM		__REG_SEL(.dword, .word)
 #define SZREG		__REG_SEL(8, 4)
 #define LGREG		__REG_SEL(3, 2)
diff --git a/arch/riscv/kernel/entry.S b/arch/riscv/kernel/entry.S
index 98f502654edd..5fdb6ba09600 100644
--- a/arch/riscv/kernel/entry.S
+++ b/arch/riscv/kernel/entry.S
@@ -387,6 +387,19 @@ handle_syscall_trace_exit:
 
 #ifdef CONFIG_VMAP_STACK
 handle_kernel_stack_overflow:
+	/*
+	 * Takes the psuedo-spinlock for the shadow stack, in case multiple
+	 * harts are concurrently overflowing their kernel stacks.  We could
+	 * store any value here, but since we're overflowing the kernel stack
+	 * already we only have SP to use as a scratch register.  So we just
+	 * swap in the address of the spinlock, as that's definately non-zero.
+	 *
+	 * Pairs with a store_release in handle_bad_stack().
+	 */
+1:	la sp, spin_shadow_stack
+	REG_AMOSWAP_AQ sp, sp, (sp)
+	bnez sp, 1b
+
 	la sp, shadow_stack
 	addi sp, sp, SHADOW_OVERFLOW_STACK_SIZE
 
diff --git a/arch/riscv/kernel/traps.c b/arch/riscv/kernel/traps.c
index bb6a450f0ecc..be54ccea8c47 100644
--- a/arch/riscv/kernel/traps.c
+++ b/arch/riscv/kernel/traps.c
@@ -213,11 +213,29 @@ asmlinkage unsigned long get_overflow_stack(void)
 		OVERFLOW_STACK_SIZE;
 }
 
+/*
+ * A pseudo spinlock to protect the shadow stack from being used by multiple
+ * harts concurrently.  This isn't a real spinlock because the lock side must
+ * be taken without a valid stack and only a single register, it's only taken
+ * while in the process of panicing anyway so the performance and error
+ * checking a proper spinlock gives us doesn't matter.
+ */
+unsigned long spin_shadow_stack;
+
 asmlinkage void handle_bad_stack(struct pt_regs *regs)
 {
 	unsigned long tsk_stk = (unsigned long)current->stack;
 	unsigned long ovf_stk = (unsigned long)this_cpu_ptr(overflow_stack);
 
+	/*
+	 * We're done with the shadow stack by this point, as we're on the
+	 * overflow stack.  Tell any other concurrent overflowing harts that
+	 * they can proceed with panicing by releasing the pseudo-spinlock.
+	 *
+	 * This pairs with an amoswap.aq in handle_kernel_stack_overflow.
+	 */
+	smp_store_release(&spin_shadow_stack, 0);
+
 	console_verbose();
 
 	pr_emerg("Insufficient stack space to handle exception!\n");
-- 
2.38.1


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH v4] riscv: fix race when vmap stack overflow
  2022-11-30  2:24 ` Palmer Dabbelt
@ 2022-11-30  7:15   ` Guo Ren
  -1 siblings, 0 replies; 16+ messages in thread
From: Guo Ren @ 2022-11-30  7:15 UTC (permalink / raw)
  To: Palmer Dabbelt; +Cc: jszhang, linux-riscv, linux-kernel

The comment becomes better. Thx.

On Wed, Nov 30, 2022 at 10:29 AM Palmer Dabbelt <palmer@rivosinc.com> wrote:
>
> From: Jisheng Zhang <jszhang@kernel.org>
>
> Currently, when detecting vmap stack overflow, riscv firstly switches
> to the so called shadow stack, then use this shadow stack to call the
> get_overflow_stack() to get the overflow stack. However, there's
> a race here if two or more harts use the same shadow stack at the same
> time.
>
> To solve this race, we introduce spin_shadow_stack atomic var, which
> will be swap between its own address and 0 in atomic way, when the
> var is set, it means the shadow_stack is being used; when the var
> is cleared, it means the shadow_stack isn't being used.
>
> Fixes: 31da94c25aea ("riscv: add VMAP_STACK overflow detection")
> Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
> Suggested-by: Guo Ren <guoren@kernel.org>
> Reviewed-by: Guo Ren <guoren@kernel.org>
> Link: https://lore.kernel.org/r/20221030124517.2370-1-jszhang@kernel.org
> [Palmer: Add AQ to the swap, and also some comments.]
> Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
> ---
> Sorry to just re-spin this one without any warning, but I'd read patch a
> few times and every time I'd managed to convice myself there was a much
> simpler way of doing this.  By the time I'd figured out why that's not
> the case it seemed faster to just write the comments.
>
> I've stashed this, right on top of the offending commit, at
> palmer/riscv-fix_vmap_stack.
>
> Since v3:
>  - Add AQ to the swap.
>  - Add a bunch of comments.
>
> Since v2:
>  - use REG_AMOSWAP
>  - add comment to the purpose of smp_store_release()
>
> Since v1:
>  - use smp_store_release directly
>  - use unsigned int instead of atomic_t
> ---
>  arch/riscv/include/asm/asm.h |  1 +
>  arch/riscv/kernel/entry.S    | 13 +++++++++++++
>  arch/riscv/kernel/traps.c    | 18 ++++++++++++++++++
>  3 files changed, 32 insertions(+)
>
> diff --git a/arch/riscv/include/asm/asm.h b/arch/riscv/include/asm/asm.h
> index 618d7c5af1a2..e15a1c9f1cf8 100644
> --- a/arch/riscv/include/asm/asm.h
> +++ b/arch/riscv/include/asm/asm.h
> @@ -23,6 +23,7 @@
>  #define REG_L          __REG_SEL(ld, lw)
>  #define REG_S          __REG_SEL(sd, sw)
>  #define REG_SC         __REG_SEL(sc.d, sc.w)
> +#define REG_AMOSWAP_AQ __REG_SEL(amoswap.d.aq, amoswap.w.aq)
Below is the reason why I use the relax version here:
https://lore.kernel.org/all/CAJF2gTRAEX_jQ_w5H05dyafZzHq+P5j05TJ=C+v+OL__GQam4A@mail.gmail.com/T/#u

>  #define REG_ASM                __REG_SEL(.dword, .word)
>  #define SZREG          __REG_SEL(8, 4)
>  #define LGREG          __REG_SEL(3, 2)
> diff --git a/arch/riscv/kernel/entry.S b/arch/riscv/kernel/entry.S
> index 98f502654edd..5fdb6ba09600 100644
> --- a/arch/riscv/kernel/entry.S
> +++ b/arch/riscv/kernel/entry.S
> @@ -387,6 +387,19 @@ handle_syscall_trace_exit:
>
>  #ifdef CONFIG_VMAP_STACK
>  handle_kernel_stack_overflow:
> +       /*
> +        * Takes the psuedo-spinlock for the shadow stack, in case multiple
> +        * harts are concurrently overflowing their kernel stacks.  We could
> +        * store any value here, but since we're overflowing the kernel stack
> +        * already we only have SP to use as a scratch register.  So we just
> +        * swap in the address of the spinlock, as that's definately non-zero.
> +        *
> +        * Pairs with a store_release in handle_bad_stack().
> +        */
> +1:     la sp, spin_shadow_stack
> +       REG_AMOSWAP_AQ sp, sp, (sp)
> +       bnez sp, 1b
> +
>         la sp, shadow_stack
>         addi sp, sp, SHADOW_OVERFLOW_STACK_SIZE
>
> diff --git a/arch/riscv/kernel/traps.c b/arch/riscv/kernel/traps.c
> index bb6a450f0ecc..be54ccea8c47 100644
> --- a/arch/riscv/kernel/traps.c
> +++ b/arch/riscv/kernel/traps.c
> @@ -213,11 +213,29 @@ asmlinkage unsigned long get_overflow_stack(void)
>                 OVERFLOW_STACK_SIZE;
>  }
>
> +/*
> + * A pseudo spinlock to protect the shadow stack from being used by multiple
> + * harts concurrently.  This isn't a real spinlock because the lock side must
> + * be taken without a valid stack and only a single register, it's only taken
> + * while in the process of panicing anyway so the performance and error
> + * checking a proper spinlock gives us doesn't matter.
> + */
> +unsigned long spin_shadow_stack;
> +
>  asmlinkage void handle_bad_stack(struct pt_regs *regs)
>  {
>         unsigned long tsk_stk = (unsigned long)current->stack;
>         unsigned long ovf_stk = (unsigned long)this_cpu_ptr(overflow_stack);
>
> +       /*
> +        * We're done with the shadow stack by this point, as we're on the
> +        * overflow stack.  Tell any other concurrent overflowing harts that
> +        * they can proceed with panicing by releasing the pseudo-spinlock.
> +        *
> +        * This pairs with an amoswap.aq in handle_kernel_stack_overflow.
> +        */
> +       smp_store_release(&spin_shadow_stack, 0);
> +
>         console_verbose();
>
>         pr_emerg("Insufficient stack space to handle exception!\n");
> --
> 2.38.1
>


--
Best Regards

 Guo Ren

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v4] riscv: fix race when vmap stack overflow
@ 2022-11-30  7:15   ` Guo Ren
  0 siblings, 0 replies; 16+ messages in thread
From: Guo Ren @ 2022-11-30  7:15 UTC (permalink / raw)
  To: Palmer Dabbelt; +Cc: jszhang, linux-riscv, linux-kernel

The comment becomes better. Thx.

On Wed, Nov 30, 2022 at 10:29 AM Palmer Dabbelt <palmer@rivosinc.com> wrote:
>
> From: Jisheng Zhang <jszhang@kernel.org>
>
> Currently, when detecting vmap stack overflow, riscv firstly switches
> to the so called shadow stack, then use this shadow stack to call the
> get_overflow_stack() to get the overflow stack. However, there's
> a race here if two or more harts use the same shadow stack at the same
> time.
>
> To solve this race, we introduce spin_shadow_stack atomic var, which
> will be swap between its own address and 0 in atomic way, when the
> var is set, it means the shadow_stack is being used; when the var
> is cleared, it means the shadow_stack isn't being used.
>
> Fixes: 31da94c25aea ("riscv: add VMAP_STACK overflow detection")
> Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
> Suggested-by: Guo Ren <guoren@kernel.org>
> Reviewed-by: Guo Ren <guoren@kernel.org>
> Link: https://lore.kernel.org/r/20221030124517.2370-1-jszhang@kernel.org
> [Palmer: Add AQ to the swap, and also some comments.]
> Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
> ---
> Sorry to just re-spin this one without any warning, but I'd read patch a
> few times and every time I'd managed to convice myself there was a much
> simpler way of doing this.  By the time I'd figured out why that's not
> the case it seemed faster to just write the comments.
>
> I've stashed this, right on top of the offending commit, at
> palmer/riscv-fix_vmap_stack.
>
> Since v3:
>  - Add AQ to the swap.
>  - Add a bunch of comments.
>
> Since v2:
>  - use REG_AMOSWAP
>  - add comment to the purpose of smp_store_release()
>
> Since v1:
>  - use smp_store_release directly
>  - use unsigned int instead of atomic_t
> ---
>  arch/riscv/include/asm/asm.h |  1 +
>  arch/riscv/kernel/entry.S    | 13 +++++++++++++
>  arch/riscv/kernel/traps.c    | 18 ++++++++++++++++++
>  3 files changed, 32 insertions(+)
>
> diff --git a/arch/riscv/include/asm/asm.h b/arch/riscv/include/asm/asm.h
> index 618d7c5af1a2..e15a1c9f1cf8 100644
> --- a/arch/riscv/include/asm/asm.h
> +++ b/arch/riscv/include/asm/asm.h
> @@ -23,6 +23,7 @@
>  #define REG_L          __REG_SEL(ld, lw)
>  #define REG_S          __REG_SEL(sd, sw)
>  #define REG_SC         __REG_SEL(sc.d, sc.w)
> +#define REG_AMOSWAP_AQ __REG_SEL(amoswap.d.aq, amoswap.w.aq)
Below is the reason why I use the relax version here:
https://lore.kernel.org/all/CAJF2gTRAEX_jQ_w5H05dyafZzHq+P5j05TJ=C+v+OL__GQam4A@mail.gmail.com/T/#u

>  #define REG_ASM                __REG_SEL(.dword, .word)
>  #define SZREG          __REG_SEL(8, 4)
>  #define LGREG          __REG_SEL(3, 2)
> diff --git a/arch/riscv/kernel/entry.S b/arch/riscv/kernel/entry.S
> index 98f502654edd..5fdb6ba09600 100644
> --- a/arch/riscv/kernel/entry.S
> +++ b/arch/riscv/kernel/entry.S
> @@ -387,6 +387,19 @@ handle_syscall_trace_exit:
>
>  #ifdef CONFIG_VMAP_STACK
>  handle_kernel_stack_overflow:
> +       /*
> +        * Takes the psuedo-spinlock for the shadow stack, in case multiple
> +        * harts are concurrently overflowing their kernel stacks.  We could
> +        * store any value here, but since we're overflowing the kernel stack
> +        * already we only have SP to use as a scratch register.  So we just
> +        * swap in the address of the spinlock, as that's definately non-zero.
> +        *
> +        * Pairs with a store_release in handle_bad_stack().
> +        */
> +1:     la sp, spin_shadow_stack
> +       REG_AMOSWAP_AQ sp, sp, (sp)
> +       bnez sp, 1b
> +
>         la sp, shadow_stack
>         addi sp, sp, SHADOW_OVERFLOW_STACK_SIZE
>
> diff --git a/arch/riscv/kernel/traps.c b/arch/riscv/kernel/traps.c
> index bb6a450f0ecc..be54ccea8c47 100644
> --- a/arch/riscv/kernel/traps.c
> +++ b/arch/riscv/kernel/traps.c
> @@ -213,11 +213,29 @@ asmlinkage unsigned long get_overflow_stack(void)
>                 OVERFLOW_STACK_SIZE;
>  }
>
> +/*
> + * A pseudo spinlock to protect the shadow stack from being used by multiple
> + * harts concurrently.  This isn't a real spinlock because the lock side must
> + * be taken without a valid stack and only a single register, it's only taken
> + * while in the process of panicing anyway so the performance and error
> + * checking a proper spinlock gives us doesn't matter.
> + */
> +unsigned long spin_shadow_stack;
> +
>  asmlinkage void handle_bad_stack(struct pt_regs *regs)
>  {
>         unsigned long tsk_stk = (unsigned long)current->stack;
>         unsigned long ovf_stk = (unsigned long)this_cpu_ptr(overflow_stack);
>
> +       /*
> +        * We're done with the shadow stack by this point, as we're on the
> +        * overflow stack.  Tell any other concurrent overflowing harts that
> +        * they can proceed with panicing by releasing the pseudo-spinlock.
> +        *
> +        * This pairs with an amoswap.aq in handle_kernel_stack_overflow.
> +        */
> +       smp_store_release(&spin_shadow_stack, 0);
> +
>         console_verbose();
>
>         pr_emerg("Insufficient stack space to handle exception!\n");
> --
> 2.38.1
>


--
Best Regards

 Guo Ren

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v4] riscv: fix race when vmap stack overflow
  2022-11-30  7:15   ` Guo Ren
@ 2022-11-30 16:54     ` Palmer Dabbelt
  -1 siblings, 0 replies; 16+ messages in thread
From: Palmer Dabbelt @ 2022-11-30 16:54 UTC (permalink / raw)
  To: guoren, Andrea Parri; +Cc: jszhang, linux-riscv, linux-kernel

On Tue, 29 Nov 2022 23:15:40 PST (-0800), guoren@kernel.org wrote:
> The comment becomes better. Thx.
>
> On Wed, Nov 30, 2022 at 10:29 AM Palmer Dabbelt <palmer@rivosinc.com> wrote:
>>
>> From: Jisheng Zhang <jszhang@kernel.org>
>>
>> Currently, when detecting vmap stack overflow, riscv firstly switches
>> to the so called shadow stack, then use this shadow stack to call the
>> get_overflow_stack() to get the overflow stack. However, there's
>> a race here if two or more harts use the same shadow stack at the same
>> time.
>>
>> To solve this race, we introduce spin_shadow_stack atomic var, which
>> will be swap between its own address and 0 in atomic way, when the
>> var is set, it means the shadow_stack is being used; when the var
>> is cleared, it means the shadow_stack isn't being used.
>>
>> Fixes: 31da94c25aea ("riscv: add VMAP_STACK overflow detection")
>> Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
>> Suggested-by: Guo Ren <guoren@kernel.org>
>> Reviewed-by: Guo Ren <guoren@kernel.org>
>> Link: https://lore.kernel.org/r/20221030124517.2370-1-jszhang@kernel.org
>> [Palmer: Add AQ to the swap, and also some comments.]
>> Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
>> ---
>> Sorry to just re-spin this one without any warning, but I'd read patch a
>> few times and every time I'd managed to convice myself there was a much
>> simpler way of doing this.  By the time I'd figured out why that's not
>> the case it seemed faster to just write the comments.
>>
>> I've stashed this, right on top of the offending commit, at
>> palmer/riscv-fix_vmap_stack.
>>
>> Since v3:
>>  - Add AQ to the swap.
>>  - Add a bunch of comments.
>>
>> Since v2:
>>  - use REG_AMOSWAP
>>  - add comment to the purpose of smp_store_release()
>>
>> Since v1:
>>  - use smp_store_release directly
>>  - use unsigned int instead of atomic_t
>> ---
>>  arch/riscv/include/asm/asm.h |  1 +
>>  arch/riscv/kernel/entry.S    | 13 +++++++++++++
>>  arch/riscv/kernel/traps.c    | 18 ++++++++++++++++++
>>  3 files changed, 32 insertions(+)
>>
>> diff --git a/arch/riscv/include/asm/asm.h b/arch/riscv/include/asm/asm.h
>> index 618d7c5af1a2..e15a1c9f1cf8 100644
>> --- a/arch/riscv/include/asm/asm.h
>> +++ b/arch/riscv/include/asm/asm.h
>> @@ -23,6 +23,7 @@
>>  #define REG_L          __REG_SEL(ld, lw)
>>  #define REG_S          __REG_SEL(sd, sw)
>>  #define REG_SC         __REG_SEL(sc.d, sc.w)
>> +#define REG_AMOSWAP_AQ __REG_SEL(amoswap.d.aq, amoswap.w.aq)
> Below is the reason why I use the relax version here:
> https://lore.kernel.org/all/CAJF2gTRAEX_jQ_w5H05dyafZzHq+P5j05TJ=C+v+OL__GQam4A@mail.gmail.com/T/#u

Sorry, I hadn't seen that one.  Adding Andrea.  IMO the acquire/release 
pair is necessary here, with just relaxed the stack stores inside the 
lock could show up on the next hart trying to use the stack.

>>  #define REG_ASM                __REG_SEL(.dword, .word)
>>  #define SZREG          __REG_SEL(8, 4)
>>  #define LGREG          __REG_SEL(3, 2)
>> diff --git a/arch/riscv/kernel/entry.S b/arch/riscv/kernel/entry.S
>> index 98f502654edd..5fdb6ba09600 100644
>> --- a/arch/riscv/kernel/entry.S
>> +++ b/arch/riscv/kernel/entry.S
>> @@ -387,6 +387,19 @@ handle_syscall_trace_exit:
>>
>>  #ifdef CONFIG_VMAP_STACK
>>  handle_kernel_stack_overflow:
>> +       /*
>> +        * Takes the psuedo-spinlock for the shadow stack, in case multiple
>> +        * harts are concurrently overflowing their kernel stacks.  We could
>> +        * store any value here, but since we're overflowing the kernel stack
>> +        * already we only have SP to use as a scratch register.  So we just
>> +        * swap in the address of the spinlock, as that's definately non-zero.
>> +        *
>> +        * Pairs with a store_release in handle_bad_stack().
>> +        */
>> +1:     la sp, spin_shadow_stack
>> +       REG_AMOSWAP_AQ sp, sp, (sp)
>> +       bnez sp, 1b
>> +
>>         la sp, shadow_stack
>>         addi sp, sp, SHADOW_OVERFLOW_STACK_SIZE
>>
>> diff --git a/arch/riscv/kernel/traps.c b/arch/riscv/kernel/traps.c
>> index bb6a450f0ecc..be54ccea8c47 100644
>> --- a/arch/riscv/kernel/traps.c
>> +++ b/arch/riscv/kernel/traps.c
>> @@ -213,11 +213,29 @@ asmlinkage unsigned long get_overflow_stack(void)
>>                 OVERFLOW_STACK_SIZE;
>>  }
>>
>> +/*
>> + * A pseudo spinlock to protect the shadow stack from being used by multiple
>> + * harts concurrently.  This isn't a real spinlock because the lock side must
>> + * be taken without a valid stack and only a single register, it's only taken
>> + * while in the process of panicing anyway so the performance and error
>> + * checking a proper spinlock gives us doesn't matter.
>> + */
>> +unsigned long spin_shadow_stack;
>> +
>>  asmlinkage void handle_bad_stack(struct pt_regs *regs)
>>  {
>>         unsigned long tsk_stk = (unsigned long)current->stack;
>>         unsigned long ovf_stk = (unsigned long)this_cpu_ptr(overflow_stack);
>>
>> +       /*
>> +        * We're done with the shadow stack by this point, as we're on the
>> +        * overflow stack.  Tell any other concurrent overflowing harts that
>> +        * they can proceed with panicing by releasing the pseudo-spinlock.
>> +        *
>> +        * This pairs with an amoswap.aq in handle_kernel_stack_overflow.
>> +        */
>> +       smp_store_release(&spin_shadow_stack, 0);
>> +
>>         console_verbose();
>>
>>         pr_emerg("Insufficient stack space to handle exception!\n");
>> --
>> 2.38.1
>>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v4] riscv: fix race when vmap stack overflow
@ 2022-11-30 16:54     ` Palmer Dabbelt
  0 siblings, 0 replies; 16+ messages in thread
From: Palmer Dabbelt @ 2022-11-30 16:54 UTC (permalink / raw)
  To: guoren, Andrea Parri; +Cc: jszhang, linux-riscv, linux-kernel

On Tue, 29 Nov 2022 23:15:40 PST (-0800), guoren@kernel.org wrote:
> The comment becomes better. Thx.
>
> On Wed, Nov 30, 2022 at 10:29 AM Palmer Dabbelt <palmer@rivosinc.com> wrote:
>>
>> From: Jisheng Zhang <jszhang@kernel.org>
>>
>> Currently, when detecting vmap stack overflow, riscv firstly switches
>> to the so called shadow stack, then use this shadow stack to call the
>> get_overflow_stack() to get the overflow stack. However, there's
>> a race here if two or more harts use the same shadow stack at the same
>> time.
>>
>> To solve this race, we introduce spin_shadow_stack atomic var, which
>> will be swap between its own address and 0 in atomic way, when the
>> var is set, it means the shadow_stack is being used; when the var
>> is cleared, it means the shadow_stack isn't being used.
>>
>> Fixes: 31da94c25aea ("riscv: add VMAP_STACK overflow detection")
>> Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
>> Suggested-by: Guo Ren <guoren@kernel.org>
>> Reviewed-by: Guo Ren <guoren@kernel.org>
>> Link: https://lore.kernel.org/r/20221030124517.2370-1-jszhang@kernel.org
>> [Palmer: Add AQ to the swap, and also some comments.]
>> Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
>> ---
>> Sorry to just re-spin this one without any warning, but I'd read patch a
>> few times and every time I'd managed to convice myself there was a much
>> simpler way of doing this.  By the time I'd figured out why that's not
>> the case it seemed faster to just write the comments.
>>
>> I've stashed this, right on top of the offending commit, at
>> palmer/riscv-fix_vmap_stack.
>>
>> Since v3:
>>  - Add AQ to the swap.
>>  - Add a bunch of comments.
>>
>> Since v2:
>>  - use REG_AMOSWAP
>>  - add comment to the purpose of smp_store_release()
>>
>> Since v1:
>>  - use smp_store_release directly
>>  - use unsigned int instead of atomic_t
>> ---
>>  arch/riscv/include/asm/asm.h |  1 +
>>  arch/riscv/kernel/entry.S    | 13 +++++++++++++
>>  arch/riscv/kernel/traps.c    | 18 ++++++++++++++++++
>>  3 files changed, 32 insertions(+)
>>
>> diff --git a/arch/riscv/include/asm/asm.h b/arch/riscv/include/asm/asm.h
>> index 618d7c5af1a2..e15a1c9f1cf8 100644
>> --- a/arch/riscv/include/asm/asm.h
>> +++ b/arch/riscv/include/asm/asm.h
>> @@ -23,6 +23,7 @@
>>  #define REG_L          __REG_SEL(ld, lw)
>>  #define REG_S          __REG_SEL(sd, sw)
>>  #define REG_SC         __REG_SEL(sc.d, sc.w)
>> +#define REG_AMOSWAP_AQ __REG_SEL(amoswap.d.aq, amoswap.w.aq)
> Below is the reason why I use the relax version here:
> https://lore.kernel.org/all/CAJF2gTRAEX_jQ_w5H05dyafZzHq+P5j05TJ=C+v+OL__GQam4A@mail.gmail.com/T/#u

Sorry, I hadn't seen that one.  Adding Andrea.  IMO the acquire/release 
pair is necessary here, with just relaxed the stack stores inside the 
lock could show up on the next hart trying to use the stack.

>>  #define REG_ASM                __REG_SEL(.dword, .word)
>>  #define SZREG          __REG_SEL(8, 4)
>>  #define LGREG          __REG_SEL(3, 2)
>> diff --git a/arch/riscv/kernel/entry.S b/arch/riscv/kernel/entry.S
>> index 98f502654edd..5fdb6ba09600 100644
>> --- a/arch/riscv/kernel/entry.S
>> +++ b/arch/riscv/kernel/entry.S
>> @@ -387,6 +387,19 @@ handle_syscall_trace_exit:
>>
>>  #ifdef CONFIG_VMAP_STACK
>>  handle_kernel_stack_overflow:
>> +       /*
>> +        * Takes the psuedo-spinlock for the shadow stack, in case multiple
>> +        * harts are concurrently overflowing their kernel stacks.  We could
>> +        * store any value here, but since we're overflowing the kernel stack
>> +        * already we only have SP to use as a scratch register.  So we just
>> +        * swap in the address of the spinlock, as that's definately non-zero.
>> +        *
>> +        * Pairs with a store_release in handle_bad_stack().
>> +        */
>> +1:     la sp, spin_shadow_stack
>> +       REG_AMOSWAP_AQ sp, sp, (sp)
>> +       bnez sp, 1b
>> +
>>         la sp, shadow_stack
>>         addi sp, sp, SHADOW_OVERFLOW_STACK_SIZE
>>
>> diff --git a/arch/riscv/kernel/traps.c b/arch/riscv/kernel/traps.c
>> index bb6a450f0ecc..be54ccea8c47 100644
>> --- a/arch/riscv/kernel/traps.c
>> +++ b/arch/riscv/kernel/traps.c
>> @@ -213,11 +213,29 @@ asmlinkage unsigned long get_overflow_stack(void)
>>                 OVERFLOW_STACK_SIZE;
>>  }
>>
>> +/*
>> + * A pseudo spinlock to protect the shadow stack from being used by multiple
>> + * harts concurrently.  This isn't a real spinlock because the lock side must
>> + * be taken without a valid stack and only a single register, it's only taken
>> + * while in the process of panicing anyway so the performance and error
>> + * checking a proper spinlock gives us doesn't matter.
>> + */
>> +unsigned long spin_shadow_stack;
>> +
>>  asmlinkage void handle_bad_stack(struct pt_regs *regs)
>>  {
>>         unsigned long tsk_stk = (unsigned long)current->stack;
>>         unsigned long ovf_stk = (unsigned long)this_cpu_ptr(overflow_stack);
>>
>> +       /*
>> +        * We're done with the shadow stack by this point, as we're on the
>> +        * overflow stack.  Tell any other concurrent overflowing harts that
>> +        * they can proceed with panicing by releasing the pseudo-spinlock.
>> +        *
>> +        * This pairs with an amoswap.aq in handle_kernel_stack_overflow.
>> +        */
>> +       smp_store_release(&spin_shadow_stack, 0);
>> +
>>         console_verbose();
>>
>>         pr_emerg("Insufficient stack space to handle exception!\n");
>> --
>> 2.38.1
>>

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v4] riscv: fix race when vmap stack overflow
  2022-11-30 16:54     ` Palmer Dabbelt
@ 2022-12-01  1:17       ` Guo Ren
  -1 siblings, 0 replies; 16+ messages in thread
From: Guo Ren @ 2022-12-01  1:17 UTC (permalink / raw)
  To: Palmer Dabbelt; +Cc: Andrea Parri, jszhang, linux-riscv, linux-kernel

On Thu, Dec 1, 2022 at 12:54 AM Palmer Dabbelt <palmer@rivosinc.com> wrote:
>
> On Tue, 29 Nov 2022 23:15:40 PST (-0800), guoren@kernel.org wrote:
> > The comment becomes better. Thx.
> >
> > On Wed, Nov 30, 2022 at 10:29 AM Palmer Dabbelt <palmer@rivosinc.com> wrote:
> >>
> >> From: Jisheng Zhang <jszhang@kernel.org>
> >>
> >> Currently, when detecting vmap stack overflow, riscv firstly switches
> >> to the so called shadow stack, then use this shadow stack to call the
> >> get_overflow_stack() to get the overflow stack. However, there's
> >> a race here if two or more harts use the same shadow stack at the same
> >> time.
> >>
> >> To solve this race, we introduce spin_shadow_stack atomic var, which
> >> will be swap between its own address and 0 in atomic way, when the
> >> var is set, it means the shadow_stack is being used; when the var
> >> is cleared, it means the shadow_stack isn't being used.
> >>
> >> Fixes: 31da94c25aea ("riscv: add VMAP_STACK overflow detection")
> >> Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
> >> Suggested-by: Guo Ren <guoren@kernel.org>
> >> Reviewed-by: Guo Ren <guoren@kernel.org>
> >> Link: https://lore.kernel.org/r/20221030124517.2370-1-jszhang@kernel.org
> >> [Palmer: Add AQ to the swap, and also some comments.]
> >> Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
> >> ---
> >> Sorry to just re-spin this one without any warning, but I'd read patch a
> >> few times and every time I'd managed to convice myself there was a much
> >> simpler way of doing this.  By the time I'd figured out why that's not
> >> the case it seemed faster to just write the comments.
> >>
> >> I've stashed this, right on top of the offending commit, at
> >> palmer/riscv-fix_vmap_stack.
> >>
> >> Since v3:
> >>  - Add AQ to the swap.
> >>  - Add a bunch of comments.
> >>
> >> Since v2:
> >>  - use REG_AMOSWAP
> >>  - add comment to the purpose of smp_store_release()
> >>
> >> Since v1:
> >>  - use smp_store_release directly
> >>  - use unsigned int instead of atomic_t
> >> ---
> >>  arch/riscv/include/asm/asm.h |  1 +
> >>  arch/riscv/kernel/entry.S    | 13 +++++++++++++
> >>  arch/riscv/kernel/traps.c    | 18 ++++++++++++++++++
> >>  3 files changed, 32 insertions(+)
> >>
> >> diff --git a/arch/riscv/include/asm/asm.h b/arch/riscv/include/asm/asm.h
> >> index 618d7c5af1a2..e15a1c9f1cf8 100644
> >> --- a/arch/riscv/include/asm/asm.h
> >> +++ b/arch/riscv/include/asm/asm.h
> >> @@ -23,6 +23,7 @@
> >>  #define REG_L          __REG_SEL(ld, lw)
> >>  #define REG_S          __REG_SEL(sd, sw)
> >>  #define REG_SC         __REG_SEL(sc.d, sc.w)
> >> +#define REG_AMOSWAP_AQ __REG_SEL(amoswap.d.aq, amoswap.w.aq)
> > Below is the reason why I use the relax version here:
> > https://lore.kernel.org/all/CAJF2gTRAEX_jQ_w5H05dyafZzHq+P5j05TJ=C+v+OL__GQam4A@mail.gmail.com/T/#u
>
> Sorry, I hadn't seen that one.  Adding Andrea.  IMO the acquire/release
> pair is necessary here, with just relaxed the stack stores inside the
> lock could show up on the next hart trying to use the stack.
Don't worry about relaxing amoswap, sp could give WAR & WAW
dependency. You could add acquire here, just for appearance.

>
> >>  #define REG_ASM                __REG_SEL(.dword, .word)
> >>  #define SZREG          __REG_SEL(8, 4)
> >>  #define LGREG          __REG_SEL(3, 2)
> >> diff --git a/arch/riscv/kernel/entry.S b/arch/riscv/kernel/entry.S
> >> index 98f502654edd..5fdb6ba09600 100644
> >> --- a/arch/riscv/kernel/entry.S
> >> +++ b/arch/riscv/kernel/entry.S
> >> @@ -387,6 +387,19 @@ handle_syscall_trace_exit:
> >>
> >>  #ifdef CONFIG_VMAP_STACK
> >>  handle_kernel_stack_overflow:
> >> +       /*
> >> +        * Takes the psuedo-spinlock for the shadow stack, in case multiple
> >> +        * harts are concurrently overflowing their kernel stacks.  We could
> >> +        * store any value here, but since we're overflowing the kernel stack
> >> +        * already we only have SP to use as a scratch register.  So we just
> >> +        * swap in the address of the spinlock, as that's definately non-zero.
> >> +        *
> >> +        * Pairs with a store_release in handle_bad_stack().
> >> +        */
> >> +1:     la sp, spin_shadow_stack
> >> +       REG_AMOSWAP_AQ sp, sp, (sp)
> >> +       bnez sp, 1b
> >> +
> >>         la sp, shadow_stack
> >>         addi sp, sp, SHADOW_OVERFLOW_STACK_SIZE
> >>
> >> diff --git a/arch/riscv/kernel/traps.c b/arch/riscv/kernel/traps.c
> >> index bb6a450f0ecc..be54ccea8c47 100644
> >> --- a/arch/riscv/kernel/traps.c
> >> +++ b/arch/riscv/kernel/traps.c
> >> @@ -213,11 +213,29 @@ asmlinkage unsigned long get_overflow_stack(void)
> >>                 OVERFLOW_STACK_SIZE;
> >>  }
> >>
> >> +/*
> >> + * A pseudo spinlock to protect the shadow stack from being used by multiple
> >> + * harts concurrently.  This isn't a real spinlock because the lock side must
> >> + * be taken without a valid stack and only a single register, it's only taken
> >> + * while in the process of panicing anyway so the performance and error
> >> + * checking a proper spinlock gives us doesn't matter.
> >> + */
> >> +unsigned long spin_shadow_stack;
> >> +
> >>  asmlinkage void handle_bad_stack(struct pt_regs *regs)
> >>  {
> >>         unsigned long tsk_stk = (unsigned long)current->stack;
> >>         unsigned long ovf_stk = (unsigned long)this_cpu_ptr(overflow_stack);
> >>
> >> +       /*
> >> +        * We're done with the shadow stack by this point, as we're on the
> >> +        * overflow stack.  Tell any other concurrent overflowing harts that
> >> +        * they can proceed with panicing by releasing the pseudo-spinlock.
> >> +        *
> >> +        * This pairs with an amoswap.aq in handle_kernel_stack_overflow.
> >> +        */
> >> +       smp_store_release(&spin_shadow_stack, 0);
> >> +
> >>         console_verbose();
> >>
> >>         pr_emerg("Insufficient stack space to handle exception!\n");
> >> --
> >> 2.38.1
> >>



-- 
Best Regards
 Guo Ren

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v4] riscv: fix race when vmap stack overflow
@ 2022-12-01  1:17       ` Guo Ren
  0 siblings, 0 replies; 16+ messages in thread
From: Guo Ren @ 2022-12-01  1:17 UTC (permalink / raw)
  To: Palmer Dabbelt; +Cc: Andrea Parri, jszhang, linux-riscv, linux-kernel

On Thu, Dec 1, 2022 at 12:54 AM Palmer Dabbelt <palmer@rivosinc.com> wrote:
>
> On Tue, 29 Nov 2022 23:15:40 PST (-0800), guoren@kernel.org wrote:
> > The comment becomes better. Thx.
> >
> > On Wed, Nov 30, 2022 at 10:29 AM Palmer Dabbelt <palmer@rivosinc.com> wrote:
> >>
> >> From: Jisheng Zhang <jszhang@kernel.org>
> >>
> >> Currently, when detecting vmap stack overflow, riscv firstly switches
> >> to the so called shadow stack, then use this shadow stack to call the
> >> get_overflow_stack() to get the overflow stack. However, there's
> >> a race here if two or more harts use the same shadow stack at the same
> >> time.
> >>
> >> To solve this race, we introduce spin_shadow_stack atomic var, which
> >> will be swap between its own address and 0 in atomic way, when the
> >> var is set, it means the shadow_stack is being used; when the var
> >> is cleared, it means the shadow_stack isn't being used.
> >>
> >> Fixes: 31da94c25aea ("riscv: add VMAP_STACK overflow detection")
> >> Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
> >> Suggested-by: Guo Ren <guoren@kernel.org>
> >> Reviewed-by: Guo Ren <guoren@kernel.org>
> >> Link: https://lore.kernel.org/r/20221030124517.2370-1-jszhang@kernel.org
> >> [Palmer: Add AQ to the swap, and also some comments.]
> >> Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
> >> ---
> >> Sorry to just re-spin this one without any warning, but I'd read patch a
> >> few times and every time I'd managed to convice myself there was a much
> >> simpler way of doing this.  By the time I'd figured out why that's not
> >> the case it seemed faster to just write the comments.
> >>
> >> I've stashed this, right on top of the offending commit, at
> >> palmer/riscv-fix_vmap_stack.
> >>
> >> Since v3:
> >>  - Add AQ to the swap.
> >>  - Add a bunch of comments.
> >>
> >> Since v2:
> >>  - use REG_AMOSWAP
> >>  - add comment to the purpose of smp_store_release()
> >>
> >> Since v1:
> >>  - use smp_store_release directly
> >>  - use unsigned int instead of atomic_t
> >> ---
> >>  arch/riscv/include/asm/asm.h |  1 +
> >>  arch/riscv/kernel/entry.S    | 13 +++++++++++++
> >>  arch/riscv/kernel/traps.c    | 18 ++++++++++++++++++
> >>  3 files changed, 32 insertions(+)
> >>
> >> diff --git a/arch/riscv/include/asm/asm.h b/arch/riscv/include/asm/asm.h
> >> index 618d7c5af1a2..e15a1c9f1cf8 100644
> >> --- a/arch/riscv/include/asm/asm.h
> >> +++ b/arch/riscv/include/asm/asm.h
> >> @@ -23,6 +23,7 @@
> >>  #define REG_L          __REG_SEL(ld, lw)
> >>  #define REG_S          __REG_SEL(sd, sw)
> >>  #define REG_SC         __REG_SEL(sc.d, sc.w)
> >> +#define REG_AMOSWAP_AQ __REG_SEL(amoswap.d.aq, amoswap.w.aq)
> > Below is the reason why I use the relax version here:
> > https://lore.kernel.org/all/CAJF2gTRAEX_jQ_w5H05dyafZzHq+P5j05TJ=C+v+OL__GQam4A@mail.gmail.com/T/#u
>
> Sorry, I hadn't seen that one.  Adding Andrea.  IMO the acquire/release
> pair is necessary here, with just relaxed the stack stores inside the
> lock could show up on the next hart trying to use the stack.
Don't worry about relaxing amoswap, sp could give WAR & WAW
dependency. You could add acquire here, just for appearance.

>
> >>  #define REG_ASM                __REG_SEL(.dword, .word)
> >>  #define SZREG          __REG_SEL(8, 4)
> >>  #define LGREG          __REG_SEL(3, 2)
> >> diff --git a/arch/riscv/kernel/entry.S b/arch/riscv/kernel/entry.S
> >> index 98f502654edd..5fdb6ba09600 100644
> >> --- a/arch/riscv/kernel/entry.S
> >> +++ b/arch/riscv/kernel/entry.S
> >> @@ -387,6 +387,19 @@ handle_syscall_trace_exit:
> >>
> >>  #ifdef CONFIG_VMAP_STACK
> >>  handle_kernel_stack_overflow:
> >> +       /*
> >> +        * Takes the psuedo-spinlock for the shadow stack, in case multiple
> >> +        * harts are concurrently overflowing their kernel stacks.  We could
> >> +        * store any value here, but since we're overflowing the kernel stack
> >> +        * already we only have SP to use as a scratch register.  So we just
> >> +        * swap in the address of the spinlock, as that's definately non-zero.
> >> +        *
> >> +        * Pairs with a store_release in handle_bad_stack().
> >> +        */
> >> +1:     la sp, spin_shadow_stack
> >> +       REG_AMOSWAP_AQ sp, sp, (sp)
> >> +       bnez sp, 1b
> >> +
> >>         la sp, shadow_stack
> >>         addi sp, sp, SHADOW_OVERFLOW_STACK_SIZE
> >>
> >> diff --git a/arch/riscv/kernel/traps.c b/arch/riscv/kernel/traps.c
> >> index bb6a450f0ecc..be54ccea8c47 100644
> >> --- a/arch/riscv/kernel/traps.c
> >> +++ b/arch/riscv/kernel/traps.c
> >> @@ -213,11 +213,29 @@ asmlinkage unsigned long get_overflow_stack(void)
> >>                 OVERFLOW_STACK_SIZE;
> >>  }
> >>
> >> +/*
> >> + * A pseudo spinlock to protect the shadow stack from being used by multiple
> >> + * harts concurrently.  This isn't a real spinlock because the lock side must
> >> + * be taken without a valid stack and only a single register, it's only taken
> >> + * while in the process of panicing anyway so the performance and error
> >> + * checking a proper spinlock gives us doesn't matter.
> >> + */
> >> +unsigned long spin_shadow_stack;
> >> +
> >>  asmlinkage void handle_bad_stack(struct pt_regs *regs)
> >>  {
> >>         unsigned long tsk_stk = (unsigned long)current->stack;
> >>         unsigned long ovf_stk = (unsigned long)this_cpu_ptr(overflow_stack);
> >>
> >> +       /*
> >> +        * We're done with the shadow stack by this point, as we're on the
> >> +        * overflow stack.  Tell any other concurrent overflowing harts that
> >> +        * they can proceed with panicing by releasing the pseudo-spinlock.
> >> +        *
> >> +        * This pairs with an amoswap.aq in handle_kernel_stack_overflow.
> >> +        */
> >> +       smp_store_release(&spin_shadow_stack, 0);
> >> +
> >>         console_verbose();
> >>
> >>         pr_emerg("Insufficient stack space to handle exception!\n");
> >> --
> >> 2.38.1
> >>



-- 
Best Regards
 Guo Ren

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v4] riscv: fix race when vmap stack overflow
  2022-11-30 16:54     ` Palmer Dabbelt
@ 2022-12-01  1:55       ` Jessica Clarke
  -1 siblings, 0 replies; 16+ messages in thread
From: Jessica Clarke @ 2022-12-01  1:55 UTC (permalink / raw)
  To: Palmer Dabbelt; +Cc: guoren, Andrea Parri, jszhang, linux-riscv, linux-kernel

On 30 Nov 2022, at 16:54, Palmer Dabbelt <palmer@rivosinc.com> wrote:
> 
> On Tue, 29 Nov 2022 23:15:40 PST (-0800), guoren@kernel.org wrote:
>> The comment becomes better. Thx.
>> 
>> On Wed, Nov 30, 2022 at 10:29 AM Palmer Dabbelt <palmer@rivosinc.com> wrote:
>>> 
>>> From: Jisheng Zhang <jszhang@kernel.org>
>>> 
>>> Currently, when detecting vmap stack overflow, riscv firstly switches
>>> to the so called shadow stack, then use this shadow stack to call the
>>> get_overflow_stack() to get the overflow stack. However, there's
>>> a race here if two or more harts use the same shadow stack at the same
>>> time.
>>> 
>>> To solve this race, we introduce spin_shadow_stack atomic var, which
>>> will be swap between its own address and 0 in atomic way, when the
>>> var is set, it means the shadow_stack is being used; when the var
>>> is cleared, it means the shadow_stack isn't being used.
>>> 
>>> Fixes: 31da94c25aea ("riscv: add VMAP_STACK overflow detection")
>>> Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
>>> Suggested-by: Guo Ren <guoren@kernel.org>
>>> Reviewed-by: Guo Ren <guoren@kernel.org>
>>> Link: https://lore.kernel.org/r/20221030124517.2370-1-jszhang@kernel.org
>>> [Palmer: Add AQ to the swap, and also some comments.]
>>> Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
>>> ---
>>> Sorry to just re-spin this one without any warning, but I'd read patch a
>>> few times and every time I'd managed to convice myself there was a much
>>> simpler way of doing this.  By the time I'd figured out why that's not
>>> the case it seemed faster to just write the comments.
>>> 
>>> I've stashed this, right on top of the offending commit, at
>>> palmer/riscv-fix_vmap_stack.
>>> 
>>> Since v3:
>>> - Add AQ to the swap.
>>> - Add a bunch of comments.
>>> 
>>> Since v2:
>>> - use REG_AMOSWAP
>>> - add comment to the purpose of smp_store_release()
>>> 
>>> Since v1:
>>> - use smp_store_release directly
>>> - use unsigned int instead of atomic_t
>>> ---
>>> arch/riscv/include/asm/asm.h |  1 +
>>> arch/riscv/kernel/entry.S    | 13 +++++++++++++
>>> arch/riscv/kernel/traps.c    | 18 ++++++++++++++++++
>>> 3 files changed, 32 insertions(+)
>>> 
>>> diff --git a/arch/riscv/include/asm/asm.h b/arch/riscv/include/asm/asm.h
>>> index 618d7c5af1a2..e15a1c9f1cf8 100644
>>> --- a/arch/riscv/include/asm/asm.h
>>> +++ b/arch/riscv/include/asm/asm.h
>>> @@ -23,6 +23,7 @@
>>> #define REG_L          __REG_SEL(ld, lw)
>>> #define REG_S          __REG_SEL(sd, sw)
>>> #define REG_SC         __REG_SEL(sc.d, sc.w)
>>> +#define REG_AMOSWAP_AQ __REG_SEL(amoswap.d.aq, amoswap.w.aq)
>> Below is the reason why I use the relax version here:
>> https://lore.kernel.org/all/CAJF2gTRAEX_jQ_w5H05dyafZzHq+P5j05TJ=C+v+OL__GQam4A@mail.gmail.com/T/#u
> 
> Sorry, I hadn't seen that one.  Adding Andrea.  IMO the acquire/release pair is necessary here, with just relaxed the stack stores inside the lock could show up on the next hart trying to use the stack.

I think what you really want is a *consume* barrier, and since you have
the data dependency between the amoswap and the memory accesses (and
this isn’t Alpha) you’re technically fine without the acquire, since
you’re writing assembly and have the data dependency as syntactic.
Though you may still want (need?) the acquire so loads/stores unrelated
to the stack pointer that happen later in program order get ordered
after the load of the new stack pointer in case there could be weird
issues *there*.

Jess

>>> #define REG_ASM                __REG_SEL(.dword, .word)
>>> #define SZREG          __REG_SEL(8, 4)
>>> #define LGREG          __REG_SEL(3, 2)
>>> diff --git a/arch/riscv/kernel/entry.S b/arch/riscv/kernel/entry.S
>>> index 98f502654edd..5fdb6ba09600 100644
>>> --- a/arch/riscv/kernel/entry.S
>>> +++ b/arch/riscv/kernel/entry.S
>>> @@ -387,6 +387,19 @@ handle_syscall_trace_exit:
>>> 
>>> #ifdef CONFIG_VMAP_STACK
>>> handle_kernel_stack_overflow:
>>> +       /*
>>> +        * Takes the psuedo-spinlock for the shadow stack, in case multiple
>>> +        * harts are concurrently overflowing their kernel stacks.  We could
>>> +        * store any value here, but since we're overflowing the kernel stack
>>> +        * already we only have SP to use as a scratch register.  So we just
>>> +        * swap in the address of the spinlock, as that's definately non-zero.
>>> +        *
>>> +        * Pairs with a store_release in handle_bad_stack().
>>> +        */
>>> +1:     la sp, spin_shadow_stack
>>> +       REG_AMOSWAP_AQ sp, sp, (sp)
>>> +       bnez sp, 1b
>>> +
>>>        la sp, shadow_stack
>>>        addi sp, sp, SHADOW_OVERFLOW_STACK_SIZE
>>> 
>>> diff --git a/arch/riscv/kernel/traps.c b/arch/riscv/kernel/traps.c
>>> index bb6a450f0ecc..be54ccea8c47 100644
>>> --- a/arch/riscv/kernel/traps.c
>>> +++ b/arch/riscv/kernel/traps.c
>>> @@ -213,11 +213,29 @@ asmlinkage unsigned long get_overflow_stack(void)
>>>                OVERFLOW_STACK_SIZE;
>>> }
>>> 
>>> +/*
>>> + * A pseudo spinlock to protect the shadow stack from being used by multiple
>>> + * harts concurrently.  This isn't a real spinlock because the lock side must
>>> + * be taken without a valid stack and only a single register, it's only taken
>>> + * while in the process of panicing anyway so the performance and error
>>> + * checking a proper spinlock gives us doesn't matter.
>>> + */
>>> +unsigned long spin_shadow_stack;
>>> +
>>> asmlinkage void handle_bad_stack(struct pt_regs *regs)
>>> {
>>>        unsigned long tsk_stk = (unsigned long)current->stack;
>>>        unsigned long ovf_stk = (unsigned long)this_cpu_ptr(overflow_stack);
>>> 
>>> +       /*
>>> +        * We're done with the shadow stack by this point, as we're on the
>>> +        * overflow stack.  Tell any other concurrent overflowing harts that
>>> +        * they can proceed with panicing by releasing the pseudo-spinlock.
>>> +        *
>>> +        * This pairs with an amoswap.aq in handle_kernel_stack_overflow.
>>> +        */
>>> +       smp_store_release(&spin_shadow_stack, 0);
>>> +
>>>        console_verbose();
>>> 
>>>        pr_emerg("Insufficient stack space to handle exception!\n");
>>> --
>>> 2.38.1
>>> 
> 
> _______________________________________________
> linux-riscv mailing list
> linux-riscv@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-riscv


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v4] riscv: fix race when vmap stack overflow
@ 2022-12-01  1:55       ` Jessica Clarke
  0 siblings, 0 replies; 16+ messages in thread
From: Jessica Clarke @ 2022-12-01  1:55 UTC (permalink / raw)
  To: Palmer Dabbelt; +Cc: guoren, Andrea Parri, jszhang, linux-riscv, linux-kernel

On 30 Nov 2022, at 16:54, Palmer Dabbelt <palmer@rivosinc.com> wrote:
> 
> On Tue, 29 Nov 2022 23:15:40 PST (-0800), guoren@kernel.org wrote:
>> The comment becomes better. Thx.
>> 
>> On Wed, Nov 30, 2022 at 10:29 AM Palmer Dabbelt <palmer@rivosinc.com> wrote:
>>> 
>>> From: Jisheng Zhang <jszhang@kernel.org>
>>> 
>>> Currently, when detecting vmap stack overflow, riscv firstly switches
>>> to the so called shadow stack, then use this shadow stack to call the
>>> get_overflow_stack() to get the overflow stack. However, there's
>>> a race here if two or more harts use the same shadow stack at the same
>>> time.
>>> 
>>> To solve this race, we introduce spin_shadow_stack atomic var, which
>>> will be swap between its own address and 0 in atomic way, when the
>>> var is set, it means the shadow_stack is being used; when the var
>>> is cleared, it means the shadow_stack isn't being used.
>>> 
>>> Fixes: 31da94c25aea ("riscv: add VMAP_STACK overflow detection")
>>> Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
>>> Suggested-by: Guo Ren <guoren@kernel.org>
>>> Reviewed-by: Guo Ren <guoren@kernel.org>
>>> Link: https://lore.kernel.org/r/20221030124517.2370-1-jszhang@kernel.org
>>> [Palmer: Add AQ to the swap, and also some comments.]
>>> Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
>>> ---
>>> Sorry to just re-spin this one without any warning, but I'd read patch a
>>> few times and every time I'd managed to convice myself there was a much
>>> simpler way of doing this.  By the time I'd figured out why that's not
>>> the case it seemed faster to just write the comments.
>>> 
>>> I've stashed this, right on top of the offending commit, at
>>> palmer/riscv-fix_vmap_stack.
>>> 
>>> Since v3:
>>> - Add AQ to the swap.
>>> - Add a bunch of comments.
>>> 
>>> Since v2:
>>> - use REG_AMOSWAP
>>> - add comment to the purpose of smp_store_release()
>>> 
>>> Since v1:
>>> - use smp_store_release directly
>>> - use unsigned int instead of atomic_t
>>> ---
>>> arch/riscv/include/asm/asm.h |  1 +
>>> arch/riscv/kernel/entry.S    | 13 +++++++++++++
>>> arch/riscv/kernel/traps.c    | 18 ++++++++++++++++++
>>> 3 files changed, 32 insertions(+)
>>> 
>>> diff --git a/arch/riscv/include/asm/asm.h b/arch/riscv/include/asm/asm.h
>>> index 618d7c5af1a2..e15a1c9f1cf8 100644
>>> --- a/arch/riscv/include/asm/asm.h
>>> +++ b/arch/riscv/include/asm/asm.h
>>> @@ -23,6 +23,7 @@
>>> #define REG_L          __REG_SEL(ld, lw)
>>> #define REG_S          __REG_SEL(sd, sw)
>>> #define REG_SC         __REG_SEL(sc.d, sc.w)
>>> +#define REG_AMOSWAP_AQ __REG_SEL(amoswap.d.aq, amoswap.w.aq)
>> Below is the reason why I use the relax version here:
>> https://lore.kernel.org/all/CAJF2gTRAEX_jQ_w5H05dyafZzHq+P5j05TJ=C+v+OL__GQam4A@mail.gmail.com/T/#u
> 
> Sorry, I hadn't seen that one.  Adding Andrea.  IMO the acquire/release pair is necessary here, with just relaxed the stack stores inside the lock could show up on the next hart trying to use the stack.

I think what you really want is a *consume* barrier, and since you have
the data dependency between the amoswap and the memory accesses (and
this isn’t Alpha) you’re technically fine without the acquire, since
you’re writing assembly and have the data dependency as syntactic.
Though you may still want (need?) the acquire so loads/stores unrelated
to the stack pointer that happen later in program order get ordered
after the load of the new stack pointer in case there could be weird
issues *there*.

Jess

>>> #define REG_ASM                __REG_SEL(.dword, .word)
>>> #define SZREG          __REG_SEL(8, 4)
>>> #define LGREG          __REG_SEL(3, 2)
>>> diff --git a/arch/riscv/kernel/entry.S b/arch/riscv/kernel/entry.S
>>> index 98f502654edd..5fdb6ba09600 100644
>>> --- a/arch/riscv/kernel/entry.S
>>> +++ b/arch/riscv/kernel/entry.S
>>> @@ -387,6 +387,19 @@ handle_syscall_trace_exit:
>>> 
>>> #ifdef CONFIG_VMAP_STACK
>>> handle_kernel_stack_overflow:
>>> +       /*
>>> +        * Takes the psuedo-spinlock for the shadow stack, in case multiple
>>> +        * harts are concurrently overflowing their kernel stacks.  We could
>>> +        * store any value here, but since we're overflowing the kernel stack
>>> +        * already we only have SP to use as a scratch register.  So we just
>>> +        * swap in the address of the spinlock, as that's definately non-zero.
>>> +        *
>>> +        * Pairs with a store_release in handle_bad_stack().
>>> +        */
>>> +1:     la sp, spin_shadow_stack
>>> +       REG_AMOSWAP_AQ sp, sp, (sp)
>>> +       bnez sp, 1b
>>> +
>>>        la sp, shadow_stack
>>>        addi sp, sp, SHADOW_OVERFLOW_STACK_SIZE
>>> 
>>> diff --git a/arch/riscv/kernel/traps.c b/arch/riscv/kernel/traps.c
>>> index bb6a450f0ecc..be54ccea8c47 100644
>>> --- a/arch/riscv/kernel/traps.c
>>> +++ b/arch/riscv/kernel/traps.c
>>> @@ -213,11 +213,29 @@ asmlinkage unsigned long get_overflow_stack(void)
>>>                OVERFLOW_STACK_SIZE;
>>> }
>>> 
>>> +/*
>>> + * A pseudo spinlock to protect the shadow stack from being used by multiple
>>> + * harts concurrently.  This isn't a real spinlock because the lock side must
>>> + * be taken without a valid stack and only a single register, it's only taken
>>> + * while in the process of panicing anyway so the performance and error
>>> + * checking a proper spinlock gives us doesn't matter.
>>> + */
>>> +unsigned long spin_shadow_stack;
>>> +
>>> asmlinkage void handle_bad_stack(struct pt_regs *regs)
>>> {
>>>        unsigned long tsk_stk = (unsigned long)current->stack;
>>>        unsigned long ovf_stk = (unsigned long)this_cpu_ptr(overflow_stack);
>>> 
>>> +       /*
>>> +        * We're done with the shadow stack by this point, as we're on the
>>> +        * overflow stack.  Tell any other concurrent overflowing harts that
>>> +        * they can proceed with panicing by releasing the pseudo-spinlock.
>>> +        *
>>> +        * This pairs with an amoswap.aq in handle_kernel_stack_overflow.
>>> +        */
>>> +       smp_store_release(&spin_shadow_stack, 0);
>>> +
>>>        console_verbose();
>>> 
>>>        pr_emerg("Insufficient stack space to handle exception!\n");
>>> --
>>> 2.38.1
>>> 
> 
> _______________________________________________
> linux-riscv mailing list
> linux-riscv@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-riscv


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v4] riscv: fix race when vmap stack overflow
  2022-12-01  1:55       ` Jessica Clarke
@ 2022-12-01  2:43         ` Andrea Parri
  -1 siblings, 0 replies; 16+ messages in thread
From: Andrea Parri @ 2022-12-01  2:43 UTC (permalink / raw)
  To: Jessica Clarke; +Cc: Palmer Dabbelt, guoren, jszhang, linux-riscv, linux-kernel

> >>> @@ -23,6 +23,7 @@
> >>> #define REG_L          __REG_SEL(ld, lw)
> >>> #define REG_S          __REG_SEL(sd, sw)
> >>> #define REG_SC         __REG_SEL(sc.d, sc.w)
> >>> +#define REG_AMOSWAP_AQ __REG_SEL(amoswap.d.aq, amoswap.w.aq)
> >> Below is the reason why I use the relax version here:
> >> https://lore.kernel.org/all/CAJF2gTRAEX_jQ_w5H05dyafZzHq+P5j05TJ=C+v+OL__GQam4A@mail.gmail.com/T/#u
> > 
> > Sorry, I hadn't seen that one.  Adding Andrea.  IMO the acquire/release pair is necessary here, with just relaxed the stack stores inside the lock could show up on the next hart trying to use the stack.
> 
> I think what you really want is a *consume* barrier, and since you have
> the data dependency between the amoswap and the memory accesses (and
> this isn’t Alpha) you’re technically fine without the acquire, since
> you’re writing assembly and have the data dependency as syntactic.
> Though you may still want (need?) the acquire so loads/stores unrelated
> to the stack pointer that happen later in program order get ordered
> after the load of the new stack pointer in case there could be weird
> issues *there*.

Agreed.

Just the fact that this is the 4th iteration of this discussion strongly
suggests to stick to the acquire and these inline comments to me.  ;)

  Andrea

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v4] riscv: fix race when vmap stack overflow
@ 2022-12-01  2:43         ` Andrea Parri
  0 siblings, 0 replies; 16+ messages in thread
From: Andrea Parri @ 2022-12-01  2:43 UTC (permalink / raw)
  To: Jessica Clarke; +Cc: Palmer Dabbelt, guoren, jszhang, linux-riscv, linux-kernel

> >>> @@ -23,6 +23,7 @@
> >>> #define REG_L          __REG_SEL(ld, lw)
> >>> #define REG_S          __REG_SEL(sd, sw)
> >>> #define REG_SC         __REG_SEL(sc.d, sc.w)
> >>> +#define REG_AMOSWAP_AQ __REG_SEL(amoswap.d.aq, amoswap.w.aq)
> >> Below is the reason why I use the relax version here:
> >> https://lore.kernel.org/all/CAJF2gTRAEX_jQ_w5H05dyafZzHq+P5j05TJ=C+v+OL__GQam4A@mail.gmail.com/T/#u
> > 
> > Sorry, I hadn't seen that one.  Adding Andrea.  IMO the acquire/release pair is necessary here, with just relaxed the stack stores inside the lock could show up on the next hart trying to use the stack.
> 
> I think what you really want is a *consume* barrier, and since you have
> the data dependency between the amoswap and the memory accesses (and
> this isn’t Alpha) you’re technically fine without the acquire, since
> you’re writing assembly and have the data dependency as syntactic.
> Though you may still want (need?) the acquire so loads/stores unrelated
> to the stack pointer that happen later in program order get ordered
> after the load of the new stack pointer in case there could be weird
> issues *there*.

Agreed.

Just the fact that this is the 4th iteration of this discussion strongly
suggests to stick to the acquire and these inline comments to me.  ;)

  Andrea

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v4] riscv: fix race when vmap stack overflow
  2022-12-01  2:43         ` Andrea Parri
@ 2022-12-01 20:00           ` Palmer Dabbelt
  -1 siblings, 0 replies; 16+ messages in thread
From: Palmer Dabbelt @ 2022-12-01 20:00 UTC (permalink / raw)
  To: Andrea Parri; +Cc: jrtc27, guoren, jszhang, linux-riscv, linux-kernel

On Wed, 30 Nov 2022 18:43:32 PST (-0800), Andrea Parri wrote:
>> >>> @@ -23,6 +23,7 @@
>> >>> #define REG_L          __REG_SEL(ld, lw)
>> >>> #define REG_S          __REG_SEL(sd, sw)
>> >>> #define REG_SC         __REG_SEL(sc.d, sc.w)
>> >>> +#define REG_AMOSWAP_AQ __REG_SEL(amoswap.d.aq, amoswap.w.aq)
>> >> Below is the reason why I use the relax version here:
>> >> https://lore.kernel.org/all/CAJF2gTRAEX_jQ_w5H05dyafZzHq+P5j05TJ=C+v+OL__GQam4A@mail.gmail.com/T/#u
>> >
>> > Sorry, I hadn't seen that one.  Adding Andrea.  IMO the acquire/release pair is necessary here, with just relaxed the stack stores inside the lock could show up on the next hart trying to use the stack.
>>
>> I think what you really want is a *consume* barrier, and since you have
>> the data dependency between the amoswap and the memory accesses (and
>> this isn’t Alpha) you’re technically fine without the acquire, since
>> you’re writing assembly and have the data dependency as syntactic.
>> Though you may still want (need?) the acquire so loads/stores unrelated
>> to the stack pointer that happen later in program order get ordered
>> after the load of the new stack pointer in case there could be weird
>> issues *there*.
>
> Agreed.
>
> Just the fact that this is the 4th iteration of this discussion strongly
> suggests to stick to the acquire and these inline comments to me.  ;)

I spent a little time last night trying to reason about the no-AQ 
version and I think it might actually be correct: the AMOSWAP is on the 
lock and SP is overwritten when loading up the actual stack so I don't 
think that's enough alone, but the no-speculative-accesses rule might be 
enough here.  Also I think mabye none of that even matters, because the 
same-address rules might bail us out due to the nature of stack 
accesses.

That said, this is some complicated and subtle reasoning.  The 
performance here doesn't matter so I'm just going to err on the side of 
caution, but if someone cares enough to come up with concrete reasoning 
as to why the barrier isn't necessary I'll at least look at the patch 
(though I'll probably gnumble the whole time, as I hate being tricked 
into thinking).

That'd be for-next material anyway, so the yes-AQ verison is on fixes 
beacuse there's a concrete breakage being fixed.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v4] riscv: fix race when vmap stack overflow
@ 2022-12-01 20:00           ` Palmer Dabbelt
  0 siblings, 0 replies; 16+ messages in thread
From: Palmer Dabbelt @ 2022-12-01 20:00 UTC (permalink / raw)
  To: Andrea Parri; +Cc: jrtc27, guoren, jszhang, linux-riscv, linux-kernel

On Wed, 30 Nov 2022 18:43:32 PST (-0800), Andrea Parri wrote:
>> >>> @@ -23,6 +23,7 @@
>> >>> #define REG_L          __REG_SEL(ld, lw)
>> >>> #define REG_S          __REG_SEL(sd, sw)
>> >>> #define REG_SC         __REG_SEL(sc.d, sc.w)
>> >>> +#define REG_AMOSWAP_AQ __REG_SEL(amoswap.d.aq, amoswap.w.aq)
>> >> Below is the reason why I use the relax version here:
>> >> https://lore.kernel.org/all/CAJF2gTRAEX_jQ_w5H05dyafZzHq+P5j05TJ=C+v+OL__GQam4A@mail.gmail.com/T/#u
>> >
>> > Sorry, I hadn't seen that one.  Adding Andrea.  IMO the acquire/release pair is necessary here, with just relaxed the stack stores inside the lock could show up on the next hart trying to use the stack.
>>
>> I think what you really want is a *consume* barrier, and since you have
>> the data dependency between the amoswap and the memory accesses (and
>> this isn’t Alpha) you’re technically fine without the acquire, since
>> you’re writing assembly and have the data dependency as syntactic.
>> Though you may still want (need?) the acquire so loads/stores unrelated
>> to the stack pointer that happen later in program order get ordered
>> after the load of the new stack pointer in case there could be weird
>> issues *there*.
>
> Agreed.
>
> Just the fact that this is the 4th iteration of this discussion strongly
> suggests to stick to the acquire and these inline comments to me.  ;)

I spent a little time last night trying to reason about the no-AQ 
version and I think it might actually be correct: the AMOSWAP is on the 
lock and SP is overwritten when loading up the actual stack so I don't 
think that's enough alone, but the no-speculative-accesses rule might be 
enough here.  Also I think mabye none of that even matters, because the 
same-address rules might bail us out due to the nature of stack 
accesses.

That said, this is some complicated and subtle reasoning.  The 
performance here doesn't matter so I'm just going to err on the side of 
caution, but if someone cares enough to come up with concrete reasoning 
as to why the barrier isn't necessary I'll at least look at the patch 
(though I'll probably gnumble the whole time, as I hate being tricked 
into thinking).

That'd be for-next material anyway, so the yes-AQ verison is on fixes 
beacuse there's a concrete breakage being fixed.

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v4] riscv: fix race when vmap stack overflow
  2022-11-30  2:24 ` Palmer Dabbelt
@ 2022-12-01 20:10   ` patchwork-bot+linux-riscv
  -1 siblings, 0 replies; 16+ messages in thread
From: patchwork-bot+linux-riscv @ 2022-12-01 20:10 UTC (permalink / raw)
  To: Palmer Dabbelt; +Cc: linux-riscv, jszhang, guoren, linux-kernel

Hello:

This patch was applied to riscv/linux.git (fixes)
by Palmer Dabbelt <palmer@rivosinc.com>:

On Tue, 29 Nov 2022 18:24:43 -0800 you wrote:
> From: Jisheng Zhang <jszhang@kernel.org>
> 
> Currently, when detecting vmap stack overflow, riscv firstly switches
> to the so called shadow stack, then use this shadow stack to call the
> get_overflow_stack() to get the overflow stack. However, there's
> a race here if two or more harts use the same shadow stack at the same
> time.
> 
> [...]

Here is the summary with links:
  - [v4] riscv: fix race when vmap stack overflow
    https://git.kernel.org/riscv/c/7e1864332fbc

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v4] riscv: fix race when vmap stack overflow
@ 2022-12-01 20:10   ` patchwork-bot+linux-riscv
  0 siblings, 0 replies; 16+ messages in thread
From: patchwork-bot+linux-riscv @ 2022-12-01 20:10 UTC (permalink / raw)
  To: Palmer Dabbelt; +Cc: linux-riscv, jszhang, guoren, linux-kernel

Hello:

This patch was applied to riscv/linux.git (fixes)
by Palmer Dabbelt <palmer@rivosinc.com>:

On Tue, 29 Nov 2022 18:24:43 -0800 you wrote:
> From: Jisheng Zhang <jszhang@kernel.org>
> 
> Currently, when detecting vmap stack overflow, riscv firstly switches
> to the so called shadow stack, then use this shadow stack to call the
> get_overflow_stack() to get the overflow stack. However, there's
> a race here if two or more harts use the same shadow stack at the same
> time.
> 
> [...]

Here is the summary with links:
  - [v4] riscv: fix race when vmap stack overflow
    https://git.kernel.org/riscv/c/7e1864332fbc

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2022-12-01 20:13 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-11-30  2:24 [PATCH v4] riscv: fix race when vmap stack overflow Palmer Dabbelt
2022-11-30  2:24 ` Palmer Dabbelt
2022-11-30  7:15 ` Guo Ren
2022-11-30  7:15   ` Guo Ren
2022-11-30 16:54   ` Palmer Dabbelt
2022-11-30 16:54     ` Palmer Dabbelt
2022-12-01  1:17     ` Guo Ren
2022-12-01  1:17       ` Guo Ren
2022-12-01  1:55     ` Jessica Clarke
2022-12-01  1:55       ` Jessica Clarke
2022-12-01  2:43       ` Andrea Parri
2022-12-01  2:43         ` Andrea Parri
2022-12-01 20:00         ` Palmer Dabbelt
2022-12-01 20:00           ` Palmer Dabbelt
2022-12-01 20:10 ` patchwork-bot+linux-riscv
2022-12-01 20:10   ` patchwork-bot+linux-riscv

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.