All of lore.kernel.org
 help / color / mirror / Atom feed
From: guoren@kernel.org
To: anup@brainfault.org, paul.walmsley@sifive.com,
	palmer@dabbelt.com, conor.dooley@microchip.com, heiko@sntech.de,
	philipp.tomsich@vrull.eu
Cc: linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org,
	Guo Ren <guoren@linux.alibaba.com>, Guo Ren <guoren@kernel.org>,
	Anup Patel <apatel@ventanamicro.com>,
	Palmer Dabbelt <palmer@rivosinc.com>
Subject: [PATCH RESEND] riscv: asid: Fixup stale TLB entry cause application crash
Date: Tue,  8 Nov 2022 05:20:44 -0500	[thread overview]
Message-ID: <20221108102044.3317793-1-guoren@kernel.org> (raw)

From: Guo Ren <guoren@linux.alibaba.com>

After use_asid_allocator enabled, the userspace application will
crash for stale tlb entry. Because only using cpumask_clear_cpu without
local_flush_tlb_all couldn't guarantee CPU's tlb entries fresh. Then
set_mm_asid would cause user space application get a stale value by
the stale tlb entry, but set_mm_noasid is okay.

Here is the symptom of the bug:
unhandled signal 11 code 0x1 (coredump)
   0x0000003fd6d22524 <+4>:     auipc   s0,0x70
   0x0000003fd6d22528 <+8>:     ld      s0,-148(s0) # 0x3fd6d92490
=> 0x0000003fd6d2252c <+12>:    ld      a5,0(s0)
(gdb) i r s0
s0          0x8082ed1cc3198b21       0x8082ed1cc3198b21
(gdb) x/16 0x3fd6d92490
0x3fd6d92490:   0xd80ac8a8      0x0000003f
The core dump file shows that the value of register s0 is wrong, but the
value in memory is right. This is because 'ld s0, -148(s0)' use a stale
mapping entry in TLB and got a wrong value from a stale physical
address.

When task run on CPU0, the task loaded/speculative-loaded the value of
address(0x3fd6d92490), and the first version of tlb mapping entry was
PTWed into CPU0's tlb.
When the task switched from CPU0 to CPU1 without local_tlb_flush_all
(because of asid), the task happened to write a value on address
(0x3fd6d92490). It caused do_page_fault -> wp_page_copy ->
ptep_clear_flush -> ptep_get_and_clear & flush_tlb_page.
The flush_tlb_page used mm_cpumask(mm) to determine which CPUs need
tlb flush, but CPU0 had cleared the CPU0's mm_cpumask in previous switch_mm.
So we only flushed the CPU1 tlb, and setted second version mapping
of the pte. When the task switch from CPU1 to CPU0 again, CPU0 still used a
stale tlb mapping entry which contained a wrong target physical address.
When the task happened to read that value, the bug would be raised.

Fixes: 65d4b9c53017 ("RISC-V: Implement ASID allocator")
Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
Signed-off-by: Guo Ren <guoren@kernel.org>
Cc: Anup Patel <apatel@ventanamicro.com>
Cc: Palmer Dabbelt <palmer@rivosinc.com>
---
 arch/riscv/mm/context.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/arch/riscv/mm/context.c b/arch/riscv/mm/context.c
index 7acbfbd14557..8ad6c2493e93 100644
--- a/arch/riscv/mm/context.c
+++ b/arch/riscv/mm/context.c
@@ -317,7 +317,9 @@ void switch_mm(struct mm_struct *prev, struct mm_struct *next,
 	 */
 	cpu = smp_processor_id();
 
-	cpumask_clear_cpu(cpu, mm_cpumask(prev));
+	if (!static_branch_unlikely(&use_asid_allocator))
+		cpumask_clear_cpu(cpu, mm_cpumask(prev));
+
 	cpumask_set_cpu(cpu, mm_cpumask(next));
 
 	set_mm(next, cpu);
-- 
2.36.1


WARNING: multiple messages have this Message-ID (diff)
From: guoren@kernel.org
To: anup@brainfault.org, paul.walmsley@sifive.com,
	palmer@dabbelt.com, conor.dooley@microchip.com, heiko@sntech.de,
	philipp.tomsich@vrull.eu
Cc: linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org,
	Guo Ren <guoren@linux.alibaba.com>, Guo Ren <guoren@kernel.org>,
	Anup Patel <apatel@ventanamicro.com>,
	Palmer Dabbelt <palmer@rivosinc.com>
Subject: [PATCH RESEND] riscv: asid: Fixup stale TLB entry cause application crash
Date: Tue,  8 Nov 2022 05:20:44 -0500	[thread overview]
Message-ID: <20221108102044.3317793-1-guoren@kernel.org> (raw)

From: Guo Ren <guoren@linux.alibaba.com>

After use_asid_allocator enabled, the userspace application will
crash for stale tlb entry. Because only using cpumask_clear_cpu without
local_flush_tlb_all couldn't guarantee CPU's tlb entries fresh. Then
set_mm_asid would cause user space application get a stale value by
the stale tlb entry, but set_mm_noasid is okay.

Here is the symptom of the bug:
unhandled signal 11 code 0x1 (coredump)
   0x0000003fd6d22524 <+4>:     auipc   s0,0x70
   0x0000003fd6d22528 <+8>:     ld      s0,-148(s0) # 0x3fd6d92490
=> 0x0000003fd6d2252c <+12>:    ld      a5,0(s0)
(gdb) i r s0
s0          0x8082ed1cc3198b21       0x8082ed1cc3198b21
(gdb) x/16 0x3fd6d92490
0x3fd6d92490:   0xd80ac8a8      0x0000003f
The core dump file shows that the value of register s0 is wrong, but the
value in memory is right. This is because 'ld s0, -148(s0)' use a stale
mapping entry in TLB and got a wrong value from a stale physical
address.

When task run on CPU0, the task loaded/speculative-loaded the value of
address(0x3fd6d92490), and the first version of tlb mapping entry was
PTWed into CPU0's tlb.
When the task switched from CPU0 to CPU1 without local_tlb_flush_all
(because of asid), the task happened to write a value on address
(0x3fd6d92490). It caused do_page_fault -> wp_page_copy ->
ptep_clear_flush -> ptep_get_and_clear & flush_tlb_page.
The flush_tlb_page used mm_cpumask(mm) to determine which CPUs need
tlb flush, but CPU0 had cleared the CPU0's mm_cpumask in previous switch_mm.
So we only flushed the CPU1 tlb, and setted second version mapping
of the pte. When the task switch from CPU1 to CPU0 again, CPU0 still used a
stale tlb mapping entry which contained a wrong target physical address.
When the task happened to read that value, the bug would be raised.

Fixes: 65d4b9c53017 ("RISC-V: Implement ASID allocator")
Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
Signed-off-by: Guo Ren <guoren@kernel.org>
Cc: Anup Patel <apatel@ventanamicro.com>
Cc: Palmer Dabbelt <palmer@rivosinc.com>
---
 arch/riscv/mm/context.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/arch/riscv/mm/context.c b/arch/riscv/mm/context.c
index 7acbfbd14557..8ad6c2493e93 100644
--- a/arch/riscv/mm/context.c
+++ b/arch/riscv/mm/context.c
@@ -317,7 +317,9 @@ void switch_mm(struct mm_struct *prev, struct mm_struct *next,
 	 */
 	cpu = smp_processor_id();
 
-	cpumask_clear_cpu(cpu, mm_cpumask(prev));
+	if (!static_branch_unlikely(&use_asid_allocator))
+		cpumask_clear_cpu(cpu, mm_cpumask(prev));
+
 	cpumask_set_cpu(cpu, mm_cpumask(next));
 
 	set_mm(next, cpu);
-- 
2.36.1


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

             reply	other threads:[~2022-11-08 10:21 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-11-08 10:20 guoren [this message]
2022-11-08 10:20 ` [PATCH RESEND] riscv: asid: Fixup stale TLB entry cause application crash guoren
2022-11-08 10:27 ` Conor Dooley
2022-11-08 10:27   ` Conor Dooley
2022-11-08 14:22   ` Conor Dooley
2022-11-08 14:22     ` Conor Dooley
2022-11-09  0:30     ` Guo Ren
2022-11-09  0:30       ` Guo Ren
2022-11-09  0:30   ` Guo Ren
2022-11-09  0:30     ` Guo Ren
2022-11-09  1:42 ` kernel test robot
2022-11-09  1:42   ` kernel test robot
2022-11-09  2:33 ` kernel test robot
2022-11-09  2:33   ` kernel test robot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20221108102044.3317793-1-guoren@kernel.org \
    --to=guoren@kernel.org \
    --cc=anup@brainfault.org \
    --cc=apatel@ventanamicro.com \
    --cc=conor.dooley@microchip.com \
    --cc=guoren@linux.alibaba.com \
    --cc=heiko@sntech.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-riscv@lists.infradead.org \
    --cc=palmer@dabbelt.com \
    --cc=palmer@rivosinc.com \
    --cc=paul.walmsley@sifive.com \
    --cc=philipp.tomsich@vrull.eu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.