From: Guo Ren <guoren@kernel.org>
To: Daniel Thompson <daniel.thompson@linaro.org>
Cc: arnd@arndb.de, palmer@rivosinc.com, tglx@linutronix.de,
peterz@infradead.org, luto@kernel.org,
conor.dooley@microchip.com, heiko@sntech.de, jszhang@kernel.org,
lazyparser@gmail.com, falcon@tinylab.org, chenhuacai@kernel.org,
apatel@ventanamicro.com, atishp@atishpatra.org,
mark.rutland@arm.com, ben@decadent.org.uk, bjorn@kernel.org,
palmer@dabbelt.com, linux-arch@vger.kernel.org,
linux-kernel@vger.kernel.org, linux-riscv@lists.infradead.org,
"Guo Ren" <guoren@linux.alibaba.com>,
"Björn Töpel" <bjorn@rivosinc.com>,
"Yipeng Zou" <zouyipeng@huawei.com>,
"Vincent Chen" <vincent.chen@sifive.com>
Subject: Re: [PATCH -next V17 4/7] riscv: entry: Convert to generic entry
Date: Sat, 1 Jul 2023 12:22:28 +0800 [thread overview]
Message-ID: <CAJF2gTSsPMCKO-Lmc=87wqRZ_05aK8Oj78kk3vjmeNBT2c_jJg@mail.gmail.com> (raw)
In-Reply-To: <CAJF2gTSCF2VLcJ5LKe88zhqM4eep-f5kpVoYOQy1TD4ZPQRb+g@mail.gmail.com>
On Sat, Jul 1, 2023 at 11:08 AM Guo Ren <guoren@kernel.org> wrote:
>
> On Sat, Jul 1, 2023 at 10:55 AM Guo Ren <guoren@kernel.org> wrote:
> >
> > On Fri, Jun 30, 2023 at 10:51 PM Daniel Thompson
> > <daniel.thompson@linaro.org> wrote:
> > >
> > > On Fri, Jun 30, 2023 at 07:22:40AM -0400, Guo Ren wrote:
> > > > On Fri, Jun 30, 2023 at 7:16 AM Guo Ren <guoren@kernel.org> wrote:
> > > > >
> > > > > On Thu, Jun 29, 2023 at 10:02 AM Daniel Thompson
> > > > > <daniel.thompson@linaro.org> wrote:
> > > > > >
> > > > > > On Tue, Feb 21, 2023 at 10:30:18PM -0500, guoren@kernel.org wrote:
> > > > > > > From: Guo Ren <guoren@linux.alibaba.com>
> > > > > > >
> > > > > > > This patch converts riscv to use the generic entry infrastructure from
> > > > > > > kernel/entry/*. The generic entry makes maintainers' work easier and
> > > > > > > codes more elegant. Here are the changes:
> > > > > > >
> > > > > > > - More clear entry.S with handle_exception and ret_from_exception
> > > > > > > - Get rid of complex custom signal implementation
> > > > > > > - Move syscall procedure from assembly to C, which is much more
> > > > > > > readable.
> > > > > > > - Connect ret_from_fork & ret_from_kernel_thread to generic entry.
> > > > > > > - Wrap with irqentry_enter/exit and syscall_enter/exit_from_user_mode
> > > > > > > - Use the standard preemption code instead of custom
> > > > > > >
> > > > > > > Suggested-by: Huacai Chen <chenhuacai@kernel.org>
> > > > > > > Reviewed-by: Björn Töpel <bjorn@rivosinc.com>
> > > > > > > Tested-by: Yipeng Zou <zouyipeng@huawei.com>
> > > > > > > Tested-by: Jisheng Zhang <jszhang@kernel.org>
> > > > > > > Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
> > > > > > > Signed-off-by: Guo Ren <guoren@kernel.org>
> > > > > > > Cc: Ben Hutchings <ben@decadent.org.uk>
> > > > > >
> > > > > > Apologies for the late feedback but I've been swamped lately and only
> > > > > > recently got round to running the full kgdb test suite on the v6.4
> > > > > > series.
> > > > > >
> > > > > > The kgdb test suite includes a couple of tests that verify that the
> > > > > > system resumes after breakpointing due to a BUG():
> > > > > > https://github.com/daniel-thompson/kgdbtest/blob/master/tests/test_kdb_fault_injection.py#L24-L45
> > > > > >
> > > > > > These tests have regressed on riscv between v6.3 and v6.4 and a bisect
> > > > > > is pointing at this patch. With these changes in place then, after kdb
> > > > > > resumes the system, the BUG() message is printed as normal but then
> > > > > > immediately fails. From the backtrace it looks like the new entry/exit
> > > > > > code cannot advance past a compiled breakpoint instruction:
> > > > > > ~~~
> > > > > > PANIC: Fatal exception in interrupt
> > > > > It comes from:
> > > > > void die(struct pt_regs *regs, ...
> > > > > {
> > > > > ...
> > > > > if (in_interrupt())
> > > > > panic("Fatal exception in interrupt");
> > > > > ...
> > > > >
> > > > > We could add a dump_backtrace to see what happened:
> > > > > if (in_interrupt()) {
> > > > > + dump_backtrace(regs, NULL, KERN_DEFAULT);
> > > > Sorry, it should be:
> > > > + dump_backtrace(NULL, NULL, KERN_DEFAULT);
> > > > We need current stack info, not exception context.
> > >
> > > I added this... and I also stopped kgdb from intercepting the panic()
> > > since that interferes with the console output from dump_backtrace().
> > >
> > > ~~~
> > > # /bin/echo BUG > /sys/kernel/debug/provoke-crash/DIRECT
> > > [ 3.380565] lkdtm: Performing direct entry BUG
> > >
> > > Entering kdb (current=0xff6000000380ab00, pid 98) on processor 0 due to NonMaskable Interrupt @ 0xffffffff8064b844
> > > kdb> go
> > > Catastrophic error detected
> > > kdb_continue_catastrophic=0, type go a second time if you really want to continue
> > > kdb> go
> > > Catastrophic error detected
> > > kdb_continue_catastrophic=0, attempting to continue
> > > [ 3.381411] ------------[ cut here ]------------
> > > [ 3.381454] kernel BUG at drivers/misc/lkdtm/bugs.c:78!
> > > [ 3.381609] Kernel BUG [#1]
> > > [ 3.381632] Modules linked in:
> > > [ 3.381734] CPU: 0 PID: 98 Comm: echo Not tainted 6.4.0-rc6-00004-ge6e9d4598760-dirty #126
> > > [ 3.381817] Hardware name: riscv-virtio,qemu (DT)
> > > [ 3.381885] epc : lkdtm_BUG+0x6/0x8
> > > [ 3.381959] ra : lkdtm_do_action+0x10/0x1c
> > > [ 3.381978] epc : ffffffff8064b844 ra : ffffffff8064afb4 sp : ff200000008c3d30
> > > [ 3.381991] gp : ffffffff810665a0 tp : ff6000000380ab00 t0 : 6500000000000000
> > > [ 3.382002] t1 : 0000000000000001 t2 : 6550203a6d74646b s0 : ff200000008c3d40
> > > [ 3.382012] s1 : ff60000003988000 a0 : ffffffff80fc0260 a1 : ff6000003ffad788
> > > [ 3.382023] a2 : ff6000003ffb9530 a3 : 0000000000000000 a4 : 0000000000000000
> > > [ 3.382034] a5 : ffffffff8064b83e a6 : 0000000000000050 a7 : 0000000000040000
> > > [ 3.382045] s2 : 0000000000000004 s3 : ffffffff80fc0260 s4 : ff200000008c3e70
> > > [ 3.382056] s5 : ff600000033223a8 s6 : 00000000000f0cc0 s7 : ff60000002211000
> > > [ 3.382066] s8 : 00ffffffafc50c08 s9 : 00ffffffafc4b9b8 s10: 0000000000000000
> > > [ 3.382077] s11: 0000000000000001 t3 : 461f715700000000 t4 : 0000000000000002
> > > [ 3.382087] t5 : 0000000000000000 t6 : ff200000008c3b58
> > > [ 3.382097] status: 0000000200000120 badaddr: 0000000000000000 cause: 0000000000000003
> > > [ 3.382139] [<ffffffff8064b844>] lkdtm_BUG+0x6/0x8
> > > [ 3.382245] Code: 0513 9245 b097 0039 80e7 7f20 bf39 1141 e422 0800 (9002) 1141
> > > [ 3.594697] ---[ end trace 0000000000000000 ]---
> > >
> > > At this point we expect a shell prompt since we should have taken the BUG(),
> > > killed the echo process and returned to the shell. However in v6.4 we get the
> > > following instead (including the instrumentation you asked for):
> >
> > After comparing with arm64, I found that arm64 uses spinlock_irq to
> > protect the in_interrupt(). I think this would make in_interrupt() =
> > 0.
> >
> > So how about trying:
> >
> > diff --git a/arch/riscv/kernel/traps.c b/arch/riscv/kernel/traps.c
> > index 5158961ea977..0ac914a99ee3 100644
> > --- a/arch/riscv/kernel/traps.c
> > +++ b/arch/riscv/kernel/traps.c
> > @@ -82,13 +82,15 @@ void die(struct pt_regs *regs, const char *str)
> >
> > bust_spinlocks(0);
> > add_taint(TAINT_DIE, LOCKDEP_NOW_UNRELIABLE);
> > - spin_unlock_irqrestore(&die_lock, flags);
> > oops_exit();
> >
> > if (in_interrupt())
> > panic("Fatal exception in interrupt");
> > if (panic_on_oops)
> > panic("Fatal exception");
> > +
> > + spin_unlock_irqrestore(&die_lock, flags);
> En... It seems it's not correct, how can I reproduce your environment
> on qemu? Sorry, I'm not familiar with kgdb.
I got it:
Normal is:
# mount -t debugfs none /sys/kernel/debug/
# /bin/echo BUG > /sys/kernel/debug/provoke-crash/DIRECT
[ 8.948041] lkdtm: Performing direct entry BUG
[ 8.949228] ------------[ cut here ]------------
[ 8.949640] kernel BUG at drivers/misc/lkdtm/bugs.c:78!
[ 8.950534] Kernel BUG [#1]
[ 8.950944] Modules linked in:
[ 8.951805] CPU: 0 PID: 106 Comm: echo Not tainted
6.3.0-rc2-00295-gb4e5219985e8 #22
[ 8.952831] Hardware name: riscv-virtio,qemu (DT)
[ 8.953587] epc : lkdtm_BUG+0x6/0x8
[ 8.954232] ra : lkdtm_do_action+0x14/0x1c
[ 8.954713] epc : ffffffff805549e2 ra : ffffffff8087245c sp :
ff2000000081bd60
[ 8.955378] gp : ffffffff814ffec0 tp : ff600000023c8000 t0 :
6500000000000000
[ 8.956029] t1 : 000000000000006c t2 : 6550203a6d74646b s0 :
ff2000000081bd70
[ 8.956699] s1 : ffffffff814bee50 a0 : ffffffff814bee50 a1 :
ff6000001fbd8608
[ 8.957381] a2 : ff6000001fbdb868 a3 : 0000000000000000 a4 :
0000000000000000
[ 8.958035] a5 : ffffffff805549dc a6 : 0000000000000032 a7 :
0000000000000038
[ 8.958708] s2 : 0000000000000004 s3 : 00000000556371a0 s4 :
ff2000000081be90
[ 8.959397] s5 : ff60000001c90000 s6 : 00000000556371a0 s7 :
0000000000000030
[ 8.960053] s8 : 000000007fffec78 s9 : 0000000000000007 s10:
0000000055637480
[ 8.960717] s11: 0000000000000001 t3 : ffffffff81512e97 t4 :
ffffffff81512e97
[ 8.961379] t5 : ffffffff81512e98 t6 : ff2000000081bba8
[ 8.961888] status: 0000000100000120 badaddr: 0000000000000000
cause: 0000000000000003
[ 8.962923] [<ffffffff805549e2>] lkdtm_BUG+0x6/0x8
[ 8.964194] Code: 0513 d665 7097 0031 80e7 f000 b705 1141 e422 0800
(9002) 1141
[ 8.965847] ---[ end trace 0000000000000000 ]---
[ 8.966637] note: echo[106] exited with irqs disabled
Segmentation fault
#
After generic_entry:
# mount -t debugfs none /sys/kernel/debug/
# /bin/echo BUG > /sys/kernel/debug/provoke-crash/DIRECT
[ 8.152247] lkdtm: Performing direct entry BUG
[ 8.153652] ------------[ cut here ]------------
[ 8.153825] kernel BUG at drivers/misc/lkdtm/bugs.c:78!
[ 8.154341] Kernel BUG [#1]
[ 8.154440] Modules linked in:
[ 8.154918] CPU: 0 PID: 106 Comm: echo Not tainted
6.4.0-rc1-00055-g0ca05a4b079f #21
[ 8.155301] Hardware name: riscv-virtio,qemu (DT)
[ 8.155581] epc : lkdtm_BUG+0x6/0x8
[ 8.155880] ra : lkdtm_do_action+0x14/0x1c
[ 8.155977] epc : ffffffff8059d4b4 ra : ffffffff808c1a84 sp :
ff2000000081bd40
[ 8.156030] gp : ffffffff81503c08 tp : ff600000028ebac0 t0 :
6500000000000000
[ 8.156079] t1 : 000000000000006c t2 : 6550203a6d74646b s0 :
ff2000000081bd50
[ 8.156144] s1 : ffffffff814c2e88 a0 : ffffffff814c2e88 a1 :
ff6000001ffd8608
[ 8.156193] a2 : ff6000001ffdb870 a3 : 0000000000000000 a4 :
0000000000000000
[ 8.156241] a5 : ffffffff8059d4ae a6 : 0000000000000032 a7 :
0000000000000038
[ 8.156288] s2 : 0000000000000004 s3 : 00000000556371a0 s4 :
ff2000000081be70
[ 8.156335] s5 : ff60000002090000 s6 : 00000000556371a0 s7 :
0000000000000030
[ 8.156382] s8 : 000000007fffec78 s9 : 0000000000000007 s10:
0000000055637480
[ 8.156428] s11: 0000000000000001 t3 : ffffffff815173d7 t4 :
ffffffff815173d7
[ 8.156473] t5 : ffffffff815173d8 t6 : ff2000000081bb88
[ 8.156516] status: 0000000100000120 badaddr: 0000000000000000
cause: 0000000000000003
[ 8.156830] [<ffffffff8059d4b4>] lkdtm_BUG+0x6/0x8
[ 8.157630] Code: 0513 1745 d097 0031 80e7 70a0 b705 1141 e422 0800
(9002) 1141
[ 8.169646] ---[ end trace 0000000000000000 ]---
[ 8.170148] Kernel panic - not syncing: Fatal exception in interrupt
[ 8.171839] ---[ end Kernel panic - not syncing: Fatal exception in
interrupt ]---
I'm debugging on it, and soon give the patch.
>
> > +
> > if (ret != NOTIFY_STOP)
> > make_task_dead(SIGSEGV);
> > }
> >
> > >
> > > [ 3.594801] [<ffffffff80005e3a>] dump_backtrace+0x1c/0x24
> > > [ 3.594826] [<ffffffff800059f0>] die+0x228/0x238
> > > [ 3.594835] [<ffffffff80005b38>] handle_break+0x9a/0xe0
> > > [ 3.594843] [<ffffffff809f30d6>] do_trap_break+0x48/0x5c
> > > [ 3.594854] [<ffffffff80003ee4>] ret_from_exception+0x0/0x64
> > > [ 3.594862] [<ffffffff8064b844>] lkdtm_BUG+0x6/0x8
> > > [ 3.594959] Kernel panic - not syncing: Fatal exception in interrupt
> > > [ 3.595005] SMP: stopping secondary CPUs
> > > [ 3.596444] ---[ end Kernel panic - not syncing: Fatal exception in interrupt ]---
> > > ~~~
> > >
> > >
> > > Daniel.
> >
> >
> >
> > --
> > Best Regards
> > Guo Ren
>
>
>
> --
> Best Regards
> Guo Ren
--
Best Regards
Guo Ren
_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv
next prev parent reply other threads:[~2023-07-01 4:23 UTC|newest]
Thread overview: 39+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-02-22 3:30 [PATCH -next V17 0/7] riscv: Add GENERIC_ENTRY support guoren
2023-02-22 3:30 ` [PATCH -next V17 1/7] compiler_types.h: Add __noinstr_section() for noinstr guoren
2023-03-22 14:46 ` Björn Töpel
2023-03-24 1:51 ` Guo Ren
2023-03-23 0:43 ` Lai Jiangshan
2023-02-22 3:30 ` [PATCH -next V17 2/7] riscv: ptrace: Remove duplicate operation guoren
2023-02-22 3:30 ` [PATCH -next V17 3/7] riscv: entry: Add noinstr to prevent instrumentation inserted guoren
2023-02-22 3:30 ` [PATCH -next V17 4/7] riscv: entry: Convert to generic entry guoren
2023-03-31 18:34 ` Conor Dooley
2023-03-31 18:41 ` Conor Dooley
2023-03-31 18:46 ` Heiko Stübner
2023-03-31 18:55 ` Conor Dooley
2023-03-31 21:22 ` Palmer Dabbelt
2023-04-01 2:15 ` Guo Ren
2023-04-01 12:10 ` Heiko Stübner
2023-04-01 13:19 ` Björn Töpel
2023-04-01 13:33 ` Björn Töpel
2023-04-01 14:58 ` Björn Töpel
2023-04-01 15:42 ` Heiko Stübner
2023-04-01 18:41 ` Björn Töpel
2023-04-07 9:13 ` Guo Ren
2023-04-07 9:18 ` Conor.Dooley
2023-04-08 4:59 ` Guo Ren
2023-04-01 1:39 ` Guo Ren
2023-04-01 9:32 ` Conor Dooley
2023-06-29 14:02 ` Daniel Thompson
2023-06-30 11:16 ` Guo Ren
2023-06-30 11:22 ` Guo Ren
2023-06-30 14:50 ` Daniel Thompson
2023-07-01 2:55 ` Guo Ren
2023-07-01 3:08 ` Guo Ren
2023-07-01 4:22 ` Guo Ren [this message]
2023-07-03 3:09 ` Guo Ren
2023-02-22 3:30 ` [PATCH -next V17 5/7] riscv: entry: Remove extra level wrappers of trace_hardirqs_{on,off} guoren
2023-02-22 3:30 ` [PATCH -next V17 6/7] riscv: entry: Consolidate ret_from_kernel_thread into ret_from_fork guoren
2023-02-22 3:30 ` [PATCH -next V17 7/7] riscv: entry: Consolidate general regs saving/restoring guoren
2023-03-24 22:01 ` (subset) [PATCH -next V17 0/7] riscv: Add GENERIC_ENTRY support Palmer Dabbelt
2023-03-24 22:10 ` patchwork-bot+linux-riscv
2023-03-27 23:27 ` Palmer Dabbelt
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAJF2gTSsPMCKO-Lmc=87wqRZ_05aK8Oj78kk3vjmeNBT2c_jJg@mail.gmail.com' \
--to=guoren@kernel.org \
--cc=apatel@ventanamicro.com \
--cc=arnd@arndb.de \
--cc=atishp@atishpatra.org \
--cc=ben@decadent.org.uk \
--cc=bjorn@kernel.org \
--cc=bjorn@rivosinc.com \
--cc=chenhuacai@kernel.org \
--cc=conor.dooley@microchip.com \
--cc=daniel.thompson@linaro.org \
--cc=falcon@tinylab.org \
--cc=guoren@linux.alibaba.com \
--cc=heiko@sntech.de \
--cc=jszhang@kernel.org \
--cc=lazyparser@gmail.com \
--cc=linux-arch@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-riscv@lists.infradead.org \
--cc=luto@kernel.org \
--cc=mark.rutland@arm.com \
--cc=palmer@dabbelt.com \
--cc=palmer@rivosinc.com \
--cc=peterz@infradead.org \
--cc=tglx@linutronix.de \
--cc=vincent.chen@sifive.com \
--cc=zouyipeng@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).