From: Andy Chiu <andy.chiu@sifive.com>
To: Alexandre Ghiti <alex@ghiti.fr>
Cc: Evgenii Shatokhin <e.shatokhin@yadro.com>,
palmer@dabbelt.com, paul.walmsley@sifive.com,
aou@eecs.berkeley.edu, rostedt@goodmis.org, mingo@redhat.com,
peterz@infradead.org, jpoimboe@kernel.org, jbaron@akamai.com,
ardb@kernel.org, greentime.hu@sifive.com, zong.li@sifive.com,
guoren@kernel.org, Jessica Clarke <jrtc27@jrtc27.com>,
kernel@esmil.dk, linux-riscv@lists.infradead.org,
linux@yadro.com
Subject: Re: [PATCH RFC v2 riscv/for-next 0/5] Enable ftrace with kernel preemption for RISC-V
Date: Thu, 7 Mar 2024 20:27:56 +0800 [thread overview]
Message-ID: <CABgGipUtzLG2tffXOdV7-7ss+_doQvjxfwzVbfzyO56J3_3X0Q@mail.gmail.com> (raw)
In-Reply-To: <bd7c2263-4e81-4be6-af4d-22a0842ba2e4@ghiti.fr>
Hi Alex,
On Thu, Mar 7, 2024 at 4:57 AM Alexandre Ghiti <alex@ghiti.fr> wrote:
>
> Hi Evgenii,
>
> On 21/02/2024 17:55, Evgenii Shatokhin wrote:
> > On 21.02.2024 08:27, Andy Chiu wrote:
> >> «Внимание! Данное письмо от внешнего адресата!»
> >>
> >> On Wed, Feb 14, 2024 at 3:42 AM Evgenii Shatokhin
> >> <e.shatokhin@yadro.com> wrote:
> >>>
> >>> Hi,
> >>>
> >>> On 13.09.2022 12:42, Andy Chiu wrote:
> >>>> This patch removes dependency of dynamic ftrace from calling
> >>>> stop_machine(), and makes it compatiable with kernel preemption.
> >>>> Originally, we ran into stack corruptions, or execution of partially
> >>>> updated instructions when starting or stopping ftrace on a fully
> >>>> preemptible kernel configuration. The reason is that kernel
> >>>> periodically
> >>>> calls rcu_momentary_dyntick_idle() on cores waiting for the
> >>>> code-patching
> >>>> core running in ftrace. Though rcu_momentary_dyntick_idle() itself is
> >>>> marked as notrace, it would call a bunch of tracable functions if we
> >>>> configured the kernel as preemptible. For example, these are some
> >>>> functions
> >>>> that happened to have a symbol and have not been marked as notrace
> >>>> on a
> >>>> RISC-V preemptible kernel compiled with GCC-11:
> >>>> - __rcu_report_exp_rnp()
> >>>> - rcu_report_exp_cpu_mult()
> >>>> - rcu_preempt_deferred_qs()
> >>>> - rcu_preempt_need_deferred_qs()
> >>>> - rcu_preempt_deferred_qs_irqrestore()
> >>>>
> >>>> Thus, this make it not ideal for us to rely on stop_machine() and
> >>>> handly marked "notrace"s to perform runtime code patching. To remove
> >>>> such dependency, we must make updates of code seemed atomic on running
> >>>> cores. This might not be obvious for RISC-V since it usaually uses
> >>>> a pair
> >>>> of AUIPC + JALR to perform a long jump, which cannot be modified and
> >>>> executed concurrently if we consider preemptions. As such, this patch
> >>>> proposed a way to make it possible. It embeds a 32-bit rel-address
> >>>> data
> >>>> into instructions of each ftrace prologue and jumps indirectly. In
> >>>> this
> >>>> way, we could store and load the address atomically so that the code
> >>>> patching core could run simutaneously with the rest of running cores.
> >>>>
> >>>> After applying the patchset, we compiled a preemptible kernel with all
> >>>> tracers and ftrace-selftest enabled, and booted it on a 2-core QEMU
> >>>> virt
> >>>> machine. The kernel could boot up successfully, passing all ftrace
> >>>> testsuits. Besides, we ran a script that randomly pick a tracer on
> >>>> every
> >>>> 0~5 seconds. The kernel has sustained over 20K rounds of the test. In
> >>>> contrast, a preemptible kernel without our patch would panic in few
> >>>> rounds on the same machine.
> >>>>
> >>>> Though we ran into errors when using hwlat or irqsoff tracers together
> >>>> with cpu-online stressor from stress-ng on a preemptible kernel. We
> >>>> believe the reason may be that percpu workers of the tracers are
> >>>> being
> >>>> queued into unbounded workqueue when cpu get offlined and patches
> >>>> will go
> >>>> through tracing tree.
> >>>>
> >>>> Additionally, we found patching of tracepoints unsafe since the
> >>>> instructions being patched are not naturally aligned. This may
> >>>> result in
> >>>> 2 half-word stores, which breaks atomicity, during the code patching.
> >>>>
> >>>> changes in patch v2:
> >>>> - Enforce alignments on all functions with a compiler workaround.
> >>>> - Support 64bit addressing for ftrace targets if xlen == 64
> >>>> - Initialize ftrace target addresses to avoid calling bad
> >>>> address in a
> >>>> hypothesized case.
> >>>> - Use LGPTR instead of SZPTR since .align is log-scaled for
> >>>> mcount-dyn.S
> >>>> - Require the nop instruction of all jump_labels aligns
> >>>> naturally on
> >>>> 4B.
> >>>>
> >>>> Andy Chiu (5):
> >>>> riscv: align ftrace to 4 Byte boundary and increase ftrace
> >>>> prologue
> >>>> size
> >>>> riscv: export patch_insn_write
> >>>> riscv: ftrace: use indirect jump to work with kernel preemption
> >>>> riscv: ftrace: do not use stop_machine to update code
> >>>> riscv: align arch_static_branch function
> >>>>
> >>>> arch/riscv/Makefile | 2 +-
> >>>> arch/riscv/include/asm/ftrace.h | 24 ----
> >>>> arch/riscv/include/asm/jump_label.h | 2 +
> >>>> arch/riscv/include/asm/patch.h | 1 +
> >>>> arch/riscv/kernel/ftrace.c | 179
> >>>> ++++++++++++++++++++--------
> >>>> arch/riscv/kernel/mcount-dyn.S | 69 ++++++++---
> >>>> arch/riscv/kernel/patch.c | 4 +-
> >>>> 7 files changed, 188 insertions(+), 93 deletions(-)
> >>>>
> >>>
> >>> First of all, thank you for working on making dynamic Ftrace robust in
> >>> preemptible kernels on RISC-V.
> >>> It is an important use case but, for now, dynamic Ftrace and related
> >>> tracers cannot be safely used with such kernels.
> >>>
> >>> Are there any updates on this series?
> >>> It needs a rebase, of course, but it looks doable.
> >>>
> >>> If I understand the discussion correctly, the only blocker was that
> >>> using "-falign-functions" was not enough to properly align cold
> >>> functions and "-fno-guess-branch-probability" would likely have a
> >>> performance cost.
> >>>
> >>> It seems, GCC developers have recently provided a workaround for that
> >>> (https://gcc.gnu.org/git/gitweb.cgi?p=gcc.git;h=0f5a9a00e3ab1fe96142f304cfbcf3f63b15f326,
> >>>
> >>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88345#c24).
> >>>
> >>> "-fmin-function-alignment" should help but, I do not know, which GCC
> >>> versions have got that patch already. In the meantime, one could
> >>> probably check if "-fmin-function-alignment" is supported by the
> >>> compiler and use it, if it is.
> >>>
> >>> Thoughts?
> >>
> >> Hi Evgenii,
> >>
> >> Thanks for the update. Indeed, it is essential to this patch for
> >> toolchain to provide forced alignment. We can test this flag in the
> >> Makefile to sort out if toolchain supports it or not. Meanwhile, I had
> >> figured out a way for this to work on any 2-B align addresses but
> >> hadn't implemented it out yet. Basically it would require more
> >> patching space for us to do software alignment. I would opt for a
> >> special toolchain flag if the toolchain just supports it.
> >>
> >> Let me take some time to look and get back to you soon.
> >
> > Thank you! Looking forward to it.
> >
> > In case it helps, here is what I have checked so far.
> >
> > 1.
> > I added the patch
> > https://gcc.gnu.org/git/?p=gcc.git;a=patch;h=0f5a9a00e3ab1fe96142f304cfbcf3f63b15f326
> > to the current revision of GCC 13.2.0 from RISC-V toolchain.
> >
> > Rebased your patchset on top of Linux 6.8-rc4 (mostly - context
> > changes, SYM_FUNC_START/SYM_FUNC_END for asm symbols, etc.).
> >
> > Reverted 8547649981e6 ("riscv: ftrace: Fixup panic by disabling
> > preemption").
> >
> > Switched from -falign-functions=4 to -fmin-function-alignment=4:
> > ------------------
> > diff --git a/arch/riscv/Makefile b/arch/riscv/Makefile
> > index b33b787c8b07..dcd0adeebaae 100644
> > --- a/arch/riscv/Makefile
> > +++ b/arch/riscv/Makefile
> > @@ -15,9 +15,9 @@ ifeq ($(CONFIG_DYNAMIC_FTRACE),y)
> > LDFLAGS_vmlinux += --no-relax
> > KBUILD_CPPFLAGS += -DCC_USING_PATCHABLE_FUNCTION_ENTRY
> > ifeq ($(CONFIG_RISCV_ISA_C),y)
> > - CC_FLAGS_FTRACE := -fpatchable-function-entry=12 -falign-functions=4
> > + CC_FLAGS_FTRACE := -fpatchable-function-entry=12
> > -fmin-function-alignment=4
> > else
> > - CC_FLAGS_FTRACE := -fpatchable-function-entry=6 -falign-functions=4
> > + CC_FLAGS_FTRACE := -fpatchable-function-entry=6
> > -fmin-function-alignment=4
> > endif
> > endif
> >
> > ------------------
> >
> > As far as I can see from objdump, the functions that were not aligned
> > at 4-byte boundary with -falign-functions=4, are now aligned correctly
> > with -fmin-function-alignment=4.
> >
> > 2.
> > I tried the kernel in a QEMU VM with 2 CPUs and "-machine virt".
> >
> > The boottime tests for Ftrace had passed, except the tests for
> > function_graph. I described the failure and the possible fix here:
> > https://lore.kernel.org/all/dcc5976d-635a-4710-92df-94a99653314e@yadro.com/
> >
> >
> > 3.
> > There were also boottime warnings about "RCU not on for:
> > arch_cpu_idle+0x0/0x2c". These are probably not related to your
> > patchset, but rather to the fact that Ftrace is enabled in a
> > preemptble kernel where RCU does different things.
> >
> > As a workaround, I disabled tracing of arch_cpu_idle() for now:
> > ------------------
> > diff --git a/arch/riscv/kernel/process.c b/arch/riscv/kernel/process.c
> > index 92922dbd5b5c..6abeecbfc51d 100644
> > --- a/arch/riscv/kernel/process.c
> > +++ b/arch/riscv/kernel/process.c
> > @@ -37,7 +37,7 @@ EXPORT_SYMBOL(__stack_chk_guard);
> >
> > extern asmlinkage void ret_from_fork(void);
> >
> > -void arch_cpu_idle(void)
> > +void noinstr arch_cpu_idle(void)
> > {
> > cpu_do_idle();
> > }
>
>
> I came up with the same fix for this, based on a similar fix for s390. I
> have a patch ready and will send it soon since to me, it is a fix, not a
> workaround.
Just making sure we aren't duplicating works. Are you also working on
getting rid of stop_machine() while patching ftrace entries? Or to
provide a patch to fix the issue in arch_cpu_idle()? I was just about
to restart my patchset for the first purpose. In case if I missed
anything, could you help pointing me to the patchset if it's already
on the ML?
>
> Thanks,
>
> Alex
>
>
> >
> > ------------------
> >
> > 4.
> > Stress-testing revealed an issue though, which I do not understand yet.
> >
> > Probably similar to what you did earlier, I ran a script that switched
> > the current tracer to "function", "function_graph", "nop", "blk" each
> > 1-5 seconds. In another shell, "stress-ng --hrtimers 1" was running.
> >
> > The kernel usually crashed within a few minutes, in seemingly random
> > locations, but often in one of two ways:
> >
> > (a) Invalid instruction, because the address of ftrace_caller function
> > was somehow written to the body of the traced function rather than
> > just to the Ftrace prologue.
> >
> > In the following example, the crash happened at 0xffffffff800d3398.
> > "b0 d7" is actually not part of the code here, but rather the lower
> > bytes of 0xffffffff8000d7b0, the address of ftrace_caller() in this
> > kernel.
> >
> > (gdb) disas /r 0xffffffff800d3382,+0x20
> > Dump of assembler code from 0xffffffff800d3382 to 0xffffffff800d33a2:
> > ...
> > 0xffffffff800d3394 <clockevents_program_event+144>: ba 87 mv a5,a4
> > 0xffffffff800d3396 <clockevents_program_event+146>: c1 bf j
> > 0xffffffff800d3366 <clockevents_program_event+98>
> > 0xffffffff800d3398 <clockevents_program_event+148>: b0 d7 sw
> > a2,104(a5) // 0xffffffff8000d7b0, the address of ftrace_caller().
> > 0xffffffff800d339a <clockevents_program_event+150>: 00 80 .2byte
> > 0x8000
> > 0xffffffff800d339c <clockevents_program_event+152>: ff ff .2byte
> > 0xffff
> > 0xffffffff800d339e <clockevents_program_event+154>: ff ff .2byte
> > 0xffff
> > 0xffffffff800d33a0 <clockevents_program_event+156>: d5 bf j
> > 0xffffffff800d3394 <clockevents_program_event+144
> >
> > The backtrace usually contains one or more occurrences of
> > return_to_handler() in this case.
> >
> > [ 260.520394] [<ffffffff800d3398>] clockevents_program_event+0xac/0x100
> > [ 260.521195] [<ffffffff8000d2bc>] return_to_handler+0x0/0x26
> > [ 260.521843] [<ffffffff800c50ba>] hrtimer_interrupt+0x122/0x20c
> > [ 260.522492] [<ffffffff8000d2bc>] return_to_handler+0x0/0x26
> > [ 260.523132] [<ffffffff8009785e>] handle_percpu_devid_irq+0x9e/0x1ec
> > [ 260.523788] [<ffffffff8000d2bc>] return_to_handler+0x0/0x26
> > [ 260.524437] [<ffffffff8000d2bc>] return_to_handler+0x0/0x26
> > [ 260.525080] [<ffffffff80a8acfa>] handle_riscv_irq+0x4a/0x74
> > [ 260.525726] [<ffffffff80a97b9a>] call_on_irq_stack+0x32/0x40
> > ----------------------
> >
> > (b) Jump to an invalid location, e.g. to the middle of a valid 4-byte
> > instruction. %ra usually points right after the last instruction,
> > "jalr a2", in return_to_handler() in such cases, so the jump was
> > likely made from there.
> >
> > The problem is reproducible, although I have not found what causes it
> > yet.
> >
> > Any help is appreciated, of course.
> >
> >>
> >>>
> >>> Regards,
> >>> Evgenii
> >>
> >> Regards,
> >> Andy
> >
> >
> > _______________________________________________
> > linux-riscv mailing list
> > linux-riscv@lists.infradead.org
> > http://lists.infradead.org/mailman/listinfo/linux-riscv
Thanks,
Andy
_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv
next prev parent reply other threads:[~2024-03-07 12:28 UTC|newest]
Thread overview: 43+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-09-13 9:42 [PATCH RFC v2 riscv/for-next 0/5] Enable ftrace with kernel preemption for RISC-V Andy Chiu
2022-09-13 9:42 ` [PATCH RFC v2 riscv/for-next 1/5] riscv: align ftrace to 4 Byte boundary and increase ftrace prologue size Andy Chiu
2022-09-15 13:53 ` Guo Ren
2022-09-17 1:15 ` Andy Chiu
2022-09-13 9:42 ` [PATCH RFC v2 riscv/for-next 2/5] riscv: export patch_insn_write Andy Chiu
2022-09-13 9:42 ` [PATCH RFC v2 riscv/for-next 3/5] riscv: ftrace: use indirect jump to work with kernel preemption Andy Chiu
2022-09-14 13:45 ` Guo Ren
2022-09-15 13:30 ` Guo Ren
2022-09-17 1:04 ` Andy Chiu
2022-09-17 10:56 ` Guo Ren
2024-02-20 14:17 ` Evgenii Shatokhin
2022-09-13 9:42 ` [PATCH RFC v2 riscv/for-next 4/5] riscv: ftrace: do not use stop_machine to update code Andy Chiu
2022-09-13 9:42 ` [PATCH RFC v2 riscv/for-next 5/5] riscv: align arch_static_branch function Andy Chiu
2022-09-14 14:06 ` Guo Ren
2022-09-16 23:54 ` Andy Chiu
2022-09-17 0:22 ` Guo Ren
2022-09-17 18:17 ` [PATCH] riscv: jump_label: Optimize size with RISCV_ISA_C guoren
2022-09-17 18:38 ` [PATCH RFC v2 riscv/for-next 5/5] riscv: align arch_static_branch function guoren
2022-09-17 23:49 ` Guo Ren
2022-09-17 23:59 ` Guo Ren
2022-09-18 0:12 ` Jessica Clarke
2022-09-18 0:46 ` Guo Ren
2022-09-14 14:24 ` Jessica Clarke
2022-09-15 1:47 ` Guo Ren
2022-09-15 2:34 ` Jessica Clarke
2024-02-13 19:42 ` [PATCH RFC v2 riscv/for-next 0/5] Enable ftrace with kernel preemption for RISC-V Evgenii Shatokhin
2024-02-21 5:27 ` Andy Chiu
2024-02-21 16:55 ` Evgenii Shatokhin
2024-03-06 20:57 ` Alexandre Ghiti
2024-03-07 8:35 ` Evgenii Shatokhin
2024-03-07 12:27 ` Andy Chiu [this message]
2024-03-07 13:21 ` Alexandre Ghiti
2024-03-07 15:57 ` Samuel Holland
2024-03-11 14:24 ` Andy Chiu
2024-03-19 14:50 ` Alexandre Ghiti
2024-03-19 14:58 ` Conor Dooley
2024-03-20 16:37 ` Andy Chiu
2024-03-18 15:31 ` Andy Chiu
2024-03-19 15:32 ` Evgenii Shatokhin
2024-03-20 16:38 ` Andy Chiu
2024-03-19 17:37 ` Alexandre Ghiti
2024-03-20 16:36 ` Andy Chiu
2024-03-21 11:02 ` Alexandre Ghiti
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CABgGipUtzLG2tffXOdV7-7ss+_doQvjxfwzVbfzyO56J3_3X0Q@mail.gmail.com \
--to=andy.chiu@sifive.com \
--cc=alex@ghiti.fr \
--cc=aou@eecs.berkeley.edu \
--cc=ardb@kernel.org \
--cc=e.shatokhin@yadro.com \
--cc=greentime.hu@sifive.com \
--cc=guoren@kernel.org \
--cc=jbaron@akamai.com \
--cc=jpoimboe@kernel.org \
--cc=jrtc27@jrtc27.com \
--cc=kernel@esmil.dk \
--cc=linux-riscv@lists.infradead.org \
--cc=linux@yadro.com \
--cc=mingo@redhat.com \
--cc=palmer@dabbelt.com \
--cc=paul.walmsley@sifive.com \
--cc=peterz@infradead.org \
--cc=rostedt@goodmis.org \
--cc=zong.li@sifive.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).