From: Indu Bhagat <indu.bhagat@oracle.com>
To: linux-toolchains@vger.kernel.org
Cc: "Jose E. Marchesi" <jose.marchesi@oracle.com>,
Daan De Meyer <daandemeyer@meta.com>,
Kris Van Hees <kris.van.hees@oracle.com>,
Elena Zannoni <elena.zannoni@oracle.com>
Subject: Unwinding user-space programs in the kernel using SFrame format
Date: Thu, 12 Jan 2023 12:30:26 -0800 [thread overview]
Message-ID: <7dcae1d1-b0f5-a497-a473-26a43f1b1ad6@oracle.com> (raw)
Hello,
This email is to initiate discussion/collaboration on adding a new
user-space program unwinder in the kernel, an unwinder which uses the
SFrame format.
What is SFrame format?
SFrame is the Simple Frame format. It represents the minimal necessary
information needed for backtracing - i.e. Canonical Frame Address (CFA),
Frame Pointer (FP), and Return Address (RA). SFrame unwind information
is available in a section called .sframe, which is itself presented in a
new segment of its own, PT_GNU_SFRAME. SFrame format is supported for
AMD64 and AARCH64 (be/le) ABIs only.
How can I experiment with the SFrame format support?
The support for SFrame format is available in binutils trunk. GNU
assembler when passed a --gsframe command line option, generates the
.sframe section. The GNU assembler uses the .cfi_* asm directives
emitted by the compiler to generate an .sframe section. GNU ld merges
the input .sframe sections as necessary, no explicit command line option
is needed. There is support in objdump/readelf as well, pass a --sframe
option to dump the .sframe section in textual format.
Where can I find details about the format?
More details are available in the include/sframe.h in binutils repo
(https://sourceware.org/git/?p=binutils-gdb.git;a=blob;f=include/sframe.h).
SFrame spec is also present in the binutils trunk. Some more content
should be available online in the form of GNU Cauldron and LPC 2022
presentations: this was talked about under the name "CTF Frame", but has
since been renamed to "SFrame".
Why is SFrame based unwinder useful?
Having an unwinder for user-space programs based on SFrame format can be
useful:
- enabling -fno-omit-frame-pointers has performance implications and
other issues.
- Compared to .eh_frame info, SFrame is a simpler format to decode
and generate backtraces. SFrame unwinder itself, hence, is small and
simple
(https://github.com/oracle/binutils-gdb/blob/oracle/sframe-unwinder/libsframe/sframe-backtrace.c
is how an SFrame based unwinder can look like. This code uses libsframe
APIs like sframe_decode, sframe_find_fre, sframe_fre_get_fp_offset etc.
to generate backtraces.).
There was some interest, at LPC 2022, in exploring an SFrame-based
userspace unwinder for the kernel. To get started on that, some
discussion on following items will be great. (Please feel free to
add/delete/correct any items; my knowledge about the kernel and its
internals remains limited).
Userspace unwinder selection
----------------------------
IIUC, userspace unwinding is always frame-pointer based in the kernel.
This is unlike kernel-space unwinding where there are a set of unwinders
to chose from: say, for x86_64, UNWINDER_ORC / UNWINDER_FRAME_POINTER /
UNWINDER_GUESS. Additionally, for kernel stack unwinding, there is also
a framework in place to plug-and-play these different unwinders.
For userspace stack unwinding, first, we may want to add new config
options, such that:
- USERSPACE_UNWINDER_SFRAME => This option enables the SFrame
unwinder for unwinding user stack traces as the first choice. User
programs must be built with SFrame support. If not, no SFrame section
will be present in the user program binary; In such a case, the
userspace unwinder defaults to frame pointer unwinding.
- USERSPACE_UNWINDER_FRAME_POINTER => userspace unwinding does frame
pointer based unwinding only. User programs must be built with frame
pointer preservation build flags to ensure useful stack traces.
Second, regarding "the framework" needed for non-frame-pointer-based
unwinders, more thought is needed.
Interface of the userspace unwinder
-----------------------------------
* OPTION 1
This one might be overly simplified but is an option. We add the
following stub:
...
if (check_sframe_state_p (current)) // checks for SFrame sections if
CONFIG_USER_UNWINDER_SFRAME is true
sframe_callchain_user (entry, regs); // current is implicit,
stores callchain entries as it unwinds using .sframe sections
...
in the following target APIs in x86_64 and aarch64 to give the desired
effect of "userspace unwinder selection"
-- perf_callchain_user in arch/x86/events/core.c
-- perf_callchain_user in in arch/arm64/kernel/perf_callchain.c
where the functions look like:
static inline bool check_sframe_state_p(struct task_struct *task);
void sframe_callchain_user(struct perf_callchain_entry_ctx *entry,
struct pt_regs *regs);
Here, sframe_callchain_user () will, first, perform an operation similar
to dl_iterate_phdr, because we need the location of the SFrame sections
for unwinding. This means, for every sframe_unwind() call, we go over
the memory mappings of "current" task_struct and find the locations of
the .sframe sections of the program + its DSOs from pages that contain
the ELF program headers. Next, using these SFrame sections, it will then
decode the SFrame section and unwind.
* OPTION 2
A possible optimization is to instead:
1. Cache some sframe related state, "struct *sframe_state", in the
"struct task_struct" (guarded by CONFIG_USER_UNWINDER_SFRAME), and
2. Use an API like so "void sframe_callchain_user(struct *sframe_state,
struct perf_callchain_entry_ctx *entry, struct pt_regs *regs)"
This state (struct sframe_state) is simply put: data about the size and
addr of the text and SFrame segments of the program and its DSOs.
Ideally this state can be setup at task setup time and needs to be
updated only if there is any change in the DSOs (added or removed) [1].
The size of the struct sframe_state itself is small here, as SFrame
sections can be decoded on-the-fly with no need for additional mallocs.
[1] PS: That this detection of "add/delete of DSOs in a user program" is
possible in some efficient way in the kernel remains an assumption; I
still need to figure things out. Any inputs on this appreciated.
Other framework
---------------
The kernel stack unwinders adhere to some interface allowing them to be
used interchangeably. The requirements of the userspace unwinder are a
bit different though: not all user applications may be compiled with
SFrame support, which means there needs to be a way we fall back on the
frame-pointer based unwinder in the kernel for unwinding user programs.
This requirement, however, does not mean that some framework changes
shouldn't be done now to make things work better.
Any feedback/ideas are appreciated. I have also not been able yet to
evaluate what other impacts could this have on perf, if at all.
Thanks
next reply other threads:[~2023-01-12 20:54 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-01-12 20:30 Indu Bhagat [this message]
2023-01-24 21:58 ` Unwinding user-space programs in the kernel using SFrame format Indu Bhagat
2023-02-06 19:44 ` Unwinding user-space programs in the kernel using SFrame fo Steven Rostedt
2023-02-07 8:36 ` Indu Bhagat
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=7dcae1d1-b0f5-a497-a473-26a43f1b1ad6@oracle.com \
--to=indu.bhagat@oracle.com \
--cc=daandemeyer@meta.com \
--cc=elena.zannoni@oracle.com \
--cc=jose.marchesi@oracle.com \
--cc=kris.van.hees@oracle.com \
--cc=linux-toolchains@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).