linux-toolchains.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Indu Bhagat <indu.bhagat@oracle.com>
To: Steven Rostedt <rostedt@goodmis.org>
Cc: linux-toolchains@vger.kernel.org,
	"Jose E. Marchesi" <jose.marchesi@oracle.com>,
	Daan De Meyer <daandemeyer@meta.com>,
	Kris Van Hees <kris.van.hees@oracle.com>,
	Elena Zannoni <elena.zannoni@oracle.com>,
	Ross Zwisler <zwisler@chromium.org>,
	andrii@kernel.org
Subject: Re: Unwinding user-space programs in the kernel using SFrame fo
Date: Tue, 7 Feb 2023 00:36:57 -0800	[thread overview]
Message-ID: <a8f2b768-08b3-1573-45ec-19fc5c680a4e@oracle.com> (raw)
In-Reply-To: <20230206144444.01862598@rorschach.local.home>

On 2/6/23 11:44, Steven Rostedt wrote:
> On Thu, 12 Jan 2023 12:30:26 -0800
> Indu Bhagat <indu.bhagat@oracle.com> wrote:
> 
>> Hello,
>>
>> This email is to initiate discussion/collaboration on adding a new
>> user-space program unwinder in the kernel, an unwinder which uses the
>> SFrame format.
> 
> This is just an FYI that I was looking to have this done too in a
> generic manner that perf, ftrace and BPF could use it (and anyone else).
> 
> I sent a proposal about it to LSF/MM/BPF summit:
> 
>     https://lore.kernel.org/all/20230206103828.6efcb28f@rorschach.local.home/
> 
> -- Steve
> 

This is great!

FWIW, I've been attempting to get a _POC_ of some sort for an 
SFrame-based userspace unwinder in the kernel (along the lines stated 
below), but it needs more work. One aspect needing attention is the code 
stubs around accessing userspace SFrame pages for unwinding in the 
kernel need to be worked out amongst others.  And of course, testing ;-).

I am excited to see this proposal and can help with SFrame decoder / 
unwinder when it comes to that, if needed. Binutils 2.40 ships 
libsframe, but I'm thinking it's better to have malloc-free 
sframe_decode () API for the kernel usecase.

Thanks

>>
>> What is SFrame format?
>> SFrame is the Simple Frame format.  It represents the minimal necessary
>> information needed for backtracing - i.e. Canonical Frame Address (CFA),
>> Frame Pointer (FP), and Return Address (RA).  SFrame unwind information
>> is available in a section called .sframe, which is itself presented in a
>> new segment of its own, PT_GNU_SFRAME.  SFrame format is supported for
>> AMD64 and AARCH64 (be/le) ABIs only.
>>
>> How can I experiment with the SFrame format support?
>> The support for SFrame format is available in binutils trunk. GNU
>> assembler when passed a --gsframe command line option, generates the
>> .sframe section. The GNU assembler uses the .cfi_* asm directives
>> emitted by the compiler to generate an .sframe section. GNU ld merges
>> the input .sframe sections as necessary, no explicit command line option
>> is needed. There is support in objdump/readelf as well, pass a --sframe
>> option to dump the .sframe section in textual format.
>>
>> Where can I find details about the format?
>> More details are available in the include/sframe.h in binutils repo
>> (https://sourceware.org/git/?p=binutils-gdb.git;a=blob;f=include/sframe.h).
>>    SFrame spec is also present in the binutils trunk.  Some more content
>> should be available online in the form of GNU Cauldron and LPC 2022
>> presentations: this was talked about under the name "CTF Frame", but has
>> since been renamed to "SFrame".
>>
>> Why is SFrame based unwinder useful?
>> Having an unwinder for user-space programs based on SFrame format can be
>> useful:
>>     - enabling -fno-omit-frame-pointers has performance implications and
>> other issues.
>>     - Compared to .eh_frame info, SFrame is a simpler format to decode
>> and generate backtraces. SFrame unwinder itself, hence, is small and
>> simple
>> (https://github.com/oracle/binutils-gdb/blob/oracle/sframe-unwinder/libsframe/sframe-backtrace.c
>> is how an SFrame based unwinder can look like. This code uses libsframe
>> APIs like sframe_decode, sframe_find_fre, sframe_fre_get_fp_offset etc.
>> to generate backtraces.).
>>
>> There was some interest, at LPC 2022, in exploring an SFrame-based
>> userspace unwinder for the kernel.  To get started on that, some
>> discussion on following items will be great. (Please feel free to
>> add/delete/correct any items; my knowledge about the kernel and its
>> internals remains limited).
>>
>> Userspace unwinder selection
>> ----------------------------
>> IIUC, userspace unwinding is always frame-pointer based in the kernel.
>> This is unlike kernel-space unwinding where there are a set of unwinders
>> to chose from: say, for x86_64, UNWINDER_ORC / UNWINDER_FRAME_POINTER /
>> UNWINDER_GUESS. Additionally, for kernel stack unwinding, there is also
>> a framework in place to plug-and-play these different unwinders.
>>
>> For userspace stack unwinding, first, we may want to add new config
>> options, such that:
>>      - USERSPACE_UNWINDER_SFRAME => This option enables the SFrame
>> unwinder for unwinding user stack traces as the first choice.  User
>> programs must be built with SFrame support. If not, no SFrame section
>> will be present in the user program binary; In such a case, the
>> userspace unwinder defaults to frame pointer unwinding.
>>      - USERSPACE_UNWINDER_FRAME_POINTER => userspace unwinding does frame
>> pointer based unwinding only. User programs must be built with frame
>> pointer preservation build flags to ensure useful stack traces.
>>
>> Second, regarding "the framework" needed for non-frame-pointer-based
>> unwinders, more thought is needed.
>>
>> Interface of the userspace unwinder
>> -----------------------------------
>> * OPTION 1
>> This one might be overly simplified but is an option.  We add the
>> following stub:
>>
>>      ...
>>      if (check_sframe_state_p (current)) // checks for SFrame sections if
>> CONFIG_USER_UNWINDER_SFRAME is true
>>         sframe_callchain_user (entry, regs); // current is implicit,
>> stores callchain entries as it unwinds using .sframe sections
>>      ...
>>
>> in the following target APIs in x86_64 and aarch64 to give the desired
>> effect of "userspace unwinder selection"
>>     -- perf_callchain_user in arch/x86/events/core.c
>>     -- perf_callchain_user in in arch/arm64/kernel/perf_callchain.c
>>
>> where the functions look like:
>>     static inline bool check_sframe_state_p(struct task_struct *task);
>>     void sframe_callchain_user(struct perf_callchain_entry_ctx *entry,
>> struct pt_regs *regs);
>>
>> Here, sframe_callchain_user () will, first, perform an operation similar
>> to dl_iterate_phdr, because we need the location of the SFrame sections
>> for unwinding. This means, for every sframe_unwind() call, we go over
>> the memory mappings of "current" task_struct and find the locations of
>> the .sframe sections of the program + its DSOs from pages that contain
>> the ELF program headers. Next, using these SFrame sections, it will then
>> decode the SFrame section and unwind.
>>
>> * OPTION 2
>> A possible optimization is to instead:
>>
>> 1. Cache some sframe related state, "struct *sframe_state", in the
>> "struct task_struct" (guarded by CONFIG_USER_UNWINDER_SFRAME), and
>> 2. Use an API like so "void sframe_callchain_user(struct *sframe_state,
>> struct perf_callchain_entry_ctx *entry, struct pt_regs *regs)"
>> This state (struct sframe_state) is simply put: data about the size and
>> addr of the text and SFrame segments of the program and its DSOs.
>> Ideally this state can be setup at task setup time and needs to be
>> updated only if there is any change in the DSOs (added or removed) [1].
>> The size of the struct sframe_state itself is small here, as SFrame
>> sections can be decoded on-the-fly with no need for additional mallocs.
>>
>> [1] PS: That this detection of "add/delete of DSOs in a user program" is
>> possible in some efficient way in the kernel remains an assumption; I
>> still need to figure things out. Any inputs on this appreciated.
>>
>> Other framework
>> ---------------
>> The kernel stack unwinders adhere to some interface allowing them to be
>> used interchangeably.  The requirements of the userspace unwinder are a
>> bit different though: not all user applications may be compiled with
>> SFrame support, which means there needs to be a way we fall back on the
>> frame-pointer based unwinder in the kernel for unwinding user programs.
>>
>> This requirement, however, does not mean that some framework changes
>> shouldn't be done now to make things work better.
>>
>> Any feedback/ideas are appreciated.  I have also not been able yet to
>> evaluate what other impacts could this have on perf, if at all.
>>
>> Thanks
> 


      reply	other threads:[~2023-02-07  8:37 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-01-12 20:30 Unwinding user-space programs in the kernel using SFrame format Indu Bhagat
2023-01-24 21:58 ` Indu Bhagat
2023-02-06 19:44 ` Unwinding user-space programs in the kernel using SFrame fo Steven Rostedt
2023-02-07  8:36   ` Indu Bhagat [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a8f2b768-08b3-1573-45ec-19fc5c680a4e@oracle.com \
    --to=indu.bhagat@oracle.com \
    --cc=andrii@kernel.org \
    --cc=daandemeyer@meta.com \
    --cc=elena.zannoni@oracle.com \
    --cc=jose.marchesi@oracle.com \
    --cc=kris.van.hees@oracle.com \
    --cc=linux-toolchains@vger.kernel.org \
    --cc=rostedt@goodmis.org \
    --cc=zwisler@chromium.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).