linux-toolchains.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Steven Rostedt <rostedt@goodmis.org>
To: Indu Bhagat <indu.bhagat@oracle.com>
Cc: linux-toolchains@vger.kernel.org, daandemeyer@meta.com,
	andrii@kernel.org, kris.van.hees@oracle.com,
	elena.zannoni@oracle.com, nick.alcock@oracle.com
Subject: Re: [POC 0/5] SFrame based stack tracer for user space in the kernel
Date: Mon, 1 May 2023 18:15:15 -0400	[thread overview]
Message-ID: <20230501181515.098acdce@gandalf.local.home> (raw)
In-Reply-To: <20230501200410.3973453-1-indu.bhagat@oracle.com>

On Mon,  1 May 2023 13:04:05 -0700
Indu Bhagat <indu.bhagat@oracle.com> wrote:

> Hello,
> 

Hi Indu,

This is really great! I think we should include LKML in this as well. And
possibly even linux-trace-kernel@vger.kernel.org.


> This patch set is a Proof of Concept implementation for an SFrame-based
> stack tracer for user space in the kernel. Some of you had expressed interest
> in exploring this earlier; hopefully, this POC helps discuss the design and
> take it forward.
> 
> Motivation
> ==========
> Generating stack traces is vital for all profiling, tracing and debugging
> tools. In context of generating stack traces for user space, frame-pointer
> based unwinding works, but has its issues ([1],[2]).  EH_Frame based
> unwinding seems undesirable for kernel's unwinding needs ([3],[4]). 
> In general, EH_Frame based unwinding is undesirable in applications that need
> fast, real-time stack tracers (e.g., profilers), because of the overhead of
> interpreting and executing DWARF opcodes to calculate the relevant stack
> offsets.
> 
> SFrame (Simple Frame) stack trace format is designed to address these concerns.
> With this POC, we would like to see how to use SFrame as a viable alternative
> for user space stack tracing needs in the kernel.
> 
> [1] https://lwn.net/Articles/919940/
> [2] https://pagure.io/fesco/issue/2817
> [3] https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org/thread/OOJDAKTJB5WGMOZRXTUX7FTPFBF3H7WE/#NXRMNKD4B23HX7U5ICMKFRZO6Z3VXQXL
> [4] https://lkml.org/lkml/2012/2/10/356
> 
> What is SFrame format
> =====================
> 
> SFrame is the "Simple Frame" stack trace format.  The format is documented as
> part of the binutils documentation at https://sourceware.org/binutils/docs.
> 
> Starting with binutils 2.40, the GNU assembler (as) can generate SFrame stack
> trace data based on the CFI directives found in the source assembly.  This is
> achieved by using the --gsframe command line option when invoking the
> assembler.  This option plays the same role as the existing --gdwarf-[2345]
> options, only this time referring to SFrame.  The resulting stack tracing
> information is stored in a new segment of its own with type PT_GNU_SFRAME,
> containing a section named '.sframe'.
> 
> Also starting with binutils 2.40, the GNU linker (ld) knows how to merge
> sections containing SFrame stack trace info.
> 
> SFrame based user space stack tracer POC
> ========================================
> These patches implement a POC for an SFrame based user space stack tracer (for
> x86) in the kernel.  The purpose of this code is to serve as a reference,
> initiate discussions, and perhaps serve as a starting point for a viable
> implementation of an SFrame based stack tracer.  Please keep in mind that my
> familiarity with with kernel code/processes/conventions is still limited ;-).
> 
> High-level Design in this POC
> =============================
> Kconfig adds two config options for userspace unwinding
>   - config USER_UNWINDER_SFRAME to enable the SFrame userspace unwinder
>   - config USER_UNWINDER_FRAME_POINTER to enable the Frame Pointer userspace
>     unwinder
> 
> If CONFIG_USER_UNWINDER_SFRAME is set, the task_struct keeps a reference to
> the sframe_state object for the task.
> 
> For long running user programs, it makes sense to cache the sframe_state
> in the task and be able to simply do a quick do_sframe_unwind() at every
> unwind request.  Caching the sframe_state also means keeping the .sframe
> pages (for the prog and its DSOs) pinned.  The task's sframe_state is
> kmalloc'ed and initialized in load_elf_binary, when the task is close to begin
> execution.  The (open) issue with this design, however, remains that we need to
> detect when additional DSOs are brought in at run-time by the application.
> 
> The detection (and resolution) of stale sframe_state is not implemented in this
> POC.  As such, the POC at this time is fit only for applications that are
> statically linked.

So my thoughts on this was not to pin the sframe, but simply note that it
exists. When perf/bpf/ftrace wants a user space stack trace, it will ask
for one (I plan on adding an interface around this process, as it will also
handle the case the sframe is not available).

As the user stack trace will not change while the task is in the kernel, it
does not need to be triggered when asked for. Instead, it could register a
callback, and then on exiting back to user space (via the ptrace path), it
would then do the sframe look up, and pass the user space stack trace to
the perf/bpf/ftrace handlers.

In this location, we can allow for the sframe to be faulted in, as it will
be in a context where it can safely take a fault (and schedule out!). It
would be no different than any part of the elf file faulting in, and can be
swapped back out with memory pressure.

I'll go ahead and play with this code.

Thanks again, this is really helpful.

-- Steve


      parent reply	other threads:[~2023-05-01 22:15 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-05-01 20:04 [POC 0/5] SFrame based stack tracer for user space in the kernel Indu Bhagat
2023-05-01 20:04 ` [POC 1/5] Kconfig: x86: Add new config options for userspace unwinder Indu Bhagat
2023-05-01 20:04 ` [POC 2/5] task_struct : add additional member for sframe state Indu Bhagat
2023-05-01 20:04 ` [POC 3/5] sframe: add new SFrame library Indu Bhagat
2023-05-01 22:40   ` Steven Rostedt
2023-05-02  5:07     ` Indu Bhagat
2023-05-02  8:46     ` Peter Zijlstra
2023-05-02  9:09   ` Peter Zijlstra
2023-05-02  9:20   ` Peter Zijlstra
2023-05-02  9:28   ` Peter Zijlstra
2023-05-02  9:30   ` Peter Zijlstra
2023-05-03  6:03     ` Indu Bhagat
2023-05-02 10:31   ` Peter Zijlstra
2023-05-02 10:41   ` Peter Zijlstra
2023-05-02 15:22     ` Steven Rostedt
2023-05-01 20:04 ` [POC 4/5] sframe: add an SFrame format stack tracer Indu Bhagat
2023-05-01 23:00   ` Steven Rostedt
2023-05-02  6:16     ` Indu Bhagat
2023-05-02  8:53   ` Peter Zijlstra
2023-05-02  9:04   ` Peter Zijlstra
2023-05-01 20:04 ` [POC 5/5] x86_64: invoke SFrame based stack tracer for user space Indu Bhagat
2023-05-01 23:11   ` Steven Rostedt
2023-05-02 10:53   ` Peter Zijlstra
2023-05-02 15:27     ` Steven Rostedt
2023-05-16 17:25       ` Andrii Nakryiko
2023-05-16 17:38         ` Steven Rostedt
2023-05-16 17:51           ` Andrii Nakryiko
2024-03-13 14:37       ` Tatsuyuki Ishi
2024-03-13 14:52         ` Steven Rostedt
2024-03-13 14:58           ` Tatsuyuki Ishi
2024-03-13 15:04             ` Steven Rostedt
2023-05-01 22:15 ` Steven Rostedt [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230501181515.098acdce@gandalf.local.home \
    --to=rostedt@goodmis.org \
    --cc=andrii@kernel.org \
    --cc=daandemeyer@meta.com \
    --cc=elena.zannoni@oracle.com \
    --cc=indu.bhagat@oracle.com \
    --cc=kris.van.hees@oracle.com \
    --cc=linux-toolchains@vger.kernel.org \
    --cc=nick.alcock@oracle.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).