From: Indu Bhagat <indu.bhagat@oracle.com>
To: Steven Rostedt <rostedt@goodmis.org>
Cc: linux-toolchains@vger.kernel.org, daandemeyer@meta.com,
andrii@kernel.org, kris.van.hees@oracle.com,
elena.zannoni@oracle.com, nick.alcock@oracle.com
Subject: Re: [POC 4/5] sframe: add an SFrame format stack tracer
Date: Mon, 1 May 2023 23:16:38 -0700 [thread overview]
Message-ID: <84bf4aee-e5a5-8e49-3826-ecb79eefc85b@oracle.com> (raw)
In-Reply-To: <20230501190018.24ae7704@gandalf.local.home>
On 5/1/23 16:00, Steven Rostedt wrote:
> On Mon, 1 May 2023 13:04:09 -0700
> Indu Bhagat <indu.bhagat@oracle.com> wrote:
>
>> This patch adds an SFrame format based stack tracer.
>>
>> The files iterate_phdr.c, iterate_phdr.h implement a dl_iterate_phdr()
>> like functionality.
>>
>> The SFrame format based stack tracer is implemented in the
>> sframe_unwind.c with architecture specific bits in the
>> arch/arm64/include/asm/sframe_regs.h and
>> arch/x86/include/asm/sframe_regs.h. Please note that the SFrame format
>> is supported for x86_64 (AMD64 ABI) and aarch64 (AAPCS64 ABI) only at
>> this time.
>>
>> The files sframe_state.[ch] implement the SFrame state management APIs.
>>
>> Some aspects of the implementation are "POC like". These will need to
>> addressed for the implementation to become more palatable:
>> - dealing with only Elf64_Phdr (no Elf32_Phdr) at this time, and other
>> TODOs in the iterate_phdr.c,
>> - detecting whether a program did a dlopen/dlclose,
>> - code stubs around user space memory access (.sframe section, ELF hdr
>> etc.) by the kernel need careful review.
>>
>> There are more aspects than above; The intention of this patch set is to
>> help drive the discussion on how to best incorporate an SFrame-based user
>> space unwinder in the kernel.
>>
>> Signed-off-by: Indu Bhagat <indu.bhagat@oracle.com>
>> ---
>> arch/arm64/include/asm/sframe_regs.h | 37 +++
>> arch/x86/include/asm/sframe_regs.h | 34 +++
>> include/sframe/sframe_regs.h | 11 +
>> include/sframe/sframe_unwind.h | 62 ++++
>> lib/sframe/Makefile | 8 +-
>> lib/sframe/iterate_phdr.c | 113 +++++++
>> lib/sframe/iterate_phdr.h | 34 +++
>> lib/sframe/sframe_state.c | 424 +++++++++++++++++++++++++++
>> lib/sframe/sframe_state.h | 80 +++++
>> lib/sframe/sframe_unwind.c | 208 +++++++++++++
>> 10 files changed, 1010 insertions(+), 1 deletion(-)
>> create mode 100644 arch/arm64/include/asm/sframe_regs.h
>> create mode 100644 arch/x86/include/asm/sframe_regs.h
>> create mode 100644 include/sframe/sframe_regs.h
>> create mode 100644 include/sframe/sframe_unwind.h
>> create mode 100644 lib/sframe/iterate_phdr.c
>> create mode 100644 lib/sframe/iterate_phdr.h
>> create mode 100644 lib/sframe/sframe_state.c
>> create mode 100644 lib/sframe/sframe_state.h
>> create mode 100644 lib/sframe/sframe_unwind.c
>>
>> diff --git a/arch/arm64/include/asm/sframe_regs.h b/arch/arm64/include/asm/sframe_regs.h
>> new file mode 100644
>> index 000000000000..ae9ab9d5d3c1
>> --- /dev/null
>> +++ b/arch/arm64/include/asm/sframe_regs.h
>> @@ -0,0 +1,37 @@
>> +/* SPDX-License-Identifier: GPL-2.0-or-later */
>> +/*
>> + * Copyright (C) 2023, Oracle and/or its affiliates.
>> + */
>> +
>> +#ifdef ASM_ARM64_SFRAME_REGS_H
>> +#define ASM_ARM64_SFRAME_REGS_H
>> +
>> +#define STACK_ACCESS_LEN 8
>> +
>> +static inline uint64_t
>> +get_ptregs_ip(struct pt_regs *regs)
>> +{
>> + return regs->pc;
>> +}
>> +
>> +static inline uint64_t
>> +get_ptregs_sp(struct pt_regs *regs)
>> +{
>> + return regs->sp;
>> +}
>> +
>> +static inline uint64_t
>> +get_ptregs_fp(struct pt_regs *regs)
>> +{
>> +#define UNWIND_AARCH64_X29 29 /* 64-bit frame pointer. */
>> + return (uint64_t)regs->regs[UNWIND_AARCH64_X29];
>> +}
>> +
>> +static inline uint64_t
>> +get_ptregs_ra(struct pt_regs *regs)
>> +{
>> +#define UNWIND_AARCH64_X30 30 /* 64-bit link pointer. */
>> + return (uint64_t)regs->regs[UNWIND_AARCH64_X30];
>> +}
>> +
>> +#endif /* ASM_ARM64_SFRAME_REGS_H */
>> diff --git a/arch/x86/include/asm/sframe_regs.h b/arch/x86/include/asm/sframe_regs.h
>> new file mode 100644
>> index 000000000000..99f84955854a
>> --- /dev/null
>> +++ b/arch/x86/include/asm/sframe_regs.h
>> @@ -0,0 +1,34 @@
>> +/* SPDX-License-Identifier: GPL-2.0-or-later */
>> +/*
>> + * Copyright (C) 2023, Oracle and/or its affiliates.
>> + */
>> +
>> +#ifndef ASM_X86_SFRAME_REGS_H
>> +#define ASM_X86_SFRAME_REGS_H
>> +
>> +#define STACK_ACCESS_LEN 8
>> +
>> +static inline uint64_t
>> +get_ptregs_ip(struct pt_regs *regs)
>> +{
>> + return (uint64_t)regs->ip;
>> +}
>> +
>> +static inline uint64_t
>> +get_ptregs_sp(struct pt_regs *regs)
>> +{
>> + return (uint64_t)regs->sp;
>> +}
>> +
>> +static inline uint64_t
>> +get_ptregs_fp(struct pt_regs *regs)
>> +{
>> + return (uint64_t)regs->bp;
>> +}
>> +
>> +static inline uint64_t
>> +get_ptregs_ra(struct pt_regs *regs)
>> +{
>> + return 0; /* SFRAME_CFA_FIXED_RA_INVALID */
>> +}
>> +#endif /* ASM_X86_SFRAME_REGS_H */
>> diff --git a/include/sframe/sframe_regs.h b/include/sframe/sframe_regs.h
>> new file mode 100644
>> index 000000000000..32b67f7a7c78
>> --- /dev/null
>> +++ b/include/sframe/sframe_regs.h
>> @@ -0,0 +1,11 @@
>> +/* SPDX-License-Identifier: GPL-2.0-or-later */
>> +/*
>> + * Copyright (C) 2023, Oracle and/or its affiliates.
>> + */
>> +
>> +#ifndef _SFRAME_REGS_H
>> +#define _SFRAME_REGS_H
>> +
>> +#include <asm/sframe_regs.h>
>> +
>> +#endif /* _SFRAME_REGS_H */
>> diff --git a/include/sframe/sframe_unwind.h b/include/sframe/sframe_unwind.h
>> new file mode 100644
>> index 000000000000..3e2c12816b60
>> --- /dev/null
>> +++ b/include/sframe/sframe_unwind.h
>
> Also, these should probably go into include/linux, Unless there's going to
> be a lot more header files.
>
I'd expect at most the current headers:
include/sframe/sframe_unwind.h
include/sframe/sframe_regs.h
And perhaps one more, for a callchain format and callbacks suggested below ?
>> @@ -0,0 +1,62 @@
>> +/* SPDX-License-Identifier: GPL-2.0-or-later */
>> +/*
>> + * Copyright (C) 2023, Oracle and/or its affiliates.
>> + */
>> +
>> +#ifndef _SFRAME_UNWIND_H
>> +#define _SFRAME_UNWIND_H
>> +
>> +#include <linux/sched.h>
>> +#include <linux/perf_event.h>
>> +
>> +#define PT_GNU_SFRAME 0x6474e554
>> +
>> +/*
>> + * State used for SFrame-based stack tracing for a user space task.
>> + */
>> +struct user_unwind_state {
>> + uint64_t pc, sp, fp, ra;
>
> I know this is POC, but please make each structure field a separate item.
> Also, should be tab delimited.
>
OK.
>> + enum stack_type stype;
>> + struct task_struct *task;
>> + bool error;
>> +};
>
> Also swap the task and the stype, as the pointer to the task will create a
> hole in the structure.
>
> struct user_unwind_state {
> uint64_t pc;
> uint64_t sp;
> uint64_t fp;
> uint64_t ra;
> struct task_stuct *task;
> enum stack_type stype;
> bool error;
> };
>
OK.
>> +
>> +/*
>> + * APIs for an SFrame based stack tracer.
>> + */
>> +
>> +void sframe_unwind_start(struct user_unwind_state *state,
>> + struct task_struct *task, struct pt_regs *regs);
>> +bool sframe_unwind_next_frame(struct user_unwind_state *state);
>> +uint64_t sframe_unwind_get_return_address(struct user_unwind_state *state);
>> +
>> +static inline bool sframe_unwind_done(struct user_unwind_state *state)
>> +{
>> + return state->stype == STACK_TYPE_UNKNOWN;
>> +}
>> +
>> +static inline bool sframe_unwind_error(struct user_unwind_state *state)
>> +{
>> + return state->error;
>> +}
>> +
>> +/*
>> + * APIs to manage the SFrame state per task for stack tracing.
>> + */
>> +
>> +extern struct sframe_state *unwind_sframe_state_alloc(struct task_struct *task);
>> +extern int unwind_sframe_state_update(struct task_struct *task);
>> +extern void unwind_sframe_state_cleanup(struct task_struct *task);
>> +
>> +extern bool unwind_sframe_state_valid_p(struct sframe_state *sfstate);
>> +extern bool unwind_sframe_state_ready_p(struct sframe_state *sftate);
>> +
>> +/*
>> + * Get the callchain using SFrame unwind info for the given task.
>> + */
>> +extern int
>> +sframe_callchain_user(struct task_struct *task,
>> + struct perf_callchain_entry_ctx *entry,
>> + struct pt_regs *regs);
>
>
> I plan on using this without any perf involvement, I'd like to keep perf
> separate from the sframe logic. As I mentioned in a previous email, I
> expect sframe to have callbacks. So the callchain format should be defined
> by sframe, and not reuse perf.
>
I will think about this. Do you have some model of the expected
callbacks for me to explore ?
>> +
>> +#endif /* _SFRAME_UNWIND_H */
>> diff --git a/lib/sframe/Makefile b/lib/sframe/Makefile
>> index 4e4291d9294f..5ee9e3e7ec93 100644
>> --- a/lib/sframe/Makefile
>> +++ b/lib/sframe/Makefile
>> @@ -1,5 +1,11 @@
>> # SPDX-License-Identifier: GPL-2.0
>> ##################################
>> -obj-$(CONFIG_USER_UNWINDER_SFRAME) += sframe_read.o \
>> +obj-$(CONFIG_USER_UNWINDER_SFRAME) += iterate_phdr.o \
>> + sframe_read.o \
>> + sframe_state.o \
>> + sframe_unwind.o
>
> Ah, the backslash is fixed here.
>
Yes, it was a rebase thing. It got missed when moving code between patches.
>>
>> +CFLAGS_iterate_phdr.o += -I $(srctree)/lib/sframe/ -Wno-error=declaration-after-statement
>> CFLAGS_sframe_read.o += -I $(srctree)/lib/sframe/
>> +CFLAGS_sframe_state.o += -I $(srctree)/lib/sframe/
>> +CFLAGS_sframe_unwind.o += -I $(srctree)/lib/sframe/
>> diff --git a/lib/sframe/iterate_phdr.c b/lib/sframe/iterate_phdr.c
>> new file mode 100644
>> index 000000000000..c10d590ecc67
>> --- /dev/null
>> +++ b/lib/sframe/iterate_phdr.c
>> @@ -0,0 +1,113 @@
>> +// SPDX-License-Identifier: GPL-2.0-or-later
>> +/*
>> + * Copyright (C) 2023, Oracle and/or its affiliates.
>> + */
>> +
>> +#include <linux/elf.h>
>> +#include <linux/mm.h>
>> +#include <linux/vmalloc.h>
>> +#include <linux/mm_types.h>
>> +
>> +#include "iterate_phdr.h"
>> +
>> +/*
>> + * Iterate over the task's memory mappings and find the ELF headers.
>> + *
>> + * This is expected to be called from perf_callchain_user(), so user process
>> + * context is expected.
>
> My thought is that this will be called in the ptrace path (not the perf
> path), so yes, it will be in user process context.
>
>> + */
>> +
>> +int iterate_phdr(int (*callback)(struct phdr_info *info,
>> + struct task_struct *task,
>> + void *data),
>> + struct task_struct *task, void *data)
>> +{
>> + struct mm_struct *mm;
>> + struct vm_area_struct *vma_mt;
>> + struct page *page;
>> +
>> + Elf64_Ehdr *ehdr;
>> + struct phdr_info phinfo;
>> +
>> + int ret = 0, res = 0;
>> + int err = 0;
>> + bool first = true;
>> +
>> + memset(&phinfo, 0, sizeof(struct phdr_info));
>> +
>> + mm = task->mm;
>> +
>> + MA_STATE(mas, &mm->mm_mt, 0, 0);
>> +
>
> So this is the code I want to discuss at LSFMM :-) As there will be more
> experts about this than what I know.
>
> Let me go and start making the infrastructure to encompass this.
>
> -- Steve
>
>
>> + mas_for_each(&mas, vma_mt, ULONG_MAX) {
>> + /* ELF header has a fixed place in the file, starting at offset
>> + * zero.
>> + */
>> + if (vma_mt->vm_pgoff)
>> + continue;
>> +
>> + /* For the callback to infer if its the prog or DSO we are
>> + * dealing with.
>> + */
>> + phinfo.pi_prog = first;
>> + first = false;
>> + /* FIXME TODO
>> + * - This code assumes 64-bit ELF by using Elf64_Ehdr.
>> + * - Detect the case when ELF program headers to be of
>> + * size > 1 page.
>> + */
>> +
>> + /* FIXME TODO KERNEL
>> + * - get_user_pages_WHAT, which API.
>> + * What flags ? Is this correct ?
>> + */
>> + ret = get_user_pages_remote(mm, vma_mt->vm_start, 1, FOLL_GET,
>> + &page, &vma_mt, NULL);
>> + if (ret <= 0)
>> + continue;
>> +
>> + /* The first page must have the ELF header. */
>> + ehdr = vmap(&page, 1, VM_MAP, PAGE_KERNEL);
>> + if (!ehdr)
>> + goto put_page;
>> +
>> + /* Check for magic bytes to make sure this is ehdr. */
>> + err = 0;
>> + err |= ((ehdr->e_ident[EI_MAG0] != ELFMAG0)
>> + || (ehdr->e_ident[EI_MAG1] != ELFMAG1)
>> + || (ehdr->e_ident[EI_MAG2] != ELFMAG2)
>> + || (ehdr->e_ident[EI_MAG3] != ELFMAG3));
>> + if (err)
>> + goto unmap;
>> +
>> + /*
>> + * FIXME TODO handle the case when number of program headers is
>> + * greater than or equal to PN_XNUM later.
>> + */
>> + if (ehdr->e_phnum == PN_XNUM)
>> + goto unmap;
>> + /*
>> + * FIXME TODO handle the case when Elf phdrs span more than one
>> + * page later ?
>> + */
>> + if ((sizeof(Elf64_Ehdr) + ehdr->e_phentsize * ehdr->e_phnum)
>> + > PAGE_SIZE)
>> + goto unmap;
>> +
>> + /* Save the location of program headers and the phnum. */
>> + phinfo.pi_addr = vma_mt->vm_start;
>> + phinfo.pi_phdr = (void *)ehdr + ehdr->e_phoff;
>> + phinfo.pi_phnum = ehdr->e_phnum;
>> +
>> + res = callback(&phinfo, task, data);
>> +unmap:
>> + vunmap(ehdr);
>> +put_page:
>> + put_page(page);
>> +
>> + if (res < 0)
>> + break;
>> + }
>> +
>> + return res;
>> +}
>>
next prev parent reply other threads:[~2023-05-02 6:17 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-05-01 20:04 [POC 0/5] SFrame based stack tracer for user space in the kernel Indu Bhagat
2023-05-01 20:04 ` [POC 1/5] Kconfig: x86: Add new config options for userspace unwinder Indu Bhagat
2023-05-01 20:04 ` [POC 2/5] task_struct : add additional member for sframe state Indu Bhagat
2023-05-01 20:04 ` [POC 3/5] sframe: add new SFrame library Indu Bhagat
2023-05-01 22:40 ` Steven Rostedt
2023-05-02 5:07 ` Indu Bhagat
2023-05-02 8:46 ` Peter Zijlstra
2023-05-02 9:09 ` Peter Zijlstra
2023-05-02 9:20 ` Peter Zijlstra
2023-05-02 9:28 ` Peter Zijlstra
2023-05-02 9:30 ` Peter Zijlstra
2023-05-03 6:03 ` Indu Bhagat
2023-05-02 10:31 ` Peter Zijlstra
2023-05-02 10:41 ` Peter Zijlstra
2023-05-02 15:22 ` Steven Rostedt
2023-05-01 20:04 ` [POC 4/5] sframe: add an SFrame format stack tracer Indu Bhagat
2023-05-01 23:00 ` Steven Rostedt
2023-05-02 6:16 ` Indu Bhagat [this message]
2023-05-02 8:53 ` Peter Zijlstra
2023-05-02 9:04 ` Peter Zijlstra
2023-05-01 20:04 ` [POC 5/5] x86_64: invoke SFrame based stack tracer for user space Indu Bhagat
2023-05-01 23:11 ` Steven Rostedt
2023-05-02 10:53 ` Peter Zijlstra
2023-05-02 15:27 ` Steven Rostedt
2023-05-16 17:25 ` Andrii Nakryiko
2023-05-16 17:38 ` Steven Rostedt
2023-05-16 17:51 ` Andrii Nakryiko
2024-03-13 14:37 ` Tatsuyuki Ishi
2024-03-13 14:52 ` Steven Rostedt
2024-03-13 14:58 ` Tatsuyuki Ishi
2024-03-13 15:04 ` Steven Rostedt
2023-05-01 22:15 ` [POC 0/5] SFrame based stack tracer for user space in the kernel Steven Rostedt
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=84bf4aee-e5a5-8e49-3826-ecb79eefc85b@oracle.com \
--to=indu.bhagat@oracle.com \
--cc=andrii@kernel.org \
--cc=daandemeyer@meta.com \
--cc=elena.zannoni@oracle.com \
--cc=kris.van.hees@oracle.com \
--cc=linux-toolchains@vger.kernel.org \
--cc=nick.alcock@oracle.com \
--cc=rostedt@goodmis.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).