linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Florent Revest <revest@chromium.org>
To: Mark Rutland <mark.rutland@arm.com>
Cc: linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org, linux-trace-kernel@vger.kernel.org,
	bpf@vger.kernel.org, catalin.marinas@arm.com, will@kernel.org,
	rostedt@goodmis.org, mhiramat@kernel.org, ast@kernel.org,
	daniel@iogearbox.net, andrii@kernel.org, kpsingh@kernel.org,
	jolsa@kernel.org, xukuohai@huaweicloud.com
Subject: Re: [PATCH 7/8] arm64: ftrace: Add direct call support
Date: Mon, 6 Feb 2023 17:25:41 +0100	[thread overview]
Message-ID: <CABRcYmJ7v0r+Pi5podwX0=zJQPY3S1mWWAdTdtFDVTCVy_PqiQ@mail.gmail.com> (raw)
In-Reply-To: <Y90phLJfnz06Ilb+@FVFF77S0Q05N>

On Fri, Feb 3, 2023 at 4:34 PM Mark Rutland <mark.rutland@arm.com> wrote:
>
> On Wed, Feb 01, 2023 at 05:34:19PM +0100, Florent Revest wrote:
> > This builds up on the CALL_OPS work which extends the ftrace patchsite
> > on arm64 with an ops pointer usable by the ftrace trampoline.
> >
> > This ops pointer is valid at all time. Indeed, it is either pointing to
> > ftrace_list_ops or to the single ops which should be called from that
> > patchsite.
> >
> > There are a few cases to distinguish:
> > - If a direct call ops is the only one tracing a function:
> >   - If the direct called trampoline is within the reach of a BL
> >     instruction
> >      -> the ftrace patchsite jumps to the trampoline
> >   - Else
> >      -> the ftrace patchsite jumps to the ftrace_caller trampoline which
> >         reads the ops pointer in the patchsite and jumps to the direct
> >         call address stored in the ops
> > - Else
> >   -> the ftrace patchsite jumps to the ftrace_caller trampoline and its
> >      ops literal points to ftrace_list_ops so it iterates over all
> >      registered ftrace ops, including the direct call ops and calls its
> >      call_direct_funcs handler which stores the direct called
> >      trampoline's address in the ftrace_regs and the ftrace_caller
> >      trampoline will return to that address instead of returning to the
> >      traced function
>
> This looks pretty good!
>
> Overall I think this is the right shape, but I have a few minor comments that
> lead to a bit of churn. I've noted those below, and I've also pushed out a
> branch with suggested fixups (as discrete patches) to my
> arm64/ftrace/direct-call-testing branch, which you can find at:
>
>   https://git.kernel.org/pub/scm/linux/kernel/git/mark/linux.git
>   git://git.kernel.org/pub/scm/linux/kernel/git/mark/linux.git
>
> Note that's based on a merge of the arm64 tree's for-next/ftrace branch and the
> trace tree's trace/for-next branch, and there were a couple of trivial
> conflicts I had to fix up when first picking this series (which I've noted in
> the affected patches)
>
> Those trees are at:
>
>   # arm64
>   https://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git
>   git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git
>
>   # trace
>   https://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace.git
>   git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace.git
>

Thanks for taking the time to publish these, that helps a lot.

> > @@ -80,6 +80,10 @@ struct ftrace_regs {
> >
> >       unsigned long sp;
> >       unsigned long pc;
> > +
> > +#ifdef CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
> > +     unsigned long custom_tramp;
>
> Minor nit, but could we please s/custom_tramp/direct_tramp/?

Yes, that's better

> I forgot to add a comment here, but we require that sizeof(struct ftrace_regs)
> is a multiple of 16, as the AAPCS requires that the stack is 16-byte aligned at
> function call boundaries (and on arm64 we currently assume that it's *always*
> aligned).
>
> I'd suggest we make this:
>
> | /*
> |  * Note: sizeof(struct ftrace_regs) must be a multiple of 16 to ensure correct
> |  * stack alignment
> |  */
> | struct ftrace_regs {
> |       /* x0 - x8 */
> |       unsigned long regs[9];
> |
> | #ifdef CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
> |       unsigned long direct_tramp;
> | #else
> |       unsigned long __unused;
> | #endif
> |
> |       unsigned long fp;
> |       unsigned long lr;
> |
> |       unsigned long sp;
> |       unsigned long pc;
> | };

Oh good catch, that was easy to miss. I'm surprised I didn't hit
horrible bugs when testing this in qemu.

> For branching to the direct call trampoline, it would be better to use a
> forward branch (i.e. BR), as that won't unbalance return stack predictors.
> That'll need to use x16 or x17 to be compatible with a `BTI C` landing pad.
>
> I'd also prefer if we could move the direct call invocation to a stub at the
> end of ftrace_caller, e.g.
>
> | SYM_CODE_START(ftrace_caller)
> |       ... discover ops pointer here ...
> |
> |       ldr     w17, [x11, #FTRACE_OPS_DIRECT_CALL]
> |       cbnz    ftrace_caller_direct
> |
> |       ... usual save and handling logic here ...
> |
> |       // just prior to return
> |       ldr     x17, [sp, #FREGS_DIRECT_CALL]
> |       cbnz    ftrace_caller_direct_late
> |
> |       ...  usual restore logic here ...
> |
> |       ret     x9
> |
> | SYM_INNER_LABEL(ftrace_caller_direct_late, SYM_L_LOCAL)
> |
> |       ... shuffle registers for the trampoline here ...
> |
> |       // fallthrough to the usual invocation
> | SYM_INNER_LABEL(ftrace_caller_direct, SYM_L_LOCAL)
> |       br      x17
> | SYM_CODE_END(ftrace_caller)

Agreed, it makes things a lot easier to read and your branch makes
restoring LR and PC more sane too. I squashed your change in and will
send it as part of v2.

> > @@ -220,14 +227,21 @@ static bool ftrace_find_callable_addr(struct dyn_ftrace *rec,
> >                                     unsigned long *addr)
> >  {
> >       unsigned long pc = rec->ip;
> > -     long offset = (long)*addr - (long)pc;
> >       struct plt_entry *plt;
> >
> > +     /*
> > +      * If a custom trampoline is unreachable, rely on the ftrace_caller
> > +      * trampoline which knows how to indirectly reach that trampoline
> > +      * through ops->direct_call.
> > +      */
> > +     if (*addr != FTRACE_ADDR && !reachable_by_bl(*addr, pc))
> > +             *addr = FTRACE_ADDR;
> > +
>
> With this, the check in get_ftrace_plt() is now redundant, since that's always
> called with FTRACE_ADDR as the 'addr' argument. We could probably clean that
> up, but I'm happy to leave that for now and handle that as a follow-up, since
> I think that'll imply making some other structural changes, which qould obscure
> the bulk of this patch.

Fair enough, I'll add an extra patch for this in v2. It's fairly
simple and if other maintainers think it's too much for the scope of
the series, we can always pick it up later and merge it separately.

  reply	other threads:[~2023-02-06 16:26 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-02-01 16:34 [PATCH 0/8] Add ftrace direct call for arm64 Florent Revest
2023-02-01 16:34 ` [PATCH 1/8] ftrace: Replace uses of _ftrace_direct APIs with _ftrace_direct_multi Florent Revest
2023-02-02 15:01   ` Mark Rutland
2023-02-02 17:37     ` Florent Revest
2023-02-07 15:21       ` Florent Revest
2023-02-07 15:35         ` Steven Rostedt
2023-02-07 16:19           ` Florent Revest
2023-02-01 16:34 ` [PATCH 2/8] ftrace: Remove the legacy _ftrace_direct API Florent Revest
2023-02-02 15:11   ` Mark Rutland
2023-02-01 16:34 ` [PATCH 3/8] ftrace: Rename _ftrace_direct_multi APIs to _ftrace_direct APIs Florent Revest
2023-02-02 15:17   ` Mark Rutland
2023-02-01 16:34 ` [PATCH 4/8] ftrace: Store direct called addresses in their ops Florent Revest
2023-02-02 15:29   ` Mark Rutland
2023-02-02 17:41     ` Florent Revest
2023-02-01 16:34 ` [PATCH 5/8] ftrace: Make DIRECT_CALLS work WITH_ARGS and !WITH_REGS Florent Revest
2023-02-02 15:54   ` Mark Rutland
2023-02-02 16:56     ` Mark Rutland
2023-02-02 18:19       ` Florent Revest
2023-02-03 10:03         ` Mark Rutland
2023-02-03 11:01           ` Florent Revest
2023-02-02 18:18     ` Florent Revest
2023-02-01 16:34 ` [PATCH 6/8] ftrace: Fix dead loop caused by direct call in ftrace selftest Florent Revest
2023-02-02 19:03   ` Mark Rutland
2023-02-03 12:35     ` Florent Revest
2023-02-03 15:37       ` Mark Rutland
2023-02-06 16:25         ` Florent Revest
2023-02-01 16:34 ` [PATCH 7/8] arm64: ftrace: Add direct call support Florent Revest
2023-02-03 15:34   ` Mark Rutland
2023-02-06 16:25     ` Florent Revest [this message]
2023-02-01 16:34 ` [PATCH 8/8] arm64: ftrace: Add direct called trampoline samples support Florent Revest
2023-02-02  8:36 ` [PATCH 0/8] Add ftrace direct call for arm64 Xu Kuohai
2023-02-02 10:50   ` Daniel Borkmann
2023-02-02 17:32     ` Florent Revest
2023-02-02 20:06 ` Steven Rostedt
2023-02-03  9:49   ` Mark Rutland

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CABRcYmJ7v0r+Pi5podwX0=zJQPY3S1mWWAdTdtFDVTCVy_PqiQ@mail.gmail.com' \
    --to=revest@chromium.org \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=catalin.marinas@arm.com \
    --cc=daniel@iogearbox.net \
    --cc=jolsa@kernel.org \
    --cc=kpsingh@kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-trace-kernel@vger.kernel.org \
    --cc=mark.rutland@arm.com \
    --cc=mhiramat@kernel.org \
    --cc=rostedt@goodmis.org \
    --cc=will@kernel.org \
    --cc=xukuohai@huaweicloud.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).