All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ard Biesheuvel <ardb@kernel.org>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Frederic Weisbecker <frederic@kernel.org>,
	LKML <linux-kernel@vger.kernel.org>,
	James Morse <james.morse@arm.com>,
	David Laight <David.Laight@aculab.com>,
	Quentin Perret <qperret@google.com>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Will Deacon <will@kernel.org>,
	Mark Rutland <mark.rutland@arm.com>
Subject: Re: [PATCH 2/4] arm64: implement support for static call trampolines
Date: Mon, 25 Oct 2021 16:55:17 +0200	[thread overview]
Message-ID: <CAMj1kXEKASsYJMHHNA=uNGTnLMoXO_4BP0--1k7cEfZZupdsog@mail.gmail.com> (raw)
In-Reply-To: <YXbC3NRWDDfsW6DG@hirez.programming.kicks-ass.net>

On Mon, 25 Oct 2021 at 16:47, Peter Zijlstra <peterz@infradead.org> wrote:
>
> On Mon, Oct 25, 2021 at 04:19:16PM +0200, Peter Zijlstra wrote:
> > On Mon, Oct 25, 2021 at 04:08:37PM +0200, Ard Biesheuvel wrote:
>
> > > > Ooohh, but what if you go from !func to NOP.
> > > >
> > > > assuming:
> > > >
> > > >         .literal = 0
> > > >         BTI C
> > > >         RET
> > > >
> > > > Then
> > > >
> > > >         CPU0                    CPU1
> > > >
> > > >         [S] literal = func      [I] NOP
> > > >         [S] insn[1] = NOP       [L] x16 = literal (NULL)
> > > >                                 b x16
> > > >                                 *BANG*
> > > >
> > > > Is that possible? (total lack of memory ordering etc..)
> > > >
> > >
> > > The CBZ will branch to the RET instruction if x16 == 0x0, so this
> > > should not happen.
> >
> > Oooh, I missed that :/ I was about to suggest writing the address of a
> > bare 'ret' trampoline instead of NULL into the literal.
>
> Perhaps a little something like so.. Shaves 2 instructions off each
> trampoline.
>
> --- a/arch/arm64/include/asm/static_call.h
> +++ b/arch/arm64/include/asm/static_call.h
> @@ -11,9 +11,7 @@
>             "   hint    34      /* BTI C */                             \n" \
>                 insn "                                                  \n" \
>             "   ldr     x16, 0b                                         \n" \
> -           "   cbz     x16, 1f                                         \n" \
>             "   br      x16                                             \n" \
> -           "1: ret                                                     \n" \
>             "   .popsection                                             \n")
>
>  #define ARCH_DEFINE_STATIC_CALL_TRAMP(name, func)                      \
> --- a/arch/arm64/kernel/patching.c
> +++ b/arch/arm64/kernel/patching.c
> @@ -90,6 +90,11 @@ int __kprobes aarch64_insn_write(void *a
>         return __aarch64_insn_write(addr, &i, AARCH64_INSN_SIZE);
>  }
>
> +asm("__static_call_ret:                \n"
> +    "  ret                     \n")
> +

This breaks BTI as it lacks the landing pad, and it will be called indirectly.

> +extern void __static_call_ret(void);
> +

Better to have an ordinary C function here (with consistent linkage),
but we need to take the address in a way that works with Clang CFI.

As the two additional instructions are on an ice cold path anyway, I'm
not sure this is an obvious improvement tbh.

>  void arch_static_call_transform(void *site, void *tramp, void *func, bool tail)
>  {
>         /*
> @@ -97,9 +102,7 @@ void arch_static_call_transform(void *si
>          *  0x0 bti c           <--- trampoline entry point
>          *  0x4 <branch or nop>
>          *  0x8 ldr x16, <literal>
> -        *  0xc cbz x16, 20
> -        * 0x10 br x16
> -        * 0x14 ret
> +        *  0xc br x16
>          */
>         struct {
>                 u64     literal;
> @@ -113,6 +116,7 @@ void arch_static_call_transform(void *si
>         insns.insn[0] = cpu_to_le32(insn);
>
>         if (!func) {
> +               insns.literal = (unsigned long)&__static_call_ret;
>                 insn = aarch64_insn_gen_branch_reg(AARCH64_INSN_REG_LR,
>                                                    AARCH64_INSN_BRANCH_RETURN);
>         } else {

  reply	other threads:[~2021-10-25 14:56 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-10-25 12:20 [PATCH 0/4] arm64: Support dynamic preemption v2 Frederic Weisbecker
2021-10-25 12:20 ` [PATCH 1/4] sched/preempt: Prepare for supporting !CONFIG_GENERIC_ENTRY dynamic preemption Frederic Weisbecker
2021-10-25 12:21 ` [PATCH 2/4] arm64: implement support for static call trampolines Frederic Weisbecker
2021-10-25 13:56   ` Peter Zijlstra
2021-10-25 14:08     ` Ard Biesheuvel
2021-10-25 14:19       ` Peter Zijlstra
2021-10-25 14:44         ` Peter Zijlstra
2021-10-25 14:55           ` Ard Biesheuvel [this message]
2021-10-25 15:03             ` Peter Zijlstra
2021-10-25 15:10               ` Ard Biesheuvel
2021-10-26 10:36                 ` Mark Rutland
2021-10-26 10:45                   ` Peter Zijlstra
2021-10-26 11:06                   ` David Laight
2021-10-27 12:47                     ` Mark Rutland
2021-10-25 15:03             ` David Laight
2021-10-25 14:25   ` David Laight
2021-10-25 14:31     ` Ard Biesheuvel
2021-10-25 14:38       ` David Laight
2021-10-25 12:21 ` [PATCH 3/4] arm64: Implement IRQ exit preemption static call for dynamic preemption Frederic Weisbecker
2021-10-25 12:21 ` [PATCH 4/4] arm64: Implement HAVE_PREEMPT_DYNAMIC Frederic Weisbecker
  -- strict thread matches above, loose matches on Subject: below --
2021-09-20 23:32 [PATCH 0/4] arm64: Support dynamic preemption Frederic Weisbecker
2021-09-20 23:32 ` [PATCH 2/4] arm64: implement support for static call trampolines Frederic Weisbecker
2021-09-21  7:09   ` Peter Zijlstra
2021-09-21 14:44     ` Ard Biesheuvel
2021-09-21 15:08       ` Peter Zijlstra
2021-09-21 15:33       ` Mark Rutland
2021-09-21 15:55         ` Ard Biesheuvel
2021-09-21 16:28           ` Mark Rutland
2021-09-25 17:46             ` David Laight
2021-09-27  8:58               ` Mark Rutland
2021-09-21 16:10   ` Ard Biesheuvel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAMj1kXEKASsYJMHHNA=uNGTnLMoXO_4BP0--1k7cEfZZupdsog@mail.gmail.com' \
    --to=ardb@kernel.org \
    --cc=David.Laight@aculab.com \
    --cc=catalin.marinas@arm.com \
    --cc=frederic@kernel.org \
    --cc=james.morse@arm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mark.rutland@arm.com \
    --cc=peterz@infradead.org \
    --cc=qperret@google.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.