linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Linus Torvalds <torvalds@linux-foundation.org>
To: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Nadav Amit <namit@vmware.com>, Andy Lutomirski <luto@kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	"the arch/x86 maintainers" <x86@kernel.org>,
	Linux List Kernel Mailing <linux-kernel@vger.kernel.org>,
	Ard Biesheuvel <ard.biesheuvel@linaro.org>,
	Steven Rostedt <rostedt@goodmis.org>,
	Ingo Molnar <mingo@kernel.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Masami Hiramatsu <mhiramat@kernel.org>,
	Jason Baron <jbaron@akamai.com>, Jiri Kosina <jkosina@suse.cz>,
	David Laight <David.Laight@aculab.com>,
	Borislav Petkov <bp@alien8.de>, Julia Cartwright <julia@ni.com>,
	Jessica Yu <jeyu@kernel.org>, "H. Peter Anvin" <hpa@zytor.com>,
	Rasmus Villemoes <linux@rasmusvillemoes.dk>,
	Edward Cree <ecree@solarflare.com>,
	Daniel Bristot de Oliveira <bristot@redhat.com>
Subject: Re: [PATCH v3 0/6] Static calls
Date: Fri, 11 Jan 2019 11:03:30 -0800	[thread overview]
Message-ID: <CAHk-=wjJm8DpCsw=Wno01q4VFqUeiLKE8QmbAtUJurYhn3jRqA@mail.gmail.com> (raw)
In-Reply-To: <20190111151525.tf7lhuycyyvjjxez@treble>

On Fri, Jan 11, 2019 at 7:15 AM Josh Poimboeuf <jpoimboe@redhat.com> wrote:
>
> >
> > Now, in the int3 handler can you take the faulting RIP and search for it in
> > the “static-calls” table, writing the RIP+5 (offset) into R10 (return
> > address) and the target into R11. You make the int3 handler to divert the
> > code execution by changing pt_regs->rip to point to a new function that does:
> >
> >       push R10
> >       jmp __x86_indirect_thunk_r11
> >
> > And then you are done. No?
>
> IIUC, that sounds pretty much like what Steven proposed:
>
>   https://lkml.kernel.org/r/20181129122000.7fb4fb04@gandalf.local.home
>
> I liked the idea, BUT, how would it work for callee-saved PV ops?  In
> that case there's only one clobbered register to work with (rax).

Actually, there's a much simpler model now that I think about it.

The BP fixup just fixes up %rip to to point to "bp_int3_handler".

And that's just a random text address set up by "text_poke_bp()".

So how about the static call rewriting simply do this:

 - for each static call:

 1)   create a fixup code stub that does

        push $returnaddressforTHIScall
        jmp targetforTHIScall

 2) do

        on_each_cpu(do_sync_core, NULL, 1);

     to make sure all CPU's see this generated code

  3) do

        text_poke_bp(addr, newcode, newlen, generatedcode);

Ta-daa! Done.

In fact, it turns out that even the extra "do_sync_core()" in #2 is
unnecessary, because taking the BP will be serializing on the CPU that
takes it, so we can skip it.

End result: the text_poke_bp() function will do the two do_sync_core
IPI's that guarantee that by the time it returns, no other CPU is
using the generated code any more, so it can be re-used for the next
static call fixup.

Notice? No odd emulation, no need to adjust the stack in the BP
handler, just the regular "return to a different IP".

Now, there is a nasty special case with that stub, though.

So nasty thing with the whole "generate a stub for each call" case:
because it's dynamic and because of the re-use of the stub, you could
be in the situation where:

  CPU1                  CPU2
  ----                  ----

  generate a stub
  on_each_cpu(do_sync_core..)
  text_poke_bp()
  ...

  rewrite to BP
                        trigger the BP
                        return to the stub
                        fun the first instruction of the stub
                        *INTERRUPT causes rescheduling*

  on_each_cpu(do_sync_core..)
  rewrite to good instruction
  on_each_cpu(do_sync_core..)

  free or re-generate the stub

                        !! The stub is still in use !!

So that simple "just generate the stub dynamically" isn't so simple after all.

But it turns out that that is really simple to handle too. How do we do that?

We do that by giving the BP handler *two* code sequences, and we make
the BP handler pick one depending on whether it is returning to a
"interrupts disabled" or "interrupts enabled" case.

So the BP handler does this:

 - if we're returning with interrupts disabled, pick the simple stub

 - if we're returning with interrupts enabled, clkear IF in the return
%rflags, and pick a *slightly* more complex stub:

        push $returnaddressforTHIScall
        sti
        jmp targetforTHIScall

and now the STI shadow will mean that this sequence is uninterruptible.

So we'd not do complex emulation of the call instruction at BP time,
but we'd do that *trivial* change at BP time.

This seems simple, doesn't need any temporary registers at all, and
doesn't need any extra stack magic. It literally needs just a trivial
sequence in poke_int3_handler().

The we'd change the end of poke_int3_handler() to do something like
this instead:

        void *newip = bp_int3_handler;
        ..
        if (new == magic_static_call_bp_int3_handler) {
                if (regs->flags &X86_FLAGS_IF) {
                        newip = magic_static_call_bp_int3_handler_sti;
                        regs->flags &= ~X86_FLAGS_IF;
        }
        regs->ip = (unsigned long) newip;
        return 1;

AAND now we're *really* done.

Does anybody see any issues in this?

              Linus

  parent reply	other threads:[~2019-01-11 19:03 UTC|newest]

Thread overview: 90+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-01-09 22:59 [PATCH v3 0/6] Static calls Josh Poimboeuf
2019-01-09 22:59 ` [PATCH v3 1/6] compiler.h: Make __ADDRESSABLE() symbol truly unique Josh Poimboeuf
2019-01-09 22:59 ` [PATCH v3 2/6] static_call: Add basic static call infrastructure Josh Poimboeuf
2019-01-10 14:03   ` Edward Cree
2019-01-10 18:37     ` Josh Poimboeuf
2019-01-09 22:59 ` [PATCH v3 3/6] x86/static_call: Add out-of-line static call implementation Josh Poimboeuf
2019-01-10  0:16   ` Nadav Amit
2019-01-10 16:28     ` Josh Poimboeuf
2019-01-09 22:59 ` [PATCH v3 4/6] static_call: Add inline static call infrastructure Josh Poimboeuf
2019-01-09 22:59 ` [PATCH v3 5/6] x86/alternative: Use a single access in text_poke() where possible Josh Poimboeuf
2019-01-10  9:32   ` Nadav Amit
2019-01-10 17:20     ` Josh Poimboeuf
2019-01-10 17:29       ` Nadav Amit
2019-01-10 17:32       ` Steven Rostedt
2019-01-10 17:42         ` Sean Christopherson
2019-01-10 17:57           ` Steven Rostedt
2019-01-10 18:04             ` Sean Christopherson
2019-01-10 18:21               ` Josh Poimboeuf
2019-01-10 18:24               ` Steven Rostedt
2019-01-11 12:10               ` Alexandre Chartre
2019-01-11 15:28                 ` Josh Poimboeuf
2019-01-11 16:46                   ` Alexandre Chartre
2019-01-11 16:57                     ` Josh Poimboeuf
2019-01-11 17:41                       ` Jason Baron
2019-01-11 17:54                         ` Nadav Amit
2019-01-15 11:10                       ` Alexandre Chartre
2019-01-15 16:19                         ` Steven Rostedt
2019-01-15 16:45                           ` Alexandre Chartre
2019-01-11  0:59           ` hpa
2019-01-11  1:34             ` Sean Christopherson
2019-01-11  8:13               ` hpa
2019-01-09 22:59 ` [PATCH v3 6/6] x86/static_call: Add inline static call implementation for x86-64 Josh Poimboeuf
2019-01-10  1:21 ` [PATCH v3 0/6] Static calls Nadav Amit
2019-01-10 16:44   ` Josh Poimboeuf
2019-01-10 17:32     ` Nadav Amit
2019-01-10 18:18       ` Josh Poimboeuf
2019-01-10 19:45         ` Nadav Amit
2019-01-10 20:32           ` Josh Poimboeuf
2019-01-10 20:48             ` Nadav Amit
2019-01-10 20:57               ` Josh Poimboeuf
2019-01-10 21:47                 ` Nadav Amit
2019-01-10 17:31 ` Linus Torvalds
2019-01-10 20:51   ` H. Peter Anvin
2019-01-10 20:30 ` Peter Zijlstra
2019-01-10 20:52   ` Josh Poimboeuf
2019-01-10 23:02     ` Linus Torvalds
2019-01-11  0:56       ` Andy Lutomirski
2019-01-11  1:47         ` Nadav Amit
2019-01-11 15:15           ` Josh Poimboeuf
2019-01-11 15:48             ` Nadav Amit
2019-01-11 16:07               ` Josh Poimboeuf
2019-01-11 17:23                 ` Nadav Amit
2019-01-11 19:03             ` Linus Torvalds [this message]
2019-01-11 19:17               ` Nadav Amit
2019-01-11 19:23               ` hpa
2019-01-11 19:33                 ` Nadav Amit
2019-01-11 19:34                 ` Linus Torvalds
2019-01-13  0:34                   ` hpa
2019-01-13  0:36                   ` hpa
2019-01-11 19:39                 ` Jiri Kosina
2019-01-14  2:31                   ` H. Peter Anvin
2019-01-14  2:40                     ` H. Peter Anvin
2019-01-14 20:11                       ` Andy Lutomirski
2019-01-14 22:00                       ` H. Peter Anvin
2019-01-14 22:54                         ` H. Peter Anvin
2019-01-15  3:05                           ` Andy Lutomirski
2019-01-15  5:01                             ` H. Peter Anvin
2019-01-15  5:37                               ` H. Peter Anvin
2019-01-14 23:27                         ` Andy Lutomirski
2019-01-14 23:51                           ` Nadav Amit
2019-01-15  2:28                           ` hpa
2019-01-11 20:04               ` Josh Poimboeuf
2019-01-11 20:12                 ` Linus Torvalds
2019-01-11 20:31                   ` Josh Poimboeuf
2019-01-11 20:46                     ` Linus Torvalds
2019-01-11 21:05                       ` Andy Lutomirski
2019-01-11 21:10                         ` Linus Torvalds
2019-01-11 21:32                           ` Josh Poimboeuf
2019-01-14 12:28                         ` Peter Zijlstra
2019-01-11 21:22                       ` Josh Poimboeuf
2019-01-11 21:23                         ` Josh Poimboeuf
2019-01-11 21:25                         ` Josh Poimboeuf
2019-01-11 21:36                         ` Nadav Amit
2019-01-11 21:41                           ` Josh Poimboeuf
2019-01-11 21:55                             ` Steven Rostedt
2019-01-11 21:59                               ` Nadav Amit
2019-01-11 21:56                             ` Nadav Amit
2019-01-12 23:54                         ` Andy Lutomirski
2020-02-17 21:10     ` Jann Horn
2020-02-17 21:57       ` Steven Rostedt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAHk-=wjJm8DpCsw=Wno01q4VFqUeiLKE8QmbAtUJurYhn3jRqA@mail.gmail.com' \
    --to=torvalds@linux-foundation.org \
    --cc=David.Laight@aculab.com \
    --cc=ard.biesheuvel@linaro.org \
    --cc=bp@alien8.de \
    --cc=bristot@redhat.com \
    --cc=ecree@solarflare.com \
    --cc=hpa@zytor.com \
    --cc=jbaron@akamai.com \
    --cc=jeyu@kernel.org \
    --cc=jkosina@suse.cz \
    --cc=jpoimboe@redhat.com \
    --cc=julia@ni.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux@rasmusvillemoes.dk \
    --cc=luto@kernel.org \
    --cc=mhiramat@kernel.org \
    --cc=mingo@kernel.org \
    --cc=namit@vmware.com \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).