All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andy Lutomirski <luto@amacapital.net>
To: Nadav Amit <namit@vmware.com>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	"H. Peter Anvin" <hpa@zytor.com>
Cc: Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@redhat.com>,
	LKML <linux-kernel@vger.kernel.org>, X86 ML <x86@kernel.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Borislav Petkov <bp@alien8.de>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	Andrew Lutomirski <luto@kernel.org>,
	Kees Cook <keescook@chromium.org>,
	Dave Hansen <dave.hansen@intel.com>,
	Masami Hiramatsu <mhiramat@kernel.org>
Subject: Re: [PATCH v3 2/7] x86/jump_label: Use text_poke_early() during early_init
Date: Mon, 5 Nov 2018 12:05:57 -0800	[thread overview]
Message-ID: <CALCETrXwci8L=53UTq8=LEs+5yPgdHnidFf6iz3wZ0SkZwd1Eg@mail.gmail.com> (raw)
In-Reply-To: <8DF7BED8-F1B2-4102-9452-46437D3E4FC6@vmware.com>

On Mon, Nov 5, 2018 at 11:25 AM Nadav Amit <namit@vmware.com> wrote:
>
> From: Andy Lutomirski
> Sent: November 5, 2018 at 7:03:49 PM GMT
> > To: Nadav Amit <namit@vmware.com>
> > Cc: Peter Zijlstra <peterz@infradead.org>, Ingo Molnar <mingo@redhat.com>, LKML <linux-kernel@vger.kernel.org>, X86 ML <x86@kernel.org>, H. Peter Anvin <hpa@zytor.com>, Thomas Gleixner <tglx@linutronix.de>, Borislav Petkov <bp@alien8.de>, Dave Hansen <dave.hansen@linux.intel.com>, Andy Lutomirski <luto@kernel.org>, Kees Cook <keescook@chromium.org>, Dave Hansen <dave.hansen@intel.com>, Masami Hiramatsu <mhiramat@kernel.org>
> > Subject: Re: [PATCH v3 2/7] x86/jump_label: Use text_poke_early() during early_init
> >
> >
> >
> >
> >> On Nov 5, 2018, at 9:49 AM, Nadav Amit <namit@vmware.com> wrote:
> >>
> >> From: Andy Lutomirski
> >> Sent: November 5, 2018 at 5:22:32 PM GMT
> >>> To: Peter Zijlstra <peterz@infradead.org>
> >>> Cc: Nadav Amit <namit@vmware.com>, Ingo Molnar <mingo@redhat.com>, linux-kernel@vger.kernel.org, x86@kernel.org, H. Peter Anvin <hpa@zytor.com>, Thomas Gleixner <tglx@linutronix.de>, Borislav Petkov <bp@alien8.de>, Dave Hansen <dave.hansen@linux.intel.com>, Andy Lutomirski <luto@kernel.org>, Kees Cook <keescook@chromium.org>, Dave Hansen <dave.hansen@intel.com>, Masami Hiramatsu <mhiramat@kernel.org>
> >>> Subject: Re: [PATCH v3 2/7] x86/jump_label: Use text_poke_early() during early_init
> >>>
> >>>
> >>>
> >>>>> On Nov 5, 2018, at 6:09 AM, Peter Zijlstra <peterz@infradead.org> wrote:
> >>>>>
> >>>>> On Fri, Nov 02, 2018 at 04:29:41PM -0700, Nadav Amit wrote:
> >>>>> diff --git a/arch/x86/kernel/jump_label.c b/arch/x86/kernel/jump_label.c
> >>>>> index aac0c1f7e354..367c1d0c20a3 100644
> >>>>> --- a/arch/x86/kernel/jump_label.c
> >>>>> +++ b/arch/x86/kernel/jump_label.c
> >>>>> @@ -52,7 +52,13 @@ static void __ref __jump_label_transform(struct jump_entry *entry,
> >>>>> jmp.offset = jump_entry_target(entry) -
> >>>>>          (jump_entry_code(entry) + JUMP_LABEL_NOP_SIZE);
> >>>>>
> >>>>> -    if (early_boot_irqs_disabled)
> >>>>> +    /*
> >>>>> +     * As long as we are in early boot, we can use text_poke_early(), which
> >>>>> +     * is more efficient: the memory was still not marked as read-only (it
> >>>>> +     * is only marked after poking_init()). This also prevents us from using
> >>>>> +     * text_poke() before poking_init() is called.
> >>>>> +     */
> >>>>> +    if (!early_boot_done)
> >>>>>     poker = text_poke_early;
> >>>>>
> >>>>> if (type == JUMP_LABEL_JMP) {
> >>>>
> >>>> It took me a while to untangle init/maze^H^Hin.c... but I think this
> >>>> is all we need:
> >>>>
> >>>> diff --git a/arch/x86/kernel/jump_label.c b/arch/x86/kernel/jump_label.c
> >>>> index aac0c1f7e354..ed5fe274a7d8 100644
> >>>> --- a/arch/x86/kernel/jump_label.c
> >>>> +++ b/arch/x86/kernel/jump_label.c
> >>>> @@ -52,7 +52,12 @@ static void __ref __jump_label_transform(struct jump_entry *entry,
> >>>> jmp.offset = jump_entry_target(entry) -
> >>>>          (jump_entry_code(entry) + JUMP_LABEL_NOP_SIZE);
> >>>>
> >>>> -    if (early_boot_irqs_disabled)
> >>>> +    /*
> >>>> +     * As long as we're UP and not yet marked RO, we can use
> >>>> +     * text_poke_early; SYSTEM_BOOTING guarantees both, as we switch to
> >>>> +     * SYSTEM_SCHEDULING before going either.
> >>>> +     */
> >>>> +    if (system_state == SYSTEM_BOOTING)
> >>>>     poker = text_poke_early;
> >>>>
> >>>> if (type == JUMP_LABEL_JMP) {
> >>>
> >>> Can we move this logic into text_poke() and get rid of text_poke_early()?
> >>
> >> This will negatively affect poking of modules doing module loading, e.g.,
> >> apply_paravirt(). This can be resolved by keeping track when the module is
> >> write-protected and giving a module parameter to text_poke(). Does it worth
> >> the complexity?
> >
> > Probably not.
> >
> > OTOH, why does alternative patching need text_poke() at all? Can’t it just
> > write to the text?
>
> Good question. According to my understanding, these games of
> text_poke_early() are not needed, at least for modules (on Intel).
>
> Intel SDM 11.6 "SELF-MODIFYING CODE” says:
>
> "A write to a memory location in a code segment that is currently cached in
> the processor causes the associated cache line (or lines) to be invalidated.
> This check is based on the physical address of the instruction.”
>
> Then the manual talks about prefetched instructions, but the modules code is
> presumably not be “prefetchable” at this point. So I think it should be
> safe, but I guess that you reviewed Intel/AMD manuals better when you wrote
> sync_core().

Beats the heck out of me.

Linus, hpa, or Dave, a question for you: suppose I map some page
writably, write to it, then upgrade permissions to allow execute.
Must I force all CPUs that might execute from it without first
serializing to serialize?  I suspect this doesn't really affect user
code, but it may affect the module loader.

To be safe, shouldn't the module loader broadcast an IPI to
sync_core() everywhere after loading a module and before making it
runnable, regardless of alternative patching?

IOW, the right sequence of events probably ought to me:

1. Allocate the memory and map it.
2. Copy in the text.
3. Patch alternatives, etc.  This is logically just like (2) from an
architectural perspective -- we're just writing to memory that won't
be executed.
4. Serialize everything.
5. Run it!

>
> Anyhow, there should be a function that wraps the memcpy() to keep track
> when someone changes the text (for potential future use).
>
> Does it make sense? Do you want me to give it a spin?

Sure, I guess.  Linus, what do you think?

>
> Thanks,
> Nadav



-- 
Andy Lutomirski
AMA Capital Management, LLC

  reply	other threads:[~2018-11-05 20:06 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-11-02 23:29 [PATCH v3 0/7] x86/alternatives: text_poke() fixes Nadav Amit
2018-11-02 23:29 ` [PATCH v3 1/7] Fix "x86/alternatives: Lockdep-enforce text_mutex in text_poke*()" Nadav Amit
2018-11-03 10:11   ` Jiri Kosina
2018-11-04 20:58   ` Thomas Gleixner
2018-11-05 18:14     ` Nadav Amit
2018-11-02 23:29 ` [PATCH v3 2/7] x86/jump_label: Use text_poke_early() during early_init Nadav Amit
2018-11-05 12:39   ` Peter Zijlstra
2018-11-05 13:33     ` Peter Zijlstra
2018-11-05 14:09   ` Peter Zijlstra
2018-11-05 17:22     ` Andy Lutomirski
2018-11-05 17:49       ` Nadav Amit
2018-11-05 19:03         ` Andy Lutomirski
2018-11-05 19:25           ` Nadav Amit
2018-11-05 20:05             ` Andy Lutomirski [this message]
2018-11-05 20:28               ` Thomas Gleixner
2018-11-05 21:31                 ` Nadav Amit
2018-11-07 19:13     ` Nadav Amit
2018-11-08 10:41       ` Peter Zijlstra
2018-11-02 23:29 ` [PATCH v3 3/7] x86/mm: temporary mm struct Nadav Amit
2018-11-02 23:29 ` [PATCH v3 4/7] fork: provide a function for copying init_mm Nadav Amit
2018-11-02 23:29 ` [PATCH v3 5/7] x86/alternatives: initializing temporary mm for patching Nadav Amit
2018-11-02 23:29 ` [PATCH v3 6/7] x86/alternatives: use temporary mm for text poking Nadav Amit
2018-11-05 13:19   ` Peter Zijlstra
2018-11-05 13:30   ` Peter Zijlstra
2018-11-05 18:04     ` Nadav Amit
2018-11-06  8:20       ` Peter Zijlstra
2018-11-06 13:11         ` Peter Zijlstra
2018-11-06 18:11           ` Nadav Amit
2018-11-06 19:08             ` Peter Zijlstra
2018-11-02 23:29 ` [PATCH v3 7/7] x86/alternatives: remove text_poke() return value Nadav Amit

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CALCETrXwci8L=53UTq8=LEs+5yPgdHnidFf6iz3wZ0SkZwd1Eg@mail.gmail.com' \
    --to=luto@amacapital.net \
    --cc=bp@alien8.de \
    --cc=dave.hansen@intel.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=hpa@zytor.com \
    --cc=keescook@chromium.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=mhiramat@kernel.org \
    --cc=mingo@redhat.com \
    --cc=namit@vmware.com \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.