linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "H. Peter Anvin" <hpa@zytor.com>
To: Jiri Kosina <jikos@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
	Josh Poimboeuf <jpoimboe@redhat.com>,
	Nadav Amit <namit@vmware.com>, Andy Lutomirski <luto@kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	the arch/x86 maintainers <x86@kernel.org>,
	Linux List Kernel Mailing <linux-kernel@vger.kernel.org>,
	Ard Biesheuvel <ard.biesheuvel@linaro.org>,
	Steven Rostedt <rostedt@goodmis.org>,
	Ingo Molnar <mingo@kernel.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Masami Hiramatsu <mhiramat@kernel.org>,
	Jason Baron <jbaron@akamai.com>,
	David Laight <David.Laight@aculab.com>,
	Borislav Petkov <bp@alien8.de>, Julia Cartwright <julia@ni.com>,
	Jessica Yu <jeyu@kernel.org>,
	Rasmus Villemoes <linux@rasmusvillemoes.dk>,
	Edward Cree <ecree@solarflare.com>,
	Daniel Bristot de Oliveira <bristot@redhat.com>
Subject: Re: [PATCH v3 0/6] Static calls
Date: Mon, 14 Jan 2019 14:00:20 -0800	[thread overview]
Message-ID: <9f60be8c-47fb-195b-fdb4-4098f1df3dc2@zytor.com> (raw)
In-Reply-To: <207c865e-a92a-1647-b1b0-363010383cc3@zytor.com>

So I was already in the middle of composing this message when Andy posted:

> I don't even think this is sufficient.  I think we also need everyone
> who clears the bit to check if all bits are clear and, if so, remove
> the breakpoint.  Otherwise we have a situation where, if you are in
> text_poke_bp() and you take an NMI (or interrupt or MCE or whatever)
> and that interrupt then hits the breakpoint, then you deadlock because
> no one removes the breakpoint.
> 
> If we do this, and if we can guarantee that all CPUs make forward
> progress, then maybe the problem is solved. Can we guarantee something
> like all NMI handlers that might wait in a spinlock or for any other
> reason will periodically check if a sync is needed while they're
> spinning?

So the really, really nasty case is when an asynchronous event on the
*patching* processor gets stuck spinning on a resource which is
unavailable due to another processor spinning on the #BP. We can disable
interrupts, but we can't stop NMIs from coming in (although we could
test in the NMI handler if we are in that condition and return
immediately; I'm not sure we want to do that, and we still have to deal
with #MC and what not.)

The fundamental problem here is that we don't see the #BP on the
patching processor, in which case we could simply complete the patching
from the #BP handler on that processor.

On 1/13/19 6:40 PM, H. Peter Anvin wrote:
> On 1/13/19 6:31 PM, H. Peter Anvin wrote:
>>
>> static cpumask_t text_poke_cpumask;
>>
>> static void text_poke_sync(void)
>> {
>> 	smp_wmb();
>> 	text_poke_cpumask = cpu_online_mask;
>> 	smp_wmb();	/* Should be optional on x86 */
>> 	cpumask_clear_cpu(&text_poke_cpumask, smp_processor_id());
>> 	on_each_cpu_mask(&text_poke_cpumask, text_poke_sync_cpu, NULL, false);
>> 	while (!cpumask_empty(&text_poke_cpumask)) {
>> 		cpu_relax();
>> 		smp_rmb();
>> 	}
>> }
>>
>> static void text_poke_sync_cpu(void *dummy)
>> {
>> 	(void)dummy;
>>
>> 	smp_rmb();
>> 	cpumask_clear_cpu(&poke_bitmask, smp_processor_id());
>> 	/*
>> 	 * We are guaranteed to return with an IRET, either from the
>> 	 * IPI or the #BP handler; this provides serialization.
>> 	 */
>> }
>>
> 
> The invariants here are:
> 
> 1. The patching routine must set each bit in the cpumask after each event
>    that requires synchronization is complete.
> 2. The bit can be (atomically) cleared on the target CPU only, and only in a
>    place that guarantees a synchronizing event (e.g. IRET) before it may
>    reaching the poked instruction.
> 3. At a minimum the IPI handler and #BP handler needs to clear the bit. It
>    *is* also possible to clear it in other places, e.g. the NMI handler, if
>    necessary as long as condition 2 is satisfied.
> 

OK, so with interrupts enabled *on the processor doing the patching* we
still have a problem if it takes an interrupt which in turn takes a #BP.
 Disabling interrupts would not help, because but an NMI and #MC could
still cause problems unless we can guarantee that no path which may be
invoked by NMI/#MC can do text_poke, which seems to be a very aggressive
assumption.

Note: I am assuming preemption is disabled.

The easiest/sanest way to deal with this might be to switch the IDT (or
provide a hook in the generic exception entry code) on the patching
processor, such that if an asynchronous event comes in, we either roll
forward or revert. This is doable because the second sync we currently
do is not actually necessary per the hardware guys.

If we take that #BP during the breakpoint deployment phase -- that is,
before the first sync has completed -- restore the previous value of the
breakpoint byte. Upon return text_poke_bp() will then have to loop back
to the beginning and do it again.

If we take the #BP after that point, we can complete the patch in the
normal manner, by writing the rest of the instruction and then removing
the #BP. text_poke_bp() will complete the synchronization sequence on
return, but if another processor is spinning and sees the breakpoint
having been removed, it is good to go.

Rignt now we do completely unnecessary setup and teardown of the PDT
entries for each phase of the patching. This would have to be removed,
so that the asynchronous event handler will always be able to do the
roll forward/roll back as required.

If this is unpalatable, the solution you touched on is probably also
doable, but I need to think *really* carefully about the sequencing
constraints, because now you also have to worry about events
interrupting a patch in progress but not completed. It would however
have the advantage that an arbitrary interrupt on the patching processor
is unlikely to cause a rollback, and so would be safer to execute with
interrupts enabled without causing a livelock.

Now, you can't just remove the breakpoint; you have to make sure that if
you do, during the first phase text_poke_bp() will loop, and in the
second phase complete the whole patching sequence.

Let me think about the second solution, the one you proposed, and what
it would require to avoid any possible race condition. If it is
practical, then I think it is probably the better option.

	-hpa

  parent reply	other threads:[~2019-01-14 22:01 UTC|newest]

Thread overview: 90+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-01-09 22:59 [PATCH v3 0/6] Static calls Josh Poimboeuf
2019-01-09 22:59 ` [PATCH v3 1/6] compiler.h: Make __ADDRESSABLE() symbol truly unique Josh Poimboeuf
2019-01-09 22:59 ` [PATCH v3 2/6] static_call: Add basic static call infrastructure Josh Poimboeuf
2019-01-10 14:03   ` Edward Cree
2019-01-10 18:37     ` Josh Poimboeuf
2019-01-09 22:59 ` [PATCH v3 3/6] x86/static_call: Add out-of-line static call implementation Josh Poimboeuf
2019-01-10  0:16   ` Nadav Amit
2019-01-10 16:28     ` Josh Poimboeuf
2019-01-09 22:59 ` [PATCH v3 4/6] static_call: Add inline static call infrastructure Josh Poimboeuf
2019-01-09 22:59 ` [PATCH v3 5/6] x86/alternative: Use a single access in text_poke() where possible Josh Poimboeuf
2019-01-10  9:32   ` Nadav Amit
2019-01-10 17:20     ` Josh Poimboeuf
2019-01-10 17:29       ` Nadav Amit
2019-01-10 17:32       ` Steven Rostedt
2019-01-10 17:42         ` Sean Christopherson
2019-01-10 17:57           ` Steven Rostedt
2019-01-10 18:04             ` Sean Christopherson
2019-01-10 18:21               ` Josh Poimboeuf
2019-01-10 18:24               ` Steven Rostedt
2019-01-11 12:10               ` Alexandre Chartre
2019-01-11 15:28                 ` Josh Poimboeuf
2019-01-11 16:46                   ` Alexandre Chartre
2019-01-11 16:57                     ` Josh Poimboeuf
2019-01-11 17:41                       ` Jason Baron
2019-01-11 17:54                         ` Nadav Amit
2019-01-15 11:10                       ` Alexandre Chartre
2019-01-15 16:19                         ` Steven Rostedt
2019-01-15 16:45                           ` Alexandre Chartre
2019-01-11  0:59           ` hpa
2019-01-11  1:34             ` Sean Christopherson
2019-01-11  8:13               ` hpa
2019-01-09 22:59 ` [PATCH v3 6/6] x86/static_call: Add inline static call implementation for x86-64 Josh Poimboeuf
2019-01-10  1:21 ` [PATCH v3 0/6] Static calls Nadav Amit
2019-01-10 16:44   ` Josh Poimboeuf
2019-01-10 17:32     ` Nadav Amit
2019-01-10 18:18       ` Josh Poimboeuf
2019-01-10 19:45         ` Nadav Amit
2019-01-10 20:32           ` Josh Poimboeuf
2019-01-10 20:48             ` Nadav Amit
2019-01-10 20:57               ` Josh Poimboeuf
2019-01-10 21:47                 ` Nadav Amit
2019-01-10 17:31 ` Linus Torvalds
2019-01-10 20:51   ` H. Peter Anvin
2019-01-10 20:30 ` Peter Zijlstra
2019-01-10 20:52   ` Josh Poimboeuf
2019-01-10 23:02     ` Linus Torvalds
2019-01-11  0:56       ` Andy Lutomirski
2019-01-11  1:47         ` Nadav Amit
2019-01-11 15:15           ` Josh Poimboeuf
2019-01-11 15:48             ` Nadav Amit
2019-01-11 16:07               ` Josh Poimboeuf
2019-01-11 17:23                 ` Nadav Amit
2019-01-11 19:03             ` Linus Torvalds
2019-01-11 19:17               ` Nadav Amit
2019-01-11 19:23               ` hpa
2019-01-11 19:33                 ` Nadav Amit
2019-01-11 19:34                 ` Linus Torvalds
2019-01-13  0:34                   ` hpa
2019-01-13  0:36                   ` hpa
2019-01-11 19:39                 ` Jiri Kosina
2019-01-14  2:31                   ` H. Peter Anvin
2019-01-14  2:40                     ` H. Peter Anvin
2019-01-14 20:11                       ` Andy Lutomirski
2019-01-14 22:00                       ` H. Peter Anvin [this message]
2019-01-14 22:54                         ` H. Peter Anvin
2019-01-15  3:05                           ` Andy Lutomirski
2019-01-15  5:01                             ` H. Peter Anvin
2019-01-15  5:37                               ` H. Peter Anvin
2019-01-14 23:27                         ` Andy Lutomirski
2019-01-14 23:51                           ` Nadav Amit
2019-01-15  2:28                           ` hpa
2019-01-11 20:04               ` Josh Poimboeuf
2019-01-11 20:12                 ` Linus Torvalds
2019-01-11 20:31                   ` Josh Poimboeuf
2019-01-11 20:46                     ` Linus Torvalds
2019-01-11 21:05                       ` Andy Lutomirski
2019-01-11 21:10                         ` Linus Torvalds
2019-01-11 21:32                           ` Josh Poimboeuf
2019-01-14 12:28                         ` Peter Zijlstra
2019-01-11 21:22                       ` Josh Poimboeuf
2019-01-11 21:23                         ` Josh Poimboeuf
2019-01-11 21:25                         ` Josh Poimboeuf
2019-01-11 21:36                         ` Nadav Amit
2019-01-11 21:41                           ` Josh Poimboeuf
2019-01-11 21:55                             ` Steven Rostedt
2019-01-11 21:59                               ` Nadav Amit
2019-01-11 21:56                             ` Nadav Amit
2019-01-12 23:54                         ` Andy Lutomirski
2020-02-17 21:10     ` Jann Horn
2020-02-17 21:57       ` Steven Rostedt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=9f60be8c-47fb-195b-fdb4-4098f1df3dc2@zytor.com \
    --to=hpa@zytor.com \
    --cc=David.Laight@aculab.com \
    --cc=ard.biesheuvel@linaro.org \
    --cc=bp@alien8.de \
    --cc=bristot@redhat.com \
    --cc=ecree@solarflare.com \
    --cc=jbaron@akamai.com \
    --cc=jeyu@kernel.org \
    --cc=jikos@kernel.org \
    --cc=jpoimboe@redhat.com \
    --cc=julia@ni.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux@rasmusvillemoes.dk \
    --cc=luto@kernel.org \
    --cc=mhiramat@kernel.org \
    --cc=mingo@kernel.org \
    --cc=namit@vmware.com \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).