linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Nadav Amit <namit@vmware.com>
To: Andy Lutomirski <luto@amacapital.net>
Cc: Ingo Molnar <mingo@redhat.com>,
	Andrew Lutomirski <luto@kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	"H. Peter Anvin" <hpa@zytor.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	LKML <linux-kernel@vger.kernel.org>, X86 ML <x86@kernel.org>,
	Borislav Petkov <bp@alien8.de>,
	"Woodhouse, David" <dwmw@amazon.co.uk>
Subject: Re: [RFC PATCH 1/5] x86: introduce preemption disable prefix
Date: Thu, 18 Oct 2018 17:42:12 +0000	[thread overview]
Message-ID: <1D3E9B8A-9D90-49D9-87D0-185908B425E1@vmware.com> (raw)
In-Reply-To: <CALCETrUQwHysVOJ2G8AQF25TRVaqUi7k93K8nks4Yp+fFehVBA@mail.gmail.com>

at 10:29 AM, Andy Lutomirski <luto@amacapital.net> wrote:

> On Thu, Oct 18, 2018 at 10:25 AM Nadav Amit <namit@vmware.com> wrote:
>> at 10:00 AM, Andy Lutomirski <luto@amacapital.net> wrote:
>> 
>>>> On Oct 18, 2018, at 9:47 AM, Nadav Amit <namit@vmware.com> wrote:
>>>> 
>>>> at 8:51 PM, Andy Lutomirski <luto@amacapital.net> wrote:
>>>> 
>>>>>> On Wed, Oct 17, 2018 at 8:12 PM Nadav Amit <namit@vmware.com> wrote:
>>>>>> at 6:22 PM, Andy Lutomirski <luto@amacapital.net> wrote:
>>>>>> 
>>>>>>>> On Oct 17, 2018, at 5:54 PM, Nadav Amit <namit@vmware.com> wrote:
>>>>>>>> 
>>>>>>>> It is sometimes beneficial to prevent preemption for very few
>>>>>>>> instructions, or prevent preemption for some instructions that precede
>>>>>>>> a branch (this latter case will be introduced in the next patches).
>>>>>>>> 
>>>>>>>> To provide such functionality on x86-64, we use an empty REX-prefix
>>>>>>>> (opcode 0x40) as an indication that preemption is disabled for the
>>>>>>>> following instruction.
>>>>>>> 
>>>>>>> Nifty!
>>>>>>> 
>>>>>>> That being said, I think you have a few bugs. First, you can’t just ignore
>>>>>>> a rescheduling interrupt, as you introduce unbounded latency when this
>>>>>>> happens — you’re effectively emulating preempt_enable_no_resched(), which
>>>>>>> is not a drop-in replacement for preempt_enable(). To fix this, you may
>>>>>>> need to jump to a slow-path trampoline that calls schedule() at the end or
>>>>>>> consider rewinding one instruction instead. Or use TF, which is only a
>>>>>>> little bit terrifying…
>>>>>> 
>>>>>> Yes, I didn’t pay enough attention here. For my use-case, I think that the
>>>>>> easiest solution would be to make synchronize_sched() ignore preemptions
>>>>>> that happen while the prefix is detected. It would slightly change the
>>>>>> meaning of the prefix.
>>>> 
>>>> So thinking about it further, rewinding the instruction seems the easiest
>>>> and most robust solution. I’ll do it.
>>>> 
>>>>>>> You also aren’t accounting for the case where you get an exception that
>>>>>>> is, in turn, preempted.
>>>>>> 
>>>>>> Hmm.. Can you give me an example for such an exception in my use-case? I
>>>>>> cannot think of an exception that might be preempted (assuming #BP, #MC
>>>>>> cannot be preempted).
>>>>> 
>>>>> Look for cond_local_irq_enable().
>>>> 
>>>> I looked at it. Yet, I still don’t see how exceptions might happen in my
>>>> use-case, but having said that - this can be fixed too.
>>> 
>>> I’m not totally certain there’s a case that matters.  But it’s worth checking
>>> 
>>>> To be frank, I paid relatively little attention to this subject. Any
>>>> feedback about the other parts and especially on the high-level approach? Is
>>>> modifying the retpolines in the proposed manner (assembly macros)
>>>> acceptable?
>>> 
>>> It’s certainly a neat idea, and it could be a real speedup.
>> 
>> Great. So I’ll try to shape things up, and I still wait for other comments
>> (from others).
>> 
>> I’ll just mention two more patches I need to cleanup (I know I still owe you some
>> work, so obviously it will be done later):
>> 
>> 1. Seccomp trampolines. On my Ubuntu, when I run Redis, systemd installs 17
>> BPF filters on the Redis server process that are invoked on each
>> system-call. Invoking each one requires an indirect branch. The patch keeps
>> a per-process kernel code-page that holds trampolines for these functions.
> 
> I wonder how many levels of branches are needed before the branches
> involved exceed the retpoline cost.

In this case there is no hierarchy, but a list of trampolines that are
called one after the other, as the seccomp filter order is predefined. It
does not work if different threads of the same process have different
filters.

>> 2. Binary-search for system-calls. Use the per-process kernel code-page also
>> to hold multiple trampolines for the 16 common system calls of a certain
>> process. The patch uses an indirection table and a binary-search to find the
>> proper trampoline.
> 
> Same comment applies here.

Branch misprediction wastes ~7 cycles and a retpoline takes at least 30. So
assuming the branch predictor is not completely stupid 3-4 levels should not
be too much.


  reply	other threads:[~2018-10-18 17:42 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-10-18  0:54 [RFC PATCH 0/5] x86: dynamic indirect call promotion Nadav Amit
2018-10-18  0:54 ` [RFC PATCH 1/5] x86: introduce preemption disable prefix Nadav Amit
2018-10-18  1:22   ` Andy Lutomirski
2018-10-18  3:12     ` Nadav Amit
2018-10-18  3:26       ` Nadav Amit
2018-10-18  3:51       ` Andy Lutomirski
2018-10-18 16:47         ` Nadav Amit
2018-10-18 17:00           ` Andy Lutomirski
2018-10-18 17:25             ` Nadav Amit
2018-10-18 17:29               ` Andy Lutomirski
2018-10-18 17:42                 ` Nadav Amit [this message]
2018-10-19  1:08             ` Nadav Amit
2018-10-19  4:29               ` Andy Lutomirski
2018-10-19  4:44                 ` Nadav Amit
2018-10-20  1:22                   ` Masami Hiramatsu
2018-10-19  5:00                 ` Alexei Starovoitov
2018-10-19  8:22                   ` Peter Zijlstra
2018-10-19 14:47                     ` Alexei Starovoitov
2018-10-19  8:19                 ` Peter Zijlstra
2018-10-19 10:38                 ` Oleg Nesterov
2018-10-19  8:33               ` Peter Zijlstra
2018-10-19 14:29                 ` Andy Lutomirski
2018-11-29  9:46                   ` Peter Zijlstra
2018-10-18  7:54     ` Peter Zijlstra
2018-10-18 18:14       ` Nadav Amit
2018-10-18  0:54 ` [RFC PATCH 2/5] x86: patch indirect branch promotion Nadav Amit
2018-10-18  0:54 ` [RFC PATCH 3/5] x86: interface for accessing indirect branch locations Nadav Amit
2018-10-18  0:54 ` [RFC PATCH 4/5] x86: learning and patching indirect branch targets Nadav Amit
2018-10-18  0:54 ` [RFC PATCH 5/5] x86: relpoline: disabling interface Nadav Amit
2018-10-23 18:36 ` [RFC PATCH 0/5] x86: dynamic indirect call promotion Dave Hansen
2018-10-23 20:32   ` Nadav Amit
2018-10-23 20:37     ` Dave Hansen
2018-11-28 16:08 ` Josh Poimboeuf
2018-11-28 19:34   ` Nadav Amit
2018-11-29  0:38     ` Josh Poimboeuf
2018-11-29  1:40       ` Andy Lutomirski
2018-11-29  2:06         ` Nadav Amit
2018-11-29  3:24           ` Andy Lutomirski
2018-11-29  4:36             ` Josh Poimboeuf
2018-11-29  6:06             ` Andy Lutomirski
2018-11-29 15:19               ` Josh Poimboeuf
2018-12-01  6:52                 ` Nadav Amit
2018-12-01 14:25                   ` Josh Poimboeuf

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1D3E9B8A-9D90-49D9-87D0-185908B425E1@vmware.com \
    --to=namit@vmware.com \
    --cc=bp@alien8.de \
    --cc=dwmw@amazon.co.uk \
    --cc=hpa@zytor.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@amacapital.net \
    --cc=luto@kernel.org \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).