From: Paul Turner <pjt@google.com>
To: David Woodhouse <dwmw2@infradead.org>
Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com>,
Linus Torvalds <torvalds@linux-foundation.org>,
Andi Kleen <ak@linux.intel.com>,
LKML <linux-kernel@vger.kernel.org>,
Greg Kroah-Hartman <gregkh@linux-foundation.org>,
Tim Chen <tim.c.chen@linux.intel.com>,
Dave Hansen <dave.hansen@intel.com>,
Thomas Gleixner <tglx@linutronix.de>,
Kees Cook <keescook@google.com>, Rik van Riel <riel@redhat.com>,
Peter Zijlstra <peterz@infradead.org>,
Andy Lutomirski <luto@amacapital.net>,
Jiri Kosina <jikos@kernel.org>,
One Thousand Gnomes <gnomes@lxorguk.ukuu.org.uk>
Subject: Re: [PATCH v3 01/13] x86/retpoline: Add initial retpoline support
Date: Fri, 5 Jan 2018 02:28:24 -0800 [thread overview]
Message-ID: <20180105102824.GA247671@google.com> (raw)
In-Reply-To: <1515094078.29312.17.camel@infradead.org>
On Thu, Jan 04, 2018 at 07:27:58PM +0000, David Woodhouse wrote:
> On Thu, 2018-01-04 at 10:36 -0800, Alexei Starovoitov wrote:
> >
> > Pretty much.
> > Paul's writeup: https://support.google.com/faqs/answer/7625886
> > tldr: jmp *%r11 gets converted to:
> > call set_up_target;
> > capture_spec:
> > pause;
> > jmp capture_spec;
> > set_up_target:
> > mov %r11, (%rsp);
> > ret;
> > where capture_spec part will be looping speculatively.
>
> That is almost identical to what's in my latest patch set, except that
> the capture_spec loop has 'lfence' instead of 'pause'.
When choosing this sequence I benchmarked several alternatives here, including
(nothing, nops, fences, and other serializing instructions such as cpuid).
The "pause; jmp" sequence proved minutely faster than "lfence;jmp" which is why
it was chosen.
"pause; jmp" 33.231 cycles/call 9.517 ns/call
"lfence; jmp" 33.354 cycles/call 9.552 ns/call
(Timings are for a complete retpolined indirect branch.)
>
> As Andi says, I'd want to see explicit approval from the CPU architects
> for making that change.
Beyond guaranteeing that speculative execution is constrained, the choice of
sequence here is a performance detail and not one of correctness.
>
> We've already had false starts there — for a long time, Intel thought
> that a much simpler option with an lfence after the register load was
> sufficient, and then eventually worked out that in some rare cases it
> wasn't. While AMD still seem to think it *is* sufficient for them,
> apparently.
As an interesting aside, that speculation proceeds beyond lfence can be
trivially proven using the timings above. In fact, if we substitute only:
"lfence" (with no jmp)
We see:
29.573 cycles/call 8.469 ns/call
Now, the only way for this timing to be different, is if speculation beyond the
lfence was executed differently.
That said, this is a negative result, it does suggest that the jmp is
contributing a larger than realized cost to our speculative loop. We can likely
shave off some additional time with some unrolling. I did try this previously,
but did not see results above the noise floor; it seems worth trying this again;
will take a look tomorrow.
next prev parent reply other threads:[~2018-01-05 10:28 UTC|newest]
Thread overview: 79+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-01-04 9:10 [RFC] Retpoline: Binary mitigation for branch-target-injection (aka "Spectre") Paul Turner
2018-01-04 9:12 ` Paul Turner
2018-01-04 9:24 ` Paul Turner
2018-01-04 9:48 ` Greg Kroah-Hartman
2018-01-04 9:56 ` Woodhouse, David
2018-01-04 9:30 ` Woodhouse, David
2018-01-04 14:36 ` [PATCH v3 01/13] x86/retpoline: Add initial retpoline support David Woodhouse
2018-01-04 18:03 ` Linus Torvalds
2018-01-04 19:32 ` Woodhouse, David
2018-01-04 18:17 ` Alexei Starovoitov
2018-01-04 18:25 ` Linus Torvalds
2018-01-04 18:36 ` Alexei Starovoitov
2018-01-04 19:27 ` David Woodhouse
2018-01-05 10:28 ` Paul Turner [this message]
2018-01-05 10:55 ` David Woodhouse
2018-01-05 11:19 ` Paul Turner
2018-01-05 11:25 ` Paul Turner
2018-01-05 11:26 ` Paolo Bonzini
2018-01-05 12:20 ` Paul Turner
2018-01-05 10:40 ` Paul Turner
2018-01-04 18:40 ` Andi Kleen
2018-01-05 10:32 ` Paul Turner
2018-01-05 12:54 ` Thomas Gleixner
2018-01-05 13:01 ` Juergen Gross
2018-01-05 13:03 ` Thomas Gleixner
2018-01-05 13:56 ` Woodhouse, David
2018-01-05 16:41 ` Woodhouse, David
2018-01-05 16:45 ` Borislav Petkov
2018-01-05 17:08 ` Josh Poimboeuf
2018-01-06 0:30 ` Borislav Petkov
2018-01-06 8:23 ` David Woodhouse
2018-01-06 17:02 ` Borislav Petkov
2018-01-07 9:40 ` David Woodhouse
2018-01-07 11:46 ` Borislav Petkov
2018-01-07 12:21 ` David Woodhouse
2018-01-07 14:03 ` Borislav Petkov
2018-01-08 21:50 ` David Woodhouse
2018-01-08 5:06 ` Josh Poimboeuf
2018-01-08 7:55 ` Woodhouse, David
2018-01-05 17:12 ` Woodhouse, David
2018-01-05 17:28 ` Linus Torvalds
2018-01-05 17:48 ` David Woodhouse
2018-01-05 18:05 ` Andi Kleen
2018-01-05 20:32 ` Woodhouse, David
2018-01-05 21:11 ` Brian Gerst
2018-01-05 22:16 ` Woodhouse, David
2018-01-05 22:43 ` Borislav Petkov
2018-01-05 22:00 ` Woodhouse, David
2018-01-05 22:06 ` Borislav Petkov
2018-01-05 23:50 ` Linus Torvalds
2018-01-06 10:53 ` Woodhouse, David
2018-01-04 14:36 ` [PATCH v3 02/13] x86/retpoline/crypto: Convert crypto assembler indirect jumps David Woodhouse
2018-01-04 14:37 ` [PATCH v3 03/13] x86/retpoline/entry: Convert entry " David Woodhouse
2018-01-04 14:46 ` Dave Hansen
2018-01-04 14:49 ` Woodhouse, David
2018-01-04 14:37 ` [PATCH v3 04/13] x86/retpoline/ftrace: Convert ftrace " David Woodhouse
2018-01-04 14:37 ` [PATCH v3 05/13] x86/retpoline/hyperv: Convert " David Woodhouse
2018-01-04 14:37 ` [PATCH v3 06/13] x86/retpoline/xen: Convert Xen hypercall " David Woodhouse
2018-01-04 15:10 ` Juergen Gross
2018-01-04 15:18 ` David Woodhouse
2018-01-04 15:54 ` Juergen Gross
2018-01-04 14:37 ` [PATCH v3 07/13] x86/retpoline/checksum32: Convert assembler " David Woodhouse
2018-01-04 14:37 ` [PATCH v3 08/13] x86/alternatives: Add missing \n at end of ALTERNATIVE inline asm David Woodhouse
2018-01-05 13:04 ` [tip:x86/pti] x86/alternatives: Add missing '\n' " tip-bot for David Woodhouse
2018-01-04 14:37 ` [PATCH v3 09/13] x86/retpoline/irq32: Convert assembler indirect jumps David Woodhouse
2018-01-04 14:37 ` [PATCH v3 10/13] x86/retpoline/pvops: " David Woodhouse
2018-01-04 15:02 ` Juergen Gross
2018-01-04 15:12 ` Woodhouse, David
2018-01-04 15:18 ` Andrew Cooper
2018-01-04 16:04 ` Juergen Gross
2018-01-04 16:37 ` Andi Kleen
2018-01-04 14:37 ` [PATCH v3 11/13] retpoline/taint: Taint kernel for missing retpoline in compiler David Woodhouse
2018-01-04 22:06 ` Justin Forbes
2018-01-04 14:37 ` [PATCH v3 12/13] retpoline/objtool: Disable some objtool warnings David Woodhouse
2018-01-04 14:37 ` [PATCH v3 13/13] retpoline: Attempt to quiten objtool warning for unreachable code David Woodhouse
2018-01-04 16:18 ` [RFC] Retpoline: Binary mitigation for branch-target-injection (aka "Spectre") Andy Lutomirski
2018-01-04 16:24 ` David Woodhouse
2018-01-05 10:49 ` Paul Turner
2018-01-05 11:43 ` Woodhouse, David
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180105102824.GA247671@google.com \
--to=pjt@google.com \
--cc=ak@linux.intel.com \
--cc=alexei.starovoitov@gmail.com \
--cc=dave.hansen@intel.com \
--cc=dwmw2@infradead.org \
--cc=gnomes@lxorguk.ukuu.org.uk \
--cc=gregkh@linux-foundation.org \
--cc=jikos@kernel.org \
--cc=keescook@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=luto@amacapital.net \
--cc=peterz@infradead.org \
--cc=riel@redhat.com \
--cc=tglx@linutronix.de \
--cc=tim.c.chen@linux.intel.com \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).