All of lore.kernel.org
 help / color / mirror / Atom feed
From: Paul Turner <pjt@google.com>
To: David Woodhouse <dwmw2@infradead.org>
Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Andi Kleen <ak@linux.intel.com>,
	LKML <linux-kernel@vger.kernel.org>,
	Greg Kroah-Hartman <gregkh@linux-foundation.org>,
	Tim Chen <tim.c.chen@linux.intel.com>,
	Dave Hansen <dave.hansen@intel.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Kees Cook <keescook@google.com>, Rik van Riel <riel@redhat.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Andy Lutomirski <luto@amacapital.net>,
	Jiri Kosina <jikos@kernel.org>,
	One Thousand Gnomes <gnomes@lxorguk.ukuu.org.uk>
Subject: Re: [PATCH v3 01/13] x86/retpoline: Add initial retpoline support
Date: Fri, 5 Jan 2018 02:28:24 -0800	[thread overview]
Message-ID: <20180105102824.GA247671@google.com> (raw)
In-Reply-To: <1515094078.29312.17.camel@infradead.org>

On Thu, Jan 04, 2018 at 07:27:58PM +0000, David Woodhouse wrote:
> On Thu, 2018-01-04 at 10:36 -0800, Alexei Starovoitov wrote:
> > 
> > Pretty much.
> > Paul's writeup: https://support.google.com/faqs/answer/7625886
> > tldr: jmp *%r11 gets converted to:
> > call set_up_target;
> > capture_spec:
> >   pause;
> >   jmp capture_spec;
> > set_up_target:
> >   mov %r11, (%rsp);
> >   ret;
> > where capture_spec part will be looping speculatively.
> 
> That is almost identical to what's in my latest patch set, except that
> the capture_spec loop has 'lfence' instead of 'pause'.

When choosing this sequence I benchmarked several alternatives here, including
(nothing, nops, fences, and other serializing instructions such as cpuid).

The "pause; jmp" sequence proved minutely faster than "lfence;jmp" which is why
it was chosen.

  "pause; jmp" 33.231 cycles/call 9.517 ns/call
  "lfence; jmp" 33.354 cycles/call 9.552 ns/call

(Timings are for a complete retpolined indirect branch.)
> 
> As Andi says, I'd want to see explicit approval from the CPU architects
> for making that change.

Beyond guaranteeing that speculative execution is constrained, the choice of
sequence here is a performance detail and not one of correctness.

> 
> We've already had false starts there — for a long time, Intel thought
> that a much simpler option with an lfence after the register load was
> sufficient, and then eventually worked out that in some rare cases it
> wasn't. While AMD still seem to think it *is* sufficient for them,
> apparently.

As an interesting aside, that speculation proceeds beyond lfence can be
trivially proven using the timings above.  In fact, if we substitute only:
  "lfence" (with no jmp)

We see:
  29.573 cycles/call 8.469 ns/call

Now, the only way for this timing to be different, is if speculation beyond the
lfence was executed differently.

That said, this is a negative result, it does suggest that the jmp is
contributing a larger than realized cost to our speculative loop.  We can likely
shave off some additional time with some unrolling.  I did try this previously,
but did not see results above the noise floor; it seems worth trying this again;
will take a look tomorrow.

  reply	other threads:[~2018-01-05 10:28 UTC|newest]

Thread overview: 79+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-01-04  9:10 [RFC] Retpoline: Binary mitigation for branch-target-injection (aka "Spectre") Paul Turner
2018-01-04  9:12 ` Paul Turner
2018-01-04  9:24 ` Paul Turner
2018-01-04  9:48   ` Greg Kroah-Hartman
2018-01-04  9:56     ` Woodhouse, David
2018-01-04  9:30 ` Woodhouse, David
2018-01-04 14:36   ` [PATCH v3 01/13] x86/retpoline: Add initial retpoline support David Woodhouse
2018-01-04 18:03     ` Linus Torvalds
2018-01-04 19:32       ` Woodhouse, David
2018-01-04 18:17     ` Alexei Starovoitov
2018-01-04 18:25       ` Linus Torvalds
2018-01-04 18:36         ` Alexei Starovoitov
2018-01-04 19:27           ` David Woodhouse
2018-01-05 10:28             ` Paul Turner [this message]
2018-01-05 10:55               ` David Woodhouse
2018-01-05 11:19                 ` Paul Turner
2018-01-05 11:25                 ` Paul Turner
2018-01-05 11:26               ` Paolo Bonzini
2018-01-05 12:20                 ` Paul Turner
2018-01-05 10:40         ` Paul Turner
2018-01-04 18:40       ` Andi Kleen
2018-01-05 10:32         ` Paul Turner
2018-01-05 12:54     ` Thomas Gleixner
2018-01-05 13:01       ` Juergen Gross
2018-01-05 13:03         ` Thomas Gleixner
2018-01-05 13:56       ` Woodhouse, David
2018-01-05 16:41         ` Woodhouse, David
2018-01-05 16:45           ` Borislav Petkov
2018-01-05 17:08             ` Josh Poimboeuf
2018-01-06  0:30               ` Borislav Petkov
2018-01-06  8:23                 ` David Woodhouse
2018-01-06 17:02                   ` Borislav Petkov
2018-01-07  9:40                     ` David Woodhouse
2018-01-07 11:46                       ` Borislav Petkov
2018-01-07 12:21                         ` David Woodhouse
2018-01-07 14:03                           ` Borislav Petkov
2018-01-08 21:50                             ` David Woodhouse
2018-01-08  5:06                 ` Josh Poimboeuf
2018-01-08  7:55                   ` Woodhouse, David
2018-01-05 17:12             ` Woodhouse, David
2018-01-05 17:28               ` Linus Torvalds
2018-01-05 17:48                 ` David Woodhouse
2018-01-05 18:05                 ` Andi Kleen
2018-01-05 20:32                 ` Woodhouse, David
2018-01-05 21:11                   ` Brian Gerst
2018-01-05 22:16                     ` Woodhouse, David
2018-01-05 22:43                       ` Borislav Petkov
2018-01-05 22:00                 ` Woodhouse, David
2018-01-05 22:06                   ` Borislav Petkov
2018-01-05 23:50                   ` Linus Torvalds
2018-01-06 10:53                     ` Woodhouse, David
2018-01-04 14:36   ` [PATCH v3 02/13] x86/retpoline/crypto: Convert crypto assembler indirect jumps David Woodhouse
2018-01-04 14:37   ` [PATCH v3 03/13] x86/retpoline/entry: Convert entry " David Woodhouse
2018-01-04 14:46     ` Dave Hansen
2018-01-04 14:49       ` Woodhouse, David
2018-01-04 14:37   ` [PATCH v3 04/13] x86/retpoline/ftrace: Convert ftrace " David Woodhouse
2018-01-04 14:37   ` [PATCH v3 05/13] x86/retpoline/hyperv: Convert " David Woodhouse
2018-01-04 14:37   ` [PATCH v3 06/13] x86/retpoline/xen: Convert Xen hypercall " David Woodhouse
2018-01-04 15:10     ` Juergen Gross
2018-01-04 15:18       ` David Woodhouse
2018-01-04 15:54     ` Juergen Gross
2018-01-04 14:37   ` [PATCH v3 07/13] x86/retpoline/checksum32: Convert assembler " David Woodhouse
2018-01-04 14:37   ` [PATCH v3 08/13] x86/alternatives: Add missing \n at end of ALTERNATIVE inline asm David Woodhouse
2018-01-05 13:04     ` [tip:x86/pti] x86/alternatives: Add missing '\n' " tip-bot for David Woodhouse
2018-01-04 14:37   ` [PATCH v3 09/13] x86/retpoline/irq32: Convert assembler indirect jumps David Woodhouse
2018-01-04 14:37   ` [PATCH v3 10/13] x86/retpoline/pvops: " David Woodhouse
2018-01-04 15:02     ` Juergen Gross
2018-01-04 15:12       ` Woodhouse, David
2018-01-04 15:18       ` Andrew Cooper
2018-01-04 16:04         ` Juergen Gross
2018-01-04 16:37       ` Andi Kleen
2018-01-04 14:37   ` [PATCH v3 11/13] retpoline/taint: Taint kernel for missing retpoline in compiler David Woodhouse
2018-01-04 22:06     ` Justin Forbes
2018-01-04 14:37   ` [PATCH v3 12/13] retpoline/objtool: Disable some objtool warnings David Woodhouse
2018-01-04 14:37   ` [PATCH v3 13/13] retpoline: Attempt to quiten objtool warning for unreachable code David Woodhouse
2018-01-04 16:18   ` [RFC] Retpoline: Binary mitigation for branch-target-injection (aka "Spectre") Andy Lutomirski
2018-01-04 16:24     ` David Woodhouse
2018-01-05 10:49     ` Paul Turner
2018-01-05 11:43       ` Woodhouse, David

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180105102824.GA247671@google.com \
    --to=pjt@google.com \
    --cc=ak@linux.intel.com \
    --cc=alexei.starovoitov@gmail.com \
    --cc=dave.hansen@intel.com \
    --cc=dwmw2@infradead.org \
    --cc=gnomes@lxorguk.ukuu.org.uk \
    --cc=gregkh@linux-foundation.org \
    --cc=jikos@kernel.org \
    --cc=keescook@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@amacapital.net \
    --cc=peterz@infradead.org \
    --cc=riel@redhat.com \
    --cc=tglx@linutronix.de \
    --cc=tim.c.chen@linux.intel.com \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.