linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: David Laight <David.Laight@ACULAB.COM>
To: 'Peter Zijlstra' <peterz@infradead.org>,
	"x86@kernel.org" <x86@kernel.org>,
	"jpoimboe@redhat.com" <jpoimboe@redhat.com>,
	"jgross@suse.com" <jgross@suse.com>,
	"mbenes@suse.com" <mbenes@suse.com>
Cc: "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: RE: [PATCH v2 03/14] x86/retpoline: Simplify retpolines
Date: Fri, 19 Mar 2021 17:18:14 +0000	[thread overview]
Message-ID: <f7a36237052f4c09ad101431653038a5@AcuMS.aculab.com> (raw)
In-Reply-To: <20210318171919.580212227@infradead.org>

From: Peter Zijlstra
> Sent: 18 March 2021 17:11
> 
> Due to commit c9c324dc22aa ("objtool: Support stack layout changes
> in alternatives"), it is possible to simplify the retpolines.
> 
...
> Notice that since the longest alternative sequence is now:
> 
>    0:   e8 07 00 00 00          callq  c <.altinstr_replacement+0xc>
>    5:   f3 90                   pause
>    7:   0f ae e8                lfence
>    a:   eb f9                   jmp    5 <.altinstr_replacement+0x5>
>    c:   48 89 04 24             mov    %rax,(%rsp)
>   10:   c3                      retq
> 
> 17 bytes, we have 15 bytes NOP at the end of our 32 byte slot. (IOW,
> if we can shrink the retpoline by 1 byte we can pack it more dense)

I'm intrigued about the lfence after the pause.
Clearly this is for very warped cpu behaviour.
To get to the pause you have to be speculating past an
unconditional call.

To get to the lfence you have to (mostly) have ignored the pause.
Which is commented:
	_mm_pause(); /* Abort speculation */
in a couple of examples in 248966-033.

I wonder what effect replacing the lfence with hlt would have?
It would certainly save 2 bytes and allow the entire retpoline
be put into a single 16byte code fetch block.

248966-033 also contains a note that the instruction(s) after
an indirect jump may get executed.
It suggests adding a pause (or illegal instruction) to stop
anything odd happening (they knew it could be horrid in June 2016.
But then go on to say the adding pause may be a performance issue.
(Presumably because if it is speculatively executed it takes ages.)

I do remember something from even longer ago about trying to never
speculate any of the trig opcodes - because at last some cpu couldn't
abort the instruction.

This may also mean that a big pile of 0x90 nops after the jmp (%eax)
is actually better than 2 or 3 'big' nops.
Of course, if you execute the nops you always want the big ones.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

  reply	other threads:[~2021-03-19 17:18 UTC|newest]

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-18 17:11 [PATCH v2 00/14] x86,objtool: Optimize !RETPOLINE Peter Zijlstra
2021-03-18 17:11 ` [PATCH v2 01/14] x86: Add insn_decode_kernel() Peter Zijlstra
2021-03-19 10:40   ` Borislav Petkov
2021-03-18 17:11 ` [PATCH v2 02/14] x86/alternatives: Optimize optimize_nops() Peter Zijlstra
2021-03-21 12:06   ` Borislav Petkov
2021-03-22  8:17     ` Peter Zijlstra
2021-03-22 11:07       ` Borislav Petkov
2021-03-18 17:11 ` [PATCH v2 03/14] x86/retpoline: Simplify retpolines Peter Zijlstra
2021-03-19 17:18   ` David Laight [this message]
2021-03-22  9:32     ` Peter Zijlstra
2021-03-22 15:41       ` David Laight
2021-03-18 17:11 ` [PATCH v2 04/14] objtool: Correctly handle retpoline thunk calls Peter Zijlstra
2021-03-18 17:11 ` [PATCH v2 05/14] objtool: Per arch retpoline naming Peter Zijlstra
2021-03-19  2:38   ` Josh Poimboeuf
2021-03-19  9:07     ` Peter Zijlstra
2021-03-18 17:11 ` [PATCH v2 06/14] objtool: Fix static_call list generation Peter Zijlstra
2021-03-22 12:44   ` Miroslav Benes
2021-03-18 17:11 ` [PATCH v2 07/14] objtool: Rework rebuild_reloc logic Peter Zijlstra
2021-03-18 17:11 ` [PATCH v2 08/14] objtool: Add elf_create_reloc() helper Peter Zijlstra
2021-03-19  1:42   ` Josh Poimboeuf
2021-03-19  9:47     ` Peter Zijlstra
2021-03-19 15:12       ` Josh Poimboeuf
2021-03-19 15:24         ` Peter Zijlstra
2021-03-19 15:37           ` Josh Poimboeuf
2021-03-18 17:11 ` [PATCH v2 09/14] objtool: Extract elf_strtab_concat() Peter Zijlstra
2021-03-19  2:10   ` Josh Poimboeuf
2021-03-19  9:52     ` Peter Zijlstra
2021-03-18 17:11 ` [PATCH v2 10/14] objtool: Extract elf_symbol_add() Peter Zijlstra
2021-03-19  2:14   ` Josh Poimboeuf
2021-03-19  9:54     ` Peter Zijlstra
2021-03-19 15:04       ` Josh Poimboeuf
2021-03-18 17:11 ` [PATCH v2 11/14] objtool: Add elf_create_undef_symbol() Peter Zijlstra
2021-03-19  2:29   ` Josh Poimboeuf
2021-03-19  7:56     ` Peter Zijlstra
2021-03-18 17:11 ` [PATCH v2 12/14] objtool: Allow archs to rewrite retpolines Peter Zijlstra
2021-03-19  2:54   ` Josh Poimboeuf
2021-03-19 11:21     ` Peter Zijlstra
2021-03-19 13:28       ` Peter Zijlstra
2021-03-18 17:11 ` [PATCH v2 13/14] objtool: Skip magical retpoline .altinstr_replacement Peter Zijlstra
2021-03-18 17:11 ` [PATCH v2 14/14] objtool,x86: Rewrite retpoline thunk calls Peter Zijlstra
2021-03-19  3:29   ` Josh Poimboeuf
2021-03-19  8:06     ` Peter Zijlstra
2021-03-19 15:30       ` Josh Poimboeuf
2021-03-19 15:56         ` Peter Zijlstra
2021-03-19 22:52           ` Josh Poimboeuf

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f7a36237052f4c09ad101431653038a5@AcuMS.aculab.com \
    --to=david.laight@aculab.com \
    --cc=jgross@suse.com \
    --cc=jpoimboe@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mbenes@suse.com \
    --cc=peterz@infradead.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).