linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC 0/2] Get rid of the entry trampoline
@ 2018-07-22 17:45 Andy Lutomirski
  2018-07-22 17:45 ` [RFC 1/2] x86/entry/64: Use the TSS sp2 slot for rsp_scratch Andy Lutomirski
  2018-07-22 17:45 ` [RFC 2/2] x86/pti/64: Remove the SYSCALL64 entry trampoline Andy Lutomirski
  0 siblings, 2 replies; 9+ messages in thread
From: Andy Lutomirski @ 2018-07-22 17:45 UTC (permalink / raw)
  To: x86, LKML; +Cc: Borislav Petkov, Linus Torvalds, Dave Hansen, Andy Lutomirski

Hi all-

I think there's general agreement that the current entry trampoline
sucks, mainly because it's dog-slow.  Thanks, Spectre.

There are three possible fixes I know of:

a) Linus' hack: use R11 for scratch space.  This doesn't actually
   speed it up, but it improves the addressing situation a bit.
   I don't like it, though: it causes the SYSCALL64 path to forget
   the arithmetic flags and all of the MSR_SYCALL_MASK flags.  The
   latter may be a showstopper, given that we've seen some evidence
   of nasty Wine use cases that expect setting EFLAGS.NT and doing
   a system call to actually do something intelligent.  Similarly,
   there could easily be user programs out there that set AC because
   they want alignment checking and expect AC to remain set across
   system calls.

b) Move the trampoline within 2G of the entry text and copy it for
   each CPU.  This is certainly possible, but it's a bit gross,
   and it uses num_possible_cpus() * 64 bytes of memory (rounded
   up to a page).  It will also result in more complicated code.

c) This series.  Just make %gs work in the entry trampoline.  It's
   actually a net code deletion.

I suspect that (b) would be faster in code that does a lot of system
calls and doesn't totally blow away the cache or the TLB between
system calls.  I suspect that (c) is faster in code that does
cache-cold system calls.

Andy Lutomirski (2):
  x86/entry/64: Use the TSS sp2 slot for rsp_scratch
  x86/pti/64: Remove the SYSCALL64 entry trampoline

 arch/x86/entry/entry_64.S          | 66 +-----------------------------
 arch/x86/include/asm/processor.h   |  5 +++
 arch/x86/include/asm/thread_info.h |  1 +
 arch/x86/kernel/asm-offsets_64.c   |  1 +
 arch/x86/kernel/cpu/common.c       | 11 +----
 arch/x86/kernel/kprobes/core.c     | 10 +----
 arch/x86/kernel/process_64.c       |  2 -
 arch/x86/kernel/vmlinux.lds.S      | 10 -----
 arch/x86/mm/cpu_entry_area.c       |  5 ---
 arch/x86/mm/pti.c                  | 24 ++++++++++-
 10 files changed, 33 insertions(+), 102 deletions(-)

-- 
2.17.1


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2018-07-24  2:37 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-07-22 17:45 [RFC 0/2] Get rid of the entry trampoline Andy Lutomirski
2018-07-22 17:45 ` [RFC 1/2] x86/entry/64: Use the TSS sp2 slot for rsp_scratch Andy Lutomirski
2018-07-22 20:12   ` Ingo Molnar
2018-07-23 12:38   ` Dave Hansen
2018-07-24  2:36     ` Andy Lutomirski
2018-07-22 17:45 ` [RFC 2/2] x86/pti/64: Remove the SYSCALL64 entry trampoline Andy Lutomirski
2018-07-22 18:27   ` Linus Torvalds
2018-07-22 20:59     ` Andy Lutomirski
2018-07-23 12:59   ` Dave Hansen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).