All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ingo Molnar <mingo@kernel.org>
To: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
	Steven Rostedt <rostedt@goodmis.org>,
	Borislav Petkov <bp@alien8.de>, "H. Peter Anvin" <hpa@zytor.com>,
	Andy Lutomirski <luto@amacapital.net>,
	Oleg Nesterov <oleg@redhat.com>,
	Frederic Weisbecker <fweisbec@gmail.com>,
	Alexei Starovoitov <ast@plumgrid.com>,
	Will Drewry <wad@chromium.org>, Kees Cook <keescook@chromium.org>,
	x86@kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 4/5] x86/asm/entry/32: Replace RESTORE_RSI_RDI[_RDX] with open-coded 32-bit reads
Date: Thu, 18 Jun 2015 11:31:34 +0200	[thread overview]
Message-ID: <20150618093134.GA1094@gmail.com> (raw)
In-Reply-To: <557F6CC3.7070709@redhat.com>


* Denys Vlasenko <dvlasenk@redhat.com> wrote:

> On 06/15/2015 10:20 PM, Ingo Molnar wrote:
> >> Actually, ecx and r11 need to be loaded first. They are not so much "restored" 
> >> as "prepared for SYSRET insn". Every cycle lost in loading these delays SYSRET. 
> >> [...]
> > 
> > So in the typical case they will still be cached, and so their max latency should 
> > be around 3 cycles.
> 
> If syscall flushes caches (say, a large read), or sleeps
> and CPU schedules away, then pt_regs->ip,flags are evicted
> and need to be reloaded.
> 
> > In fact because they are memory loads, they don't really have dependencies,
> > they should be available to SYSRET almost immediately,
> 
> They depend on the memory data.
> 
> > i.e. within a cycle - and 
> > there's no reason to believe why these loads wouldn't pipeline properly and 
> > parallelize with the many other things SYSRET has to do to organize a return to 
> > user-space, before it can actually use the target RIP and RFLAGS.
> 
> This does not sound right.
> 
> If it takes, say, 20 cycles to pull data from e.g. L3 cache to ECX,
> then SYSRET can't possibly complete sooner than in 20 cycles.

Yeah, that's true, but my point is: SYSRET has to do a lot of other things 
(permission checks, loading the user mode state - most of which are unrelated to 
R11/RCX), which take dozens of cycles, and which are probably overlapped with any 
cache misses on arguments such as R11/RCX.

It's not impossible that reordering helps, for example if SYSRET has some internal 
dependencies that makes it parallelism worse than ideal - but I'd complicate this 
code only if it gives a measurable improvement for cache-cold syscall performance.

Thanks,

	Ingo

  reply	other threads:[~2015-06-18  9:31 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-06-09 18:54 [PATCH 1/5] x86/asm/entry/32: Fix fallout from r9 trick removal in SYSCALL code Denys Vlasenko
2015-06-09 18:54 ` [PATCH 2/5] x86/asm/entry/32: Explain reloading of registers after __audit_syscall_entry Denys Vlasenko
2015-06-10  7:09   ` [tip:x86/asm] x86/asm/entry/32: Explain reloading of registers after __audit_syscall_entry() tip-bot for Denys Vlasenko
2015-06-09 18:54 ` [PATCH 3/5] x86/asm/entry/32: Shorten __audit_syscall_entry args preparation Denys Vlasenko
2015-06-10  6:21   ` Ingo Molnar
2015-06-12 23:28     ` Andy Lutomirski
2015-06-10  7:10   ` [tip:x86/asm] x86/asm/entry/32: Shorten __audit_syscall_entry() " tip-bot for Denys Vlasenko
2015-06-09 18:54 ` [PATCH 4/5] x86/asm/entry/32: Replace RESTORE_RSI_RDI[_RDX] with open-coded 32-bit reads Denys Vlasenko
2015-06-09 19:01   ` Andy Lutomirski
2015-06-09 19:03     ` Denys Vlasenko
2015-06-09 19:11       ` Andy Lutomirski
2015-06-09 19:18         ` Denys Vlasenko
2015-06-09 19:27           ` Andy Lutomirski
2015-06-14  8:40   ` Ingo Molnar
2015-06-14 15:21     ` Denys Vlasenko
2015-06-15 20:20       ` Ingo Molnar
2015-06-16  0:24         ` Denys Vlasenko
2015-06-18  9:31           ` Ingo Molnar [this message]
2015-06-18 10:59             ` Denys Vlasenko
2015-06-09 18:54 ` [PATCH 5/5] x86/asm/entry/32: Simplify ptrace register shuffling Denys Vlasenko
2015-06-09 18:59   ` Andy Lutomirski
2015-06-09 19:14     ` Denys Vlasenko
2015-06-18  9:33   ` Ingo Molnar
2015-06-10  7:09 ` [tip:x86/asm] x86/asm/entry/32: Fix fallout from the R9 trick removal in the SYSCALL code tip-bot for Denys Vlasenko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150618093134.GA1094@gmail.com \
    --to=mingo@kernel.org \
    --cc=ast@plumgrid.com \
    --cc=bp@alien8.de \
    --cc=dvlasenk@redhat.com \
    --cc=fweisbec@gmail.com \
    --cc=hpa@zytor.com \
    --cc=keescook@chromium.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@amacapital.net \
    --cc=oleg@redhat.com \
    --cc=rostedt@goodmis.org \
    --cc=torvalds@linux-foundation.org \
    --cc=wad@chromium.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.