All of lore.kernel.org
 help / color / mirror / Atom feed
From: Linus Torvalds <torvalds@linux-foundation.org>
To: Denys Vlasenko <vda.linux@googlemail.com>
Cc: Andy Lutomirski <luto@amacapital.net>,
	Borislav Petkov <bp@alien8.de>,
	Denys Vlasenko <dvlasenk@redhat.com>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Oleg Nesterov <oleg@redhat.com>, "H. Peter Anvin" <hpa@zytor.com>,
	Frederic Weisbecker <fweisbec@gmail.com>, X86 ML <x86@kernel.org>,
	Alexei Starovoitov <ast@plumgrid.com>,
	Will Drewry <wad@chromium.org>, Kees Cook <keescook@chromium.org>
Subject: Re: [PATCH 3/4] x86: open-code register save/restore in trace_hardirqs thunks
Date: Sat, 10 Jan 2015 13:27:20 -0800	[thread overview]
Message-ID: <CA+55aFy-uSod4dndhjZ2VKgLkG42hajk7HLJwuNdv_eHfTW-=Q@mail.gmail.com> (raw)
In-Reply-To: <CAK1hOcPAxvuPRB-+z2kob2=sF1C+aftH4M9JL2MaA5VcQDzuNA@mail.gmail.com>

On Sat, Jan 10, 2015 at 1:09 PM, Denys Vlasenko
<vda.linux@googlemail.com> wrote:
>
> I think using push/pop is okay. In the very hottest code paths
> you may want to prefer mov's.

For kernel entrypoints in particular, the code sequence is quite
possibly constrained by the decoder and instruction fetch rather than
the execution engine. Even if the entrypoint were to be in the L1 I$
(which is not generally the case except in microbenchmarks), I am
pretty sure that even Intel doesn't actually speculatively decode
across system call boundaries, so unlike normal nice code, you don't
have the front end running ahead of the execution engine.

Looking at the system call hotpath, for example, it looks like we
save/restore 8 registers. So 16 instructions or about 80 bytes of
code. I could easily imagine us avoiding one cacheline access by using
shorter 1- and 2-byte push/pop instructions (depending a bit on how
the cacheline alignment works out, of course).

Depending on how well it prefetches from L2 and/or exact decoder
details, that kind of issue *can* overshadow the actual execution
costs. Of course, on microbenchmarks (eg some system call benchmark
that does "getppid()" in a loop), even the kernel side stays in the
L1, so those might show possible execution issues more. And
macrobenchmarks probably won't show a cycle or two in the system call
or fault path anyway.

                    Linus

  reply	other threads:[~2015-01-10 21:27 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-01-08 16:25 [PATCH 0/4] x86: entry.S cleanup Denys Vlasenko
2015-01-08 16:25 ` [PATCH 1/4] x86: entry_64.S: delete unused code Denys Vlasenko
2015-01-08 18:16   ` Borislav Petkov
2015-01-13 22:01     ` Andy Lutomirski
2015-01-08 16:25 ` [PATCH 2/4] x86: ia32entry.S: fix wrong symbolic constant usage: R11->ARGOFFSET Denys Vlasenko
2015-01-09 10:41   ` Borislav Petkov
2015-01-08 16:25 ` [PATCH 3/4] x86: open-code register save/restore in trace_hardirqs thunks Denys Vlasenko
2015-01-09 10:55   ` Borislav Petkov
2015-01-09 20:29     ` Denys Vlasenko
2015-01-10 13:52       ` Borislav Petkov
2015-01-09 12:19   ` Borislav Petkov
2015-01-09 18:54     ` Denys Vlasenko
2015-01-10 14:23       ` Borislav Petkov
2015-01-10 20:14         ` Denys Vlasenko
2015-01-10 20:17           ` Andy Lutomirski
2015-01-10 20:42             ` Borislav Petkov
2015-01-10 21:02               ` Andy Lutomirski
2015-01-10 21:09                 ` Denys Vlasenko
2015-01-10 21:27                   ` Linus Torvalds [this message]
2015-01-10 21:57                     ` Denys Vlasenko
2015-01-10 20:43             ` Denys Vlasenko
2015-01-10 21:08             ` Linus Torvalds
2015-01-10 21:26               ` Borislav Petkov
2015-01-10 22:00           ` Borislav Petkov
2015-01-10 22:03             ` Denys Vlasenko
2015-01-10 22:04             ` Andy Lutomirski
2015-01-08 16:25 ` [PATCH 4/4] x86: entry_64.S: fold SAVE_ARGS_IRQ macro into its sole user Denys Vlasenko
2015-01-10 22:00 [PATCH 0/4 v2] x86: entry.S cleanup Denys Vlasenko
2015-01-10 22:00 ` [PATCH 3/4] x86: open-code register save/restore in trace_hardirqs thunks Denys Vlasenko
2015-01-10 22:07   ` Linus Torvalds
2015-01-10 22:35     ` Denys Vlasenko
2015-01-10 22:41       ` Borislav Petkov
2015-01-11  3:33         ` Denys Vlasenko
2015-01-11 10:54           ` Borislav Petkov
2015-01-11 23:06             ` Denys Vlasenko
2015-02-11  2:38   ` Andy Lutomirski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CA+55aFy-uSod4dndhjZ2VKgLkG42hajk7HLJwuNdv_eHfTW-=Q@mail.gmail.com' \
    --to=torvalds@linux-foundation.org \
    --cc=ast@plumgrid.com \
    --cc=bp@alien8.de \
    --cc=dvlasenk@redhat.com \
    --cc=fweisbec@gmail.com \
    --cc=hpa@zytor.com \
    --cc=keescook@chromium.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@amacapital.net \
    --cc=oleg@redhat.com \
    --cc=vda.linux@googlemail.com \
    --cc=wad@chromium.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.