All of lore.kernel.org
 help / color / mirror / Atom feed
From: Denys Vlasenko <vda.linux@googlemail.com>
To: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>,
	Denys Vlasenko <dvlasenk@redhat.com>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Oleg Nesterov <oleg@redhat.com>, "H. Peter Anvin" <hpa@zytor.com>,
	Frederic Weisbecker <fweisbec@gmail.com>, X86 ML <x86@kernel.org>,
	Alexei Starovoitov <ast@plumgrid.com>,
	Will Drewry <wad@chromium.org>, Kees Cook <keescook@chromium.org>
Subject: Re: [PATCH 3/4] x86: open-code register save/restore in trace_hardirqs thunks
Date: Sat, 10 Jan 2015 21:43:37 +0100	[thread overview]
Message-ID: <CAK1hOcNeAytfd=tMSeCXV+FDnf16qb6wfO5=n94z3jnEFk=JKQ@mail.gmail.com> (raw)
In-Reply-To: <CALCETrXS7BvtS3P3J8hnbt3GGvezx-935uxanSN_a=CU6s1d3g@mail.gmail.com>

On Sat, Jan 10, 2015 at 9:17 PM, Andy Lutomirski <luto@amacapital.net> wrote:
>> After I've seen the disassembly I myself posted, I can't help but wonder
>> why we use 5-byte instructions to store and load regs on stack when
>> pushes and pops are 1 or 2-byte long.
>
> I asked this once, and someone told me that push/pop has lower
> throughput.  I find this surprising.

Theoretically yes.
In practice, AMD K7 and K8 seem to be able to execute two movq's
in one cycle, but only one push.

For all other processors I looked at, they have the same throughput:
K10 can do two movq's in one cycle, but also two push'es.
Bulldozer...Steamroller: can do one insn per cycle.
Bobcat..Jaguar: can do one insn per cycle.
Core 2: can do one insn per cycle.
Nehalem: can do one insn per cycle.

The above was microbenchmarked with long sequences
of similar instructions, in which case store unit gets saturated
and becomes a bottleneck.

Here's the document.

http://www.agner.org/optimize/instruction_tables.pdf

  parent reply	other threads:[~2015-01-10 20:43 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-01-08 16:25 [PATCH 0/4] x86: entry.S cleanup Denys Vlasenko
2015-01-08 16:25 ` [PATCH 1/4] x86: entry_64.S: delete unused code Denys Vlasenko
2015-01-08 18:16   ` Borislav Petkov
2015-01-13 22:01     ` Andy Lutomirski
2015-01-08 16:25 ` [PATCH 2/4] x86: ia32entry.S: fix wrong symbolic constant usage: R11->ARGOFFSET Denys Vlasenko
2015-01-09 10:41   ` Borislav Petkov
2015-01-08 16:25 ` [PATCH 3/4] x86: open-code register save/restore in trace_hardirqs thunks Denys Vlasenko
2015-01-09 10:55   ` Borislav Petkov
2015-01-09 20:29     ` Denys Vlasenko
2015-01-10 13:52       ` Borislav Petkov
2015-01-09 12:19   ` Borislav Petkov
2015-01-09 18:54     ` Denys Vlasenko
2015-01-10 14:23       ` Borislav Petkov
2015-01-10 20:14         ` Denys Vlasenko
2015-01-10 20:17           ` Andy Lutomirski
2015-01-10 20:42             ` Borislav Petkov
2015-01-10 21:02               ` Andy Lutomirski
2015-01-10 21:09                 ` Denys Vlasenko
2015-01-10 21:27                   ` Linus Torvalds
2015-01-10 21:57                     ` Denys Vlasenko
2015-01-10 20:43             ` Denys Vlasenko [this message]
2015-01-10 21:08             ` Linus Torvalds
2015-01-10 21:26               ` Borislav Petkov
2015-01-10 22:00           ` Borislav Petkov
2015-01-10 22:03             ` Denys Vlasenko
2015-01-10 22:04             ` Andy Lutomirski
2015-01-08 16:25 ` [PATCH 4/4] x86: entry_64.S: fold SAVE_ARGS_IRQ macro into its sole user Denys Vlasenko
2015-01-10 22:00 [PATCH 0/4 v2] x86: entry.S cleanup Denys Vlasenko
2015-01-10 22:00 ` [PATCH 3/4] x86: open-code register save/restore in trace_hardirqs thunks Denys Vlasenko
2015-01-10 22:07   ` Linus Torvalds
2015-01-10 22:35     ` Denys Vlasenko
2015-01-10 22:41       ` Borislav Petkov
2015-01-11  3:33         ` Denys Vlasenko
2015-01-11 10:54           ` Borislav Petkov
2015-01-11 23:06             ` Denys Vlasenko
2015-02-11  2:38   ` Andy Lutomirski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAK1hOcNeAytfd=tMSeCXV+FDnf16qb6wfO5=n94z3jnEFk=JKQ@mail.gmail.com' \
    --to=vda.linux@googlemail.com \
    --cc=ast@plumgrid.com \
    --cc=bp@alien8.de \
    --cc=dvlasenk@redhat.com \
    --cc=fweisbec@gmail.com \
    --cc=hpa@zytor.com \
    --cc=keescook@chromium.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@amacapital.net \
    --cc=oleg@redhat.com \
    --cc=torvalds@linux-foundation.org \
    --cc=wad@chromium.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.