From: Andy Lutomirski <luto@MIT.EDU>
To: x86@kernel.org
Cc: linux-kernel@vger.kernel.org, Ingo Molnar <mingo@elte.hu>,
Andi Kleen <andi@firstfloor.org>,
Linus Torvalds <torvalds@linux-foundation.org>,
"David S. Miller" <davem@davemloft.net>,
Eric Dumazet <eric.dumazet@gmail.com>,
Peter Zijlstra <a.p.zijlstra@chello.nl>,
Thomas Gleixner <tglx@linutronix.de>,
Borislav Petkov <bp@amd64.org>, Andy Lutomirski <luto@MIT.EDU>
Subject: [PATCH v5 0/8] vDSO time changes for 2.6.40
Date: Mon, 23 May 2011 09:31:23 -0400 [thread overview]
Message-ID: <cover.1306156808.git.luto@mit.edu> (raw)
[Patch 8/8 is brand-new and optional. If anyone objects to it,
please just drop it for 2.6.40 and I'll fix it for 2.6.41.]
This series speeds up vclock_gettime(CLOCK_MONOTONIC) on by almost 30%
(tested on Sandy Bridge). It also adds time() to the vDSO so that we
can deprecate the vsyscall entry points later on. These patches are
intended for 2.6.40, and if I'm feeling really ambitious I'll try to
shave a few more ns off for 2.6.41. (There are lots more optimization
opportunities in there.)
x86-64: Clean up vdso/kernel shared variables
Because vsyscall_gtod_data's address isn't known until load time,
the code contains unnecessary address calculations. The code is
also rather complicated. Clean it up and use addresses that are
known at compile time.
x86-64: Remove unnecessary barrier in vread_tsc
A fair amount of testing on lots of machines has failed to find a
single example in which the barrier *after* rdtsc is needed. So
remove it. (The manuals give no real justification for it, and
rdtsc has no dependencies so there's no sensible reason for a CPU to
delay it.)
x86-64: Don't generate cmov in vread_tsc
GCC likes to generate a cmov on a branch that's almost completely
predictable. Force it to generate a real branch instead.
x86-64: vclock_gettime(CLOCK_MONOTONIC) can't ever see nsec < 0
vset_normalize_timespec was more general than necessary. Open-code
the appropriate normalization loops. This is a big win for
CLOCK_MONOTONIC_COARSE.
x86-64: Move vread_tsc into a new file with sensible options
This way vread_tsc doesn't have a frame pointer, with saves about
0.3ns. I guess that the CPU's stack frame optimizations aren't quite
as good as I thought.
x86-64: Turn off -pg and turn on -foptimize-sibling-calls for vDSO
We're building the vDSO with optimizations disabled that were meant
for kernel code. Override that, except for -fno-omit-frame-pointers,
which might make userspace debugging harder.
x86-64: Add time to vDSO
x86-64: Optimize vDSO time()
These aren't strictly related, but they depend on the vvar cleanup.
They will allow us to deprecate all of the vsyscall entries.
Changes from v4:
- Rebase to 2.6.39.
- Add a missing Signed-off-by.
- Add time() to vDSO.
- Add optional patch 8/8 to optimize vDSO time() as an extra
incentive to use it over the old time() vsyscall.
Changes from v3:
- Put jiffies and vgetcpu_mode into the same cacheline. I folded it
into the vsyscall cleanup patch because it's literally just changing
a number. (In theory one more cacheline could be saved by putting
jiffies and vgetcpu_mode at the end of gtod_data, but that would be
annoying to maintain and would, I think, have little benefit.
- Don't turn off frame pointers in vDSO code.
Changes from v2:
- Just remove the second barrier instead of hacking it. Tests
still pass.
Changes from v1:
- Redo the vsyscall_gtod_data address patch to make the code
cleaner instead of uglier and to make it work for all the
vsyscall variables.
- Improve the comments for clarity and formatting.
- Fix up the changelog for the nsec < 0 tweak (the normalization
code can't be inline because the two callers are different).
- Move vread_tsc into its own file, removing a GCC version
dependence and making it more maintainable.
Andy Lutomirski (8):
x86-64: Clean up vdso/kernel shared variables
x86-64: Remove unnecessary barrier in vread_tsc
x86-64: Don't generate cmov in vread_tsc
x86-64: vclock_gettime(CLOCK_MONOTONIC) can't ever see nsec < 0
x86-64: Move vread_tsc into a new file with sensible options
x86-64: Turn off -pg and turn on -foptimize-sibling-calls for vDSO
x86-64: Add time to vDSO
x86-64: Optimize vDSO time()
arch/x86/include/asm/tsc.h | 4 ++
arch/x86/include/asm/vdso.h | 14 -------
arch/x86/include/asm/vgtod.h | 2 -
arch/x86/include/asm/vsyscall.h | 12 +-----
arch/x86/include/asm/vvar.h | 52 +++++++++++++++++++++++++++
arch/x86/kernel/Makefile | 8 +++--
arch/x86/kernel/time.c | 2 +-
arch/x86/kernel/tsc.c | 19 ----------
arch/x86/kernel/vmlinux.lds.S | 34 ++++++------------
arch/x86/kernel/vread_tsc_64.c | 36 +++++++++++++++++++
arch/x86/kernel/vsyscall_64.c | 46 ++++++++++--------------
arch/x86/vdso/Makefile | 17 ++++++++-
arch/x86/vdso/vclock_gettime.c | 74 ++++++++++++++++++++++++++++-----------
arch/x86/vdso/vdso.lds.S | 9 +----
arch/x86/vdso/vextern.h | 16 --------
arch/x86/vdso/vgetcpu.c | 3 +-
arch/x86/vdso/vma.c | 27 --------------
arch/x86/vdso/vvar.c | 12 ------
18 files changed, 202 insertions(+), 185 deletions(-)
create mode 100644 arch/x86/include/asm/vvar.h
create mode 100644 arch/x86/kernel/vread_tsc_64.c
delete mode 100644 arch/x86/vdso/vextern.h
delete mode 100644 arch/x86/vdso/vvar.c
--
1.7.5.1
next reply other threads:[~2011-05-23 13:32 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-05-23 13:31 Andy Lutomirski [this message]
2011-05-23 13:31 ` [PATCH v5 1/8] x86-64: Clean up vdso/kernel shared variables Andy Lutomirski
2011-05-23 13:31 ` [PATCH v5 2/8] x86-64: Remove unnecessary barrier in vread_tsc Andy Lutomirski
2011-05-23 13:31 ` [PATCH v5 3/8] x86-64: Don't generate cmov " Andy Lutomirski
2011-05-23 13:31 ` [PATCH v5 4/8] x86-64: vclock_gettime(CLOCK_MONOTONIC) can't ever see nsec < 0 Andy Lutomirski
2011-05-23 13:31 ` [PATCH v5 5/8] x86-64: Move vread_tsc into a new file with sensible options Andy Lutomirski
2011-05-23 13:31 ` [PATCH v5 6/8] x86-64: Turn off -pg and turn on -foptimize-sibling-calls for vDSO Andy Lutomirski
2011-05-23 13:31 ` [PATCH v5 7/8] x86-64: Add time to vDSO Andy Lutomirski
2011-05-23 13:31 ` [PATCH v5 8/8] x86-64: Optimize vDSO time() Andy Lutomirski
2011-05-23 15:23 ` [PATCH v5 0/8] vDSO time changes for 2.6.40 Linus Torvalds
2011-05-23 15:28 ` Andrew Lutomirski
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=cover.1306156808.git.luto@mit.edu \
--to=luto@mit.edu \
--cc=a.p.zijlstra@chello.nl \
--cc=andi@firstfloor.org \
--cc=bp@amd64.org \
--cc=davem@davemloft.net \
--cc=eric.dumazet@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=tglx@linutronix.de \
--cc=torvalds@linux-foundation.org \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.