From: Linus Torvalds <torvalds@osdl.org>
To: "Martin J. Bligh" <mbligh@mbligh.org>
Cc: Matt Mackall <mpm@selenic.com>, Andrew Morton <akpm@osdl.org>,
Adrian Bunk <bunk@stusta.de>,
mingo@elte.hu, tim@physik3.uni-rostock.de, arjan@infradead.org,
davej@redhat.com, linux-kernel@vger.kernel.org,
Zwane Mwaikambo <zwane@linuxpower.ca>
Subject: Re: [patch 00/2] improve .text size on gcc 4.0 and newer compilers
Date: Wed, 4 Jan 2006 14:37:19 -0800 (PST) [thread overview]
Message-ID: <Pine.LNX.4.64.0601041356200.3668@g5.osdl.org> (raw)
In-Reply-To: <43BB6255.2030903@mbligh.org>
On Tue, 3 Jan 2006, Martin J. Bligh wrote:
>
> Well, it's not just one function, is it? It'd seem that if you unlined
> a whole bunch of stuff (according to this theory) then normal
> macro-benchmarks would go faster? Otherwise it's all just rather
> theoretical, is it not?
One of the problems with code size optimizations is that they
fundamentally show much less impact under pretty much any traditional
benchmark.
In user space, for example, the biggest impact of size optimization tends
to be in things like load time and low-memory situations. Yet, that's
never what is benchmarked (yeah, people will benchmark some low-memory
kernel situations by compating different swap-out algorithms etc against
each other - but they'll almost never compare the perceived speed of your
desktop when you start swapping).
Now, I'll happily also admit that code and data _layout_ is often a lot
more effective than just code size optimizations. That's especially true
with low-memory situations where the page size being larger than many of
the code sequences, you can make a bigger impact by changing utilization
than by changing absolute size.
But even then it's actually really hard to measure. Cache effects tend to
be hard _anyway_ to measure, it's really hard when the interesting case is
the cold-cache case and can't even do simple microbenchmarks that repeat a
million times to get stable results.
So the best we can usually do is "microbenchmarks don't show any
noticeable _worse_ behaviour, and code size went down by 10%".
Just as an example: there's an old paper on the impact of the OS design on
the memory subsystem, and one of the tables is about how many cycles per
instruction is spent on cache effects (this was Ultrix vs Mach, the load
was the average of a workload that was either specint or looked a lot
like it).
I$ D$
Ultrix user: 0.07 0.08
Mach user: 0.07 0.08
Ultrix system: 0.43 0.23
Mach system: 0.57 0.29
Now, that's an oldish paper, and we really don't care about Ultrix vs Mach
nor about the particular hw (Decstation), but the same kind of basic
numbers have been repeated over an over. Namely that system code tends to
have very different behaviour from user code wrt caches. Caches may have
gotten better, but code has gotten bigger, and cores have gotten faster.
And in particular, they tend to be _much_ worse. Something you'll seldom
see as clearly in micro-benchmarks, if only because under those benchmarks
things will generally be more cached - especially on the I$ side.
So I should probably try to find something slightly more modern, but what
it boils down to is that at least one well-cited paper that is fairly easy
to find claims that about _half_a_cycle_ for each instruction was spent on
I$ misses in system code on a perfectly regular mix of programs. And that
the cost of I$ was actually higher than the cost of D$ misses.
Now, most of the _time_ tends to be spent in user mode. That is probably
part of the reason for why system mode gets higher cache misses (another
is that obviously you'd hope that the user program can optimize for its
particular memory usage, while the kernel can't). But the result remains:
I$ miss time is a noticeable portion of system time.
Now, I claim that I$ miss overhead on a system would probably tend to have
a pretty linear relationship with the size of the static code. There
aren't that many big hot loops or small code that fits in the I$ to skew
that trivial rule.
So at a total guess, and taking the above numbers (that are questionable,
but hey, they should be ok as a starting point for WAGging), reducing code
size by 10% should give about 0.007 cycles per instruction in user space.
Whee. Not very noticeable. But on system code (the only thing the kernel
can help), it's actually a much more visible 0.05 CPI.
Yeah, the math is bogus, the numbers may be irrelevant, but it does show
why I$ should matter, even though it's much harder to measure.
Linus
next prev parent reply other threads:[~2006-01-04 22:43 UTC|newest]
Thread overview: 210+ messages / expand[flat|nested] mbox.gz Atom feed top
2005-12-28 11:46 [patch 00/2] improve .text size on gcc 4.0 and newer compilers Ingo Molnar
2005-12-28 19:17 ` Linus Torvalds
2005-12-28 19:34 ` Arjan van de Ven
2005-12-28 21:02 ` Linus Torvalds
2005-12-28 21:17 ` Arjan van de Ven
2005-12-28 21:23 ` Ingo Molnar
2005-12-28 21:48 ` Ingo Molnar
2005-12-28 23:56 ` Krzysztof Halasa
2005-12-29 7:41 ` Ingo Molnar
2005-12-29 8:02 ` Dave Jones
2005-12-29 19:44 ` Krzysztof Halasa
2005-12-29 4:11 ` Andrew Morton
2005-12-29 7:32 ` Ingo Molnar
2005-12-29 14:58 ` Horst von Brand
2005-12-29 15:40 ` Adrian Bunk
2005-12-29 17:41 ` Linus Torvalds
2005-12-29 18:42 ` Arjan van de Ven
2005-12-29 18:45 ` Arjan van de Ven
2005-12-29 20:19 ` Ingo Molnar
2005-12-29 22:20 ` Matt Mackall
2005-12-29 20:28 ` Dave Jones
2005-12-29 20:49 ` Linus Torvalds
2005-12-29 21:25 ` Linus Torvalds
[not found] ` <20051229224839.GA12247@elte.hu>
2005-12-29 22:58 ` Arjan van de Ven
2005-12-30 2:03 ` Tim Schmielau
2005-12-30 2:15 ` Tim Schmielau
2005-12-30 7:49 ` Ingo Molnar
2005-12-31 14:38 ` Adrian Bunk
2005-12-31 14:45 ` Ingo Molnar
2005-12-31 15:08 ` Adrian Bunk
2006-01-02 10:37 ` Ingo Molnar
2006-01-02 10:48 ` Arjan van de Ven
2006-01-02 13:43 ` Adrian Bunk
2006-01-02 14:05 ` Ingo Molnar
2006-01-02 15:01 ` Adrian Bunk
2006-01-02 18:44 ` Krzysztof Halasa
2006-01-02 18:51 ` Arjan van de Ven
2006-01-02 19:49 ` Krzysztof Halasa
2006-01-02 19:54 ` Arjan van de Ven
2006-01-02 20:05 ` Krzysztof Halasa
2006-01-02 20:18 ` Jörn Engel
2006-01-02 22:23 ` Russell King
2006-01-02 23:55 ` Alan Cox
2006-01-03 3:59 ` Daniel Jacobowitz
2006-01-03 8:53 ` Russell King
2006-01-03 8:56 ` Arjan van de Ven
2006-01-03 9:00 ` Russell King
2006-01-03 9:10 ` Arjan van de Ven
2006-01-03 9:14 ` Vitaly Wool
2006-01-02 19:03 ` Andrew Morton
2006-01-02 19:17 ` Jakub Jelinek
2006-01-02 19:30 ` Andrew Morton
2006-01-02 19:41 ` Linus Torvalds
2006-01-02 19:53 ` Ingo Molnar
2006-01-02 20:28 ` Jakub Jelinek
2006-01-02 20:09 ` Ingo Molnar
2006-01-02 20:24 ` Andrew Morton
2006-01-02 20:40 ` Ingo Molnar
2006-01-02 20:30 ` Ingo Molnar
2006-01-02 19:12 ` Linus Torvalds
2006-01-02 19:59 ` Krzysztof Halasa
2006-01-02 20:13 ` Ingo Molnar
2006-01-02 21:00 ` Jan Engelhardt
2006-01-02 22:43 ` Linus Torvalds
2006-01-02 13:42 ` Adrian Bunk
2006-01-02 18:28 ` Andrew Morton
2006-01-02 18:49 ` Arjan van de Ven
2006-01-02 19:26 ` Jörn Engel
2006-01-02 21:51 ` Grant Coady
2006-01-02 22:03 ` Antonio Vargas
2006-01-02 22:56 ` Arjan van de Ven
2006-01-02 23:10 ` Grant Coady
2006-01-02 23:57 ` Alan Cox
2006-01-02 23:58 ` Grant Coady
2006-01-03 5:31 ` Nick Piggin
2006-01-03 23:40 ` Martin J. Bligh
2006-01-04 4:28 ` Matt Mackall
2006-01-04 5:51 ` Martin J. Bligh
2006-01-04 17:10 ` Matt Mackall
2006-01-04 22:37 ` Linus Torvalds [this message]
2006-01-05 0:55 ` Martin Bligh
2006-01-04 17:36 ` Zwane Mwaikambo
2005-12-31 3:51 ` Kurt Wall
2005-12-30 3:31 ` Nicolas Pitre
2005-12-29 22:41 ` userspace breakage Dave Jones
2005-12-29 21:23 ` Jeff V. Merkey
2005-12-29 23:42 ` Linus Torvalds
2005-12-29 22:17 ` Jeff V. Merkey
2005-12-30 0:10 ` Linus Torvalds
2005-12-29 23:54 ` Jeff V. Merkey
2005-12-30 0:32 ` Linus Torvalds
2005-12-30 14:45 ` Jeff V. Merkey
2005-12-30 16:17 ` Rik van Riel
2005-12-30 15:01 ` Jeff V. Merkey
2005-12-30 11:17 ` Bernd Petrovitsch
2005-12-30 14:59 ` Jeff V. Merkey
2005-12-30 20:22 ` Steven Rostedt
2005-12-30 21:26 ` Linus Torvalds
2005-12-30 23:49 ` Andrew Morton
2005-12-31 8:33 ` Arjan van de Ven
2005-12-31 15:30 ` Steven Rostedt
2005-12-31 16:40 ` Francois Romieu
2005-12-30 11:19 ` Bernd Petrovitsch
2005-12-30 14:53 ` Jeff V. Merkey
2005-12-30 17:17 ` Bernd Petrovitsch
2005-12-29 22:47 ` Ismail Donmez
2005-12-29 22:56 ` Linus Torvalds
2005-12-29 23:03 ` Dave Jones
2006-01-03 20:28 ` Greg KH
2006-01-03 20:37 ` Dave Jones
2006-01-04 23:00 ` Paolo Ciarrocchi
2006-01-05 4:42 ` Greg KH
2005-12-29 23:25 ` Adrian Bunk
2005-12-29 23:52 ` Linus Torvalds
2005-12-30 8:09 ` Arjan van de Ven
2006-01-03 20:28 ` Greg KH
2005-12-29 23:07 ` Dmitry Torokhov
2005-12-30 0:38 ` Ryan Anderson
2005-12-30 0:46 ` Dave Jones
2005-12-30 1:05 ` Linus Torvalds
2005-12-30 1:19 ` Dave Airlie
2005-12-30 1:21 ` Dave Jones
2005-12-30 2:10 ` Jiri Slaby
2005-12-30 2:14 ` Dave Jones
2005-12-30 2:35 ` Jiri Slaby
2005-12-30 5:14 ` Theodore Ts'o
2005-12-30 0:49 ` Linus Torvalds
2005-12-30 3:47 ` [patch 00/2] improve .text size on gcc 4.0 and newer compilers Mark Lord
2005-12-30 3:56 ` Dave Jones
2005-12-30 3:57 ` Mark Lord
2005-12-30 4:02 ` Dave Jones
2005-12-30 4:11 ` Mark Lord
2005-12-30 4:14 ` Mark Lord
2005-12-30 4:20 ` Mark Lord
2005-12-30 5:04 ` Dave Jones
2005-12-29 23:16 ` Willy Tarreau
2005-12-30 8:05 ` Arjan van de Ven
2005-12-30 8:15 ` Willy Tarreau
2005-12-30 8:24 ` Arjan van de Ven
2005-12-30 9:20 ` Willy Tarreau
2005-12-30 13:38 ` Adrian Bunk
2005-12-30 8:33 ` Jesper Juhl
2005-12-30 9:28 ` Willy Tarreau
2005-12-30 9:37 ` Jesper Juhl
2005-12-30 9:38 ` Willy Tarreau
2005-12-30 19:53 ` Alistair John Strachan
2005-12-29 7:49 ` Arjan van de Ven
2005-12-29 15:01 ` Horst von Brand
2005-12-30 15:28 ` Alan Cox
2005-12-30 20:59 ` Adrian Bunk
2005-12-30 22:12 ` Matt Mackall
2005-12-30 23:54 ` Adrian Bunk
2005-12-31 9:20 ` Arjan van de Ven
2005-12-29 14:38 ` Christoph Hellwig
2005-12-29 14:54 ` Arjan van de Ven
2005-12-29 15:35 ` Adrian Bunk
2005-12-29 15:38 ` Arjan van de Ven
2005-12-29 15:42 ` Jakub Jelinek
2005-12-29 19:14 ` Adrian Bunk
2005-12-30 9:28 ` Andi Kleen
2005-12-30 9:40 ` Ingo Molnar
2005-12-30 10:14 ` Ingo Molnar
2005-12-30 13:31 ` Adrian Bunk
2005-12-30 14:08 ` Christian Trefzer
2005-12-30 10:25 ` Andi Kleen
2005-12-29 0:37 ` Rogério Brito
2006-01-03 3:36 ` Daniel Jacobowitz
2005-12-29 4:38 ` Adrian Bunk
2005-12-29 7:59 ` Ingo Molnar
2005-12-29 13:52 ` Adrian Bunk
2005-12-29 19:57 ` Horst von Brand
2005-12-29 20:25 ` Ingo Molnar
2005-12-31 15:22 ` Adrian Bunk
2006-01-05 0:55 Chuck Ebbert
2006-01-05 1:07 ` Martin Bligh
2006-01-05 12:19 ` Arjan van de Ven
2006-01-05 14:30 ` Jakub Jelinek
2006-01-05 16:55 ` Linus Torvalds
2006-01-05 18:42 ` Daniel Jacobowitz
2006-01-05 17:02 ` Matt Mackall
2006-01-05 17:59 ` Martin Bligh
2006-01-05 18:09 ` Arjan van de Ven
2006-01-05 18:43 ` Daniel Jacobowitz
2006-01-05 19:17 ` Linus Torvalds
2006-01-05 19:40 ` Linus Torvalds
2006-01-05 19:49 ` Martin Bligh
2006-01-05 20:13 ` Linus Torvalds
2006-01-05 20:15 ` Linus Torvalds
2006-01-05 23:30 ` Ingo Molnar
2006-01-05 23:54 ` Linus Torvalds
2006-01-06 0:15 ` Ingo Molnar
2006-01-06 0:27 ` Linus Torvalds
2006-01-06 0:54 ` Ingo Molnar
2006-01-06 0:02 ` Martin Bligh
2006-01-06 0:40 ` Ingo Molnar
2006-01-06 0:55 ` Martin Bligh
2006-01-06 1:48 ` Mitchell Blank Jr
2006-01-06 0:50 ` Mitchell Blank Jr
2006-01-06 0:58 ` Ingo Molnar
2006-01-06 1:22 ` Mitchell Blank Jr
2006-01-05 21:34 ` Matt Mackall
2006-01-05 22:08 ` Linus Torvalds
2006-01-05 22:36 ` Matt Mackall
2006-01-05 22:49 ` Martin Bligh
2006-01-05 23:02 ` Matt Mackall
2006-01-05 22:55 ` Ingo Molnar
2006-01-05 23:11 ` Matt Mackall
2006-01-05 23:27 ` Jesse Barnes
2006-01-05 23:58 ` Ingo Molnar
2006-01-05 21:32 ` Grzegorz Kulewski
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Pine.LNX.4.64.0601041356200.3668@g5.osdl.org \
--to=torvalds@osdl.org \
--cc=akpm@osdl.org \
--cc=arjan@infradead.org \
--cc=bunk@stusta.de \
--cc=davej@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mbligh@mbligh.org \
--cc=mingo@elte.hu \
--cc=mpm@selenic.com \
--cc=tim@physik3.uni-rostock.de \
--cc=zwane@linuxpower.ca \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).