archive mirror
 help / color / mirror / Atom feed
From: Linus Torvalds <>
To: Ingo Molnar <>
Cc: Pekka Enberg <>, Jesper Juhl <>,,
	Andrew Morton <>,
	"Paul E. McKenney" <>,
	Daniel Lezcano <>,
	Eric Paris <>,
	Roman Zippel <>,,
	Steven Rostedt <>
Subject: Re: PATCH][RFC][resend] CC_OPTIMIZE_FOR_SIZE should default to N
Date: Tue, 22 Mar 2011 09:59:59 -0700	[thread overview]
Message-ID: <> (raw)
In-Reply-To: <>

On Tue, Mar 22, 2011 at 3:27 AM, Ingo Molnar <> wrote:
> If that situation has changed - if GCC has regressed in this area then a commit
> changing the default IMHO gains a lot of credibility if it is backed by careful
> measurements using perf stat --repeat or similar tools.

Also, please don't back up any numbers for the "-O2 is faster than
-Os" case with some benchmark that is hot in the caches.

The thing is, many optimizations that make the code larger look really
good if there are no cache misses, and the code is run a million times
in a tight loop.

But kernel code in particular tends to not be like that. Yes, there
are cases where we spend 75% of the time in the kernel (my own
personal favorite is "git diff") basically having user space loop
around just one single operation. But it is _really_ quite rare in
real life. Most of the time, user space will blow the kernel caches
out of the water, and the kernel loops will be on the order of a few
entries (eg a "loop" may be the loop around a pathname lookup, and
loops over three path components). Not millions.

The rule-of-thumb should be simple: 10% larger code likely means 10%
more I$ misses. Does the larger -O2 code make up for it?

Now, the downside of -Os has always been that it's not all that widely
used, so we've hit compiler bugs several times. That's been almost
enough to make me think that it's not worth it. But currently I don't
think we have any known issues, and probably exactly _because_ we use
-Os it seems that gcc hasn't that many regressions. It was much more
painful when we started trying to use -Os.

(That said, gcc -Os isn't all that wonderful. It tends to sometimes
generate really crappy code just because it's smaller, ie using a
multiply instruction in a critical code window just because doing a
few shifts and adds is larger. And that can be _so_ much slower that
it really hurts. So we might be better off with a model where we can
say "this code is important and really core kernel code that everybody
uses, do -O2 for this", and just compile _most_ of the kernel with


  reply	other threads:[~2011-03-22 17:01 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-03-21 20:08 Jesper Juhl
2011-03-22  2:52 ` Steven Rostedt
2011-03-22  8:21 ` Pekka Enberg
2011-03-22  8:25   ` Jesper Juhl
2011-03-22 10:27   ` Ingo Molnar
2011-03-22 16:59     ` Linus Torvalds [this message]
2011-03-23 17:45       ` Andi Kleen
2011-03-23 21:14       ` Ingo Molnar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \ \ \ \ \ \ \ \ \ \ \ \ \ \
    --subject='Re: PATCH][RFC][resend] CC_OPTIMIZE_FOR_SIZE should default to N' \

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).