linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: torvalds@transmeta.com (Linus Torvalds)
To: linux-kernel@vger.kernel.org
Subject: Re: Why Plan 9 C compilers don't have asm("")
Date: Wed, 4 Jul 2001 17:22:44 +0000 (UTC)	[thread overview]
Message-ID: <9hvjd4$1ok$1@penguin.transmeta.com> (raw)
In-Reply-To: <200107040337.XAA00376@smarty.smart.net> <20010703233605.A1244@zalem.puupuu.org> <20010704002436.C1294@ftsoj.fsmlabs.com>

In article <20010704002436.C1294@ftsoj.fsmlabs.com>,
Cort Dougan  <cort@fsmlabs.com> wrote:
>
>There isn't such a crippling difference between straight-line and code with
>unconditional branches in it with modern processors.  In fact, there's very
>little measurable difference.

Oh, the small details get to you eventually.

And it's not just the "call" and "ret" instructions.  They _do_ hurt,
even on modern CPU's, btw.  They tend to break up the prefetching, and
often mean that you cannot do as good of a instruction mix. 

But there's an even more serious problem: a function call in C is a
VERY heavy operation as far as the compiler is concerned. It's a major
sequence point, and the compiler doesn't know what memory locations are
potentially dead etc.

Which means that the compiler has to save everything that might be
relevant to memory, and depending on the calling convention has to
assume that registers are trashed.  And when you come back, you have to
re-load everything again, on the assumption that the function might have
changed state. 

You also often have issues like reloading the gp pointer on many 64-bit
architectures, where functions can be in different "domains", and
returning from an unknown function means that you have to do other nasty
setup in order to get at your global data.

And trust me, it's noticeable. On alpha, a fast function call _should_
be a a simple two-cycle thing - branch and return. But because of
practical linker issues, what the compiler ends up having to generate
for calls to targets that it doesn't know where they are is 

 - load a 64-bit address off the GP area that the linker will have fixed
   up.
 - do an indirect branch to that address
 - the callee re-loads the GP with _its_ copy of the GP if it needs any
   global data or needs to call anybody else.
 - we return to the caller
 - the caller reloads its GP.

Your theoretical two cycles that the CPU could follow in the front end
and speculate around turns into multiple loads, a indirect branch and
about 10 instructions.  And that's without any of the other effects even
being taken into account.  No matter _how_ good the CPU is, that's going
to be slower than not doing it. 

[ And yes, I know there are optimizing linkers for the alpha around that
  improve this and notice when they don't need to change GP and can do a
  straight branch etc.  I don't think GNU ld _still_ does that, but who
  knows. Even the "good" Digital compilers tended to nop out unnecessary
  instructions rather than remove them, causing more icache pressure on
  a CPU that was already famous for needing tons of icache ]

Now, you could get around a bit of this by allowing for special calling
conventions.  Gcc actually has this for some details - namely the
"register arguments" part, which actually makes for much more readable
code (that's my main personal use for it - never mind the fact that it
is probably faster _too_). 

But gcc doesn't have a good "you can re-order this call wrt other stuff"
setup, and gcc lacks the ability to change the calling convention
on-the-fly ("this function will not clobber any registers"). 

Try it and see. There are good reasons for "inline asm", not the least
of which is that it often makes the produced code much more readable.

And if you never look at the produced assembler code, then you'll never
have a fast system. Really. Compilers can do only so much. People who
understand what the end result is makes for a difference.

Now, you could probably argue that instead of inline asms we should have
more flexibility in doing a per-callee calling convention. That would be
good too, no question about it.

			Linus

  parent reply	other threads:[~2001-07-04 17:23 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2001-07-04  3:37 Why Plan 9 C compilers don't have asm("") Rick Hohensee
2001-07-04  3:36 ` Olivier Galibert
2001-07-04  6:24   ` Cort Dougan
2001-07-04  8:03     ` H. Peter Anvin
2001-07-04 17:22     ` Linus Torvalds [this message]
2001-07-06  8:38       ` Cort Dougan
2001-07-06 18:44         ` Linus Torvalds
2001-07-06 20:02           ` Cort Dougan
2001-07-08 21:55           ` Victor Yodaiken
2001-07-08 22:28             ` Alan Cox
2001-07-09  1:22             ` Johan Kullstam
2001-07-08 22:29           ` David S. Miller
2001-07-06 11:43       ` David S. Miller
2001-07-21 22:10       ` Richard Henderson
2001-07-22  3:43         ` Linus Torvalds
2001-07-22  3:59           ` Mike Castle
2001-07-22  6:49           ` Richard Henderson
2001-07-22  7:44             ` Linus Torvalds
2001-07-22 15:53               ` Richard Henderson
2001-07-22 19:08                 ` Linus Torvalds
2001-07-04  7:15 ` pazke
2001-07-04 17:32 ` Don't feed the trooll [offtopic] " Ben LaHaise
2001-07-05  1:02 ` Michael Meissner
2001-07-05  1:54   ` Rick Hohensee
2001-07-05 16:54     ` Michael Meissner
2001-07-04 10:10 Rick Hohensee
2001-07-05  3:26 Rick Hohensee
2001-07-06 17:24 Rick Hohensee
2001-07-06 23:54 ` David S. Miller
2001-07-07  0:16   ` H. Peter Anvin
2001-07-07  0:37   ` David S. Miller
2001-07-07  6:16 Rick Hohensee
     [not found] <mailman.994629840.17424.linux-kernel2news@redhat.com>
2001-07-09  0:08 ` Pete Zaitcev
2001-07-09  0:28   ` Victor Yodaiken
2001-07-09  3:03 Rick Hohensee
2001-07-23  4:39 Rick Hohensee

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='9hvjd4$1ok$1@penguin.transmeta.com' \
    --to=torvalds@transmeta.com \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).