linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Linus Torvalds <torvalds@transmeta.com>
To: Richard Henderson <rth@twiddle.net>
Cc: <linux-kernel@vger.kernel.org>
Subject: Re: Why Plan 9 C compilers don't have asm("")
Date: Sat, 21 Jul 2001 20:43:43 -0700 (PDT)	[thread overview]
Message-ID: <Pine.LNX.4.33.0107212017470.6166-100000@penguin.transmeta.com> (raw)
In-Reply-To: <20010721151055.A3676@twiddle.net>


On Sat, 21 Jul 2001, Richard Henderson wrote:
>
> >   Even the "good" Digital compilers tended to nop out unnecessary
> >   instructions rather than remove them, causing more icache pressure on
> >   a CPU that was already famous for needing tons of icache ]
>
> But you're absolutely right about the nopping -- removing the nops would
> require debug info and EH info to be re-coded.  The later being a matter
> of correctness.  This is a bit nastier than I ever cared to deal with.

I don't see it as being all that nasty.  It's only nasty if you make the
compiler default to the long form - because it's so hard to reduce the
size later (branches around call-sites etc). But if you make the compiler
default to the short form, you're ok - you can trivially expand the short
form later without any of the same problems.

Isn't this what mips32 used to do too - it didn't have the GP reload
issue, but it has 16-bit branch offsets if I remember correctly. So they
ended up adding trampoline branches on demand later. And a 16-bit branch
offset is a lot more constraining than a 20-bit one. 128kB is not a huge
jump, but 2MB is getting to be pretty far in most applications..

So:
 - you _always_ generate the fast case. A call is always considered to be
   a short one, simple "bsr", no GP change, no nothing.
 - you generate a trampoline as well, and teach the linker to go through
   the trampoline if it has to do a far call (one trampoline per target,
   not per caller). Think of it as a "overflow" case for a .rel20.
 - If it's not a weak reference and you can satisfy it at link-time, you
   can obviously just get rid of the trampoline then and there. This takes
   care of all the normal "intra-GP" things.

Sure, if you want to be fancy, you also drop unused GOT entries for
anything that ends up not having a trampoline.

So the above takes care of correctness. For bonus points, you allow the
user to specify "this will be a far call (or weak)" as an attribute, which
you use on intra-modules code. Which is almost entirely library
interfaces, so you'd have the system header files use this so that shared
library calls don't get the hit of the trampoline.

As far as I can see, this should take care of about 99% of all static
jumps.  Most applications have less than 2MB code-space, and the only real
reason for the long form is for intra-module calls which tend to be fairly
well specified (ie they are declared in standard headers etc).

Sure, you could sometimes get the slower case: more than a 2MB offset
within a module, so that you'd have to use the trampoline, or if you're
lazy and don't update the headers for dynamically linked libraries. But
even then there would be the potential for icache win. And you could
always have a "-mlarge-model" compiler option for those cases, so if you
notice that you lose on this optimization, you just disable it.

No?

		Linus


  reply	other threads:[~2001-07-22  3:45 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2001-07-04  3:37 Why Plan 9 C compilers don't have asm("") Rick Hohensee
2001-07-04  3:36 ` Olivier Galibert
2001-07-04  6:24   ` Cort Dougan
2001-07-04  8:03     ` H. Peter Anvin
2001-07-04 17:22     ` Linus Torvalds
2001-07-06  8:38       ` Cort Dougan
2001-07-06 18:44         ` Linus Torvalds
2001-07-06 20:02           ` Cort Dougan
2001-07-08 21:55           ` Victor Yodaiken
2001-07-08 22:28             ` Alan Cox
2001-07-09  1:22             ` Johan Kullstam
2001-07-08 22:29           ` David S. Miller
2001-07-06 11:43       ` David S. Miller
2001-07-21 22:10       ` Richard Henderson
2001-07-22  3:43         ` Linus Torvalds [this message]
2001-07-22  3:59           ` Mike Castle
2001-07-22  6:49           ` Richard Henderson
2001-07-22  7:44             ` Linus Torvalds
2001-07-22 15:53               ` Richard Henderson
2001-07-22 19:08                 ` Linus Torvalds
2001-07-04  7:15 ` pazke
2001-07-04 17:32 ` Don't feed the trooll [offtopic] " Ben LaHaise
2001-07-05  1:02 ` Michael Meissner
2001-07-05  1:54   ` Rick Hohensee
2001-07-05 16:54     ` Michael Meissner
2001-07-04 10:10 Rick Hohensee
2001-07-05  3:26 Rick Hohensee
2001-07-06 17:24 Rick Hohensee
2001-07-06 23:54 ` David S. Miller
2001-07-07  0:16   ` H. Peter Anvin
2001-07-07  0:37   ` David S. Miller
2001-07-07  6:16 Rick Hohensee
     [not found] <mailman.994629840.17424.linux-kernel2news@redhat.com>
2001-07-09  0:08 ` Pete Zaitcev
2001-07-09  0:28   ` Victor Yodaiken
2001-07-09  3:03 Rick Hohensee
2001-07-23  4:39 Rick Hohensee

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Pine.LNX.4.33.0107212017470.6166-100000@penguin.transmeta.com \
    --to=torvalds@transmeta.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=rth@twiddle.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).