linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Rob Landley <landley@webofficenow.com>
To: Alan Cox <alan@lxorguk.ukuu.org.uk>, mnelson@dynatec.com (Matt Nelson)
Cc: linux-kernel@vger.kernel.org
Subject: Re: Any limitations on bigmem usage?
Date: Tue, 12 Jun 2001 08:46:57 -0400	[thread overview]
Message-ID: <01061208465701.00814@webofficenow.com> (raw)
In-Reply-To: <E159r2Z-0001b4-00@the-village.bc.nu>
In-Reply-To: <E159r2Z-0001b4-00@the-village.bc.nu>

On Tuesday 12 June 2001 12:29, Alan Cox wrote:

> If your algorithm can work well with say 2Gb windows on the data and only
> change window evey so often (in computing terms) then it should be ok, if
> its access is random you need to look at a 64bit box like an Alpha, Sparc64
> or eventually IA64

No, eventually Sledgehammer.

You know that IA64 will never ship, because the performance will always suck. 
 The design is fundamentally flawed.  The real bottleneck is the memory bus.  
These days they're clock multiplying the inside of the processor by a factor 
of what, twelve?  Fun as long as you're entirely in L1 cache and aren't task 
switching.  But that's basicaly an embedded system.

CISC instructions are effectively compressed.  When you exhaust your cache 
your next 64 bit gulp can get more than 2 instructions, you can get more like 
5 (which still means you're getting about 1/4 the performance of in-cache 
execution, although L2 and L3 caches help here, of course..)

But optimizing for something that isn't actually your bottleneck is just 
dumb, and that's exactly what Intel did with the move to VLIW/EPIC/IA64.  3 
times 64 bit instructions is 192, whereas Pentium can fit more than that in a 
single 64 bit chunk.  Brilliant.  You need what, a 6x larger cache just to 
break even with the amount of time you're running in-cache?  And of course 
the compiler is GOING to put NOPs in that because it won't always be able to 
figure out something for the second and third cores to do this clock, 
regardless of how good a compiler it is.  That's just beautiful.

This is why they were so desperate for RAMBUS.  They KNOW the memory bus is 
killing them, they were grasping at straws to fix it.  (Currently they're 
saying that a future vaporware version of iTanium will fix everything, but 
it's a fundamental design flaw: the new instruction set they've chosen is WAY 
less efficient going across the memory bus, and that's the real limiting 
factor in performance.)

Transmeta at least sucks CISC in from main memory to keep the northbridge 
bottleneck optimized.  And they have a big cache, and they're still using 32 
bit instructions so they only need 96 bytes per chunk (or atom or whatever 
they're calling it these days).

Sledgehammer is the interesting x86 64 bit instruction set.  You still have 
the cisc/risc hybrid that made pentium work.  (And, of course, backwards 
compatability that's inherent in the design rather than bolted onto the side 
with duct tape and crazy glue.)  Yeah the circuitry to make it work is 
probably fairly insane, but at least the Athlon design team made all the racy 
logic go away so they can migrate the design to new manufacturing processes 
without four months of redesign work.  (And no, making insanely long 
pipelines in Pentium 4 is not a major improvement there either.  Cache 
exhaustion still kills you, so does branch misprediction.  Stalling a longer 
pipeline is more expensive.)

I saw an 800 mhz iTanium which benchmarked about the same (running 64 bit 
code) as a Penium III 300 mhz running 32 bit code.  That's just sad.  Go 
ahead and blame the compiler.  That's still just sad.  And a design flaw in 
the new instruction set is not something that can be easily optimized away.

Rob

  reply	other threads:[~2001-06-12 17:49 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2001-06-12  0:18 Matt Nelson
2001-06-12  0:31 ` Doug McNaught
2001-06-12  0:38   ` Doug McNaught
2001-06-12 14:03 ` Matt Nelson
2001-06-12 16:29 ` Alan Cox
2001-06-12 12:46   ` Rob Landley [this message]
2001-06-12 18:16     ` Rik van Riel
2001-06-12 18:34 Holzrichter, Bruce
2001-06-13 10:39 ` Ralf Baechle

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=01061208465701.00814@webofficenow.com \
    --to=landley@webofficenow.com \
    --cc=alan@lxorguk.ukuu.org.uk \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mnelson@dynatec.com \
    --subject='Re: Any limitations on bigmem usage?' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).