From: Rob Landley <rob@landley.net>
To: Alan Cox <alan@lxorguk.ukuu.org.uk>, Pavel Machek <pavel@suse.cz>
Cc: CaT <cat@zip.com.au>, Larry McVoy <lm@bitmover.com>,
Anton Blanchard <anton@samba.org>,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>
Subject: Re: Scaling noise
Date: Tue, 9 Sep 2003 02:11:15 -0400 [thread overview]
Message-ID: <200309090211.16136.rob@landley.net> (raw)
In-Reply-To: <1063028321.21050.28.camel@dhcp23.swansea.linux.org.uk>
On Monday 08 September 2003 09:38, Alan Cox wrote:
> On Sad, 2003-09-06 at 16:08, Pavel Machek wrote:
> > Hi!
> >
> > > Maybe this is a better way to get my point across. Think about more
> > > CPUs on the same memory subsystem. I've been trying to make this
> > > scaling point
> >
> > The point of hyperthreading is that more virtual CPUs on same memory
> > subsystem can actually help stuff.
>
> Its a way of exposing asynchronicity keeping the old instruction set.
> Its trying to make better use of the bandwidth available by having
> something else to schedule into stalls. Thats why HT is really good for
> code which is full of polling I/O, badly coded memory accesses but is
> worthless on perfectly tuned hand coded stuff which doesnt stall.
<rant>
I wouldn't call it worthless. "Proof of concept", maybe.
Modern processors (Athlon and P4 both, I believe) have three execution cores,
and so are trying to dispatch three instructions per clock. With
speculation, lookahead, branch prediction, register renaming, instruction
reordering, magic pixie dust, happy thoughts, a tailwind, and 8 zillion other
related things, they can just about do it too, but not even close to 100% of
the time. Extracting three parallel instructions from one instruction stream
is doable, but not fun, and not consistent.
The third core is unavoidably idle some of the time. Trying to keep four
cores bus would be a nightmare. (All the VLIW guys keep trying to unload
this on the compiler. Don't ask me how a compiler is supposed to do branch
prediction and speculative execution. I suppose having to recompile your
binaries for more cores isn't TOO big a problem these days, but the boxed
mainstream desktop apps people wouldn't like it at all.)
Transistor budgets keep going up as manufacturing die sizes shrink, and the
engineers keep wanting to throw transistors at the problem. The first really
easy way to turn transistors into performance are a bigger L1 cache, but
somewhere between 256k and one megabyte per running process you hit some
serious diminishing returns since your working set is in cache and your far
accesses to big datasets (or streaming data) just aren't going to be helped
by more L1 cache.
The other obvious way to turn transistors into performance is to build
execution cores out of them. (Yeah, you can also pipeline yourself to death
to do less per clock for marketing reasons, but there's serious diminishing
returns there too.) With more execution cores, you can (theoretically)
execute more instructions per clock. Except that keeping 3 cores busy out of
one instruction stream is really hard, and 4 would be a nightmare...
Hyperthreading is just a neat hack to keep multiple cores busy. Having
another point of execution to schedule instructions from means you're
guaranteed to keep 1 core busy all the time for each point of execution
(barring memory access latency on "branch to mars" conditions), and with 3
cores and 2 pointes of execution they can fight over the middle core, which
should just about never be idle when the system is loaded.
With hyperthreading (SMT, whatever you wanna call it), the move to 4 execution
cores becomes a no-brainer. (Keeping 2 cores busy from one instruction
stream is relatively trivial), and even 5 (since keeping 3 cores busy is a
solved problem, although it's not busy all the time, but the two threads can
fight for the extra core when they actually have something for it to do...)
And THAT is where SMT starts showing real performance benefits, when you get
to 4 or 5 cores. It's cheaper than SMP on a die because they can share all
sorts of hardware (not the least of which being L1 cache, and you can even
expand L1 cache a bit because you now have the working sets of 2 processes to
stick in it)...
Intel's been desperate for a way to make use of its transistor budget for a
while; manufacturing is what it does better than AMD< not clever processor
design. The original Itanic, case in point, had more than 3 instruction
execution cores in each chip: 3 VLIW, a HP-PA Risc, and a brain-damaged
Pentium (which itself had a couple execution cores)... The long list of
reasons Itanic sucked started with the fact that it had 3 different modes and
whichever one you were in circuitry for the other 2 wouldn't contribute a
darn thing to your performance (although it did not stop there, and in fact
didn't even slow down...)
Of course since power is now the third variable along with price/performance,
sooner or later you'll see chips that individually power down cores as they
go dormant. Possibly even a banked L1 cache; who knows? (It's another
alternative to clocking down the whole chip; power down individual functional
units of the chip. Dunno who might actually do that, or when, but it's nice
to have options...)
</rant>
In brief: hyper threading is cool.
> Its great feature is that HT gets *more* not less useful as the CPU gets
> faster..
Excution point 1 stalls waiting for memory, so execution point 2 gets extra
cores. The classic tale of overlapping processing and I/O, only this time
with the memory bus being the slow device you have to wait for...
Rob
next prev parent reply other threads:[~2003-09-09 14:48 UTC|newest]
Thread overview: 154+ messages / expand[flat|nested] mbox.gz Atom feed top
2003-09-03 4:03 Scaling noise Larry McVoy
2003-09-03 4:12 ` Roland Dreier
2003-09-03 4:20 ` Larry McVoy
2003-09-03 15:12 ` Martin J. Bligh
2003-09-03 4:18 ` Anton Blanchard
2003-09-03 4:29 ` Larry McVoy
2003-09-03 4:33 ` CaT
2003-09-03 5:08 ` Larry McVoy
2003-09-03 5:44 ` Mikael Abrahamsson
2003-09-03 6:12 ` Bernd Eckenfels
2003-09-03 12:09 ` Alan Cox
2003-09-03 15:10 ` Martin J. Bligh
2003-09-03 16:01 ` Jörn Engel
2003-09-03 16:21 ` Martin J. Bligh
2003-09-03 19:41 ` Mike Fedyk
2003-09-03 20:11 ` Martin J. Bligh
2003-09-04 20:36 ` Rik van Riel
2003-09-04 20:47 ` Martin J. Bligh
2003-09-04 21:30 ` William Lee Irwin III
2003-09-03 8:11 ` Giuliano Pochini
2003-09-03 14:25 ` Steven Cole
2003-09-03 12:47 ` Antonio Vargas
2003-09-03 15:31 ` Steven Cole
2003-09-04 1:50 ` Daniel Phillips
2003-09-04 1:52 ` Larry McVoy
2003-09-04 4:42 ` David S. Miller
2003-09-08 19:40 ` bill davidsen
2003-09-04 2:18 ` William Lee Irwin III
2003-09-04 2:19 ` Steven Cole
2003-09-04 2:35 ` William Lee Irwin III
2003-09-04 2:40 ` Steven Cole
2003-09-04 3:20 ` Nick Piggin
2003-09-04 3:07 ` Daniel Phillips
2003-09-08 19:27 ` bill davidsen
2003-09-08 19:12 ` bill davidsen
2003-09-03 16:37 ` Kurt Wall
2003-09-06 15:08 ` Pavel Machek
2003-09-08 13:38 ` Alan Cox
2003-09-09 6:11 ` Rob Landley [this message]
2003-09-09 16:07 ` Ricardo Bugalho
2003-09-10 5:14 ` Rob Landley
2003-09-10 5:45 ` David Mosberger
2003-09-10 10:10 ` Ricardo Bugalho
2003-09-03 6:28 ` Anton Blanchard
2003-09-03 6:55 ` Nick Piggin
2003-09-03 15:23 ` Martin J. Bligh
2003-09-03 15:39 ` Larry McVoy
2003-09-03 15:50 ` Martin J. Bligh
2003-09-04 0:49 ` Larry McVoy
2003-09-04 2:21 ` Daniel Phillips
2003-09-04 2:35 ` Martin J. Bligh
2003-09-04 2:46 ` Larry McVoy
2003-09-04 4:58 ` David S. Miller
2003-09-10 15:47 ` Lock EVERYTHING (for testing) [was: Re: Scaling noise] Timothy Miller
2003-09-04 4:49 ` Scaling noise David S. Miller
2003-09-08 19:50 ` bill davidsen
2003-09-08 23:39 ` Peter Chubb
2003-09-03 17:16 ` William Lee Irwin III
2003-09-03 15:51 ` UP Regression (was) " Cliff White
2003-09-03 17:21 ` William Lee Irwin III
2003-09-03 18:53 ` Cliff White
2003-09-04 0:54 ` Nick Piggin
2003-09-03 5:02 Samium Gromoff
2003-09-03 7:10 John Bradford
2003-09-03 7:38 ` Mike Fedyk
2003-09-03 11:14 ` Larry McVoy
2003-09-08 20:05 ` bill davidsen
2003-09-03 9:41 Brown, Len
2003-09-03 11:02 ` Geert Uytterhoeven
2003-09-03 11:19 ` Larry McVoy
2003-09-03 11:47 ` Matthias Andree
2003-09-03 18:00 ` William Lee Irwin III
2003-09-03 18:05 ` Larry McVoy
2003-09-03 18:15 ` William Lee Irwin III
2003-09-03 18:15 ` Larry McVoy
2003-09-03 18:26 ` William Lee Irwin III
2003-09-03 18:32 ` Alan Cox
2003-09-03 19:46 ` William Lee Irwin III
2003-09-03 20:13 ` Alan Cox
2003-09-03 20:31 ` William Lee Irwin III
2003-09-03 20:48 ` Martin J. Bligh
2003-09-03 21:21 ` William Lee Irwin III
2003-09-03 21:29 ` Martin J. Bligh
2003-09-03 21:51 ` William Lee Irwin III
2003-09-03 21:46 ` Martin J. Bligh
2003-09-04 0:07 ` Mike Fedyk
2003-09-04 1:06 ` Larry McVoy
2003-09-04 1:10 ` Larry McVoy
2003-09-04 1:32 ` William Lee Irwin III
2003-09-04 1:46 ` David Lang
2003-09-04 1:51 ` William Lee Irwin III
2003-09-04 2:31 ` Martin J. Bligh
2003-09-04 2:40 ` Mike Fedyk
2003-09-04 2:50 ` Martin J. Bligh
2003-09-04 3:49 ` Mike Fedyk
2003-09-04 2:48 ` Steven Cole
2003-09-04 17:05 ` Daniel Phillips
2003-09-07 21:18 ` Eric W. Biederman
2003-09-07 23:07 ` Larry McVoy
2003-09-07 23:47 ` Eric W. Biederman
2003-09-08 0:57 ` Larry McVoy
2003-09-08 3:55 ` Eric W. Biederman
2003-09-08 4:47 ` Stephen Satchell
2003-09-08 5:25 ` Larry McVoy
2003-09-08 8:32 ` Eric W. Biederman
2003-09-04 0:58 ` Larry McVoy
2003-09-04 1:12 ` William Lee Irwin III
2003-09-04 2:49 ` Larry McVoy
2003-09-04 3:15 ` William Lee Irwin III
2003-09-04 3:38 ` Nick Piggin
2003-09-05 1:34 ` Robert White
2003-09-03 19:11 ` Steven Cole
2003-09-03 19:36 ` William Lee Irwin III
[not found] <rx83.88x.5@gated-at.bofh.it>
[not found] ` <rxrp.8wt.1@gated-at.bofh.it>
[not found] ` <rxB3.gg.1@gated-at.bofh.it>
[not found] ` <rxB6.gg.5@gated-at.bofh.it>
[not found] ` <rydL.17V.1@gated-at.bofh.it>
[not found] ` <rGXO.5g9.7@gated-at.bofh.it>
2003-09-03 15:33 ` Ihar 'Philips' Filipau
2003-09-03 17:07 Brown, Len
2003-09-03 17:32 ` Larry McVoy
2003-09-03 18:07 ` William Lee Irwin III
2003-09-03 18:07 ` Larry McVoy
2003-09-03 18:25 ` William Lee Irwin III
2003-09-03 23:47 ` Larry McVoy
2003-09-03 23:52 ` William Lee Irwin III
2003-09-03 23:55 ` Martin J. Bligh
2003-09-03 18:28 ` Valdis.Kletnieks
2003-09-03 18:31 ` Alan Cox
2003-09-03 20:11 ` Diego Calleja García
2003-09-03 18:11 ` Alan Cox
2003-09-03 19:56 ` Daniel Gryniewicz
2003-09-03 18:17 ` Martin J. Bligh
2003-09-04 0:36 ` Larry McVoy
2003-09-04 2:21 ` Martin J. Bligh
2003-09-04 2:34 ` Larry McVoy
2003-09-04 2:48 ` Martin J. Bligh
2003-09-04 3:02 ` Larry McVoy
2003-09-04 3:46 ` Gerrit Huizenga
2003-09-04 4:41 ` Martin J. Bligh
2003-09-10 15:02 ` Timothy Miller
2003-09-10 15:12 ` Larry McVoy
2003-09-28 1:51 ` Paul Jakma
2003-09-28 3:13 ` Steven Cole
2003-09-29 0:47 ` Paul Jakma
2003-10-22 1:22 ` Paul Jakma
2003-10-22 3:46 ` Steven Cole
2003-09-04 3:16 ` David Lang
2003-09-04 3:45 ` William Lee Irwin III
2003-09-04 4:51 ` Martin J. Bligh
2003-09-04 3:47 ` Davide Libenzi
2003-09-04 4:16 ` Larry McVoy
2003-09-04 7:43 ` Davide Libenzi
2003-09-08 6:21 Brown, Len
2003-09-08 9:21 ` Eric W. Biederman
2003-09-10 10:01 John Bradford
2003-09-10 11:35 ` Alan Cox
2003-09-10 13:46 ` Bill Davidsen
2003-09-10 15:14 John Bradford
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=200309090211.16136.rob@landley.net \
--to=rob@landley.net \
--cc=alan@lxorguk.ukuu.org.uk \
--cc=anton@samba.org \
--cc=cat@zip.com.au \
--cc=linux-kernel@vger.kernel.org \
--cc=lm@bitmover.com \
--cc=pavel@suse.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).