linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Daniel Phillips <phillips@bonn-fries.net>
To: Linus Torvalds <torvalds@transmeta.com>,
	Daniel Phillips <phillips@bonn-fries.net>
Cc: <linux-kernel@vger.kernel.org>
Subject: Re: [Lse-tech] Re: 10.31 second kernel compile
Date: Sat, 16 Mar 2002 23:25:45 +0100	[thread overview]
Message-ID: <E16mMcT-0000m9-00@starship> (raw)
In-Reply-To: <Pine.LNX.4.33.0203161046400.31913-100000@penguin.transmeta.com>
In-Reply-To: <Pine.LNX.4.33.0203161046400.31913-100000@penguin.transmeta.com>

On March 16, 2002 08:01 pm, Linus Torvalds wrote:
> On Sat, 16 Mar 2002, Daniel Phillips wrote:
> > It could be a lot more abstract than that.  Chuck Cranor's UVM (which 
> > seems to bear some sort of filial relationship to the FreeBSD VM) buries 
> > all  accesses to the page table behind a 'pmap' API, and implements the 
> > standard Unix VM semantics at the 'memory object' level.
> 
> Who knows, maybe we'll change the abstraction in Linux some day too.. 
> However, I personally tend to prefer "thin" abstractions that don't hide 
> details.
> 
> The problem with the thick abstractions ("high level") is that they often
> lead you down the wrong path. You start thinking that it's really cheap to
> share partial address spaces etc ("hey, I just map this 'memory object'
> into another process, and it's just a matter of one linked list operation
> and incrementing a reference ount").

My opinion, which I implied in the previous post but didn't state in so many 
words, is that the whole Real Unix crowd - Chuck Cranor and Matt Dillon, Sun, 
SGI and IBM etc - got off on the wrong track with respect to implementing 
Unix VM semantics, and that we will achieve all the design goals they set for 
themselves in a simpler, more efficient way.  (That is, assuming I ever 
finish debugging the page table sharing[1] and extend it to shared mmaps.)  I 
attribute that whole wrong turn to a too-heavy abstraction of the page table, 
distracting the eye from the observation that the page table itself provides 
sufficient state to do the same job as memory objects.  I'm curious to hear 
Matt's opinion on that by the way, I have to go bother him about this.

> Until you realize that the actual sharing still implies a TLB switch 
> between the two "threads", and that you need to instantiate the TLB in 
> both processes etc. And suddenly that instantiation is actually the _real_ 
> cost - and your clever highlevel abstraction was actually a lot more 
> expensive than you realized.

Well I don't have any problem with the TLB cost being hidden, what bothers me 
is the complexity of the mechanism required to make the abstraction work.  
Sort-of work I mean, just google 'all-shadowed case' to see one nasty 
difficulty.

> [ Side note: I'm very biased by reality. In theory, a non-page-table based 
>   approach which used only a front-side TLB and a fast lookup into higher- 
>   level abstractions might be a really nice setup. However, in practice, 
>   the world is 99%+ based on hardware that natively looks up the TLB in a 
>   tree, and is really good at it too.  So I'm biased. I'd rather do good 
>   on the 99% than care about some theoretical 1% ]

It breaks down somewhat as virtual memory range goes way beyond 4GB.  
There's the relatively minor issue of extra levels of tree traversal, 
currently limited to 4 by AMD's architecture but not so limited on other 
architectures.  A bigger problem is what to do about internal fragmentation 
in the page table tree, say if somebody mmaps a 2 TB sparse file, then writes 
one byte every 2 meg.  Bang, 4 gig worth of page tables, this is probably not 
what we want.  IMHO, 'don't do that then' isn't a reasonable response.

What we might want to do there is evict some page tables when they start 
proliferating too much, and that's when we find out we have no good model for 
doing that.  I think this needs to be looked at.

[1] I finally got a little more work done on it today

-- 
Daniel

  reply	other threads:[~2002-03-16 22:31 UTC|newest]

Thread overview: 137+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2002-03-13  8:52 10.31 second kernel compile Anton Blanchard
2002-03-13 14:44 ` Martin J. Bligh
2002-03-13 21:44   ` [Lse-tech] " Dave Hansen
2002-03-14  1:07     ` Keith Owens
2002-03-14 11:27   ` Anton Blanchard
2002-03-14 13:16     ` [Lse-tech] " Dipankar Sarma
2002-03-17 13:12       ` some RCU dcache and ratcache results Anton Blanchard
2002-03-14 13:21     ` [Lse-tech] Re: 10.31 second kernel compile Momchil Velikov
2002-03-14 18:33       ` Daniel Phillips
2002-03-15 12:16         ` Chris Wedgwood
2002-03-16  5:12           ` Anton Blanchard
2002-03-15 18:20         ` Linus Torvalds
2002-03-16 15:24           ` Daniel Phillips
2002-03-16 19:01             ` Linus Torvalds
2002-03-16 22:25               ` Daniel Phillips [this message]
2002-03-19 16:35                 ` Bill Davidsen
2002-03-18  3:07           ` David S. Miller
2002-03-16 11:55         ` Paul Mackerras
2002-03-16 17:25           ` Rik van Riel
2002-03-16 17:57           ` yodaiken
2002-03-16 18:06           ` Linus Torvalds
2002-03-16 18:35             ` yodaiken
2002-03-16 18:45               ` Linus Torvalds
2002-03-16 18:57                 ` yodaiken
2002-03-16 19:16                   ` Linus Torvalds
2002-03-16 19:53                     ` yodaiken
2002-03-16 20:02                       ` Linus Torvalds
2002-03-16 20:25                         ` yodaiken
2002-03-27  1:07                     ` Richard Henderson
2002-03-16 19:43                   ` David Mosberger
2002-03-16 19:58                     ` Linus Torvalds
2002-03-16 20:08                       ` yodaiken
2002-03-16 20:23                         ` Linus Torvalds
2002-03-16 20:36                     ` David Mosberger
2002-03-16 20:46                       ` Linus Torvalds
2002-03-17  1:09                       ` Paul Mackerras
2002-03-17  2:08                         ` Linus Torvalds
2002-03-16 20:53             ` Alan Cox
2002-03-14 19:05       ` Linus Torvalds
2002-03-19 16:40         ` Bill Davidsen
2002-03-14 18:21   ` Hanna Linder
2002-03-16  5:27     ` Anton Blanchard
2002-03-15  7:12   ` Chris Wedgwood
2002-03-16  6:15 ` 7.52 " Anton Blanchard
2002-03-16  6:42   ` [Lse-tech] " Gerrit Huizenga
2002-03-17 12:34     ` Anton Blanchard
2002-03-17 22:09       ` Theodore Tso
2002-03-18  7:04         ` Jeff Garzik
2002-03-19 18:28           ` Theodore Tso
2002-03-16  8:05   ` Linus Torvalds
2002-03-16 11:54     ` yodaiken
2002-03-16 11:04   ` Paul Mackerras
2002-03-16 18:32     ` Linus Torvalds
2002-03-17  2:00     ` Paul Mackerras
2002-03-17  2:40       ` Linus Torvalds
2002-03-17  2:50         ` M. Edward Borasky
2002-03-18 15:08           ` 0.73 " snpe
2002-03-18 19:42       ` 7.52 " Cort Dougan
2002-03-18 20:04         ` Linus Torvalds
2002-03-18 20:23           ` Linus Torvalds
2002-03-18 21:50             ` Rene Herman
2002-03-18 22:36             ` Cort Dougan
2002-03-18 22:47               ` Linus Torvalds
2002-03-18 22:56                 ` Cort Dougan
2002-03-18 23:52                 ` Paul Mackerras
2002-03-19  0:57                   ` Dave Jones
2002-03-19  3:35                     ` Jeff Garzik
2002-03-19  0:22                 ` David S. Miller
2002-03-19  0:27                   ` Cort Dougan
2002-03-19  0:27                     ` David S. Miller
2002-03-19  0:36                       ` Cort Dougan
2002-03-19  0:38                         ` David S. Miller
2002-03-19  1:28                           ` Davide Libenzi
2002-03-19  2:42             ` Paul Mackerras
2002-03-27  2:53             ` Richard Henderson
2002-04-02  4:32               ` Linus Torvalds
2002-04-02 10:50             ` Pablo Alcaraz
2002-03-18 21:34           ` Cort Dougan
2002-03-18 22:00             ` Linus Torvalds
2002-03-18 19:37     ` Cort Dougan
2002-03-16 17:37   ` [Lse-tech] " Martin J. Bligh
2002-03-17  1:45     ` Keith Owens
2002-03-17 13:54     ` David Woodhouse
2002-03-19 16:49     ` Bill Davidsen
2002-03-16 18:57   ` Daniel Egger
2002-03-17  8:18     ` Mike Galbraith
2002-03-17 15:29       ` Martin J. Bligh
     [not found] <20020316113536.A19495@hq.fsmlabs.com.suse.lists.linux.kernel>
     [not found] ` <Pine.LNX.4.33.0203161037160.31913-100000@penguin.transmeta.com.suse.lists.linux.kernel>
     [not found]   ` <20020316115726.B19495@hq.fsmlabs.com.suse.lists.linux.kernel>
2002-03-16 19:32     ` [Lse-tech] Re: 10.31 " Andi Kleen
2002-03-16 19:57       ` yodaiken
2002-03-16 20:05         ` Andi Kleen
2002-03-16 20:12           ` yodaiken
2002-03-16 20:34             ` Linus Torvalds
2002-03-16 21:39               ` yodaiken
2002-03-16 21:49                 ` Linus Torvalds
2002-03-17 14:38                   ` Kai Henningsen
2002-03-17 18:20                     ` Alan Cox
2002-03-16 22:00                 ` Alan Cox
2002-03-16 21:49                   ` Linus Torvalds
2002-03-16 23:10                   ` yodaiken
2002-03-17  1:17                     ` rddunlap
2002-03-17  3:34                     ` Alan Cox
2002-03-17 14:52                 ` Kai Henningsen
2002-03-17 21:00                   ` yodaiken
2002-03-19 12:06                 ` Pavel Machek
2002-03-19 21:12                   ` yodaiken
2002-03-19 22:09                     ` Chris Friesen
2002-03-19 22:15                       ` yodaiken
2002-03-20  4:25                     ` Bill Davidsen
2002-03-16 20:27           ` Richard Gooch
2002-03-16 20:47             ` yodaiken
2002-03-16 21:05             ` Richard Gooch
2002-03-16 23:34               ` yodaiken
2002-03-17 13:48               ` Rik van Riel
2002-03-17  2:50           ` Chris Wedgwood
2002-03-17  3:43             ` Alan Cox
2002-03-17  4:12               ` Chris Wedgwood
2002-03-17  4:31                 ` Alan Cox
2002-03-16 20:14         ` Linus Torvalds
2002-03-16 20:22           ` Andi Kleen
2002-03-19  4:34             ` Rusty Russell
2002-03-17 13:23           ` Rik van Riel
2002-03-17 18:16             ` Linus Torvalds
2002-03-17 23:01               ` Davide Libenzi
2002-03-18  0:53                 ` Rik van Riel
2002-03-18  1:13                   ` Davide Libenzi
2002-03-18  1:31                     ` Linus Torvalds
2002-03-18  1:56                       ` Davide Libenzi
2002-03-18  1:40                     ` Mike Fedyk
2002-03-18  1:48                       ` Davide Libenzi
2002-03-24 21:12           ` Rogier Wolff
2002-03-24 21:35             ` Andrew Morton
2002-03-24 22:54               ` Nick Craig-Wood
2002-03-24 23:41                 ` Andi Kleen
2002-03-25  6:40               ` Martin J. Bligh
2002-03-16 20:36         ` Richard Gooch
2002-03-16 20:38           ` Linus Torvalds
2002-03-16 20:51           ` Richard Gooch

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=E16mMcT-0000m9-00@starship \
    --to=phillips@bonn-fries.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=torvalds@transmeta.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).