linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Linus Torvalds <torvalds@transmeta.com>
To: Richard Gooch <rgooch@ras.ucalgary.ca>
Cc: Kernel Mailing List <linux-kernel@vger.kernel.org>
Subject: Re: Getting FS access events
Date: Mon, 14 May 2001 21:43:18 -0700 (PDT)	[thread overview]
Message-ID: <Pine.LNX.4.21.0105142130480.23663-100000@penguin.transmeta.com> (raw)
In-Reply-To: <Pine.LNX.4.21.0105142054180.23578-100000@penguin.transmeta.com>


On Mon, 14 May 2001, Linus Torvalds wrote:
> 
> Or rather, there is a fundamental reason why we must NEVER EVER look at
> the buffer cache: it is not coherent with the page cache. 
> 
> And keeping it coherent would be _extremely_ expensive. How do we
> know? Because we used to do that. Remember the small mindcraft
> benchmark? Yup. Double copies all over the place, double lookups, double
> everything.

I think I should explain a bit more.

The current page cache is completely non-coherent (with _anything_: it's
not coherent with other files using a page cache because they have a
different index, and it's not coherent with the buffer cache because that
one isn't even in the same name space).

Now, being non-coherent is always the best option if you can get away with
it. It means that there is no way you can ever have _any_ performance
overhead from maintaining the coherency, and it's 100% reproducible -
there's no question where the page cache gets its data from (the raw disk
device. No if's, but's and why's).

The disadvantage of virtual caches is that they can have aliases. That's
fine, but you hav eto be aware of it, and you have to live with the
consequences. That's what we do now. There are no aliases that are worth
worrying about, so virtual caches work perfectly. This is not always true
(virtual CPU data caches tend to be a really bad idea, while virtual CPU
instruction caches tend to work fairly well, although potentially with a
lower utilization ratio than a physical one due to aliasing).

The other alternative is to have a physical cache. That's fine too: you
avoid aliases, but you have to look up the physical address when looking
up the cache. THIS is the real cost of the buffer cache - not the hashing
and the locking, but the fact that you have to know the physical
location. 

A mixed-mode cache is not a good idea. It gets the worst from both worlds,
without getting _any_ of the good qualities. You have the horrible
coherency issue, together with the overhead of having to find out the
physical address. 

You could choose to do "partial coherency", ie be coherent only one way,
for example. That would make the coherency overhead much less, but would
also make the caches basically act very unpredictably - you might have
somebody write through the page cache yet on a read actually not _see_
what he wrote, because it got written out to disk and was shadowed by
cached data in the buffer cache that didn't get updated.

So "partial coherency" might avoid some of the performance issues, but
it's unacceptable to me simply it's pretty non-repeatable and has some
strange behaviour that can be considered "obviously wrong" (see above
about one example).

Which leaves us with the fact that the page cache is best done the way it
is, and anybody who has coherency concerns might really think about those
concerns another way.

I'm really serious about doing "resume from disk". If you want a fast
boot, I will bet you a dollar that you cannot do it faster than by loading
a contiguous image of several megabytes contiguously into memory. There is
NO overhead, you're pretty much guaranteed platter speeds, and there are
no issues about trying to order accesses etc. There are also no issues
about messing up any run-time data structures.

Give it some thought.

		Linus


  parent reply	other threads:[~2001-05-15  4:44 UTC|newest]

Thread overview: 75+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <200105140117.f4E1HqN07362@vindaloo.ras.ucalgary.ca>
2001-05-14  1:32 ` Getting FS access events Linus Torvalds
2001-05-14  1:45   ` Larry McVoy
2001-05-14  2:39   ` Richard Gooch
2001-05-14  3:09     ` Rik van Riel
2001-05-14  4:27     ` Richard Gooch
2001-05-15  4:37     ` Chris Wedgwood
2001-05-23 11:37       ` Stephen C. Tweedie
2001-05-14  2:24 ` Richard Gooch
2001-05-14  4:46   ` Linus Torvalds
2001-05-14  5:15   ` Richard Gooch
2001-05-14 13:04     ` Daniel Phillips
2001-05-14 18:00       ` Andreas Dilger
2001-05-14 20:16     ` Linus Torvalds
2001-05-14 23:19     ` Richard Gooch
2001-05-15  0:42       ` Daniel Phillips
2001-05-15  4:00       ` Linus Torvalds
2001-05-15  4:35         ` Larry McVoy
2001-05-15  4:59           ` Alexander Viro
2001-05-15 17:01             ` Pavel Machek
2001-05-15  4:43         ` Linus Torvalds [this message]
2001-05-15  5:04           ` Alexander Viro
2001-05-15 16:17           ` Pavel Machek
2001-05-19 19:39             ` Linus Torvalds
2001-05-19 19:44               ` Pavel Machek
2001-05-19 19:47                 ` Linus Torvalds
2001-05-23 11:29                   ` Stephen C. Tweedie
2001-05-20  4:30               ` Chris Wedgwood
2001-05-20 19:47                 ` Alan Cox
2001-05-18  7:55           ` Rogier Wolff
2001-05-23 11:36             ` Stephen C. Tweedie
2001-05-15  4:57         ` David S. Miller
2001-05-15  5:12           ` Alexander Viro
2001-05-15  9:10           ` Alan Cox
2001-05-15  9:48             ` Lars Brinkhoff
2001-05-15  9:54               ` Alexander Viro
2001-05-15 20:17               ` Kai Henningsen
2001-05-15 20:58                 ` Alexander Viro
2001-05-15 21:08                   ` Alexander Viro
2001-05-15  6:20         ` Richard Gooch
2001-05-15  6:28           ` Linus Torvalds
2001-05-15  6:49           ` Richard Gooch
2001-05-15  6:57             ` Alexander Viro
2001-05-15 10:33               ` Daniel Phillips
2001-05-15 10:44                 ` Alexander Viro
2001-05-15 14:42                   ` Daniel Phillips
2001-05-15  7:13             ` Linus Torvalds
2001-05-15  7:56               ` Chris Wedgwood
2001-05-15  8:06                 ` Linus Torvalds
2001-05-15  8:33                   ` Alexander Viro
2001-05-15 10:27                     ` David Woodhouse
2001-05-15 16:00                     ` Chris Mason
2001-05-15 19:26                     ` H. Peter Anvin
2001-05-15 20:03                       ` Alexander Viro
2001-05-15 20:07                         ` H. Peter Anvin
2001-05-15 20:15                           ` Alexander Viro
2001-05-15 20:17                             ` H. Peter Anvin
2001-05-15 20:22                               ` Alexander Viro
2001-05-15 20:26                                 ` H. Peter Anvin
2001-05-15 20:31                                   ` Alexander Viro
2001-05-15 21:12                                     ` Linus Torvalds
2001-05-15 21:22                                     ` H. Peter Anvin
2001-05-15 21:02                                 ` Linus Torvalds
2001-05-15 21:53                                   ` Jan Harkes
2001-05-19  5:26                   ` Chris Wedgwood
2001-05-15 10:04             ` Anton Altaparmakov
2001-05-15 19:28               ` H. Peter Anvin
2001-05-15 22:31                 ` Albert D. Cahalan
2001-05-15 22:35                   ` H. Peter Anvin
2001-05-16  1:17                   ` Anton Altaparmakov
2001-05-16  1:30                     ` H. Peter Anvin
2001-05-16  8:34                     ` Anton Altaparmakov
2001-05-16 16:27                       ` H. Peter Anvin
2001-05-15 16:26             ` Pavel Machek
2001-05-15 18:02             ` Craig Milo Rogers
2001-05-15  6:13       ` Richard Gooch

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Pine.LNX.4.21.0105142130480.23663-100000@penguin.transmeta.com \
    --to=torvalds@transmeta.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=rgooch@ras.ucalgary.ca \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).