archive mirror
 help / color / mirror / Atom feed
From: James Bottomley <>
To: Dan Magenheimer <>
Cc: Andrew Morton <>,
	Dave Hansen <>,,,
	Konrad Wilk <>,
	Seth Jennings <>,
	Nitin Gupta <>,
	Nebojsa Trpkovic <>,,
	KAMEZAWA Hiroyuki <>,, Chris Mason <>,
Subject: RE: [PATCH] mm: implement WasActive page flag (for improving cleancache)
Date: Fri, 27 Jan 2012 07:43:07 -0600	[thread overview]
Message-ID: <> (raw)
In-Reply-To: <7198bfb3-1e32-40d3-8601-d88aed7aabd8@default>

On Thu, 2012-01-26 at 18:43 -0800, Dan Magenheimer wrote:
> > From: Andrew Morton []
> > It really didn't tell us anything, apart from referring to vague
> > "problems on streaming workloads", which forces everyone to go off and
> > do an hour or two's kernel archeology, probably in the area of
> > readahead.
> > 
> > Just describe the problem!  Why is it slow?  Where's the time being
> > spent?  How does the proposed fix (which we haven't actually seen)
> > address the problem?  If you inform us of these things then perhaps
> > someone will have a useful suggestion.  And as a side-effect, we'll
> > understand cleancache better.
> Sorry, I'm often told that my explanations are long-winded so
> as a result I sometimes err on the side of brevity...
> The problem is that if a pageframe is used for a page that is
> very unlikely (or never going) to be used again instead of for
> a page that IS likely to be used again, it results in more
> refaults (which means more I/O which means poorer performance).
> So we want to keep pages that are most likely to be used again.
> And pages that were active are more likely to be used again than
> pages that were never active... at least the post-2.6.27 kernel
> makes that assumption.  A cleancache backend can keep or discard
> any page it pleases... it makes sense for it to keep pages
> that were previously active rather than pages that were never
> active.
> For zcache, we can store twice as many pages per pageframe.
> But if we are storing two pages that are very unlikely
> (or never going) to be used again instead of one page
> that IS likely to be used again, that's probably still a bad choice.
> Further, for every page that never gets used again (or gets reclaimed
> before it can be used again because there's so much data streaming
> through cleancache), zcache wastes the cpu cost of a page compression.
> On newer machines, compression is suitably fast that this additional
> cpu cost is small-ish.  On older machines, it adds up fast and that's
> what Nebojsa was seeing in 
> Page replacement algorithms are all about heuristics and
> heuristics require information.  The WasActive flag provides
> information that has proven useful to the kernel (as proven
> by the 2.6.27 page replacement design rewrite) to cleancache
> backends (such as zcache).

So this sounds very similar to the recent discussion which I cc'd to
this list about readahead:

It sounds like we want to measure something similar (whether a page has
been touched since it was brought in).  It isn't exactly your WasActive
flag because we want to know after we bring a page in for readahead was
it ever actually used, but it's very similar.

What I was wondering was instead of using a flag, could we make the LRU
lists do this for us ... something like have a special LRU list for
pages added to the page cache but never referenced since added?  It
sounds like you can get your WasActive information from the same type of
LRU list tricks (assuming we can do them).

I think the memory pressure eviction heuristic is: referenced but not
recently used pages first followed by unreferenced and not recently used
readahead pages.  The key being to keep recently read in readahead pages
until last because there's a time between doing readahead and getting
the page accessed and we don't want to trash a recently red in readahead
page only to have the process touch it and find it has to be read in


  parent reply	other threads:[~2012-01-27 13:43 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-01-25 21:58 [PATCH] mm: implement WasActive page flag (for improving cleancache) Dan Magenheimer
2012-01-26 17:28 ` Dave Hansen
2012-01-26 21:28   ` Dan Magenheimer
2012-01-27  0:31     ` Andrew Morton
2012-01-27  0:56       ` Dan Magenheimer
2012-01-27  1:15         ` Andrew Morton
2012-01-27  2:43           ` Dan Magenheimer
2012-01-27  3:33             ` Rik van Riel
2012-01-27  5:15               ` Dan Magenheimer
2012-01-30  8:57                 ` KAMEZAWA Hiroyuki
2012-01-30 22:03                   ` Dan Magenheimer
2012-01-27 13:43             ` James Bottomley [this message]
2012-01-27 17:32               ` Dan Magenheimer
2012-01-27 17:54                 ` James Bottomley
2012-01-27 18:46                   ` Dan Magenheimer
2012-01-27 21:49                     ` James Bottomley
2012-01-29  0:50                       ` Rik van Riel
2012-01-29 22:25                         ` James Bottomley
2012-01-27  3:28         ` Rik van Riel
2012-01-27  5:11           ` Dan Magenheimer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).