All of lore.kernel.org
 help / color / mirror / Atom feed
From: Nitin Gupta <ngupta@vflare.org>
To: Dan Magenheimer <dan.magenheimer@oracle.com>
Cc: chris.mason@oracle.com, viro@zeniv.linux.org.uk,
	akpm@linux-foundation.org, adilger@sun.com, tytso@mit.edu,
	mfasheh@suse.com, joel.becker@oracle.com, matthew@wil.cx,
	linux-btrfs@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-fsdevel@vger.kernel.org, linux-ext4@vger.kernel.org,
	ocfs2-devel@oss.oracle.com, linux-mm@kvack.org, jeremy@goop.org,
	JBeulich@novell.com, kurt.hackel@oracle.com, npiggin@suse.de,
	dave.mccracken@oracle.com, riel@redhat.com, avi@redhat.com,
	konrad.wilk@oracle.com
Subject: Re: [PATCH V3 0/8] Cleancache: overview
Date: Fri, 23 Jul 2010 13:06:43 +0530	[thread overview]
Message-ID: <4C49468B.40307@vflare.org> (raw)
In-Reply-To: <20100621231809.GA11111@ca-server1.us.oracle.com>

On 06/22/2010 04:48 AM, Dan Magenheimer wrote:
> [PATCH V3 0/8] Cleancache: overview
> 
<snip>
> 
>  Documentation/ABI/testing/sysfs-kernel-mm-cleancache |   11 +
>  Documentation/vm/cleancache.txt                      |  194 +++++++++++++++++++
>  fs/btrfs/extent_io.c                                 |    9 
>  fs/btrfs/super.c                                     |    2 
>  fs/buffer.c                                          |    5 
>  fs/ext3/super.c                                      |    2 
>  fs/ext4/super.c                                      |    2 
>  fs/mpage.c                                           |    7 
>  fs/ocfs2/super.c                                     |    3 
>  fs/super.c                                           |    7 
>  include/linux/cleancache.h                           |   88 ++++++++
>  include/linux/fs.h                                   |    5 
>  mm/Kconfig                                           |   22 ++
>  mm/Makefile                                          |    1 
>  mm/cleancache.c                                      |  169 ++++++++++++++++
>  mm/filemap.c                                         |   11 +
>  mm/truncate.c                                        |   10 
>  17 files changed, 548 insertions(+)
> 
> (following is a copy of Documentation/vm/cleancache.txt)
> 
> MOTIVATION
> 
> Cleancache can be thought of as a page-granularity victim cache for clean
> pages that the kernel's pageframe replacement algorithm (PFRA) would like
> to keep around, but can't since there isn't enough memory.  So when the
> PFRA "evicts" a page, it first attempts to put it into a synchronous
> concurrency-safe page-oriented "pseudo-RAM" device (such as Xen's Transcendent
> Memory, aka "tmem", or in-kernel compressed memory, aka "zmem", or other
> RAM-like devices) which is not directly accessible or addressable by the
> kernel and is of unknown and possibly time-varying size.  And when a
> cleancache-enabled filesystem wishes to access a page in a file on disk,
> it first checks cleancache to see if it already contains it; if it does,
> the page is copied into the kernel and a disk access is avoided.
> 


Since zcache is now one of its use cases, I think the major objection that
remains against cleancache is its intrusiveness -- in particular, need to
change individual filesystems (even though one liners). Changes below should
help avoid these per-fs changes and make it more self contained. I haven't
tested these changes myself, so there might be missed cases or other mysterious
problems:

1. Cleancache requires filesystem specific changes primarily to make a call to
cleancache init and store (per-fs instance) pool_id. I think we can get rid of
these by directly passing 'struct super_block' pointer which is also
sufficient to identify FS instance a page belongs to. This should then be used
as a 'handle' by cleancache_ops provider to find corresponding memory pool or
create a new pool when a new handle is encountered.

This leaves out case of ocfs2 for which cleancache needs 'uuid' to decide if a
shared pool should be created. IMHO, this case (and cleancache.init_shared_fs)
should be removed from cleancache_ops since it is applicable only for Xen's
cleancache_ops provider.

2. I think change in btrfs can be avoided by moving cleancache_get_page()
from do_mpage_reapage() to filemap_fault() and this should work for all
filesystems. See:

handle_pte_fault() -> do_(non)linear_fault() -> __do_fault()
						-> vma->vm_ops->fault()

which is defined as filemap_fault() for all filesystems. If some future
filesystem uses its own custom function (why?) then it will have to arrange for
call to cleancache_get_page(), if it wants this feature.

With above changes, cleancache will be fairly self-contained:
 - cleancache_put_page() when page is removed from page-cache
 - cleacacache_get_page() when PF occurs (and after page-cache is searched)
 - cleancache_flush_*() on truncate_*()

Thanks,
Nitin

WARNING: multiple messages have this Message-ID (diff)
From: Nitin Gupta <ngupta@vflare.org>
To: Dan Magenheimer <dan.magenheimer@oracle.com>
Cc: chris.mason@oracle.com, viro@zeniv.linux.org.uk,
	akpm@linux-foundation.org, adilger@sun.com, tytso@mit.edu,
	mfasheh@suse.com, joel.becker@oracle.com, matthew@wil.cx,
	linux-btrfs@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-fsdevel@vger.kernel.org, linux-ext4@vger.kernel.org,
	ocfs2-devel@oss.oracle.com, linux-mm@kvack.org, jeremy@goop.org,
	JBeulich@novell.com, kurt.hackel@oracle.com, npiggin@suse.de,
	dave.mccracken@oracle.com, riel@redhat.com, avi@redhat.com,
	konrad.wilk@oracle.com
Subject: [Ocfs2-devel] [PATCH V3 0/8] Cleancache: overview
Date: Fri, 23 Jul 2010 13:06:43 +0530	[thread overview]
Message-ID: <4C49468B.40307@vflare.org> (raw)
In-Reply-To: <20100621231809.GA11111@ca-server1.us.oracle.com>

On 06/22/2010 04:48 AM, Dan Magenheimer wrote:
> [PATCH V3 0/8] Cleancache: overview
> 
<snip>
> 
>  Documentation/ABI/testing/sysfs-kernel-mm-cleancache |   11 +
>  Documentation/vm/cleancache.txt                      |  194 +++++++++++++++++++
>  fs/btrfs/extent_io.c                                 |    9 
>  fs/btrfs/super.c                                     |    2 
>  fs/buffer.c                                          |    5 
>  fs/ext3/super.c                                      |    2 
>  fs/ext4/super.c                                      |    2 
>  fs/mpage.c                                           |    7 
>  fs/ocfs2/super.c                                     |    3 
>  fs/super.c                                           |    7 
>  include/linux/cleancache.h                           |   88 ++++++++
>  include/linux/fs.h                                   |    5 
>  mm/Kconfig                                           |   22 ++
>  mm/Makefile                                          |    1 
>  mm/cleancache.c                                      |  169 ++++++++++++++++
>  mm/filemap.c                                         |   11 +
>  mm/truncate.c                                        |   10 
>  17 files changed, 548 insertions(+)
> 
> (following is a copy of Documentation/vm/cleancache.txt)
> 
> MOTIVATION
> 
> Cleancache can be thought of as a page-granularity victim cache for clean
> pages that the kernel's pageframe replacement algorithm (PFRA) would like
> to keep around, but can't since there isn't enough memory.  So when the
> PFRA "evicts" a page, it first attempts to put it into a synchronous
> concurrency-safe page-oriented "pseudo-RAM" device (such as Xen's Transcendent
> Memory, aka "tmem", or in-kernel compressed memory, aka "zmem", or other
> RAM-like devices) which is not directly accessible or addressable by the
> kernel and is of unknown and possibly time-varying size.  And when a
> cleancache-enabled filesystem wishes to access a page in a file on disk,
> it first checks cleancache to see if it already contains it; if it does,
> the page is copied into the kernel and a disk access is avoided.
> 


Since zcache is now one of its use cases, I think the major objection that
remains against cleancache is its intrusiveness -- in particular, need to
change individual filesystems (even though one liners). Changes below should
help avoid these per-fs changes and make it more self contained. I haven't
tested these changes myself, so there might be missed cases or other mysterious
problems:

1. Cleancache requires filesystem specific changes primarily to make a call to
cleancache init and store (per-fs instance) pool_id. I think we can get rid of
these by directly passing 'struct super_block' pointer which is also
sufficient to identify FS instance a page belongs to. This should then be used
as a 'handle' by cleancache_ops provider to find corresponding memory pool or
create a new pool when a new handle is encountered.

This leaves out case of ocfs2 for which cleancache needs 'uuid' to decide if a
shared pool should be created. IMHO, this case (and cleancache.init_shared_fs)
should be removed from cleancache_ops since it is applicable only for Xen's
cleancache_ops provider.

2. I think change in btrfs can be avoided by moving cleancache_get_page()
from do_mpage_reapage() to filemap_fault() and this should work for all
filesystems. See:

handle_pte_fault() -> do_(non)linear_fault() -> __do_fault()
						-> vma->vm_ops->fault()

which is defined as filemap_fault() for all filesystems. If some future
filesystem uses its own custom function (why?) then it will have to arrange for
call to cleancache_get_page(), if it wants this feature.

With above changes, cleancache will be fairly self-contained:
 - cleancache_put_page() when page is removed from page-cache
 - cleacacache_get_page() when PF occurs (and after page-cache is searched)
 - cleancache_flush_*() on truncate_*()

Thanks,
Nitin

WARNING: multiple messages have this Message-ID (diff)
From: Nitin Gupta <ngupta@vflare.org>
To: Dan Magenheimer <dan.magenheimer@oracle.com>
Cc: chris.mason@oracle.com, viro@zeniv.linux.org.uk,
	akpm@linux-foundation.org, adilger@sun.com, tytso@mit.edu,
	mfasheh@suse.com, joel.becker@oracle.com, matthew@wil.cx,
	linux-btrfs@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-fsdevel@vger.kernel.org, linux-ext4@vger.kernel.org,
	ocfs2-devel@oss.oracle.com, linux-mm@kvack.org, jeremy@goop.org,
	JBeulich@novell.com, kurt.hackel@oracle.com, npiggin@suse.de,
	dave.mccracken@oracle.com, riel@redhat.com, avi@redhat.com,
	konrad.wilk@oracle.com
Subject: Re: [PATCH V3 0/8] Cleancache: overview
Date: Fri, 23 Jul 2010 13:06:43 +0530	[thread overview]
Message-ID: <4C49468B.40307@vflare.org> (raw)
In-Reply-To: <20100621231809.GA11111@ca-server1.us.oracle.com>

On 06/22/2010 04:48 AM, Dan Magenheimer wrote:
> [PATCH V3 0/8] Cleancache: overview
> 
<snip>
> 
>  Documentation/ABI/testing/sysfs-kernel-mm-cleancache |   11 +
>  Documentation/vm/cleancache.txt                      |  194 +++++++++++++++++++
>  fs/btrfs/extent_io.c                                 |    9 
>  fs/btrfs/super.c                                     |    2 
>  fs/buffer.c                                          |    5 
>  fs/ext3/super.c                                      |    2 
>  fs/ext4/super.c                                      |    2 
>  fs/mpage.c                                           |    7 
>  fs/ocfs2/super.c                                     |    3 
>  fs/super.c                                           |    7 
>  include/linux/cleancache.h                           |   88 ++++++++
>  include/linux/fs.h                                   |    5 
>  mm/Kconfig                                           |   22 ++
>  mm/Makefile                                          |    1 
>  mm/cleancache.c                                      |  169 ++++++++++++++++
>  mm/filemap.c                                         |   11 +
>  mm/truncate.c                                        |   10 
>  17 files changed, 548 insertions(+)
> 
> (following is a copy of Documentation/vm/cleancache.txt)
> 
> MOTIVATION
> 
> Cleancache can be thought of as a page-granularity victim cache for clean
> pages that the kernel's pageframe replacement algorithm (PFRA) would like
> to keep around, but can't since there isn't enough memory.  So when the
> PFRA "evicts" a page, it first attempts to put it into a synchronous
> concurrency-safe page-oriented "pseudo-RAM" device (such as Xen's Transcendent
> Memory, aka "tmem", or in-kernel compressed memory, aka "zmem", or other
> RAM-like devices) which is not directly accessible or addressable by the
> kernel and is of unknown and possibly time-varying size.  And when a
> cleancache-enabled filesystem wishes to access a page in a file on disk,
> it first checks cleancache to see if it already contains it; if it does,
> the page is copied into the kernel and a disk access is avoided.
> 


Since zcache is now one of its use cases, I think the major objection that
remains against cleancache is its intrusiveness -- in particular, need to
change individual filesystems (even though one liners). Changes below should
help avoid these per-fs changes and make it more self contained. I haven't
tested these changes myself, so there might be missed cases or other mysterious
problems:

1. Cleancache requires filesystem specific changes primarily to make a call to
cleancache init and store (per-fs instance) pool_id. I think we can get rid of
these by directly passing 'struct super_block' pointer which is also
sufficient to identify FS instance a page belongs to. This should then be used
as a 'handle' by cleancache_ops provider to find corresponding memory pool or
create a new pool when a new handle is encountered.

This leaves out case of ocfs2 for which cleancache needs 'uuid' to decide if a
shared pool should be created. IMHO, this case (and cleancache.init_shared_fs)
should be removed from cleancache_ops since it is applicable only for Xen's
cleancache_ops provider.

2. I think change in btrfs can be avoided by moving cleancache_get_page()
from do_mpage_reapage() to filemap_fault() and this should work for all
filesystems. See:

handle_pte_fault() -> do_(non)linear_fault() -> __do_fault()
						-> vma->vm_ops->fault()

which is defined as filemap_fault() for all filesystems. If some future
filesystem uses its own custom function (why?) then it will have to arrange for
call to cleancache_get_page(), if it wants this feature.

With above changes, cleancache will be fairly self-contained:
 - cleancache_put_page() when page is removed from page-cache
 - cleacacache_get_page() when PF occurs (and after page-cache is searched)
 - cleancache_flush_*() on truncate_*()

Thanks,
Nitin

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  parent reply	other threads:[~2010-07-23  7:36 UTC|newest]

Thread overview: 69+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-06-21 23:18 [PATCH V3 0/8] Cleancache: overview Dan Magenheimer
2010-06-21 23:18 ` Dan Magenheimer
2010-06-21 23:18 ` [Ocfs2-devel] " Dan Magenheimer
2010-06-22  6:40 ` Christoph Hellwig
2010-06-22  6:40   ` Christoph Hellwig
2010-06-22  6:40   ` [Ocfs2-devel] " Christoph Hellwig
2010-07-06 20:58 ` Konrad Rzeszutek Wilk
2010-07-06 20:58   ` [Ocfs2-devel] " Konrad Rzeszutek Wilk
2010-07-06 20:58   ` Konrad Rzeszutek Wilk
2010-07-23  7:36 ` Nitin Gupta [this message]
2010-07-23  7:36   ` Nitin Gupta
2010-07-23  7:36   ` [Ocfs2-devel] " Nitin Gupta
2010-07-23  8:16   ` Minchan Kim
2010-07-23  8:16     ` [Ocfs2-devel] " Minchan Kim
2010-07-23  8:16     ` Minchan Kim
2010-07-23  8:16     ` Minchan Kim
2010-07-23 14:56     ` Nitin Gupta
2010-07-23 14:56       ` Nitin Gupta
2010-07-23 14:56       ` [Ocfs2-devel] " Nitin Gupta
2010-07-23  8:17   ` Minchan Kim
2010-07-23  8:17     ` [Ocfs2-devel] " Minchan Kim
2010-07-23  8:17     ` Minchan Kim
2010-07-23  8:17     ` Minchan Kim
2010-07-23 13:58   ` Dan Magenheimer
2010-07-23 13:58     ` [Ocfs2-devel] " Dan Magenheimer
2010-07-23 13:58     ` Dan Magenheimer
2010-07-23 14:04     ` Christoph Hellwig
2010-07-23 14:04       ` [Ocfs2-devel] " Christoph Hellwig
2010-07-23 14:04       ` Christoph Hellwig
2010-07-23 14:44       ` Dan Magenheimer
2010-07-23 14:44         ` [Ocfs2-devel] " Dan Magenheimer
2010-07-23 14:44         ` Dan Magenheimer
2010-07-23 14:44         ` Dan Magenheimer
2010-07-23 15:05         ` Nitin Gupta
2010-07-23 15:05           ` Nitin Gupta
2010-07-23 15:05           ` [Ocfs2-devel] " Nitin Gupta
2010-07-23 17:43           ` Dan Magenheimer
2010-07-23 17:43             ` [Ocfs2-devel] " Dan Magenheimer
2010-07-23 17:43             ` Dan Magenheimer
2010-07-23 17:43             ` Dan Magenheimer
2010-07-23 17:37         ` Dan Magenheimer
2010-07-23 17:37           ` [Ocfs2-devel] " Dan Magenheimer
2010-07-23 17:37           ` Dan Magenheimer
2010-07-23 17:37           ` Dan Magenheimer
2010-07-23 18:36           ` Nitin Gupta
2010-07-23 18:36             ` Nitin Gupta
2010-07-23 18:36             ` [Ocfs2-devel] " Nitin Gupta
2010-07-23 21:17         ` Dan Magenheimer
2010-07-23 21:17           ` [Ocfs2-devel] " Dan Magenheimer
2010-07-23 21:17           ` Dan Magenheimer
2010-07-23 21:17           ` Dan Magenheimer
2010-08-03 16:22           ` Boaz Harrosh
2010-08-03 16:22             ` [Ocfs2-devel] " Boaz Harrosh
2010-08-03 16:22             ` Boaz Harrosh
2010-08-03 17:35             ` Dan Magenheimer
2010-08-03 17:35               ` [Ocfs2-devel] " Dan Magenheimer
2010-08-03 17:35               ` Dan Magenheimer
2010-08-03 17:35               ` Dan Magenheimer
2010-08-03 18:34               ` Andreas Dilger
2010-08-03 18:34                 ` [Ocfs2-devel] " Andreas Dilger
2010-08-03 18:34                 ` Andreas Dilger
2010-08-03 18:34                 ` Andreas Dilger
2010-08-03 19:09                 ` Dan Magenheimer
2010-08-03 19:09                   ` [Ocfs2-devel] " Dan Magenheimer
2010-08-03 19:09                   ` Dan Magenheimer
2010-08-03 19:09                   ` Dan Magenheimer
  -- strict thread matches above, loose matches on Subject: below --
2010-06-21 23:18 Dan Magenheimer
2010-06-21 23:18 Dan Magenheimer
2010-06-21 23:18 Dan Magenheimer
     [not found] <20100621231809.GA11111@ca-server1.us.oracle.com4C49468B.40307@vflare.org>

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4C49468B.40307@vflare.org \
    --to=ngupta@vflare.org \
    --cc=JBeulich@novell.com \
    --cc=adilger@sun.com \
    --cc=akpm@linux-foundation.org \
    --cc=avi@redhat.com \
    --cc=chris.mason@oracle.com \
    --cc=dan.magenheimer@oracle.com \
    --cc=dave.mccracken@oracle.com \
    --cc=jeremy@goop.org \
    --cc=joel.becker@oracle.com \
    --cc=konrad.wilk@oracle.com \
    --cc=kurt.hackel@oracle.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=matthew@wil.cx \
    --cc=mfasheh@suse.com \
    --cc=npiggin@suse.de \
    --cc=ocfs2-devel@oss.oracle.com \
    --cc=riel@redhat.com \
    --cc=tytso@mit.edu \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.