linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: David Howells <dhowells@redhat.com>
To: Chuck Lever <chuck.lever@oracle.com>
Cc: dhowells@redhat.com, Peter Staubach <staubach@redhat.com>,
	Trond Myklebust <trond.myklebust@fys.uio.no>,
	nfsv4@linux-nfs.org, linux-kernel@vger.kernel.org
Subject: Re: How to manage shared persistent local caching (FS-Cache) with NFS?
Date: Sat, 08 Dec 2007 00:52:15 +0000	[thread overview]
Message-ID: <16896.1197075135@redhat.com> (raw)
In-Reply-To: <DD7C0279-50CF-443B-B61B-D3DD78EE22C5@oracle.com>

Chuck Lever <chuck.lever@oracle.com> wrote:

> Why not encode the local mounted-on directory in the key?

Can't.  Namespaces.  chroot.

> Meaning your cache is at quota all the time, and to continue operation it must
> eject items constantly.

I've thought about that, thank you.  Go and read the documentation.  There's
configurable hysteresis in the culling algorithm.

> This is a scenario where it pays to cache the read-mostly items on disk, and
> leave the frequently changing items in memory.

Currently any file which is opened for writing is automatically ejected from
the cache.

> The economics of disk caches is different than memory caches.  Disk caches are
> much larger and cheaper, but their performance tanks when  they have to track
> frequently changing files.  Memory caches are  smaller, but tracking
> frequently changing data is only a little more  expensive than tracking data
> that doesn't change often.

I'm aware of all that.  My OLS slides and paper can be found here:

	http://people.redhat.com/~dhowells/fscache/fscache-ols2006.odp
	http://people.redhat.com/~dhowells/fscache/FS-Cache.pdf

Lots of small files also hurt worse than fewer big files in some ways.  Lots
more metadata in the cache.  On the other hand, fragmentation is less of a
problem.

Anyway, this is straying off the main topic.

> I think it's key to preventing FS-cache from making performance worse in many
> common scenarios.

Perhaps.  The problem is that NFS doesn't know what the access pattern on a
file is expected to be.  I've been asked to provide fine-grained cache
controls (perhaps directory level), but Al Viro was, erm, luke warm in his
reception to that idea.

Gathering statistical data dynamically has performance penalties of its own:-/

> Disconnected operation for NFS is fraught with challenges.  Access to data on
> servers is traditionally gated by the client's IP address,  for example.  The
> client may disconnect from the network, then  reconnect using a different
> address where suddenly all of its  accesses are rebuffed.

Agreed, but isn't that one of the design goals for NFS4?

It's also something of interest to other netfs's that might want to use
FS-Cache.  This isn't an NFS-only facility.

> NFS servers, not clients, traditionally determine the file's mtime and ctime,
> and its file handle.  So file updates and file creation  become problematic.
> The client has to reconcile the server's file  handle, for files created
> offline, with its own when reconnecting.

Yes.  Basically it's a major can of major worms.  Doesn't stop people wanting
it, though.

> And, for disconnected operation, the cache is required to contain every item
> from the remote.  You can't just drop items from the cache  because they are
> inconvenient.

Yes.  That's what pinning and reservations are for.

Currently, support for disconnected operations is an idea I'd like to have,
but is otherwise mostly non-existent.

> That something might be the pathname of the mounted-on directory or of the
> file itself.

See above.

> Yes, they do.  The combination of mount options and mounted-on directory (or
> local pathname to the file) gives you a unique identity  for that view.

See above.

> So an item is cached in memory until space becomes available in the disk
> cache?

The item isn't considered for caching until space becomes available in the
disk cache.  It's put on a queue for potential caching, but won't actually be
cached if it gets discarded from the icache or pagecache before being cached.

It's unfortunate, but with a fast network you can download data faster than
you can make space in the cache.  unlink() and rmdir() are (a) slow and (b)
synchronous.  Each unlink() or rmdir() operation requires a task to perform
it, and that task is committed until the op finishes.

I could actually improve cachefilesd (the userspace cache culler) by giving it
multiple threads.

However, having cachefilesd doing lots of parallel synchronous, journalled
disk ops hurts performance in other ways I've noticed:-/

Again, hysteresis is available.  We stop writing stuff into the cache beyond
a limit until the free space drops sufficiently below that limit that we've
got a good go at writing a load new stuff, rather than just a block here and a
block there.

It's all very icky, and depends as much on the filesystem underlying the cache
(ext3 for example) and *its* configuration, as the characteristics of the
netfs and the network link.  It's all about compromise.

David

      parent reply	other threads:[~2007-12-08  0:52 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-12-05 17:11 How to manage shared persistent local caching (FS-Cache) with NFS? David Howells
2007-12-05 17:49 ` Jon Masters
2007-12-05 18:03 ` David Howells
2007-12-05 19:54 ` Chuck Lever
2007-12-06  1:22 ` David Howells
2007-12-06 18:28   ` Chuck Lever
2007-12-06 20:00   ` David Howells
2007-12-07 17:59     ` Chuck Lever
2007-12-08  0:52     ` David Howells [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=16896.1197075135@redhat.com \
    --to=dhowells@redhat.com \
    --cc=chuck.lever@oracle.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=nfsv4@linux-nfs.org \
    --cc=staubach@redhat.com \
    --cc=trond.myklebust@fys.uio.no \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).