All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jeff Layton <jlayton@redhat.com>
To: Chuck Lever <chuck.lever@oracle.com>
Cc: bfields@fieldses.org, linux-nfs@vger.kernel.org
Subject: Re: [PATCH RFC] nfsd: report length of the largest hash chain in reply cache stats
Date: Fri, 15 Feb 2013 17:20:58 -0500	[thread overview]
Message-ID: <20130215172058.29941a54@tlielax.poochiereds.net> (raw)
In-Reply-To: <299C8DF9-5BFC-4E26-8F7E-CE3415D1140F@oracle.com>

On Fri, 15 Feb 2013 16:14:56 -0500
Chuck Lever <chuck.lever@oracle.com> wrote:

> 
> On Feb 15, 2013, at 3:04 PM, Jeff Layton <jlayton@redhat.com> wrote:
> 
> > So we can get a feel for how effective the hashing function is.
> > 
> > As Chuck Lever pointed out to me, it's generally acceptable to do
> > "expensive" stuff when reading the stats since that's a relatively
> > rare activity.
> 
> A good measure of the efficacy of a hash function is the ratio of the maximum chain length to the optimal chain length (which can be computed by dividing the total number of cache entries by the number of hash chains).
> 

Right, the number of chains is always 64 for now (maybe we should print
that out in the stats too), so you can compute that from the values
provided here.

> If we plan to stick with a hash table for this cache, there should be some indication when the hash function falls over.  This will matter because the DRC can now grow much larger, which is turning out to be the real fundamental change with this work.
> 

That's the kicker. With the patch below, computing the max chain length
on the fly is somewhat expensive since you have to walk every entry.
It's certainly possible though (even likely) that the real max length
will occur at some point when we're not looking at this file. So how to
best gauge that?

Maybe we should just punt and move it all to a rbtree or something. A
self-balancing structure is nice and simple to deal with, even if the
insertion penalty is a bit higher...

> A philosophical question though is "How can we know when the DRC is large enough?"
> 

An excellent question, and not an easy one to answer. Clearly 1024
entries was not enough. We now cap the size as a function of the
available low memory, which I think is a reasonable way to keep it from
ballooning so large that the box falls over. We also have a shrinker
and periodic cache cleaner to prune off entries that have expired.

Of course one thing I haven't really considered enough is the
performance implications of walking the potentially much longer hash
chains here.

If that is a problem, then one way to counter that without moving to a
different structure altogether might be to alter the hash function
based on the max size of the cache. IOW, grow the number of hash buckets
as the max cache size grows?

-- 
Jeff Layton <jlayton@redhat.com>

  reply	other threads:[~2013-02-15 22:21 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-02-15 17:23 [PATCH v2 0/3] nfsd: better stats gathering for DRC Jeff Layton
2013-02-15 17:23 ` [PATCH v2 1/3] nfsd: break out comparator into separate function Jeff Layton
2013-02-15 17:23 ` [PATCH v2 2/3] nfsd: track memory utilization in the DRC Jeff Layton
2013-02-15 17:23 ` [PATCH v2 3/3] nfsd: add new reply_cache_stats file in nfsdfs Jeff Layton
2013-02-15 17:29   ` Chuck Lever
2013-02-15 18:34     ` Jeff Layton
2013-02-15 20:04       ` [PATCH RFC] nfsd: report length of the largest hash chain in reply cache stats Jeff Layton
2013-02-15 21:14         ` Chuck Lever
2013-02-15 22:20           ` Jeff Layton [this message]
2013-02-16 13:39             ` J. Bruce Fields
2013-02-16 17:18               ` Chuck Lever
2013-02-17 16:00                 ` J. Bruce Fields
2013-02-17 19:58                   ` Chuck Lever
2013-02-18 14:21                   ` Jeff Layton
2013-02-18 14:30                     ` J. Bruce Fields
2013-02-18 14:39                 ` Jeff Layton
2013-02-18 16:18                   ` Chuck Lever

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130215172058.29941a54@tlielax.poochiereds.net \
    --to=jlayton@redhat.com \
    --cc=bfields@fieldses.org \
    --cc=chuck.lever@oracle.com \
    --cc=linux-nfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.