linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Matthew Wilcox <willy@infradead.org>
To: Andreas Dilger <adilger@dilger.ca>
Cc: Waiman Long <longman@redhat.com>,
	Alexander Viro <viro@zeniv.linux.org.uk>,
	Jonathan Corbet <corbet@lwn.net>,
	Luis Chamberlain <mcgrof@kernel.org>,
	Kees Cook <keescook@chromium.org>,
	Iurii Zaikin <yzaikin@google.com>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Linux FS Devel <linux-fsdevel@vger.kernel.org>,
	linux-doc@vger.kernel.org,
	Mauro Carvalho Chehab <mchehab+samsung@kernel.org>,
	Eric Biggers <ebiggers@google.com>,
	Dave Chinner <david@fromorbit.com>,
	Eric Sandeen <sandeen@redhat.com>
Subject: Re: [PATCH 00/11] fs/dcache: Limit # of negative dentries
Date: Wed, 26 Feb 2020 13:45:07 -0800	[thread overview]
Message-ID: <20200226214507.GE24185@bombadil.infradead.org> (raw)
In-Reply-To: <2EDB6FFC-C649-4C80-999B-945678F5CE87@dilger.ca>

On Wed, Feb 26, 2020 at 02:28:50PM -0700, Andreas Dilger wrote:
> On Feb 26, 2020, at 9:29 AM, Matthew Wilcox <willy@infradead.org> wrote:
> > This is always the wrong approach.  A sysctl is just a way of blaming
> > the sysadmin for us not being very good at programming.
> > 
> > I agree that we need a way to limit the number of negative dentries.
> > But that limit needs to be dynamic and depend on how the system is being
> > used, not on how some overworked sysadmin has configured it.
> > 
> > So we need an initial estimate for the number of negative dentries that
> > we need for good performance.  Maybe it's 1000.  It doesn't really matter;
> > it's going to change dynamically.
> > 
> > Then we need a metric to let us know whether it needs to be increased.
> > Perhaps that's "number of new negative dentries created in the last
> > second".  And we need to decide how much to increase it; maybe it's by
> > 50% or maybe by 10%.  Perhaps somewhere between 10-100% depending on
> > how high the recent rate of negative dentry creation has been.
> > 
> > We also need a metric to let us know whether it needs to be decreased.
> > I'm reluctant to say that memory pressure should be that metric because
> > very large systems can let the number of dentries grow in an unbounded
> > way.  Perhaps that metric is "number of hits in the negative dentry
> > cache in the last ten seconds".  Again, we'll need to decide how much
> > to shrink the target number by.
> 
> OK, so now instead of a single tunable parameter we need three, because
> these numbers are totally made up and nobody knows the right values. :-)
> Defaulting the limit to "disabled/no limit" also has the problem that
> 99.99% of users won't even know this tunable exists, let alone how to
> set it correctly, so they will continue to see these problems, and the
> code may as well not exist (i.e. pure overhead), while Waiman has a
> better idea today of what would be reasonable defaults.

I never said "no limit".  I just said to start at some fairly random
value and not worry about where you start because it'll correct to where
this system needs it to be.  As long as it converges like loadavg does,
it'll be fine.  It needs a fairly large "don't change the target" area,
and it needs to react quickly to real changes in a system's workload.

> I definitely agree that a single fixed value will be wrong for every
> system except the original developer's.  Making the maximum default to
> some reasonable fraction of the system size, rather than a fixed value,
> is probably best to start.  Something like this as a starting point:
> 
> 	/* Allow a reasonable minimum number of negative entries,
> 	 * but proportionately more if the directory/dcache is large.
> 	 */
> 	dir_negative_max = max(num_dir_entries / 16, 1024);
>         total_negative_max = max(totalram_pages / 32, total_dentries / 8);

Those kinds of things are garbage on large machines.  With a terabyte
of RAM, you can end up with tens of millions of dentries clogging up
the system.  There _is_ an upper limit on the useful number of dentries
to keep around.

> (Waiman should decide actual values based on where the problem was hit
> previously), and include tunables to change the limits for testing.
> 
> Ideally there would also be a dir ioctl that allows fetching the current
> positive/negative entry count on a directory (e.g. /usr/bin, /usr/lib64,
> /usr/share/man/man*) to see what these values are.  Otherwise there is
> no way to determine whether the limits used are any good or not.

It definitely needs to be instrumented for testing, but no, not ioctls.
tracepoints, perhaps.

> Dynamic limits are hard to get right, and incorrect state machines can lead
> to wild swings in behaviour due to unexpected feedback.  It isn't clear to
> me that adjusting the limit based on the current rate of negative dentry
> creation even makes sense.  If there are a lot of negative entries being
> created, that is when you'd want to _stop_ allowing more to be added.

That doesn't make sense.  What you really want to know is "If my dcache
had twice as many entries in it, would that significantly reduce the
thrash of new entries being created".  In the page cache, we end up
with a double LRU where once-used entries fall off the list quickly
but twice-or-more used entries get to stay around for a bit longer.
Maybe we could do something like that; keep a victim cache for recently
evicted dentries, and if we get a large hit rate in the victim cache,
expand the size of the primary cache.



  reply	other threads:[~2020-02-26 21:45 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-02-26 16:13 [PATCH 00/11] fs/dcache: Limit # of negative dentries Waiman Long
2020-02-26 16:13 ` [PATCH 01/11] fs/dcache: Fix incorrect accounting " Waiman Long
2020-02-26 16:13 ` [PATCH 02/11] fs/dcache: Simplify __dentry_kill() Waiman Long
2020-02-26 16:13 ` [PATCH 03/11] fs/dcache: Add a counter to track number of children Waiman Long
2020-02-26 16:13 ` [PATCH 04/11] fs/dcache: Add sysctl parameter dentry-dir-max Waiman Long
2020-02-26 16:13 ` [PATCH 05/11] fs/dcache: Reclaim excessive negative dentries in directories Waiman Long
2020-02-26 16:13 ` [PATCH 06/11] fs/dcache: directory opportunistically stores # of positive dentries Waiman Long
2020-02-26 16:14 ` [PATCH 07/11] fs/dcache: Add static key negative_reclaim_enable Waiman Long
2020-02-26 16:14 ` [PATCH 08/11] fs/dcache: Limit dentry reclaim count in negative_reclaim_workfn() Waiman Long
2020-02-26 16:14 ` [PATCH 09/11] fs/dcache: Don't allow small values for dentry-dir-max Waiman Long
2020-02-26 16:14 ` [PATCH 10/11] fs/dcache: Kill off dentry as last resort Waiman Long
2020-02-26 16:14 ` [PATCH 11/11] fs/dcache: Track # of negative dentries reclaimed & killed Waiman Long
2020-02-26 16:29 ` [PATCH 00/11] fs/dcache: Limit # of negative dentries Matthew Wilcox
2020-02-26 19:19   ` Waiman Long
2020-02-26 21:28     ` Matthew Wilcox
2020-02-26 21:28   ` Andreas Dilger
2020-02-26 21:45     ` Matthew Wilcox [this message]
2020-02-27  8:07       ` Dave Chinner
2020-02-27  9:55     ` Ian Kent
2020-02-28  3:34       ` Matthew Wilcox
2020-02-28  4:16         ` Ian Kent
2020-02-28  4:36           ` Ian Kent
2020-02-28  4:52             ` Al Viro
2020-02-28  4:22         ` Al Viro
2020-02-28  4:52           ` Ian Kent
2020-02-28 15:32         ` Waiman Long
2020-02-28 15:39           ` Matthew Wilcox
2020-02-28 19:32         ` Theodore Y. Ts'o
2020-02-27 19:04   ` Eric Sandeen
2020-02-27 22:39     ` Dave Chinner
2020-02-27  8:30 ` Dave Chinner
2020-02-28 15:47   ` Waiman Long
2020-03-15  3:46 ` Matthew Wilcox
2020-03-21 10:17   ` Konstantin Khlebnikov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200226214507.GE24185@bombadil.infradead.org \
    --to=willy@infradead.org \
    --cc=adilger@dilger.ca \
    --cc=corbet@lwn.net \
    --cc=david@fromorbit.com \
    --cc=ebiggers@google.com \
    --cc=keescook@chromium.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=longman@redhat.com \
    --cc=mcgrof@kernel.org \
    --cc=mchehab+samsung@kernel.org \
    --cc=sandeen@redhat.com \
    --cc=viro@zeniv.linux.org.uk \
    --cc=yzaikin@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).