linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Stephen Brennan <stephen.s.brennan@oracle.com>
To: James Bottomley <James.Bottomley@HansenPartnership.com>,
	Matthew Wilcox <willy@infradead.org>,
	Colin Walters <walters@verbum.org>
Cc: Dave Chinner <david@fromorbit.com>,
	Roman Gushchin <roman.gushchin@linux.dev>,
	lsf-pc@lists.linux-foundation.org, linux-fsdevel@vger.kernel.org,
	linux-mm@kvack.org,
	Gautham Ananthakrishna <gautham.ananthakrishna@oracle.com>,
	khlebnikov@yandex-team.ru
Subject: Re: [LSF/MM TOPIC] Better handling of negative dentries
Date: Thu, 31 Mar 2022 12:27:53 -0700	[thread overview]
Message-ID: <87o81l6hxi.fsf@stepbren-lnx.us.oracle.com> (raw)
In-Reply-To: <f849f7f981ef76b30b4d91457752b3740b1f6d51.camel@HansenPartnership.com>

James Bottomley <James.Bottomley@HansenPartnership.com> writes:
> On Tue, 2022-03-22 at 14:08 -0700, Stephen Brennan wrote:
[snip]
>> If we're looking at issues like [1], then the amount needs to be on a
>> per-directory basis, and maybe roughly based on CPU speed. For other
>> priorities or failure modes, then the policy would need to be
>> completely different. Ideally a solution could work for almost all
>> scenarios, but failing that, maybe it is worth allowing policy to be
>> set by administrators via sysctl or even a BPF?
>
> Looking at [1], you're really trying to contain the parent's child list
> from exploding with negative dentries.  Looking through the patch, it
> still strikes me that dentry_kill/retain_dentry is still a better
> place, because if a negative dentry comes back there, it's unlikely to
> become positive (well, fstat followed by create would be the counter
> example, but it would partly be the app's fault for not doing
> open(O_CREAT)).

I actually like the idea of doing the pruning during d_alloc().
Basically, if you're creating dentries, you should also be working on
the cache management for them.

> If we have the signal for reuse of negative dentry from the cache,
> which would be a fast lookup, we know a newly created negative dentry
> already had a slow lookup, so we can do more processing without
> necessarily slowing down the workload too much.  In particular, we
> could just iterate over the parent's children of this negative dentry
> and start pruning if there are too many (too many being a relative
> term, but I think something like 2x-10x the max positive entries
> wouldn't be such a bad heuristic).

I agree that, on a per-directory basis, 2-10x feels right, though maybe
there needs to be some leeway for empty directories?

Per-directory pruning also makes sense from a concurrency standpoint:
the LRU locks could become a source of contention.

> Assuming we don't allow the
> parent's list to contain too many negative dentries, we might not need
> the sweep negative logic because the list wouldn't be allowed to grow
> overly large.

Seconded, I have no desire to actually try to get that sweep negative
logic merged if we can do a better job handling the dentries in the
first place.

> I think a second heuristic would be prune at least two
> negative dentries from the end of the sb lru list if they've never been
> used for a lookup and were created more than a specified time ago
> (problem is what, but I bet a minute wouldn't be unreasonable).
>
> Obviously, while I think it would work for some of the negative dentry
> induced issues, the above is very heuristic in tuning and won't help
> with any of the other object issues in filesystems.  But on the other
> hand, negative dentries are special in that if they're never used to
> cache an -ENOENT and they never go positive, they're just a waste of
> space.

I took a preliminary stab at some of these ideas in this series:

https://lore.kernel.org/linux-fsdevel/20220331190827.48241-1-stephen.s.brennan@oracle.com/

It doesn't handle pruning from the sb LRU list, nor does it have a good
way to decide which dentry to kill. But it's pretty stable and simple,
and I value that part :) . It still needs some work but I'd welcome
feedback from folks interested in this discussion.

Stephen

>
> James

  reply	other threads:[~2022-03-31 19:28 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-03-15 19:55 [LSF/MM TOPIC] Better handling of negative dentries Matthew Wilcox
2022-03-15 20:56 ` Roman Gushchin
2022-03-16  2:07   ` Gao Xiang
2022-03-16  2:52     ` Dave Chinner
2022-03-16  3:08       ` Gao Xiang
2022-03-22 15:08       ` Matthew Wilcox
2022-03-22 19:19         ` James Bottomley
2022-03-22 20:17           ` Colin Walters
2022-03-22 20:27             ` James Bottomley
2022-03-22 20:37             ` Matthew Wilcox
2022-03-22 21:08               ` Stephen Brennan
2022-03-29 15:24                 ` James Bottomley
2022-03-31 19:27                   ` Stephen Brennan [this message]
2022-04-01 15:45                     ` James Bottomley
2022-03-22 22:21         ` Dave Chinner
2022-03-22 20:41   ` Matthew Wilcox
2022-03-22 21:19     ` Roman Gushchin
2022-03-22 22:29       ` Dave Chinner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87o81l6hxi.fsf@stepbren-lnx.us.oracle.com \
    --to=stephen.s.brennan@oracle.com \
    --cc=James.Bottomley@HansenPartnership.com \
    --cc=david@fromorbit.com \
    --cc=gautham.ananthakrishna@oracle.com \
    --cc=khlebnikov@yandex-team.ru \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lsf-pc@lists.linux-foundation.org \
    --cc=roman.gushchin@linux.dev \
    --cc=walters@verbum.org \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).