All of lore.kernel.org
 help / color / mirror / Atom feed
From: Waiman Long <longman@redhat.com>
To: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>,
	linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	linux-mm@kvack.org, Alexander Viro <viro@zeniv.linux.org.uk>
Subject: Re: [PATCH RFC 8/8] dcache: prevent flooding with negative dentries
Date: Fri, 8 May 2020 17:07:55 -0400	[thread overview]
Message-ID: <b3e29b86-7231-fcd1-3dbf-224bb82b079f@redhat.com> (raw)
In-Reply-To: <158894061332.200862.9812452563558764287.stgit@buzz>

On 5/8/20 8:23 AM, Konstantin Khlebnikov wrote:
> Without memory pressure count of negative dentries isn't bounded.
> They could consume all memory and drain all other inactive caches.
>
> Typical scenario is an idle system where some process periodically creates
> temporary files and removes them. After some time, memory will be filled
> with negative dentries for these random file names. Reclaiming them took
> some time because slab frees pages only when all related objects are gone.
> Time of dentry lookup is usually unaffected because hash table grows along
> with size of memory. Unless somebody especially crafts hash collisions.
> Simple lookup of random names also generates negative dentries very fast.
>
> This patch implements heuristic which detects such scenarios and prevents
> unbounded growth of completely unneeded negative dentries. It keeps up to
> three latest negative dentry in each bucket unless they were referenced.
>
> At first dput of negative dentry when it swept to the tail of siblings
> we'll also clear it's reference flag and look at next dentries in chain.
> Then kill third in series of negative, unused and unreferenced denries.
>
> This way each hash bucket will preserve three negative dentry to let them
> get reference and survive. Adding positive or used dentry into hash chain
> also protects few recent negative dentries. In result total size of dcache
> asymptotically limited by count of buckets and positive or used dentries.
>
> Before patch: tool 'dcache_stress' could fill entire memory with dentries.
>
> nr_dentry = 104913261   104.9M
> nr_buckets = 8388608    12.5 avg
> nr_unused = 104898729   100.0%
> nr_negative = 104883218 100.0%
>
> After this patch count of dentries saturates at around 3 per bucket:
>
> nr_dentry = 24619259    24.6M
> nr_buckets = 8388608    2.9 avg
> nr_unused = 24605226    99.9%
> nr_negative = 24600351  99.9%
>
> This heuristic isn't bulletproof and solves only most practical case.
> It's easy to deceive: just touch same random name twice.
>
> Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
> ---
>   fs/dcache.c |   54 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
>   1 file changed, 54 insertions(+)
>
> diff --git a/fs/dcache.c b/fs/dcache.c
> index 60158065891e..9f3d331b4978 100644
> --- a/fs/dcache.c
> +++ b/fs/dcache.c
> @@ -632,6 +632,58 @@ static inline struct dentry *lock_parent(struct dentry *dentry)
>   	return __lock_parent(dentry);
>   }
>   
> +/*
> + * Called at first dput of each negative dentry.
> + * Prevents filling cache with never reused negative dentries.
> + *
> + * This clears reference and then looks at following dentries in hash chain.
> + * If they are negative, unused and unreferenced then keep two and kill third.
> + */
> +static void trim_negative(struct dentry *dentry)
> +	__releases(dentry->d_lock)
> +{
> +	struct dentry *victim, *parent;
> +	struct hlist_bl_node *next;
> +	int keep = 2;
> +
> +	rcu_read_lock();
> +
> +	dentry->d_flags &= ~DCACHE_REFERENCED;
> +	spin_unlock(&dentry->d_lock);
> +
> +	next = rcu_dereference_raw(dentry->d_hash.next);
> +	while (1) {
> +		victim = hlist_bl_entry(next, struct dentry, d_hash);
> +
> +		if (!next || d_count(victim) || !d_is_negative(victim) ||
> +		    (victim->d_flags & DCACHE_REFERENCED)) {
> +			rcu_read_unlock();
> +			return;
> +		}
> +
> +		if (!keep--)
> +			break;
> +
> +		next = rcu_dereference_raw(next->next);
> +	}
> +
> +	spin_lock(&victim->d_lock);
> +	parent = lock_parent(victim);
> +
> +	rcu_read_unlock();
> +
> +	if (d_count(victim) || !d_is_negative(victim) ||
> +	    (victim->d_flags & DCACHE_REFERENCED)) {
> +		if (parent)
> +			spin_unlock(&parent->d_lock);
> +		spin_unlock(&victim->d_lock);
> +		return;
> +	}
> +
> +	__dentry_kill(victim);
> +	dput(parent);
> +}

Since you are picking a victim from the hash list, I think it is better 
to kill it only if it has already been in the LRU. Otherwise, it could 
be in the process of being instantiated or in the middle of some operations.

Besides, I feel a bit uneasy about picking a random negative dentry to 
kill like that.

Cheers,
Longman


  parent reply	other threads:[~2020-05-08 21:08 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-08 12:23 [PATCH RFC 0/8] dcache: increase poison resistance Konstantin Khlebnikov
2020-05-08 12:23 ` [PATCH RFC 1/8] dcache: show count of hash buckets in sysctl fs.dentry-state Konstantin Khlebnikov
2020-05-08 14:49   ` Waiman Long
2020-05-08 16:16     ` Konstantin Khlebnikov
2020-05-08 19:05       ` Waiman Long
2020-05-08 19:38         ` Konstantin Khlebnikov
2020-05-08 20:00           ` Waiman Long
2020-05-08 20:03             ` Matthew Wilcox
2020-05-08 12:23 ` [PATCH RFC 2/8] selftests: add stress testing tool for dcache Konstantin Khlebnikov
2020-05-13  1:52   ` Dave Chinner
2020-05-08 12:23 ` [PATCH RFC 3/8] dcache: sweep cached negative dentries to the end of list of siblings Konstantin Khlebnikov
2020-05-08 19:38   ` Waiman Long
2020-05-08 12:23 ` [PATCH RFC 4/8] fsnotify: stop walking child dentries if remaining tail is negative Konstantin Khlebnikov
2020-05-08 12:23 ` [PATCH RFC 5/8] dcache: add action D_WALK_SKIP_SIBLINGS to d_walk() Konstantin Khlebnikov
2020-05-08 12:23 ` [PATCH RFC 6/8] dcache: stop walking siblings if remaining dentries all negative Konstantin Khlebnikov
2020-05-08 12:23 ` [PATCH RFC 7/8] dcache: push releasing dentry lock into sweep_negative Konstantin Khlebnikov
2020-05-08 12:23 ` [PATCH RFC 8/8] dcache: prevent flooding with negative dentries Konstantin Khlebnikov
2020-05-08 14:56   ` Matthew Wilcox
2020-05-08 16:29     ` Konstantin Khlebnikov
2020-05-08 21:07   ` Waiman Long [this message]
2020-12-09 23:01 ` [PATCH RFC 0/8] dcache: increase poison resistance Junxiao Bi
2020-12-12  7:32   ` Konstantin Khlebnikov
2020-12-13 18:49     ` Junxiao Bi
2020-12-14  7:43       ` Konstantin Khlebnikov
2020-12-14 23:10         ` Junxiao Bi
2020-12-16 18:46           ` Junxiao Bi
2020-12-17 15:47             ` Konstantin Khlebnikov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b3e29b86-7231-fcd1-3dbf-224bb82b079f@redhat.com \
    --to=longman@redhat.com \
    --cc=khlebnikov@yandex-team.ru \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.