linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Gautham Ananthakrishna <gautham.ananthakrishna@oracle.com>
To: linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	linux-mm@kvack.org
Cc: viro@zeniv.linux.org.uk, matthew.wilcox@oracle.com,
	khlebnikov@yandex-team.ru, gautham.ananthakrishna@oracle.com
Subject: [PATCH RFC 6/6] dcache: prevent flooding with negative dentries
Date: Thu, 21 Jan 2021 18:49:45 +0530	[thread overview]
Message-ID: <1611235185-1685-7-git-send-email-gautham.ananthakrishna@oracle.com> (raw)
In-Reply-To: <1611235185-1685-1-git-send-email-gautham.ananthakrishna@oracle.com>

From: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>

Without memory pressure count of negative dentries isn't bounded.
They could consume all memory and drain all other inactive caches.

Typical scenario is an idle system where some process periodically creates
temporary files and removes them. After some time, memory will be filled
with negative dentries for these random file names. Reclaiming them took
some time because slab frees pages only when all related objects are gone.
Time of dentry lookup is usually unaffected because hash table grows along
with size of memory. Unless somebody especially crafts hash collisions.
Simple lookup of random names also generates negative dentries very fast.

This patch implements heuristic which detects such scenarios and prevents
unbounded growth of completely unneeded negative dentries. It keeps up to
three latest negative dentry in each bucket unless they were referenced.

At first dput of negative dentry when it swept to the tail of siblings
we'll also clear it's reference flag and look at next dentries in chain.
Then kill third in series of negative, unused and unreferenced denries.

This way each hash bucket will preserve three negative dentry to let them
get reference and survive. Adding positive or used dentry into hash chain
also protects few recent negative dentries. In result total size of dcache
asymptotically limited by count of buckets and positive or used dentries.

Before patch: tool 'dcache_stress' could fill entire memory with dentries.

nr_dentry = 104913261   104.9M
nr_buckets = 8388608    12.5 avg
nr_unused = 104898729   100.0%
nr_negative = 104883218 100.0%

After this patch count of dentries saturates at around 3 per bucket:

nr_dentry = 24619259    24.6M
nr_buckets = 8388608    2.9 avg
nr_unused = 24605226    99.9%
nr_negative = 24600351  99.9%

This heuristic isn't bulletproof and solves only most practical case.
It's easy to deceive: just touch same random name twice.

Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Signed-off-by: Gautham Ananthakrishna <gautham.ananthakrishna@oracle.com>
---
 fs/dcache.c | 54 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 54 insertions(+)

diff --git a/fs/dcache.c b/fs/dcache.c
index 22c990b..6281938 100644
--- a/fs/dcache.c
+++ b/fs/dcache.c
@@ -633,6 +633,58 @@ static inline struct dentry *lock_parent(struct dentry *dentry)
 }
 
 /*
+ * Called at first dput of each negative dentry.
+ * Prevents filling cache with never reused negative dentries.
+ *
+ * This clears reference and then looks at following dentries in hash chain.
+ * If they are negative, unused and unreferenced then keep two and kill third.
+ */
+static void trim_negative(struct dentry *dentry)
+	__releases(dentry->d_lock)
+{
+	struct dentry *victim, *parent;
+	struct hlist_bl_node *next;
+	int keep = 2;
+
+	rcu_read_lock();
+
+	dentry->d_flags &= ~DCACHE_REFERENCED;
+	spin_unlock(&dentry->d_lock);
+
+	next = rcu_dereference_raw(dentry->d_hash.next);
+	while (1) {
+		victim = hlist_bl_entry(next, struct dentry, d_hash);
+
+		if (!next || d_count(victim) || !d_is_negative(victim) ||
+		    (victim->d_flags & DCACHE_REFERENCED)) {
+			rcu_read_unlock();
+			return;
+		}
+
+		if (!keep--)
+			break;
+
+		next = rcu_dereference_raw(next->next);
+	}
+
+	spin_lock(&victim->d_lock);
+	parent = lock_parent(victim);
+
+	rcu_read_unlock();
+
+	if (d_count(victim) || !d_is_negative(victim) ||
+	    (victim->d_flags & DCACHE_REFERENCED)) {
+		if (parent)
+			spin_unlock(&parent->d_lock);
+		spin_unlock(&victim->d_lock);
+		return;
+	}
+
+	__dentry_kill(victim);
+	dput(parent);
+}
+
+/*
  * Move cached negative dentry to the tail of parent->d_subdirs.
  * This lets walkers skip them all together at first sight.
  * Must be called at dput of negative dentry.
@@ -654,6 +706,8 @@ static void sweep_negative(struct dentry *dentry)
 		}
 
 		spin_unlock(&parent->d_lock);
+
+		return trim_negative(dentry);
 	}
 out:
 	spin_unlock(&dentry->d_lock);
-- 
1.8.3.1


  parent reply	other threads:[~2021-01-21 15:30 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-21 13:19 [PATCH RFC 0/6] fix the negative dentres bloating system memory usage Gautham Ananthakrishna
2021-01-21 13:19 ` [PATCH RFC 1/6] dcache: sweep cached negative dentries to the end of list of siblings Gautham Ananthakrishna
2021-04-14  3:00   ` Al Viro
2021-04-15 16:50     ` Al Viro
2021-04-14  3:41   ` Al Viro
2021-04-15 16:25     ` Al Viro
2021-01-21 13:19 ` [PATCH RFC 2/6] fsnotify: stop walking child dentries if remaining tail is negative Gautham Ananthakrishna
2021-01-21 13:19 ` [PATCH RFC 3/6] dcache: add action D_WALK_SKIP_SIBLINGS to d_walk() Gautham Ananthakrishna
2021-01-21 13:19 ` [PATCH RFC 4/6] dcache: stop walking siblings if remaining dentries all negative Gautham Ananthakrishna
2021-01-21 13:19 ` [PATCH RFC 5/6] dcache: push releasing dentry lock into sweep_negative Gautham Ananthakrishna
2021-01-21 13:19 ` Gautham Ananthakrishna [this message]
2021-04-14  3:56   ` [PATCH RFC 6/6] dcache: prevent flooding with negative dentries Al Viro
2021-03-31 14:23 ` [PATCH RFC 0/6] fix the negative dentres bloating system memory usage Matthew Wilcox
2021-04-14  2:40 ` Al Viro

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1611235185-1685-7-git-send-email-gautham.ananthakrishna@oracle.com \
    --to=gautham.ananthakrishna@oracle.com \
    --cc=khlebnikov@yandex-team.ru \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=matthew.wilcox@oracle.com \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).