All of lore.kernel.org
 help / color / mirror / Atom feed
From: Matthew Wilcox <willy@infradead.org>
To: linux-mm@kvack.org, linux-fsdevel@vger.kernel.org
Cc: Andres Freund <andres@anarazel.de>
Subject: [RFC] page cache drop-behind
Date: Tue, 9 Jun 2020 19:39:39 -0700	[thread overview]
Message-ID: <20200610023939.GI19604@bombadil.infradead.org> (raw)


Andres reported a problem recently where reading a file several times
the size of memory causes intermittent stalls.  My suspicion is that
page allocation eventually runs into the low watermark and starts to
do reclaim.  Some shrinkers take a long time to run and have a low chance
of actually freeing a page (eg the dentry cache needs to free 21 dentries
which all happen to be on the same pair of pages to free those two pages).

This patch attempts to free pages from the file that we're currently
reading from if there are no pages readily available.  If that doesn't
work, we'll run all the shrinkers just as we did before.

This should solve Andres' problem, although it's a bit narrow in scope.
It might be better to look through the inactive page list, regardless of
which file they were allocated for.  That could solve the "weekly backup"
problem with lots of little files.

I'm not really set up to do performance testing at the moment, so this
is just me thinking hard about the problem.

diff --git a/mm/readahead.c b/mm/readahead.c
index 3c9a8dd7c56c..3531e1808e24 100644
--- a/mm/readahead.c
+++ b/mm/readahead.c
@@ -111,9 +111,24 @@ int read_cache_pages(struct address_space *mapping, struct list_head *pages,
 	}
 	return ret;
 }
-
 EXPORT_SYMBOL(read_cache_pages);
 
+/*
+ * Attempt to detect a streaming workload which exceeds memory and
+ * handle it by dropping the page cache behind the active part of the
+ * file.
+ */
+static void discard_behind(struct file *file, struct address_space *mapping)
+{
+	unsigned long keep = file->f_ra.ra_pages * 2;
+
+	if (mapping->nrpages < 1000)
+		return;
+	if (file->f_ra.start < keep)
+		return;
+	invalidate_mapping_pages(mapping, 0, file->f_ra.start - keep);
+}
+
 static void read_pages(struct readahead_control *rac, struct list_head *pages,
 		bool skip_page)
 {
@@ -179,6 +194,7 @@ void page_cache_readahead_unbounded(struct address_space *mapping,
 {
 	LIST_HEAD(page_pool);
 	gfp_t gfp_mask = readahead_gfp_mask(mapping);
+	gfp_t light_gfp = gfp_mask & ~__GFP_DIRECT_RECLAIM;
 	struct readahead_control rac = {
 		.mapping = mapping,
 		.file = file,
@@ -219,7 +235,11 @@ void page_cache_readahead_unbounded(struct address_space *mapping,
 			continue;
 		}
 
-		page = __page_cache_alloc(gfp_mask);
+		page = __page_cache_alloc(light_gfp);
+		if (!page) {
+			discard_behind(file, mapping);
+			page = __page_cache_alloc(gfp_mask);
+		}
 		if (!page)
 			break;
 		if (mapping->a_ops->readpages) {

             reply	other threads:[~2020-06-10  2:39 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-06-10  2:39 Matthew Wilcox [this message]
2020-06-11 20:30 ` [RFC] page cache drop-behind Andres Freund

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200610023939.GI19604@bombadil.infradead.org \
    --to=willy@infradead.org \
    --cc=andres@anarazel.de \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.