[RFC] page cache drop-behind

* [RFC] page cache drop-behind
@ 2020-06-10  2:39 Matthew Wilcox
  2020-06-11 20:30 ` Andres Freund
  0 siblings, 1 reply; 2+ messages in thread
From: Matthew Wilcox @ 2020-06-10  2:39 UTC (permalink / raw)
  To: linux-mm, linux-fsdevel; +Cc: Andres Freund


Andres reported a problem recently where reading a file several times
the size of memory causes intermittent stalls.  My suspicion is that
page allocation eventually runs into the low watermark and starts to
do reclaim.  Some shrinkers take a long time to run and have a low chance
of actually freeing a page (eg the dentry cache needs to free 21 dentries
which all happen to be on the same pair of pages to free those two pages).

This patch attempts to free pages from the file that we're currently
reading from if there are no pages readily available.  If that doesn't
work, we'll run all the shrinkers just as we did before.

This should solve Andres' problem, although it's a bit narrow in scope.
It might be better to look through the inactive page list, regardless of
which file they were allocated for.  That could solve the "weekly backup"
problem with lots of little files.

I'm not really set up to do performance testing at the moment, so this
is just me thinking hard about the problem.

diff --git a/mm/readahead.c b/mm/readahead.c
index 3c9a8dd7c56c..3531e1808e24 100644
--- a/mm/readahead.c
+++ b/mm/readahead.c
@@ -111,9 +111,24 @@ int read_cache_pages(struct address_space *mapping, struct list_head *pages,
 	}
 	return ret;
 }
-
 EXPORT_SYMBOL(read_cache_pages);
 
+/*
+ * Attempt to detect a streaming workload which exceeds memory and
+ * handle it by dropping the page cache behind the active part of the
+ * file.
+ */
+static void discard_behind(struct file *file, struct address_space *mapping)
+{
+	unsigned long keep = file->f_ra.ra_pages * 2;
+
+	if (mapping->nrpages < 1000)
+		return;
+	if (file->f_ra.start < keep)
+		return;
+	invalidate_mapping_pages(mapping, 0, file->f_ra.start - keep);
+}
+
 static void read_pages(struct readahead_control *rac, struct list_head *pages,
 		bool skip_page)
 {
@@ -179,6 +194,7 @@ void page_cache_readahead_unbounded(struct address_space *mapping,
 {
 	LIST_HEAD(page_pool);
 	gfp_t gfp_mask = readahead_gfp_mask(mapping);
+	gfp_t light_gfp = gfp_mask & ~__GFP_DIRECT_RECLAIM;
 	struct readahead_control rac = {
 		.mapping = mapping,
 		.file = file,
@@ -219,7 +235,11 @@ void page_cache_readahead_unbounded(struct address_space *mapping,
 			continue;
 		}
 
-		page = __page_cache_alloc(gfp_mask);
+		page = __page_cache_alloc(light_gfp);
+		if (!page) {
+			discard_behind(file, mapping);
+			page = __page_cache_alloc(gfp_mask);
+		}
 		if (!page)
 			break;
 		if (mapping->a_ops->readpages) {

^ permalink raw reply related	[flat|nested] 2+ messages in thread