Linux-mm Archive on
 help / color / Atom feed
From: Johannes Weiner <>
To: Vlastimil Babka <>
Cc: Andrew Morton <>,
	Mel Gorman <>,,,
Subject: [PATCH v2] mm: fadvise: avoid expensive remote LRU cache draining after FADV_DONTNEED
Date: Wed, 14 Dec 2016 16:00:17 -0500
Message-ID: <> (raw)
In-Reply-To: <>

When FADV_DONTNEED cannot drop all pages in the range, it observes
that some pages might still be on per-cpu LRU caches after recent
instantiation and so initiates remote calls to all CPUs to flush their
local caches. However, in most cases, the fadvise happens from the
same context that instantiated the pages, and any pre-LRU pages in the
specified range are most likely sitting on the local CPU's LRU cache,
and so in many cases this results in unnecessary remote calls, which,
in a loaded system, can hold up the fadvise() call significantly.

[ I didn't record it in the extreme case we observed at Facebook,
  unfortunately. We had a slow-to-respond system and noticed it
  lru_add_drain_all() leading the profile during fadvise calls. This
  patch came out of thinking about the code and how we commonly call

  FWIW, I wrote a silly directory tree walker/searcher that recurses
  through /usr to read and FADV_DONTNEED each file it finds. On a 2
  socket 40 ht machine, over 1% is spent in lru_add_drain_all(). With
  the patch, that cost is gone; the local drain cost shows at 0.09%. ]

Try to avoid the remote call by flushing the local LRU cache before
even attempting to invalidate anything. It's a cheap operation, and
the local LRU cache is the most likely to hold any pre-LRU pages in
the specified fadvise range.

Signed-off-by: Johannes Weiner <>
Acked-by: Vlastimil Babka <>
Acked-by: Mel Gorman <>
 mm/fadvise.c | 15 ++++++++++++++-
 1 file changed, 14 insertions(+), 1 deletion(-)

diff --git a/mm/fadvise.c b/mm/fadvise.c
index 6c707bfe02fd..a43013112581 100644
--- a/mm/fadvise.c
+++ b/mm/fadvise.c
@@ -139,7 +139,20 @@ SYSCALL_DEFINE4(fadvise64_64, int, fd, loff_t, offset, loff_t, len, int, advice)
 		if (end_index >= start_index) {
-			unsigned long count = invalidate_mapping_pages(mapping,
+			unsigned long count;
+			/*
+			 * It's common to FADV_DONTNEED right after
+			 * the read or write that instantiates the
+			 * pages, in which case there will be some
+			 * sitting on the local LRU cache. Try to
+			 * avoid the expensive remote drain and the
+			 * second cache tree walk below by flushing
+			 * them out right away.
+			 */
+			lru_add_drain();
+			count = invalidate_mapping_pages(mapping,
 						start_index, end_index);

To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to  For more info on Linux MM,
see: .
Don't email: <a href=mailto:""> </a>

  reply index

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-12-10 17:26 [PATCH] " Johannes Weiner
2016-12-12  9:21 ` Vlastimil Babka
2016-12-12 15:55   ` Johannes Weiner
2016-12-13 12:32     ` Vlastimil Babka
2016-12-14 21:00       ` Johannes Weiner [this message]
2016-12-15  4:09         ` [PATCH v2] " Hillf Danton
2016-12-12  9:51 ` [PATCH] " Mel Gorman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \ \ \ \ \ \ \ \ \

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-mm Archive on

Archives are clonable:
	git clone --mirror linux-mm/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-mm linux-mm/ \
	public-inbox-index linux-mm

Example config snippet for mirrors

Newsgroup available over NNTP:

AGPL code for this site: git clone