From: Minchan Kim <minchan@kernel.org>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: linux-mm <linux-mm@kvack.org>,
LKML <linux-kernel@vger.kernel.org>, Jan Kara <jack@suse.cz>,
Matthew Wilcox <willy@infradead.org>,
Josef Bacik <josef@toxicpanda.com>,
Johannes Weiner <hannes@cmpxchg.org>,
Minchan Kim <minchan@kernel.org>
Subject: [PATCH v2 1/2] mm: make PageReadahead more strict
Date: Fri, 14 Feb 2020 11:29:50 -0800 [thread overview]
Message-ID: <20200214192951.29430-1-minchan@kernel.org> (raw)
Recently, I got some bugreports major page fault takes several seconds
sometime. When I review drop mmap_sem logic, I found several bugs.
CPU 1 CPU 2
mm_populate
for ()
..
ret = populate_vma_page_range
__get_user_pages
faultin_page
handle_mm_fault
filemap_fault
do_async_mmap_readahead
shrink_page_list
pageout
SetPageReclaim(=SetPageReadahead)
writepage
SetPageWriteback
if (PageReadahead(page))
maybe_unlock_mmap_for_io
up_read(mmap_sem)
page_cache_async_readahead()
if (PageWriteback(page))
return;
Here, since ret from populate_vma_page_range is zero, the loop continue
to run with same address with previous iteration. It will repeat the
loop until the page's writeout is done(ie, PG_writeback or PG_reclaim
is clear).
We could fix the above specific case via adding PageWriteback
ret = populate_vma_page_range
...
...
filemap_fault
do_async_mmap_readahead
if (!PageWriteback(page) && PageReadahead(page))
maybe_unlock_mmap_for_io
up_read(mmap_sem)
page_cache_async_readahead()
if (PageWriteback(page))
return;
Furthermore, to prevent potential issues caused by sharing PG_readahead
with PG_reclaim, let's make page flag wrapper for PageReadahead
with description. With that, we could remove PageWriteback check
in page_cache_async_readahead, which is more clear for maintenance/
readability.
Fixes: 6b4c9f446981 ("filemap: drop the mmap_sem for all blocking operations")
Signed-off-by: Minchan Kim <minchan@kernel.org>
---
include/linux/page-flags.h | 28 ++++++++++++++++++++++++++--
mm/readahead.c | 6 ------
2 files changed, 26 insertions(+), 8 deletions(-)
diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h
index 1bf83c8fcaa7..f91a9b2a49bd 100644
--- a/include/linux/page-flags.h
+++ b/include/linux/page-flags.h
@@ -363,8 +363,32 @@ PAGEFLAG(MappedToDisk, mappedtodisk, PF_NO_TAIL)
/* PG_readahead is only used for reads; PG_reclaim is only for writes */
PAGEFLAG(Reclaim, reclaim, PF_NO_TAIL)
TESTCLEARFLAG(Reclaim, reclaim, PF_NO_TAIL)
-PAGEFLAG(Readahead, reclaim, PF_NO_COMPOUND)
- TESTCLEARFLAG(Readahead, reclaim, PF_NO_COMPOUND)
+
+SETPAGEFLAG(Readahead, reclaim, PF_NO_COMPOUND)
+CLEARPAGEFLAG(Readahead, reclaim, PF_NO_COMPOUND)
+
+/*
+ * Since PG_readahead is shared with PG_reclaim of the page flags,
+ * PageReadahead should double check whether it's readahead marker
+ * or PG_reclaim. It could be done by PageWriteback check because
+ * PG_reclaim is always with PG_writeback.
+ */
+static inline int PageReadahead(struct page *page)
+{
+ VM_BUG_ON_PGFLAGS(PageCompound(page), page);
+
+ return (page->flags & (1UL << PG_reclaim | 1UL << PG_writeback)) ==
+ (1UL << PG_reclaim);
+}
+
+/* Clear PG_readahead only if it's PG_readahead, not PG_reclaim */
+static inline int TestClearPageReadahead(struct page *page)
+{
+ VM_BUG_ON_PGFLAGS(PageCompound(page), page);
+
+ return !PageWriteback(page) ||
+ test_and_clear_bit(PG_reclaim, &page->flags);
+}
#ifdef CONFIG_HIGHMEM
/*
diff --git a/mm/readahead.c b/mm/readahead.c
index 2fe72cd29b47..85b15e5a1d7b 100644
--- a/mm/readahead.c
+++ b/mm/readahead.c
@@ -553,12 +553,6 @@ page_cache_async_readahead(struct address_space *mapping,
if (!ra->ra_pages)
return;
- /*
- * Same bit is used for PG_readahead and PG_reclaim.
- */
- if (PageWriteback(page))
- return;
-
ClearPageReadahead(page);
/*
--
2.25.0.265.gbab2e86ba0-goog
next reply other threads:[~2020-02-14 19:29 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-02-14 19:29 Minchan Kim [this message]
2020-02-14 19:29 ` [PATCH v2 2/2] mm: fix long time stall from mm_populate Minchan Kim
2020-02-21 17:50 ` Minchan Kim
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200214192951.29430-1-minchan@kernel.org \
--to=minchan@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=hannes@cmpxchg.org \
--cc=jack@suse.cz \
--cc=josef@toxicpanda.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).