From: Minchan Kim <minchan@kernel.org>
To: Qian Cai <cai@lca.pw>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Johannes Weiner <hannes@cmpxchg.org>,
Michal Hocko <mhocko@suse.com>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: "mm: account nr_isolated_xxx in [isolate|putback]_lru_page" breaks OOM with swap
Date: Wed, 31 Jul 2019 14:34:44 +0900 [thread overview]
Message-ID: <20190731053444.GA155569@google.com> (raw)
In-Reply-To: <1564503928.11067.32.camel@lca.pw>
On Tue, Jul 30, 2019 at 12:25:28PM -0400, Qian Cai wrote:
> OOM workloads with swapping is unable to recover with linux-next since next-
> 20190729 due to the commit "mm: account nr_isolated_xxx in
> [isolate|putback]_lru_page" breaks OOM with swap" [1]
>
> [1] https://lore.kernel.org/linux-mm/20190726023435.214162-4-minchan@kernel.org/
> T/#mdcd03bcb4746f2f23e6f508c205943726aee8355
>
> For example, LTP oom01 test case is stuck for hours, while it finishes in a few
> minutes here after reverted the above commit. Sometimes, it prints those message
> while hanging.
>
> [ 509.983393][ T711] INFO: task oom01:5331 blocked for more than 122 seconds.
> [ 509.983431][ T711] Not tainted 5.3.0-rc2-next-20190730 #7
> [ 509.983447][ T711] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> disables this message.
> [ 509.983477][ T711] oom01 D24656 5331 5157 0x00040000
> [ 509.983513][ T711] Call Trace:
> [ 509.983538][ T711] [c00020037d00f880] [0000000000000008] 0x8 (unreliable)
> [ 509.983583][ T711] [c00020037d00fa60] [c000000000023724]
> __switch_to+0x3a4/0x520
> [ 509.983615][ T711] [c00020037d00fad0] [c0000000008d17bc]
> __schedule+0x2fc/0x950
> [ 509.983647][ T711] [c00020037d00fba0] [c0000000008d1e68] schedule+0x58/0x150
> [ 509.983684][ T711] [c00020037d00fbd0] [c0000000008d7614]
> rwsem_down_read_slowpath+0x4b4/0x630
> [ 509.983727][ T711] [c00020037d00fc90] [c0000000008d7dfc]
> down_read+0x12c/0x240
> [ 509.983758][ T711] [c00020037d00fd20] [c00000000005fb28]
> __do_page_fault+0x6f8/0xee0
> [ 509.983801][ T711] [c00020037d00fe20] [c00000000000a364]
> handle_page_fault+0x18/0x38
Thanks for the testing! No surprise the patch make some bugs because
it's rather tricky.
Could you test this patch?
From b31667210dd747f4d8aeb7bdc1f5c14f1f00bff5 Mon Sep 17 00:00:00 2001
From: Minchan Kim <minchan@kernel.org>
Date: Wed, 31 Jul 2019 14:18:01 +0900
Subject: [PATCH] mm: decrease NR_ISOALTED count at succesful migration
If migration fails, it should go back to LRU list so putback_lru_page
could handle NR_ISOLATED count in pair with isolate_lru_page. However,
if migration is successful, the page will be freed so no need to
add the page back to LRU list. Thus, NR_ISOLATED count should be done
in manually.
Signed-off-by: Minchan Kim <minchan@kernel.org>
---
mm/migrate.c | 16 ++++++++--------
1 file changed, 8 insertions(+), 8 deletions(-)
diff --git a/mm/migrate.c b/mm/migrate.c
index 84b89d2d69065..96ae0c3cada8d 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -1166,6 +1166,7 @@ static ICE_noinline int unmap_and_move(new_page_t get_new_page,
{
int rc = MIGRATEPAGE_SUCCESS;
struct page *newpage;
+ bool is_lru = __PageMovable(page);
if (!thp_migration_supported() && PageTransHuge(page))
return -ENOMEM;
@@ -1175,17 +1176,10 @@ static ICE_noinline int unmap_and_move(new_page_t get_new_page,
return -ENOMEM;
if (page_count(page) == 1) {
- bool is_lru = !__PageMovable(page);
-
/* page was freed from under us. So we are done. */
ClearPageActive(page);
ClearPageUnevictable(page);
- if (likely(is_lru))
- mod_node_page_state(page_pgdat(page),
- NR_ISOLATED_ANON +
- page_is_file_cache(page),
- -hpage_nr_pages(page));
- else {
+ if (unlikely(!is_lru)) {
lock_page(page);
if (!PageMovable(page))
__ClearPageIsolated(page);
@@ -1229,6 +1223,12 @@ static ICE_noinline int unmap_and_move(new_page_t get_new_page,
if (set_hwpoison_free_buddy_page(page))
num_poisoned_pages_inc();
}
+
+ if (likely(is_lru))
+ mod_node_page_state(page_pgdat(page),
+ NR_ISOLATED_ANON +
+ page_is_file_cache(page),
+ -hpage_nr_pages(page));
} else {
if (rc != -EAGAIN) {
if (likely(!__PageMovable(page))) {
--
2.22.0.709.g102302147b-goog
next prev parent reply other threads:[~2019-07-31 5:34 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-07-30 16:25 "mm: account nr_isolated_xxx in [isolate|putback]_lru_page" breaks OOM with swap Qian Cai
2019-07-31 5:34 ` Minchan Kim [this message]
2019-07-31 16:09 ` Qian Cai
2019-07-31 18:18 ` Qian Cai
2019-08-01 6:51 ` Minchan Kim
2019-08-01 11:46 ` Qian Cai
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190731053444.GA155569@google.com \
--to=minchan@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=cai@lca.pw \
--cc=hannes@cmpxchg.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@suse.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).