linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Qian Cai <cai@lca.pw>
To: Minchan Kim <minchan@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Michal Hocko <mhocko@suse.com>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: "mm: account nr_isolated_xxx in [isolate|putback]_lru_page" breaks OOM with swap
Date: Thu, 1 Aug 2019 07:46:10 -0400	[thread overview]
Message-ID: <53837B9C-7E73-47DA-9373-5E989A9EEC4F@lca.pw> (raw)
In-Reply-To: <20190801065108.GA179251@google.com>



> On Aug 1, 2019, at 2:51 AM, Minchan Kim <minchan@kernel.org> wrote:
> 
> On Wed, Jul 31, 2019 at 02:18:00PM -0400, Qian Cai wrote:
>> On Wed, 2019-07-31 at 12:09 -0400, Qian Cai wrote:
>>> On Wed, 2019-07-31 at 14:34 +0900, Minchan Kim wrote:
>>>> On Tue, Jul 30, 2019 at 12:25:28PM -0400, Qian Cai wrote:
>>>>> OOM workloads with swapping is unable to recover with linux-next since
>>>>> next-
>>>>> 20190729 due to the commit "mm: account nr_isolated_xxx in
>>>>> [isolate|putback]_lru_page" breaks OOM with swap" [1]
>>>>> 
>>>>> [1] https://lore.kernel.org/linux-mm/20190726023435.214162-4-minchan@kerne
>>>>> l.
>>>>> org/
>>>>> T/#mdcd03bcb4746f2f23e6f508c205943726aee8355
>>>>> 
>>>>> For example, LTP oom01 test case is stuck for hours, while it finishes in
>>>>> a
>>>>> few
>>>>> minutes here after reverted the above commit. Sometimes, it prints those
>>>>> message
>>>>> while hanging.
>>>>> 
>>>>> [  509.983393][  T711] INFO: task oom01:5331 blocked for more than 122
>>>>> seconds.
>>>>> [  509.983431][  T711]       Not tainted 5.3.0-rc2-next-20190730 #7
>>>>> [  509.983447][  T711] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
>>>>> disables this message.
>>>>> [  509.983477][  T711] oom01           D24656  5331   5157 0x00040000
>>>>> [  509.983513][  T711] Call Trace:
>>>>> [  509.983538][  T711] [c00020037d00f880] [0000000000000008] 0x8
>>>>> (unreliable)
>>>>> [  509.983583][  T711] [c00020037d00fa60] [c000000000023724]
>>>>> __switch_to+0x3a4/0x520
>>>>> [  509.983615][  T711] [c00020037d00fad0] [c0000000008d17bc]
>>>>> __schedule+0x2fc/0x950
>>>>> [  509.983647][  T711] [c00020037d00fba0] [c0000000008d1e68]
>>>>> schedule+0x58/0x150
>>>>> [  509.983684][  T711] [c00020037d00fbd0] [c0000000008d7614]
>>>>> rwsem_down_read_slowpath+0x4b4/0x630
>>>>> [  509.983727][  T711] [c00020037d00fc90] [c0000000008d7dfc]
>>>>> down_read+0x12c/0x240
>>>>> [  509.983758][  T711] [c00020037d00fd20] [c00000000005fb28]
>>>>> __do_page_fault+0x6f8/0xee0
>>>>> [  509.983801][  T711] [c00020037d00fe20] [c00000000000a364]
>>>>> handle_page_fault+0x18/0x38
>>>> 
>>>> Thanks for the testing! No surprise the patch make some bugs because
>>>> it's rather tricky.
>>>> 
>>>> Could you test this patch?
>>> 
>>> It does help the situation a bit, but the recover speed is still way slower
>>> than
>>> just reverting the commit "mm: account nr_isolated_xxx in
>>> [isolate|putback]_lru_page". For example, on this powerpc system, it used to
>>> take 4-min to finish oom01 while now still take 13-min.
>>> 
>>> The oom02 (testing NUMA mempolicy) takes even longer and I gave up after 26-
>>> min
>>> with several hang tasks below.
>> 
>> Also, oom02 is stuck on an x86 machine.
> 
> Yeb, above my patch had a bug to test page type after page was freed.
> However, after the review, I found other bugs but I don't think it's
> related to your problem, either. Okay, then, let's revert the patch.
> 
> Andrew, could you revert the below patch?
> "mm: account nr_isolated_xxx in [isolate|putback]_lru_page"
> 
> It's just clean up patch and isn't related to new madvise hint system call now.
> Thus, it shouldn't be blocker.
> 
> Anyway, I want to fix the problem when I have available time.
> Qian, What's the your config and system configuration on x86?
> Is it possible to reproduce in qemu?
> It would be really helpful if you tell me reproduce step on x86.

https://raw.githubusercontent.com/cailca/linux-mm/master/x86.config

The config could work in Openstack, and I never tried in QEMU. It might need
a few modification here or there. The reproduced x86 server is,

HPE ProLiant DL385 Gen10
AMD EPYC 7251 8-Core Processor
Smart Storage PQI 12G SAS/PCIe 3
Memory: 32768 MB
NUMA Nodes: 8

      reply	other threads:[~2019-08-01 11:46 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-07-30 16:25 "mm: account nr_isolated_xxx in [isolate|putback]_lru_page" breaks OOM with swap Qian Cai
2019-07-31  5:34 ` Minchan Kim
2019-07-31 16:09   ` Qian Cai
2019-07-31 18:18     ` Qian Cai
2019-08-01  6:51       ` Minchan Kim
2019-08-01 11:46         ` Qian Cai [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=53837B9C-7E73-47DA-9373-5E989A9EEC4F@lca.pw \
    --to=cai@lca.pw \
    --cc=akpm@linux-foundation.org \
    --cc=hannes@cmpxchg.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=minchan@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).