All of lore.kernel.org
 help / color / mirror / Atom feed
From: Michal Hocko <mhocko@suse.cz>
To: Hugh Dickins <hughd@google.com>
Cc: zhong jiang <zhongjiang@huawei.com>,
	akpm@linux-foundation.org, vbabka@suse.cz, rientjes@google.com,
	linux-mm@kvack.org, Xishi Qiu <qiuxishi@huawei.com>,
	Hanjun Guo <guohanjun@huawei.com>
Subject: Re: [PATCH] mm: fix oom work when memory is under pressure
Date: Sat, 17 Sep 2016 17:56:56 +0200	[thread overview]
Message-ID: <20160917155655.GD29145@dhcp22.suse.cz> (raw)
In-Reply-To: <alpine.LSU.2.11.1609161440280.5127@eggly.anvils>

On Fri 16-09-16 15:13:56, Hugh Dickins wrote:
> On Wed, 14 Sep 2016, Michal Hocko wrote:
> > On Wed 14-09-16 10:42:19, Michal Hocko wrote:
> > > [Let's CC Hugh]
> > 
> > now for real...
> > 
> > > 
> > > On Wed 14-09-16 15:13:50, zhong jiang wrote:
> > > [...]
> > > >   hi, Michal
> > > > 
> > > >   Recently, I hit the same issue when run a OOM case of the LTP and ksm enable.
> > > >  
> > > > [  601.937145] Call trace:
> > > > [  601.939600] [<ffffffc000086a88>] __switch_to+0x74/0x8c
> > > > [  601.944760] [<ffffffc000a1bae0>] __schedule+0x23c/0x7bc
> > > > [  601.950007] [<ffffffc000a1c09c>] schedule+0x3c/0x94
> > > > [  601.954907] [<ffffffc000a1eb84>] rwsem_down_write_failed+0x214/0x350
> > > > [  601.961289] [<ffffffc000a1e32c>] down_write+0x64/0x80
> > > > [  601.966363] [<ffffffc00021f794>] __ksm_exit+0x90/0x19c
> > > > [  601.971523] [<ffffffc0000be650>] mmput+0x118/0x11c
> > > > [  601.976335] [<ffffffc0000c3ec4>] do_exit+0x2dc/0xa74
> > > > [  601.981321] [<ffffffc0000c46f8>] do_group_exit+0x4c/0xe4
> > > > [  601.986656] [<ffffffc0000d0f34>] get_signal+0x444/0x5e0
> > > > [  601.991904] [<ffffffc000089fcc>] do_signal+0x1d8/0x450
> > > > [  601.997065] [<ffffffc00008a35c>] do_notify_resume+0x70/0x78
> > > 
> > > So this is a hung task triggering because the exiting task cannot get
> > > the mmap sem for write because the ksmd holds it for read while
> > > allocating memory which just takes ages to complete, right?
> > > 
> > > > 
> > > > The root case is that ksmd hold the read lock. and the lock is not released.
> > > >  scan_get_next_rmap_item
> > > >          down_read
> > > >                    get_next_rmap_item
> > > >                              alloc_rmap_item     #ksmd will loop permanently.
> > > > 
> > > > How do you see this kind of situation ? or  let the issue alone.
> > > 
> > > I am not familiar with the ksmd code so it is hard for me to judge but
> > > one thing to do would be __GFP_NORETRY which would force a bail out from
> > > the allocation rather than looping for ever. A quick look tells me that
> > > the allocation failure here is quite easy to handle. There might be
> > > others...
> 
> Yes, very good suggestion in this case: the ksmd code does exactly the
> right thing when that allocation fails, but was too stupid to use an
> allocation mode which might fail - and it can allocate rather a lot of
> slots along that path, so it will be good to let it break out there.
> 
> Thank you, Zhongjiang, please send akpm a fully signed-off patch, tagged
> for stable, with your explanation above (which was a lot more helpful
> to me than what you wrote in your other mail of Sept 13th).  But please
> make it GFP_KERNEL | __GFP_NORETRY | __GFP_NOWARN (and break that line

agreed

> before 80 cols): the allocation will sometimes fail, and we're not at
> all interested in hearing about that.
> 
> Michal, how would you feel about this or a separate patch adding
> __GFP_HIGH to the allocation in ksm's alloc_stable_node()?  That
> allocation could cause the same problem, but it is much less common
> (so less important to do anything about it), and differs from the
> rmap_item case in that if it succeeds, it will usually free a page;
> whereas if it fails, the fallback (two break_cow()s) may want to
> allocate a couple of pages.  So __GFP_HIGH makes more sense for it
> than __GFP_NORETRY: but perhaps we prefer not to add __GFP_HIGHs?

I am not familiar with the ksmd code enough to have a strong opinion
here. __GFP_HIGH should be imho used only when really necessary but as
you point out and comment in cmp_and_merge_page explain
			/*
			 * If we fail to insert the page into the stable tree,
			 * we will have 2 virtual addresses that are pointing
			 * to a ksm page left outside the stable tree,
			 * in which case we need to break_cow on both.
			 */
this can actually save some memory if succeed. So I will leave the
decision to you. I have no experience in how much this path can actually
eat and whether the flag actually makes much difference.

Thanks!
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2016-09-17 15:57 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-09-06 14:47 [PATCH] mm: fix oom work when memory is under pressure zhongjiang
2016-09-09 11:44 ` Michal Hocko
2016-09-12  9:51   ` zhong jiang
2016-09-12 11:13     ` Michal Hocko
2016-09-12 13:42       ` zhong jiang
2016-09-12 17:44         ` Michal Hocko
2016-09-13 13:13           ` zhong jiang
2016-09-13 13:28             ` Michal Hocko
2016-09-13 14:01               ` zhong jiang
2016-09-14  7:13               ` zhong jiang
2016-09-14  8:42                 ` Michal Hocko
2016-09-14  8:50                   ` zhong jiang
2016-09-14  9:05                     ` Michal Hocko
2016-09-14  8:52                   ` Michal Hocko
2016-09-14  9:25                     ` zhong jiang
2016-09-14 11:29                       ` Tetsuo Handa
2016-09-14 13:52                         ` zhong jiang
2016-09-18  6:00                           ` Tetsuo Handa
2016-09-18  6:13                             ` Tetsuo Handa
2016-09-19  4:44                               ` zhong jiang
2016-09-19  7:15                             ` zhong jiang
2016-09-16 22:13                     ` Hugh Dickins
2016-09-17 15:56                       ` Michal Hocko [this message]
2016-09-18  4:04                       ` zhong jiang
2016-09-18 14:42                         ` Michal Hocko
2016-09-19 17:27                           ` Hugh Dickins

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160917155655.GD29145@dhcp22.suse.cz \
    --to=mhocko@suse.cz \
    --cc=akpm@linux-foundation.org \
    --cc=guohanjun@huawei.com \
    --cc=hughd@google.com \
    --cc=linux-mm@kvack.org \
    --cc=qiuxishi@huawei.com \
    --cc=rientjes@google.com \
    --cc=vbabka@suse.cz \
    --cc=zhongjiang@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.