linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
To: mhocko@kernel.org
Cc: akpm@linux-foundation.org, hannes@cmpxchg.org, guro@fb.com,
	vdavydov.dev@gmail.com, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org
Subject: Re: [RFC PATCH 2/2] mm, oom: do not trigger out_of_memory from the #PF
Date: Sat, 20 May 2017 00:22:30 +0900	[thread overview]
Message-ID: <201705200022.BFJ12428.JFOSMLFOtFHOVQ@I-love.SAKURA.ne.jp> (raw)
In-Reply-To: <20170519132209.GG29839@dhcp22.suse.cz>

Michal Hocko wrote:
> On Fri 19-05-17 22:02:44, Tetsuo Handa wrote:
> > Michal Hocko wrote:
> > > Any allocation failure during the #PF path will return with VM_FAULT_OOM
> > > which in turn results in pagefault_out_of_memory. This can happen for
> > > 2 different reasons. a) Memcg is out of memory and we rely on
> > > mem_cgroup_oom_synchronize to perform the memcg OOM handling or b)
> > > normal allocation fails.
> > > 
> > > The later is quite problematic because allocation paths already trigger
> > > out_of_memory and the page allocator tries really hard to not fail
> > 
> > We made many memory allocation requests from page fault path (e.g. XFS)
> > __GFP_FS some time ago, didn't we? But if I recall correctly (I couldn't
> > find the message), there are some allocation requests from page fault path
> > which cannot use __GFP_FS. Then, not all allocation requests can call
> > oom_kill_process() and reaching pagefault_out_of_memory() will be
> > inevitable.
> 
> Even if such an allocation fail without the OOM killer then we simply
> retry the PF and will do that the same way how we keep retrying the
> allocation inside the page allocator. So how is this any different?

You are trying to remove out_of_memory() from pagefault_out_of_memory()
by this patch. But you also want to make !__GFP_FS allocations not to
keep retrying inside the page allocator in future kernels, don't you?
Then, a thread which need to allocate memory from page fault path but
cannot call oom_kill_process() will spin forever (unless somebody else
calls oom_kill_process() via a __GFP_FS allocation request). I consider
that introducing such possibility is a problem.

> 
> > > allocations. Anyway, if the OOM killer has been already invoked there
> > > is no reason to invoke it again from the #PF path. Especially when the
> > > OOM condition might be gone by that time and we have no way to find out
> > > other than allocate.
> > > 
> > > Moreover if the allocation failed and the OOM killer hasn't been
> > > invoked then we are unlikely to do the right thing from the #PF context
> > > because we have already lost the allocation context and restictions and
> > > therefore might oom kill a task from a different NUMA domain.
> > 
> > If we carry a flag via task_struct that indicates whether it is an memory
> > allocation request from page fault and allocation failure is not acceptable,
> > we can call out_of_memory() from page allocator path.
> 
> I do not understand

We need to allocate memory from page fault path in order to avoid spinning forever
(unless somebody else calls oom_kill_process() via a __GFP_FS allocation request),
doesn't it? Then, memory allocation requests from page fault path can pass flags
like __GFP_NOFAIL | __GFP_KILLABLE because retrying the page fault without
allocating memory is pointless. I called such flags as carry a flag via task_struct.

> > By the way, can page fault occur after reaching do_exit()? When a thread
> > reached do_exit(), fatal_signal_pending(current) becomes false, doesn't it?
> 
> yes fatal_signal_pending will be false at the time and I believe we can
> perform a page fault past that moment  and go via allocation path which would
> trigger the OOM or give this task access to reserves but it is more
> likely that the oom reaper will push to kill another task by that time
> if the situation didn't get resolved. Or did I miss your concern?

How checking fatal_signal_pending() here helps? It only suppresses printk().
If current thread needs to allocate memory because not all allocation requests
can call oom_kill_process(), doing printk() is not the right thing to do.
Allocate memory by some means (e.g. __GFP_NOFAIL | __GFP_KILLABLE) will be
the right thing to do.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2017-05-19 15:22 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-05-19 11:26 [PATCH 0/2] fix premature OOM killer Michal Hocko
2017-05-19 11:26 ` [PATCH 1/2] mm, oom: make sure that the oom victim uses memory reserves Michal Hocko
2017-05-19 12:12   ` Tetsuo Handa
2017-05-19 12:46     ` Michal Hocko
2017-05-22 15:06       ` Roman Gushchin
2017-05-19 11:26 ` [RFC PATCH 2/2] mm, oom: do not trigger out_of_memory from the #PF Michal Hocko
2017-05-19 13:02   ` Tetsuo Handa
2017-05-19 13:22     ` Michal Hocko
2017-05-19 15:22       ` Tetsuo Handa [this message]
2017-05-19 15:50         ` Michal Hocko
2017-05-19 23:43           ` Tetsuo Handa
2017-05-22  9:31             ` Michal Hocko
2017-06-08 14:36   ` Michal Hocko
2017-06-09 14:08     ` Johannes Weiner
2017-06-09 14:46       ` Michal Hocko
2017-06-10  8:49         ` Michal Hocko
2017-06-10 11:57           ` [RFC PATCH 2/2] mm, oom: do not trigger out_of_memory from the#PF Tetsuo Handa
2017-06-12  7:39             ` Michal Hocko
2017-06-12 10:48               ` [RFC PATCH 2/2] mm, oom: do not trigger out_of_memory from the #PF Tetsuo Handa
2017-06-12 11:06                 ` Michal Hocko
2017-06-23 12:50           ` Michal Hocko
2017-05-19 11:37 ` [PATCH 0/2] fix premature OOM killer Tetsuo Handa
2017-05-19 12:47   ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=201705200022.BFJ12428.JFOSMLFOtFHOVQ@I-love.SAKURA.ne.jp \
    --to=penguin-kernel@i-love.sakura.ne.jp \
    --cc=akpm@linux-foundation.org \
    --cc=guro@fb.com \
    --cc=hannes@cmpxchg.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=vdavydov.dev@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).