linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Michal Hocko <mhocko@kernel.org>
To: Johannes Weiner <hannes@cmpxchg.org>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Roman Gushchin <guro@fb.com>,
	Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>,
	Vladimir Davydov <vdavydov.dev@gmail.com>,
	linux-mm@kvack.org, LKML <linux-kernel@vger.kernel.org>
Subject: Re: [RFC PATCH 2/2] mm, oom: do not trigger out_of_memory from the #PF
Date: Sat, 10 Jun 2017 10:49:01 +0200	[thread overview]
Message-ID: <20170610084901.GB12347@dhcp22.suse.cz> (raw)
In-Reply-To: <20170609144642.GH21764@dhcp22.suse.cz>

On Fri 09-06-17 16:46:42, Michal Hocko wrote:
> On Fri 09-06-17 10:08:53, Johannes Weiner wrote:
> > On Thu, Jun 08, 2017 at 04:36:07PM +0200, Michal Hocko wrote:
> > > Does anybody see any problem with the patch or I can send it for the
> > > inclusion?
> > > 
> > > On Fri 19-05-17 13:26:04, Michal Hocko wrote:
> > > > From: Michal Hocko <mhocko@suse.com>
> > > > 
> > > > Any allocation failure during the #PF path will return with VM_FAULT_OOM
> > > > which in turn results in pagefault_out_of_memory. This can happen for
> > > > 2 different reasons. a) Memcg is out of memory and we rely on
> > > > mem_cgroup_oom_synchronize to perform the memcg OOM handling or b)
> > > > normal allocation fails.
> > > > 
> > > > The later is quite problematic because allocation paths already trigger
> > > > out_of_memory and the page allocator tries really hard to not fail
> > > > allocations. Anyway, if the OOM killer has been already invoked there
> > > > is no reason to invoke it again from the #PF path. Especially when the
> > > > OOM condition might be gone by that time and we have no way to find out
> > > > other than allocate.
> > > > 
> > > > Moreover if the allocation failed and the OOM killer hasn't been
> > > > invoked then we are unlikely to do the right thing from the #PF context
> > > > because we have already lost the allocation context and restictions and
> > > > therefore might oom kill a task from a different NUMA domain.
> > > > 
> > > > An allocation might fail also when the current task is the oom victim
> > > > and there are no memory reserves left and we should simply bail out
> > > > from the #PF rather than invoking out_of_memory.
> > > > 
> > > > This all suggests that there is no legitimate reason to trigger
> > > > out_of_memory from pagefault_out_of_memory so drop it. Just to be sure
> > > > that no #PF path returns with VM_FAULT_OOM without allocation print a
> > > > warning that this is happening before we restart the #PF.
> > > > 
> > > > Signed-off-by: Michal Hocko <mhocko@suse.com>
> > 
> > I don't agree with this patch.
> > 
> > The warning you replace the oom call with indicates that we never
> > expect a VM_FAULT_OOM to leak to this point. But should there be a
> > leak, it's infinitely better to tickle the OOM killer again - even if
> > that call is then fairly inaccurate and without alloc context - than
> > infinite re-invocations of the #PF when the VM_FAULT_OOM comes from a
> > context - existing or future - that isn't allowed to trigger the OOM.
> 
> I disagree. Retrying the page fault while dropping all the locks
> on the way and still being in the killable context should be preferable
> to a system wide disruptive action like the OOM killer.

And just to clarify a bit. The OOM killer should be invoked whenever
appropriate from the allocation context. If we decide to fail the
allocation in the PF path then we can safely roll back and retry the
whole PF. This has an advantage that any locks held while doing the
allocation will be released and that alone can help to make a further
progress. Moreover we can relax retry-for-ever _inside_ the allocator
semantic for the PF path and fail allocations when we cannot make
further progress even after we hit the OOM condition or we do stall for
too long. This would have a nice side effect that PF would be a killable
context from the page allocator POV. From the user space POV there is no
difference between retrying the PF and looping inside the allocator,
right?

That being said, late just-in-case OOM killer invocation is not only
suboptimal it also disallows us to make further changes in that area.

Or am I oversimplifying or missing something here?
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2017-06-10  8:49 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-05-19 11:26 [PATCH 0/2] fix premature OOM killer Michal Hocko
2017-05-19 11:26 ` [PATCH 1/2] mm, oom: make sure that the oom victim uses memory reserves Michal Hocko
2017-05-19 12:12   ` Tetsuo Handa
2017-05-19 12:46     ` Michal Hocko
2017-05-22 15:06       ` Roman Gushchin
2017-05-19 11:26 ` [RFC PATCH 2/2] mm, oom: do not trigger out_of_memory from the #PF Michal Hocko
2017-05-19 13:02   ` Tetsuo Handa
2017-05-19 13:22     ` Michal Hocko
2017-05-19 15:22       ` Tetsuo Handa
2017-05-19 15:50         ` Michal Hocko
2017-05-19 23:43           ` Tetsuo Handa
2017-05-22  9:31             ` Michal Hocko
2017-06-08 14:36   ` Michal Hocko
2017-06-09 14:08     ` Johannes Weiner
2017-06-09 14:46       ` Michal Hocko
2017-06-10  8:49         ` Michal Hocko [this message]
2017-06-10 11:57           ` [RFC PATCH 2/2] mm, oom: do not trigger out_of_memory from the#PF Tetsuo Handa
2017-06-12  7:39             ` Michal Hocko
2017-06-12 10:48               ` [RFC PATCH 2/2] mm, oom: do not trigger out_of_memory from the #PF Tetsuo Handa
2017-06-12 11:06                 ` Michal Hocko
2017-06-23 12:50           ` Michal Hocko
2017-05-19 11:37 ` [PATCH 0/2] fix premature OOM killer Tetsuo Handa
2017-05-19 12:47   ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170610084901.GB12347@dhcp22.suse.cz \
    --to=mhocko@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=guro@fb.com \
    --cc=hannes@cmpxchg.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=penguin-kernel@I-love.SAKURA.ne.jp \
    --cc=vdavydov.dev@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).