From: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
To: mhocko@suse.cz
Cc: akpm@linux-foundation.org, linux-mm@kvack.org, oleg@redhat.com,
rientjes@google.com, vdavydov@parallels.com, mst@redhat.com
Subject: Re: [PATCH v3 0/8] Change OOM killer to use list of mm_struct.
Date: Mon, 25 Jul 2016 20:07:11 +0900 [thread overview]
Message-ID: <201607252007.BGI56224.SHVFLFOOFMJtOQ@I-love.SAKURA.ne.jp> (raw)
In-Reply-To: <20160725084803.GE9401@dhcp22.suse.cz>
Michal Hocko wrote:
> > > > > diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> > > > > index 788e4f22e0bb..34446f49c2e1 100644
> > > > > --- a/mm/page_alloc.c
> > > > > +++ b/mm/page_alloc.c
> > > > > @@ -3358,7 +3358,7 @@ gfp_to_alloc_flags(gfp_t gfp_mask)
> > > > > alloc_flags |= ALLOC_NO_WATERMARKS;
> > > > > else if (!in_interrupt() &&
> > > > > ((current->flags & PF_MEMALLOC) ||
> > > > > - unlikely(test_thread_flag(TIF_MEMDIE))))
> > > > > + tsk_is_oom_victim(current))
> > > > > alloc_flags |= ALLOC_NO_WATERMARKS;
> > > > > }
> > > > > #ifdef CONFIG_CMA
> > > > >
> > > > > where tsk_is_oom_victim wouldn't require the given task to go via
> > > > > out_of_memory. This would solve some of the problems we have right now
> > > > > when a thread doesn't get access to memory reserves because it never
> > > > > reaches out_of_memory (e.g. recently mentioned mempool_alloc doing
> > > > > __GFP_NORETRY). It would also make the code easier to follow. If we want
> > > > > to implement that we need an easy to implement tsk_is_oom_victim
> > > > > obviously. With the signal_struct::oom_mm this is really trivial thing.
> > > > > I am not sure we can do that with the mm list though because we are
> > > > > loosing the task->mm at certain point in time.
> > > >
> > > > bool tsk_is_oom_victim(void)
> > > > {
> > > > return current->mm && test_bit(MMF_OOM_KILLED, ¤t->mm->flags) &&
> > > > (fatal_signal_pending(current) || (current->flags & PF_EXITING));
> > > > }
> > >
> > > which doesn't work as soon as exit_mm clears the mm which is exactly
> > > the concern I have raised above.
> >
> > Are you planning to change the scope where the OOM victims can access memory
> > reserves?
>
> Yes. Because we know that there are some post exit_mm allocations and I
> do not want to get back to PF_EXITING and other tricks...
>
> > (1) If you plan to allow the OOM victims to access memory reserves until
> > TASK_DEAD, tsk_is_oom_victim() will be as trivial as
> >
> > bool tsk_is_oom_victim(struct task_struct *task)
> > {
> > return task->signal->oom_mm;
> > }
>
> yes, exactly. That's what I've tried to say above. with the oom_mm this
> is trivial to implement while mm lists will not help us much due to
> their life time. This also means that we know about the oom victim until
> it is unhashed and become invisible to the oom killer.
Then, what are advantages with allowing only OOM victims access to memory
reserves after they left exit_mm()? OOM victims might be waiting for locks
at e.g. exit_task_work() held by non OOM victims waiting for memory
allocation. If you change the OOM killer wait until existing OOM victims
are removed from task_list, we might OOM livelock, don't we? I think that
what we should do is make the OOM killer wait until MMF_OOM_REAPED is set
rather than wait until existing OOM victims are removed from task_list.
Since we assume that mm_struct is the primary source of memory consumption,
we don't select threads which already left exit_mm(). Since we assume that
mm_struct is the primary source of memory consumption, why should we
distinguish OOM victims and non OOM victims after they left exit_mm()?
> Yes. the exit_mm is not really suitable place to cut the access to
> memory reserves. a) mmput might be not the last one and b) even if it is
> we shouldn't really rely it has cleared the memory. It will in 99% cases
> but we have seen that the code had to play PF_EXITING tricks in the past
> to cover post exit_mm allocations. I think the code flow would get
> simplified greatly if we just do not rely on tsk->mm for anything but
> the oom victim selection.
Even if exit_mm() is not suitable place to cut the access to memory reserves,
I don't see advantages with allowing only OOM victims access to memory
reserves after they left exit_mm().
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2016-07-25 11:07 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-07-12 13:29 [PATCH v3 0/8] Change OOM killer to use list of mm_struct Tetsuo Handa
2016-07-12 13:29 ` [PATCH 1/8] mm,oom_reaper: Reduce find_lock_task_mm() usage Tetsuo Handa
2016-07-12 13:29 ` [PATCH 2/8] mm,oom_reaper: Do not attempt to reap a task twice Tetsuo Handa
2016-07-12 14:19 ` Michal Hocko
2016-07-12 13:29 ` [PATCH 3/8] mm,oom: Use list of mm_struct used by OOM victims Tetsuo Handa
2016-07-12 14:28 ` Michal Hocko
2016-07-12 13:29 ` [PATCH 4/8] mm,oom: Close oom_has_pending_mm race Tetsuo Handa
2016-07-12 14:36 ` Michal Hocko
2016-07-12 13:29 ` [PATCH 5/8] mm,oom_reaper: Make OOM reaper use list of mm_struct Tetsuo Handa
2016-07-12 14:51 ` Michal Hocko
2016-07-12 15:42 ` Tetsuo Handa
2016-07-13 7:48 ` Michal Hocko
2016-07-12 13:29 ` [PATCH 6/8] mm,oom: Remove OOM_SCAN_ABORT case and signal_struct->oom_victims Tetsuo Handa
2016-07-12 13:29 ` [PATCH 7/8] mm,oom: Stop clearing TIF_MEMDIE on remote thread Tetsuo Handa
2016-07-12 14:53 ` Michal Hocko
2016-07-12 15:45 ` Tetsuo Handa
2016-07-13 8:13 ` Michal Hocko
2016-07-12 13:29 ` [PATCH 8/8] oom_reaper: Revert "oom_reaper: close race with exiting task" Tetsuo Handa
2016-07-12 14:56 ` Michal Hocko
2016-07-21 11:21 ` [PATCH v3 0/8] Change OOM killer to use list of mm_struct Michal Hocko
2016-07-22 11:09 ` Tetsuo Handa
2016-07-22 12:05 ` Michal Hocko
2016-07-23 2:59 ` Tetsuo Handa
2016-07-25 8:48 ` Michal Hocko
2016-07-25 11:07 ` Tetsuo Handa [this message]
2016-07-25 11:21 ` Michal Hocko
2016-07-25 11:47 ` Tetsuo Handa
2016-07-25 11:59 ` Michal Hocko
2016-07-25 14:02 ` Tetsuo Handa
2016-07-25 14:17 ` Michal Hocko
2016-07-25 21:40 ` Tetsuo Handa
2016-07-26 7:52 ` Michal Hocko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=201607252007.BGI56224.SHVFLFOOFMJtOQ@I-love.SAKURA.ne.jp \
--to=penguin-kernel@i-love.sakura.ne.jp \
--cc=akpm@linux-foundation.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@suse.cz \
--cc=mst@redhat.com \
--cc=oleg@redhat.com \
--cc=rientjes@google.com \
--cc=vdavydov@parallels.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).