linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
To: mhocko@suse.cz
Cc: akpm@linux-foundation.org, linux-mm@kvack.org, oleg@redhat.com,
	rientjes@google.com, vdavydov@parallels.com, mst@redhat.com
Subject: Re: [PATCH v3 0/8] Change OOM killer to use list of mm_struct.
Date: Mon, 25 Jul 2016 20:07:11 +0900	[thread overview]
Message-ID: <201607252007.BGI56224.SHVFLFOOFMJtOQ@I-love.SAKURA.ne.jp> (raw)
In-Reply-To: <20160725084803.GE9401@dhcp22.suse.cz>

Michal Hocko wrote:
> > > > > diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> > > > > index 788e4f22e0bb..34446f49c2e1 100644
> > > > > --- a/mm/page_alloc.c
> > > > > +++ b/mm/page_alloc.c
> > > > > @@ -3358,7 +3358,7 @@ gfp_to_alloc_flags(gfp_t gfp_mask)
> > > > >  			alloc_flags |= ALLOC_NO_WATERMARKS;
> > > > >  		else if (!in_interrupt() &&
> > > > >  				((current->flags & PF_MEMALLOC) ||
> > > > > -				 unlikely(test_thread_flag(TIF_MEMDIE))))
> > > > > +				 tsk_is_oom_victim(current))
> > > > >  			alloc_flags |= ALLOC_NO_WATERMARKS;
> > > > >  	}
> > > > >  #ifdef CONFIG_CMA
> > > > > 
> > > > > where tsk_is_oom_victim wouldn't require the given task to go via
> > > > > out_of_memory. This would solve some of the problems we have right now
> > > > > when a thread doesn't get access to memory reserves because it never
> > > > > reaches out_of_memory (e.g. recently mentioned mempool_alloc doing
> > > > > __GFP_NORETRY). It would also make the code easier to follow. If we want
> > > > > to implement that we need an easy to implement tsk_is_oom_victim
> > > > > obviously. With the signal_struct::oom_mm this is really trivial thing.
> > > > > I am not sure we can do that with the mm list though because we are
> > > > > loosing the task->mm at certain point in time.
> > > > 
> > > > bool tsk_is_oom_victim(void)
> > > > {
> > > > 	return current->mm && test_bit(MMF_OOM_KILLED, &current->mm->flags) &&
> > > > 		 (fatal_signal_pending(current) || (current->flags & PF_EXITING));
> > > > }
> > > 
> > > which doesn't work as soon as exit_mm clears the mm which is exactly
> > > the concern I have raised above.
> > 
> > Are you planning to change the scope where the OOM victims can access memory
> > reserves?
> 
> Yes. Because we know that there are some post exit_mm allocations and I
> do not want to get back to PF_EXITING and other tricks...
> 
> > (1) If you plan to allow the OOM victims to access memory reserves until
> >     TASK_DEAD, tsk_is_oom_victim() will be as trivial as
> > 
> > bool tsk_is_oom_victim(struct task_struct *task)
> > {
> > 	return task->signal->oom_mm;
> > }
> 
> yes, exactly. That's what I've tried to say above. with the oom_mm this
> is trivial to implement while mm lists will not help us much due to
> their life time. This also means that we know about the oom victim until
> it is unhashed and become invisible to the oom killer.

Then, what are advantages with allowing only OOM victims access to memory
reserves after they left exit_mm()? OOM victims might be waiting for locks
at e.g. exit_task_work() held by non OOM victims waiting for memory
allocation. If you change the OOM killer wait until existing OOM victims
are removed from task_list, we might OOM livelock, don't we? I think that
what we should do is make the OOM killer wait until MMF_OOM_REAPED is set
rather than wait until existing OOM victims are removed from task_list.

Since we assume that mm_struct is the primary source of memory consumption,
we don't select threads which already left exit_mm(). Since we assume that
mm_struct is the primary source of memory consumption, why should we
distinguish OOM victims and non OOM victims after they left exit_mm()?

> Yes. the exit_mm is not really suitable place to cut the access to
> memory reserves. a) mmput might be not the last one and b) even if it is
> we shouldn't really rely it has cleared the memory. It will in 99% cases
> but we have seen that the code had to play PF_EXITING tricks in the past
> to cover post exit_mm allocations. I think the code flow would get
> simplified greatly if we just do not rely on tsk->mm for anything but
> the oom victim selection.

Even if exit_mm() is not suitable place to cut the access to memory reserves,
I don't see advantages with allowing only OOM victims access to memory
reserves after they left exit_mm().

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2016-07-25 11:07 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-07-12 13:29 [PATCH v3 0/8] Change OOM killer to use list of mm_struct Tetsuo Handa
2016-07-12 13:29 ` [PATCH 1/8] mm,oom_reaper: Reduce find_lock_task_mm() usage Tetsuo Handa
2016-07-12 13:29 ` [PATCH 2/8] mm,oom_reaper: Do not attempt to reap a task twice Tetsuo Handa
2016-07-12 14:19   ` Michal Hocko
2016-07-12 13:29 ` [PATCH 3/8] mm,oom: Use list of mm_struct used by OOM victims Tetsuo Handa
2016-07-12 14:28   ` Michal Hocko
2016-07-12 13:29 ` [PATCH 4/8] mm,oom: Close oom_has_pending_mm race Tetsuo Handa
2016-07-12 14:36   ` Michal Hocko
2016-07-12 13:29 ` [PATCH 5/8] mm,oom_reaper: Make OOM reaper use list of mm_struct Tetsuo Handa
2016-07-12 14:51   ` Michal Hocko
2016-07-12 15:42     ` Tetsuo Handa
2016-07-13  7:48       ` Michal Hocko
2016-07-12 13:29 ` [PATCH 6/8] mm,oom: Remove OOM_SCAN_ABORT case and signal_struct->oom_victims Tetsuo Handa
2016-07-12 13:29 ` [PATCH 7/8] mm,oom: Stop clearing TIF_MEMDIE on remote thread Tetsuo Handa
2016-07-12 14:53   ` Michal Hocko
2016-07-12 15:45     ` Tetsuo Handa
2016-07-13  8:13       ` Michal Hocko
2016-07-12 13:29 ` [PATCH 8/8] oom_reaper: Revert "oom_reaper: close race with exiting task" Tetsuo Handa
2016-07-12 14:56   ` Michal Hocko
2016-07-21 11:21 ` [PATCH v3 0/8] Change OOM killer to use list of mm_struct Michal Hocko
2016-07-22 11:09   ` Tetsuo Handa
2016-07-22 12:05     ` Michal Hocko
2016-07-23  2:59       ` Tetsuo Handa
2016-07-25  8:48         ` Michal Hocko
2016-07-25 11:07           ` Tetsuo Handa [this message]
2016-07-25 11:21             ` Michal Hocko
2016-07-25 11:47               ` Tetsuo Handa
2016-07-25 11:59                 ` Michal Hocko
2016-07-25 14:02                   ` Tetsuo Handa
2016-07-25 14:17                     ` Michal Hocko
2016-07-25 21:40                       ` Tetsuo Handa
2016-07-26  7:52                         ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=201607252007.BGI56224.SHVFLFOOFMJtOQ@I-love.SAKURA.ne.jp \
    --to=penguin-kernel@i-love.sakura.ne.jp \
    --cc=akpm@linux-foundation.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.cz \
    --cc=mst@redhat.com \
    --cc=oleg@redhat.com \
    --cc=rientjes@google.com \
    --cc=vdavydov@parallels.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).