linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
To: mhocko@suse.cz
Cc: linux-mm@kvack.org, akpm@linux-foundation.org, oleg@redhat.com,
	rientjes@google.com, vdavydov@parallels.com, mst@redhat.com
Subject: Re: [PATCH 3/6] mm,oom: Use list of mm_struct used by OOM victims.
Date: Tue, 12 Jul 2016 15:00:41 +0900	[thread overview]
Message-ID: <201607121500.AGE04699.FFQOFHVSOtOLMJ@I-love.SAKURA.ne.jp> (raw)
In-Reply-To: <20160711125051.GF1811@dhcp22.suse.cz>

Michal Hocko wrote:
> > diff --git a/kernel/fork.c b/kernel/fork.c
> > index 7926993..8e469e0 100644
> > --- a/kernel/fork.c
> > +++ b/kernel/fork.c
> > @@ -722,6 +722,10 @@ static inline void __mmput(struct mm_struct *mm)
> >  	}
> >  	if (mm->binfmt)
> >  		module_put(mm->binfmt->module);
> > +#ifndef CONFIG_MMU
> > +	if (mm->oom_mm.victim)
> > +		exit_oom_mm(mm);
> > +#endif
> 
> This ifdef is not really needed. There is no reason we should wait for
> the oom_reaper to unlink the mm.

Oleg wanted to avoid adding OOM related hooks if possible
( http://lkml.kernel.org/r/20160705205231.GA25340@redhat.com ),
but I thought that calling exit_oom_mm() from here is better for CONFIG_MMU=n case
( http://lkml.kernel.org/r/201607062043.FEC86485.JFFVLtFOQOSHMO@I-love.SAKURA.ne.jp ).

I think that not calling exit_oom_mm() from here is better for CONFIG_MMU=y case.
Calling exit_oom_mm() from here will require !list_empty() check after holding
oom_lock at oom_reaper(). Instead, we can do

+#ifdef CONFIG_MMU
+	if (mm->oom_mm.victim)
+		set_bit(MMF_OOM_REAPED, &mm->flags);
+#else
+	if (mm->oom_mm.victim)
+		exit_oom_mm(mm);
+#endif

here and let oom_has_pending_mm() check for MMF_OOM_REAPED.

> > +bool oom_has_pending_mm(struct mem_cgroup *memcg, const nodemask_t *nodemask)
> > +{
> > +	struct mm_struct *mm;
> > +
> > +	list_for_each_entry(mm, &oom_mm_list, oom_mm.list)
> > +		if (!oom_unkillable_task(mm->oom_mm.victim, memcg, nodemask))
> > +			return true;
> 
> The condition is quite hard to read. Moreover 2 of 4 conditions are
> never true. Wouldn't it be better to do something like the following?
> 

No problem.

> > @@ -653,6 +657,9 @@ subsys_initcall(oom_init)
> >   */
> >  void mark_oom_victim(struct task_struct *tsk)
> >  {
> > +	struct mm_struct *mm = tsk->mm;
> > +	struct task_struct *old_tsk = mm->oom_mm.victim;
> > +
> >  	WARN_ON(oom_killer_disabled);
> >  	/* OOM killer might race with memcg OOM */
> >  	if (test_and_set_tsk_thread_flag(tsk, TIF_MEMDIE))
> > @@ -666,6 +673,18 @@ void mark_oom_victim(struct task_struct *tsk)
> >  	 */
> >  	__thaw_task(tsk);
> >  	atomic_inc(&oom_victims);
> > +	/*
> > +	 * Since mark_oom_victim() is called from multiple threads,
> > +	 * connect this mm to oom_mm_list only if not yet connected.
> > +	 */
> > +	get_task_struct(tsk);
> > +	mm->oom_mm.victim = tsk;
> > +	if (!old_tsk) {
> > +		atomic_inc(&mm->mm_count);
> > +		list_add_tail(&mm->oom_mm.list, &oom_mm_list);
> > +	} else {
> > +		put_task_struct(old_tsk);
> > +	}
> 
> Isn't this overcomplicated? Why do we need to replace the old task by
> the current one?

I'm not sure whether task_in_oom_domain(mm->oom_mm.victim, memcg, nodemask) in
oom_has_pending_mm() will work as expected, especially when all threads in
one thread group (which mm->oom_mm.victim belongs to) reached TASK_DEAD state.
( http://lkml.kernel.org/r/201607042150.CIB00512.FSOtMHLOOVFFQJ@I-love.SAKURA.ne.jp )

I guess that task_in_oom_domain() will return false, and that mm will be selected
by another thread group (which mm->oom_mm.victim does not belongs to). Therefore,
I think we need to replace the old task with the new task (at least when
task_in_oom_domain() returned false) at mark_oom_victim().

If task_in_oom_domain(mm->oom_mm.victim, memcg, nodemask) in oom_has_pending_mm()
does not work as expected even if we replace the old task with the new task at
mark_oom_victim(), I think we after all need to use something like

struct task_struct {
(...snipped...)
+	struct mm_struct *oom_mm; /* current->mm as of getting TIF_MEMDIE */
+	struct task_struct *oom_mm_list; /* Connected to oom_mm_list global list. */
-#ifdef CONFIG_MMU
-	struct task_struct *oom_reaper_list;
-#endif
(...snipped...)
};

or your signal_struct->oom_mm approach.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2016-07-12  6:00 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-07-07 15:58 [PATCH v2 0/6] Change OOM killer to use list of mm_struct Tetsuo Handa
2016-07-07 16:00 ` [PATCH 1/6] mm,oom_reaper: Reduce find_lock_task_mm() usage Tetsuo Handa
2016-07-11 12:02   ` Michal Hocko
2016-07-07 16:01 ` [PATCH 2/6] mm,oom_reaper: Do not attempt to reap a task twice Tetsuo Handa
2016-07-11 12:15   ` Michal Hocko
2016-07-07 16:03 ` [PATCH 3/6] mm,oom: Use list of mm_struct used by OOM victims Tetsuo Handa
2016-07-11 12:50   ` Michal Hocko
2016-07-12  6:00     ` Tetsuo Handa [this message]
2016-07-12  7:09       ` Michal Hocko
2016-07-07 16:04 ` [PATCH 4/6] mm,oom_reaper: Make OOM reaper use list of mm_struct Tetsuo Handa
2016-07-11 13:16   ` Michal Hocko
2016-07-12 13:38     ` Tetsuo Handa
2016-07-12 13:46       ` Michal Hocko
2016-07-12 13:55         ` Michal Hocko
2016-07-12 14:01           ` Tetsuo Handa
2016-07-07 16:06 ` [PATCH 5/6] mm,oom: Remove OOM_SCAN_ABORT case and signal_struct->oom_victims Tetsuo Handa
2016-07-11 13:19   ` Michal Hocko
2016-07-07 16:07 ` [PATCH 6/6] mm,oom: Stop clearing TIF_MEMDIE on remote thread Tetsuo Handa
2016-07-11 13:22   ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=201607121500.AGE04699.FFQOFHVSOtOLMJ@I-love.SAKURA.ne.jp \
    --to=penguin-kernel@i-love.sakura.ne.jp \
    --cc=akpm@linux-foundation.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.cz \
    --cc=mst@redhat.com \
    --cc=oleg@redhat.com \
    --cc=rientjes@google.com \
    --cc=vdavydov@parallels.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).