linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Michal Hocko <mhocko@kernel.org>
To: David Rientjes <rientjes@google.com>
Cc: linux-mm@kvack.org,
	Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: [RFC 2/3] oom: Do not sacrifice already OOM killed children
Date: Wed, 13 Jan 2016 10:36:02 +0100	[thread overview]
Message-ID: <20160113093601.GB28942@dhcp22.suse.cz> (raw)
In-Reply-To: <alpine.DEB.2.10.1601121644250.28831@chino.kir.corp.google.com>

On Tue 12-01-16 16:45:35, David Rientjes wrote:
> On Tue, 12 Jan 2016, Michal Hocko wrote:
> 
> > diff --git a/mm/oom_kill.c b/mm/oom_kill.c
> > index 2b9dc5129a89..8bca0b1e97f7 100644
> > --- a/mm/oom_kill.c
> > +++ b/mm/oom_kill.c
> > @@ -671,6 +671,63 @@ static bool process_shares_mm(struct task_struct *p, struct mm_struct *mm)
> >  }
> >  
> >  #define K(x) ((x) << (PAGE_SHIFT-10))
> > +
> > +/*
> > + * If any of victim's children has a different mm and is eligible for kill,
> > + * the one with the highest oom_badness() score is sacrificed for its
> > + * parent.  This attempts to lose the minimal amount of work done while
> > + * still freeing memory.
> > + */
> > +static struct task_struct *
> > +try_to_sacrifice_child(struct oom_control *oc, struct task_struct *victim,
> > +		       unsigned long totalpages, struct mem_cgroup *memcg)
> > +{
> > +	struct task_struct *child_victim = NULL;
> > +	unsigned int victim_points = 0;
> > +	struct task_struct *t;
> > +
> > +	read_lock(&tasklist_lock);
> > +	for_each_thread(victim, t) {
> > +		struct task_struct *child;
> > +
> > +		list_for_each_entry(child, &t->children, sibling) {
> > +			unsigned int child_points;
> > +
> > +			/*
> > +			 * Skip over already OOM killed children as this hasn't
> > +			 * helped to resolve the situation obviously.
> > +			 */
> > +			if (test_tsk_thread_flag(child, TIF_MEMDIE) ||
> > +					fatal_signal_pending(child) ||
> > +					task_will_free_mem(child))
> > +				continue;
> > +
> 
> What guarantees that child had time to exit after it has been oom killed 
> (better yet, what guarantees that it has even scheduled after it has been 
> oom killed)?  It seems like this would quickly kill many children 
> unnecessarily.

If the child hasn't released any memory after all the allocator attempts to
free a memory, which takes quite some time, then what is the advantage of
waiting even more and possibly get stuck? This is a heuristic, we should
have killed the selected victim but we have chosen to reduce the impact by
selecting the child process instead. If that hasn't led to any
improvement I believe we should move on rather than looping on
potentially unresolvable situation _just because_ of the said heuristic.
-- 
Michal Hocko
SUSE Labs

  reply	other threads:[~2016-01-13  9:36 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-01-12 21:00 [RFC 0/3] oom: few enahancements Michal Hocko
2016-01-12 21:00 ` [RFC 1/3] oom, sysrq: Skip over oom victims and killed tasks Michal Hocko
2016-01-13  0:41   ` David Rientjes
2016-01-13  9:30     ` Michal Hocko
2016-01-14  0:38       ` David Rientjes
2016-01-14 11:00         ` Michal Hocko
2016-01-14 21:51           ` David Rientjes
2016-01-15 10:12             ` Michal Hocko
2016-01-15 15:37               ` One Thousand Gnomes
2016-01-19 23:01                 ` David Rientjes
2016-01-19 22:57               ` David Rientjes
2016-01-20  9:49                 ` Michal Hocko
2016-01-21  0:01                   ` David Rientjes
2016-01-21  9:15                     ` Michal Hocko
2016-01-12 21:00 ` [RFC 2/3] oom: Do not sacrifice already OOM killed children Michal Hocko
2016-01-13  0:45   ` David Rientjes
2016-01-13  9:36     ` Michal Hocko [this message]
2016-01-14  0:42       ` David Rientjes
2016-01-12 21:00 ` [RFC 3/3] oom: Do not try to sacrifice small children Michal Hocko
2016-01-13  0:51   ` David Rientjes
2016-01-13  9:40     ` Michal Hocko
2016-01-14  0:43       ` David Rientjes

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160113093601.GB28942@dhcp22.suse.cz \
    --to=mhocko@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=penguin-kernel@i-love.sakura.ne.jp \
    --cc=rientjes@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).