linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
To: Michal Hocko <mhocko@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	linux-mm@kvack.org, Johannes Weiner <hannes@cmpxchg.org>,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 0/2] oom, memcg: do not report racy no-eligible OOM
Date: Fri, 11 Jan 2019 21:40:52 +0900	[thread overview]
Message-ID: <0d67b389-91e2-18ab-b596-39361b895c89@i-love.sakura.ne.jp> (raw)
In-Reply-To: <20190111113354.GD14956@dhcp22.suse.cz>

On 2019/01/11 20:33, Michal Hocko wrote:
> On Fri 11-01-19 19:25:22, Tetsuo Handa wrote:
>> On 2019/01/11 8:59, Tetsuo Handa wrote:
>>> Michal Hocko wrote:
>>>> On Wed 09-01-19 20:34:46, Tetsuo Handa wrote:
>>>>> On 2019/01/09 20:03, Michal Hocko wrote:
>>>>>> Tetsuo,
>>>>>> can you confirm that these two patches are fixing the issue you have
>>>>>> reported please?
>>>>>>
>>>>>
>>>>> My patch fixes the issue better than your "[PATCH 2/2] memcg: do not
>>>>> report racy no-eligible OOM tasks" does.
>>>>
>>>> OK, so we are stuck again. Hooray!
>>>
>>> Andrew, will you pick up "[PATCH 3/2] memcg: Facilitate termination of memcg OOM victims." ?
>>> Since mm-oom-marks-all-killed-tasks-as-oom-victims.patch does not call mark_oom_victim()
>>> when task_will_free_mem() == true, memcg-do-not-report-racy-no-eligible-oom-tasks.patch
>>> does not close the race whereas my patch closes the race better.
>>>
>>
>> I confirmed that mm-oom-marks-all-killed-tasks-as-oom-victims.patch and
>> memcg-do-not-report-racy-no-eligible-oom-tasks.patch are completely failing
>> to fix the issue I am reporting. :-(
> 
> OK, this is really interesting. This means that we are racing
> when marking all the tasks sharing the mm with the clone syscall.

Nothing interesting. This is NOT a race between clone() and the OOM killer. :-(
By the moment the OOM killer is invoked, all clone() requests are already completed.

Did you notice that there is no

  "Killed process %d (%s) total-vm:%lukB, anon-rss:%lukB, file-rss:%lukB, shmem-rss:%lukB\n"

line between

  [   71.304703][ T9694] Memory cgroup out of memory: Kill process 9692 (a.out) score 904 or sacrifice child

and

  [   71.309149][   T54] oom_reaper: reaped process 9750 (a.out), now anon-rss:0kB, file-rss:0kB, shmem-rss:185532kB

? Then, you will find that [ T9694] failed to reach for_each_process(p) loop inside
__oom_kill_process() in the first round of out_of_memory() call because
find_lock_task_mm() == NULL at __oom_kill_process() because Ctrl-C made that victim
complete exit_mm() before find_lock_task_mm() is called. Then, in the second round
of out_of_memory() call, [ T9750] (which is fatal_signal_pending() == T &&
tsk_is_oom_victim() == F) hit task_will_free_mem(current) path and called
mark_oom_victim() and woke up the OOM reaper. Then, before the third round of
out_of_memory() call starts, the OOM reaper set MMF_OOM_SKIP. When the third round
of out_of_memory() call started, [ T9748] could not hit task_will_free_mem(current)
path because MMF_OOM_SKIP was already set, and oom_badness() ignored any mm which
already has MMF_OOM_SKIP. As a result, [ T9748] failed to find a candidate. And this
step repeats for up to number of threads (213 times for this run).

> Does fatal_signal_pending handle this better?
> 

Of course. My patch handles it perfectly. Even if we raced with clone() requests,
why do we need to care about threads doing clone() requests? Such threads are not
inside try_charge(), and therefore such threads can't contribute to this issue
by calling out_of_memory() from try_charge().

  reply	other threads:[~2019-01-11 12:41 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-01-07 14:38 [PATCH 0/2] oom, memcg: do not report racy no-eligible OOM Michal Hocko
2019-01-07 14:38 ` [PATCH 1/2] mm, oom: marks all killed tasks as oom victims Michal Hocko
2019-01-07 20:58   ` Tetsuo Handa
2019-01-08  8:11     ` Michal Hocko
2019-01-07 14:38 ` [PATCH 2/2] memcg: do not report racy no-eligible OOM tasks Michal Hocko
2019-01-07 20:59   ` Tetsuo Handa
2019-01-08  8:14     ` Michal Hocko
2019-01-08 10:39       ` Tetsuo Handa
2019-01-08 11:46         ` Michal Hocko
2019-01-08  8:35   ` kbuild test robot
2019-01-08  9:39     ` Michal Hocko
2019-01-11  0:23       ` [kbuild-all] " Rong Chen
2019-01-08 14:21 ` [PATCH 3/2] memcg: Facilitate termination of memcg OOM victims Tetsuo Handa
2019-01-08 14:38   ` Michal Hocko
2019-01-09 11:03 ` [PATCH 0/2] oom, memcg: do not report racy no-eligible OOM Michal Hocko
2019-01-09 11:34   ` Tetsuo Handa
2019-01-09 12:02     ` Michal Hocko
2019-01-10 23:59       ` Tetsuo Handa
2019-01-11 10:25         ` Tetsuo Handa
2019-01-11 11:33           ` Michal Hocko
2019-01-11 12:40             ` Tetsuo Handa [this message]
2019-01-11 13:34               ` Michal Hocko
2019-01-11 14:31                 ` Tetsuo Handa
2019-01-11 15:07                   ` Michal Hocko
2019-01-11 15:37                     ` Tetsuo Handa
2019-01-11 16:45                       ` Michal Hocko
2019-01-12 10:52                         ` Tetsuo Handa
2019-01-13 17:36                           ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=0d67b389-91e2-18ab-b596-39361b895c89@i-love.sakura.ne.jp \
    --to=penguin-kernel@i-love.sakura.ne.jp \
    --cc=akpm@linux-foundation.org \
    --cc=hannes@cmpxchg.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).