All of lore.kernel.org
 help / color / mirror / Atom feed
From: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
To: Michal Hocko <mhocko@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Johannes Weiner <hannes@cmpxchg.org>,
	David Rientjes <rientjes@google.com>,
	linux-mm@kvack.org, Yong-Taek Lee <ytk.lee@samsung.com>,
	Paul McKenney <paulmck@linux.vnet.ibm.com>,
	Linus Torvalds <torvalds@linux-foundation.org>
Subject: Re: [PATCH] mm, oom: Tolerate processes sharing mm with different view of oom_score_adj.
Date: Wed, 16 Jan 2019 22:32:50 +0900	[thread overview]
Message-ID: <6118fa8a-7344-b4b2-36ce-d77d495fba69@i-love.sakura.ne.jp> (raw)
In-Reply-To: <20190116121915.GJ24149@dhcp22.suse.cz>

On 2019/01/16 21:19, Michal Hocko wrote:
> On Wed 16-01-19 20:30:25, Tetsuo Handa wrote:
>> On 2019/01/16 20:09, Michal Hocko wrote:
>>> On Wed 16-01-19 19:55:21, Tetsuo Handa wrote:
>>>> This patch reverts both commit 44a70adec910d692 ("mm, oom_adj: make sure
>>>> processes sharing mm have same view of oom_score_adj") and commit
>>>> 97fd49c2355ffded ("mm, oom: kill all tasks sharing the mm") in order to
>>>> close a race and reduce the latency at __set_oom_adj(), and reduces the
>>>> warning at __oom_kill_process() in order to minimize the latency.
>>>>
>>>> Commit 36324a990cf578b5 ("oom: clear TIF_MEMDIE after oom_reaper managed
>>>> to unmap the address space") introduced the worst case mentioned in
>>>> 44a70adec910d692. But since the OOM killer skips mm with MMF_OOM_SKIP set,
>>>> only administrators can trigger the worst case.
>>>>
>>>> Since 44a70adec910d692 did not take latency into account, we can hold RCU
>>>> for minutes and trigger RCU stall warnings by calling printk() on many
>>>> thousands of thread groups. Even without calling printk(), the latency is
>>>> mentioned by Yong-Taek Lee [1]. And I noticed that 44a70adec910d692 is
>>>> racy, and trying to fix the race will require a global lock which is too
>>>> costly for rare events.
>>>>
>>>> If the worst case in 44a70adec910d692 happens, it is an administrator's
>>>> request. Therefore, tolerate the worst case and speed up __set_oom_adj().
>>>
>>> I really do not think we care about latency. I consider the overal API
>>> sanity much more important. Besides that the original report you are
>>> referring to was never exaplained/shown to represent real world usecase.
>>> oom_score_adj is not really a an interface to be tweaked in hot paths.
>>
>> I do care about the latency. Holding RCU for more than 2 minutes is insane.
> 
> Creating 8k threads could be considered insane as well. But more
> seriously. I absolutely do not insist on holding a single RCU section
> for the whole operation. But that doesn't really mean that we want to
> revert these changes. for_each_process is by far not only called from
> this path.

Unlike check_hung_uninterruptible_tasks() where failing to resume after
breaking RCU section is tolerable, failing to resume after breaking RCU
section for __set_oom_adj() is not tolerable; it leaves the possibility
of different oom_score_adj. Unless it is inevitable (e.g. SysRq-t), I think
that calling printk() on each thread from RCU section is a poor choice.

What if thousands of threads concurrently called __set_oom_adj() when
each __set_oom_adj() call involves printk() on thousands of threads
which can take more than 2 minutes? How long will it take to complete?

  reply	other threads:[~2019-01-16 13:33 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-01-16 10:55 Tetsuo Handa
2019-01-16 11:09 ` Michal Hocko
2019-01-16 11:30   ` Tetsuo Handa
2019-01-16 12:19     ` Michal Hocko
2019-01-16 13:32       ` Tetsuo Handa [this message]
2019-01-16 13:41         ` Michal Hocko
2019-01-17 10:40           ` Tetsuo Handa
2019-01-17 15:51           ` Michal Hocko
2019-01-30 22:49             ` [PATCH v2] " Tetsuo Handa
2019-01-31  7:11               ` Michal Hocko
2019-01-31 20:59                 ` Tetsuo Handa
2019-02-01  9:14                   ` Michal Hocko
2019-02-02 11:06                     ` Tetsuo Handa
2019-02-11 15:07                       ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=6118fa8a-7344-b4b2-36ce-d77d495fba69@i-love.sakura.ne.jp \
    --to=penguin-kernel@i-love.sakura.ne.jp \
    --cc=akpm@linux-foundation.org \
    --cc=hannes@cmpxchg.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=rientjes@google.com \
    --cc=torvalds@linux-foundation.org \
    --cc=ytk.lee@samsung.com \
    --subject='Re: [PATCH] mm, oom: Tolerate processes sharing mm with different view of oom_score_adj.' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.