All of lore.kernel.org
 help / color / mirror / Atom feed
From: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
To: Michal Hocko <mhocko@kernel.org>
Cc: David Rientjes <rientjes@google.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [patch -mm] mm, oom: remove oom_lock from exit_mmap
Date: Mon, 16 Jul 2018 16:04:26 +0900	[thread overview]
Message-ID: <916d7e1d-66ea-00d9-c943-ef3d2e082584@i-love.sakura.ne.jp> (raw)
In-Reply-To: <20180716061317.GA17280@dhcp22.suse.cz>

On 2018/07/16 15:13, Michal Hocko wrote:
> On Sat 14-07-18 06:18:58, Tetsuo Handa wrote:
>>> @@ -3073,9 +3073,7 @@ void exit_mmap(struct mm_struct *mm)
>>>  		 * which clears VM_LOCKED, otherwise the oom reaper cannot
>>>  		 * reliably test it.
>>>  		 */
>>> -		mutex_lock(&oom_lock);
>>>  		__oom_reap_task_mm(mm);
>>> -		mutex_unlock(&oom_lock);
>>>  
>>>  		set_bit(MMF_OOM_SKIP, &mm->flags);
>>
>> David and Michal are using different version as a baseline here.
>> David is making changes using timeout based back off (in linux-next.git)
>> which is inappropriately trying to use MMF_UNSTABLE for two purposes.
>>
>> Michal is making changes using current code (in linux.git) which does not
>> address David's concern.
> 
> Yes I have based it on top of Linus tree because the point of this patch
> is to get rid of the locking which is no longer needed. I do not see
> what concern are you talking about.

I'm saying that applying your patch does not work on linux-next.git
because David's patch already did s/MMF_OOM_SKIP/MMF_UNSTABLE/ .

>>
>> My version ( https://marc.info/?l=linux-mm&m=153119509215026 ) is
>> making changes using current code which also provides oom-badness
>> based back off in order to address David's concern.
>>
>>>  		down_write(&mm->mmap_sem);
>>
>> Anyway, I suggest doing
>>
>>   mutex_lock(&oom_lock);
>>   set_bit(MMF_OOM_SKIP, &mm->flags);
>>   mutex_unlock(&oom_lock);
> 
> Why do we need it?
> 
>> like I mentioned at
>> http://lkml.kernel.org/r/201807130620.w6D6KiAJ093010@www262.sakura.ne.jp
>> even if we make changes on top of linux-next's timeout based back off.
> 
> says
> : (3) Prevent from selecting new OOM victim when there is an !MMF_OOM_SKIP mm
> :     which current thread should wait for.
> [...]
> : Regarding (A), we can reduce the range oom_lock serializes from
> : "__oom_reap_task_mm()" to "setting MMF_OOM_SKIP", for oom_lock is useful for (3).
> 
> But why there is a lock needed for this? This doesn't make much sense to
> me. If we do not have MMF_OOM_SKIP set we still should have mm_is_oom_victim
> so no new task should be selected. If we race with the oom reaper than
> ok, we would just not select a new victim and retry later.
> 

How mm_is_oom_victim() helps? mm_is_oom_victim() is used by exit_mmap() whether
current thread should call __oom_reap_task_mm().

I'm talking about below sequence (i.e. after returning from __oom_reap_task_mm()).

  CPU 0                                   CPU 1
  
  mutex_trylock(&oom_lock) in __alloc_pages_may_oom() succeeds.
  get_page_from_freelist() fails.
  Enters out_of_memory().

                                          __oom_reap_task_mm() reclaims some memory.
                                          Sets MMF_OOM_SKIP.

  select_bad_process() selects new victim because MMF_OOM_SKIP is already set.
  Kills a new OOM victim without retrying last second allocation attempt.
  Leaves out_of_memory().
  mutex_unlock(&oom_lock) in __alloc_pages_may_oom() is called.

If setting MMF_OOM_SKIP is guarded by oom_lock, we can enforce
last second allocation attempt like below.

  CPU 0                                   CPU 1
  
  mutex_trylock(&oom_lock) in __alloc_pages_may_oom() succeeds.
  get_page_from_freelist() fails.
  Enters out_of_memory().

                                          __oom_reap_task_mm() reclaims some memory.
                                          mutex_lock(&oom_lock);

  select_bad_process() does not select new victim because MMF_OOM_SKIP is not yet set.
  Leaves out_of_memory().
  mutex_unlock(&oom_lock) in __alloc_pages_may_oom() is called.

                                          Sets MMF_OOM_SKIP.
                                          mutex_unlock(&oom_lock);

  get_page_from_freelist() likely succeeds before reaching __alloc_pages_may_oom() again.
  Saved one OOM victim from being needlessly killed.

That is, guarding setting MMF_OOM_SKIP works as if synchronize_rcu(); it waits for anybody
who already acquired (or started waiting for) oom_lock to release oom_lock, in order to
prevent select_bad_process() from needlessly selecting new OOM victim.


  reply	other threads:[~2018-07-16  7:04 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-07-12 21:34 [patch -mm] mm, oom: remove oom_lock from exit_mmap David Rientjes
2018-07-13  6:20 ` Tetsuo Handa
2018-07-13 14:26 ` Michal Hocko
2018-07-13 21:18   ` Tetsuo Handa
2018-07-16  6:13     ` Michal Hocko
2018-07-16  7:04       ` Tetsuo Handa [this message]
2018-07-16  7:44         ` Michal Hocko
2018-07-16 10:38           ` Tetsuo Handa
2018-07-16 11:15             ` Michal Hocko
2018-07-17  4:22     ` David Rientjes
2018-07-17  4:14   ` David Rientjes

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=916d7e1d-66ea-00d9-c943-ef3d2e082584@i-love.sakura.ne.jp \
    --to=penguin-kernel@i-love.sakura.ne.jp \
    --cc=akpm@linux-foundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=rientjes@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.