All of lore.kernel.org
 help / color / mirror / Atom feed
From: Suren Baghdasaryan <surenb@google.com>
To: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.com>,
	akpm@linux-foundation.org, rientjes@google.com,
	willy@infradead.org, guro@fb.com, riel@surriel.com,
	minchan@kernel.org, kirill@shutemov.name, aarcange@redhat.com,
	christian@brauner.io, hch@infradead.org, oleg@redhat.com,
	david@redhat.com, jannh@google.com, shakeelb@google.com,
	luto@kernel.org, christian.brauner@ubuntu.com,
	fweimer@redhat.com, jengelh@inai.de, timmurray@google.com,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	kernel-team@android.com
Subject: Re: [PATCH 4/3] mm: drop MMF_OOM_SKIP from exit_mmap
Date: Thu, 16 Dec 2021 09:23:12 -0800	[thread overview]
Message-ID: <CAJuCfpGMTcyVikNrQR7Y1E54JAjgs5zFBry=DTDidJmD1YWpUg@mail.gmail.com> (raw)
In-Reply-To: <Ybsn2hJZXRofwuv+@cmpxchg.org>

On Thu, Dec 16, 2021 at 3:49 AM Johannes Weiner <hannes@cmpxchg.org> wrote:
>
> On Wed, Dec 15, 2021 at 06:26:11PM -0800, Suren Baghdasaryan wrote:
> > On Thu, Dec 9, 2021 at 9:06 AM Suren Baghdasaryan <surenb@google.com> wrote:
> > >
> > > On Thu, Dec 9, 2021 at 8:47 AM Michal Hocko <mhocko@suse.com> wrote:
> > > >
> > > > On Thu 09-12-21 08:24:04, Suren Baghdasaryan wrote:
> > > > > On Thu, Dec 9, 2021 at 1:12 AM Michal Hocko <mhocko@suse.com> wrote:
> > > > > >
> > > > > > Do we want this on top?
> > > > >
> > > > > As we discussed in this thread
> > > > > https://lore.kernel.org/all/YY4snVzZZZYhbigV@dhcp22.suse.cz,
> > > > > __oom_reap_task_mm in exit_mmap allows oom-reaper/process_mrelease to
> > > > > unmap pages in parallel with exit_mmap without blocking each other.
> > > > > Removal of __oom_reap_task_mm from exit_mmap prevents this parallelism
> > > > > and has a negative impact on performance. So the conclusion of that
> > > > > thread I thought was to keep that part. My understanding is that we
> > > > > also wanted to remove MMF_OOM_SKIP as a follow-up patch but
> > > > > __oom_reap_task_mm would stay.
> > > >
> > > > OK, then we were talking past each other, I am afraid. I really wanted
> > > > to get rid of this oom specific stuff from exit_mmap. It was there out
> > > > of necessity. With a proper locking we can finally get rid of the crud.
> > > > As I've said previously oom reaping has never been a hot path.
> > > >
> > > > If we really want to optimize this path then I would much rather see a
> > > > generic solution which would allow to move the write lock down after
> > > > unmap_vmas. That would require oom reaper to be able to handle mlocked
> > > > memory.
> > >
> > > Ok, let's work on that and when that's done we can get rid of the oom
> > > stuff in exit_mmap. I'll look into this over the weekend and will
> > > likely be back with questions.
> >
> > As promised, I have a question:
> > Any particular reason why munlock_vma_pages_range clears VM_LOCKED
> > before unlocking pages and not after (see:
> > https://elixir.bootlin.com/linux/latest/source/mm/mlock.c#L424)? Seems
> > to me if VM_LOCKED was reset at the end (with proper ordering) then
> > __oom_reap_task_mm would correctly skip VM_LOCKED vmas.
> > https://lore.kernel.org/lkml/20180514064824.534798031@linuxfoundation.org/
> > has this explanation:
> >
> > "Since munlock_vma_pages_range() depends on clearing VM_LOCKED from
> > vm_flags before actually doing the munlock to determine if any other
> > vmas are locking the same memory, the check for VM_LOCKED in the oom
> > reaper is racy."
> >
> > but "to determine if any other vmas are locking the same memory"
> > explanation eludes me... Any insights?
>
> A page's mlock state is determined by whether any of the vmas that map
> it are mlocked. The munlock code does:
>
> vma->vm_flags &= VM_LOCKED_CLEAR_MASK
> TestClearPageMlocked()
> isolate_lru_page()
> __munlock_isolated_page()
>   page_mlock()
>     rmap_walk() # for_each_vma()
>       page_mlock_one()
>         (vma->vm_flags & VM_LOCKED) && TestSetPageMlocked()
>
> If we didn't clear the VM_LOCKED flag first, racing threads could
> re-lock pages under us because they see that flag and think our vma
> wants those pages mlocked when we're in the process of munlocking.

Thanks for the explanation Johannes!
So far I didn't find an easy way to let __oom_reap_task_mm() run
concurrently with unlock_range(). Will keep exploring.

>

  reply	other threads:[~2021-12-16 17:23 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-12-08 21:22 [PATCH v4 1/3] mm: protect free_pgtables with mmap_lock write lock in exit_mmap Suren Baghdasaryan
2021-12-08 21:22 ` [PATCH v4 2/3] mm: document locking restrictions for vm_operations_struct::close Suren Baghdasaryan
2021-12-09  8:55   ` Michal Hocko
2021-12-08 21:22 ` [PATCH v4 3/3] mm/oom_kill: allow process_mrelease to run under mmap_lock protection Suren Baghdasaryan
2021-12-09  8:59   ` Michal Hocko
2021-12-09 19:03     ` Suren Baghdasaryan
2021-12-09  8:55 ` [PATCH v4 1/3] mm: protect free_pgtables with mmap_lock write lock in exit_mmap Michal Hocko
2021-12-09 19:03   ` Suren Baghdasaryan
2021-12-10  9:20     ` Michal Hocko
2021-12-09  9:12 ` [PATCH 4/3] mm: drop MMF_OOM_SKIP from exit_mmap Michal Hocko
2021-12-09 16:24   ` Suren Baghdasaryan
2021-12-09 16:47     ` Michal Hocko
2021-12-09 17:06       ` Suren Baghdasaryan
2021-12-16  2:26         ` Suren Baghdasaryan
2021-12-16 11:49           ` Johannes Weiner
2021-12-16 17:23             ` Suren Baghdasaryan [this message]
2021-12-30  5:59               ` Suren Baghdasaryan
2021-12-30  8:24                 ` Michal Hocko
2021-12-30 17:29                   ` Suren Baghdasaryan
2022-01-03 12:11                     ` Michal Hocko
2022-01-03 21:16                       ` Hugh Dickins
2022-01-04 22:24                         ` Suren Baghdasaryan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAJuCfpGMTcyVikNrQR7Y1E54JAjgs5zFBry=DTDidJmD1YWpUg@mail.gmail.com' \
    --to=surenb@google.com \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=christian.brauner@ubuntu.com \
    --cc=christian@brauner.io \
    --cc=david@redhat.com \
    --cc=fweimer@redhat.com \
    --cc=guro@fb.com \
    --cc=hannes@cmpxchg.org \
    --cc=hch@infradead.org \
    --cc=jannh@google.com \
    --cc=jengelh@inai.de \
    --cc=kernel-team@android.com \
    --cc=kirill@shutemov.name \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=luto@kernel.org \
    --cc=mhocko@suse.com \
    --cc=minchan@kernel.org \
    --cc=oleg@redhat.com \
    --cc=riel@surriel.com \
    --cc=rientjes@google.com \
    --cc=shakeelb@google.com \
    --cc=timmurray@google.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.