All of lore.kernel.org
 help / color / mirror / Atom feed
From: Michal Hocko <mhocko@suse.com>
To: Suren Baghdasaryan <surenb@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>,
	akpm@linux-foundation.org, rientjes@google.com,
	willy@infradead.org, guro@fb.com, riel@surriel.com,
	minchan@kernel.org, kirill@shutemov.name, aarcange@redhat.com,
	christian@brauner.io, hch@infradead.org, oleg@redhat.com,
	david@redhat.com, jannh@google.com, shakeelb@google.com,
	luto@kernel.org, christian.brauner@ubuntu.com,
	fweimer@redhat.com, jengelh@inai.de, timmurray@google.com,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	kernel-team@android.com
Subject: Re: [PATCH 4/3] mm: drop MMF_OOM_SKIP from exit_mmap
Date: Thu, 30 Dec 2021 09:24:04 +0100	[thread overview]
Message-ID: <Yc1spBeXal373b4A@dhcp22.suse.cz> (raw)
In-Reply-To: <CAJuCfpEHJTqG+PkAPJknf5_41ZKFqjk8pY=gTg_VZgsfY-=9Tg@mail.gmail.com>

On Wed 29-12-21 21:59:55, Suren Baghdasaryan wrote:
[...]
> After some more digging I think there are two acceptable options:
> 
> 1. Call unlock_range() under mmap_write_lock and then downgrade it to
> read lock so that both exit_mmap() and __oom_reap_task_mm() can unmap
> vmas in parallel like this:
> 
>     if (mm->locked_vm) {
>         mmap_write_lock(mm);
>         unlock_range(mm->mmap, ULONG_MAX);
>         mmap_write_downgrade(mm);
>     } else
>         mmap_read_lock(mm);
> ...
>     unmap_vmas(&tlb, vma, 0, -1);
>     mmap_read_unlock(mm);
>     mmap_write_lock(mm);
>     free_pgtables(&tlb, vma, FIRST_USER_ADDRESS, USER_PGTABLES_CEILING);
> ...
>     mm->mmap = NULL;
>     mmap_write_unlock(mm);
> 
> This way exit_mmap() might block __oom_reap_task_mm() but for a much
> shorter time during unlock_range() call.

IIRC unlock_range depends on page lock at some stage and that can mean
this will block for a long time or for ever when the holder of the lock
depends on a memory allocation. This was the primary problem why the oom
reaper skips over mlocked vmas.

> 2. Introduce another vm_flag mask similar to VM_LOCKED which is set
> before munlock_vma_pages_range() clears VM_LOCKED so that
> __oom_reap_task_mm() can identify vmas being unlocked and skip them.
> 
> Option 1 seems cleaner to me because it keeps the locking pattern
> around unlock_range() in exit_mmap() consistent with all other places
> it is used (in mremap() and munmap()) with mmap_write_lock taken.
> WDYT?

It would be really great to make unlock_range oom reaper aware IMHO.

You do not quote your change in the full length so it is not really
clear whether you are planning to drop __oom_reap_task_mm from exit_mmap
as well. If yes then 1) could push oom reaper to timeout while the
unlock_range could be dropped on something so that wouldn't be an
improvement. 2) sounds like a workaround to me as it doesn't really
address the underlying problem.

I have to say that I am not really a great fan of __oom_reap_task_mm in
exit_mmap but I would rather see it in place than making the surrounding
code more complex/tricky.

-- 
Michal Hocko
SUSE Labs

  reply	other threads:[~2021-12-30  8:24 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-12-08 21:22 [PATCH v4 1/3] mm: protect free_pgtables with mmap_lock write lock in exit_mmap Suren Baghdasaryan
2021-12-08 21:22 ` [PATCH v4 2/3] mm: document locking restrictions for vm_operations_struct::close Suren Baghdasaryan
2021-12-09  8:55   ` Michal Hocko
2021-12-08 21:22 ` [PATCH v4 3/3] mm/oom_kill: allow process_mrelease to run under mmap_lock protection Suren Baghdasaryan
2021-12-09  8:59   ` Michal Hocko
2021-12-09 19:03     ` Suren Baghdasaryan
2021-12-09  8:55 ` [PATCH v4 1/3] mm: protect free_pgtables with mmap_lock write lock in exit_mmap Michal Hocko
2021-12-09 19:03   ` Suren Baghdasaryan
2021-12-10  9:20     ` Michal Hocko
2021-12-09  9:12 ` [PATCH 4/3] mm: drop MMF_OOM_SKIP from exit_mmap Michal Hocko
2021-12-09 16:24   ` Suren Baghdasaryan
2021-12-09 16:47     ` Michal Hocko
2021-12-09 17:06       ` Suren Baghdasaryan
2021-12-16  2:26         ` Suren Baghdasaryan
2021-12-16 11:49           ` Johannes Weiner
2021-12-16 17:23             ` Suren Baghdasaryan
2021-12-30  5:59               ` Suren Baghdasaryan
2021-12-30  8:24                 ` Michal Hocko [this message]
2021-12-30 17:29                   ` Suren Baghdasaryan
2022-01-03 12:11                     ` Michal Hocko
2022-01-03 21:16                       ` Hugh Dickins
2022-01-04 22:24                         ` Suren Baghdasaryan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Yc1spBeXal373b4A@dhcp22.suse.cz \
    --to=mhocko@suse.com \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=christian.brauner@ubuntu.com \
    --cc=christian@brauner.io \
    --cc=david@redhat.com \
    --cc=fweimer@redhat.com \
    --cc=guro@fb.com \
    --cc=hannes@cmpxchg.org \
    --cc=hch@infradead.org \
    --cc=jannh@google.com \
    --cc=jengelh@inai.de \
    --cc=kernel-team@android.com \
    --cc=kirill@shutemov.name \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=luto@kernel.org \
    --cc=minchan@kernel.org \
    --cc=oleg@redhat.com \
    --cc=riel@surriel.com \
    --cc=rientjes@google.com \
    --cc=shakeelb@google.com \
    --cc=surenb@google.com \
    --cc=timmurray@google.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.