From: Shakeel Butt <shakeelb@google.com>
To: Suren Baghdasaryan <surenb@google.com>
Cc: Michal Hocko <mhocko@suse.com>,
Andrew Morton <akpm@linux-foundation.org>,
David Rientjes <rientjes@google.com>,
Matthew Wilcox <willy@infradead.org>,
Johannes Weiner <hannes@cmpxchg.org>,
Roman Gushchin <guro@fb.com>, Rik van Riel <riel@surriel.com>,
Minchan Kim <minchan@kernel.org>,
Christian Brauner <christian@brauner.io>,
Christoph Hellwig <hch@infradead.org>,
Oleg Nesterov <oleg@redhat.com>,
David Hildenbrand <david@redhat.com>,
Jann Horn <jannh@google.com>, Andy Lutomirski <luto@kernel.org>,
Christian Brauner <christian.brauner@ubuntu.com>,
Florian Weimer <fweimer@redhat.com>,
Jan Engelhardt <jengelh@inai.de>,
Tim Murray <timmurray@google.com>,
Linux API <linux-api@vger.kernel.org>,
Linux MM <linux-mm@kvack.org>,
LKML <linux-kernel@vger.kernel.org>,
kernel-team <kernel-team@android.com>
Subject: Re: [PATCH v3 1/2] mm: introduce process_mrelease system call
Date: Fri, 23 Jul 2021 10:00:26 -0700 [thread overview]
Message-ID: <CALvZod7Vb2MKgCcSYtsMd8F4sFb2K7jQk3AGSECYfKvd3MNqzQ@mail.gmail.com> (raw)
In-Reply-To: <CAJuCfpGmpwTv92joNuVPaEJg1PigtGQn2daywHaqF4TXjuiCWQ@mail.gmail.com>
On Fri, Jul 23, 2021 at 9:09 AM Suren Baghdasaryan <surenb@google.com> wrote:
>
> On Fri, Jul 23, 2021 at 6:46 AM Shakeel Butt <shakeelb@google.com> wrote:
> >
> > On Fri, Jul 23, 2021 at 1:53 AM Michal Hocko <mhocko@suse.com> wrote:
> > >
> > [...]
> > > > However
> > > > retrying means issuing another syscall, so additional overhead...
> > > > I guess such "best effort" approach would be unusual for a syscall, so
> > > > maybe we can keep it as it is now and if such "do not block" mode is needed
> > > > we can use flags to implement it later?
> > >
> > > Yeah, an explicit opt-in via flags would be an option if that turns out
> > > to be really necessary.
> > >
> >
> > I am fine with keeping it as it is but we do need the non-blocking
> > option (via flags) to enable userspace to act more aggressively.
>
> I think you want to check memory conditions shortly after issuing
> kill/reap requests irrespective of mmap_sem contention. The reason is
> that even when memory release is not blocked, allocations from other
> processes might consume memory faster than we release it. For example,
> in Android we issue kill and start waiting on pidfd for its death
> notification. As soon as the process is dead we reassess the situation
> and possibly kill again. If the process is not dead within a
> configurable timeout we check conditions again and might issue more
> kill requests (IOW our wait for the process to die has a timeout). If
> process_mrelease() is blocked on mmap_sem, we might timeout like this.
> I imagine that a non-blocking option for process_mrelease() would not
> really change this logic.
On a containerized system, killing a job requires killing multiple
processes and then process_mrelease() them. Now there is cgroup.kill
to kill all the processes in a cgroup tree but we would still need to
process_mrelease() all the processes in that tree. There is a chance
that we get stuck in reaping the early process. Making
process_mrelease() non-blocking will enable the userspace to go to
other processes in the list.
An alternative would be to have a cgroup specific interface for
reaping similar to cgroup.kill.
> Adding such an option is trivial but I would like to make sure it's
> indeed useful. Maybe after the syscall is in place you can experiment
> with it and see if such an option would really change the way you use
> it?
SGTM.
next prev parent reply other threads:[~2021-07-23 17:00 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-07-23 1:14 [PATCH v3 1/2] mm: introduce process_mrelease system call Suren Baghdasaryan
2021-07-23 1:14 ` [PATCH v3 2/2] mm: wire up syscall process_mrelease Suren Baghdasaryan
2021-07-23 2:03 ` [PATCH v3 1/2] mm: introduce process_mrelease system call Shakeel Butt
[not found] ` <CAJuCfpFZeQez77CB7odfaSpi3JcLQ_Nz0WvDTsra1VPoA-j7sg@mail.gmail.com>
2021-07-23 6:20 ` Michal Hocko
[not found] ` <CAJuCfpGSZwVgZ=FxhCV-uC_mzC7O-v-3k3tm-F6kOB7WM9t9tw@mail.gmail.com>
2021-07-23 8:15 ` David Hildenbrand
2021-07-23 8:18 ` Suren Baghdasaryan
2021-07-23 8:53 ` Michal Hocko
2021-07-23 13:46 ` Shakeel Butt
2021-07-23 16:08 ` Suren Baghdasaryan
2021-07-23 17:00 ` Shakeel Butt [this message]
2021-07-26 7:27 ` Michal Hocko
2021-07-26 13:43 ` Shakeel Butt
2021-08-02 19:53 ` Suren Baghdasaryan
2021-08-02 20:05 ` Shakeel Butt
2021-08-02 20:08 ` Suren Baghdasaryan
2021-08-02 22:16 ` Suren Baghdasaryan
2021-07-23 13:40 ` Shakeel Butt
2021-07-26 8:20 ` Michal Hocko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CALvZod7Vb2MKgCcSYtsMd8F4sFb2K7jQk3AGSECYfKvd3MNqzQ@mail.gmail.com \
--to=shakeelb@google.com \
--cc=akpm@linux-foundation.org \
--cc=christian.brauner@ubuntu.com \
--cc=christian@brauner.io \
--cc=david@redhat.com \
--cc=fweimer@redhat.com \
--cc=guro@fb.com \
--cc=hannes@cmpxchg.org \
--cc=hch@infradead.org \
--cc=jannh@google.com \
--cc=jengelh@inai.de \
--cc=kernel-team@android.com \
--cc=linux-api@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=luto@kernel.org \
--cc=mhocko@suse.com \
--cc=minchan@kernel.org \
--cc=oleg@redhat.com \
--cc=riel@surriel.com \
--cc=rientjes@google.com \
--cc=surenb@google.com \
--cc=timmurray@google.com \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).