From: Joel Fernandes <joel@joelfernandes.org>
To: Daniel Colascione <dancol@google.com>
Cc: "Steven Rostedt" <rostedt@goodmis.org>,
"Sultan Alsawaf" <sultan@kerneltoast.com>,
"Tim Murray" <timmurray@google.com>,
"Michal Hocko" <mhocko@kernel.org>,
"Suren Baghdasaryan" <surenb@google.com>,
"Greg Kroah-Hartman" <gregkh@linuxfoundation.org>,
"Arve Hjønnevåg" <arve@android.com>,
"Todd Kjos" <tkjos@android.com>,
"Martijn Coenen" <maco@android.com>,
"Christian Brauner" <christian@brauner.io>,
"Ingo Molnar" <mingo@redhat.com>,
"Peter Zijlstra" <peterz@infradead.org>,
LKML <linux-kernel@vger.kernel.org>,
"open list:ANDROID DRIVERS" <devel@driverdev.osuosl.org>,
linux-mm <linux-mm@kvack.org>,
kernel-team <kernel-team@android.com>
Subject: Re: [RFC] simple_lmk: Introduce Simple Low Memory Killer for Android
Date: Fri, 15 Mar 2019 09:36:10 -0400 [thread overview]
Message-ID: <20190315133610.GC3378@google.com> (raw)
In-Reply-To: <CAKOZuetZHJzmQy3n001x4+rmWoWHEgUv2Zvow9W5+xvukxp1JQ@mail.gmail.com>
On Thu, Mar 14, 2019 at 09:36:43PM -0700, Daniel Colascione wrote:
[snip]
> > If you can solve this with an ebpf program, I
> > strongly suggest you do that instead.
>
> Regarding process death notification: I will absolutely not support
> putting aBPF and perf trace events on the critical path of core system
> memory management functionality. Tracing and monitoring facilities are
> great for learning about the system, but they were never intended to
> be load-bearing. The proposed eBPF process-monitoring approach is just
> a variant of the netlink proposal we discussed previously on the pidfd
> threads; it has all of its drawbacks. We really need a core system
> call --- really, we've needed robust process management since the
> creation of unix --- and I'm glad that we're finally getting it.
> Adding new system calls is not expensive; going to great lengths to
> avoid adding one is like calling a helicopter to avoid crossing the
> street. I don't think we should present an abuse of the debugging and
> performance monitoring infrastructure as an alternative to a robust
> and desperately-needed bit of core functionality that's neither hard
> to add nor complex to implement nor expensive to use.
The eBPF-based solution to this would be just as simple while avoiding any
kernel changes. I don't know why you think it is not load-bearing. However, I
agree the proc/pidfd approach is better and can be simpler for some users who
don't want to deal with eBPF - especially since something like this has many
usecases. I was just suggesting the eBPF solution as a better alternative to
the task_struct surgery idea from Sultan since that sounded to me quite
hackish (that could just be my opinion).
> Regarding the proposal for a new kernel-side lmkd: when possible, the
> kernel should provide mechanism, not policy. Putting the low memory
> killer back into the kernel after we've spent significant effort
> making it possible for userspace to do that job. Compared to kernel
> code, more easily understood, more easily debuggable, more easily
> updated, and much safer. If we *can* move something out of the kernel,
> we should. This patch moves us in exactly the wrong direction. Yes, we
> need *something* that sits synchronously astride the page allocation
> path and does *something* to stop a busy beaver allocator that eats
> all the available memory before lmkd, even mlocked and realtime, can
> respond. The OOM killer is adequate for this very rare case.
>
> With respect to kill timing: Tim is right about the need for two
> levels of policy: first, a high-level process prioritization and
> memory-demand balancing scheme (which is what OOM score adjustment
> code in ActivityManager amounts to); and second, a low-level
> process-killing methodology that maximizes sustainable memory reclaim
> and minimizes unwanted side effects while killing those processes that
> should be dead. Both of these policies belong in userspace --- because
> they *can* be in userspace --- and userspace needs only a few tools,
> most of which already exist, to do a perfectly adequate job.
>
> We do want killed processes to die promptly. That's why I support
> boosting a process's priority somehow when lmkd is about to kill it.
> The precise way in which we do that --- involving not only actual
> priority, but scheduler knobs, cgroup assignment, core affinity, and
> so on --- is a complex topic best left to userspace. lmkd already has
> all the knobs it needs to implement whatever priority boosting policy
> it wants.
>
> Hell, once we add a pidfd_wait --- which I plan to work on, assuming
> nobody beats me to it, after pidfd_send_signal lands --- you can
> imagine a general-purpose priority inheritance mechanism expediting
> process death when a high-priority process waits on a pidfd_wait
> handle for a condemned process. You know you're on the right track
> design-wise when you start seeing this kind of elegant constructive
> interference between seemingly-unrelated features. What we don't need
> is some kind of blocking SIGKILL alternative or backdoor event
> delivery system.
>
> We definitely don't want to have to wait for a process's parent to
> reap it. Instead, we want to wait for it to become a zombie. That's
> why I designed my original exithand patch to fire death notification
> upon transition to the zombie state, not upon process table removal,
> and I expect pidfd_wait (or whatever we call it) to act the same way.
Agreed. Looking forward to the patches. :)
thanks,
- Joel
next prev parent reply other threads:[~2019-03-15 13:36 UTC|newest]
Thread overview: 113+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-03-10 20:34 [RFC] simple_lmk: Introduce Simple Low Memory Killer for Android Sultan Alsawaf
2019-03-10 21:03 ` Greg Kroah-Hartman
2019-03-10 21:26 ` Sultan Alsawaf
2019-03-11 16:32 ` Joel Fernandes
2019-03-11 16:37 ` Joel Fernandes
2019-03-11 17:43 ` Michal Hocko
2019-03-11 17:58 ` Sultan Alsawaf
2019-03-11 20:10 ` Suren Baghdasaryan
2019-03-11 20:46 ` Sultan Alsawaf
2019-03-11 21:11 ` Joel Fernandes
2019-03-11 21:46 ` Sultan Alsawaf
2019-03-11 22:15 ` Suren Baghdasaryan
2019-03-11 22:36 ` Sultan Alsawaf
2019-03-12 8:05 ` Michal Hocko
2019-03-12 14:36 ` Suren Baghdasaryan
2019-03-12 15:25 ` Matthew Wilcox
2019-03-12 15:33 ` Michal Hocko
2019-03-12 15:39 ` Michal Hocko
2019-03-12 16:37 ` Sultan Alsawaf
2019-03-12 16:48 ` Michal Hocko
2019-03-12 16:58 ` Michal Hocko
2019-03-12 17:15 ` Suren Baghdasaryan
2019-03-12 17:17 ` Tim Murray
2019-03-12 17:45 ` Sultan Alsawaf
2019-03-12 18:43 ` Tim Murray
2019-03-12 18:50 ` Christian Brauner
2019-03-14 17:47 ` Joel Fernandes
2019-03-14 20:49 ` Sultan Alsawaf
2019-03-15 2:54 ` Joel Fernandes
2019-03-15 3:43 ` Sultan Alsawaf
2019-03-15 3:16 ` Steven Rostedt
2019-03-15 3:45 ` Sultan Alsawaf
2019-03-15 4:36 ` Daniel Colascione
2019-03-15 13:36 ` Joel Fernandes [this message]
2019-03-15 15:56 ` Suren Baghdasaryan
2019-03-15 16:12 ` Daniel Colascione
2019-03-15 16:43 ` Steven Rostedt
2019-03-15 17:17 ` Daniel Colascione
2019-03-15 18:03 ` Christian Brauner
2019-03-15 18:13 ` Joel Fernandes
2019-03-15 18:24 ` Christian Brauner
2019-03-15 18:49 ` Joel Fernandes
2019-03-16 17:31 ` Suren Baghdasaryan
2019-03-16 18:00 ` Daniel Colascione
2019-03-16 18:57 ` Christian Brauner
2019-03-16 19:37 ` Suren Baghdasaryan
2019-03-17 1:53 ` Joel Fernandes
2019-03-17 11:42 ` Christian Brauner
2019-03-17 15:40 ` Daniel Colascione
2019-03-18 0:29 ` Christian Brauner
2019-03-18 23:50 ` Joel Fernandes
2019-03-19 22:14 ` Christian Brauner
2019-03-19 22:26 ` Joel Fernandes
2019-03-19 22:48 ` Daniel Colascione
2019-03-19 23:10 ` Christian Brauner
2019-03-20 1:52 ` Joel Fernandes
2019-03-20 2:42 ` pidfd design Daniel Colascione
2019-03-20 3:59 ` Christian Brauner
2019-03-20 7:02 ` Daniel Colascione
2019-03-20 11:33 ` Joel Fernandes
2019-03-20 18:26 ` Christian Brauner
2019-03-20 18:38 ` Daniel Colascione
2019-03-20 18:51 ` Christian Brauner
2019-03-20 18:58 ` Andy Lutomirski
2019-03-20 19:14 ` Christian Brauner
2019-03-20 19:40 ` Daniel Colascione
2019-03-21 17:02 ` Andy Lutomirski
2019-03-25 20:13 ` Jann Horn
2019-03-25 20:23 ` Daniel Colascione
2019-03-25 23:42 ` Andy Lutomirski
2019-03-25 23:45 ` Christian Brauner
2019-03-26 0:00 ` Andy Lutomirski
2019-03-26 0:12 ` Christian Brauner
2019-03-26 0:24 ` Andy Lutomirski
2019-03-28 9:21 ` Christian Brauner
2019-03-20 19:19 ` Joel Fernandes
2019-03-20 19:29 ` Daniel Colascione
2019-03-24 14:44 ` Serge E. Hallyn
2019-03-24 18:48 ` Joel Fernandes
2019-03-20 19:11 ` Joel Fernandes
2019-05-07 2:16 ` [RFC] simple_lmk: Introduce Simple Low Memory Killer for Android Sultan Alsawaf
2019-05-07 7:04 ` Greg Kroah-Hartman
2019-05-07 7:27 ` Sultan Alsawaf
2019-05-07 7:43 ` Greg Kroah-Hartman
2019-05-07 8:12 ` Sultan Alsawaf
2019-05-07 10:58 ` Christian Brauner
2019-05-07 16:28 ` Suren Baghdasaryan
2019-05-07 16:38 ` Christian Brauner
2019-05-07 16:53 ` Sultan Alsawaf
2019-05-07 20:01 ` Suren Baghdasaryan
2019-05-07 18:46 ` Joel Fernandes
2019-05-07 17:17 ` Sultan Alsawaf
2019-05-07 17:29 ` Greg Kroah-Hartman
2019-05-07 11:09 ` Greg Kroah-Hartman
2019-05-07 12:26 ` Michal Hocko
2019-05-07 15:31 ` Oleg Nesterov
2019-05-07 16:35 ` Sultan Alsawaf
2019-05-09 15:56 ` Oleg Nesterov
2019-05-09 18:33 ` Sultan Alsawaf
2019-05-10 15:10 ` Oleg Nesterov
2019-05-13 16:45 ` Sultan Alsawaf
2019-05-14 16:44 ` Steven Rostedt
2019-05-14 17:31 ` Sultan Alsawaf
2019-05-15 14:58 ` Oleg Nesterov
2019-05-15 17:27 ` Sultan Alsawaf
2019-05-15 18:32 ` Steven Rostedt
2019-05-15 18:52 ` Sultan Alsawaf
2019-05-15 20:09 ` Steven Rostedt
2019-05-16 13:54 ` Oleg Nesterov
2019-03-17 16:35 ` Serge E. Hallyn
2019-03-17 17:11 ` Daniel Colascione
2019-03-17 17:16 ` Serge E. Hallyn
2019-03-17 22:02 ` Suren Baghdasaryan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190315133610.GC3378@google.com \
--to=joel@joelfernandes.org \
--cc=arve@android.com \
--cc=christian@brauner.io \
--cc=dancol@google.com \
--cc=devel@driverdev.osuosl.org \
--cc=gregkh@linuxfoundation.org \
--cc=kernel-team@android.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=maco@android.com \
--cc=mhocko@kernel.org \
--cc=mingo@redhat.com \
--cc=peterz@infradead.org \
--cc=rostedt@goodmis.org \
--cc=sultan@kerneltoast.com \
--cc=surenb@google.com \
--cc=timmurray@google.com \
--cc=tkjos@android.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).