+surenb

On Fri, Feb 24, 2017 at 10:38 AM, Tim Murray <timmurray@google.com> wrote:
Hi all, I've recently been looking at lowmemorykiller, userspace lmkd, and memory cgroups on Android.

First of all, no, an Android device will probably not function without a kernel or userspace version of lowmemorykiller. Android userspace expects that if there are many apps running in the background on a machine and the foreground app allocates additional memory, something on the system will kill background apps to free up more memory. If this doesn't happen, I expect that at the very least you'll see page cache thrashing, and you'll probably see the OOM killer run regularly, which has a tendency to cause Android userspace to restart. To the best of my knowledge, no device has shipped with a userspace lmkd.

Second, yes, the current design and implementation of lowmemorykiller are unsatisfactory. I now have some concrete evidence that the design of lowmemorykiller is directly responsible for some very negative user-visible behaviors (particularly the triggers for when to kill), so I'm currently working on an overhaul to the Android memory model that would use mem cgroups and userspace lmkd to make smarter decisions about reclaim vs killing. Yes, this means that we would move to vmpressure (which will require improvements to vmpressure). I can't give a firm ETA for this, as it's still in the prototype phase, but the initial results are promising.

On Fri, Feb 24, 2017 at 1:34 AM, Michal Hocko <mhocko@kernel.org> wrote:
On Thu 23-02-17 21:36:00, Martijn Coenen wrote:
> On Thu, Feb 23, 2017 at 9:24 PM, John Stultz <john.stultz@linaro.org> wrote:
[...]
> > This is reportedly because while the mempressure notifiers provide a
> > the signal to userspace, the work the deamon then has to do to look up
> > per process memory usage, in order to figure out who is best to kill
> > at that point was too costly and resulted in poor device performance.
>
> In particular, mempressure requires memory cgroups to function, and we
> saw performance regressions due to the accounting done in mem cgroups.
> At the time we didn't have enough time left to solve this before the
> release, and we reverted back to kernel lmkd.

I would be more than interested to hear details. We used to have some
visible charge path performance footprint but this should be gone now.

[...]
> > It would be great however to get a discussion going here on what the
> > ulmkd needs from the kernel in order to efficiently determine who best
> > to kill, and how we might best implement that.
>
> The two main issues I think we need to address are:
> 1) Getting the right granularity of events from the kernel; I once
> tried to submit a patch upstream to address this:
> https://lkml.org/lkml/2016/2/24/582

Not only that, the implementation of tht vmpressure needs some serious
rethinking as well. The current one can hit critical events
unexpectedly. The calculation also doesn't consider slab reclaim
sensibly.

> 2) Find out where exactly the memory cgroup overhead is coming from,
> and how to reduce it or work around it to acceptable levels for
> Android. This was also on 3.10, and maybe this has long been fixed or
> improved in more recent kernel versions.

3e32cb2e0a12 ("mm: memcontrol: lockless page counters") has improved
situation a lot as all the charging is lockless since then (3.19).
--
Michal Hocko
SUSE Labs