All of lore.kernel.org
 help / color / mirror / Atom feed
From: ebiederm@xmission.com (Eric W. Biederman)
To: Shakeel Butt <shakeelb@google.com>
Cc: Alexey Gladkov <legion@kernel.org>,
	 Christian Brauner <christian.brauner@ubuntu.com>,
	 LKML <linux-kernel@vger.kernel.org>,
	 Linux Containers <containers@lists.linux.dev>,
	 Linux Containers <containers@lists.linux-foundation.org>,
	 Linux FS Devel <linux-fsdevel@vger.kernel.org>,
	 Linux MM <linux-mm@kvack.org>,
	 Andrew Morton <akpm@linux-foundation.org>,
	 Johannes Weiner <hannes@cmpxchg.org>,
	 Michal Hocko <mhocko@kernel.org>,
	 Chris Down <chris@chrisdown.name>,
	 Cgroups <cgroups@vger.kernel.org>
Subject: Re: [PATCH v1] proc: Implement /proc/self/meminfo
Date: Wed, 16 Jun 2021 11:17:38 -0500	[thread overview]
Message-ID: <87zgvpg4wt.fsf@disp2133> (raw)
In-Reply-To: <CALvZod7po_fK9JpcUNVrN6PyyP9k=hdcyRfZmHjSVE5r_8Laqw@mail.gmail.com> (Shakeel Butt's message of "Tue, 15 Jun 2021 18:09:49 -0700")

Shakeel Butt <shakeelb@google.com> writes:

> On Tue, Jun 15, 2021 at 5:47 AM Alexey Gladkov <legion@kernel.org> wrote:
>>
> [...]
>>
>> I made the second version of the patch [1], but then I had a conversation
>> with Eric W. Biederman offlist. He convinced me that it is a bad idea to
>> change all the values in meminfo to accommodate cgroups. But we agreed
>> that MemAvailable in /proc/meminfo should respect cgroups limits. This
>> field was created to hide implementation details when calculating
>> available memory. You can see that it is quite widely used [2].
>> So I want to try to move in that direction.
>>
>> [1] https://git.kernel.org/pub/scm/linux/kernel/git/legion/linux.git/log/?h=patchset/meminfo/v2.0
>> [2] https://codesearch.debian.net/search?q=MemAvailable%3A
>>
>
> Please see following two links on the previous discussion on having
> per-memcg MemAvailable stat.
>
> [1] https://lore.kernel.org/linux-mm/alpine.DEB.2.22.394.2006281445210.855265@chino.kir.corp.google.com/
> [2] https://lore.kernel.org/linux-mm/alpine.DEB.2.23.453.2007142018150.2667860@chino.kir.corp.google.com/
>
> MemAvailable itself is an imprecise metric and involving memcg makes
> this metric even more weird. The difference of semantics of swap
> accounting of v1 and v2 is one source of this weirdness (I have not
> checked your patch if it is handling this weirdness). The lazyfree and
> deferred split pages are another source.
>
> So, I am not sure if complicating an already imprecise metric will
> make it more useful.

Making a good guess at how much memory can be allocated without
triggering swapping or otherwise stressing the system is something that
requires understanding our mm internals.

To be able to continue changing the mm or even mm policy without
introducing regressions in userspace we need to export values that
userspace can use.

At a first approximation that seems to look like MemAvailable.

MemAvailable seems to have a good definition.  Roughly the amount of
memory that can be allocated without triggering swapping.  Updated
to include not trigger memory cgroup based swapping and I sounds good.

I don't know if it will work in practice but I think it is worth
exploring.

I do know that hiding the implementation details and providing userspace
with information it can directly use seems like the programming model
that needs to be explored.  Most programs should not care if they are in
a memory cgroup, etc.  Programs, load management systems, and even
balloon drivers have a legitimately interest in how much additional load
can be placed on a systems memory.


A version of this that I remember working fairly well is free space
on compressed filesystems.  As I recall compressed filesystems report
the amount of uncompressed space that is available (an underestimate).
This results in the amount of space consumed going up faster than the
free space goes down.

We can't do exactly the same thing with our memory usability estimate,
but having our estimate be a reliable underestimate might be enough
to avoid problems with reporting too much memory as available to
userspace.

I know that MemAvailable already does that /2 so maybe it is already
aiming at being an underestimate.  Perhaps we need some additional
accounting to help create a useful metric for userspace as well.


I don't know the final answer.  I do know that not designing an
interface that userspace can use to deal with it's legitimate concerns
is sticking our collective heads in the sand and wishing the problem
will go away.

Eric


  reply	other threads:[~2021-06-16 16:17 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-03 10:43 [PATCH v1] proc: Implement /proc/self/meminfo legion
2021-06-03 11:33 ` Michal Hocko
2021-06-03 11:33 ` Chris Down
2021-06-09  8:16   ` Enrico Weigelt, metux IT consult
2021-06-09 19:14     ` Eric W. Biederman
2021-06-09 19:14       ` Eric W. Biederman
2021-06-09 20:31       ` Johannes Weiner
2021-06-09 20:56         ` Eric W. Biederman
2021-06-09 20:56           ` Eric W. Biederman
2021-06-10  0:36           ` Daniel Walsh
2021-06-11 10:37         ` Enrico Weigelt, metux IT consult
2021-06-15 11:32 ` Christian Brauner
2021-06-15 12:47   ` Alexey Gladkov
2021-06-16  1:09     ` Shakeel Butt
2021-06-16  1:09       ` Shakeel Butt
2021-06-16 16:17       ` Eric W. Biederman [this message]
2021-06-16 16:17         ` Eric W. Biederman
2021-06-18 17:03         ` Michal Hocko
2021-06-18 17:03           ` Michal Hocko
2021-06-18 23:38         ` Shakeel Butt
2021-06-18 23:38           ` Shakeel Butt
2021-06-21 18:20           ` Enrico Weigelt, metux IT consult

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87zgvpg4wt.fsf@disp2133 \
    --to=ebiederm@xmission.com \
    --cc=akpm@linux-foundation.org \
    --cc=cgroups@vger.kernel.org \
    --cc=chris@chrisdown.name \
    --cc=christian.brauner@ubuntu.com \
    --cc=containers@lists.linux-foundation.org \
    --cc=containers@lists.linux.dev \
    --cc=hannes@cmpxchg.org \
    --cc=legion@kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=shakeelb@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.