linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Enrico Weigelt, metux IT consult" <lkml@metux.net>
To: Johannes Weiner <hannes@cmpxchg.org>,
	"Eric W. Biederman" <ebiederm@xmission.com>
Cc: Chris Down <chris@chrisdown.name>,
	legion@kernel.org, LKML <linux-kernel@vger.kernel.org>,
	Linux Containers <containers@lists.linux.dev>,
	Linux Containers <containers@lists.linux-foundation.org>,
	Linux FS Devel <linux-fsdevel@vger.kernel.org>,
	linux-mm@kvack.org, Andrew Morton <akpm@linux-foundation.org>,
	Christian Brauner <christian.brauner@ubuntu.com>,
	Michal Hocko <mhocko@kernel.org>
Subject: Re: [PATCH v1] proc: Implement /proc/self/meminfo
Date: Fri, 11 Jun 2021 12:37:36 +0200	[thread overview]
Message-ID: <f62b652c-3f6f-31ba-be0f-5f97b304599f@metux.net> (raw)
In-Reply-To: <YMElKcrVIhJg4GTT@cmpxchg.org>

On 09.06.21 22:31, Johannes Weiner wrote:

>> Alex has found applications that are trying to do something with
>> meminfo, and the fields that those applications care about.  I don't see
>> anyone making the case that specifically what the applications are
>> trying to do is buggy.

Actually, I do. The assumptions made by those applications are most
certainly wrong - have been wrong ever since.

The problem is that just looking on these numbers is only helpful if
that application is pretty much alone on the machine. You'll find those
when the vendor explicitly tells you to run it on a standalone machine,
disable swap, etc. Database systems are probably the most prominent
sector here.

Onder certain circumstances this actually migh, heuristically, work,
but those kind of auto tuning (eg. automatically eat X% of ram for
buffers, etc) might work. Under certain circumstance.

Those assumptions already had been defeated by VMs (eg. overcommit).

I don't feel it's a good idea to add extra kernel code, just in order
to make some ill-designed applications work a little bit less bad
(which still needs to be changed anyways)

Don't get me wrong, I'm really in favour of having a clean interface
for telling applications how much resources they can take (actually
considered quite the same), but this needs to be very well thought (and
we should also get orchestration folks into the loop for that). This
basically falls into two categories:

a) hard limits - how much can an application possibly consume
    --> what makes about an application ? process ? process group ?
        cgroup ? namespace ?
b) reasonable defaults - how much can an application take at will, w/o
    affecting others ?
    --> what exactly is "reasonable" ? kernel cache vs. userland buffers?
    --> how to deal w/ overcommit scenarios ?
    --> who shall be in charge of controlling these values ?

It's a very complex problem, not easy to solve. Much of that seems to be
a matter of policies, and depending on actual workloads.

Maybe, for now, it's better pursue that on orchestration level.

> Not all the information at the system level translates well to the
> container level. Things like available memory require a hierarchical
> assessment rather than just a look at the local level, since there
> could be limits higher up the tree.

By the way: what exactly *is* a container anyways ?

The mainline kernel (in contrast to openvz kernel) doesn't actually know
about containers at all - instead is provides several concepts like
namespaces, cgroups, etc, that together are used for providing some
container environment - but that composition is done in userland, and
there're several approaches w/ different semantics.

Before we can do anything container specific in the kernel, we first
have to come to an general aggreement what actually is a container from
kernel perspective. No idea whether we can achieve that at all (in near
future) w/o actually introducing the concept of container within the
kernel.

> We should also not speculate what users intended to do with the
> meminfo data right now. There is a surprising amount of misconception
> around what these values actually mean. I'd rather have users show up
> on the mailing list directly and outline the broader usecase.

ACK.

The only practical use cases I'm aware of is:

a) safety: know how much memory I can eat, until I get -ENOMEM, so
    applications can proactively take counter measures, eg. pre
    allocation (common practise in safety related applications)

b) autotuning: how much shall the application take for caches or
    buffers. this is problematic, since it can only work on heuristics,
    which in turn can only be experimentally found within certain range
    of assumptions (eg. certain databases like to do that). By that way
    you can only find more or less reasonable parameters for the majority
    of cases (assuming you have an idea what that majority actually is),
    but still far from optimal. for *good* parameters you need to measure
    your actual workloads and applying good knowledge of what this
    application is actually doing. (one of the DBA's primary jobs)


--mtx
-- 
---
Hinweis: unverschlüsselte E-Mails können leicht abgehört und manipuliert
werden ! Für eine vertrauliche Kommunikation senden Sie bitte ihren
GPG/PGP-Schlüssel zu.
---
Enrico Weigelt, metux IT consult
Free software and Linux embedded engineering
info@metux.net -- +49-151-27565287

  parent reply	other threads:[~2021-06-11 10:38 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-03 10:43 [PATCH v1] proc: Implement /proc/self/meminfo legion
2021-06-03 11:33 ` Michal Hocko
2021-06-03 11:33 ` Chris Down
2021-06-09  8:16   ` Enrico Weigelt, metux IT consult
2021-06-09 19:14     ` Eric W. Biederman
2021-06-09 20:31       ` Johannes Weiner
2021-06-09 20:56         ` Eric W. Biederman
2021-06-10  0:36           ` Daniel Walsh
2021-06-11 10:37         ` Enrico Weigelt, metux IT consult [this message]
2021-06-15 11:32 ` Christian Brauner
2021-06-15 12:47   ` Alexey Gladkov
2021-06-16  1:09     ` Shakeel Butt
2021-06-16 16:17       ` Eric W. Biederman
2021-06-18 17:03         ` Michal Hocko
2021-06-18 23:38         ` Shakeel Butt
2021-06-21 18:20           ` Enrico Weigelt, metux IT consult

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f62b652c-3f6f-31ba-be0f-5f97b304599f@metux.net \
    --to=lkml@metux.net \
    --cc=akpm@linux-foundation.org \
    --cc=chris@chrisdown.name \
    --cc=christian.brauner@ubuntu.com \
    --cc=containers@lists.linux-foundation.org \
    --cc=containers@lists.linux.dev \
    --cc=ebiederm@xmission.com \
    --cc=hannes@cmpxchg.org \
    --cc=legion@kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).