All of lore.kernel.org
 help / color / mirror / Atom feed
From: Tejun Heo <tj@kernel.org>
To: Michal Hocko <mhocko@kernel.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>,
	Chris Down <chris@chrisdown.name>,
	Andrew Morton <akpm@linux-foundation.org>,
	Roman Gushchin <guro@fb.com>, Dennis Zhou <dennis@kernel.org>,
	linux-kernel@vger.kernel.org, cgroups@vger.kernel.org,
	linux-mm@kvack.org, kernel-team@fb.com
Subject: Re: [PATCH 2/2] mm: Consider subtrees in memory.events
Date: Fri, 25 Jan 2019 08:51:52 -0800	[thread overview]
Message-ID: <20190125165152.GK50184@devbig004.ftw2.facebook.com> (raw)
In-Reply-To: <20190125074824.GD3560@dhcp22.suse.cz>

Hello, Michal.

On Fri, Jan 25, 2019 at 09:42:13AM +0100, Michal Hocko wrote:
> > If you read my sentence again, I'm not talking about the kernel but
> > the surrounding infrastructure that consumes this data. The risk is
> > not dependent on the age of the interface age, but on its adoption.
> 
> You really have to assume the user visible interface is consumed shortly
> after it is exposed/considered stable in this case as cgroups v2 was
> explicitly called unstable for a considerable period of time. This is a
> general policy regarding user APIs in the kernel. I can see arguments a
> next release after introduction or in similar cases but this is 3 years
> ago. We already have distribution kernels based on 4.12 kernel and it is
> old comparing to 5.0.

We do change userland-visible behaviors if the existing behavior is
buggy / misleading / confusing.  For example, we recently changed how
discard bytes are accounted (no longer included in write bytes or ios)
and even how mincore(2) behaves, both of which are far older than
cgroup2.

The main considerations are the blast radius and existing use cases in
these decisions.  Age does contribute to it but mostly because they
affect how widely the behavior may be depended upon.

> > > Changing interfaces now represents a non-trivial risk and so far I
> > > haven't heard any actual usecase where the current semantic is
> > > actually wrong.  Inconsistency on its own is not a sufficient
> > > justification IMO.
> > 
> > It can be seen either way, and in isolation it wouldn't be wrong to
> > count events on the local level. But we made that decision for the
> > entire interface, and this file is the odd one out now. From that
> > comprehensive perspective, yes, the behavior is wrong.
> 
> I do see your point about consistency. But it is also important to
> consider the usability of this interface. As already mentioned, catching
> an oom event at a level where the oom doesn't happen and having hard
> time to identify that place without races is a not a straightforward API
> to use. So it might be really the case that the api is actually usable
> for its purpose.

What if a user wants to monitor any ooms in the subtree tho, which is
a valid use case?  If local event monitoring is useful and it can be,
let's add separate events which are clearly identifiable to be local.
Right now, it's confusing like hell.

> > It really
> > confuses people who are trying to use it, because they *do* expect it
> > to behave recursively.
> 
> Then we should improve the documentation. But seriously these are no
> strong reasons to change a long term semantic people might rely on.

This is broken interface.  We're mixing local and hierarchical numbers
willy nilly without obvious way of telling them apart.

> > I'm really having a hard time believing there are existing cgroup2
> > users with specific expectations for the non-recursive behavior...
> 
> I can certainly imagine monitoring tools to hook at levels where limits
> are set and report events as they happen. It would be more than
> confusing to receive events for reclaim/ooms that hasn't happened at
> that level just because a delegated memcg down the hierarchy has decided
> to set a more restrictive limits. Really this is a very unexpected
> behavior change for anybody using that interface right now on anything
> but leaf memcgs.

Sure, there's some probability this change may cause some disruptions
although I'm pretty skeptical given that inner node event monitoring
is mostly useless right now.  However, there's also a lot of on-going
and future costs everyone is paying because the interface is so
confusing.

Thanks.

-- 
tejun

  reply	other threads:[~2019-01-25 16:51 UTC|newest]

Thread overview: 53+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-01-23 22:31 [PATCH 2/2] mm: Consider subtrees in memory.events Chris Down
2019-01-24  0:24 ` Roman Gushchin
2019-01-24  1:03   ` Chris Down
2019-01-24  8:22 ` Michal Hocko
2019-01-24 15:21   ` Tejun Heo
2019-01-24 15:51     ` Michal Hocko
2019-01-24 16:00   ` Johannes Weiner
2019-01-24 17:01     ` Michal Hocko
2019-01-24 18:23       ` Johannes Weiner
2019-01-25  8:42         ` Michal Hocko
2019-01-25 16:51           ` Tejun Heo [this message]
2019-01-25 17:37             ` Michal Hocko
2019-01-25 17:37               ` Michal Hocko
2019-01-25 18:28               ` Tejun Heo
2019-01-28 12:51                 ` Michal Hocko
2019-01-28 14:28                   ` Tejun Heo
2019-01-28 14:52                     ` Michal Hocko
2019-01-28 14:54                       ` Tejun Heo
2019-01-28 15:18                         ` Michal Hocko
2019-01-28 15:41                           ` Tejun Heo
2019-01-28 17:05                             ` Michal Hocko
2019-01-28 17:49                               ` Tejun Heo
2019-01-29 14:43                                 ` Michal Hocko
2019-01-29 14:52                                   ` Tejun Heo
2019-01-30 16:50                                     ` Michal Hocko
2019-01-30 17:06                                       ` Tejun Heo
2019-01-30 17:41                                         ` Michal Hocko
2019-01-30 17:52                                           ` Tejun Heo
2019-01-30 18:16                                             ` Michal Hocko
2019-01-30 19:11                                         ` Shakeel Butt
2019-01-30 19:11                                           ` Shakeel Butt
2019-01-30 19:27                                           ` Johannes Weiner
2019-01-30 19:30                                             ` Johannes Weiner
2019-01-30 19:37                                               ` Shakeel Butt
2019-01-30 19:37                                                 ` Shakeel Butt
2019-01-30 19:23                   ` Johannes Weiner
2019-01-30 20:05                     ` Michal Hocko
2019-01-30 21:31                       ` Johannes Weiner
2019-01-31  8:58                         ` Michal Hocko
2019-01-31 16:22                           ` Johannes Weiner
2019-02-01 10:27                             ` Michal Hocko
2019-02-01 16:34                               ` Johannes Weiner
2019-01-28 15:59                 ` Shakeel Butt
2019-01-28 15:59                   ` Shakeel Butt
2019-01-28 16:05                   ` Tejun Heo
2019-01-28 16:08                     ` Shakeel Butt
2019-01-28 16:08                       ` Shakeel Butt
2019-01-28 16:12                       ` Tejun Heo
2019-01-28 14:30 ` Tejun Heo
2019-02-08 22:43 ` [PATCH v2 1/2] mm: Rename ambiguously named memory.stat counters and functions Chris Down
2019-02-08 22:44   ` [PATCH v2 2/2] mm: Consider subtrees in memory.events Chris Down
2019-02-11 19:01     ` Johannes Weiner
2019-02-11 18:55   ` [PATCH v2 1/2] mm: Rename ambiguously named memory.stat counters and functions Johannes Weiner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190125165152.GK50184@devbig004.ftw2.facebook.com \
    --to=tj@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=cgroups@vger.kernel.org \
    --cc=chris@chrisdown.name \
    --cc=dennis@kernel.org \
    --cc=guro@fb.com \
    --cc=hannes@cmpxchg.org \
    --cc=kernel-team@fb.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.