All of lore.kernel.org
 help / color / mirror / Atom feed
From: Chris Down <chris@chrisdown.name>
To: Roman Gushchin <guro@fb.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Michal Hocko <mhocko@kernel.org>, Tejun Heo <tj@kernel.org>,
	Dennis Zhou <dennis@kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"cgroups@vger.kernel.org" <cgroups@vger.kernel.org>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	Kernel Team <Kernel-team@fb.com>
Subject: Re: [PATCH 2/2] mm: Consider subtrees in memory.events
Date: Wed, 23 Jan 2019 20:03:06 -0500	[thread overview]
Message-ID: <20190124010306.GA9055@chrisdown.name> (raw)
In-Reply-To: <20190124002359.GB21563@castle.DHCP.thefacebook.com>

Roman Gushchin writes:
>On Wed, Jan 23, 2019 at 05:31:44PM -0500, Chris Down wrote:
>> memory.stat and other files already consider subtrees in their output,
>> and we should too in order to not present an inconsistent interface.
>>
>> The current situation is fairly confusing, because people interacting
>> with cgroups expect hierarchical behaviour in the vein of memory.stat,
>> cgroup.events, and other files. For example, this causes confusion when
>> debugging reclaim events under low, as currently these always read "0"
>> at non-leaf memcg nodes, which frequently causes people to misdiagnose
>> breach behaviour. The same confusion applies to other counters in this
>> file when debugging issues.
>>
>> Aggregation is done at write time instead of at read-time since these
>> counters aren't hot (unlike memory.stat which is per-page, so it does it
>> at read time), and it makes sense to bundle this with the file
>> notifications.
>
>I agree with the consistency argument (matching cgroup.events, ...),
>and it's definitely looks better for oom* events, but at the same time it feels
>like a API break.
>
>Just for example, let's say you have a delegated sub-tree with memory.max
>set. Earlier, getting memory.high/max event meant that the whole sub-tree
>is tight on memory, and, for example, led to shutdown of some parts of the tree.
>After your change, it might mean that some sub-cgroup has reached its limit,
>and probably doesn't matter on the top level.

Yeah, this is something I was thinking about while writing it. I think there's 
an argument to be made either way, since functionally they can both represent 
the same feature set, just in different ways.

In the subtree-propagated version you can find the level of the hierarchy that 
the event fired at by checking parent events vs. their subtrees' events, and 
this also allows trivially setting up event watches per-subtree.

In the previous, non-propagated version, it's more trivial to work out the 
level as the event only appears in that memory.events file, but it's harder to 
actually find out about the existence of such an event because you need to keep 
a watch for each individual cgroup in the subtree at all times.

So I think there's a reasonable argument to be made in favour of considering 
subtrees.

1. I'm not aware of anyone major currently relying on using the individual 
subtree level to indicate only subtree-level events.
2. Also, being able to detect the level at which an event happened can be 
achieved in both versions by comparing event counters.
3. Having memory.events work like cgroup.events and others seems to fit with 
principle of least astonishment.

That said, I agree that there's a tradeoff here, but in my experience this 
behaviour more closely resembles user intuition and better matches the overall 
semantics around hierarchical behaviour we've generally established for cgroup 
v2.

>Maybe it's still ok, but we definitely need to document it better. It feels
>bad that different versions of the kernel will handle it differently, so
>the userspace has to workaround it to actually use these events.

That's perfectly reasonable. I'll update the documentation to match.

>Also, please, make sure that it doesn't break memcg kselftests.

For sure.

>We don't have memory.events file for the root cgroup, so we can stop earlier.

Oh yeah, I missed that when changing from a for loop to do/while. I'll fix that 
up, thanks.

Thanks for your feedback!

  reply	other threads:[~2019-01-24  1:03 UTC|newest]

Thread overview: 53+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-01-23 22:31 [PATCH 2/2] mm: Consider subtrees in memory.events Chris Down
2019-01-24  0:24 ` Roman Gushchin
2019-01-24  1:03   ` Chris Down [this message]
2019-01-24  8:22 ` Michal Hocko
2019-01-24 15:21   ` Tejun Heo
2019-01-24 15:51     ` Michal Hocko
2019-01-24 16:00   ` Johannes Weiner
2019-01-24 17:01     ` Michal Hocko
2019-01-24 18:23       ` Johannes Weiner
2019-01-25  8:42         ` Michal Hocko
2019-01-25 16:51           ` Tejun Heo
2019-01-25 17:37             ` Michal Hocko
2019-01-25 17:37               ` Michal Hocko
2019-01-25 18:28               ` Tejun Heo
2019-01-28 12:51                 ` Michal Hocko
2019-01-28 14:28                   ` Tejun Heo
2019-01-28 14:52                     ` Michal Hocko
2019-01-28 14:54                       ` Tejun Heo
2019-01-28 15:18                         ` Michal Hocko
2019-01-28 15:41                           ` Tejun Heo
2019-01-28 17:05                             ` Michal Hocko
2019-01-28 17:49                               ` Tejun Heo
2019-01-29 14:43                                 ` Michal Hocko
2019-01-29 14:52                                   ` Tejun Heo
2019-01-30 16:50                                     ` Michal Hocko
2019-01-30 17:06                                       ` Tejun Heo
2019-01-30 17:41                                         ` Michal Hocko
2019-01-30 17:52                                           ` Tejun Heo
2019-01-30 18:16                                             ` Michal Hocko
2019-01-30 19:11                                         ` Shakeel Butt
2019-01-30 19:11                                           ` Shakeel Butt
2019-01-30 19:27                                           ` Johannes Weiner
2019-01-30 19:30                                             ` Johannes Weiner
2019-01-30 19:37                                               ` Shakeel Butt
2019-01-30 19:37                                                 ` Shakeel Butt
2019-01-30 19:23                   ` Johannes Weiner
2019-01-30 20:05                     ` Michal Hocko
2019-01-30 21:31                       ` Johannes Weiner
2019-01-31  8:58                         ` Michal Hocko
2019-01-31 16:22                           ` Johannes Weiner
2019-02-01 10:27                             ` Michal Hocko
2019-02-01 16:34                               ` Johannes Weiner
2019-01-28 15:59                 ` Shakeel Butt
2019-01-28 15:59                   ` Shakeel Butt
2019-01-28 16:05                   ` Tejun Heo
2019-01-28 16:08                     ` Shakeel Butt
2019-01-28 16:08                       ` Shakeel Butt
2019-01-28 16:12                       ` Tejun Heo
2019-01-28 14:30 ` Tejun Heo
2019-02-08 22:43 ` [PATCH v2 1/2] mm: Rename ambiguously named memory.stat counters and functions Chris Down
2019-02-08 22:44   ` [PATCH v2 2/2] mm: Consider subtrees in memory.events Chris Down
2019-02-11 19:01     ` Johannes Weiner
2019-02-11 18:55   ` [PATCH v2 1/2] mm: Rename ambiguously named memory.stat counters and functions Johannes Weiner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190124010306.GA9055@chrisdown.name \
    --to=chris@chrisdown.name \
    --cc=Kernel-team@fb.com \
    --cc=akpm@linux-foundation.org \
    --cc=cgroups@vger.kernel.org \
    --cc=dennis@kernel.org \
    --cc=guro@fb.com \
    --cc=hannes@cmpxchg.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.