From: "Michal Koutný" <mkoutny@suse.com>
To: Johannes Weiner <hannes@cmpxchg.org>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Michal Hocko <mhocko@suse.com>, Roman Gushchin <guro@fb.com>,
Shakeel Butt <shakeelb@google.com>,
Seth Jennings <sjenning@redhat.com>,
Dan Streetman <ddstreet@ieee.org>,
Minchan Kim <minchan@kernel.org>,
linux-mm@kvack.org, cgroups@vger.kernel.org,
linux-kernel@vger.kernel.org, kernel-team@fb.com
Subject: Re: [PATCH v2 6/6] zswap: memcg accounting
Date: Mon, 16 May 2022 16:34:59 +0200 [thread overview]
Message-ID: <20220516143459.GA17557@blackbody.suse.cz> (raw)
In-Reply-To: <Yn6QfdouzkcrygTR@cmpxchg.org>
On Fri, May 13, 2022 at 01:08:13PM -0400, Johannes Weiner <hannes@cmpxchg.org> wrote:
> Right, it's accounted as a subset rather than fully disjointed. But it
> is a limitable counter of its own, so I exported it as such, with a
> current and a max knob. This is comparable to the kmem counter in v1.
That counter and limit didn't turn out well. I liked the analogy to
writeback (and dirty limit) better.
> From an API POV it would be quite strange to have max for a counter
> that has no current. Likewise it would be strange for a major memory
> consumer to be missing from memory.stat.
My understanding would be to have all memory.stat entries as you
propose, no extra .current counter and the .max knob for zswap
configuration.
> It needs to be configured to the workload's access frequency curve,
> which can be done with trial-and-error (reasonable balance between
> zswpins and pswpins) or in a more targeted manner using tools such as
> page_idle, damon etc.
> [...]
> Because for load tuning, bytes make much more sense. That's how you
> measure the workingset, so a percentage is an awkward indirection. At
> the cgroup level, it makes even less sense: all memcg tunables are in
> bytes, it would be quite weird to introduce a "max" that is 0-100. Add
> the confusion of how percentages would propagate down the hierarchy...
Thanks for the explanation. I guess there's no simple tranformation of
in-kernel available information that'd allow a more semantic
configuration of this value. The rather crude absolute value requires
(but also simply allows) some calibration or responsive tuning.
> I don't traverse all ancestors, I bail on disabled groups and skip
> unlimited ones.
I admit I missed that.
> Flushing unnecessary groups with a ratelimit doesn't sound like an
> improvement to me.
Then I'm only concerned about a situation when there's a single deep
memcg that undergoes both workingset_refault() and zswap querying.
The latter (bare call to cgroup_rstat_flush()) won't reset
stats_flush_threshold, so the former (or the async flush more likely)
would attempt a flush too. The flush work (on the leaf memcg) would be
done twice even though it may be within the tolerance of cumulated
error the second time.
This is a thing that might require attention in the future (depending on
some data how it actually performs). I see how the current approach is
justified.
Regards,
Michal
next prev parent reply other threads:[~2022-05-16 14:35 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-05-10 15:28 [PATCH v2 0/6] zswap: accounting & cgroup control Johannes Weiner
2022-05-10 15:28 ` [PATCH v2 1/6] Documentation: filesystems: proc: update meminfo section Johannes Weiner
2022-05-11 17:11 ` David Hildenbrand
2022-05-11 18:51 ` Johannes Weiner
2022-05-12 8:55 ` David Hildenbrand
2022-05-10 15:28 ` [PATCH v2 2/6] mm: Kconfig: move swap and slab config options to the MM section Johannes Weiner
2022-05-10 15:28 ` [PATCH v2 3/6] mm: Kconfig: group swap, slab, hotplug and thp options into submenus Johannes Weiner
2022-05-10 22:40 ` Andrew Morton
2022-05-11 15:22 ` Johannes Weiner
2022-05-11 16:28 ` Johannes Weiner
2022-05-10 15:28 ` [PATCH v2 4/6] mm: Kconfig: simplify zswap configuration Johannes Weiner
2022-05-10 15:28 ` [PATCH v2 5/6] mm: zswap: add basic meminfo and vmstat coverage Johannes Weiner
2022-05-11 17:13 ` David Hildenbrand
2022-05-10 15:28 ` [PATCH v2 6/6] zswap: memcg accounting Johannes Weiner
2022-05-11 17:32 ` Michal Koutný
2022-05-11 19:06 ` Johannes Weiner
2022-05-13 15:14 ` Michal Koutný
2022-05-13 17:08 ` Johannes Weiner
2022-05-16 14:34 ` Michal Koutný [this message]
2022-05-16 20:01 ` Johannes Weiner
2022-05-17 23:52 ` Andrew Morton
2022-05-18 8:23 ` Michal Koutný
2022-05-13 17:23 ` Shakeel Butt
2022-05-13 18:25 ` Johannes Weiner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20220516143459.GA17557@blackbody.suse.cz \
--to=mkoutny@suse.com \
--cc=akpm@linux-foundation.org \
--cc=cgroups@vger.kernel.org \
--cc=ddstreet@ieee.org \
--cc=guro@fb.com \
--cc=hannes@cmpxchg.org \
--cc=kernel-team@fb.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@suse.com \
--cc=minchan@kernel.org \
--cc=shakeelb@google.com \
--cc=sjenning@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).