All of lore.kernel.org
 help / color / mirror / Atom feed
From: Yang Shi <shy828301@gmail.com>
To: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Johannes Weiner <hannes@cmpxchg.org>,
	Michal Hocko <mhocko@suse.com>,
	Shakeel Butt <shakeelb@google.com>, Tejun Heo <tj@kernel.org>,
	linux-mm@kvack.org, cgroups@vger.kernel.org,
	linux-kernel@vger.kernel.org,
	Christian Brauner <brauner@kernel.org>
Subject: Re: [RFC PATCH] mm: memcontrol: don't account swap failures not due to cgroup limits
Date: Fri, 3 Feb 2023 11:07:30 -0800	[thread overview]
Message-ID: <CAHbLzkpk+6+kzsxmJ_MK+708rpCEjB2njnarLkzfzXX-MUyG7g@mail.gmail.com> (raw)
In-Reply-To: <Y91ZsDSIr2oFHu3E@P9FQF9L96D.corp.robot.car>

On Fri, Feb 3, 2023 at 11:00 AM Roman Gushchin <roman.gushchin@linux.dev> wrote:
>
> On Thu, Feb 02, 2023 at 10:56:26AM -0500, Johannes Weiner wrote:
> > Christian reports the following situation in a cgroup that doesn't
> > have memory.swap.max configured:
> >
> >   $ cat memory.swap.events
> >   high 0
> >   max 0
> >   fail 6218
> >
> > Upon closer examination, this is an ARM64 machine that doesn't support
> > swapping out THPs.
>
> Do we expect it to be added any time soon or it's caused by some system
> limitations?

AFAIK, it has been supported since 6.0. See commit d0637c505f8a1

>
> > In that case, the first get_swap_page() fails, and
> > the kernel falls back to splitting the THP and swapping the 4k
> > constituents one by one. /proc/vmstat confirms this with a high rate
> > of thp_swpout_fallback events.
> >
> > While the behavior can ultimately be explained, it's unexpected and
> > confusing. I see three choices how to address this:
> >
> > a) Specifically exlude THP fallbacks from being counted, as the
> >    failure is transient and the memory is ultimately swapped.
> >
> >    Arguably, though, the user would like to know if their cgroup's
> >    swap limit is causing high rates of THP splitting during swapout.
>
> I agree, but it's probably better to reflect it in a form of a per-memcg
> thp split failure counter (e.g. in memory.stat), not as swap out failures.
> Overall option a) looks preferable to me. Especially if in the long run
> the arm64 limitation will be fixed.
>
> >
> > b) Only count cgroup swap events when they are actually due to a
> >    cgroup's own limit. Exclude failures that are due to physical swap
> >    shortage or other system-level conditions (like !THP_SWAP). Also
> >    count them at the level where the limit is configured, which may be
> >    above the local cgroup that holds the page-to-be-swapped.
> >
> >    This is in line with how memory.swap.high, memory.high and
> >    memory.max events are counted.
> >
> >    However, it's a change in documented behavior.
>
> I'm not sure about this option: I can easily imagine a setup with a
> memcg-specific swap space, which would require setting an artificial
> memory.swap.max to get the fail counter working. On the other side not a deal
> breaker.
>
> Thanks!
>

WARNING: multiple messages have this Message-ID (diff)
From: Yang Shi <shy828301-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
To: Roman Gushchin <roman.gushchin-fxUVXftIFDnyG1zEObXtfA@public.gmane.org>
Cc: Johannes Weiner <hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org>,
	Michal Hocko <mhocko-IBi9RG/b67k@public.gmane.org>,
	Shakeel Butt <shakeelb-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>,
	Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org,
	cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	Christian Brauner
	<brauner-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
Subject: Re: [RFC PATCH] mm: memcontrol: don't account swap failures not due to cgroup limits
Date: Fri, 3 Feb 2023 11:07:30 -0800	[thread overview]
Message-ID: <CAHbLzkpk+6+kzsxmJ_MK+708rpCEjB2njnarLkzfzXX-MUyG7g@mail.gmail.com> (raw)
In-Reply-To: <Y91ZsDSIr2oFHu3E-+xijCwNIfdoLQcUKs7qKB+WAnPUfkyWGUBSOeVevoDU@public.gmane.org>

On Fri, Feb 3, 2023 at 11:00 AM Roman Gushchin <roman.gushchin-fxUVXftIFDnyG1zEObXtfA@public.gmane.org> wrote:
>
> On Thu, Feb 02, 2023 at 10:56:26AM -0500, Johannes Weiner wrote:
> > Christian reports the following situation in a cgroup that doesn't
> > have memory.swap.max configured:
> >
> >   $ cat memory.swap.events
> >   high 0
> >   max 0
> >   fail 6218
> >
> > Upon closer examination, this is an ARM64 machine that doesn't support
> > swapping out THPs.
>
> Do we expect it to be added any time soon or it's caused by some system
> limitations?

AFAIK, it has been supported since 6.0. See commit d0637c505f8a1

>
> > In that case, the first get_swap_page() fails, and
> > the kernel falls back to splitting the THP and swapping the 4k
> > constituents one by one. /proc/vmstat confirms this with a high rate
> > of thp_swpout_fallback events.
> >
> > While the behavior can ultimately be explained, it's unexpected and
> > confusing. I see three choices how to address this:
> >
> > a) Specifically exlude THP fallbacks from being counted, as the
> >    failure is transient and the memory is ultimately swapped.
> >
> >    Arguably, though, the user would like to know if their cgroup's
> >    swap limit is causing high rates of THP splitting during swapout.
>
> I agree, but it's probably better to reflect it in a form of a per-memcg
> thp split failure counter (e.g. in memory.stat), not as swap out failures.
> Overall option a) looks preferable to me. Especially if in the long run
> the arm64 limitation will be fixed.
>
> >
> > b) Only count cgroup swap events when they are actually due to a
> >    cgroup's own limit. Exclude failures that are due to physical swap
> >    shortage or other system-level conditions (like !THP_SWAP). Also
> >    count them at the level where the limit is configured, which may be
> >    above the local cgroup that holds the page-to-be-swapped.
> >
> >    This is in line with how memory.swap.high, memory.high and
> >    memory.max events are counted.
> >
> >    However, it's a change in documented behavior.
>
> I'm not sure about this option: I can easily imagine a setup with a
> memcg-specific swap space, which would require setting an artificial
> memory.swap.max to get the fail counter working. On the other side not a deal
> breaker.
>
> Thanks!
>

  reply	other threads:[~2023-02-03 19:07 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-02-02 15:56 [RFC PATCH] mm: memcontrol: don't account swap failures not due to cgroup limits Johannes Weiner
2023-02-02 15:56 ` Johannes Weiner
2023-02-02 18:27 ` Shakeel Butt
2023-02-02 18:27   ` Shakeel Butt
2023-02-02 18:30 ` Yosry Ahmed
2023-02-02 18:30   ` Yosry Ahmed
2023-02-06 16:18   ` Michal Koutný
2023-02-06 16:18     ` Michal Koutný
2023-02-07 16:54     ` Johannes Weiner
2023-02-07 16:54       ` Johannes Weiner
2023-02-07 19:09   ` Johannes Weiner
2023-02-07 19:09     ` Johannes Weiner
2023-02-07 19:21     ` Yosry Ahmed
2023-02-07 19:21       ` Yosry Ahmed
2023-02-07 22:14       ` Roman Gushchin
2023-02-07 22:14         ` Roman Gushchin
2023-02-03 19:00 ` Roman Gushchin
2023-02-03 19:00   ` Roman Gushchin
2023-02-03 19:07   ` Yang Shi [this message]
2023-02-03 19:07     ` Yang Shi
2023-02-03 19:19     ` Roman Gushchin
2023-02-03 19:19       ` Roman Gushchin
2023-02-07 16:52       ` Johannes Weiner
2023-02-07 16:52         ` Johannes Weiner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAHbLzkpk+6+kzsxmJ_MK+708rpCEjB2njnarLkzfzXX-MUyG7g@mail.gmail.com \
    --to=shy828301@gmail.com \
    --cc=brauner@kernel.org \
    --cc=cgroups@vger.kernel.org \
    --cc=hannes@cmpxchg.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=roman.gushchin@linux.dev \
    --cc=shakeelb@google.com \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.