All of lore.kernel.org
 help / color / mirror / Atom feed
From: Roman Gushchin <guro@fb.com>
To: Shakeel Butt <shakeelb@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>,
	Michal Hocko <mhocko@suse.com>, Chris Down <chris@chrisdown.name>,
	Andrew Morton <akpm@linux-foundation.org>,
	<cgroups@vger.kernel.org>, <linux-mm@kvack.org>,
	<linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 4/4] memcg: synchronously enforce memory.high
Date: Thu, 10 Feb 2022 12:15:34 -0800	[thread overview]
Message-ID: <YgVyZrDPxVgP6OLG@carbon.dhcp.thefacebook.com> (raw)
In-Reply-To: <20220210081437.1884008-5-shakeelb@google.com>

On Thu, Feb 10, 2022 at 12:14:37AM -0800, Shakeel Butt wrote:
> The high limit is used to throttle the workload without invoking the
> oom-killer. Recently we tried to use the high limit to right size our
> internal workloads. More specifically dynamically adjusting the limits
> of the workload without letting the workload get oom-killed. However due
> to the limitation of the implementation of high limit enforcement, we
> observed the mechanism fails for some real workloads.
> 
> The high limit is enforced on return-to-userspace i.e. the kernel let
> the usage goes over the limit and when the execution returns to
> userspace, the high reclaim is triggered and the process can get
> throttled as well. However this mechanism fails for workloads which do
> large allocations in a single kernel entry e.g. applications that
> mlock() a large chunk of memory in a single syscall. Such applications
> bypass the high limit and can trigger the oom-killer.
> 
> To make high limit enforcement more robust, this patch make the limit
> enforcement synchronous. However there are couple of open questions to
> enforce high limit synchronously. What should be the behavior of
> __GFP_NORETRY allocaion on hitting high limit? Similar question arise
> for allocations which do not allow blocking. This patch took the
> approach to keep the previous behavior i.e. let such allocations not
> throttle synchronously but rely on the return-to-userspace mechanism to
> throttle processes triggering such allocations.
> 
> This patch does not remove the return-to-userspace high limit
> enforcement due to the reason mentioned in the previous para. Also the
> allocations where the memory usage is below high limit but the swap
> usage is above swap's high limit, such allocators are throttled in the
> return-to-userspace.

Has this approach been extensively tested in the production?

Injecting sleeps at return-to-userspace moment is safe in terms of priority
inversions: a slowed down task will unlikely affect the rest of the system.

It way less predictable for a random allocation in the kernel mode, what if
the task is already holding a system-wide resource?

Someone might argue that it's not better than a system-wide memory shortage
and the same allocation might go into a direct reclaim anyway, but with
the way how memory.high is used it will happen way more often.

Thanks!

WARNING: multiple messages have this Message-ID (diff)
From: Roman Gushchin <guro-b10kYP2dOMg@public.gmane.org>
To: Shakeel Butt <shakeelb-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
Cc: Johannes Weiner <hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org>,
	Michal Hocko <mhocko-IBi9RG/b67k@public.gmane.org>,
	Chris Down <chris-6Bi1550iOqEnzZ6mRAm98g@public.gmane.org>,
	Andrew Morton
	<akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>,
	cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Subject: Re: [PATCH 4/4] memcg: synchronously enforce memory.high
Date: Thu, 10 Feb 2022 12:15:34 -0800	[thread overview]
Message-ID: <YgVyZrDPxVgP6OLG@carbon.dhcp.thefacebook.com> (raw)
In-Reply-To: <20220210081437.1884008-5-shakeelb-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>

On Thu, Feb 10, 2022 at 12:14:37AM -0800, Shakeel Butt wrote:
> The high limit is used to throttle the workload without invoking the
> oom-killer. Recently we tried to use the high limit to right size our
> internal workloads. More specifically dynamically adjusting the limits
> of the workload without letting the workload get oom-killed. However due
> to the limitation of the implementation of high limit enforcement, we
> observed the mechanism fails for some real workloads.
> 
> The high limit is enforced on return-to-userspace i.e. the kernel let
> the usage goes over the limit and when the execution returns to
> userspace, the high reclaim is triggered and the process can get
> throttled as well. However this mechanism fails for workloads which do
> large allocations in a single kernel entry e.g. applications that
> mlock() a large chunk of memory in a single syscall. Such applications
> bypass the high limit and can trigger the oom-killer.
> 
> To make high limit enforcement more robust, this patch make the limit
> enforcement synchronous. However there are couple of open questions to
> enforce high limit synchronously. What should be the behavior of
> __GFP_NORETRY allocaion on hitting high limit? Similar question arise
> for allocations which do not allow blocking. This patch took the
> approach to keep the previous behavior i.e. let such allocations not
> throttle synchronously but rely on the return-to-userspace mechanism to
> throttle processes triggering such allocations.
> 
> This patch does not remove the return-to-userspace high limit
> enforcement due to the reason mentioned in the previous para. Also the
> allocations where the memory usage is below high limit but the swap
> usage is above swap's high limit, such allocators are throttled in the
> return-to-userspace.

Has this approach been extensively tested in the production?

Injecting sleeps at return-to-userspace moment is safe in terms of priority
inversions: a slowed down task will unlikely affect the rest of the system.

It way less predictable for a random allocation in the kernel mode, what if
the task is already holding a system-wide resource?

Someone might argue that it's not better than a system-wide memory shortage
and the same allocation might go into a direct reclaim anyway, but with
the way how memory.high is used it will happen way more often.

Thanks!

  reply	other threads:[~2022-02-10 20:15 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-02-10  8:14 [PATCH 0/4] memcg: robust enforcement of memory.high Shakeel Butt
2022-02-10  8:14 ` Shakeel Butt
2022-02-10  8:14 ` [PATCH 1/4] memcg: refactor mem_cgroup_oom Shakeel Butt
2022-02-10  8:14   ` Shakeel Butt
2022-02-10 19:52   ` Roman Gushchin
2022-02-10 19:52     ` Roman Gushchin
2022-02-10 22:23     ` Shakeel Butt
2022-02-10  8:14 ` [PATCH 2/4] memcg: unify force charging conditions Shakeel Butt
2022-02-10  8:14   ` Shakeel Butt
2022-02-10 20:03   ` Roman Gushchin
2022-02-10 20:03     ` Roman Gushchin
2022-02-10 22:25     ` Shakeel Butt
2022-02-10 22:25       ` Shakeel Butt
2022-02-10 23:15       ` Roman Gushchin
2022-02-10  8:14 ` [PATCH 3/4] selftests: memcg: test high limit for single entry allocation Shakeel Butt
2022-02-10  8:14   ` Shakeel Butt
2022-02-10  8:14 ` [PATCH 4/4] memcg: synchronously enforce memory.high Shakeel Butt
2022-02-10  8:14   ` Shakeel Butt
2022-02-10 20:15   ` Roman Gushchin [this message]
2022-02-10 20:15     ` Roman Gushchin
2022-02-10 22:22     ` Shakeel Butt
2022-02-10 22:22       ` Shakeel Butt
2022-02-10 23:29       ` Roman Gushchin
2022-02-10 23:29         ` Roman Gushchin
2022-02-10 23:53         ` Shakeel Butt
2022-02-11  2:44           ` Roman Gushchin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YgVyZrDPxVgP6OLG@carbon.dhcp.thefacebook.com \
    --to=guro@fb.com \
    --cc=akpm@linux-foundation.org \
    --cc=cgroups@vger.kernel.org \
    --cc=chris@chrisdown.name \
    --cc=hannes@cmpxchg.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=shakeelb@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.