linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Shakeel Butt <shakeelb@google.com>
To: Johannes Weiner <hannes@cmpxchg.org>,
	Yang Shi <yang.shi@linux.alibaba.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Michal Hocko <mhocko@suse.com>, Tejun Heo <tj@kernel.org>,
	Roman Gushchin <guro@fb.com>, Linux MM <linux-mm@kvack.org>,
	Cgroups <cgroups@vger.kernel.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Kernel Team <kernel-team@fb.com>
Subject: Re: [PATCH] mm: memcontrol: asynchronous reclaim for memory.high
Date: Wed, 26 Feb 2020 12:25:33 -0800	[thread overview]
Message-ID: <CALvZod7fya+o8mO+qo=FXjk3WgNje=2P=sxM5StgdBoGNeXRMg@mail.gmail.com> (raw)
In-Reply-To: <20200219181219.54356-1-hannes@cmpxchg.org>

On Wed, Feb 19, 2020 at 10:12 AM Johannes Weiner <hannes@cmpxchg.org> wrote:
>
> We have received regression reports from users whose workloads moved
> into containers and subsequently encountered new latencies. For some
> users these were a nuisance, but for some it meant missing their SLA
> response times. We tracked those delays down to cgroup limits, which
> inject direct reclaim stalls into the workload where previously all
> reclaim was handled my kswapd.
>
> This patch adds asynchronous reclaim to the memory.high cgroup limit
> while keeping direct reclaim as a fallback. In our testing, this
> eliminated all direct reclaim from the affected workload.
>
> memory.high has a grace buffer of about 4% between when it becomes
> exceeded and when allocating threads get throttled. We can use the
> same buffer for the async reclaimer to operate in. If the worker
> cannot keep up and the grace buffer is exceeded, allocating threads
> will fall back to direct reclaim before getting throttled.
>
> For irq-context, there's already async memory.high enforcement. Re-use
> that work item for all allocating contexts, but switch it to the
> unbound workqueue so reclaim work doesn't compete with the workload.
> The work item is per cgroup, which means the workqueue infrastructure
> will create at maximum one worker thread per reclaiming cgroup.
>
> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
> ---
>  mm/memcontrol.c | 60 +++++++++++++++++++++++++++++++++++++------------
>  mm/vmscan.c     | 10 +++++++--

This reminds me of the per-memcg kswapd proposal from LSFMM 2018
(https://lwn.net/Articles/753162/).

If I understand this correctly, the use-case is that the job instead
of direct reclaiming (potentially in latency sensitive tasks), prefers
a background non-latency sensitive task to do the reclaim. I am
wondering if we can use the memory.high notification along with a new
memcg interface (like memory.try_to_free_pages) to implement a user
space background reclaimer. That would resolve the cpu accounting
concerns as the user space background reclaimer can share the cpu cost
with the task.

One concern with this approach will be that the memory.high
notification is too late and the latency sensitive task has faced the
stall. We can either introduce a threshold notification or another
notification only limit like memory.near_high which can be set based
on the job's rate of allocations and when the usage hits this limit
just notify the user space.

Shakeel

  parent reply	other threads:[~2020-02-26 20:25 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-02-19 18:12 [PATCH] mm: memcontrol: asynchronous reclaim for memory.high Johannes Weiner
2020-02-19 18:37 ` Michal Hocko
2020-02-19 19:16   ` Johannes Weiner
2020-02-19 19:53     ` Michal Hocko
2020-02-19 21:17       ` Johannes Weiner
2020-02-20  9:46         ` Michal Hocko
2020-02-20 14:41           ` Johannes Weiner
2020-02-19 21:41       ` Daniel Jordan
2020-02-19 22:08         ` Johannes Weiner
2020-02-20 15:45           ` Daniel Jordan
2020-02-20 15:56             ` Tejun Heo
2020-02-20 18:23               ` Daniel Jordan
2020-02-20 18:45                 ` Tejun Heo
2020-02-20 19:55                   ` Daniel Jordan
2020-02-20 20:54                     ` Tejun Heo
2020-02-19 19:17   ` Chris Down
2020-02-19 19:31   ` Andrew Morton
2020-02-19 21:33     ` Johannes Weiner
2020-02-26 20:25 ` Shakeel Butt [this message]
2020-02-26 22:26   ` Johannes Weiner
2020-02-26 23:36     ` Shakeel Butt
2020-02-26 23:46       ` Johannes Weiner
2020-02-27  0:12     ` Yang Shi
2020-02-27  2:42       ` Shakeel Butt
2020-02-27  9:58       ` Michal Hocko
2020-02-27 12:50       ` Johannes Weiner
2020-02-26 23:59   ` Yang Shi
2020-02-27  2:36     ` Shakeel Butt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CALvZod7fya+o8mO+qo=FXjk3WgNje=2P=sxM5StgdBoGNeXRMg@mail.gmail.com' \
    --to=shakeelb@google.com \
    --cc=akpm@linux-foundation.org \
    --cc=cgroups@vger.kernel.org \
    --cc=guro@fb.com \
    --cc=hannes@cmpxchg.org \
    --cc=kernel-team@fb.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=tj@kernel.org \
    --cc=yang.shi@linux.alibaba.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).