linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Shakeel Butt <shakeelb@google.com>
To: Tejun Heo <tj@kernel.org>
Cc: Jakub Kicinski <kuba@kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	 Linux MM <linux-mm@kvack.org>, Kernel Team <kernel-team@fb.com>,
	 Johannes Weiner <hannes@cmpxchg.org>,
	Chris Down <chris@chrisdown.name>,
	 Cgroups <cgroups@vger.kernel.org>
Subject: Re: [PATCH 0/3] memcg: Slow down swap allocation as the available space gets depleted
Date: Fri, 17 Apr 2020 10:51:10 -0700	[thread overview]
Message-ID: <CALvZod7-r0OrJ+-_uCy_p3BU3348ve2+YatiSdLvFaVqcqCs=w@mail.gmail.com> (raw)
In-Reply-To: <20200417173615.GB43469@mtj.thefacebook.com>

On Fri, Apr 17, 2020 at 10:36 AM Tejun Heo <tj@kernel.org> wrote:
>
> Hello,
>
> On Fri, Apr 17, 2020 at 10:18:15AM -0700, Shakeel Butt wrote:
> > > There currently are issues with anonymous memory management which makes them
> > > different / worse than page cache but I don't follow why swapping
> > > necessarily means that isolation is broken. Page refaults don't indicate
> > > that memory isolation is broken after all.
> >
> > Sorry, I meant the performance isolation. Direct reclaim does not
> > really differentiate who to stall and whose CPU to use.
>
> Can you please elaborate concrete scenarios? I'm having a hard time seeing
> differences from page cache.
>

Oh I was talking about the global reclaim here. In global reclaim, any
task can be throttled (throttle_direct_reclaim()). Memory freed by
using the CPU of high priority low latency jobs can be stolen by low
priority batch jobs.

> > > > memcg limit reclaim and memcg limits are overcommitted. Shouldn't
> > > > running out of swap will trigger the OOM earlier which should be
> > > > better than impacting the whole system.
> > >
> > > The primary scenario which was being considered was undercommitted
> > > protections but I don't think that makes any relevant differences.
> > >
> >
> > What is undercommitted protections? Does it mean there is still swap
> > available on the system but the memcg is hitting its swap limit?
>
> Hahaha, I assumed you were talking about memory.high/max and was saying that
> the primary scenarios that were being considered was usage of memory.low
> interacting with swap. Again, can you please give an concrete example so
> that we don't misunderstand each other?
>
> > > This is exactly similar to delay injection for memory.high. What's desired
> > > is slowing down the workload as the available resource is depleted so that
> > > the resource shortage presents as gradual degradation of performance and
> > > matching increase in resource PSI. This allows the situation to be detected
> > > and handled from userland while avoiding sudden and unpredictable behavior
> > > changes.
> > >
> >
> > Let me try to understand this with an example. Memcg 'A' has
>
> Ah, you already went there. Great.
>
> > memory.high = 100 MiB, memory.max = 150 MiB and memory.swap.max = 50
> > MiB. When A's usage goes over 100 MiB, it will reclaim the anon, file
> > and kmem. The anon will go to swap and increase its swap usage until
> > it hits the limit. Now the 'A' reclaim_high has fewer things (file &
> > kmem) to reclaim but the mem_cgroup_handle_over_high() will keep A's
> > increase in usage in check.
> >
> > So, my question is: should the slowdown by memory.high depends on the
> > reclaimable memory? If there is no reclaimable memory and the job hits
> > memory.high, should the kernel slow it down to crawl until the PSI
> > monitor comes and decides what to do. If I understand correctly, the
> > problem is the kernel slow down is not successful when reclaimable
> > memory is very low. Please correct me if I am wrong.
>
> In combination with memory.high, swap slowdown may not be necessary because
> memory.high's slow down mechanism is already there to handle "can't swap"
> scenario whether that's because swap is disabled wholesale, limited or
> depleted. However, please consider the following scenario.
>
> cgroup A has memory.low protection and no other restrictions. cgroup B has
> no protection and has access to swap. When B's memory starts bloating and
> gets the system under memory contention, it'll start consuming swap until it
> can't. When swap becomes depleted for B, there's nothing holding it back and
> B will start eating into A's protection.
>

In this example does 'B' have memory.high and memory.max set and by A
having no other restrictions, I am assuming you meant unlimited high
and max for A? Can 'A' use memory.min?

> The proposed mechanism just plugs another vector for the same condition
> where anonymous memory management breaks down because they can no longer be
> reclaimed due to swap unavailability.
>
> Thanks.
>
> --
> tejun


  reply	other threads:[~2020-04-17 17:51 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-04-17  1:06 Jakub Kicinski
2020-04-17  1:06 ` [PATCH 1/3] mm: prepare for swap over-high accounting and penalty calculation Jakub Kicinski
2020-04-17  1:06 ` [PATCH 2/3] mm: move penalty delay clamping out of calculate_high_delay() Jakub Kicinski
2020-04-17  1:06 ` [PATCH 3/3] mm: automatically penalize tasks with high swap use Jakub Kicinski
2020-04-17  7:37   ` Michal Hocko
2020-04-17 23:22     ` Jakub Kicinski
2020-04-17 16:11 ` [PATCH 0/3] memcg: Slow down swap allocation as the available space gets depleted Shakeel Butt
2020-04-17 16:23   ` Tejun Heo
2020-04-17 17:18     ` Shakeel Butt
2020-04-17 17:36       ` Tejun Heo
2020-04-17 17:51         ` Shakeel Butt [this message]
2020-04-17 19:35           ` Tejun Heo
2020-04-17 21:51             ` Shakeel Butt
2020-04-17 22:59               ` Tejun Heo
2020-04-20 16:12                 ` Shakeel Butt
2020-04-20 16:47                   ` Tejun Heo
2020-04-20 17:03                     ` Michal Hocko
2020-04-20 17:06                       ` Tejun Heo
2020-04-21 11:06                         ` Michal Hocko
2020-04-21 14:27                           ` Johannes Weiner
2020-04-21 16:11                             ` Michal Hocko
2020-04-21 16:56                               ` Johannes Weiner
2020-04-22 13:26                                 ` Michal Hocko
2020-04-22 14:15                                   ` Johannes Weiner
2020-04-22 15:43                                     ` Michal Hocko
2020-04-22 17:13                                       ` Johannes Weiner
2020-04-22 18:49                                         ` Michal Hocko
2020-04-23 15:00                                           ` Johannes Weiner
2020-04-24 15:05                                             ` Michal Hocko
2020-04-28 14:24                                               ` Johannes Weiner
2020-04-29  9:55                                                 ` Michal Hocko
2020-04-21 19:09                             ` Shakeel Butt
2020-04-21 21:59                               ` Johannes Weiner
2020-04-21 22:39                                 ` Shakeel Butt
2020-04-21 15:20                           ` Tejun Heo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CALvZod7-r0OrJ+-_uCy_p3BU3348ve2+YatiSdLvFaVqcqCs=w@mail.gmail.com' \
    --to=shakeelb@google.com \
    --cc=akpm@linux-foundation.org \
    --cc=cgroups@vger.kernel.org \
    --cc=chris@chrisdown.name \
    --cc=hannes@cmpxchg.org \
    --cc=kernel-team@fb.com \
    --cc=kuba@kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=tj@kernel.org \
    --subject='Re: [PATCH 0/3] memcg: Slow down swap allocation as the available space gets depleted' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).