linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Johannes Weiner <hannes@cmpxchg.org>
To: Shakeel Butt <shakeelb@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Andrey Ryabinin <aryabinin@virtuozzo.com>,
	Suren Baghdasaryan <surenb@google.com>,
	Michal Hocko <mhocko@suse.com>, Linux MM <linux-mm@kvack.org>,
	Cgroups <cgroups@vger.kernel.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Kernel Team <kernel-team@fb.com>
Subject: Re: [PATCH 00/11] mm: fix page aging across multiple cgroups
Date: Thu, 7 Nov 2019 09:45:55 -0800	[thread overview]
Message-ID: <20191107174555.GA116752@cmpxchg.org> (raw)
In-Reply-To: <CALvZod7821vuP_KcOKZkzKu-6b_kzDPrximi3E-Ld95fd=zbMg@mail.gmail.com>

On Wed, Nov 06, 2019 at 06:50:25PM -0800, Shakeel Butt wrote:
> On Mon, Jun 3, 2019 at 2:59 PM Johannes Weiner <hannes@cmpxchg.org> wrote:
> >
> > When applications are put into unconfigured cgroups for memory
> > accounting purposes, the cgrouping itself should not change the
> > behavior of the page reclaim code. We expect the VM to reclaim the
> > coldest pages in the system. But right now the VM can reclaim hot
> > pages in one cgroup while there is eligible cold cache in others.
> >
> > This is because one part of the reclaim algorithm isn't truly cgroup
> > hierarchy aware: the inactive/active list balancing. That is the part
> > that is supposed to protect hot cache data from one-off streaming IO.
> >
> > The recursive cgroup reclaim scheme will scan and rotate the physical
> > LRU lists of each eligible cgroup at the same rate in a round-robin
> > fashion, thereby establishing a relative order among the pages of all
> > those cgroups. However, the inactive/active balancing decisions are
> > made locally within each cgroup, so when a cgroup is running low on
> > cold pages, its hot pages will get reclaimed - even when sibling
> > cgroups have plenty of cold cache eligible in the same reclaim run.
> >
> > For example:
> >
> >    [root@ham ~]# head -n1 /proc/meminfo
> >    MemTotal:        1016336 kB
> >
> >    [root@ham ~]# ./reclaimtest2.sh
> >    Establishing 50M active files in cgroup A...
> >    Hot pages cached: 12800/12800 workingset-a
> >    Linearly scanning through 18G of file data in cgroup B:
> >    real    0m4.269s
> >    user    0m0.051s
> >    sys     0m4.182s
> >    Hot pages cached: 134/12800 workingset-a
> >
> 
> Can you share reclaimtest2.sh as well? Maybe a selftest to
> monitor/test future changes.

I wish it were more portable, but it really only does what it says in
the log output, in a pretty hacky way, with all parameters hard-coded
to my test environment:

---

#!/bin/bash

# this should protect workingset-a from workingset-b

set -e
#set -x

echo Establishing 50M active files in cgroup A...
rmdir /cgroup/workingset-a 2>/dev/null || true
mkdir /cgroup/workingset-a
echo $$ > /cgroup/workingset-a/cgroup.procs
rm -f workingset-a
dd of=workingset-a bs=1M count=0 seek=50 2>/dev/null >/dev/null
cat workingset-a > /dev/null
cat workingset-a > /dev/null
cat workingset-a > /dev/null
cat workingset-a > /dev/null
cat workingset-a > /dev/null
cat workingset-a > /dev/null
cat workingset-a > /dev/null
cat workingset-a > /dev/null
echo -n "Hot pages cached: "
./mincore workingset-a

echo -n Linearly scanning through 2G of file data cgroup B:
rmdir /cgroup/workingset-b >/dev/null || true
mkdir /cgroup/workingset-b
echo $$ > /cgroup/workingset-b/cgroup.procs
rm -f workingset-b
dd of=workingset-b bs=1M count=0 seek=2048 2>/dev/null >/dev/null
time (
  cat workingset-b > /dev/null
  cat workingset-b > /dev/null
  cat workingset-b > /dev/null
  cat workingset-b > /dev/null
  cat workingset-b > /dev/null
  cat workingset-b > /dev/null
  cat workingset-b > /dev/null
  cat workingset-b > /dev/null )
echo -n "Hot pages cached: "
./mincore workingset-a


      reply	other threads:[~2019-11-07 17:48 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-06-03 21:07 [PATCH 00/11] mm: fix page aging across multiple cgroups Johannes Weiner
2019-06-03 21:07 ` [PATCH 01/11] mm: vmscan: move inactive_list_is_low() swap check to the caller Johannes Weiner
2019-11-07  2:50   ` Shakeel Butt
2019-11-08  3:43     ` Andrew Morton
2019-06-03 21:07 ` [PATCH 02/11] mm: clean up and clarify lruvec lookup procedure Johannes Weiner
2019-11-07  2:50   ` Shakeel Butt
2019-06-03 21:07 ` [PATCH 03/11] mm: vmscan: simplify lruvec_lru_size() Johannes Weiner
2019-11-07  2:51   ` Shakeel Butt
2019-06-03 21:07 ` [PATCH 04/11] mm: vmscan: naming fixes: cgroup_reclaim() and writeback_working() Johannes Weiner
2019-11-07  2:51   ` Shakeel Butt
2019-06-03 21:07 ` [PATCH 05/11] mm: vmscan: replace shrink_node() loop with a retry jump Johannes Weiner
2019-11-07  2:51   ` Shakeel Butt
2019-06-03 21:07 ` [PATCH 06/11] mm: vmscan: turn shrink_node_memcg() into shrink_lruvec() Johannes Weiner
2019-11-07  2:51   ` Shakeel Butt
2019-06-03 21:07 ` [PATCH 07/11] mm: vmscan: split shrink_node() into node part and memcgs part Johannes Weiner
2019-11-07  2:51   ` Shakeel Butt
2019-06-03 21:07 ` [PATCH 08/11] mm: vmscan: harmonize writeback congestion tracking for nodes & memcgs Johannes Weiner
2019-11-07  2:52   ` Shakeel Butt
2019-06-03 21:07 ` [PATCH 09/11] mm: vmscan: move file exhaustion detection to the node level Johannes Weiner
2019-11-07  2:52   ` Shakeel Butt
2019-06-03 21:07 ` [PATCH 10/11] mm: vmscan: detect file thrashing at the reclaim root Johannes Weiner
2019-06-03 21:07 ` [PATCH 11/11] mm: vmscan: enforce inactive:active ratio " Johannes Weiner
2019-11-07  2:50 ` [PATCH 00/11] mm: fix page aging across multiple cgroups Shakeel Butt
2019-11-07 17:45   ` Johannes Weiner [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191107174555.GA116752@cmpxchg.org \
    --to=hannes@cmpxchg.org \
    --cc=akpm@linux-foundation.org \
    --cc=aryabinin@virtuozzo.com \
    --cc=cgroups@vger.kernel.org \
    --cc=kernel-team@fb.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=shakeelb@google.com \
    --cc=surenb@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).