From: Jan Kara <jack@suse.cz>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Jan Kara <jack@suse.cz>, Dave Chinner <david@fromorbit.com>,
Roman Gushchin <guro@fb.com>, Michal Hocko <mhocko@kernel.org>,
Chris Mason <clm@fb.com>,
"linux-mm@kvack.org" <linux-mm@kvack.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
"linux-xfs@vger.kernel.org" <linux-xfs@vger.kernel.org>,
"vdavydov.dev@gmail.com" <vdavydov.dev@gmail.com>
Subject: Re: [PATCH 1/2] Revert "mm: don't reclaim inodes with many attached pages"
Date: Fri, 8 Feb 2019 10:55:07 +0100 [thread overview]
Message-ID: <20190208095507.GB6353@quack2.suse.cz> (raw)
In-Reply-To: <20190207213727.a791db810341cec2c013ba93@linux-foundation.org>
On Thu 07-02-19 21:37:27, Andrew Morton wrote:
> On Thu, 7 Feb 2019 11:27:50 +0100 Jan Kara <jack@suse.cz> wrote:
>
> > On Fri 01-02-19 09:19:04, Dave Chinner wrote:
> > > Maybe for memcgs, but that's exactly the oppose of what we want to
> > > do for global caches (e.g. filesystem metadata caches). We need to
> > > make sure that a single, heavily pressured cache doesn't evict small
> > > caches that lower pressure but are equally important for
> > > performance.
> > >
> > > e.g. I've noticed recently a significant increase in RMW cycles in
> > > XFS inode cache writeback during various benchmarks. It hasn't
> > > affected performance because the machine has IO and CPU to burn, but
> > > on slower machines and storage, it will have a major impact.
> >
> > Just as a data point, our performance testing infrastructure has bisected
> > down to the commits discussed in this thread as the cause of about 40%
> > regression in XFS file delete performance in bonnie++ benchmark.
> >
>
> Has anyone done significant testing with Rik's maybe-fix?
I will give it a spin with bonnie++ today. We'll see what comes out.
Honza
>
>
>
> From: Rik van Riel <riel@surriel.com>
> Subject: mm, slab, vmscan: accumulate gradual pressure on small slabs
>
> There are a few issues with the way the number of slab objects to scan is
> calculated in do_shrink_slab. First, for zero-seek slabs, we could leave
> the last object around forever. That could result in pinning a dying
> cgroup into memory, instead of reclaiming it. The fix for that is
> trivial.
>
> Secondly, small slabs receive much more pressure, relative to their size,
> than larger slabs, due to "rounding up" the minimum number of scanned
> objects to batch_size.
>
> We can keep the pressure on all slabs equal relative to their size by
> accumulating the scan pressure on small slabs over time, resulting in
> sometimes scanning an object, instead of always scanning several.
>
> This results in lower system CPU use, and a lower major fault rate, as
> actively used entries from smaller caches get reclaimed less aggressively,
> and need to be reloaded/recreated less often.
>
> [akpm@linux-foundation.org: whitespace fixes, per Roman]
> [riel@surriel.com: couple of fixes]
> Link: http://lkml.kernel.org/r/20190129142831.6a373403@imladris.surriel.com
> Link: http://lkml.kernel.org/r/20190128143535.7767c397@imladris.surriel.com
> Fixes: 4b85afbdacd2 ("mm: zero-seek shrinkers")
> Fixes: 172b06c32b94 ("mm: slowly shrink slabs with a relatively small number of objects")
> Signed-off-by: Rik van Riel <riel@surriel.com>
> Tested-by: Chris Mason <clm@fb.com>
> Acked-by: Roman Gushchin <guro@fb.com>
> Acked-by: Johannes Weiner <hannes@cmpxchg.org>
> Cc: Dave Chinner <dchinner@redhat.com>
> Cc: Jonathan Lemon <bsd@fb.com>
> Cc: Jan Kara <jack@suse.cz>
> Cc: <stable@vger.kernel.org>
>
> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
> ---
>
>
> --- a/include/linux/shrinker.h~mmslabvmscan-accumulate-gradual-pressure-on-small-slabs
> +++ a/include/linux/shrinker.h
> @@ -65,6 +65,7 @@ struct shrinker {
>
> long batch; /* reclaim batch size, 0 = default */
> int seeks; /* seeks to recreate an obj */
> + int small_scan; /* accumulate pressure on slabs with few objects */
> unsigned flags;
>
> /* These are for internal use */
> --- a/mm/vmscan.c~mmslabvmscan-accumulate-gradual-pressure-on-small-slabs
> +++ a/mm/vmscan.c
> @@ -488,18 +488,30 @@ static unsigned long do_shrink_slab(stru
> * them aggressively under memory pressure to keep
> * them from causing refetches in the IO caches.
> */
> - delta = freeable / 2;
> + delta = (freeable + 1) / 2;
> }
>
> /*
> * Make sure we apply some minimal pressure on default priority
> - * even on small cgroups. Stale objects are not only consuming memory
> + * even on small cgroups, by accumulating pressure across multiple
> + * slab shrinker runs. Stale objects are not only consuming memory
> * by themselves, but can also hold a reference to a dying cgroup,
> * preventing it from being reclaimed. A dying cgroup with all
> * corresponding structures like per-cpu stats and kmem caches
> * can be really big, so it may lead to a significant waste of memory.
> */
> - delta = max_t(unsigned long long, delta, min(freeable, batch_size));
> + if (!delta && shrinker->seeks) {
> + unsigned long nr_considered;
> +
> + shrinker->small_scan += freeable;
> + nr_considered = shrinker->small_scan >> priority;
> +
> + delta = 4 * nr_considered;
> + do_div(delta, shrinker->seeks);
> +
> + if (delta)
> + shrinker->small_scan -= nr_considered << priority;
> + }
>
> total_scan += delta;
> if (total_scan < 0) {
> _
>
--
Jan Kara <jack@suse.com>
SUSE Labs, CR
next prev parent reply other threads:[~2019-02-08 9:55 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-01-30 4:17 [PATCH 0/2] [REGRESSION v4.19-20] mm: shrinkers are now way too aggressive Dave Chinner
2019-01-30 4:17 ` [PATCH 1/2] Revert "mm: don't reclaim inodes with many attached pages" Dave Chinner
2019-01-30 12:21 ` Chris Mason
2019-01-31 1:34 ` Dave Chinner
2019-01-31 9:10 ` Michal Hocko
2019-01-31 18:57 ` Roman Gushchin
2019-01-31 22:19 ` Dave Chinner
2019-02-04 21:47 ` Dave Chinner
2019-02-07 10:27 ` Jan Kara
2019-02-08 5:37 ` Andrew Morton
2019-02-08 9:55 ` Jan Kara [this message]
2019-02-08 12:50 ` Jan Kara
2019-02-08 22:49 ` Andrew Morton
2019-02-09 3:42 ` Roman Gushchin
2019-02-08 21:25 ` Dave Chinner
2019-02-11 15:34 ` Wolfgang Walter
2019-01-31 15:48 ` Chris Mason
2019-02-01 23:39 ` Dave Chinner
2019-01-30 4:17 ` [PATCH 2/2] Revert "mm: slowly shrink slabs with a relatively small number of objects" Dave Chinner
2019-01-30 5:48 ` [PATCH 0/2] [REGRESSION v4.19-20] mm: shrinkers are now way too aggressive Roman Gushchin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190208095507.GB6353@quack2.suse.cz \
--to=jack@suse.cz \
--cc=akpm@linux-foundation.org \
--cc=clm@fb.com \
--cc=david@fromorbit.com \
--cc=guro@fb.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-xfs@vger.kernel.org \
--cc=mhocko@kernel.org \
--cc=vdavydov.dev@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).