From: Masayoshi Mizuma <msys.mizuma@gmail.com>
To: Shakeel Butt <shakeelb@google.com>
Cc: Roman Gushchin <guro@fb.com>,
Johannes Weiner <hannes@cmpxchg.org>,
Michal Hocko <mhocko@kernel.org>,
Vladimir Davydov <vdavydov.dev@gmail.com>,
Cgroups <cgroups@vger.kernel.org>, Linux MM <linux-mm@kvack.org>
Subject: Re: memcg: performance degradation since v5.9
Date: Fri, 9 Apr 2021 12:35:39 -0400 [thread overview]
Message-ID: <20210409163539.5374pde3u6gkbg4a@gabell> (raw)
In-Reply-To: <CALvZod58NBQLvk2m7Mb=_0oGCApcNeisxVuE1b+qh1OKDSy0Ag@mail.gmail.com>
On Thu, Apr 08, 2021 at 02:08:13PM -0700, Shakeel Butt wrote:
> On Thu, Apr 8, 2021 at 1:54 PM Roman Gushchin <guro@fb.com> wrote:
> >
> > On Thu, Apr 08, 2021 at 03:39:48PM -0400, Masayoshi Mizuma wrote:
> > > Hello,
> > >
> > > I detected a performance degradation issue for a benchmark of PostgresSQL [1],
> > > and the issue seems to be related to object level memory cgroup [2].
> > > I would appreciate it if you could give me some ideas to solve it.
> > >
> > > The benchmark shows the transaction per second (tps) and the tps for v5.9
> > > and later kernel get about 10%-20% smaller than v5.8.
> > >
> > > The benchmark does sendto() and recvfrom() system calls repeatedly,
> > > and the duration of the system calls get longer than v5.8.
> > > The result of perf trace of the benchmark is as follows:
> > >
> > > - v5.8
> > >
> > > syscall calls errors total min avg max stddev
> > > (msec) (msec) (msec) (msec) (%)
> > > --------------- -------- ------ -------- --------- --------- --------- ------
> > > sendto 699574 0 2595.220 0.001 0.004 0.462 0.03%
> > > recvfrom 1391089 694427 2163.458 0.001 0.002 0.442 0.04%
> > >
> > > - v5.9
> > >
> > > syscall calls errors total min avg max stddev
> > > (msec) (msec) (msec) (msec) (%)
> > > --------------- -------- ------ -------- --------- --------- --------- ------
> > > sendto 699187 0 3316.948 0.002 0.005 0.044 0.02%
> > > recvfrom 1397042 698828 2464.995 0.001 0.002 0.025 0.04%
> > >
> > > - v5.12-rc6
> > >
> > > syscall calls errors total min avg max stddev
> > > (msec) (msec) (msec) (msec) (%)
> > > --------------- -------- ------ -------- --------- --------- --------- ------
> > > sendto 699445 0 3015.642 0.002 0.004 0.027 0.02%
> > > recvfrom 1395929 697909 2338.783 0.001 0.002 0.024 0.03%
> > >
>
> Can you please explain how to read these numbers? Or at least put a %
> regression.
Let me summarize them here.
The total duration ('total' column above) of each system call is as follows
if v5.8 is assumed as 100%:
- sendto:
- v5.8 100%
- v5.9 128%
- v5.12-rc6 116%
- revfrom:
- v5.8 100%
- v5.9 114%
- v5.12-rc6 108%
>
> > > I bisected the kernel patches, then I found the patch series, which add
> > > object level memory cgroup support, causes the degradation.
> > >
> > > I confirmed the delay with a kernel module which just runs
> > > kmem_cache_alloc/kmem_cache_free as follows. The duration is about
> > > 2-3 times than v5.8.
> > >
> > > dummy_cache = KMEM_CACHE(dummy, SLAB_ACCOUNT);
> > > for (i = 0; i < 100000000; i++)
> > > {
> > > p = kmem_cache_alloc(dummy_cache, GFP_KERNEL);
> > > kmem_cache_free(dummy_cache, p);
> > > }
> > >
> > > It seems that the object accounting work in slab_pre_alloc_hook() and
> > > slab_post_alloc_hook() is the overhead.
> > >
> > > cgroup.nokmem kernel parameter doesn't work for my case because it disables
> > > all of kmem accounting.
>
> The patch is somewhat doing that i.e. disabling memcg accounting for slab.
>
> > >
> > > The degradation is gone when I apply a patch (at the bottom of this email)
> > > that adds a kernel parameter that expects to fallback to the page level
> > > accounting, however, I'm not sure it's a good approach though...
> >
> > Hello Masayoshi!
> >
> > Thank you for the report!
> >
> > It's not a secret that per-object accounting is more expensive than a per-page
> > allocation. I had micro-benchmark results similar to yours: accounted
> > allocations are about 2x slower. But in general it tends to not affect real
> > workloads, because the cost of allocations is still low and tends to be only
> > a small fraction of the whole cpu load. And because it brings up significant
> > benefits: 40%+ slab memory savings, less fragmentation, more stable workingset,
> > etc, real workloads tend to perform on pair or better.
> >
> > So my first question is if you see the regression in any real workload
> > or it's only about the benchmark?
> >
> > Second, I'll try to take a look into the benchmark to figure out why it's
> > affected so badly, but I'm not sure we can easily fix it. If you have any
> > ideas what kind of objects the benchmark is allocating in big numbers,
> > please let me know.
> >
>
> One idea would be to increase MEMCG_CHARGE_BATCH.
Thank you for the idea! It's hard-corded as 32 now, so I'm wondering it may be
a good idea to make MEMCG_CHARGE_BATCH tunable from a kernel parameter or something.
Thanks!
Masa
next prev parent reply other threads:[~2021-04-09 16:35 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-04-08 19:39 memcg: performance degradation since v5.9 Masayoshi Mizuma
2021-04-08 20:53 ` Roman Gushchin
2021-04-08 21:08 ` Shakeel Butt
2021-04-09 16:35 ` Masayoshi Mizuma [this message]
2021-04-09 16:50 ` Shakeel Butt
2021-04-12 15:22 ` Masayoshi Mizuma
2021-04-09 16:05 ` Masayoshi Mizuma
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20210409163539.5374pde3u6gkbg4a@gabell \
--to=msys.mizuma@gmail.com \
--cc=cgroups@vger.kernel.org \
--cc=guro@fb.com \
--cc=hannes@cmpxchg.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@kernel.org \
--cc=shakeelb@google.com \
--cc=vdavydov.dev@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).