linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Pratik Sampat <psampat@linux.ibm.com>
To: Roman Gushchin <guro@fb.com>
Cc: Dennis Zhou <dennis@kernel.org>, Tejun Heo <tj@kernel.org>,
	Christoph Lameter <cl@linux.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Vlastimil Babka <vbabka@suse.cz>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	pratik.r.sampat@gmail.com
Subject: Re: [PATCH v3 0/6] percpu: partial chunk depopulation
Date: Fri, 16 Apr 2021 18:26:15 +0530	[thread overview]
Message-ID: <25c78660-9f4c-34b3-3a05-68c313661a46@linux.ibm.com> (raw)
In-Reply-To: <20210408035736.883861-1-guro@fb.com>

Hello Roman,

I've tried the v3 patch series on a POWER9 and an x86 KVM setup.

My results of the percpu_test are as follows:
Intel KVM 4CPU:4G
Vanilla 5.12-rc6
# ./percpu_test.sh
Percpu:             1952 kB
Percpu:           219648 kB
Percpu:           219648 kB

5.12-rc6 + with patchset applied
# ./percpu_test.sh
Percpu:             2080 kB
Percpu:           219712 kB
Percpu:            72672 kB

I'm able to see improvement comparable to that of what you're see too.

However, on POWERPC I'm unable to reproduce these improvements with the patchset in the same configuration

POWER9 KVM 4CPU:4G
Vanilla 5.12-rc6
# ./percpu_test.sh
Percpu:             5888 kB
Percpu:           118272 kB
Percpu:           118272 kB

5.12-rc6 + with patchset applied
# ./percpu_test.sh
Percpu:             6144 kB
Percpu:           119040 kB
Percpu:           119040 kB

I'm wondering if there's any architectural specific code that needs plumbing
here?

I will also look through the code to find the reason why POWER isn't
depopulating pages.

Thank you,
Pratik

On 08/04/21 9:27 am, Roman Gushchin wrote:
> In our production experience the percpu memory allocator is sometimes struggling
> with returning the memory to the system. A typical example is a creation of
> several thousands memory cgroups (each has several chunks of the percpu data
> used for vmstats, vmevents, ref counters etc). Deletion and complete releasing
> of these cgroups doesn't always lead to a shrinkage of the percpu memory,
> so that sometimes there are several GB's of memory wasted.
>
> The underlying problem is the fragmentation: to release an underlying chunk
> all percpu allocations should be released first. The percpu allocator tends
> to top up chunks to improve the utilization. It means new small-ish allocations
> (e.g. percpu ref counters) are placed onto almost filled old-ish chunks,
> effectively pinning them in memory.
>
> This patchset solves this problem by implementing a partial depopulation
> of percpu chunks: chunks with many empty pages are being asynchronously
> depopulated and the pages are returned to the system.
>
> To illustrate the problem the following script can be used:
>
> --
> #!/bin/bash
>
> cd /sys/fs/cgroup
>
> mkdir percpu_test
> echo "+memory" > percpu_test/cgroup.subtree_control
>
> cat /proc/meminfo | grep Percpu
>
> for i in `seq 1 1000`; do
>      mkdir percpu_test/cg_"${i}"
>      for j in `seq 1 10`; do
> 	mkdir percpu_test/cg_"${i}"_"${j}"
>      done
> done
>
> cat /proc/meminfo | grep Percpu
>
> for i in `seq 1 1000`; do
>      for j in `seq 1 10`; do
> 	rmdir percpu_test/cg_"${i}"_"${j}"
>      done
> done
>
> sleep 10
>
> cat /proc/meminfo | grep Percpu
>
> for i in `seq 1 1000`; do
>      rmdir percpu_test/cg_"${i}"
> done
>
> rmdir percpu_test
> --
>
> It creates 11000 memory cgroups and removes every 10 out of 11.
> It prints the initial size of the percpu memory, the size after
> creating all cgroups and the size after deleting most of them.
>
> Results:
>    vanilla:
>      ./percpu_test.sh
>      Percpu:             7488 kB
>      Percpu:           481152 kB
>      Percpu:           481152 kB
>
>    with this patchset applied:
>      ./percpu_test.sh
>      Percpu:             7488 kB
>      Percpu:           481408 kB
>      Percpu:           135552 kB
>
> So the total size of the percpu memory was reduced by more than 3.5 times.
>
> v3:
>    - introduced pcpu_check_chunk_hint()
>    - fixed a bug related to the hint check
>    - minor cosmetic changes
>    - s/pretends/fixes (cc Vlastimil)
>
> v2:
>    - depopulated chunks are sidelined
>    - depopulation happens in the reverse order
>    - depopulate list made per-chunk type
>    - better results due to better heuristics
>
> v1:
>    - depopulation heuristics changed and optimized
>    - chunks are put into a separate list, depopulation scan this list
>    - chunk->isolated is introduced, chunk->depopulate is dropped
>    - rearranged patches a bit
>    - fixed a panic discovered by krobot
>    - made pcpu_nr_empty_pop_pages per chunk type
>    - minor fixes
>
> rfc:
>    https://lwn.net/Articles/850508/
>
>
> Roman Gushchin (6):
>    percpu: fix a comment about the chunks ordering
>    percpu: split __pcpu_balance_workfn()
>    percpu: make pcpu_nr_empty_pop_pages per chunk type
>    percpu: generalize pcpu_balance_populated()
>    percpu: factor out pcpu_check_chunk_hint()
>    percpu: implement partial chunk depopulation
>
>   mm/percpu-internal.h |   4 +-
>   mm/percpu-stats.c    |   9 +-
>   mm/percpu.c          | 306 +++++++++++++++++++++++++++++++++++--------
>   3 files changed, 261 insertions(+), 58 deletions(-)
>


  parent reply	other threads:[~2021-04-16 12:56 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-08  3:57 [PATCH v3 0/6] percpu: partial chunk depopulation Roman Gushchin
2021-04-08  3:57 ` [PATCH v3 1/6] percpu: fix a comment about the chunks ordering Roman Gushchin
2021-04-16 21:06   ` Dennis Zhou
2021-04-08  3:57 ` [PATCH v3 2/6] percpu: split __pcpu_balance_workfn() Roman Gushchin
2021-04-16 21:06   ` Dennis Zhou
2021-04-08  3:57 ` [PATCH v3 3/6] percpu: make pcpu_nr_empty_pop_pages per chunk type Roman Gushchin
2021-04-16 21:08   ` Dennis Zhou
2021-04-08  3:57 ` [PATCH v3 4/6] percpu: generalize pcpu_balance_populated() Roman Gushchin
2021-04-16 21:09   ` Dennis Zhou
2021-04-08  3:57 ` [PATCH v3 5/6] percpu: factor out pcpu_check_chunk_hint() Roman Gushchin
2021-04-16 21:15   ` Dennis Zhou
2021-04-08  3:57 ` [PATCH v3 6/6] percpu: implement partial chunk depopulation Roman Gushchin
2021-04-16 12:56 ` Pratik Sampat [this message]
2021-04-16 14:18   ` [PATCH v3 0/6] percpu: " Dennis Zhou
2021-04-16 15:28     ` Pratik Sampat
2021-04-16 17:13       ` Roman Gushchin
2021-04-16 18:27         ` Pratik Sampat
2021-04-16 18:34           ` Roman Gushchin
2021-04-16 18:41             ` Pratik Sampat
2021-04-16 19:09               ` Roman Gushchin
2021-04-16 19:44                 ` Pratik Sampat
2021-04-16 20:03                   ` Roman Gushchin
2021-04-17  7:08                     ` Pratik Sampat
2021-04-16 21:47                   ` Dennis Zhou
2021-04-17  7:14                     ` Pratik Sampat
2021-04-16 16:21     ` Roman Gushchin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=25c78660-9f4c-34b3-3a05-68c313661a46@linux.ibm.com \
    --to=psampat@linux.ibm.com \
    --cc=akpm@linux-foundation.org \
    --cc=cl@linux.com \
    --cc=dennis@kernel.org \
    --cc=guro@fb.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=pratik.r.sampat@gmail.com \
    --cc=tj@kernel.org \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).