From: Wei Yang <richard.weiyang@gmail.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Wei Yang <richard.weiyang@gmail.com>,
cl@linux.com, penberg@kernel.org, mhocko@kernel.org,
linux-mm@kvack.org
Subject: Re: [PATCH v2] mm/slub: improve performance by skipping checked node in get_any_partial()
Date: Thu, 22 Nov 2018 09:13:29 +0000 [thread overview]
Message-ID: <20181122091329.gjz4muzjwp55dnjo@master> (raw)
In-Reply-To: <20181121190555.c010ac50e7eaa141549a63e5@linux-foundation.org>
On Wed, Nov 21, 2018 at 07:05:55PM -0800, Andrew Morton wrote:
>On Tue, 20 Nov 2018 11:31:19 +0800 Wei Yang <richard.weiyang@gmail.com> wrote:
>
>> 1. Background
>>
>> Current slub has three layers:
>>
>> * cpu_slab
>> * percpu_partial
>> * per node partial list
>>
>> Slub allocator tries to get an object from top to bottom. When it can't
>> get an object from the upper two layers, it will search the per node
>> partial list. The is done in get_partial().
>>
>> The abstraction of get_partial() may looks like this:
>>
>> get_partial()
>> get_partial_node()
>> get_any_partial()
>> for_each_zone_zonelist()
>>
>> The idea behind this is: it first try a local node, then try other nodes
>> if caller doesn't specify a node.
>>
>> 2. Room for Improvement
>>
>> When we look one step deeper in get_any_partial(), it tries to get a
>> proper node by for_each_zone_zonelist(), which iterates on the
>> node_zonelists.
>>
>> This behavior would introduce some redundant check on the same node.
>> Because:
>>
>> * the local node is already checked in get_partial_node()
>> * one node may have several zones on node_zonelists
>>
>> 3. Solution Proposed in Patch
>>
>> We could reduce these redundant check by record the last unsuccessful
>> node and then skip it.
>>
>> 4. Tests & Result
>>
>> After some tests, the result shows this may improve the system a little,
>> especially on a machine with only one node.
>>
>> 4.1 Test Description
>>
>> There are two cases for two system configurations.
>>
>> Test Cases:
>>
>> 1. counter comparison
>> 2. kernel build test
>>
>> System Configuration:
>>
>> 1. One node machine with 4G
>> 2. Four node machine with 8G
>>
>> 4.2 Result for Test 1
>>
>> Test 1: counter comparison
>>
>> This is a test with hacked kernel to record times function
>> get_any_partial() is invoked and times the inner loop iterates. By
>> comparing the ratio of two counters, we get to know how many inner
>> loops we skipped.
>>
>> Here is a snip of the test patch.
>>
>> ---
>> static void *get_any_partial() {
>>
>> get_partial_count++;
>>
>> do {
>> for_each_zone_zonelist() {
>> get_partial_try_count++;
>> }
>> } while();
>>
>> return NULL;
>> }
>> ---
>>
>> The result of (get_partial_count / get_partial_try_count):
>>
>> +----------+----------------+------------+-------------+
>> | | Base | Patched | Improvement|
>> +----------+----------------+------------+-------------+
>> |One Node | 1:3 | 1:0 | - 100% |
>> +----------+----------------+------------+-------------+
>> |Four Nodes| 1:5.8 | 1:2.5 | - 56% |
>> +----------+----------------+------------+-------------+
>>
>> 4.3 Result for Test 2
>>
>> Test 2: kernel build
>>
>> Command used:
>>
>> > time make -j8 bzImage
>>
>> Each version/system configuration combination has four round kernel
>> build tests. Take the average result of real to compare.
>>
>> +----------+----------------+------------+-------------+
>> | | Base | Patched | Improvement|
>> +----------+----------------+------------+-------------+
>> |One Node | 4m41s | 4m32s | - 4.47% |
>> +----------+----------------+------------+-------------+
>> |Four Nodes| 4m45s | 4m39s | - 2.92% |
>> +----------+----------------+------------+-------------+
>>
>> Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
>>
>
>Looks good to me, but I'll await input from the slab maintainers before
>proceeding any further.
>
>I didn't like the variable name much, and the comment could be
>improved. Please review:
>
Looks much better, thanks :-)
>
>--- a/mm/slub.c~mm-slub-improve-performance-by-skipping-checked-node-in-get_any_partial-fix
>+++ a/mm/slub.c
>@@ -1873,7 +1873,7 @@ static void *get_partial_node(struct kme
> * Get a page from somewhere. Search in increasing NUMA distances.
> */
> static void *get_any_partial(struct kmem_cache *s, gfp_t flags,
>- struct kmem_cache_cpu *c, int except)
>+ struct kmem_cache_cpu *c, int exclude_nid)
> {
> #ifdef CONFIG_NUMA
> struct zonelist *zonelist;
>@@ -1911,7 +1911,7 @@ static void *get_any_partial(struct kmem
> for_each_zone_zonelist(zone, z, zonelist, high_zoneidx) {
> struct kmem_cache_node *n;
>
>- if (except == zone_to_nid(zone))
>+ if (exclude_nid == zone_to_nid(zone))
> continue;
>
> n = get_node(s, zone_to_nid(zone));
>@@ -1931,12 +1931,13 @@ static void *get_any_partial(struct kmem
> }
> }
> /*
>- * Fail to get object from this node, either because
>- * 1. Fails in if check
>- * 2. NULl object returns from get_partial_node()
>- * Skip it next time.
>+ * Failed to get an object from this node, either
>+ * because
>+ * 1. Failure in the above if check
>+ * 2. NULL return from get_partial_node()
>+ * So skip this node next time.
> */
>- except = zone_to_nid(zone);
>+ exclude_nid = zone_to_nid(zone);
> }
> } while (read_mems_allowed_retry(cpuset_mems_cookie));
> #endif
>_
--
Wei Yang
Help you, Help me
next prev parent reply other threads:[~2018-11-22 9:13 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-11-08 1:12 [PATCH] mm/slub: skip node in case there is no slab to acquire Wei Yang
2018-11-09 20:48 ` Andrew Morton
2018-11-09 23:47 ` Wei Yang
2018-11-13 9:12 ` [PATCH v2] " Wei Yang
2018-11-13 13:17 ` [PATCH] " Michal Hocko
2018-11-13 13:26 ` Wei Yang
2018-11-13 13:34 ` Michal Hocko
2018-11-20 3:31 ` [PATCH v2] mm/slub: improve performance by skipping checked node in get_any_partial() Wei Yang
2018-11-22 3:05 ` Andrew Morton
2018-11-22 9:13 ` Wei Yang [this message]
2018-11-22 23:41 ` Wei Yang
2018-11-23 13:39 ` Michal Hocko
2018-11-23 13:49 ` Michal Hocko
2018-11-23 15:27 ` Wei Yang
2018-12-20 22:41 ` Andrew Morton
2018-12-21 0:25 ` Alexander Duyck
2018-12-21 3:29 ` Wei Yang
2018-12-21 1:37 ` Christopher Lameter
2018-12-21 1:37 ` Christopher Lameter
2018-12-21 3:33 ` Wei Yang
2018-12-24 22:03 ` Wei Yang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20181122091329.gjz4muzjwp55dnjo@master \
--to=richard.weiyang@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=cl@linux.com \
--cc=linux-mm@kvack.org \
--cc=mhocko@kernel.org \
--cc=penberg@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).