All of lore.kernel.org
 help / color / mirror / Atom feed
From: Chengming Zhou <chengming.zhou@linux.dev>
To: Hyeonggon Yoo <42.hyeyoo@gmail.com>
Cc: cl@linux.com, penberg@kernel.org, rientjes@google.com,
	iamjoonsoo.kim@lge.com, akpm@linux-foundation.org,
	vbabka@suse.cz, roman.gushchin@linux.dev, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org,
	Chengming Zhou <zhouchengming@bytedance.com>
Subject: Re: [RFC PATCH 0/5] slub: Delay freezing of CPU partial slabs
Date: Wed, 18 Oct 2023 15:44:29 +0800	[thread overview]
Message-ID: <8cff8994-28a3-4a7e-8a6e-217c4da49ca1@linux.dev> (raw)
In-Reply-To: <CAB=+i9Sw1YSdUKrjygA5cOsVjQMVmS8-KJ+ku4AG9Fw_2guENQ@mail.gmail.com>

On 2023/10/18 14:34, Hyeonggon Yoo wrote:
> On Wed, Oct 18, 2023 at 12:45 AM <chengming.zhou@linux.dev> wrote:
>> 4. Testing
>> ==========
>> We just did some simple testing on a server with 128 CPUs (2 nodes) to
>> compare performance for now.
>>
>>  - perf bench sched messaging -g 5 -t -l 100000
>>    baseline     RFC
>>    7.042s       6.966s
>>    7.022s       7.045s
>>    7.054s       6.985s
>>
>>  - stress-ng --rawpkt 128 --rawpkt-ops 100000000
>>    baseline     RFC
>>    2.42s        2.15s
>>    2.45s        2.16s
>>    2.44s        2.17s
>>
>> It shows above there is about 10% improvement on stress-ng rawpkt
>> testcase, although no much improvement on perf sched bench testcase.
>>
>> Thanks for any comment and code review!
> 
> Hi Chengming, this is the kerneltesting.org test report for your patch series.
> 
> I applied this series on my slab-experimental tree [1] for testing,
> and I observed several kernel panics [2] [3] [4] on kernels without
> CONFIG_SLUB_CPU_PARTIAL.
> 
> To verify that this series caused kernel panics, I tested before and after
> applying it on Vlastimil's slab/for-next and yeah, this series was the cause.
> 
> System is deadlocked on memory and the OOM-killer says there is a
> huge amount of slab memory. So maybe there is a memory leak or it makes
> slab memory grow unboundedly?

Thanks for the testing!

I can reproduce the OOM locally without CONFIG_SLUB_CPU_PARTIAL.

I made a quick fix below (will need to get another better fix). The root
cause is in patch-4, which wrongly put some partial slabs onto the CPU
partial list even without CONFIG_SLUB_CPU_PARTIAL. So these partial slabs
are leaked.

diff --git a/mm/slub.c b/mm/slub.c
index d58eaf8447fd..b7ba6c008122 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -2339,12 +2339,12 @@ static void *get_partial_node(struct kmem_cache *s, struct kmem_cache_node *n,
                        }
                }

+#ifdef CONFIG_SLUB_CPU_PARTIAL
                remove_partial(n, slab);
                put_cpu_partial(s, slab, 0);
                stat(s, CPU_PARTIAL_NODE);
                partial_slabs++;

-#ifdef CONFIG_SLUB_CPU_PARTIAL
                if (!kmem_cache_has_cpu_partial(s)
                        || partial_slabs > s->cpu_partial_slabs / 2)
                        break;


> 
> [1] https://git.kerneltesting.org/slab-experimental/
> [2] https://lava.kerneltesting.org/scheduler/job/127#bottom
> [3] https://lava.kerneltesting.org/scheduler/job/131#bottom
> [4] https://lava.kerneltesting.org/scheduler/job/134#bottom
> 
>>
>> Chengming Zhou (5):
>>   slub: Introduce on_partial()
>>   slub: Don't manipulate slab list when used by cpu
>>   slub: Optimize deactivate_slab()
>>   slub: Don't freeze slabs for cpu partial
>>   slub: Introduce get_cpu_partial()
>>
>>  mm/slab.h |   2 +-
>>  mm/slub.c | 257 +++++++++++++++++++++++++++++++-----------------------
>>  2 files changed, 150 insertions(+), 109 deletions(-)
>>
>> --
>> 2.40.1
>>

      reply	other threads:[~2023-10-18  7:44 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-10-17 15:44 [RFC PATCH 0/5] slub: Delay freezing of CPU partial slabs chengming.zhou
2023-10-17 15:44 ` [RFC PATCH 1/5] slub: Introduce on_partial() chengming.zhou
2023-10-17 15:54   ` Matthew Wilcox
2023-10-18  7:37     ` Chengming Zhou
2023-10-27  5:26   ` kernel test robot
2023-10-27  9:43     ` Chengming Zhou
2023-10-17 15:44 ` [RFC PATCH 2/5] slub: Don't manipulate slab list when used by cpu chengming.zhou
2023-10-17 15:44 ` [RFC PATCH 3/5] slub: Optimize deactivate_slab() chengming.zhou
2023-10-17 15:44 ` [RFC PATCH 4/5] slub: Don't freeze slabs for cpu partial chengming.zhou
2023-10-17 15:44 ` [RFC PATCH 5/5] slub: Introduce get_cpu_partial() chengming.zhou
2023-10-18  6:34 ` [RFC PATCH 0/5] slub: Delay freezing of CPU partial slabs Hyeonggon Yoo
2023-10-18  7:44   ` Chengming Zhou [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8cff8994-28a3-4a7e-8a6e-217c4da49ca1@linux.dev \
    --to=chengming.zhou@linux.dev \
    --cc=42.hyeyoo@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=cl@linux.com \
    --cc=iamjoonsoo.kim@lge.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=penberg@kernel.org \
    --cc=rientjes@google.com \
    --cc=roman.gushchin@linux.dev \
    --cc=vbabka@suse.cz \
    --cc=zhouchengming@bytedance.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.