All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jesper Dangaard Brouer <jbrouer@redhat.com>
To: Vlastimil Babka <vbabka@suse.cz>,
	netdev@vger.kernel.org, linux-mm@kvack.org
Cc: brouer@redhat.com, Christoph Lameter <cl@linux.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Mel Gorman <mgorman@techsingularity.net>,
	Joonsoo Kim <iamjoonsoo.kim@lge.com>,
	penberg@kernel.org, Jakub Kicinski <kuba@kernel.org>,
	"David S. Miller" <davem@davemloft.net>,
	edumazet@google.com, pabeni@redhat.com,
	David Rientjes <rientjes@google.com>,
	Hyeonggon Yoo <42.hyeyoo@gmail.com>,
	Roman Gushchin <roman.gushchin@linux.dev>,
	Alexander Potapenko <glider@google.com>,
	Marco Elver <elver@google.com>,
	kasan-dev <kasan-dev@googlegroups.com>,
	Matthew Wilcox <willy@infradead.org>
Subject: Re: [PATCH RFC] mm+net: allow to set kmem_cache create flag for SLAB_NEVER_MERGE
Date: Mon, 23 Jan 2023 17:14:20 +0100	[thread overview]
Message-ID: <93665604-5420-be5d-2104-17850288b955@redhat.com> (raw)
In-Reply-To: <bfe4ff8f-0244-739d-3dfa-60101c8bf6b8@suse.cz>



On 18/01/2023 08.36, Vlastimil Babka wrote:
> On 1/17/23 14:40, Jesper Dangaard Brouer wrote:
>> Allow API users of kmem_cache_create to specify that they don't want
>> any slab merge or aliasing (with similar sized objects). Use this in
>> network stack and kfence_test.
>>
>> The SKB (sk_buff) kmem_cache slab is critical for network performance.
>> Network stack uses kmem_cache_{alloc,free}_bulk APIs to gain
>> performance by amortising the alloc/free cost.
>>
>> For the bulk API to perform efficiently the slub fragmentation need to
>> be low. Especially for the SLUB allocator, the efficiency of bulk free
>> API depend on objects belonging to the same slab (page).
> 
> Incidentally, would you know if anyone still uses SLAB instead of SLUB
> because it would perform better for networking? IIRC in the past discussions
> networking was one of the reasons for SLAB to stay. We are looking again
> into the possibility of removing it, so it would be good to know if there
> are benchmarks where SLUB does worse so it can be looked into.
> 

I don't know of any users using SLAB for network performance reasons.
I've only been benchmarking with SLUB for a long time.
Anyone else on netdev?

Both SLUB and SLAB got the kmem_cache bulk API implemented.  This is
used today in network stack to squeeze extra performance for networking
for our SKB (sk_buff) metadata structure (that point to packet data).
Details: Networking cache upto 64 of these SKBs for RX-path NAPI-softirq
processing per CPU, which is repopulated with kmem_cache bulking API
(bulk alloc 16 and bulk free 32).

>> When running different network performance microbenchmarks, I started
>> to notice that performance was reduced (slightly) when machines had
>> longer uptimes. I believe the cause was 'skbuff_head_cache' got
>> aliased/merged into the general slub for 256 bytes sized objects (with
>> my kernel config, without CONFIG_HARDENED_USERCOPY).
> 
> So did things improve with SLAB_NEVER_MERGE?

Yes, but only the stability of the results.

The performance tests were microbenchmarks and as Christoph points out
there might be gains from more partial slabs when there are more
fragmentation.  The "overload" microbench will always do maximum
bulking, while more real workloads might be satisfied from the partial
slabs.  I would need to do a broader range of benchmarks before I can
conclude if this is always a win.

>> For SKB kmem_cache network stack have reasons for not merging, but it
>> varies depending on kernel config (e.g. CONFIG_HARDENED_USERCOPY).
>> We want to explicitly set SLAB_NEVER_MERGE for this kmem_cache.
>>

In most distro kernels configs SKB kmem_cache will already not get
merged / aliased.  I was just trying to make this consistent.

>> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
>> ---
>>   include/linux/slab.h    |    2 ++
>>   mm/kfence/kfence_test.c |    7 +++----
>>   mm/slab.h               |    5 +++--
>>   mm/slab_common.c        |    8 ++++----
>>   net/core/skbuff.c       |   13 ++++++++++++-
>>   5 files changed, 24 insertions(+), 11 deletions(-)
>>
>> diff --git a/include/linux/slab.h b/include/linux/slab.h
>> index 45af70315a94..83a89ba7c4be 100644
>> --- a/include/linux/slab.h
>> +++ b/include/linux/slab.h
>> @@ -138,6 +138,8 @@
>>   #define SLAB_SKIP_KFENCE	0
>>   #endif
>>   
>> +#define SLAB_NEVER_MERGE	((slab_flags_t __force)0x40000000U)
> 
> I think there should be an explanation what this does and when to consider
> it. We should discourage blind use / cargo cult / copy paste from elsewhere
> resulting in excessive proliferation of the flag.

I agree.

> - very specialized internal things like kfence? ok
> - prevent a bad user of another cache corrupt my cache due to merging? no,
> use slub_debug to find and fix the root cause

Agree, and the comment could point to the slub_debug trick.

> - performance concerns? only after proper evaluation, not prematurely
>

Yes, and I would need to do more perf eval myself ;-)
I don't have time atm, thus I'll not pursue this RFC patch anytime soon.

--Jesper


  reply	other threads:[~2023-01-23 16:15 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-01-17 13:40 [PATCH RFC] mm+net: allow to set kmem_cache create flag for SLAB_NEVER_MERGE Jesper Dangaard Brouer
2023-01-17 14:54 ` Christoph Lameter
2023-01-18  5:17   ` Matthew Wilcox
2023-01-19 18:08     ` Jesper Dangaard Brouer
2023-01-24 16:06   ` Hyeonggon Yoo
2023-01-18  7:36 ` Vlastimil Babka
2023-01-23 16:14   ` Jesper Dangaard Brouer [this message]
2023-05-31 12:03 ` Vlastimil Babka
2023-05-31 13:59   ` Jesper Dangaard Brouer
2023-05-31 20:08   ` Roman Gushchin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=93665604-5420-be5d-2104-17850288b955@redhat.com \
    --to=jbrouer@redhat.com \
    --cc=42.hyeyoo@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=brouer@redhat.com \
    --cc=cl@linux.com \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=elver@google.com \
    --cc=glider@google.com \
    --cc=iamjoonsoo.kim@lge.com \
    --cc=kasan-dev@googlegroups.com \
    --cc=kuba@kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@techsingularity.net \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=penberg@kernel.org \
    --cc=rientjes@google.com \
    --cc=roman.gushchin@linux.dev \
    --cc=vbabka@suse.cz \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.