All of lore.kernel.org
 help / color / mirror / Atom feed
From: Hyeonggon Yoo <42.hyeyoo@gmail.com>
To: Vlastimil Babka <vbabka@suse.cz>
Cc: Guenter Roeck <linux@roeck-us.net>,
	Christoph Lameter <cl@linux.com>,
	Pekka Enberg <penberg@kernel.org>,
	David Rientjes <rientjes@google.com>,
	Joonsoo Kim <iamjoonsoo.kim@lge.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Roman Gushchin <roman.gushchin@linux.dev>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v4 10/17] mm/slab: kmalloc: pass requests larger than order-1 page to page allocator
Date: Sun, 16 Oct 2022 18:10:15 +0900	[thread overview]
Message-ID: <Y0vKd8/9lrI8T+Wk@hyeyoo> (raw)
In-Reply-To: <7ad9081b-9082-2cbb-5732-f87366dca801@suse.cz>

On Sat, Oct 15, 2022 at 09:39:08PM +0200, Vlastimil Babka wrote:
> On 10/15/22 01:48, Hyeonggon Yoo wrote:
> > On Fri, Oct 14, 2022 at 01:58:18PM -0700, Guenter Roeck wrote:
> >> Hi,
> >> 
> >> On Wed, Aug 17, 2022 at 07:18:19PM +0900, Hyeonggon Yoo wrote:
> >> > There is not much benefit for serving large objects in kmalloc().
> >> > Let's pass large requests to page allocator like SLUB for better
> >> > maintenance of common code.
> >> > 
> >> > Signed-off-by: Hyeonggon Yoo <42.hyeyoo@gmail.com>
> >> > Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
> >> > ---
> >> 
> >> This patch results in a WARNING backtrace in all mips and sparc64
> >> emulations.
> >> 
> >> ------------[ cut here ]------------
> >> WARNING: CPU: 0 PID: 0 at mm/slab_common.c:729 kmalloc_slab+0xc0/0xdc
> >> Modules linked in:
> >> CPU: 0 PID: 0 Comm: swapper Not tainted 6.0.0-11990-g9c9155a3509a #1
> >> Stack : ffffffff 801b2a18 80dd0000 00000004 00000000 00000000 81023cd4 00000000
> >>         81040000 811a9930 81040000 8104a628 81101833 00000001 81023c78 00000000
> >>         00000000 00000000 80f5d858 81023b98 00000001 00000023 00000000 ffffffff
> >>         00000000 00000064 00000002 81040000 81040000 00000001 80f5d858 000002d9
> >>         00000000 00000000 80000000 80002000 00000000 00000000 00000000 00000000
> >>         ...
> >> Call Trace:
> >> [<8010a2bc>] show_stack+0x38/0x118
> >> [<80cf5f7c>] dump_stack_lvl+0xac/0x104
> >> [<80130d7c>] __warn+0xe0/0x224
> >> [<80cdba5c>] warn_slowpath_fmt+0x64/0xb8
> >> [<8028c058>] kmalloc_slab+0xc0/0xdc
> >> 
> >> irq event stamp: 0
> >> hardirqs last  enabled at (0): [<00000000>] 0x0
> >> hardirqs last disabled at (0): [<00000000>] 0x0
> >> softirqs last  enabled at (0): [<00000000>] 0x0
> >> softirqs last disabled at (0): [<00000000>] 0x0
> >> ---[ end trace 0000000000000000 ]---
> >> 
> >> Guenter
> > 
> > Hi.
> > 
> > Thank you so much for this report!
> > 
> > Hmm so SLAB tries to find kmalloc cache for freelist index array using
> > kmalloc_slab() directly, and it becomes problematic when size of the
> > array is larger than PAGE_SIZE * 2.
> 
> Hmm interesting, did you find out how exactly that can happen in practice,

> or what's special about mips and sparc64 here?

IIUC if page size is large, number of objects per slab is quite large and so
the possiblity of failing to use objfreelist slab is higher, and then it
tries to use off slab.

> Because normally
> calculate_slab_order() will only go up to slab_max_order, which AFAICS can
> only go up to SLAB_MAX_ORDER_HI, thus 1, unless there's a boot command line
> override.

AFAICS with mips default configuration and without setting slab_max_order,
It seems SLAB actually does not use too big freelist index array.

But it hits the warning because of tricky logic.

For example if the condition is true on


>	if (freelist_cache->size > cachep->size / 2)
>		continue;

or on (before kmalloc is up, in case of kmem_cache)
>	freelist_cache = kmalloc_slab(freelist_size, 0u);
>       if (!freelist_cache)
>		continue;

it increases gfporder over and over until 'num' becomes larger than SLAB_MAX_OBJS.
(regardless of slab_max_order).

I think adding below would be more robust.

diff --git a/mm/slab.c b/mm/slab.c
index d1f6e2c64c2e..1321aca1887c 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -1679,7 +1679,7 @@ static size_t calculate_slab_order(struct kmem_cache *cachep,
 			} else {
 				freelist_cache = kmalloc_slab(freelist_size, 0u);
 				if (!freelist_cache)
-					continue;
+					break;
 				freelist_cache_size = freelist_cache->size;
 
 				/*
@@ -1692,7 +1692,7 @@ static size_t calculate_slab_order(struct kmem_cache *cachep,
 
 			/* check if off slab has enough benefit */
 			if (freelist_cache_size > cachep->size / 2)
-				continue;
+				break;
 		}
 
 		/* Found something acceptable - save it away */


> And if we have two pages for objects, surely even with small objects they
> can't be smaller than freelist_idx_t, so if the number of objects fits into
> two pages (order 1), then the freelist array should also fit in two pages?

That's right but on certain condition it seem to go larger than slab_max_order.
(from code inspection)

> 
> Thanks,
> Vlastimil
> 
> > Will send a fix soon.
> > 

-- 
Thanks,
Hyeonggon

  reply	other threads:[~2022-10-16  9:10 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-08-17 10:18 [PATCH v4 00/17] common kmalloc v4 Hyeonggon Yoo
2022-08-17 10:18 ` [PATCH v4 01/17] mm/slab: move NUMA-related code to __do_cache_alloc() Hyeonggon Yoo
2022-08-17 10:18 ` [PATCH v4 02/17] mm/slab: cleanup slab_alloc() and slab_alloc_node() Hyeonggon Yoo
2022-08-17 10:18 ` [PATCH v4 03/17] mm/slab_common: remove CONFIG_NUMA ifdefs for common kmalloc functions Hyeonggon Yoo
2022-08-17 10:18 ` [PATCH v4 04/17] mm/slab_common: cleanup kmalloc_track_caller() Hyeonggon Yoo
2022-08-17 10:18 ` [PATCH v4 05/17] mm/sl[au]b: factor out __do_kmalloc_node() Hyeonggon Yoo
2022-08-17 10:18 ` [PATCH v4 06/17] mm/slab_common: fold kmalloc_order_trace() into kmalloc_large() Hyeonggon Yoo
2022-08-17 10:18 ` [PATCH v4 07/17] mm/slub: move kmalloc_large_node() to slab_common.c Hyeonggon Yoo
2022-08-17 10:18 ` [PATCH v4 08/17] mm/slab_common: kmalloc_node: pass large requests to page allocator Hyeonggon Yoo
2022-08-17 10:18 ` [PATCH v4 09/17] mm/slab_common: cleanup kmalloc_large() Hyeonggon Yoo
2022-08-17 10:18 ` [PATCH v4 10/17] mm/slab: kmalloc: pass requests larger than order-1 page to page allocator Hyeonggon Yoo
2022-10-14 20:58   ` Guenter Roeck
2022-10-14 23:48     ` Hyeonggon Yoo
2022-10-15 19:39       ` Vlastimil Babka
2022-10-16  9:10         ` Hyeonggon Yoo [this message]
2022-10-15  4:34     ` [PATCH] mm/slab: use kmalloc_node() for off slab freelist_idx_t array allocation Hyeonggon Yoo
2022-08-17 10:18 ` [PATCH v4 11/17] mm/sl[au]b: introduce common alloc/free functions without tracepoint Hyeonggon Yoo
2022-08-17 10:18 ` [PATCH v4 12/17] mm/sl[au]b: generalize kmalloc subsystem Hyeonggon Yoo
2022-08-17 10:18 ` [PATCH v4 13/17] mm/sl[au]b: cleanup kmem_cache_alloc[_node]_trace() Hyeonggon Yoo
2022-08-23 15:04   ` Vlastimil Babka
2022-08-24  3:54     ` Hyeonggon Yoo
2022-08-17 10:18 ` [PATCH v4 14/17] mm/slab_common: unify NUMA and UMA version of tracepoints Hyeonggon Yoo
2022-08-17 10:18 ` [PATCH v4 15/17] mm/slab_common: drop kmem_alloc & avoid dereferencing fields when not using Hyeonggon Yoo
2022-08-17 10:18 ` [PATCH v4 16/17] mm/slab_common: move declaration of __ksize() to mm/slab.h Hyeonggon Yoo
2022-08-17 10:18 ` [PATCH v4 17/17] mm/sl[au]b: check if large object is valid in __ksize() Hyeonggon Yoo
2022-08-23 15:12   ` Vlastimil Babka
2022-08-24  3:52     ` Hyeonggon Yoo
2022-08-23 15:16 ` [PATCH v4 00/17] common kmalloc v4 Vlastimil Babka
2022-08-24  3:58   ` Hyeonggon Yoo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Y0vKd8/9lrI8T+Wk@hyeyoo \
    --to=42.hyeyoo@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=cl@linux.com \
    --cc=iamjoonsoo.kim@lge.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux@roeck-us.net \
    --cc=penberg@kernel.org \
    --cc=rientjes@google.com \
    --cc=roman.gushchin@linux.dev \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.