linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] mm/slub: embed __slab_alloc to its caller
@ 2021-02-02  8:05 Abel Wu
  2021-02-02 10:11 ` Christoph Lameter
  0 siblings, 1 reply; 4+ messages in thread
From: Abel Wu @ 2021-02-02  8:05 UTC (permalink / raw)
  To: Christoph Lameter, Pekka Enberg, David Rientjes, Joonsoo Kim,
	Andrew Morton, Vlastimil Babka
  Cc: hewenliang4, wuyun.wu, Abel Wu, linux-mm, linux-kernel

Since slab_alloc_node() is the only caller of __slab_alloc(), embed
__slab_alloc() to its caller to save function call overhead. This
will also expand the caller's code block size a bit, but hackbench
tests on both host and guest didn't show a difference w/ or w/o
this patch.

Also rename ___slab_alloc() to __slab_alloc().

Signed-off-by: Abel Wu <abel.w@icloud.com>
---
 mm/slub.c | 46 ++++++++++++++++------------------------------
 1 file changed, 16 insertions(+), 30 deletions(-)

diff --git a/mm/slub.c b/mm/slub.c
index 7ecbbbe5bc0c..0f69d2d0471a 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -2654,10 +2654,9 @@ static inline void *get_freelist(struct kmem_cache *s, struct page *page)
  * we need to allocate a new slab. This is the slowest path since it involves
  * a call to the page allocator and the setup of a new slab.
  *
- * Version of __slab_alloc to use when we know that interrupts are
- * already disabled (which is the case for bulk allocation).
+ * Must be called with interrupts disabled.
  */
-static void *___slab_alloc(struct kmem_cache *s, gfp_t gfpflags, int node,
+static void *__slab_alloc(struct kmem_cache *s, gfp_t gfpflags, int node,
 			  unsigned long addr, struct kmem_cache_cpu *c)
 {
 	void *freelist;
@@ -2758,31 +2757,6 @@ static void *___slab_alloc(struct kmem_cache *s, gfp_t gfpflags, int node,
 	return freelist;
 }
 
-/*
- * Another one that disabled interrupt and compensates for possible
- * cpu changes by refetching the per cpu area pointer.
- */
-static void *__slab_alloc(struct kmem_cache *s, gfp_t gfpflags, int node,
-			  unsigned long addr, struct kmem_cache_cpu *c)
-{
-	void *p;
-	unsigned long flags;
-
-	local_irq_save(flags);
-#ifdef CONFIG_PREEMPTION
-	/*
-	 * We may have been preempted and rescheduled on a different
-	 * cpu before disabling interrupts. Need to reload cpu area
-	 * pointer.
-	 */
-	c = this_cpu_ptr(s->cpu_slab);
-#endif
-
-	p = ___slab_alloc(s, gfpflags, node, addr, c);
-	local_irq_restore(flags);
-	return p;
-}
-
 /*
  * If the object has been wiped upon free, make sure it's fully initialized by
  * zeroing out freelist pointer.
@@ -2854,7 +2828,19 @@ static __always_inline void *slab_alloc_node(struct kmem_cache *s,
 	object = c->freelist;
 	page = c->page;
 	if (unlikely(!object || !page || !node_match(page, node))) {
+		unsigned long flags;
+
+		local_irq_save(flags);
+#ifdef CONFIG_PREEMPTION
+		/*
+		 * We may have been preempted and rescheduled on a different
+		 * cpu before disabling interrupts. Need to reload cpu area
+		 * pointer.
+		 */
+		c = this_cpu_ptr(s->cpu_slab);
+#endif
 		object = __slab_alloc(s, gfpflags, node, addr, c);
+		local_irq_restore(flags);
 	} else {
 		void *next_object = get_freepointer_safe(s, object);
 
@@ -3299,7 +3285,7 @@ int kmem_cache_alloc_bulk(struct kmem_cache *s, gfp_t flags, size_t size,
 			 * We may have removed an object from c->freelist using
 			 * the fastpath in the previous iteration; in that case,
 			 * c->tid has not been bumped yet.
-			 * Since ___slab_alloc() may reenable interrupts while
+			 * Since __slab_alloc() may reenable interrupts while
 			 * allocating memory, we should bump c->tid now.
 			 */
 			c->tid = next_tid(c->tid);
@@ -3308,7 +3294,7 @@ int kmem_cache_alloc_bulk(struct kmem_cache *s, gfp_t flags, size_t size,
 			 * Invoking slow path likely have side-effect
 			 * of re-populating per CPU c->freelist
 			 */
-			p[i] = ___slab_alloc(s, flags, NUMA_NO_NODE,
+			p[i] = __slab_alloc(s, flags, NUMA_NO_NODE,
 					    _RET_IP_, c);
 			if (unlikely(!p[i]))
 				goto error;
-- 
2.27.0



^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] mm/slub: embed __slab_alloc to its caller
  2021-02-02  8:05 [PATCH] mm/slub: embed __slab_alloc to its caller Abel Wu
@ 2021-02-02 10:11 ` Christoph Lameter
  2021-02-03  1:41   ` Abel Wu
  0 siblings, 1 reply; 4+ messages in thread
From: Christoph Lameter @ 2021-02-02 10:11 UTC (permalink / raw)
  To: Abel Wu
  Cc: Pekka Enberg, David Rientjes, Joonsoo Kim, Andrew Morton,
	Vlastimil Babka, hewenliang4, wuyun.wu, linux-mm, linux-kernel

On Tue, 2 Feb 2021, Abel Wu wrote:

> Since slab_alloc_node() is the only caller of __slab_alloc(), embed
> __slab_alloc() to its caller to save function call overhead. This
> will also expand the caller's code block size a bit, but hackbench
> tests on both host and guest didn't show a difference w/ or w/o
> this patch.

slab_alloc_node is an always_inline function. It is intentional that only
the fast path was inlined and not the slow path.


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] mm/slub: embed __slab_alloc to its caller
  2021-02-02 10:11 ` Christoph Lameter
@ 2021-02-03  1:41   ` Abel Wu
  2021-02-05 13:03     ` Vlastimil Babka
  0 siblings, 1 reply; 4+ messages in thread
From: Abel Wu @ 2021-02-03  1:41 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Pekka Enberg, David Rientjes, Joonsoo Kim, Andrew Morton,
	Vlastimil Babka, hewenliang4, wuyun.wu, linux-mm, linux-kernel

> On Feb 2, 2021, at 6:11 PM, Christoph Lameter <cl@linux.com> wrote:
> 
> On Tue, 2 Feb 2021, Abel Wu wrote:
> 
>> Since slab_alloc_node() is the only caller of __slab_alloc(), embed
>> __slab_alloc() to its caller to save function call overhead. This
>> will also expand the caller's code block size a bit, but hackbench
>> tests on both host and guest didn't show a difference w/ or w/o
>> this patch.
> 
> slab_alloc_node is an always_inline function. It is intentional that only
> the fast path was inlined and not the slow path.

Oh I got it. Thanks for your excellent explanation.

Best Regards,
	Abel


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] mm/slub: embed __slab_alloc to its caller
  2021-02-03  1:41   ` Abel Wu
@ 2021-02-05 13:03     ` Vlastimil Babka
  0 siblings, 0 replies; 4+ messages in thread
From: Vlastimil Babka @ 2021-02-05 13:03 UTC (permalink / raw)
  To: Abel Wu, Christoph Lameter
  Cc: Pekka Enberg, David Rientjes, Joonsoo Kim, Andrew Morton,
	hewenliang4, wuyun.wu, linux-mm, linux-kernel

On 2/3/21 2:41 AM, Abel Wu wrote:
>> On Feb 2, 2021, at 6:11 PM, Christoph Lameter <cl@linux.com> wrote:
>> 
>> On Tue, 2 Feb 2021, Abel Wu wrote:
>> 
>>> Since slab_alloc_node() is the only caller of __slab_alloc(), embed
>>> __slab_alloc() to its caller to save function call overhead. This
>>> will also expand the caller's code block size a bit, but hackbench
>>> tests on both host and guest didn't show a difference w/ or w/o
>>> this patch.
>> 
>> slab_alloc_node is an always_inline function. It is intentional that only
>> the fast path was inlined and not the slow path.
> 
> Oh I got it. Thanks for your excellent explanation.

BTW, there's a script in the Linux source to nicely see the effect of such changes:

./scripts/bloat-o-meter slub.o.before mm/slub.o
add/remove: 0/1 grow/shrink: 9/0 up/down: 1660/-1130 (530)
Function                                     old     new   delta
__slab_alloc                                 127    1130   +1003
__kmalloc_track_caller                       877     965     +88
__kmalloc                                    878     966     +88
kmem_cache_alloc                             778     862     +84
__kmalloc_node_track_caller                  996    1080     +84
kmem_cache_alloc_node_trace                  813     896     +83
kmem_cache_alloc_node                        800     881     +81
kmem_cache_alloc_trace                       786     862     +76
__kmalloc_node                               998    1071     +73
___slab_alloc                               1130       -   -1130
Total: Before=57782, After=58312, chg +0.92%

And yeah, bloating all the entry points wouldn't be nice.
Thanks,
Vlastimil


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2021-02-05 13:04 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-02-02  8:05 [PATCH] mm/slub: embed __slab_alloc to its caller Abel Wu
2021-02-02 10:11 ` Christoph Lameter
2021-02-03  1:41   ` Abel Wu
2021-02-05 13:03     ` Vlastimil Babka

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).