All of lore.kernel.org
 help / color / mirror / Atom feed
From: Hyeonggon Yoo <42.hyeyoo@gmail.com>
To: Vlastimil Babka <vbabka@suse.cz>
Cc: linux-kernel@vger.kernel.org, Christoph Lameter <cl@linux.com>,
	Pekka Enberg <penberg@kernel.org>,
	David Rientjes <rientjes@google.com>,
	Joonsoo Kim <iamjoonsoo.kim@lge.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	linux-mm@kvack.org, Matthew Wilcox <willy@infradead.org>,
	Dave Taht <dave.taht@gmail.com>
Subject: Re: [RFC PATCH] mm, slob: Rewrite SLOB using segregated free list
Date: Thu, 21 Oct 2021 11:41:12 +0000	[thread overview]
Message-ID: <20211021114112.GA4004@kvm.asia-northeast3-a.c.our-ratio-313919.internal> (raw)
In-Reply-To: <cf8ef7b4-ca18-064f-9c5d-01047e40446b@suse.cz>

Hmm.. I think I need to clarify my intention.

I'm not saying this should be merged or we should put effort to make
SLOB into lightweight SLOB. I just rewrote it just for fun.
I wanted to know how small a segregated free list allocator can be.

And when I rewrote it, I wondered who is users of SLOB
and where SLOB should be used.

I think SLOB was useful when there was only SLAB and there was no SLUB,
but I wonder where SLOB should be used now.

When I compared SLOB and SLUB without cpu partials,
That made 300kB of difference in Slab memory.

Then Is SLOB used where 300kB of difference is so important?
But I think we need at least 16MB of RAM to run linux.

So I'm not saying we need to turn SLOB into lightweight SLUB,
but wanted to talk about the questions:

> > But after rewriting, I thought I need to discuss what SLOB is for.
> > According to Matthew, SLOB is for small machines whose
> > memory is 1~16 MB.
> > 
> > I wonder adding 48kB on SLOB memory for speed/lower latency
> > is worth or harmful.
> > 
> > So.. questions in my head now:
> >     - Who is users of SLOB?
> >     - Is it harmful to add some kilobytes of memory into SLOB?
> >     - Is it really possible to run linux under 10MB of RAM?
> >     	(I failed with tinyconfig.)
> >     - What is the boundary to make decision between SLOB and SLUB?

On Thu, Oct 21, 2021 at 10:46:46AM +0200, Vlastimil Babka wrote:
> On 10/20/21 15:55, Hyeonggon Yoo wrote:
> > Hello linux-mm, I rewrote SLOB using segregated free list,
> > to understand SLOB and SLUB more. It uses more kilobytes
> > of memory (48kB on 32bit tinyconfig) and became 9~10x faster.
> > 
> > But after rewriting, I thought I need to discuss what SLOB is for.
> > According to Matthew, SLOB is for small machines whose
> > memory is 1~16 MB.
> > 
> > I wonder adding 48kB on SLOB memory for speed/lower latency
> > is worth or harmful.
> > 
> > So.. questions in my head now:
> >     - Who is users of SLOB?
> >     - Is it harmful to add some kilobytes of memory into SLOB?
> >     - Is it really possible to run linux under 10MB of RAM?
> >     	(I failed with tinyconfig.)
> >     - What is the boundary to make decision between SLOB and SLUB?
> > 
> > Anyway, below is my work.
> > Any comments/opinions will be appreciated!
> > 
> > SLOB uses sequential fit method. the advantages of this method
> > is the fact that it is simple and does not have complex metadata.
> > 
> > But big downside of sequential fit method is its high latency
> > in allocation/deallocation and fast fragmentation.
> > 
> > High latency comes from iterating pages and also iterating objects
> > in the page to find suitable free object. And fragmentation easily
> > happens because objects of difference size is allocated in same page.
> > 
> > This patch tries to minimize both its latency and fragmentation by
> > re-implmenting SLOB using segregated free list method and adding
> > support for slab merging. it looks like lightweight SLUB but more
> > compact than SLUB.
> 
> My immediate reaction is that we probably don't want to turn SLOB into
> lightweight SLUB. SLOB choses the tradeoff of low memory usage over speed
> and shifting it towards more speed kinda defeats this purpose. Also it's a
> major rewrite, so without a very clear motivation there will be resistance
> to that.
>

Yes, I agree that SLOB is for memory efficiency, not a performance.
That's why I said:
> > I wonder adding 48kB on SLOB memory for speed/lower latency
> > is worth or harmful.

But on the contrary, I wonder when SLOB is useful than SLUB.
is it for really tiny linux systems that has under 1M of RAM?
But can linux be that small?

> SLUB itself could be probably tuned to less memory overhead if needed. Most
> of the debug options effectively disable percpu slabs, we could add a mode
> that disables them without the rest of the debugging overhead. Allocation
> order can be lowered (although some object sizes might benefit from less
> fragmentation with a higher order).

Yes, that's what I was curious about. As SLUB is not that big,
I wonder where SLOB is useful.

> > One notable difference is after this patch SLOB uses kmalloc_caches
> > like SL[AU]B.
> > 
> > Below is performance impacts of this patch.
> > 
> > Memory usage was measured on 32 bit + tinyconfig + slab merging.
> > 
> > Before:
> >     MemTotal:          29668 kB
> >     MemFree:           19364 kB
> >     MemAvailable:      18396 kB
> >     Slab:                668 kB
> > 
> > After:
> >     MemTotal:          29668 kB
> >     MemFree:           19420 kB
> >     MemAvailable:      18452 kB
> >     Slab:                716 kB
> > 
> > This patch adds about 48 kB after boot.
> > 
> > hackbench was measured on 64 bit typical buildroot configuration.
> > After this patch it's 9~10x faster than before.
> > 
> > Before:
> >     memory usage:
> >         after boot:
> >             Slab:               7908 kB
> >         after hackbench:
> >             Slab:               8544 kB
> > 
> >     Time: 189.947
> >     Performance counter stats for 'hackbench -g 4 -l 10000':
> >          379413.20 msec cpu-clock                 #    1.997 CPUs utilized
> >            8818226      context-switches          #   23.242 K/sec
> >             375186      cpu-migrations            #  988.859 /sec
> >               3954      page-faults               #   10.421 /sec
> >       269923095290      cycles                    #    0.711 GHz
> >       212341582012      instructions              #    0.79  insn per cycle
> >         2361087153      branch-misses
> >        58222839688      cache-references          #  153.455 M/sec
> >         6786521959      cache-misses              #   11.656 % of all cache refs
> > 
> >      190.002062273 seconds time elapsed
> > 
> >        3.486150000 seconds user
> >      375.599495000 seconds sys
> > 
> > After:
> >     memory usage:
> >        after boot:
> >            Slab:               7560 kB
> >         after hackbench:
> >            Slab:               7836 kB
> 
> Interesting that the memory usage in this test is actually lower with your
> patch.

I didn't mention that because if we have enough memory,
I think we have no reason to use SLOB. (why not use SLUB?)
I thought memory usage on small machine is important.

> 
> > hackbench:
> >     Time: 20.780
> >     Performance counter stats for 'hackbench -g 4 -l 10000':
> >           41509.79 msec cpu-clock                 #    1.996 CPUs utilized
> >             630032      context-switches          #   15.178 K/sec
> >               8287      cpu-migrations            #  199.640 /sec
> >               4036      page-faults               #   97.230 /sec
> >        57477161020      cycles                    #    1.385 GHz
> >        62775453932      instructions              #    1.09  insn per cycle
> >          164902523      branch-misses
> >        22559952993      cache-references          #  543.485 M/sec
> >          832404011      cache-misses              #    3.690 % of all cache refs
> > 
> >       20.791893590 seconds time elapsed
> > 
> >        1.423282000 seconds user
> >       40.072449000 seconds sys
> 
> That's significant, but also hackbench is kind of worst case test, so in
> practice the benefit won't be that prominent.

> 
> > Signed-off-by: Hyeonggon Yoo <42.hyeyoo@gmail.com>
> > ---

  reply	other threads:[~2021-10-21 11:41 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-10-20 13:55 [RFC PATCH] mm, slob: Rewrite SLOB using segregated free list Hyeonggon Yoo
2021-10-20 14:07 ` Matthew Wilcox
2021-10-20 20:22 ` kernel test robot
2021-10-20 20:22   ` kernel test robot
2021-10-21  8:46 ` Vlastimil Babka
2021-10-21 11:41   ` Hyeonggon Yoo [this message]
2021-10-25  7:58     ` Christoph Lameter

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20211021114112.GA4004@kvm.asia-northeast3-a.c.our-ratio-313919.internal \
    --to=42.hyeyoo@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=cl@linux.com \
    --cc=dave.taht@gmail.com \
    --cc=iamjoonsoo.kim@lge.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=penberg@kernel.org \
    --cc=rientjes@google.com \
    --cc=vbabka@suse.cz \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.