Re: [PATCH v6] mm: add zblock - new allocator for use via zpool API

From: Vitaly Wool <vitaly.wool@konsulko.com>
To: Johannes Weiner <hannes@cmpxchg.org>
Cc: ananda <a.badmaev@clicknet.pro>, linux-mm@kvack.org
Subject: Re: [PATCH v6] mm: add zblock - new allocator for use via zpool API
Date: Tue, 29 Nov 2022 08:48:27 +0100	[thread overview]
Message-ID: <CAM4kBBLeJHxUq4e8cA-oHZXSPQS_G0T=Kga=2x-OW7H2u_J0kA@mail.gmail.com> (raw)
In-Reply-To: <Y4UTgz7MNcVxlSnR@cmpxchg.org>

On Mon, Nov 28, 2022 at 9:01 PM Johannes Weiner <hannes@cmpxchg.org> wrote:
>
> On Fri, Nov 04, 2022 at 11:58:56AM +0300, ananda wrote:
> > From: Ananda <a.badmaev@clicknet.pro>
> >
> >     Zblock stores integer number of compressed objects per zblock block.
>
> What does that mean?

It's explained later in the patch but anyway, an example: let's create
an object with 4 adjacent pages, a total of 16384 bytes. We can divide
it into 43 subblocks of size 381, plus we'll have one byte that's not
used. Subblocks will then be treated as an array.

> > These blocks consist of several physical pages (1/2/4/8) and are arranged
> > in linked lists.
> >     The range from 0 to PAGE_SIZE is divided into the number of intervals
> > corresponding to the number of lists and each list only operates objects
> > of size from its interval. Thus the block lists are isolated from each
> > other, which makes it possible to simultaneously perform actions with
> > several objects from different lists.
>
> This was benchmarked not long ago in the context of zsmalloc, and it
> didn't seem to matter too much in real world applications:
>
> https://lore.kernel.org/linux-mm/20221107213114.916231-1-nphamcs@gmail.com/

We basically reproduced this test and also ran it with zblock, and
zblock performs better by 3.5% on a 8G ZRAM disk with btrfs and this
difference is getting bigger with disk sizes getting bigger.
I'm pretty sure that the difference will get even bigger over time
because zsmalloc will run compaction more and more.

> Do you have situations where this matters?
>
> >     Blocks make it possible to densely arrange objects of various sizes
> > resulting in low internal fragmentation. Also this allocator tries to fill
> > incomplete blocks instead of adding new ones thus in many cases providing
> > a compression ratio substantially higher than z3fold and zbud.
>
> How does it compare to zsmalloc?

That depends on the type of data being compressed, but typically
zsmalloc is better by 5-10%.

> >     Zblock does not require MMU and also is superior to zsmalloc with
> > regard to the worst execution times, thus allowing for better response time
> > and real-time characteristics of the whole system.
>
> zsmalloc has depends on MMU, but which parts actually require it? It
> has its own handle indirection and can migrate objects around and
> replace backing pages without any virtual memory tricks. There is the
> kmap stuff of course, because it supports highmem backing pages, but
> that isn't relevant on NOMMU either.
>
> Also can you please elaborate on the worst execution time?

I don't have the numbers at hand but zsmalloc (and z3fold, for that
matter) do have high spikes when compaction kicks in, not to speak
about longer disabled preemption.

> My first impression is that this looks awfully close to zsmalloc, with
> a couple fewer features and somewhat more static design choices. It's
> in that sense reminiscent of the slob allocator, which we're in the
> process of removing, because 3 slab allocators is a pain to
> maintain. This would be the 4th zswap allocator, and it's not clear
> that it's drastically outperforming or doing something that isn't
> possible in one of the existing ones.

I don't think this comparison is on point, at least because zblock's
code is at least 4x smaller than zsmalloc's, and the execution
overhead is lower too. For lower performance devices, zblock is a real
enabler, and there's a class of high performance devices where it can
be the best fit too.

I get your point about 4 zswap allocators though, and have no problem
obsoleting z3fold as soon as we get zblock in.

Thanks,
Vitaly