linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] zram: easy the allocation of zcomp_strm's buffers with 2 pages
@ 2024-01-03  0:30 Barry Song
  2024-01-06  1:30 ` Sergey Senozhatsky
  0 siblings, 1 reply; 7+ messages in thread
From: Barry Song @ 2024-01-03  0:30 UTC (permalink / raw)
  To: minchan, senozhatsky, axboe; +Cc: linux-kernel, linux-block, Barry Song

There is no need to keep zcomp_strm's buffers contiguous physically.
And rarely, 1-order allocation can fail while buddy is seriously
fragmented.

Signed-off-by: Barry Song <v-songbaohua@oppo.com>
---
 drivers/block/zram/zcomp.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/block/zram/zcomp.c b/drivers/block/zram/zcomp.c
index 55af4efd7983..8237b08c49d8 100644
--- a/drivers/block/zram/zcomp.c
+++ b/drivers/block/zram/zcomp.c
@@ -11,6 +11,7 @@
 #include <linux/sched.h>
 #include <linux/cpu.h>
 #include <linux/crypto.h>
+#include <linux/vmalloc.h>
 
 #include "zcomp.h"
 
@@ -37,7 +38,7 @@ static void zcomp_strm_free(struct zcomp_strm *zstrm)
 {
 	if (!IS_ERR_OR_NULL(zstrm->tfm))
 		crypto_free_comp(zstrm->tfm);
-	free_pages((unsigned long)zstrm->buffer, 1);
+	vfree(zstrm->buffer);
 	zstrm->tfm = NULL;
 	zstrm->buffer = NULL;
 }
@@ -53,7 +54,7 @@ static int zcomp_strm_init(struct zcomp_strm *zstrm, struct zcomp *comp)
 	 * allocate 2 pages. 1 for compressed data, plus 1 extra for the
 	 * case when compressed size is larger than the original one
 	 */
-	zstrm->buffer = (void *)__get_free_pages(GFP_KERNEL | __GFP_ZERO, 1);
+	zstrm->buffer = vzalloc(2 * PAGE_SIZE);
 	if (IS_ERR_OR_NULL(zstrm->tfm) || !zstrm->buffer) {
 		zcomp_strm_free(zstrm);
 		return -ENOMEM;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH] zram: easy the allocation of zcomp_strm's buffers with 2 pages
  2024-01-03  0:30 [PATCH] zram: easy the allocation of zcomp_strm's buffers with 2 pages Barry Song
@ 2024-01-06  1:30 ` Sergey Senozhatsky
  2024-01-06  7:38   ` Barry Song
  0 siblings, 1 reply; 7+ messages in thread
From: Sergey Senozhatsky @ 2024-01-06  1:30 UTC (permalink / raw)
  To: Barry Song
  Cc: minchan, senozhatsky, axboe, linux-kernel, linux-block,
	Barry Song, Andrew Morton

On (24/01/03 13:30), Barry Song wrote:
> There is no need to keep zcomp_strm's buffers contiguous physically.
> And rarely, 1-order allocation can fail while buddy is seriously
> fragmented.

Dunno. Some of these don't sound like convincing reasons, I'm afraid.
We don't allocate compression streams all the time, we do it once
per-CPU. And if the system is under such a terrible memory pressure
then one probably should not use zram at all, because zsmalloc needs
pages for its pool.

I also wonder whether Android uses HW compression, in which case we
may need to have physically contig pages. Not to mention TLB shootdowns
that virt contig pages add to the picture.

[..]
> @@ -37,7 +38,7 @@ static void zcomp_strm_free(struct zcomp_strm *zstrm)
>  {
>  	if (!IS_ERR_OR_NULL(zstrm->tfm))
>  		crypto_free_comp(zstrm->tfm);
> -	free_pages((unsigned long)zstrm->buffer, 1);
> +	vfree(zstrm->buffer);
>  	zstrm->tfm = NULL;
>  	zstrm->buffer = NULL;
>  }
> @@ -53,7 +54,7 @@ static int zcomp_strm_init(struct zcomp_strm *zstrm, struct zcomp *comp)
>  	 * allocate 2 pages. 1 for compressed data, plus 1 extra for the
>  	 * case when compressed size is larger than the original one
>  	 */
> -	zstrm->buffer = (void *)__get_free_pages(GFP_KERNEL | __GFP_ZERO, 1);
> +	zstrm->buffer = vzalloc(2 * PAGE_SIZE);
>  	if (IS_ERR_OR_NULL(zstrm->tfm) || !zstrm->buffer) {
>  		zcomp_strm_free(zstrm);
>  		return -ENOMEM;

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] zram: easy the allocation of zcomp_strm's buffers with 2 pages
  2024-01-06  1:30 ` Sergey Senozhatsky
@ 2024-01-06  7:38   ` Barry Song
  2024-01-15  2:34     ` Sergey Senozhatsky
  0 siblings, 1 reply; 7+ messages in thread
From: Barry Song @ 2024-01-06  7:38 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: minchan, axboe, linux-kernel, linux-block, Barry Song, Andrew Morton

On Sat, Jan 6, 2024 at 9:30 AM Sergey Senozhatsky
<senozhatsky@chromium.org> wrote:
>
> On (24/01/03 13:30), Barry Song wrote:
> > There is no need to keep zcomp_strm's buffers contiguous physically.
> > And rarely, 1-order allocation can fail while buddy is seriously
> > fragmented.
>
> Dunno. Some of these don't sound like convincing reasons, I'm afraid.
> We don't allocate compression streams all the time, we do it once
> per-CPU. And if the system is under such a terrible memory pressure

We actually do it many times actually because we free it while unplugging and
re-allocate it during hotplugging. this can happen quite often for systems like
Android using hotplug for power management.

> then one probably should not use zram at all, because zsmalloc needs
> pages for its pool.

In my humble opinion, 1-order allocation and 0-order allocation are different
things, 1-order is still more difficult though it is easier than
2-order which was
a big pain causing allocation latency for tasks' kernel stacks and negatively
affecting user experience. it has now been replaced by vmalloc and makes
life easier :-)

>
> I also wonder whether Android uses HW compression, in which case we
> may need to have physically contig pages. Not to mention TLB shootdowns
> that virt contig pages add to the picture.

I don't understand how HW compression and TLB shootdown are related as zRAM
is using a traditional comp API.
We are always passing a virtual address, traditional HW drivers use their own
buffers to do DMA.

int crypto_comp_compress(struct crypto_comp *comp,
const u8 *src, unsigned int slen,
u8 *dst, unsigned int *dlen);
int crypto_comp_decompress(struct crypto_comp *comp,
  const u8 *src, unsigned int slen,
  u8 *dst, unsigned int *dlen);

In new acomp API, we are passing a sg - users' buffers to drivers directly,
sg_init_one(&input, src, entry->length);
sg_init_table(&output, 1);
sg_set_page(&output, page, PAGE_SIZE, 0);
acomp_request_set_params(acomp_ctx->req, &input, &output, entry->length, dlen);
ret = crypto_wait_req(crypto_acomp_decompress(acomp_ctx->req),
&acomp_ctx->wait);

but i agree one-nents sg might have some advantage in scompress case
after we move
to new acomp APIs if we have this patch I sent recently [patch 3/3],
https://lore.kernel.org/linux-mm/20240103095006.608744-1-21cnbao@gmail.com/

For the current zRAM code, I guess HW compression/TLB is not a concern.

>
> [..]
> > @@ -37,7 +38,7 @@ static void zcomp_strm_free(struct zcomp_strm *zstrm)
> >  {
> >       if (!IS_ERR_OR_NULL(zstrm->tfm))
> >               crypto_free_comp(zstrm->tfm);
> > -     free_pages((unsigned long)zstrm->buffer, 1);
> > +     vfree(zstrm->buffer);
> >       zstrm->tfm = NULL;
> >       zstrm->buffer = NULL;
> >  }
> > @@ -53,7 +54,7 @@ static int zcomp_strm_init(struct zcomp_strm *zstrm, struct zcomp *comp)
> >        * allocate 2 pages. 1 for compressed data, plus 1 extra for the
> >        * case when compressed size is larger than the original one
> >        */
> > -     zstrm->buffer = (void *)__get_free_pages(GFP_KERNEL | __GFP_ZERO, 1);
> > +     zstrm->buffer = vzalloc(2 * PAGE_SIZE);
> >       if (IS_ERR_OR_NULL(zstrm->tfm) || !zstrm->buffer) {
> >               zcomp_strm_free(zstrm);
> >               return -ENOMEM;

Thanks
Barry

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] zram: easy the allocation of zcomp_strm's buffers with 2 pages
  2024-01-06  7:38   ` Barry Song
@ 2024-01-15  2:34     ` Sergey Senozhatsky
  2024-01-29  2:46       ` Barry Song
  0 siblings, 1 reply; 7+ messages in thread
From: Sergey Senozhatsky @ 2024-01-15  2:34 UTC (permalink / raw)
  To: Barry Song
  Cc: Sergey Senozhatsky, minchan, axboe, linux-kernel, linux-block,
	Barry Song, Andrew Morton

On (24/01/06 15:38), Barry Song wrote:
> On Sat, Jan 6, 2024 at 9:30 AM Sergey Senozhatsky
> <senozhatsky@chromium.org> wrote:
> >
> > On (24/01/03 13:30), Barry Song wrote:
> > > There is no need to keep zcomp_strm's buffers contiguous physically.
> > > And rarely, 1-order allocation can fail while buddy is seriously
> > > fragmented.
> >
> > Dunno. Some of these don't sound like convincing reasons, I'm afraid.
> > We don't allocate compression streams all the time, we do it once
> > per-CPU. And if the system is under such a terrible memory pressure
> 
> We actually do it many times actually because we free it while unplugging and
> re-allocate it during hotplugging. this can happen quite often for systems like
> Android using hotplug for power management.

Okay, makes sense.
Do you see these problems in real life? I don't recall any reports.

> > then one probably should not use zram at all, because zsmalloc needs
> > pages for its pool.
> 
> In my humble opinion, 1-order allocation and 0-order allocation are different
> things, 1-order is still more difficult though it is easier than
> 2-order which was
> a big pain causing allocation latency for tasks' kernel stacks and negatively
> affecting user experience. it has now been replaced by vmalloc and makes
> life easier :-)

Sure.

> > I also wonder whether Android uses HW compression, in which case we
> > may need to have physically contig pages. Not to mention TLB shootdowns
> > that virt contig pages add to the picture.
> 
> I don't understand how HW compression and TLB shootdown are related as zRAM
> is using a traditional comp API.

Oh, those are not related. TLB shootdowns are what now will be added to
all compressions/decompressions, so it's sort of extra cost.
HW compression (which android may be doing?) is another story.

Did you run any perf tests on zram w/ and w/o the patch?

> We are always passing a virtual address, traditional HW drivers use their own
> buffers to do DMA.
> 
> int crypto_comp_compress(struct crypto_comp *comp,
> const u8 *src, unsigned int slen,
> u8 *dst, unsigned int *dlen);
> int crypto_comp_decompress(struct crypto_comp *comp,
>   const u8 *src, unsigned int slen,
>   u8 *dst, unsigned int *dlen);
> 
> In new acomp API, we are passing a sg - users' buffers to drivers directly,
> sg_init_one(&input, src, entry->length);
> sg_init_table(&output, 1);
> sg_set_page(&output, page, PAGE_SIZE, 0);
> acomp_request_set_params(acomp_ctx->req, &input, &output, entry->length, dlen);
> ret = crypto_wait_req(crypto_acomp_decompress(acomp_ctx->req),
> &acomp_ctx->wait);
> 
> but i agree one-nents sg might have some advantage in scompress case

Right.

> after we move
> to new acomp APIs if we have this patch I sent recently [patch 3/3],
> https://lore.kernel.org/linux-mm/20240103095006.608744-1-21cnbao@gmail.com/

Nice.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] zram: easy the allocation of zcomp_strm's buffers with 2 pages
  2024-01-15  2:34     ` Sergey Senozhatsky
@ 2024-01-29  2:46       ` Barry Song
  2024-02-06  1:55         ` Sergey Senozhatsky
  0 siblings, 1 reply; 7+ messages in thread
From: Barry Song @ 2024-01-29  2:46 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: minchan, axboe, linux-kernel, linux-block, Barry Song, Andrew Morton

On Mon, Jan 15, 2024 at 10:35 AM Sergey Senozhatsky
<senozhatsky@chromium.org> wrote:
>
> On (24/01/06 15:38), Barry Song wrote:
> > On Sat, Jan 6, 2024 at 9:30 AM Sergey Senozhatsky
> > <senozhatsky@chromium.org> wrote:
> > >
> > > On (24/01/03 13:30), Barry Song wrote:
> > > > There is no need to keep zcomp_strm's buffers contiguous physically.
> > > > And rarely, 1-order allocation can fail while buddy is seriously
> > > > fragmented.
> > >
> > > Dunno. Some of these don't sound like convincing reasons, I'm afraid.
> > > We don't allocate compression streams all the time, we do it once
> > > per-CPU. And if the system is under such a terrible memory pressure
> >
> > We actually do it many times actually because we free it while unplugging and
> > re-allocate it during hotplugging. this can happen quite often for systems like
> > Android using hotplug for power management.
>
> Okay, makes sense.
> Do you see these problems in real life? I don't recall any reports.

i don't have problems with the current zram which supports normal pages only.

but  in our out-of-tree code, we have enhanced zram/zsmalloc to support large
folios compression/decompression, which will make zram work much better
with large anon folios/mTHP things on which Ryan Roberts is working on.

I mean, a large folio with for example 16 normal pages can be saved as
one object in zram.
In millions of phones, we have deployed this approach and seen huge improvement
on compression ratio and cpu consumption decrease. in that case, we
need a larger
per-cpu buffer, and have seen frequent failure on allocation. that
inspired me to send
this patch in advance.

>
> > > then one probably should not use zram at all, because zsmalloc needs
> > > pages for its pool.
> >
> > In my humble opinion, 1-order allocation and 0-order allocation are different
> > things, 1-order is still more difficult though it is easier than
> > 2-order which was
> > a big pain causing allocation latency for tasks' kernel stacks and negatively
> > affecting user experience. it has now been replaced by vmalloc and makes
> > life easier :-)
>
> Sure.
>
> > > I also wonder whether Android uses HW compression, in which case we
> > > may need to have physically contig pages. Not to mention TLB shootdowns
> > > that virt contig pages add to the picture.
> >
> > I don't understand how HW compression and TLB shootdown are related as zRAM
> > is using a traditional comp API.
>
> Oh, those are not related. TLB shootdowns are what now will be added to
> all compressions/decompressions, so it's sort of extra cost.

i am sorry i still don't understand where the tlb shootdowns come
from. we don't unmap
this per-cpu buffers during compression and decompression, do we ?

am i missing something?

> HW compression (which android may be doing?) is another story.
>
> Did you run any perf tests on zram w/ and w/o the patch?
>
> > We are always passing a virtual address, traditional HW drivers use their own
> > buffers to do DMA.
> >
> > int crypto_comp_compress(struct crypto_comp *comp,
> > const u8 *src, unsigned int slen,
> > u8 *dst, unsigned int *dlen);
> > int crypto_comp_decompress(struct crypto_comp *comp,
> >   const u8 *src, unsigned int slen,
> >   u8 *dst, unsigned int *dlen);
> >
> > In new acomp API, we are passing a sg - users' buffers to drivers directly,
> > sg_init_one(&input, src, entry->length);
> > sg_init_table(&output, 1);
> > sg_set_page(&output, page, PAGE_SIZE, 0);
> > acomp_request_set_params(acomp_ctx->req, &input, &output, entry->length, dlen);
> > ret = crypto_wait_req(crypto_acomp_decompress(acomp_ctx->req),
> > &acomp_ctx->wait);
> >
> > but i agree one-nents sg might have some advantage in scompress case
>
> Right.
>
> > after we move
> > to new acomp APIs if we have this patch I sent recently [patch 3/3],
> > https://lore.kernel.org/linux-mm/20240103095006.608744-1-21cnbao@gmail.com/
>
> Nice.

Thanks
Barry

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] zram: easy the allocation of zcomp_strm's buffers with 2 pages
  2024-01-29  2:46       ` Barry Song
@ 2024-02-06  1:55         ` Sergey Senozhatsky
  2024-02-06  9:30           ` Barry Song
  0 siblings, 1 reply; 7+ messages in thread
From: Sergey Senozhatsky @ 2024-02-06  1:55 UTC (permalink / raw)
  To: Barry Song
  Cc: Sergey Senozhatsky, minchan, axboe, linux-kernel, linux-block,
	Barry Song, Andrew Morton

On (24/01/29 10:46), Barry Song wrote:
> On Mon, Jan 15, 2024 at 10:35 AM Sergey Senozhatsky
> <senozhatsky@chromium.org> wrote:
> >
> > On (24/01/06 15:38), Barry Song wrote:
> > > On Sat, Jan 6, 2024 at 9:30 AM Sergey Senozhatsky
> > > <senozhatsky@chromium.org> wrote:
> > > >
> > > > On (24/01/03 13:30), Barry Song wrote:
> > > > > There is no need to keep zcomp_strm's buffers contiguous physically.
> > > > > And rarely, 1-order allocation can fail while buddy is seriously
> > > > > fragmented.
> > > >
[..]
> > Okay, makes sense.
> > Do you see these problems in real life? I don't recall any reports.
> 
> i don't have problems with the current zram which supports normal pages only.
> 
> but  in our out-of-tree code, we have enhanced zram/zsmalloc to support large
> folios compression/decompression, which will make zram work much better
> with large anon folios/mTHP things on which Ryan Roberts is working on.
> 
> I mean, a large folio with for example 16 normal pages can be saved as
> one object in zram.
> In millions of phones, we have deployed this approach and seen huge improvement
> on compression ratio and cpu consumption decrease. in that case, we
> need a larger
> per-cpu buffer, and have seen frequent failure on allocation. that
> inspired me to send
> this patch in advance.

May I please ask you to resend this patch with updated commit mesasge?
E.g. mention cpu offlinig/onlining, etc.

[..]
> > > > I also wonder whether Android uses HW compression, in which case we
> > > > may need to have physically contig pages. Not to mention TLB shootdowns
> > > > that virt contig pages add to the picture.
> > >
> > > I don't understand how HW compression and TLB shootdown are related as zRAM
> > > is using a traditional comp API.
> >
> > Oh, those are not related. TLB shootdowns are what now will be added to
> > all compressions/decompressions, so it's sort of extra cost.
> 
> i am sorry i still don't understand where the tlb shootdowns come
> from. we don't unmap
> this per-cpu buffers during compression and decompression, do we ?
> 
> am i missing something?

No, I guess you are right.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] zram: easy the allocation of zcomp_strm's buffers with 2 pages
  2024-02-06  1:55         ` Sergey Senozhatsky
@ 2024-02-06  9:30           ` Barry Song
  0 siblings, 0 replies; 7+ messages in thread
From: Barry Song @ 2024-02-06  9:30 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: minchan, axboe, linux-kernel, linux-block, Barry Song, Andrew Morton

On Tue, Feb 6, 2024 at 9:55 AM Sergey Senozhatsky
<senozhatsky@chromium.org> wrote:
>
> On (24/01/29 10:46), Barry Song wrote:
> > On Mon, Jan 15, 2024 at 10:35 AM Sergey Senozhatsky
> > <senozhatsky@chromium.org> wrote:
> > >
> > > On (24/01/06 15:38), Barry Song wrote:
> > > > On Sat, Jan 6, 2024 at 9:30 AM Sergey Senozhatsky
> > > > <senozhatsky@chromium.org> wrote:
> > > > >
> > > > > On (24/01/03 13:30), Barry Song wrote:
> > > > > > There is no need to keep zcomp_strm's buffers contiguous physically.
> > > > > > And rarely, 1-order allocation can fail while buddy is seriously
> > > > > > fragmented.
> > > > >
> [..]
> > > Okay, makes sense.
> > > Do you see these problems in real life? I don't recall any reports.
> >
> > i don't have problems with the current zram which supports normal pages only.
> >
> > but  in our out-of-tree code, we have enhanced zram/zsmalloc to support large
> > folios compression/decompression, which will make zram work much better
> > with large anon folios/mTHP things on which Ryan Roberts is working on.
> >
> > I mean, a large folio with for example 16 normal pages can be saved as
> > one object in zram.
> > In millions of phones, we have deployed this approach and seen huge improvement
> > on compression ratio and cpu consumption decrease. in that case, we
> > need a larger
> > per-cpu buffer, and have seen frequent failure on allocation. that
> > inspired me to send
> > this patch in advance.
>
> May I please ask you to resend this patch with updated commit mesasge?
> E.g. mention cpu offlinig/onlining, etc.

I will send v2 to add this information in the commit message. thanks!

>
> [..]
> > > > > I also wonder whether Android uses HW compression, in which case we
> > > > > may need to have physically contig pages. Not to mention TLB shootdowns
> > > > > that virt contig pages add to the picture.
> > > >
> > > > I don't understand how HW compression and TLB shootdown are related as zRAM
> > > > is using a traditional comp API.
> > >
> > > Oh, those are not related. TLB shootdowns are what now will be added to
> > > all compressions/decompressions, so it's sort of extra cost.
> >
> > i am sorry i still don't understand where the tlb shootdowns come
> > from. we don't unmap
> > this per-cpu buffers during compression and decompression, do we ?
> >
> > am i missing something?
>
> No, I guess you are right.

Best regards
Barry

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2024-02-06  9:30 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-01-03  0:30 [PATCH] zram: easy the allocation of zcomp_strm's buffers with 2 pages Barry Song
2024-01-06  1:30 ` Sergey Senozhatsky
2024-01-06  7:38   ` Barry Song
2024-01-15  2:34     ` Sergey Senozhatsky
2024-01-29  2:46       ` Barry Song
2024-02-06  1:55         ` Sergey Senozhatsky
2024-02-06  9:30           ` Barry Song

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).