linux-crypto.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/3] mm/zswap & crypto/acompress: remove a couple of memcpy
@ 2024-02-16  4:08 Barry Song
  2024-02-16  4:08 ` [PATCH v2 1/3] crypto: introduce acomp_is_sleepable to expose if a acomp has a scomp backend Barry Song
                   ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: Barry Song @ 2024-02-16  4:08 UTC (permalink / raw)
  To: akpm, davem, hannes, herbert, linux-crypto, linux-mm, nphamcs,
	yosryahmed, zhouchengming
  Cc: chriscli, chrisl, ddstreet, linux-kernel, sjenning, vitaly.wool,
	Barry Song

From: Barry Song <v-songbaohua@oppo.com>

The patchset removes a couple of memcpy in zswap and crypto
to improve zswap's performance.

Thanks for Chengming Zhou's test and perf data.
Quote from Chengming,
 I just tested these three patches on my server, found improvement in the
 kernel build testcase on a tmpfs with zswap (lz4 + zsmalloc) enabled.
 
         mm-stable 501a06fe8e4c  patched
 real    1m38.028s               1m32.317s
 user    19m11.482s              18m39.439s
 sys     19m26.445s              17m5.646s

As zswap is the direct use of this patchset and zswap benefits from
this series, It is probably better for this patchset to go through
Andrew's mm tree than Herbert's crypto tree if there is no objection
from Herbert.

-v2:
  * add flush_dcache_page() in scomp_acomp_comp_decomp() according to
    Herbert's suggestion, thanks!
  * collect Reviewed-by of Nhat, thanks!
  * rename is_async to is_sleepable according to Yosry's suggestion,
    thanks!

Barry Song (3):
  crypto: introduce acomp_is_sleepable to expose if a acomp has a scomp
    backend
  mm/zswap: remove the memcpy if acomp is not sleepable
  crypto: scompress: remove memcpy if sg_nents is 1

 crypto/acompress.c         |  8 ++++++++
 crypto/scompress.c         | 36 +++++++++++++++++++++++++++++-------
 include/crypto/acompress.h |  9 +++++++++
 mm/zswap.c                 |  6 ++++--
 4 files changed, 50 insertions(+), 9 deletions(-)

-- 
2.34.1


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH v2 1/3] crypto: introduce acomp_is_sleepable to expose if a acomp has a scomp backend
  2024-02-16  4:08 [PATCH v2 0/3] mm/zswap & crypto/acompress: remove a couple of memcpy Barry Song
@ 2024-02-16  4:08 ` Barry Song
  2024-02-16  4:08 ` [PATCH v2 2/3] mm/zswap: remove the memcpy if acomp is not sleepable Barry Song
  2024-02-16  4:08 ` [PATCH v2 3/3] crypto: scompress: remove memcpy if sg_nents is 1 Barry Song
  2 siblings, 0 replies; 9+ messages in thread
From: Barry Song @ 2024-02-16  4:08 UTC (permalink / raw)
  To: akpm, davem, hannes, herbert, linux-crypto, linux-mm, nphamcs,
	yosryahmed, zhouchengming
  Cc: chriscli, chrisl, ddstreet, linux-kernel, sjenning, vitaly.wool,
	Barry Song

From: Barry Song <v-songbaohua@oppo.com>

Almost all CPU-based compressors/decompressors are actually synchronous
though they support acomp APIs. While some hardware has hardware-based
accelerators to offload CPU's work such as hisilicon and intel/qat/,
their drivers are working in async mode.
Letting acomp's users know exactly if the acomp is really async will
help users know if the compression and decompression procedure can
sleep.

Signed-off-by: Barry Song <v-songbaohua@oppo.com>
Tested-by: Chengming Zhou <zhouchengming@bytedance.com>
---
 crypto/acompress.c         | 8 ++++++++
 include/crypto/acompress.h | 9 +++++++++
 2 files changed, 17 insertions(+)

diff --git a/crypto/acompress.c b/crypto/acompress.c
index 1c682810a484..fa15df394a4c 100644
--- a/crypto/acompress.c
+++ b/crypto/acompress.c
@@ -152,6 +152,14 @@ struct crypto_acomp *crypto_alloc_acomp_node(const char *alg_name, u32 type,
 }
 EXPORT_SYMBOL_GPL(crypto_alloc_acomp_node);
 
+bool acomp_is_sleepable(struct crypto_acomp *acomp)
+{
+	struct crypto_tfm *tfm = crypto_acomp_tfm(acomp);
+
+	return tfm->__crt_alg->cra_type == &crypto_acomp_type;
+}
+EXPORT_SYMBOL_GPL(acomp_is_sleepable);
+
 struct acomp_req *acomp_request_alloc(struct crypto_acomp *acomp)
 {
 	struct crypto_tfm *tfm = crypto_acomp_tfm(acomp);
diff --git a/include/crypto/acompress.h b/include/crypto/acompress.h
index 574cffc90730..88ca33532313 100644
--- a/include/crypto/acompress.h
+++ b/include/crypto/acompress.h
@@ -204,6 +204,15 @@ struct acomp_req *acomp_request_alloc(struct crypto_acomp *tfm);
  */
 void acomp_request_free(struct acomp_req *req);
 
+/**
+ * acomp_is_sleepable() -- check if an acomp is sleepable
+ *
+ * @tfm:	ACOMPRESS tfm handle allocated with crypto_alloc_acomp()
+ *
+ * Return:	true if the acomp is sleepable, otherwise, false
+ */
+bool acomp_is_sleepable(struct crypto_acomp *tfm);
+
 /**
  * acomp_request_set_callback() -- Sets an asynchronous callback
  *
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v2 2/3] mm/zswap: remove the memcpy if acomp is not sleepable
  2024-02-16  4:08 [PATCH v2 0/3] mm/zswap & crypto/acompress: remove a couple of memcpy Barry Song
  2024-02-16  4:08 ` [PATCH v2 1/3] crypto: introduce acomp_is_sleepable to expose if a acomp has a scomp backend Barry Song
@ 2024-02-16  4:08 ` Barry Song
  2024-02-16  8:30   ` Yosry Ahmed
  2024-02-16 12:38   ` Chengming Zhou
  2024-02-16  4:08 ` [PATCH v2 3/3] crypto: scompress: remove memcpy if sg_nents is 1 Barry Song
  2 siblings, 2 replies; 9+ messages in thread
From: Barry Song @ 2024-02-16  4:08 UTC (permalink / raw)
  To: akpm, davem, hannes, herbert, linux-crypto, linux-mm, nphamcs,
	yosryahmed, zhouchengming
  Cc: chriscli, chrisl, ddstreet, linux-kernel, sjenning, vitaly.wool,
	Barry Song

From: Barry Song <v-songbaohua@oppo.com>

Most compressors are actually CPU-based and won't sleep during
compression and decompression. We should remove the redundant
memcpy for them.

Signed-off-by: Barry Song <v-songbaohua@oppo.com>
Tested-by: Chengming Zhou <zhouchengming@bytedance.com>
Reviewed-by: Nhat Pham <nphamcs@gmail.com>
---
 mm/zswap.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/mm/zswap.c b/mm/zswap.c
index 350dd2fc8159..6319d2281020 100644
--- a/mm/zswap.c
+++ b/mm/zswap.c
@@ -168,6 +168,7 @@ struct crypto_acomp_ctx {
 	struct crypto_wait wait;
 	u8 *buffer;
 	struct mutex mutex;
+	bool is_sleepable;
 };
 
 /*
@@ -716,6 +717,7 @@ static int zswap_cpu_comp_prepare(unsigned int cpu, struct hlist_node *node)
 		goto acomp_fail;
 	}
 	acomp_ctx->acomp = acomp;
+	acomp_ctx->is_sleepable = acomp_is_sleepable(acomp);
 
 	req = acomp_request_alloc(acomp_ctx->acomp);
 	if (!req) {
@@ -1368,7 +1370,7 @@ static void __zswap_load(struct zswap_entry *entry, struct page *page)
 	mutex_lock(&acomp_ctx->mutex);
 
 	src = zpool_map_handle(zpool, entry->handle, ZPOOL_MM_RO);
-	if (!zpool_can_sleep_mapped(zpool)) {
+	if (acomp_ctx->is_sleepable && !zpool_can_sleep_mapped(zpool)) {
 		memcpy(acomp_ctx->buffer, src, entry->length);
 		src = acomp_ctx->buffer;
 		zpool_unmap_handle(zpool, entry->handle);
@@ -1382,7 +1384,7 @@ static void __zswap_load(struct zswap_entry *entry, struct page *page)
 	BUG_ON(acomp_ctx->req->dlen != PAGE_SIZE);
 	mutex_unlock(&acomp_ctx->mutex);
 
-	if (zpool_can_sleep_mapped(zpool))
+	if (!acomp_ctx->is_sleepable || zpool_can_sleep_mapped(zpool))
 		zpool_unmap_handle(zpool, entry->handle);
 }
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v2 3/3] crypto: scompress: remove memcpy if sg_nents is 1
  2024-02-16  4:08 [PATCH v2 0/3] mm/zswap & crypto/acompress: remove a couple of memcpy Barry Song
  2024-02-16  4:08 ` [PATCH v2 1/3] crypto: introduce acomp_is_sleepable to expose if a acomp has a scomp backend Barry Song
  2024-02-16  4:08 ` [PATCH v2 2/3] mm/zswap: remove the memcpy if acomp is not sleepable Barry Song
@ 2024-02-16  4:08 ` Barry Song
  2 siblings, 0 replies; 9+ messages in thread
From: Barry Song @ 2024-02-16  4:08 UTC (permalink / raw)
  To: akpm, davem, hannes, herbert, linux-crypto, linux-mm, nphamcs,
	yosryahmed, zhouchengming
  Cc: chriscli, chrisl, ddstreet, linux-kernel, sjenning, vitaly.wool,
	Barry Song

From: Barry Song <v-songbaohua@oppo.com>

while sg_nents is 1 which is always true for the current kernel
as the only user - zswap is the case, we should remove two big
memcpy.

Signed-off-by: Barry Song <v-songbaohua@oppo.com>
Tested-by: Chengming Zhou <zhouchengming@bytedance.com>
---
 crypto/scompress.c | 36 +++++++++++++++++++++++++++++-------
 1 file changed, 29 insertions(+), 7 deletions(-)

diff --git a/crypto/scompress.c b/crypto/scompress.c
index b108a30a7600..50a487eac792 100644
--- a/crypto/scompress.c
+++ b/crypto/scompress.c
@@ -117,6 +117,7 @@ static int scomp_acomp_comp_decomp(struct acomp_req *req, int dir)
 	struct crypto_scomp *scomp = *tfm_ctx;
 	void **ctx = acomp_request_ctx(req);
 	struct scomp_scratch *scratch;
+	void *src, *dst;
 	unsigned int dlen;
 	int ret;
 
@@ -134,13 +135,25 @@ static int scomp_acomp_comp_decomp(struct acomp_req *req, int dir)
 	scratch = raw_cpu_ptr(&scomp_scratch);
 	spin_lock(&scratch->lock);
 
-	scatterwalk_map_and_copy(scratch->src, req->src, 0, req->slen, 0);
+	if (sg_nents(req->src) == 1) {
+		src = kmap_local_page(sg_page(req->src)) + req->src->offset;
+	} else {
+		scatterwalk_map_and_copy(scratch->src, req->src, 0,
+					 req->slen, 0);
+		src = scratch->src;
+	}
+
+	if (req->dst && sg_nents(req->dst) == 1)
+		dst = kmap_local_page(sg_page(req->dst)) + req->dst->offset;
+	else
+		dst = scratch->dst;
+
 	if (dir)
-		ret = crypto_scomp_compress(scomp, scratch->src, req->slen,
-					    scratch->dst, &req->dlen, *ctx);
+		ret = crypto_scomp_compress(scomp, src, req->slen,
+					    dst, &req->dlen, *ctx);
 	else
-		ret = crypto_scomp_decompress(scomp, scratch->src, req->slen,
-					      scratch->dst, &req->dlen, *ctx);
+		ret = crypto_scomp_decompress(scomp, src, req->slen,
+					      dst, &req->dlen, *ctx);
 	if (!ret) {
 		if (!req->dst) {
 			req->dst = sgl_alloc(req->dlen, GFP_ATOMIC, NULL);
@@ -152,10 +165,19 @@ static int scomp_acomp_comp_decomp(struct acomp_req *req, int dir)
 			ret = -ENOSPC;
 			goto out;
 		}
-		scatterwalk_map_and_copy(scratch->dst, req->dst, 0, req->dlen,
-					 1);
+		if (dst == scratch->dst) {
+			scatterwalk_map_and_copy(scratch->dst, req->dst, 0,
+						 req->dlen, 1);
+		} else {
+			flush_dcache_page(sg_page(req->dst));
+		}
 	}
 out:
+	if (src != scratch->src)
+		kunmap_local(src);
+	if (dst != scratch->dst)
+		kunmap_local(dst);
+
 	spin_unlock(&scratch->lock);
 	return ret;
 }
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 2/3] mm/zswap: remove the memcpy if acomp is not sleepable
  2024-02-16  4:08 ` [PATCH v2 2/3] mm/zswap: remove the memcpy if acomp is not sleepable Barry Song
@ 2024-02-16  8:30   ` Yosry Ahmed
  2024-02-16 10:10     ` Barry Song
  2024-02-16 12:38   ` Chengming Zhou
  1 sibling, 1 reply; 9+ messages in thread
From: Yosry Ahmed @ 2024-02-16  8:30 UTC (permalink / raw)
  To: Barry Song
  Cc: akpm, davem, hannes, herbert, linux-crypto, linux-mm, nphamcs,
	zhouchengming, chriscli, chrisl, ddstreet, linux-kernel,
	sjenning, vitaly.wool, Barry Song

On Fri, Feb 16, 2024 at 05:08:14PM +1300, Barry Song wrote:
> From: Barry Song <v-songbaohua@oppo.com>
> 
> Most compressors are actually CPU-based and won't sleep during
> compression and decompression. We should remove the redundant
> memcpy for them.
> 
> Signed-off-by: Barry Song <v-songbaohua@oppo.com>
> Tested-by: Chengming Zhou <zhouchengming@bytedance.com>
> Reviewed-by: Nhat Pham <nphamcs@gmail.com>
> ---
>  mm/zswap.c | 6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/mm/zswap.c b/mm/zswap.c
> index 350dd2fc8159..6319d2281020 100644
> --- a/mm/zswap.c
> +++ b/mm/zswap.c
> @@ -168,6 +168,7 @@ struct crypto_acomp_ctx {
>  	struct crypto_wait wait;
>  	u8 *buffer;
>  	struct mutex mutex;
> +	bool is_sleepable;
>  };
>  
>  /*
> @@ -716,6 +717,7 @@ static int zswap_cpu_comp_prepare(unsigned int cpu, struct hlist_node *node)
>  		goto acomp_fail;
>  	}
>  	acomp_ctx->acomp = acomp;
> +	acomp_ctx->is_sleepable = acomp_is_sleepable(acomp);

Just one question here. In patch 1, sleepable seems to mean "not async".
IIUC, even a synchronous algorithm may sleep (e.g. if there is a
cond_resched or waiting for a mutex). Does sleepable in acomp terms the
same as "atomic" in scheduling/preemption terms?

Also, was this tested with debug options to catch any possible sleeps in
atomic context?

If the answer to both questions is yes, the change otherwise LGTM. Feel
free to add:
Acked-by: Yosry Ahmed <yosryahmed@google.com>

Thanks!

>  
>  	req = acomp_request_alloc(acomp_ctx->acomp);
>  	if (!req) {
> @@ -1368,7 +1370,7 @@ static void __zswap_load(struct zswap_entry *entry, struct page *page)
>  	mutex_lock(&acomp_ctx->mutex);
>  
>  	src = zpool_map_handle(zpool, entry->handle, ZPOOL_MM_RO);
> -	if (!zpool_can_sleep_mapped(zpool)) {
> +	if (acomp_ctx->is_sleepable && !zpool_can_sleep_mapped(zpool)) {
>  		memcpy(acomp_ctx->buffer, src, entry->length);
>  		src = acomp_ctx->buffer;
>  		zpool_unmap_handle(zpool, entry->handle);
> @@ -1382,7 +1384,7 @@ static void __zswap_load(struct zswap_entry *entry, struct page *page)
>  	BUG_ON(acomp_ctx->req->dlen != PAGE_SIZE);
>  	mutex_unlock(&acomp_ctx->mutex);
>  
> -	if (zpool_can_sleep_mapped(zpool))
> +	if (!acomp_ctx->is_sleepable || zpool_can_sleep_mapped(zpool))
>  		zpool_unmap_handle(zpool, entry->handle);
>  }
>  
> -- 
> 2.34.1
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 2/3] mm/zswap: remove the memcpy if acomp is not sleepable
  2024-02-16  8:30   ` Yosry Ahmed
@ 2024-02-16 10:10     ` Barry Song
  2024-02-16 19:36       ` Yosry Ahmed
  0 siblings, 1 reply; 9+ messages in thread
From: Barry Song @ 2024-02-16 10:10 UTC (permalink / raw)
  To: Yosry Ahmed
  Cc: akpm, davem, hannes, herbert, linux-crypto, linux-mm, nphamcs,
	zhouchengming, chriscli, chrisl, ddstreet, linux-kernel,
	sjenning, vitaly.wool, Barry Song

On Fri, Feb 16, 2024 at 9:30 PM Yosry Ahmed <yosryahmed@google.com> wrote:
>
> On Fri, Feb 16, 2024 at 05:08:14PM +1300, Barry Song wrote:
> > From: Barry Song <v-songbaohua@oppo.com>
> >
> > Most compressors are actually CPU-based and won't sleep during
> > compression and decompression. We should remove the redundant
> > memcpy for them.
> >
> > Signed-off-by: Barry Song <v-songbaohua@oppo.com>
> > Tested-by: Chengming Zhou <zhouchengming@bytedance.com>
> > Reviewed-by: Nhat Pham <nphamcs@gmail.com>
> > ---
> >  mm/zswap.c | 6 ++++--
> >  1 file changed, 4 insertions(+), 2 deletions(-)
> >
> > diff --git a/mm/zswap.c b/mm/zswap.c
> > index 350dd2fc8159..6319d2281020 100644
> > --- a/mm/zswap.c
> > +++ b/mm/zswap.c
> > @@ -168,6 +168,7 @@ struct crypto_acomp_ctx {
> >       struct crypto_wait wait;
> >       u8 *buffer;
> >       struct mutex mutex;
> > +     bool is_sleepable;
> >  };
> >
> >  /*
> > @@ -716,6 +717,7 @@ static int zswap_cpu_comp_prepare(unsigned int cpu, struct hlist_node *node)
> >               goto acomp_fail;
> >       }
> >       acomp_ctx->acomp = acomp;
> > +     acomp_ctx->is_sleepable = acomp_is_sleepable(acomp);
>
> Just one question here. In patch 1, sleepable seems to mean "not async".
> IIUC, even a synchronous algorithm may sleep (e.g. if there is a
> cond_resched or waiting for a mutex). Does sleepable in acomp terms the
> same as "atomic" in scheduling/preemption terms?

I think the answer is yes though async and sleepable are slightly
different semantically
generally speaking. but for comp cases, they are equal.

We have two backends for compression/ decompression - scomp and acomp. if comp
is using scomp backend, we can safely think they are not sleepable at
least from the
below three facts.

1. in zRAM, we are using scomp APIs only - crypto_comp_decompress()/
crypto_comp_compress(),  which are definitely scomp, we have never considered
sleeping problem in zram drivers:
static int zram_read_from_zspool(struct zram *zram, struct page *page,
                                 u32 index)
{
        struct zcomp_strm *zstrm;
        unsigned long handle;
        unsigned int size;
        void *src, *dst;
        u32 prio;
        int ret;

        handle = zram_get_handle(zram, index);
        ...
        src = zs_map_object(zram->mem_pool, handle, ZS_MM_RO);
        if (size == PAGE_SIZE) {
                dst = kmap_local_page(page);
                memcpy(dst, src, PAGE_SIZE);
                kunmap_local(dst);
                ret = 0;
        } else {
                dst = kmap_local_page(page);
                ret = zcomp_decompress(zstrm, src, size, dst);
                kunmap_local(dst);
                zcomp_stream_put(zram->comps[prio]);
        }
        zs_unmap_object(zram->mem_pool, handle);
        return ret;
}

2. zswap used to only support scomp before we moved to use
crypto_acomp_compress()
and crypto_acomp_decompress() APIs whose backends can be either scomp
or acomp, thus new hardware-based compression drivers can be used in zswap.

But before we moved to these new APIs in commit  1ec3b5fe6eec782 ("mm/zswap:
move to use crypto_acomp API for hardware acceleration") , zswap had
never considered
sleeping problems just like zRAM.

3. There is no sleeping in drivers using scomp backend.

$ git grep crypto_register_scomp
crypto/842.c:   ret = crypto_register_scomp(&scomp);
crypto/deflate.c:       ret = crypto_register_scomp(&scomp);
crypto/lz4.c:   ret = crypto_register_scomp(&scomp);
crypto/lz4hc.c: ret = crypto_register_scomp(&scomp);
crypto/lzo-rle.c:       ret = crypto_register_scomp(&scomp);
crypto/lzo.c:   ret = crypto_register_scomp(&scomp);
crypto/zstd.c:  ret = crypto_register_scomp(&scomp);
drivers/crypto/cavium/zip/zip_main.c:   ret =
crypto_register_scomp(&zip_scomp_deflate);
drivers/crypto/cavium/zip/zip_main.c:   ret =
crypto_register_scomp(&zip_scomp_lzs);

which are the most common cases.

>
> Also, was this tested with debug options to catch any possible sleeps in
> atomic context?

yes. i have enabled CONFIG_DEBUG_ATOMIC_SLEEP=y.

>
> If the answer to both questions is yes, the change otherwise LGTM. Feel
> free to add:
> Acked-by: Yosry Ahmed <yosryahmed@google.com>

Thanks!

>
> Thanks!
>
> >
> >       req = acomp_request_alloc(acomp_ctx->acomp);
> >       if (!req) {
> > @@ -1368,7 +1370,7 @@ static void __zswap_load(struct zswap_entry *entry, struct page *page)
> >       mutex_lock(&acomp_ctx->mutex);
> >
> >       src = zpool_map_handle(zpool, entry->handle, ZPOOL_MM_RO);
> > -     if (!zpool_can_sleep_mapped(zpool)) {
> > +     if (acomp_ctx->is_sleepable && !zpool_can_sleep_mapped(zpool)) {
> >               memcpy(acomp_ctx->buffer, src, entry->length);
> >               src = acomp_ctx->buffer;
> >               zpool_unmap_handle(zpool, entry->handle);
> > @@ -1382,7 +1384,7 @@ static void __zswap_load(struct zswap_entry *entry, struct page *page)
> >       BUG_ON(acomp_ctx->req->dlen != PAGE_SIZE);
> >       mutex_unlock(&acomp_ctx->mutex);
> >
> > -     if (zpool_can_sleep_mapped(zpool))
> > +     if (!acomp_ctx->is_sleepable || zpool_can_sleep_mapped(zpool))
> >               zpool_unmap_handle(zpool, entry->handle);
> >  }
> >
> > --
> > 2.34.1
> >

Thanks
Barry

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 2/3] mm/zswap: remove the memcpy if acomp is not sleepable
  2024-02-16  4:08 ` [PATCH v2 2/3] mm/zswap: remove the memcpy if acomp is not sleepable Barry Song
  2024-02-16  8:30   ` Yosry Ahmed
@ 2024-02-16 12:38   ` Chengming Zhou
  1 sibling, 0 replies; 9+ messages in thread
From: Chengming Zhou @ 2024-02-16 12:38 UTC (permalink / raw)
  To: Barry Song, akpm, davem, hannes, herbert, linux-crypto, linux-mm,
	nphamcs, yosryahmed
  Cc: chriscli, chrisl, ddstreet, linux-kernel, sjenning, vitaly.wool,
	Barry Song

On 2024/2/16 12:08, Barry Song wrote:
> From: Barry Song <v-songbaohua@oppo.com>
> 
> Most compressors are actually CPU-based and won't sleep during
> compression and decompression. We should remove the redundant
> memcpy for them.
> 
> Signed-off-by: Barry Song <v-songbaohua@oppo.com>
> Tested-by: Chengming Zhou <zhouchengming@bytedance.com>
> Reviewed-by: Nhat Pham <nphamcs@gmail.com>

LGTM, thanks!

Reviewed-by: Chengming Zhou <zhouchengming@bytedance.com>

> ---
>  mm/zswap.c | 6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/mm/zswap.c b/mm/zswap.c
> index 350dd2fc8159..6319d2281020 100644
> --- a/mm/zswap.c
> +++ b/mm/zswap.c
> @@ -168,6 +168,7 @@ struct crypto_acomp_ctx {
>  	struct crypto_wait wait;
>  	u8 *buffer;
>  	struct mutex mutex;
> +	bool is_sleepable;
>  };
>  
>  /*
> @@ -716,6 +717,7 @@ static int zswap_cpu_comp_prepare(unsigned int cpu, struct hlist_node *node)
>  		goto acomp_fail;
>  	}
>  	acomp_ctx->acomp = acomp;
> +	acomp_ctx->is_sleepable = acomp_is_sleepable(acomp);
>  
>  	req = acomp_request_alloc(acomp_ctx->acomp);
>  	if (!req) {
> @@ -1368,7 +1370,7 @@ static void __zswap_load(struct zswap_entry *entry, struct page *page)
>  	mutex_lock(&acomp_ctx->mutex);
>  
>  	src = zpool_map_handle(zpool, entry->handle, ZPOOL_MM_RO);
> -	if (!zpool_can_sleep_mapped(zpool)) {
> +	if (acomp_ctx->is_sleepable && !zpool_can_sleep_mapped(zpool)) {
>  		memcpy(acomp_ctx->buffer, src, entry->length);
>  		src = acomp_ctx->buffer;
>  		zpool_unmap_handle(zpool, entry->handle);
> @@ -1382,7 +1384,7 @@ static void __zswap_load(struct zswap_entry *entry, struct page *page)
>  	BUG_ON(acomp_ctx->req->dlen != PAGE_SIZE);
>  	mutex_unlock(&acomp_ctx->mutex);
>  
> -	if (zpool_can_sleep_mapped(zpool))
> +	if (!acomp_ctx->is_sleepable || zpool_can_sleep_mapped(zpool))
>  		zpool_unmap_handle(zpool, entry->handle);
>  }
>  

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 2/3] mm/zswap: remove the memcpy if acomp is not sleepable
  2024-02-16 10:10     ` Barry Song
@ 2024-02-16 19:36       ` Yosry Ahmed
  2024-02-17  4:38         ` Barry Song
  0 siblings, 1 reply; 9+ messages in thread
From: Yosry Ahmed @ 2024-02-16 19:36 UTC (permalink / raw)
  To: Barry Song
  Cc: akpm, davem, hannes, herbert, linux-crypto, linux-mm, nphamcs,
	zhouchengming, chriscli, chrisl, ddstreet, linux-kernel,
	sjenning, vitaly.wool, Barry Song

On Fri, Feb 16, 2024 at 11:10:04PM +1300, Barry Song wrote:
> On Fri, Feb 16, 2024 at 9:30 PM Yosry Ahmed <yosryahmed@google.com> wrote:
> >
> > On Fri, Feb 16, 2024 at 05:08:14PM +1300, Barry Song wrote:
> > > From: Barry Song <v-songbaohua@oppo.com>
> > >
> > > Most compressors are actually CPU-based and won't sleep during
> > > compression and decompression. We should remove the redundant
> > > memcpy for them.
> > >
> > > Signed-off-by: Barry Song <v-songbaohua@oppo.com>
> > > Tested-by: Chengming Zhou <zhouchengming@bytedance.com>
> > > Reviewed-by: Nhat Pham <nphamcs@gmail.com>
> > > ---
> > >  mm/zswap.c | 6 ++++--
> > >  1 file changed, 4 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/mm/zswap.c b/mm/zswap.c
> > > index 350dd2fc8159..6319d2281020 100644
> > > --- a/mm/zswap.c
> > > +++ b/mm/zswap.c
> > > @@ -168,6 +168,7 @@ struct crypto_acomp_ctx {
> > >       struct crypto_wait wait;
> > >       u8 *buffer;
> > >       struct mutex mutex;
> > > +     bool is_sleepable;
> > >  };
> > >
> > >  /*
> > > @@ -716,6 +717,7 @@ static int zswap_cpu_comp_prepare(unsigned int cpu, struct hlist_node *node)
> > >               goto acomp_fail;
> > >       }
> > >       acomp_ctx->acomp = acomp;
> > > +     acomp_ctx->is_sleepable = acomp_is_sleepable(acomp);
> >
> > Just one question here. In patch 1, sleepable seems to mean "not async".
> > IIUC, even a synchronous algorithm may sleep (e.g. if there is a
> > cond_resched or waiting for a mutex). Does sleepable in acomp terms the
> > same as "atomic" in scheduling/preemption terms?
> 
> I think the answer is yes though async and sleepable are slightly
> different semantically
> generally speaking. but for comp cases, they are equal.
> 
> We have two backends for compression/ decompression - scomp and acomp. if comp
> is using scomp backend, we can safely think they are not sleepable at
> least from the
> below three facts.
> 
> 1. in zRAM, we are using scomp APIs only - crypto_comp_decompress()/
> crypto_comp_compress(),  which are definitely scomp, we have never considered
> sleeping problem in zram drivers:
> static int zram_read_from_zspool(struct zram *zram, struct page *page,
>                                  u32 index)
> {
>         struct zcomp_strm *zstrm;
>         unsigned long handle;
>         unsigned int size;
>         void *src, *dst;
>         u32 prio;
>         int ret;
> 
>         handle = zram_get_handle(zram, index);
>         ...
>         src = zs_map_object(zram->mem_pool, handle, ZS_MM_RO);
>         if (size == PAGE_SIZE) {
>                 dst = kmap_local_page(page);
>                 memcpy(dst, src, PAGE_SIZE);
>                 kunmap_local(dst);
>                 ret = 0;
>         } else {
>                 dst = kmap_local_page(page);
>                 ret = zcomp_decompress(zstrm, src, size, dst);
>                 kunmap_local(dst);
>                 zcomp_stream_put(zram->comps[prio]);
>         }
>         zs_unmap_object(zram->mem_pool, handle);
>         return ret;
> }
> 
> 2. zswap used to only support scomp before we moved to use
> crypto_acomp_compress()
> and crypto_acomp_decompress() APIs whose backends can be either scomp
> or acomp, thus new hardware-based compression drivers can be used in zswap.
> 
> But before we moved to these new APIs in commit  1ec3b5fe6eec782 ("mm/zswap:
> move to use crypto_acomp API for hardware acceleration") , zswap had
> never considered
> sleeping problems just like zRAM.
> 
> 3. There is no sleeping in drivers using scomp backend.
> 
> $ git grep crypto_register_scomp
> crypto/842.c:   ret = crypto_register_scomp(&scomp);
> crypto/deflate.c:       ret = crypto_register_scomp(&scomp);
> crypto/lz4.c:   ret = crypto_register_scomp(&scomp);
> crypto/lz4hc.c: ret = crypto_register_scomp(&scomp);
> crypto/lzo-rle.c:       ret = crypto_register_scomp(&scomp);
> crypto/lzo.c:   ret = crypto_register_scomp(&scomp);
> crypto/zstd.c:  ret = crypto_register_scomp(&scomp);
> drivers/crypto/cavium/zip/zip_main.c:   ret =
> crypto_register_scomp(&zip_scomp_deflate);
> drivers/crypto/cavium/zip/zip_main.c:   ret =
> crypto_register_scomp(&zip_scomp_lzs);
> 
> which are the most common cases.

Thanks for explaining. Ideally we should be able to catch any violations
with proper debug options as you mentioned. Please include more info the
commit message about sleepability, a summarized version of what you
described above.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 2/3] mm/zswap: remove the memcpy if acomp is not sleepable
  2024-02-16 19:36       ` Yosry Ahmed
@ 2024-02-17  4:38         ` Barry Song
  0 siblings, 0 replies; 9+ messages in thread
From: Barry Song @ 2024-02-17  4:38 UTC (permalink / raw)
  To: Yosry Ahmed
  Cc: akpm, davem, hannes, herbert, linux-crypto, linux-mm, nphamcs,
	zhouchengming, chriscli, chrisl, ddstreet, linux-kernel,
	sjenning, vitaly.wool, Barry Song

On Sat, Feb 17, 2024 at 8:36 AM Yosry Ahmed <yosryahmed@google.com> wrote:
>
> On Fri, Feb 16, 2024 at 11:10:04PM +1300, Barry Song wrote:
> > On Fri, Feb 16, 2024 at 9:30 PM Yosry Ahmed <yosryahmed@google.com> wrote:
> > >
> > > On Fri, Feb 16, 2024 at 05:08:14PM +1300, Barry Song wrote:
> > > > From: Barry Song <v-songbaohua@oppo.com>
> > > >
> > > > Most compressors are actually CPU-based and won't sleep during
> > > > compression and decompression. We should remove the redundant
> > > > memcpy for them.
> > > >
> > > > Signed-off-by: Barry Song <v-songbaohua@oppo.com>
> > > > Tested-by: Chengming Zhou <zhouchengming@bytedance.com>
> > > > Reviewed-by: Nhat Pham <nphamcs@gmail.com>
> > > > ---
> > > >  mm/zswap.c | 6 ++++--
> > > >  1 file changed, 4 insertions(+), 2 deletions(-)
> > > >
> > > > diff --git a/mm/zswap.c b/mm/zswap.c
> > > > index 350dd2fc8159..6319d2281020 100644
> > > > --- a/mm/zswap.c
> > > > +++ b/mm/zswap.c
> > > > @@ -168,6 +168,7 @@ struct crypto_acomp_ctx {
> > > >       struct crypto_wait wait;
> > > >       u8 *buffer;
> > > >       struct mutex mutex;
> > > > +     bool is_sleepable;
> > > >  };
> > > >
> > > >  /*
> > > > @@ -716,6 +717,7 @@ static int zswap_cpu_comp_prepare(unsigned int cpu, struct hlist_node *node)
> > > >               goto acomp_fail;
> > > >       }
> > > >       acomp_ctx->acomp = acomp;
> > > > +     acomp_ctx->is_sleepable = acomp_is_sleepable(acomp);
> > >
> > > Just one question here. In patch 1, sleepable seems to mean "not async".
> > > IIUC, even a synchronous algorithm may sleep (e.g. if there is a
> > > cond_resched or waiting for a mutex). Does sleepable in acomp terms the
> > > same as "atomic" in scheduling/preemption terms?
> >
> > I think the answer is yes though async and sleepable are slightly
> > different semantically
> > generally speaking. but for comp cases, they are equal.
> >
> > We have two backends for compression/ decompression - scomp and acomp. if comp
> > is using scomp backend, we can safely think they are not sleepable at
> > least from the
> > below three facts.
> >
> > 1. in zRAM, we are using scomp APIs only - crypto_comp_decompress()/
> > crypto_comp_compress(),  which are definitely scomp, we have never considered
> > sleeping problem in zram drivers:
> > static int zram_read_from_zspool(struct zram *zram, struct page *page,
> >                                  u32 index)
> > {
> >         struct zcomp_strm *zstrm;
> >         unsigned long handle;
> >         unsigned int size;
> >         void *src, *dst;
> >         u32 prio;
> >         int ret;
> >
> >         handle = zram_get_handle(zram, index);
> >         ...
> >         src = zs_map_object(zram->mem_pool, handle, ZS_MM_RO);
> >         if (size == PAGE_SIZE) {
> >                 dst = kmap_local_page(page);
> >                 memcpy(dst, src, PAGE_SIZE);
> >                 kunmap_local(dst);
> >                 ret = 0;
> >         } else {
> >                 dst = kmap_local_page(page);
> >                 ret = zcomp_decompress(zstrm, src, size, dst);
> >                 kunmap_local(dst);
> >                 zcomp_stream_put(zram->comps[prio]);
> >         }
> >         zs_unmap_object(zram->mem_pool, handle);
> >         return ret;
> > }
> >
> > 2. zswap used to only support scomp before we moved to use
> > crypto_acomp_compress()
> > and crypto_acomp_decompress() APIs whose backends can be either scomp
> > or acomp, thus new hardware-based compression drivers can be used in zswap.
> >
> > But before we moved to these new APIs in commit  1ec3b5fe6eec782 ("mm/zswap:
> > move to use crypto_acomp API for hardware acceleration") , zswap had
> > never considered
> > sleeping problems just like zRAM.
> >
> > 3. There is no sleeping in drivers using scomp backend.
> >
> > $ git grep crypto_register_scomp
> > crypto/842.c:   ret = crypto_register_scomp(&scomp);
> > crypto/deflate.c:       ret = crypto_register_scomp(&scomp);
> > crypto/lz4.c:   ret = crypto_register_scomp(&scomp);
> > crypto/lz4hc.c: ret = crypto_register_scomp(&scomp);
> > crypto/lzo-rle.c:       ret = crypto_register_scomp(&scomp);
> > crypto/lzo.c:   ret = crypto_register_scomp(&scomp);
> > crypto/zstd.c:  ret = crypto_register_scomp(&scomp);
> > drivers/crypto/cavium/zip/zip_main.c:   ret =
> > crypto_register_scomp(&zip_scomp_deflate);
> > drivers/crypto/cavium/zip/zip_main.c:   ret =
> > crypto_register_scomp(&zip_scomp_lzs);
> >
> > which are the most common cases.
>
> Thanks for explaining. Ideally we should be able to catch any violations
> with proper debug options as you mentioned. Please include more info the
> commit message about sleepability, a summarized version of what you
> described above.

ok. I will enhance the commit message of patch 1/3 with the summary.

Thanks
Barry

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2024-02-17  4:38 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-02-16  4:08 [PATCH v2 0/3] mm/zswap & crypto/acompress: remove a couple of memcpy Barry Song
2024-02-16  4:08 ` [PATCH v2 1/3] crypto: introduce acomp_is_sleepable to expose if a acomp has a scomp backend Barry Song
2024-02-16  4:08 ` [PATCH v2 2/3] mm/zswap: remove the memcpy if acomp is not sleepable Barry Song
2024-02-16  8:30   ` Yosry Ahmed
2024-02-16 10:10     ` Barry Song
2024-02-16 19:36       ` Yosry Ahmed
2024-02-17  4:38         ` Barry Song
2024-02-16 12:38   ` Chengming Zhou
2024-02-16  4:08 ` [PATCH v2 3/3] crypto: scompress: remove memcpy if sg_nents is 1 Barry Song

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).