linux-crypto.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/3] mm/zswap & crypto/acompress: remove a couple of memcpy
@ 2024-01-03  9:50 Barry Song
  2024-01-03  9:50 ` [PATCH 1/3] crypto: introduce acomp_is_async to expose if a acomp has a scomp backend Barry Song
                   ` (2 more replies)
  0 siblings, 3 replies; 11+ messages in thread
From: Barry Song @ 2024-01-03  9:50 UTC (permalink / raw)
  To: herbert, davem, akpm, ddstreet, sjenning, vitaly.wool, linux-crypto
  Cc: chriscli, chrisl, hannes, linux-kernel, linux-mm, nphamcs,
	yosryahmed, zhouchengming, Barry Song

From: Barry Song <v-songbaohua@oppo.com>

The patchset removes a couple of memcpy in zswap and crypto
to improve zswap's performance.

Thanks for Chengming Zhou's test and perf data.
Quote from Chengming,
 I just tested these three patches on my server, found improvement in the
 kernel build testcase on a tmpfs with zswap (lz4 + zsmalloc) enabled.
 
         mm-stable 501a06fe8e4c  patched
 real    1m38.028s               1m32.317s
 user    19m11.482s              18m39.439s
 sys     19m26.445s              17m5.646s

The patchset is based on mm-stable.

Barry Song (3):
  crypto: introduce acomp_is_async to expose if a acomp has a scomp
    backend
  mm/zswap: remove the memcpy if acomp is not asynchronous
  crypto: scompress: remove memcpy if sg_nents is 1

 crypto/acompress.c         |  8 ++++++++
 crypto/scompress.c         | 35 ++++++++++++++++++++++++++++-------
 include/crypto/acompress.h |  9 +++++++++
 mm/zswap.c                 |  6 ++++--
 4 files changed, 49 insertions(+), 9 deletions(-)

-- 
2.34.1


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 1/3] crypto: introduce acomp_is_async to expose if a acomp has a scomp backend
  2024-01-03  9:50 [PATCH 0/3] mm/zswap & crypto/acompress: remove a couple of memcpy Barry Song
@ 2024-01-03  9:50 ` Barry Song
  2024-01-08 22:35   ` Yosry Ahmed
  2024-01-03  9:50 ` [PATCH 2/3] mm/zswap: remove the memcpy if acomp is not asynchronous Barry Song
  2024-01-03  9:50 ` [PATCH 3/3] crypto: scompress: remove memcpy if sg_nents is 1 Barry Song
  2 siblings, 1 reply; 11+ messages in thread
From: Barry Song @ 2024-01-03  9:50 UTC (permalink / raw)
  To: herbert, davem, akpm, ddstreet, sjenning, vitaly.wool, linux-crypto
  Cc: chriscli, chrisl, hannes, linux-kernel, linux-mm, nphamcs,
	yosryahmed, zhouchengming, Barry Song

From: Barry Song <v-songbaohua@oppo.com>

Almost all CPU-based compressors/decompressors are actually synchronous
though they support acomp APIs. While some hardware has hardware-based
accelerators to offload CPU's work such as hisilicon and intel/qat/,
their drivers are working in async mode.
Letting acomp's users know exactly if the acomp is really async will
help users know if the compression and decompression procedure can
sleep.

Signed-off-by: Barry Song <v-songbaohua@oppo.com>
Tested-by: Chengming Zhou <zhouchengming@bytedance.com>
---
 crypto/acompress.c         | 8 ++++++++
 include/crypto/acompress.h | 9 +++++++++
 2 files changed, 17 insertions(+)

diff --git a/crypto/acompress.c b/crypto/acompress.c
index 1c682810a484..99118e879a4a 100644
--- a/crypto/acompress.c
+++ b/crypto/acompress.c
@@ -152,6 +152,14 @@ struct crypto_acomp *crypto_alloc_acomp_node(const char *alg_name, u32 type,
 }
 EXPORT_SYMBOL_GPL(crypto_alloc_acomp_node);
 
+bool acomp_is_async(struct crypto_acomp *acomp)
+{
+	struct crypto_tfm *tfm = crypto_acomp_tfm(acomp);
+
+	return tfm->__crt_alg->cra_type == &crypto_acomp_type;
+}
+EXPORT_SYMBOL_GPL(acomp_is_async);
+
 struct acomp_req *acomp_request_alloc(struct crypto_acomp *acomp)
 {
 	struct crypto_tfm *tfm = crypto_acomp_tfm(acomp);
diff --git a/include/crypto/acompress.h b/include/crypto/acompress.h
index 574cffc90730..d91830c2d442 100644
--- a/include/crypto/acompress.h
+++ b/include/crypto/acompress.h
@@ -204,6 +204,15 @@ struct acomp_req *acomp_request_alloc(struct crypto_acomp *tfm);
  */
 void acomp_request_free(struct acomp_req *req);
 
+/**
+ * acomp_is_async() -- check if an acomp is asynchronous(can sleep)
+ *
+ * @tfm:	ACOMPRESS tfm handle allocated with crypto_alloc_acomp()
+ *
+ * Return:	true if the acomp is asynchronous, otherwise, false
+ */
+bool acomp_is_async(struct crypto_acomp *tfm);
+
 /**
  * acomp_request_set_callback() -- Sets an asynchronous callback
  *
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 2/3] mm/zswap: remove the memcpy if acomp is not asynchronous
  2024-01-03  9:50 [PATCH 0/3] mm/zswap & crypto/acompress: remove a couple of memcpy Barry Song
  2024-01-03  9:50 ` [PATCH 1/3] crypto: introduce acomp_is_async to expose if a acomp has a scomp backend Barry Song
@ 2024-01-03  9:50 ` Barry Song
  2024-01-04  0:38   ` Nhat Pham
  2024-01-08 22:36   ` Yosry Ahmed
  2024-01-03  9:50 ` [PATCH 3/3] crypto: scompress: remove memcpy if sg_nents is 1 Barry Song
  2 siblings, 2 replies; 11+ messages in thread
From: Barry Song @ 2024-01-03  9:50 UTC (permalink / raw)
  To: herbert, davem, akpm, ddstreet, sjenning, vitaly.wool, linux-crypto
  Cc: chriscli, chrisl, hannes, linux-kernel, linux-mm, nphamcs,
	yosryahmed, zhouchengming, Barry Song

From: Barry Song <v-songbaohua@oppo.com>

Most compressors are actually CPU-based and won't sleep during
compression and decompression. We should remove the redundant
memcpy for them.

Signed-off-by: Barry Song <v-songbaohua@oppo.com>
Tested-by: Chengming Zhou <zhouchengming@bytedance.com>
---
 mm/zswap.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/mm/zswap.c b/mm/zswap.c
index ca25b676048e..36898614ebcc 100644
--- a/mm/zswap.c
+++ b/mm/zswap.c
@@ -168,6 +168,7 @@ struct crypto_acomp_ctx {
 	struct crypto_wait wait;
 	u8 *buffer;
 	struct mutex mutex;
+	bool is_async; /* if acomp can sleep */
 };
 
 /*
@@ -716,6 +717,7 @@ static int zswap_cpu_comp_prepare(unsigned int cpu, struct hlist_node *node)
 		goto acomp_fail;
 	}
 	acomp_ctx->acomp = acomp;
+	acomp_ctx->is_async = acomp_is_async(acomp);
 
 	req = acomp_request_alloc(acomp_ctx->acomp);
 	if (!req) {
@@ -1370,7 +1372,7 @@ static void __zswap_load(struct zswap_entry *entry, struct page *page)
 	mutex_lock(&acomp_ctx->mutex);
 
 	src = zpool_map_handle(zpool, entry->handle, ZPOOL_MM_RO);
-	if (!zpool_can_sleep_mapped(zpool)) {
+	if (acomp_ctx->is_async && !zpool_can_sleep_mapped(zpool)) {
 		memcpy(acomp_ctx->buffer, src, entry->length);
 		src = acomp_ctx->buffer;
 		zpool_unmap_handle(zpool, entry->handle);
@@ -1384,7 +1386,7 @@ static void __zswap_load(struct zswap_entry *entry, struct page *page)
 	BUG_ON(acomp_ctx->req->dlen != PAGE_SIZE);
 	mutex_unlock(&acomp_ctx->mutex);
 
-	if (zpool_can_sleep_mapped(zpool))
+	if (!acomp_ctx->is_async || zpool_can_sleep_mapped(zpool))
 		zpool_unmap_handle(zpool, entry->handle);
 }
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 3/3] crypto: scompress: remove memcpy if sg_nents is 1
  2024-01-03  9:50 [PATCH 0/3] mm/zswap & crypto/acompress: remove a couple of memcpy Barry Song
  2024-01-03  9:50 ` [PATCH 1/3] crypto: introduce acomp_is_async to expose if a acomp has a scomp backend Barry Song
  2024-01-03  9:50 ` [PATCH 2/3] mm/zswap: remove the memcpy if acomp is not asynchronous Barry Song
@ 2024-01-03  9:50 ` Barry Song
  2024-01-25  9:58   ` Herbert Xu
  2 siblings, 1 reply; 11+ messages in thread
From: Barry Song @ 2024-01-03  9:50 UTC (permalink / raw)
  To: herbert, davem, akpm, ddstreet, sjenning, vitaly.wool, linux-crypto
  Cc: chriscli, chrisl, hannes, linux-kernel, linux-mm, nphamcs,
	yosryahmed, zhouchengming, Barry Song

From: Barry Song <v-songbaohua@oppo.com>

while sg_nents is 1 which is always true for the current kernel
as the only user - zswap is the case, we should remove two big
memcpy.

Signed-off-by: Barry Song <v-songbaohua@oppo.com>
Tested-by: Chengming Zhou <zhouchengming@bytedance.com>
---
 crypto/scompress.c | 35 ++++++++++++++++++++++++++++-------
 1 file changed, 28 insertions(+), 7 deletions(-)

diff --git a/crypto/scompress.c b/crypto/scompress.c
index 442a82c9de7d..d1bb40ef83a2 100644
--- a/crypto/scompress.c
+++ b/crypto/scompress.c
@@ -117,6 +117,7 @@ static int scomp_acomp_comp_decomp(struct acomp_req *req, int dir)
 	struct crypto_scomp *scomp = *tfm_ctx;
 	void **ctx = acomp_request_ctx(req);
 	struct scomp_scratch *scratch;
+	void *src, *dst;
 	int ret;
 
 	if (!req->src || !req->slen || req->slen > SCOMP_SCRATCH_SIZE)
@@ -131,13 +132,26 @@ static int scomp_acomp_comp_decomp(struct acomp_req *req, int dir)
 	scratch = raw_cpu_ptr(&scomp_scratch);
 	spin_lock(&scratch->lock);
 
-	scatterwalk_map_and_copy(scratch->src, req->src, 0, req->slen, 0);
+	if (sg_nents(req->src) == 1) {
+		src = kmap_local_page(sg_page(req->src)) + req->src->offset;
+	} else {
+		scatterwalk_map_and_copy(scratch->src, req->src, 0,
+					 req->slen, 0);
+		src = scratch->src;
+	}
+
+	if (req->dst && sg_nents(req->dst) == 1) {
+		dst = kmap_local_page(sg_page(req->dst)) + req->dst->offset;
+	} else {
+		dst = scratch->dst;
+	}
+
 	if (dir)
-		ret = crypto_scomp_compress(scomp, scratch->src, req->slen,
-					    scratch->dst, &req->dlen, *ctx);
+		ret = crypto_scomp_compress(scomp, src, req->slen,
+					    dst, &req->dlen, *ctx);
 	else
-		ret = crypto_scomp_decompress(scomp, scratch->src, req->slen,
-					      scratch->dst, &req->dlen, *ctx);
+		ret = crypto_scomp_decompress(scomp, src, req->slen,
+					      dst, &req->dlen, *ctx);
 	if (!ret) {
 		if (!req->dst) {
 			req->dst = sgl_alloc(req->dlen, GFP_ATOMIC, NULL);
@@ -146,10 +160,17 @@ static int scomp_acomp_comp_decomp(struct acomp_req *req, int dir)
 				goto out;
 			}
 		}
-		scatterwalk_map_and_copy(scratch->dst, req->dst, 0, req->dlen,
-					 1);
+		if (dst == scratch->dst) {
+			scatterwalk_map_and_copy(scratch->dst, req->dst, 0,
+						 req->dlen, 1);
+		}
 	}
 out:
+	if (src != scratch->src)
+		kunmap_local(src);
+	if (dst != scratch->dst)
+		kunmap_local(dst);
+
 	spin_unlock(&scratch->lock);
 	return ret;
 }
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH 2/3] mm/zswap: remove the memcpy if acomp is not asynchronous
  2024-01-03  9:50 ` [PATCH 2/3] mm/zswap: remove the memcpy if acomp is not asynchronous Barry Song
@ 2024-01-04  0:38   ` Nhat Pham
  2024-02-16  3:55     ` Barry Song
  2024-01-08 22:36   ` Yosry Ahmed
  1 sibling, 1 reply; 11+ messages in thread
From: Nhat Pham @ 2024-01-04  0:38 UTC (permalink / raw)
  To: Barry Song
  Cc: herbert, davem, akpm, ddstreet, sjenning, vitaly.wool,
	linux-crypto, chriscli, chrisl, hannes, linux-kernel, linux-mm,
	yosryahmed, zhouchengming, Barry Song

On Wed, Jan 3, 2024 at 1:50 AM Barry Song <21cnbao@gmail.com> wrote:
>
> From: Barry Song <v-songbaohua@oppo.com>
>
> Most compressors are actually CPU-based and won't sleep during
> compression and decompression. We should remove the redundant
> memcpy for them.
>
> Signed-off-by: Barry Song <v-songbaohua@oppo.com>
> Tested-by: Chengming Zhou <zhouchengming@bytedance.com>

nit: it might help to include the test numbers in the changelog in
this patch here too. Save a couple of clicks to dig out the original
patch cover for the numbers :)

> ---
>  mm/zswap.c | 6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/mm/zswap.c b/mm/zswap.c
> index ca25b676048e..36898614ebcc 100644
> --- a/mm/zswap.c
> +++ b/mm/zswap.c
> @@ -168,6 +168,7 @@ struct crypto_acomp_ctx {
>         struct crypto_wait wait;
>         u8 *buffer;
>         struct mutex mutex;
> +       bool is_async; /* if acomp can sleep */

nit: seems like this comment isn't necessary. is_async is pretty
self-explanatory to me. But definitely not a show stopper tho :)

>  };
>
>  /*
> @@ -716,6 +717,7 @@ static int zswap_cpu_comp_prepare(unsigned int cpu, struct hlist_node *node)
>                 goto acomp_fail;
>         }
>         acomp_ctx->acomp = acomp;
> +       acomp_ctx->is_async = acomp_is_async(acomp);
>
>         req = acomp_request_alloc(acomp_ctx->acomp);
>         if (!req) {
> @@ -1370,7 +1372,7 @@ static void __zswap_load(struct zswap_entry *entry, struct page *page)
>         mutex_lock(&acomp_ctx->mutex);
>
>         src = zpool_map_handle(zpool, entry->handle, ZPOOL_MM_RO);
> -       if (!zpool_can_sleep_mapped(zpool)) {
> +       if (acomp_ctx->is_async && !zpool_can_sleep_mapped(zpool)) {
>                 memcpy(acomp_ctx->buffer, src, entry->length);
>                 src = acomp_ctx->buffer;
>                 zpool_unmap_handle(zpool, entry->handle);
> @@ -1384,7 +1386,7 @@ static void __zswap_load(struct zswap_entry *entry, struct page *page)
>         BUG_ON(acomp_ctx->req->dlen != PAGE_SIZE);
>         mutex_unlock(&acomp_ctx->mutex);
>
> -       if (zpool_can_sleep_mapped(zpool))
> +       if (!acomp_ctx->is_async || zpool_can_sleep_mapped(zpool))
>                 zpool_unmap_handle(zpool, entry->handle);
>  }
>
> --
> 2.34.1
>

The zswap side looks good to me. I don't have expertise/authority to
ack the crypto API change (but FWIW it LGTM too based on a cursory
code read).
Reviewed-by: Nhat Pham <nphamcs@gmail.com>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/3] crypto: introduce acomp_is_async to expose if a acomp has a scomp backend
  2024-01-03  9:50 ` [PATCH 1/3] crypto: introduce acomp_is_async to expose if a acomp has a scomp backend Barry Song
@ 2024-01-08 22:35   ` Yosry Ahmed
  2024-01-09  3:38     ` Barry Song
  0 siblings, 1 reply; 11+ messages in thread
From: Yosry Ahmed @ 2024-01-08 22:35 UTC (permalink / raw)
  To: Barry Song
  Cc: herbert, davem, akpm, ddstreet, sjenning, vitaly.wool,
	linux-crypto, chriscli, chrisl, hannes, linux-kernel, linux-mm,
	nphamcs, zhouchengming, Barry Song

On Wed, Jan 3, 2024 at 1:50 AM Barry Song <21cnbao@gmail.com> wrote:
>
> From: Barry Song <v-songbaohua@oppo.com>
>
> Almost all CPU-based compressors/decompressors are actually synchronous
> though they support acomp APIs. While some hardware has hardware-based
> accelerators to offload CPU's work such as hisilicon and intel/qat/,
> their drivers are working in async mode.
> Letting acomp's users know exactly if the acomp is really async will
> help users know if the compression and decompression procedure can
> sleep.
>
> Signed-off-by: Barry Song <v-songbaohua@oppo.com>
> Tested-by: Chengming Zhou <zhouchengming@bytedance.com>
> ---
>  crypto/acompress.c         | 8 ++++++++
>  include/crypto/acompress.h | 9 +++++++++
>  2 files changed, 17 insertions(+)
>
> diff --git a/crypto/acompress.c b/crypto/acompress.c
> index 1c682810a484..99118e879a4a 100644
> --- a/crypto/acompress.c
> +++ b/crypto/acompress.c
> @@ -152,6 +152,14 @@ struct crypto_acomp *crypto_alloc_acomp_node(const char *alg_name, u32 type,
>  }
>  EXPORT_SYMBOL_GPL(crypto_alloc_acomp_node);
>
> +bool acomp_is_async(struct crypto_acomp *acomp)

Is synchronous semantically the same as sleepable? IIUC synchronous
code may still sleep, at least generally. The purpose of this change
is to know whether we will sleep or not in the zswap code, so I
suggest the code should be explicit about sleep-ability instead (e.g.
acomp_is_sleepable or acomp_may_sleep).

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 2/3] mm/zswap: remove the memcpy if acomp is not asynchronous
  2024-01-03  9:50 ` [PATCH 2/3] mm/zswap: remove the memcpy if acomp is not asynchronous Barry Song
  2024-01-04  0:38   ` Nhat Pham
@ 2024-01-08 22:36   ` Yosry Ahmed
  1 sibling, 0 replies; 11+ messages in thread
From: Yosry Ahmed @ 2024-01-08 22:36 UTC (permalink / raw)
  To: Barry Song
  Cc: herbert, davem, akpm, ddstreet, sjenning, vitaly.wool,
	linux-crypto, chriscli, chrisl, hannes, linux-kernel, linux-mm,
	nphamcs, zhouchengming, Barry Song

On Wed, Jan 3, 2024 at 1:50 AM Barry Song <21cnbao@gmail.com> wrote:
>
> From: Barry Song <v-songbaohua@oppo.com>
>
> Most compressors are actually CPU-based and won't sleep during
> compression and decompression. We should remove the redundant
> memcpy for them.
>
> Signed-off-by: Barry Song <v-songbaohua@oppo.com>
> Tested-by: Chengming Zhou <zhouchengming@bytedance.com>
> ---
>  mm/zswap.c | 6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/mm/zswap.c b/mm/zswap.c
> index ca25b676048e..36898614ebcc 100644
> --- a/mm/zswap.c
> +++ b/mm/zswap.c
> @@ -168,6 +168,7 @@ struct crypto_acomp_ctx {
>         struct crypto_wait wait;
>         u8 *buffer;
>         struct mutex mutex;
> +       bool is_async; /* if acomp can sleep */

As pointed out in patch 1, I think we should name this explicitly to
be about sleep-ability (e.g. sleepable or can_sleep).

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/3] crypto: introduce acomp_is_async to expose if a acomp has a scomp backend
  2024-01-08 22:35   ` Yosry Ahmed
@ 2024-01-09  3:38     ` Barry Song
  0 siblings, 0 replies; 11+ messages in thread
From: Barry Song @ 2024-01-09  3:38 UTC (permalink / raw)
  To: Yosry Ahmed, herbert
  Cc: davem, akpm, ddstreet, sjenning, vitaly.wool, linux-crypto,
	chriscli, chrisl, hannes, linux-kernel, linux-mm, nphamcs,
	zhouchengming, Barry Song

On Tue, Jan 9, 2024 at 6:36 AM Yosry Ahmed <yosryahmed@google.com> wrote:
>
> On Wed, Jan 3, 2024 at 1:50 AM Barry Song <21cnbao@gmail.com> wrote:
> >
> > From: Barry Song <v-songbaohua@oppo.com>
> >
> > Almost all CPU-based compressors/decompressors are actually synchronous
> > though they support acomp APIs. While some hardware has hardware-based
> > accelerators to offload CPU's work such as hisilicon and intel/qat/,
> > their drivers are working in async mode.
> > Letting acomp's users know exactly if the acomp is really async will
> > help users know if the compression and decompression procedure can
> > sleep.
> >
> > Signed-off-by: Barry Song <v-songbaohua@oppo.com>
> > Tested-by: Chengming Zhou <zhouchengming@bytedance.com>
> > ---
> >  crypto/acompress.c         | 8 ++++++++
> >  include/crypto/acompress.h | 9 +++++++++
> >  2 files changed, 17 insertions(+)
> >
> > diff --git a/crypto/acompress.c b/crypto/acompress.c
> > index 1c682810a484..99118e879a4a 100644
> > --- a/crypto/acompress.c
> > +++ b/crypto/acompress.c
> > @@ -152,6 +152,14 @@ struct crypto_acomp *crypto_alloc_acomp_node(const char *alg_name, u32 type,
> >  }
> >  EXPORT_SYMBOL_GPL(crypto_alloc_acomp_node);
> >
> > +bool acomp_is_async(struct crypto_acomp *acomp)
>
> Is synchronous semantically the same as sleepable? IIUC synchronous
> code may still sleep, at least generally. The purpose of this change
> is to know whether we will sleep or not in the zswap code, so I
> suggest the code should be explicit about sleep-ability instead (e.g.
> acomp_is_sleepable or acomp_may_sleep).

Thanks, Tosry. sounds reasonable.

I'd like to ask for Herbert's comment, do we have a better way to know
if an acomp can sleep other than checking the below?

return tfm->__crt_alg->cra_type == &crypto_acomp_type;

Thanks
Barry

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 3/3] crypto: scompress: remove memcpy if sg_nents is 1
  2024-01-03  9:50 ` [PATCH 3/3] crypto: scompress: remove memcpy if sg_nents is 1 Barry Song
@ 2024-01-25  9:58   ` Herbert Xu
  2024-02-16  3:49     ` Barry Song
  0 siblings, 1 reply; 11+ messages in thread
From: Herbert Xu @ 2024-01-25  9:58 UTC (permalink / raw)
  To: Barry Song
  Cc: davem, akpm, ddstreet, sjenning, vitaly.wool, linux-crypto,
	chriscli, chrisl, hannes, linux-kernel, linux-mm, nphamcs,
	yosryahmed, zhouchengming, Barry Song

On Wed, Jan 03, 2024 at 10:50:06PM +1300, Barry Song wrote:
>
> +	if (dst != scratch->dst)
> +		kunmap_local(dst);

This is missing a flush_dcache_page.

It may not matter for zswap, but this is API code and needs to
work for every single case.

Thanks,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 3/3] crypto: scompress: remove memcpy if sg_nents is 1
  2024-01-25  9:58   ` Herbert Xu
@ 2024-02-16  3:49     ` Barry Song
  0 siblings, 0 replies; 11+ messages in thread
From: Barry Song @ 2024-02-16  3:49 UTC (permalink / raw)
  To: Herbert Xu
  Cc: davem, akpm, ddstreet, sjenning, vitaly.wool, linux-crypto,
	chriscli, chrisl, hannes, linux-kernel, linux-mm, nphamcs,
	yosryahmed, zhouchengming, Barry Song

On Thu, Jan 25, 2024 at 10:58 PM Herbert Xu <herbert@gondor.apana.org.au> wrote:
>
> On Wed, Jan 03, 2024 at 10:50:06PM +1300, Barry Song wrote:
> >
> > +     if (dst != scratch->dst)
> > +             kunmap_local(dst);
>
> This is missing a flush_dcache_page.

Thanks, Herbert!  I'd rather add flush_dcache_page()
to the below place so that we can avoid one redundant
flush for ENOSPC/ENOMEM case:

        if (!ret) {
                if (!req->dst) {
                        req->dst = sgl_alloc(req->dlen, GFP_ATOMIC, NULL);
                        if (!req->dst) {
                                ret = -ENOMEM;
                                goto out;
                        }
                } else if (req->dlen > dlen) {
                        ret = -ENOSPC;
                        goto out;
                }
                if (dst == scratch->dst) {
                        scatterwalk_map_and_copy(scratch->dst, req->dst, 0,
                                                 req->dlen, 1);
                } else {
+                        flush_dcache_page(sg_page(req->dst));
                }
        }
out:
        if (src != scratch->src)
                kunmap_local(src);
        if (dst != scratch->dst)
                kunmap_local(dst);

>
> It may not matter for zswap, but this is API code and needs to
> work for every single case.
>
> Thanks,
> --
> Email: Herbert Xu <herbert@gondor.apana.org.au>
> Home Page: http://gondor.apana.org.au/~herbert/
> PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
>

Thanks
Barry

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 2/3] mm/zswap: remove the memcpy if acomp is not asynchronous
  2024-01-04  0:38   ` Nhat Pham
@ 2024-02-16  3:55     ` Barry Song
  0 siblings, 0 replies; 11+ messages in thread
From: Barry Song @ 2024-02-16  3:55 UTC (permalink / raw)
  To: Nhat Pham
  Cc: herbert, davem, akpm, ddstreet, sjenning, vitaly.wool,
	linux-crypto, chriscli, chrisl, hannes, linux-kernel, linux-mm,
	yosryahmed, zhouchengming, Barry Song

On Thu, Jan 4, 2024 at 1:38 PM Nhat Pham <nphamcs@gmail.com> wrote:
>
> On Wed, Jan 3, 2024 at 1:50 AM Barry Song <21cnbao@gmail.com> wrote:
> >
> > From: Barry Song <v-songbaohua@oppo.com>
> >
> > Most compressors are actually CPU-based and won't sleep during
> > compression and decompression. We should remove the redundant
> > memcpy for them.
> >
> > Signed-off-by: Barry Song <v-songbaohua@oppo.com>
> > Tested-by: Chengming Zhou <zhouchengming@bytedance.com>
>

Hi Nhat,
Thanks for reviewing!

> nit: it might help to include the test numbers in the changelog in
> this patch here too. Save a couple of clicks to dig out the original
> patch cover for the numbers :)

Chengming's test data is for the whole series. so i can't find the
proper commit to put the data. but it seems Andrew does have
a good habit to collect some important cover-letter info to commits,
so in v2, i'd keep the commit message as is.

>
> > ---
> >  mm/zswap.c | 6 ++++--
> >  1 file changed, 4 insertions(+), 2 deletions(-)
> >
> > diff --git a/mm/zswap.c b/mm/zswap.c
> > index ca25b676048e..36898614ebcc 100644
> > --- a/mm/zswap.c
> > +++ b/mm/zswap.c
> > @@ -168,6 +168,7 @@ struct crypto_acomp_ctx {
> >         struct crypto_wait wait;
> >         u8 *buffer;
> >         struct mutex mutex;
> > +       bool is_async; /* if acomp can sleep */
>
> nit: seems like this comment isn't necessary. is_async is pretty
> self-explanatory to me. But definitely not a show stopper tho :)

Thanks. I am changing the name to is_sleepable according to
Yosry's suggestion. As a result, the comment is removed as well.

>
> >  };
> >
> >  /*
> > @@ -716,6 +717,7 @@ static int zswap_cpu_comp_prepare(unsigned int cpu, struct hlist_node *node)
> >                 goto acomp_fail;
> >         }
> >         acomp_ctx->acomp = acomp;
> > +       acomp_ctx->is_async = acomp_is_async(acomp);
> >
> >         req = acomp_request_alloc(acomp_ctx->acomp);
> >         if (!req) {
> > @@ -1370,7 +1372,7 @@ static void __zswap_load(struct zswap_entry *entry, struct page *page)
> >         mutex_lock(&acomp_ctx->mutex);
> >
> >         src = zpool_map_handle(zpool, entry->handle, ZPOOL_MM_RO);
> > -       if (!zpool_can_sleep_mapped(zpool)) {
> > +       if (acomp_ctx->is_async && !zpool_can_sleep_mapped(zpool)) {
> >                 memcpy(acomp_ctx->buffer, src, entry->length);
> >                 src = acomp_ctx->buffer;
> >                 zpool_unmap_handle(zpool, entry->handle);
> > @@ -1384,7 +1386,7 @@ static void __zswap_load(struct zswap_entry *entry, struct page *page)
> >         BUG_ON(acomp_ctx->req->dlen != PAGE_SIZE);
> >         mutex_unlock(&acomp_ctx->mutex);
> >
> > -       if (zpool_can_sleep_mapped(zpool))
> > +       if (!acomp_ctx->is_async || zpool_can_sleep_mapped(zpool))
> >                 zpool_unmap_handle(zpool, entry->handle);
> >  }
> >
> > --
> > 2.34.1
> >
>
> The zswap side looks good to me. I don't have expertise/authority to
> ack the crypto API change (but FWIW it LGTM too based on a cursory
> code read).
> Reviewed-by: Nhat Pham <nphamcs@gmail.com>

Thanks!
Barry

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2024-02-16  3:55 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-01-03  9:50 [PATCH 0/3] mm/zswap & crypto/acompress: remove a couple of memcpy Barry Song
2024-01-03  9:50 ` [PATCH 1/3] crypto: introduce acomp_is_async to expose if a acomp has a scomp backend Barry Song
2024-01-08 22:35   ` Yosry Ahmed
2024-01-09  3:38     ` Barry Song
2024-01-03  9:50 ` [PATCH 2/3] mm/zswap: remove the memcpy if acomp is not asynchronous Barry Song
2024-01-04  0:38   ` Nhat Pham
2024-02-16  3:55     ` Barry Song
2024-01-08 22:36   ` Yosry Ahmed
2024-01-03  9:50 ` [PATCH 3/3] crypto: scompress: remove memcpy if sg_nents is 1 Barry Song
2024-01-25  9:58   ` Herbert Xu
2024-02-16  3:49     ` Barry Song

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).