linux-crypto.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH RFC 1/2] crypto: introduce acomp_is_async to expose if a acomp has a scomp backend
       [not found] <ZY1EnEefZsRTGYnP@gondor.apana.org.au>
@ 2024-01-03  2:57 ` Barry Song
  2024-01-03  2:57   ` [PATCH RFC 2/2] mm/zswap: remove the memcpy if acomp is not asynchronous Barry Song
  2024-01-03  2:57   ` [PATCH v4 2/6] mm/zswap: reuse dstmem when decompress Barry Song
  0 siblings, 2 replies; 5+ messages in thread
From: Barry Song @ 2024-01-03  2:57 UTC (permalink / raw)
  To: herbert, davem, akpm, chriscli, chrisl, ddstreet, hannes,
	linux-mm, nphamcs, sjenning, vitaly.wool, yosryahmed,
	zhouchengming
  Cc: linux-kernel, linux-crypto, Barry Song

From: Barry Song <v-songbaohua@oppo.com>

Almost all CPU-based compressors/decompressors are actually synchronous
though they support acomp APIs. While some chips have hardware-based
accelerators to offload CPU's work such as hisilicon and intel/qat/,
their drivers are working in async mode.
Letting acomp's users know exactly if the acomp is really async will
help users know if the compression and decompression procedure can
sleep and make their decisions accordingly.

Signed-off-by: Barry Song <v-songbaohua@oppo.com>
---
 crypto/acompress.c         | 8 ++++++++
 include/crypto/acompress.h | 9 +++++++++
 2 files changed, 17 insertions(+)

diff --git a/crypto/acompress.c b/crypto/acompress.c
index 1c682810a484..99118e879a4a 100644
--- a/crypto/acompress.c
+++ b/crypto/acompress.c
@@ -152,6 +152,14 @@ struct crypto_acomp *crypto_alloc_acomp_node(const char *alg_name, u32 type,
 }
 EXPORT_SYMBOL_GPL(crypto_alloc_acomp_node);
 
+bool acomp_is_async(struct crypto_acomp *acomp)
+{
+	struct crypto_tfm *tfm = crypto_acomp_tfm(acomp);
+
+	return tfm->__crt_alg->cra_type == &crypto_acomp_type;
+}
+EXPORT_SYMBOL_GPL(acomp_is_async);
+
 struct acomp_req *acomp_request_alloc(struct crypto_acomp *acomp)
 {
 	struct crypto_tfm *tfm = crypto_acomp_tfm(acomp);
diff --git a/include/crypto/acompress.h b/include/crypto/acompress.h
index 574cffc90730..5831080479e9 100644
--- a/include/crypto/acompress.h
+++ b/include/crypto/acompress.h
@@ -195,6 +195,15 @@ static inline int crypto_has_acomp(const char *alg_name, u32 type, u32 mask)
  */
 struct acomp_req *acomp_request_alloc(struct crypto_acomp *tfm);
 
+/**
+ * acomp_is_async() -- check if an acomp is asynchronous(can sleep)
+ *
+ * @tfm:	ACOMPRESS tfm handle allocated with crypto_alloc_acomp()
+ *
+ * Return:	true if the acomp is asynchronous, otherwise, false
+ */
+bool acomp_is_async(struct crypto_acomp *tfm);
+
 /**
  * acomp_request_free() -- zeroize and free asynchronous (de)compression
  *			   request as well as the output buffer if allocated
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH RFC 2/2] mm/zswap: remove the memcpy if acomp is not asynchronous
  2024-01-03  2:57 ` [PATCH RFC 1/2] crypto: introduce acomp_is_async to expose if a acomp has a scomp backend Barry Song
@ 2024-01-03  2:57   ` Barry Song
  2024-01-03  2:57   ` [PATCH v4 2/6] mm/zswap: reuse dstmem when decompress Barry Song
  1 sibling, 0 replies; 5+ messages in thread
From: Barry Song @ 2024-01-03  2:57 UTC (permalink / raw)
  To: herbert, davem, akpm, chriscli, chrisl, ddstreet, hannes,
	linux-mm, nphamcs, sjenning, vitaly.wool, yosryahmed,
	zhouchengming
  Cc: linux-kernel, linux-crypto, Barry Song

From: Barry Song <v-songbaohua@oppo.com>

Most compressors are actually CPU-based, they won't sleep during
decompression. we should be able to remove the redundant memcpy
for them.

Signed-off-by: Barry Song <v-songbaohua@oppo.com>
---
 mm/zswap.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/mm/zswap.c b/mm/zswap.c
index ca25b676048e..36898614ebcc 100644
--- a/mm/zswap.c
+++ b/mm/zswap.c
@@ -168,6 +168,7 @@ struct crypto_acomp_ctx {
 	struct crypto_wait wait;
 	u8 *buffer;
 	struct mutex mutex;
+	bool is_async; /* if acomp can sleep */
 };
 
 /*
@@ -716,6 +717,7 @@ static int zswap_cpu_comp_prepare(unsigned int cpu, struct hlist_node *node)
 		goto acomp_fail;
 	}
 	acomp_ctx->acomp = acomp;
+	acomp_ctx->is_async = acomp_is_async(acomp);
 
 	req = acomp_request_alloc(acomp_ctx->acomp);
 	if (!req) {
@@ -1370,7 +1372,7 @@ static void __zswap_load(struct zswap_entry *entry, struct page *page)
 	mutex_lock(&acomp_ctx->mutex);
 
 	src = zpool_map_handle(zpool, entry->handle, ZPOOL_MM_RO);
-	if (!zpool_can_sleep_mapped(zpool)) {
+	if (acomp_ctx->is_async && !zpool_can_sleep_mapped(zpool)) {
 		memcpy(acomp_ctx->buffer, src, entry->length);
 		src = acomp_ctx->buffer;
 		zpool_unmap_handle(zpool, entry->handle);
@@ -1384,7 +1386,7 @@ static void __zswap_load(struct zswap_entry *entry, struct page *page)
 	BUG_ON(acomp_ctx->req->dlen != PAGE_SIZE);
 	mutex_unlock(&acomp_ctx->mutex);
 
-	if (zpool_can_sleep_mapped(zpool))
+	if (!acomp_ctx->is_async || zpool_can_sleep_mapped(zpool))
 		zpool_unmap_handle(zpool, entry->handle);
 }
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH v4 2/6] mm/zswap: reuse dstmem when decompress
  2024-01-03  2:57 ` [PATCH RFC 1/2] crypto: introduce acomp_is_async to expose if a acomp has a scomp backend Barry Song
  2024-01-03  2:57   ` [PATCH RFC 2/2] mm/zswap: remove the memcpy if acomp is not asynchronous Barry Song
@ 2024-01-03  2:57   ` Barry Song
  2024-01-25  9:41     ` Herbert Xu
  1 sibling, 1 reply; 5+ messages in thread
From: Barry Song @ 2024-01-03  2:57 UTC (permalink / raw)
  To: herbert, davem, akpm, chriscli, chrisl, ddstreet, hannes,
	linux-mm, nphamcs, sjenning, vitaly.wool, yosryahmed,
	zhouchengming
  Cc: linux-kernel, linux-crypto

>>
>> for CPU-based alg, we have completed the compr/decompr within
>> crypto_acomp_decompress()
>> synchronously. they won't return EINPROGRESS, EBUSY.
>>
>> The problem is that crypto_acomp won't expose this information to its
>> users. if it does,
>> we can use this info, we will totally avoid the code of copying
>> zsmalloc's data to a tmp
>> buffer for the most majority users of zswap.
>>
>> But I am not sure if we can find a way to convince Herbert(+To)  :-)

> What would you like to expose? The async status of the underlying
> algorithm?

Right. followed by a rfc patchset, please help take a look.

> 
> We could certainly do that.  But I wonder if it might actually be
> better for you to allocate a second sync-only algorithm for such
> cases.  I'd like to see some real numbers.

some hardware might want to use an accelerator to help offload CPU's
work. their drivers are working in async mode, for example, hisilicon
and intel.

I don't have the exact number we can save by removing the redundant
memcpy, nor do i have a proper hardware to test and get the number.
As Chengming is actually working in zswap, i wonder if you can test
my patches and post some data?

> 
> Cheers,
> --
> Email: Herbert Xu <herbert@gondor.apana.org.au>

Thanks
Barry


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v4 2/6] mm/zswap: reuse dstmem when decompress
  2024-01-03  2:57   ` [PATCH v4 2/6] mm/zswap: reuse dstmem when decompress Barry Song
@ 2024-01-25  9:41     ` Herbert Xu
  2024-01-27 14:41       ` Barry Song
  0 siblings, 1 reply; 5+ messages in thread
From: Herbert Xu @ 2024-01-25  9:41 UTC (permalink / raw)
  To: Barry Song
  Cc: davem, akpm, chriscli, chrisl, ddstreet, hannes, linux-mm,
	nphamcs, sjenning, vitaly.wool, yosryahmed, zhouchengming,
	linux-kernel, linux-crypto

On Wed, Jan 03, 2024 at 03:57:59PM +1300, Barry Song wrote:
>
> > We could certainly do that.  But I wonder if it might actually be
> > better for you to allocate a second sync-only algorithm for such
> > cases.  I'd like to see some real numbers.
> 
> some hardware might want to use an accelerator to help offload CPU's
> work. their drivers are working in async mode, for example, hisilicon
> and intel.
> 
> I don't have the exact number we can save by removing the redundant
> memcpy, nor do i have a proper hardware to test and get the number.
> As Chengming is actually working in zswap, i wonder if you can test
> my patches and post some data?

I don't have the hardware to test this.  Since you're proposing
the change, please test it to ensure that we're not adding cruft
to the API that's actually detrimental to performance.

Thanks,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v4 2/6] mm/zswap: reuse dstmem when decompress
  2024-01-25  9:41     ` Herbert Xu
@ 2024-01-27 14:41       ` Barry Song
  0 siblings, 0 replies; 5+ messages in thread
From: Barry Song @ 2024-01-27 14:41 UTC (permalink / raw)
  To: Herbert Xu
  Cc: davem, akpm, chriscli, chrisl, ddstreet, hannes, linux-mm,
	nphamcs, sjenning, vitaly.wool, yosryahmed, zhouchengming,
	linux-kernel, linux-crypto

On Thu, Jan 25, 2024 at 5:41 PM Herbert Xu <herbert@gondor.apana.org.au> wrote:
>
> On Wed, Jan 03, 2024 at 03:57:59PM +1300, Barry Song wrote:
> >
> > > We could certainly do that.  But I wonder if it might actually be
> > > better for you to allocate a second sync-only algorithm for such
> > > cases.  I'd like to see some real numbers.
> >
> > some hardware might want to use an accelerator to help offload CPU's
> > work. their drivers are working in async mode, for example, hisilicon
> > and intel.
> >
> > I don't have the exact number we can save by removing the redundant
> > memcpy, nor do i have a proper hardware to test and get the number.
> > As Chengming is actually working in zswap, i wonder if you can test
> > my patches and post some data?
>
> I don't have the hardware to test this.  Since you're proposing
> the change, please test it to ensure that we're not adding cruft
> to the API that's actually detrimental to performance.

Understood.
sorry for the grammatical errors, i was actually asking for
chengming's help for testing as
he had hardware and was actively working on optimizing  zswap.
and he has already tested and sent the performance data which I quoted
in the formal
patchset.

>
> Thanks,
> --
> Email: Herbert Xu <herbert@gondor.apana.org.au>
> Home Page: http://gondor.apana.org.au/~herbert/
> PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt


Thanks
Barry

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2024-01-27 14:42 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <ZY1EnEefZsRTGYnP@gondor.apana.org.au>
2024-01-03  2:57 ` [PATCH RFC 1/2] crypto: introduce acomp_is_async to expose if a acomp has a scomp backend Barry Song
2024-01-03  2:57   ` [PATCH RFC 2/2] mm/zswap: remove the memcpy if acomp is not asynchronous Barry Song
2024-01-03  2:57   ` [PATCH v4 2/6] mm/zswap: reuse dstmem when decompress Barry Song
2024-01-25  9:41     ` Herbert Xu
2024-01-27 14:41       ` Barry Song

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).