linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v7] crypto: scompress: remove memcpy if sg_nents is 1 and pages are lowmem
@ 2024-03-01 19:27 Barry Song
  2024-03-08 11:26 ` Herbert Xu
  2024-03-17 23:13 ` Guenter Roeck
  0 siblings, 2 replies; 7+ messages in thread
From: Barry Song @ 2024-03-01 19:27 UTC (permalink / raw)
  To: herbert, davem, linux-crypto
  Cc: akpm, chrisl, sjenning, vitaly.wool, linux-kernel, Barry Song,
	Johannes Weiner, Nhat Pham, Yosry Ahmed, Chengming Zhou

From: Barry Song <v-songbaohua@oppo.com>

while sg_nents is 1, which is always true for the current kernel
as the only user - zswap is this case, we might have a chance to
remove memcpy, thus improve the performance.
Though sg_nents is 1, its buffer might cross two pages. If those
pages are highmem, we have no cheap way to map them to contiguous
virtual address because kmap doesn't support more than one page
(kmap single higmem page could be still expensive for tlb) and
vmap is expensive.
So we also test and enure page is not highmem in order to safely
use page_to_virt before removing the memcpy. The good news is
that in the most majority of cases, we are lowmem, and we are
always lowmem in those modern and popular hardware.

Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Nhat Pham <nphamcs@gmail.com>
Cc: Yosry Ahmed <yosryahmed@google.com>
Signed-off-by: Barry Song <v-songbaohua@oppo.com>
Tested-by: Chengming Zhou <zhouchengming@bytedance.com>
---
 -v7:
 * fix the problem pointed out by Herbert - flush all pages if dst
   is longer than one page.

 crypto/scompress.c | 36 +++++++++++++++++++++++++++++-------
 1 file changed, 29 insertions(+), 7 deletions(-)

diff --git a/crypto/scompress.c b/crypto/scompress.c
index b108a30a7600..60bbb7ea4060 100644
--- a/crypto/scompress.c
+++ b/crypto/scompress.c
@@ -117,6 +117,7 @@ static int scomp_acomp_comp_decomp(struct acomp_req *req, int dir)
 	struct crypto_scomp *scomp = *tfm_ctx;
 	void **ctx = acomp_request_ctx(req);
 	struct scomp_scratch *scratch;
+	void *src, *dst;
 	unsigned int dlen;
 	int ret;
 
@@ -134,13 +135,25 @@ static int scomp_acomp_comp_decomp(struct acomp_req *req, int dir)
 	scratch = raw_cpu_ptr(&scomp_scratch);
 	spin_lock(&scratch->lock);
 
-	scatterwalk_map_and_copy(scratch->src, req->src, 0, req->slen, 0);
+	if (sg_nents(req->src) == 1 && !PageHighMem(sg_page(req->src))) {
+		src = page_to_virt(sg_page(req->src)) + req->src->offset;
+	} else {
+		scatterwalk_map_and_copy(scratch->src, req->src, 0,
+					 req->slen, 0);
+		src = scratch->src;
+	}
+
+	if (req->dst && sg_nents(req->dst) == 1 && !PageHighMem(sg_page(req->dst)))
+		dst = page_to_virt(sg_page(req->dst)) + req->dst->offset;
+	else
+		dst = scratch->dst;
+
 	if (dir)
-		ret = crypto_scomp_compress(scomp, scratch->src, req->slen,
-					    scratch->dst, &req->dlen, *ctx);
+		ret = crypto_scomp_compress(scomp, src, req->slen,
+					    dst, &req->dlen, *ctx);
 	else
-		ret = crypto_scomp_decompress(scomp, scratch->src, req->slen,
-					      scratch->dst, &req->dlen, *ctx);
+		ret = crypto_scomp_decompress(scomp, src, req->slen,
+					      dst, &req->dlen, *ctx);
 	if (!ret) {
 		if (!req->dst) {
 			req->dst = sgl_alloc(req->dlen, GFP_ATOMIC, NULL);
@@ -152,8 +165,17 @@ static int scomp_acomp_comp_decomp(struct acomp_req *req, int dir)
 			ret = -ENOSPC;
 			goto out;
 		}
-		scatterwalk_map_and_copy(scratch->dst, req->dst, 0, req->dlen,
-					 1);
+		if (dst == scratch->dst) {
+			scatterwalk_map_and_copy(scratch->dst, req->dst, 0,
+						 req->dlen, 1);
+		} else {
+			int nr_pages = DIV_ROUND_UP(req->dst->offset + req->dlen, PAGE_SIZE);
+			int i;
+			struct page *dst_page = sg_page(req->dst);
+
+			for (i = 0; i < nr_pages; i++)
+				flush_dcache_page(dst_page + i);
+		}
 	}
 out:
 	spin_unlock(&scratch->lock);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH v7] crypto: scompress: remove memcpy if sg_nents is 1 and pages are lowmem
  2024-03-01 19:27 [PATCH v7] crypto: scompress: remove memcpy if sg_nents is 1 and pages are lowmem Barry Song
@ 2024-03-08 11:26 ` Herbert Xu
  2024-03-17 23:13 ` Guenter Roeck
  1 sibling, 0 replies; 7+ messages in thread
From: Herbert Xu @ 2024-03-08 11:26 UTC (permalink / raw)
  To: Barry Song
  Cc: davem, linux-crypto, akpm, chrisl, sjenning, vitaly.wool,
	linux-kernel, Barry Song, Johannes Weiner, Nhat Pham,
	Yosry Ahmed, Chengming Zhou

On Sat, Mar 02, 2024 at 08:27:45AM +1300, Barry Song wrote:
> From: Barry Song <v-songbaohua@oppo.com>
> 
> while sg_nents is 1, which is always true for the current kernel
> as the only user - zswap is this case, we might have a chance to
> remove memcpy, thus improve the performance.
> Though sg_nents is 1, its buffer might cross two pages. If those
> pages are highmem, we have no cheap way to map them to contiguous
> virtual address because kmap doesn't support more than one page
> (kmap single higmem page could be still expensive for tlb) and
> vmap is expensive.
> So we also test and enure page is not highmem in order to safely
> use page_to_virt before removing the memcpy. The good news is
> that in the most majority of cases, we are lowmem, and we are
> always lowmem in those modern and popular hardware.
> 
> Cc: Johannes Weiner <hannes@cmpxchg.org>
> Cc: Nhat Pham <nphamcs@gmail.com>
> Cc: Yosry Ahmed <yosryahmed@google.com>
> Signed-off-by: Barry Song <v-songbaohua@oppo.com>
> Tested-by: Chengming Zhou <zhouchengming@bytedance.com>
> ---
>  -v7:
>  * fix the problem pointed out by Herbert - flush all pages if dst
>    is longer than one page.
> 
>  crypto/scompress.c | 36 +++++++++++++++++++++++++++++-------
>  1 file changed, 29 insertions(+), 7 deletions(-)

Patch applied.  Thanks.
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v7] crypto: scompress: remove memcpy if sg_nents is 1 and pages are lowmem
  2024-03-01 19:27 [PATCH v7] crypto: scompress: remove memcpy if sg_nents is 1 and pages are lowmem Barry Song
  2024-03-08 11:26 ` Herbert Xu
@ 2024-03-17 23:13 ` Guenter Roeck
  2024-03-17 23:48   ` Barry Song
  1 sibling, 1 reply; 7+ messages in thread
From: Guenter Roeck @ 2024-03-17 23:13 UTC (permalink / raw)
  To: Barry Song
  Cc: herbert, davem, linux-crypto, akpm, chrisl, sjenning,
	vitaly.wool, linux-kernel, Barry Song, Johannes Weiner,
	Nhat Pham, Yosry Ahmed, Chengming Zhou

Hi,

On Sat, Mar 02, 2024 at 08:27:45AM +1300, Barry Song wrote:
[ ... ]
> @@ -152,8 +165,17 @@ static int scomp_acomp_comp_decomp(struct acomp_req *req, int dir)
>  			ret = -ENOSPC;
>  			goto out;
>  		}
> -		scatterwalk_map_and_copy(scratch->dst, req->dst, 0, req->dlen,
> -					 1);
> +		if (dst == scratch->dst) {
> +			scatterwalk_map_and_copy(scratch->dst, req->dst, 0,
> +						 req->dlen, 1);
> +		} else {
> +			int nr_pages = DIV_ROUND_UP(req->dst->offset + req->dlen, PAGE_SIZE);
> +			int i;
> +			struct page *dst_page = sg_page(req->dst);
> +
> +			for (i = 0; i < nr_pages; i++)
> +				flush_dcache_page(dst_page + i);

flush_dcache_page() is an empty macro on some architectures
such as xtensa. This results in

Building xtensa:allmodconfig ... failed
--------------
Error log:
crypto/scompress.c: In function 'scomp_acomp_comp_decomp':
crypto/scompress.c:174:38: error: unused variable 'dst_page'

on the affected architectures.

Guenter

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v7] crypto: scompress: remove memcpy if sg_nents is 1 and pages are lowmem
  2024-03-17 23:13 ` Guenter Roeck
@ 2024-03-17 23:48   ` Barry Song
  2024-03-18  0:13     ` Guenter Roeck
  0 siblings, 1 reply; 7+ messages in thread
From: Barry Song @ 2024-03-17 23:48 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: herbert, davem, linux-crypto, akpm, chrisl, sjenning,
	vitaly.wool, linux-kernel, Barry Song, Johannes Weiner,
	Nhat Pham, Yosry Ahmed, Chengming Zhou

On Mon, Mar 18, 2024 at 7:13 AM Guenter Roeck <linux@roeck-us.net> wrote:
>
> Hi,
>
> On Sat, Mar 02, 2024 at 08:27:45AM +1300, Barry Song wrote:
> [ ... ]
> > @@ -152,8 +165,17 @@ static int scomp_acomp_comp_decomp(struct acomp_req *req, int dir)
> >                       ret = -ENOSPC;
> >                       goto out;
> >               }
> > -             scatterwalk_map_and_copy(scratch->dst, req->dst, 0, req->dlen,
> > -                                      1);
> > +             if (dst == scratch->dst) {
> > +                     scatterwalk_map_and_copy(scratch->dst, req->dst, 0,
> > +                                              req->dlen, 1);
> > +             } else {
> > +                     int nr_pages = DIV_ROUND_UP(req->dst->offset + req->dlen, PAGE_SIZE);
> > +                     int i;
> > +                     struct page *dst_page = sg_page(req->dst);
> > +
> > +                     for (i = 0; i < nr_pages; i++)
> > +                             flush_dcache_page(dst_page + i);
>
> flush_dcache_page() is an empty macro on some architectures
> such as xtensa. This results in

Hi Guenter,

this is a bug of xtensa, could you test the patch:
https://lore.kernel.org/all/20240313045036.51065-1-21cnbao@gmail.com/


>
> Building xtensa:allmodconfig ... failed
> --------------
> Error log:
> crypto/scompress.c: In function 'scomp_acomp_comp_decomp':
> crypto/scompress.c:174:38: error: unused variable 'dst_page'
>
> on the affected architectures.
>
> Guenter

Thanks
Barry

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v7] crypto: scompress: remove memcpy if sg_nents is 1 and pages are lowmem
  2024-03-17 23:48   ` Barry Song
@ 2024-03-18  0:13     ` Guenter Roeck
  2024-03-18  0:21       ` Barry Song
  0 siblings, 1 reply; 7+ messages in thread
From: Guenter Roeck @ 2024-03-18  0:13 UTC (permalink / raw)
  To: Barry Song
  Cc: herbert, davem, linux-crypto, akpm, chrisl, sjenning,
	vitaly.wool, linux-kernel, Barry Song, Johannes Weiner,
	Nhat Pham, Yosry Ahmed, Chengming Zhou

On 3/17/24 16:48, Barry Song wrote:
> On Mon, Mar 18, 2024 at 7:13 AM Guenter Roeck <linux@roeck-us.net> wrote:
>>
>> Hi,
>>
>> On Sat, Mar 02, 2024 at 08:27:45AM +1300, Barry Song wrote:
>> [ ... ]
>>> @@ -152,8 +165,17 @@ static int scomp_acomp_comp_decomp(struct acomp_req *req, int dir)
>>>                        ret = -ENOSPC;
>>>                        goto out;
>>>                }
>>> -             scatterwalk_map_and_copy(scratch->dst, req->dst, 0, req->dlen,
>>> -                                      1);
>>> +             if (dst == scratch->dst) {
>>> +                     scatterwalk_map_and_copy(scratch->dst, req->dst, 0,
>>> +                                              req->dlen, 1);
>>> +             } else {
>>> +                     int nr_pages = DIV_ROUND_UP(req->dst->offset + req->dlen, PAGE_SIZE);
>>> +                     int i;
>>> +                     struct page *dst_page = sg_page(req->dst);
>>> +
>>> +                     for (i = 0; i < nr_pages; i++)
>>> +                             flush_dcache_page(dst_page + i);
>>
>> flush_dcache_page() is an empty macro on some architectures
>> such as xtensa. This results in
> 
> Hi Guenter,
> 
> this is a bug of xtensa, could you test the patch:

Thanks for the update.

> https://lore.kernel.org/all/20240313045036.51065-1-21cnbao@gmail.com/
> 

That doesn't build for me.

   CC      arch/xtensa/kernel/asm-offsets.s
In file included from ./include/linux/highmem.h:8,
                  from ./include/linux/bvec.h:10,
                  from ./include/linux/blk_types.h:10,
                  from ./include/linux/writeback.h:13,
                  from ./include/linux/memcontrol.h:23,
                  from ./include/linux/swap.h:9,
                  from ./include/linux/suspend.h:5,
                  from arch/xtensa/kernel/asm-offsets.c:24:
./include/linux/cacheflush.h:9:5: error: "ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE" is not defined, evaluates to 0 [-Werror=undef]
     9 | #if ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE

A similar works for loongarch, so I don't know what is wrong.
Maybe some context patch is missing.

Guenter


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v7] crypto: scompress: remove memcpy if sg_nents is 1 and pages are lowmem
  2024-03-18  0:13     ` Guenter Roeck
@ 2024-03-18  0:21       ` Barry Song
  2024-03-18  0:33         ` Guenter Roeck
  0 siblings, 1 reply; 7+ messages in thread
From: Barry Song @ 2024-03-18  0:21 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: herbert, davem, linux-crypto, akpm, chrisl, sjenning,
	vitaly.wool, linux-kernel, Barry Song, Johannes Weiner,
	Nhat Pham, Yosry Ahmed, Chengming Zhou

On Mon, Mar 18, 2024 at 8:14 AM Guenter Roeck <linux@roeck-us.net> wrote:
>
> On 3/17/24 16:48, Barry Song wrote:
> > On Mon, Mar 18, 2024 at 7:13 AM Guenter Roeck <linux@roeck-us.net> wrote:
> >>
> >> Hi,
> >>
> >> On Sat, Mar 02, 2024 at 08:27:45AM +1300, Barry Song wrote:
> >> [ ... ]
> >>> @@ -152,8 +165,17 @@ static int scomp_acomp_comp_decomp(struct acomp_req *req, int dir)
> >>>                        ret = -ENOSPC;
> >>>                        goto out;
> >>>                }
> >>> -             scatterwalk_map_and_copy(scratch->dst, req->dst, 0, req->dlen,
> >>> -                                      1);
> >>> +             if (dst == scratch->dst) {
> >>> +                     scatterwalk_map_and_copy(scratch->dst, req->dst, 0,
> >>> +                                              req->dlen, 1);
> >>> +             } else {
> >>> +                     int nr_pages = DIV_ROUND_UP(req->dst->offset + req->dlen, PAGE_SIZE);
> >>> +                     int i;
> >>> +                     struct page *dst_page = sg_page(req->dst);
> >>> +
> >>> +                     for (i = 0; i < nr_pages; i++)
> >>> +                             flush_dcache_page(dst_page + i);
> >>
> >> flush_dcache_page() is an empty macro on some architectures
> >> such as xtensa. This results in
> >
> > Hi Guenter,
> >
> > this is a bug of xtensa, could you test the patch:
>
> Thanks for the update.
>
> > https://lore.kernel.org/all/20240313045036.51065-1-21cnbao@gmail.com/
> >
>
> That doesn't build for me.
>
>    CC      arch/xtensa/kernel/asm-offsets.s
> In file included from ./include/linux/highmem.h:8,
>                   from ./include/linux/bvec.h:10,
>                   from ./include/linux/blk_types.h:10,
>                   from ./include/linux/writeback.h:13,
>                   from ./include/linux/memcontrol.h:23,
>                   from ./include/linux/swap.h:9,
>                   from ./include/linux/suspend.h:5,
>                   from arch/xtensa/kernel/asm-offsets.c:24:
> ./include/linux/cacheflush.h:9:5: error: "ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE" is not defined, evaluates to 0 [-Werror=undef]
>      9 | #if ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE
>
> A similar works for loongarch, so I don't know what is wrong.
> Maybe some context patch is missing.

this is weird as include/asm-generic/cacheflush.h will define it to 0

 #ifndef ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE
 static inline void flush_dcache_page(struct page *page)
 {
 }

 #define ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE 0
 #endif

Maybe somewhere else also needs to be fixed.
Can you tell me your toolchain version and toolchain name? and defconfig name?

>
> Guenter

Thanks
Barry

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v7] crypto: scompress: remove memcpy if sg_nents is 1 and pages are lowmem
  2024-03-18  0:21       ` Barry Song
@ 2024-03-18  0:33         ` Guenter Roeck
  0 siblings, 0 replies; 7+ messages in thread
From: Guenter Roeck @ 2024-03-18  0:33 UTC (permalink / raw)
  To: Barry Song
  Cc: herbert, davem, linux-crypto, akpm, chrisl, sjenning,
	vitaly.wool, linux-kernel, Barry Song, Johannes Weiner,
	Nhat Pham, Yosry Ahmed, Chengming Zhou

On 3/17/24 17:21, Barry Song wrote:
> On Mon, Mar 18, 2024 at 8:14 AM Guenter Roeck <linux@roeck-us.net> wrote:
>>
>> On 3/17/24 16:48, Barry Song wrote:
>>> On Mon, Mar 18, 2024 at 7:13 AM Guenter Roeck <linux@roeck-us.net> wrote:
>>>>
>>>> Hi,
>>>>
>>>> On Sat, Mar 02, 2024 at 08:27:45AM +1300, Barry Song wrote:
>>>> [ ... ]
>>>>> @@ -152,8 +165,17 @@ static int scomp_acomp_comp_decomp(struct acomp_req *req, int dir)
>>>>>                         ret = -ENOSPC;
>>>>>                         goto out;
>>>>>                 }
>>>>> -             scatterwalk_map_and_copy(scratch->dst, req->dst, 0, req->dlen,
>>>>> -                                      1);
>>>>> +             if (dst == scratch->dst) {
>>>>> +                     scatterwalk_map_and_copy(scratch->dst, req->dst, 0,
>>>>> +                                              req->dlen, 1);
>>>>> +             } else {
>>>>> +                     int nr_pages = DIV_ROUND_UP(req->dst->offset + req->dlen, PAGE_SIZE);
>>>>> +                     int i;
>>>>> +                     struct page *dst_page = sg_page(req->dst);
>>>>> +
>>>>> +                     for (i = 0; i < nr_pages; i++)
>>>>> +                             flush_dcache_page(dst_page + i);
>>>>
>>>> flush_dcache_page() is an empty macro on some architectures
>>>> such as xtensa. This results in
>>>
>>> Hi Guenter,
>>>
>>> this is a bug of xtensa, could you test the patch:
>>
>> Thanks for the update.
>>
>>> https://lore.kernel.org/all/20240313045036.51065-1-21cnbao@gmail.com/
>>>
>>
>> That doesn't build for me.
>>
>>     CC      arch/xtensa/kernel/asm-offsets.s
>> In file included from ./include/linux/highmem.h:8,
>>                    from ./include/linux/bvec.h:10,
>>                    from ./include/linux/blk_types.h:10,
>>                    from ./include/linux/writeback.h:13,
>>                    from ./include/linux/memcontrol.h:23,
>>                    from ./include/linux/swap.h:9,
>>                    from ./include/linux/suspend.h:5,
>>                    from arch/xtensa/kernel/asm-offsets.c:24:
>> ./include/linux/cacheflush.h:9:5: error: "ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE" is not defined, evaluates to 0 [-Werror=undef]
>>       9 | #if ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE
>>
>> A similar works for loongarch, so I don't know what is wrong.
>> Maybe some context patch is missing.
> 
> this is weird as include/asm-generic/cacheflush.h will define it to 0
> 
>   #ifndef ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE
>   static inline void flush_dcache_page(struct page *page)
>   {
>   }
> 
>   #define ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE 0
>   #endif
> 
> Maybe somewhere else also needs to be fixed.
> Can you tell me your toolchain version and toolchain name? and defconfig name?
> 

config is allmodconfig - GCC_PLUGINS.

Toolchains are self-built from gcc source using the buildall system. I tried
gcc 11.4, 12.3, and 13.2. I don't think that matters, though, since
"asm-generic/cacheflush.h" isn't included from arch/xtensa. The diff below
seems to fix the problem, but I have not fully tested it.

Guenter

---
diff --git a/arch/xtensa/include/asm/cacheflush.h b/arch/xtensa/include/asm/cacheflush.h
index 11efdc8eb262..62662919cbbc 100644
--- a/arch/xtensa/include/asm/cacheflush.h
+++ b/arch/xtensa/include/asm/cacheflush.h
@@ -183,4 +183,6 @@ extern void copy_from_user_page(struct vm_area_struct*, struct page*,

  #endif

+#include <asm-generic/cacheflush.h>
+
  #endif /* _XTENSA_CACHEFLUSH_H */


^ permalink raw reply related	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2024-03-18  0:33 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-03-01 19:27 [PATCH v7] crypto: scompress: remove memcpy if sg_nents is 1 and pages are lowmem Barry Song
2024-03-08 11:26 ` Herbert Xu
2024-03-17 23:13 ` Guenter Roeck
2024-03-17 23:48   ` Barry Song
2024-03-18  0:13     ` Guenter Roeck
2024-03-18  0:21       ` Barry Song
2024-03-18  0:33         ` Guenter Roeck

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).