All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH] fs/isofs: Replace kmap() with kmap_local_page()
@ 2022-07-25 16:23 Fabio M. De Francesco
  2022-07-26 14:50 ` Jan Kara
  0 siblings, 1 reply; 3+ messages in thread
From: Fabio M. De Francesco @ 2022-07-25 16:23 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle),
	Jan Kara, Roman Gushchin, Pali Rohár, Andrew Morton,
	Muchun Song, linux-kernel
  Cc: Fabio M. De Francesco, Ira Weiny

The use of kmap() is being deprecated in favor of kmap_local_page().

There are two main problems with kmap(): (1) It comes with an overhead as
mapping space is restricted and protected by a global lock for
synchronization and (2) it also requires global TLB invalidation when the
kmap’s pool wraps and it might block when the mapping space is fully
utilized until a slot becomes available.

With kmap_local_page() the mappings are per thread, CPU local, can take
page faults, and can be called from any context (including interrupts).
Tasks can be preempted and, when scheduled to run again, the kernel
virtual addresses are restored and still valid. It is faster than kmap()
in kernels with HIGHMEM enabled.

Since kmap_local_page() can be safely used in compress.c, it should be
called everywhere instead of kmap().

Therefore, replace kmap() with kmap_local_page() in compress.c. Where it
is needed, use memzero_page() instead of open coding kmap_local_page()
plus memset() to fill the pages with zeros. Delete the redundant
flush_dcache_page() in the two call sites of memzero_page().

This is an RFC because these changes have not been tested (tests are
welcome!), therefore I'm not entirely sure whether these conversions work
properly. I'd like to hear comments from more experienced developers
before sending a real patch. Suggestions about how to run tests would
also be much appreciated.

Suggested-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Fabio M. De Francesco <fmdefrancesco@gmail.com>
---
 fs/isofs/compress.c | 24 +++++++++++++-----------
 1 file changed, 13 insertions(+), 11 deletions(-)

diff --git a/fs/isofs/compress.c b/fs/isofs/compress.c
index 95a19f25d61c..a1562124bb91 100644
--- a/fs/isofs/compress.c
+++ b/fs/isofs/compress.c
@@ -67,8 +67,7 @@ static loff_t zisofs_uncompress_block(struct inode *inode, loff_t block_start,
 		for ( i = 0 ; i < pcount ; i++ ) {
 			if (!pages[i])
 				continue;
-			memset(page_address(pages[i]), 0, PAGE_SIZE);
-			flush_dcache_page(pages[i]);
+			memzero_page(pages[i], 0, PAGE_SIZE);
 			SetPageUptodate(pages[i]);
 		}
 		return ((loff_t)pcount) << PAGE_SHIFT;
@@ -120,8 +119,7 @@ static loff_t zisofs_uncompress_block(struct inode *inode, loff_t block_start,
 	       zerr != Z_STREAM_END) {
 		if (!stream.avail_out) {
 			if (pages[curpage]) {
-				stream.next_out = page_address(pages[curpage])
-						+ poffset;
+				stream.next_out = kmap_local_page(pages[curpage] + poffset);
 				stream.avail_out = PAGE_SIZE - poffset;
 				poffset = 0;
 			} else {
@@ -170,6 +168,12 @@ static loff_t zisofs_uncompress_block(struct inode *inode, loff_t block_start,
 			}
 		}
 
+		if (stream.next_out)
+			if (stream.next_out != (char *)zisofs_sink_page) {
+				kunmap_local(stream.next_out);
+				stream.next_out = NULL;
+			}
+
 		if (!stream.avail_out) {
 			/* This page completed */
 			if (pages[curpage]) {
@@ -183,6 +187,9 @@ static loff_t zisofs_uncompress_block(struct inode *inode, loff_t block_start,
 	}
 inflate_out:
 	zlib_inflateEnd(&stream);
+	if (stream.next_out)
+		if (stream.next_out != (char *)zisofs_sink_page)
+			kunmap_local(stream.next_out);
 
 z_eio:
 	mutex_unlock(&zisofs_zlib_lock);
@@ -283,9 +290,7 @@ static int zisofs_fill_pages(struct inode *inode, int full_page, int pcount,
 	}
 
 	if (poffset && *pages) {
-		memset(page_address(*pages) + poffset, 0,
-		       PAGE_SIZE - poffset);
-		flush_dcache_page(*pages);
+		memzero_page(*pages, poffset, PAGE_SIZE - poffset);
 		SetPageUptodate(*pages);
 	}
 	return 0;
@@ -343,10 +348,8 @@ static int zisofs_read_folio(struct file *file, struct folio *folio)
 	for (i = 0; i < pcount; i++, index++) {
 		if (i != full_page)
 			pages[i] = grab_cache_page_nowait(mapping, index);
-		if (pages[i]) {
+		if (pages[i])
 			ClearPageError(pages[i]);
-			kmap(pages[i]);
-		}
 	}
 
 	err = zisofs_fill_pages(inode, full_page, pcount, pages);
@@ -357,7 +360,6 @@ static int zisofs_read_folio(struct file *file, struct folio *folio)
 			flush_dcache_page(pages[i]);
 			if (i == full_page && err)
 				SetPageError(pages[i]);
-			kunmap(pages[i]);
 			unlock_page(pages[i]);
 			if (i != full_page)
 				put_page(pages[i]);
-- 
2.37.1


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [RFC PATCH] fs/isofs: Replace kmap() with kmap_local_page()
  2022-07-25 16:23 [RFC PATCH] fs/isofs: Replace kmap() with kmap_local_page() Fabio M. De Francesco
@ 2022-07-26 14:50 ` Jan Kara
  2022-07-26 16:21   ` Fabio M. De Francesco
  0 siblings, 1 reply; 3+ messages in thread
From: Jan Kara @ 2022-07-26 14:50 UTC (permalink / raw)
  To: Fabio M. De Francesco
  Cc: Matthew Wilcox (Oracle),
	Jan Kara, Roman Gushchin, Pali Rohár, Andrew Morton,
	Muchun Song, linux-kernel, Ira Weiny

On Mon 25-07-22 18:23:31, Fabio M. De Francesco wrote:
> The use of kmap() is being deprecated in favor of kmap_local_page().
> 
> There are two main problems with kmap(): (1) It comes with an overhead as
> mapping space is restricted and protected by a global lock for
> synchronization and (2) it also requires global TLB invalidation when the
> kmap’s pool wraps and it might block when the mapping space is fully
> utilized until a slot becomes available.
> 
> With kmap_local_page() the mappings are per thread, CPU local, can take
> page faults, and can be called from any context (including interrupts).
> Tasks can be preempted and, when scheduled to run again, the kernel
> virtual addresses are restored and still valid. It is faster than kmap()
> in kernels with HIGHMEM enabled.
> 
> Since kmap_local_page() can be safely used in compress.c, it should be
> called everywhere instead of kmap().
> 
> Therefore, replace kmap() with kmap_local_page() in compress.c. Where it
> is needed, use memzero_page() instead of open coding kmap_local_page()
> plus memset() to fill the pages with zeros. Delete the redundant
> flush_dcache_page() in the two call sites of memzero_page().
> 
> This is an RFC because these changes have not been tested (tests are
> welcome!), therefore I'm not entirely sure whether these conversions work
> properly. I'd like to hear comments from more experienced developers
> before sending a real patch. Suggestions about how to run tests would
> also be much appreciated.
> 
> Suggested-by: Ira Weiny <ira.weiny@intel.com>
> Signed-off-by: Fabio M. De Francesco <fmdefrancesco@gmail.com>

What you propose makes sense to me. But the lack of testing is troublesome.
You can always at least test your patch without highmem on isofs image with
compression (mkisofs seems to support creating such filesystems). Even that
would detect a bug in your patch ;) - see below.

> diff --git a/fs/isofs/compress.c b/fs/isofs/compress.c
> index 95a19f25d61c..a1562124bb91 100644
> --- a/fs/isofs/compress.c
> +++ b/fs/isofs/compress.c
> @@ -120,8 +119,7 @@ static loff_t zisofs_uncompress_block(struct inode *inode, loff_t block_start,
>  	       zerr != Z_STREAM_END) {
>  		if (!stream.avail_out) {
>  			if (pages[curpage]) {
> -				stream.next_out = page_address(pages[curpage])
> -						+ poffset;
> +				stream.next_out = kmap_local_page(pages[curpage] + poffset);

This is wrong. Most importantly because you need to add 'poffset' to the
final address provided by kmap_local_page(), not to the struct page pointer.

Secondly please wrap the line to fit into 80 chars.

>  				stream.avail_out = PAGE_SIZE - poffset;
>  				poffset = 0;
>  			} else {
> @@ -170,6 +168,12 @@ static loff_t zisofs_uncompress_block(struct inode *inode, loff_t block_start,
>  			}
>  		}
>  
> +		if (stream.next_out)
> +			if (stream.next_out != (char *)zisofs_sink_page) {
> +				kunmap_local(stream.next_out);
> +				stream.next_out = NULL;
> +			}
> +

This looks buggy as well. If we mapped page above, we'll unmap it here even
if stream.avail_out > 0 and we want to still write to it. I think you
should unmap the page here only if stream.avail_out == 0 and we are going
to switch to the next page...

>  		if (!stream.avail_out) {
>  			/* This page completed */
>  			if (pages[curpage]) {
> @@ -183,6 +187,9 @@ static loff_t zisofs_uncompress_block(struct inode *inode, loff_t block_start,
>  	}
>  inflate_out:
>  	zlib_inflateEnd(&stream);
> +	if (stream.next_out)
> +		if (stream.next_out != (char *)zisofs_sink_page)
> +			kunmap_local(stream.next_out);

This is correct but I'd simplify it to:

	if (stream.next_out && stream.next_out != (char *)zisofs_sink_page)
		kunmap_local(stream.next_out);

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [RFC PATCH] fs/isofs: Replace kmap() with kmap_local_page()
  2022-07-26 14:50 ` Jan Kara
@ 2022-07-26 16:21   ` Fabio M. De Francesco
  0 siblings, 0 replies; 3+ messages in thread
From: Fabio M. De Francesco @ 2022-07-26 16:21 UTC (permalink / raw)
  To: Jan Kara
  Cc: Matthew Wilcox (Oracle),
	Jan Kara, Roman Gushchin, Pali Rohár, Andrew Morton,
	Muchun Song, linux-kernel, Ira Weiny

On martedì 26 luglio 2022 16:50:24 CEST Jan Kara wrote:
> On Mon 25-07-22 18:23:31, Fabio M. De Francesco wrote:
> > The use of kmap() is being deprecated in favor of kmap_local_page().
> > 
> > There are two main problems with kmap(): (1) It comes with an overhead 
as
> > mapping space is restricted and protected by a global lock for
> > synchronization and (2) it also requires global TLB invalidation when 
the
> > kmap’s pool wraps and it might block when the mapping space is fully
> > utilized until a slot becomes available.
> > 
> > With kmap_local_page() the mappings are per thread, CPU local, can take
> > page faults, and can be called from any context (including interrupts).
> > Tasks can be preempted and, when scheduled to run again, the kernel
> > virtual addresses are restored and still valid. It is faster than 
kmap()
> > in kernels with HIGHMEM enabled.
> > 
> > Since kmap_local_page() can be safely used in compress.c, it should be
> > called everywhere instead of kmap().
> > 
> > Therefore, replace kmap() with kmap_local_page() in compress.c. Where 
it
> > is needed, use memzero_page() instead of open coding kmap_local_page()
> > plus memset() to fill the pages with zeros. Delete the redundant
> > flush_dcache_page() in the two call sites of memzero_page().
> > 
> > This is an RFC because these changes have not been tested (tests are
> > welcome!), therefore I'm not entirely sure whether these conversions 
work
> > properly. I'd like to hear comments from more experienced developers
> > before sending a real patch. Suggestions about how to run tests would
> > also be much appreciated.
> > 
> > Suggested-by: Ira Weiny <ira.weiny@intel.com>
> > Signed-off-by: Fabio M. De Francesco <fmdefrancesco@gmail.com>
> 
> What you propose makes sense to me. But the lack of testing is 
troublesome.
> You can always at least test your patch without highmem on isofs image 
with
> compression (mkisofs seems to support creating such filesystems).

I understand your concerns about the lack of testing. I'll find a way to do 
those tests on a QEMU/KVM x86_32 VM with a kernel with HIGHMEM64GB enabled. 
(this is what I did for other filesystem, Btrfs to start with).

I have no experience with this particular fs and saw that xfstests have no 
special tests for isofs. However, you are talking about using mkisofs and I 
suppose it will be enough for this purpose.

> Even that
> would detect a bug in your patch ;) - see below.
> 
> > diff --git a/fs/isofs/compress.c b/fs/isofs/compress.c
> > index 95a19f25d61c..a1562124bb91 100644
> > --- a/fs/isofs/compress.c
> > +++ b/fs/isofs/compress.c
> > @@ -120,8 +119,7 @@ static loff_t zisofs_uncompress_block(struct inode 
*inode, loff_t block_start,
> >  	       zerr != Z_STREAM_END) {
> >  		if (!stream.avail_out) {
> >  			if (pages[curpage]) {
> > -				stream.next_out = 
page_address(pages[curpage])
> > -						+ poffset;
> > +				stream.next_out = 
kmap_local_page(pages[curpage] + poffset);
> 
> This is wrong. Most importantly because you need to add 'poffset' to the
> final address provided by kmap_local_page(), not to the struct page 
pointer.

Sorry, this is only a typo.

I've already made dozens of conversions to kmap_local_page(), so I know 
that I should add the offset to the kernel virtual address: 

stream.next_out = kmap_local_page(pages[curpage]) + poffset;

> 
> Secondly please wrap the line to fit into 80 chars.
> 
> >  				stream.avail_out = PAGE_SIZE - 
poffset;
> >  				poffset = 0;
> >  			} else {
> > @@ -170,6 +168,12 @@ static loff_t zisofs_uncompress_block(struct inode 
*inode, loff_t block_start,
> >  			}
> >  		}
> >  
> > +		if (stream.next_out)
> > +			if (stream.next_out != (char 
*)zisofs_sink_page) {
> > +				kunmap_local(stream.next_out);
> > +				stream.next_out = NULL;
> > +			}
> > +
> 
> This looks buggy as well. If we mapped page above, we'll unmap it here 
even
> if stream.avail_out > 0 and we want to still write to it. I think you
> should unmap the page here only if stream.avail_out == 0 and we are going
> to switch to the next page...

I need to look carefully at this because I'm not 100% sure of what you are 
talking about. At this moment I cannot look at the code, however I suppose 
that when I can, I'll have no problems to figure out what I missed. 

> >  		if (!stream.avail_out) {
> >  			/* This page completed */
> >  			if (pages[curpage]) {
> > @@ -183,6 +187,9 @@ static loff_t zisofs_uncompress_block(struct inode 
*inode, loff_t block_start,
> >  	}
> >  inflate_out:
> >  	zlib_inflateEnd(&stream);
> > +	if (stream.next_out)
> > +		if (stream.next_out != (char *)zisofs_sink_page)
> > +			kunmap_local(stream.next_out);
> 
> This is correct but I'd simplify it to:
> 
> 	if (stream.next_out && stream.next_out != (char 
*)zisofs_sink_page)
> 		kunmap_local(stream.next_out);

It's shorter and more readable. I'll rewrite the code as you suggest.

Thanks,

Fabio

> 								
Honza
> -- 
> Jan Kara <jack@suse.com>
> SUSE Labs, CR
> 





^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2022-07-26 16:22 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-07-25 16:23 [RFC PATCH] fs/isofs: Replace kmap() with kmap_local_page() Fabio M. De Francesco
2022-07-26 14:50 ` Jan Kara
2022-07-26 16:21   ` Fabio M. De Francesco

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.