linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] tmpfs: fix regressions from wider use of ZERO_PAGE
@ 2022-04-08 20:38 Hugh Dickins
  2022-04-09  5:06 ` Christoph Hellwig
  0 siblings, 1 reply; 7+ messages in thread
From: Hugh Dickins @ 2022-04-08 20:38 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Hugh Dickins, Chuck Lever III, Mark Hemment, Patrice CHOTARD,
	Mikulas Patocka, Lukas Czerner, Christoph Hellwig,
	Darrick J. Wong, linux-mm, linux-nfs, linux-fsdevel,
	linux-kernel, linux-stm32

Chuck Lever reported fsx-based xfstests generic 075 091 112 127 failing
when 5.18-rc1 NFS server exports tmpfs: bisected to recent tmpfs change.

Whilst nfsd_splice_action() does contain some questionable handling of
repeated pages, and Chuck was able to work around there, history from
Mark Hemment makes clear that there might be similar dangers elsewhere:
it was not a good idea for me to pass ZERO_PAGE down to unknown actors.

Revert shmem_file_read_iter() to using ZERO_PAGE for holes only when
iter_is_iovec(); in other cases, use the more natural iov_iter_zero()
instead of copy_page_to_iter().  We would use iov_iter_zero() throughout,
but the x86 clear_user() is not nearly so well optimized as copy to user
(dd of 1T sparse tmpfs file takes 57 seconds rather than 44 seconds).

And now pagecache_init() does not need to SetPageUptodate(ZERO_PAGE(0)):
which had caused boot failure on arm noMMU STM32F7 and STM32H7 boards
Reported-by: Patrice CHOTARD <patrice.chotard@foss.st.com>

Reported-by: Chuck Lever III <chuck.lever@oracle.com>
Fixes: 56a8c8eb1eaf ("tmpfs: do not allocate pages on read")
Signed-off-by: Hugh Dickins <hughd@google.com>
Tested-by: Chuck Lever III <chuck.lever@oracle.com>
---

 mm/filemap.c |    6 ------
 mm/shmem.c   |   31 ++++++++++++++++++++-----------
 2 files changed, 20 insertions(+), 17 deletions(-)

--- 5.18-rc1/mm/filemap.c
+++ linux/mm/filemap.c
@@ -1063,12 +1063,6 @@ void __init pagecache_init(void)
 		init_waitqueue_head(&folio_wait_table[i]);
 
 	page_writeback_init();
-
-	/*
-	 * tmpfs uses the ZERO_PAGE for reading holes: it is up-to-date,
-	 * and splice's page_cache_pipe_buf_confirm() needs to see that.
-	 */
-	SetPageUptodate(ZERO_PAGE(0));
 }
 
 /*
--- 5.18-rc1/mm/shmem.c
+++ linux/mm/shmem.c
@@ -2513,7 +2513,6 @@ static ssize_t shmem_file_read_iter(struct kiocb *iocb, struct iov_iter *to)
 		pgoff_t end_index;
 		unsigned long nr, ret;
 		loff_t i_size = i_size_read(inode);
-		bool got_page;
 
 		end_index = i_size >> PAGE_SHIFT;
 		if (index > end_index)
@@ -2570,24 +2569,34 @@ static ssize_t shmem_file_read_iter(struct kiocb *iocb, struct iov_iter *to)
 			 */
 			if (!offset)
 				mark_page_accessed(page);
-			got_page = true;
+			/*
+			 * Ok, we have the page, and it's up-to-date, so
+			 * now we can copy it to user space...
+			 */
+			ret = copy_page_to_iter(page, offset, nr, to);
+			put_page(page);
+
+		} else if (iter_is_iovec(to)) {
+			/*
+			 * Copy to user tends to be so well optimized, but
+			 * clear_user() not so much, that it is noticeably
+			 * faster to copy the zero page instead of clearing.
+			 */
+			ret = copy_page_to_iter(ZERO_PAGE(0), offset, nr, to);
 		} else {
-			page = ZERO_PAGE(0);
-			got_page = false;
+			/*
+			 * But submitting the same page twice in a row to
+			 * splice() - or others? - can result in confusion:
+			 * so don't attempt that optimization on pipes etc.
+			 */
+			ret = iov_iter_zero(nr, to);
 		}
 
-		/*
-		 * Ok, we have the page, and it's up-to-date, so
-		 * now we can copy it to user space...
-		 */
-		ret = copy_page_to_iter(page, offset, nr, to);
 		retval += ret;
 		offset += ret;
 		index += offset >> PAGE_SHIFT;
 		offset &= ~PAGE_MASK;
 
-		if (got_page)
-			put_page(page);
 		if (!iov_iter_count(to))
 			break;
 		if (ret < nr) {

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] tmpfs: fix regressions from wider use of ZERO_PAGE
  2022-04-08 20:38 [PATCH] tmpfs: fix regressions from wider use of ZERO_PAGE Hugh Dickins
@ 2022-04-09  5:06 ` Christoph Hellwig
  2022-04-09  6:08   ` Hugh Dickins
  0 siblings, 1 reply; 7+ messages in thread
From: Christoph Hellwig @ 2022-04-09  5:06 UTC (permalink / raw)
  To: Hugh Dickins
  Cc: Andrew Morton, Chuck Lever III, Mark Hemment, Patrice CHOTARD,
	Mikulas Patocka, Lukas Czerner, Christoph Hellwig,
	Darrick J. Wong, linux-mm, linux-nfs, linux-fsdevel,
	linux-kernel, linux-stm32, viro

On Fri, Apr 08, 2022 at 01:38:41PM -0700, Hugh Dickins wrote:
> +		} else if (iter_is_iovec(to)) {
> +			/*
> +			 * Copy to user tends to be so well optimized, but
> +			 * clear_user() not so much, that it is noticeably
> +			 * faster to copy the zero page instead of clearing.
> +			 */
> +			ret = copy_page_to_iter(ZERO_PAGE(0), offset, nr, to);

Is the offset and length guaranteed to be less than PAGE_SIZE here?

Either way I'd rather do this optimization in iov_iter_zero rather
than hiding it in tmpfs.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] tmpfs: fix regressions from wider use of ZERO_PAGE
  2022-04-09  5:06 ` Christoph Hellwig
@ 2022-04-09  6:08   ` Hugh Dickins
  2022-04-12  4:57     ` making x86 clear_user not suck, was " Christoph Hellwig
  2022-04-12 23:22     ` Andrew Morton
  0 siblings, 2 replies; 7+ messages in thread
From: Hugh Dickins @ 2022-04-09  6:08 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Hugh Dickins, Andrew Morton, Chuck Lever III, Mark Hemment,
	Patrice CHOTARD, Mikulas Patocka, Lukas Czerner, Darrick J. Wong,
	Jason A. Donenfeld, Borislav Petkov, linux-mm, linux-nfs,
	linux-fsdevel, linux-kernel, linux-stm32, viro

On Sat, 9 Apr 2022, Christoph Hellwig wrote:
> On Fri, Apr 08, 2022 at 01:38:41PM -0700, Hugh Dickins wrote:
> > +		} else if (iter_is_iovec(to)) {
> > +			/*
> > +			 * Copy to user tends to be so well optimized, but
> > +			 * clear_user() not so much, that it is noticeably
> > +			 * faster to copy the zero page instead of clearing.
> > +			 */
> > +			ret = copy_page_to_iter(ZERO_PAGE(0), offset, nr, to);
> 
> Is the offset and length guaranteed to be less than PAGE_SIZE here?

Almost :) The offset is guaranteed to be less than PAGE_SIZE here, and
the length is guaranteed to be less than or equal to PAGE_SIZE - offset.

> 
> Either way I'd rather do this optimization in iov_iter_zero rather
> than hiding it in tmpfs.

Let's see what others say.  I think we would all prefer clear_user() to be
enhanced, and hack around it neither here in tmpfs nor in iov_iter_zero().
But that careful work won't get done by magic, nor by me.

And iov_iter_zero() has to deal with a wider range of possibilities,
when pulling in cache lines of ZERO_PAGE(0) will be less advantageous,
than in tmpfs doing a large dd - the case I'm aiming not to regress here
(tmpfs has been copying ZERO_PAGE(0) like this for years).

Hugh

^ permalink raw reply	[flat|nested] 7+ messages in thread

* making x86 clear_user not suck, was Re: [PATCH] tmpfs: fix regressions from wider use of ZERO_PAGE
  2022-04-09  6:08   ` Hugh Dickins
@ 2022-04-12  4:57     ` Christoph Hellwig
  2022-04-13 18:06       ` Matthew Wilcox
  2022-04-12 23:22     ` Andrew Morton
  1 sibling, 1 reply; 7+ messages in thread
From: Christoph Hellwig @ 2022-04-12  4:57 UTC (permalink / raw)
  To: Hugh Dickins
  Cc: Christoph Hellwig, Andrew Morton, Chuck Lever III, Mark Hemment,
	Patrice CHOTARD, Mikulas Patocka, Lukas Czerner, Darrick J. Wong,
	Jason A. Donenfeld, Borislav Petkov, linux-mm, linux-nfs,
	linux-fsdevel, linux-kernel, linux-stm32, viro, x86

On Fri, Apr 08, 2022 at 11:08:29PM -0700, Hugh Dickins wrote:
> > 
> > Either way I'd rather do this optimization in iov_iter_zero rather
> > than hiding it in tmpfs.
> 
> Let's see what others say.  I think we would all prefer clear_user() to be
> enhanced, and hack around it neither here in tmpfs nor in iov_iter_zero().
> But that careful work won't get done by magic, nor by me.

I agree with that.

> And iov_iter_zero() has to deal with a wider range of possibilities,
> when pulling in cache lines of ZERO_PAGE(0) will be less advantageous,
> than in tmpfs doing a large dd - the case I'm aiming not to regress here
> (tmpfs has been copying ZERO_PAGE(0) like this for years).

Maybe.  OTOH I'd hate to have iov_iter_zero not used much because it
sucks too much.

So how can we entice someone with the right knowledge to implement a
decent clear_user for x86?

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] tmpfs: fix regressions from wider use of ZERO_PAGE
  2022-04-09  6:08   ` Hugh Dickins
  2022-04-12  4:57     ` making x86 clear_user not suck, was " Christoph Hellwig
@ 2022-04-12 23:22     ` Andrew Morton
  2022-04-13 16:41       ` Christoph Hellwig
  1 sibling, 1 reply; 7+ messages in thread
From: Andrew Morton @ 2022-04-12 23:22 UTC (permalink / raw)
  To: Hugh Dickins
  Cc: Christoph Hellwig, Chuck Lever III, Mark Hemment,
	Patrice CHOTARD, Mikulas Patocka, Lukas Czerner, Darrick J. Wong,
	Jason A. Donenfeld, Borislav Petkov, linux-mm, linux-nfs,
	linux-fsdevel, linux-kernel, linux-stm32, viro

On Fri, 8 Apr 2022 23:08:29 -0700 (PDT) Hugh Dickins <hughd@google.com> wrote:

> > 
> > Either way I'd rather do this optimization in iov_iter_zero rather
> > than hiding it in tmpfs.
> 
> Let's see what others say.  I think we would all prefer clear_user() to be
> enhanced, and hack around it neither here in tmpfs nor in iov_iter_zero().
> But that careful work won't get done by magic, nor by me.
> 
> And iov_iter_zero() has to deal with a wider range of possibilities,
> when pulling in cache lines of ZERO_PAGE(0) will be less advantageous,
> than in tmpfs doing a large dd - the case I'm aiming not to regress here
> (tmpfs has been copying ZERO_PAGE(0) like this for years).

We do need something to get 5.18 fixed.  Christoph, do you think we
should proceed with this patch for 5.18?

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] tmpfs: fix regressions from wider use of ZERO_PAGE
  2022-04-12 23:22     ` Andrew Morton
@ 2022-04-13 16:41       ` Christoph Hellwig
  0 siblings, 0 replies; 7+ messages in thread
From: Christoph Hellwig @ 2022-04-13 16:41 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Hugh Dickins, Christoph Hellwig, Chuck Lever III, Mark Hemment,
	Patrice CHOTARD, Mikulas Patocka, Lukas Czerner, Darrick J. Wong,
	Jason A. Donenfeld, Borislav Petkov, linux-mm, linux-nfs,
	linux-fsdevel, linux-kernel, linux-stm32, viro

On Tue, Apr 12, 2022 at 04:22:21PM -0700, Andrew Morton wrote:
> On Fri, 8 Apr 2022 23:08:29 -0700 (PDT) Hugh Dickins <hughd@google.com> wrote:
> 
> > > 
> > > Either way I'd rather do this optimization in iov_iter_zero rather
> > > than hiding it in tmpfs.
> > 
> > Let's see what others say.  I think we would all prefer clear_user() to be
> > enhanced, and hack around it neither here in tmpfs nor in iov_iter_zero().
> > But that careful work won't get done by magic, nor by me.
> > 
> > And iov_iter_zero() has to deal with a wider range of possibilities,
> > when pulling in cache lines of ZERO_PAGE(0) will be less advantageous,
> > than in tmpfs doing a large dd - the case I'm aiming not to regress here
> > (tmpfs has been copying ZERO_PAGE(0) like this for years).
> 
> We do need something to get 5.18 fixed.  Christoph, do you think we
> should proceed with this patch for 5.18?

Well, let's queue it up then.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: making x86 clear_user not suck, was Re: [PATCH] tmpfs: fix regressions from wider use of ZERO_PAGE
  2022-04-12  4:57     ` making x86 clear_user not suck, was " Christoph Hellwig
@ 2022-04-13 18:06       ` Matthew Wilcox
  0 siblings, 0 replies; 7+ messages in thread
From: Matthew Wilcox @ 2022-04-13 18:06 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Hugh Dickins, Andrew Morton, Chuck Lever III, Mark Hemment,
	Patrice CHOTARD, Mikulas Patocka, Lukas Czerner, Darrick J. Wong,
	Jason A. Donenfeld, Borislav Petkov, linux-mm, linux-nfs,
	linux-fsdevel, linux-kernel, linux-stm32, viro, x86

On Tue, Apr 12, 2022 at 06:57:57AM +0200, Christoph Hellwig wrote:
> On Fri, Apr 08, 2022 at 11:08:29PM -0700, Hugh Dickins wrote:
> > > 
> > > Either way I'd rather do this optimization in iov_iter_zero rather
> > > than hiding it in tmpfs.
> > 
> > Let's see what others say.  I think we would all prefer clear_user() to be
> > enhanced, and hack around it neither here in tmpfs nor in iov_iter_zero().
> > But that careful work won't get done by magic, nor by me.
> 
> I agree with that.
> 
> > And iov_iter_zero() has to deal with a wider range of possibilities,
> > when pulling in cache lines of ZERO_PAGE(0) will be less advantageous,
> > than in tmpfs doing a large dd - the case I'm aiming not to regress here
> > (tmpfs has been copying ZERO_PAGE(0) like this for years).
> 
> Maybe.  OTOH I'd hate to have iov_iter_zero not used much because it
> sucks too much.
> 
> So how can we entice someone with the right knowledge to implement a
> decent clear_user for x86?

Apparently that already happened, but it needs finishing up:
https://lore.kernel.org/lkml/Yk9yBcj78mpXOOLL@zx2c4.com/

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2022-04-13 18:06 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-04-08 20:38 [PATCH] tmpfs: fix regressions from wider use of ZERO_PAGE Hugh Dickins
2022-04-09  5:06 ` Christoph Hellwig
2022-04-09  6:08   ` Hugh Dickins
2022-04-12  4:57     ` making x86 clear_user not suck, was " Christoph Hellwig
2022-04-13 18:06       ` Matthew Wilcox
2022-04-12 23:22     ` Andrew Morton
2022-04-13 16:41       ` Christoph Hellwig

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).