linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/2] iomap: Fix leakage of pipe pages while splicing
@ 2019-11-21 16:15 Jan Kara
  2019-11-21 16:15 ` [PATCH 1/2] iomap: Fix pipe page leakage during splicing Jan Kara
                   ` (2 more replies)
  0 siblings, 3 replies; 13+ messages in thread
From: Jan Kara @ 2019-11-21 16:15 UTC (permalink / raw)
  To: Darrick J. Wong
  Cc: linux-fsdevel, Christoph Hellwig, Matthew Bobrowski,
	Eric Biggers, Jan Kara

Hello,

here is a fix and a cleanup for iomap code. The first patch fixes a leakage
of pipe pages when iomap_dio_rw() splices to a pipe, the second patch is
a cleanup that removes strange copying of iter in iomap_dio_rw(). Patches
have passed fstests for ext4 and xfs and fix the syzkaller reproducer for
me.

								Honza

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 1/2] iomap: Fix pipe page leakage during splicing
  2019-11-21 16:15 [PATCH 0/2] iomap: Fix leakage of pipe pages while splicing Jan Kara
@ 2019-11-21 16:15 ` Jan Kara
  2019-11-21 23:55   ` Darrick J. Wong
  2019-11-22 13:17   ` Christoph Hellwig
  2019-11-21 16:15 ` [PATCH 2/2] iomap: Do not create fake iter in iomap_dio_bio_actor() Jan Kara
  2019-11-21 16:58 ` [PATCH 0/2] iomap: Fix leakage of pipe pages while splicing Darrick J. Wong
  2 siblings, 2 replies; 13+ messages in thread
From: Jan Kara @ 2019-11-21 16:15 UTC (permalink / raw)
  To: Darrick J. Wong
  Cc: linux-fsdevel, Christoph Hellwig, Matthew Bobrowski,
	Eric Biggers, Jan Kara, stable

When splicing using iomap_dio_rw() to a pipe, we may leak pipe pages
because bio_iov_iter_get_pages() records that the pipe will have full
extent worth of data however if file size is not block size aligned
iomap_dio_rw() returns less than what bio_iov_iter_get_pages() set up
and splice code gets confused leaking a pipe page with the file tail.

Handle the situation similarly to the old direct IO implementation and
revert iter to actually returned read amount which makes iter consistent
with value returned from iomap_dio_rw() and thus the splice code is
happy.

Fixes: ff6a9292e6f6 ("iomap: implement direct I/O")
CC: stable@vger.kernel.org
Reported-by: syzbot+991400e8eba7e00a26e1@syzkaller.appspotmail.com
Signed-off-by: Jan Kara <jack@suse.cz>
---
 fs/iomap/direct-io.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/fs/iomap/direct-io.c b/fs/iomap/direct-io.c
index 1fc28c2da279..30189652c560 100644
--- a/fs/iomap/direct-io.c
+++ b/fs/iomap/direct-io.c
@@ -497,8 +497,15 @@ iomap_dio_rw(struct kiocb *iocb, struct iov_iter *iter,
 		}
 		pos += ret;
 
-		if (iov_iter_rw(iter) == READ && pos >= dio->i_size)
+		if (iov_iter_rw(iter) == READ && pos >= dio->i_size) {
+			/*
+			 * We will report we've read data only upto i_size.
+			 * Revert iter to a state corresponding to that as
+			 * some callers (such as splice code) rely on it.
+			 */
+			iov_iter_revert(iter, pos - dio->i_size);
 			break;
+		}
 	} while ((count = iov_iter_count(iter)) > 0);
 	blk_finish_plug(&plug);
 
-- 
2.16.4


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 2/2] iomap: Do not create fake iter in iomap_dio_bio_actor()
  2019-11-21 16:15 [PATCH 0/2] iomap: Fix leakage of pipe pages while splicing Jan Kara
  2019-11-21 16:15 ` [PATCH 1/2] iomap: Fix pipe page leakage during splicing Jan Kara
@ 2019-11-21 16:15 ` Jan Kara
  2019-11-22  0:02   ` Darrick J. Wong
  2019-11-22 13:26   ` Christoph Hellwig
  2019-11-21 16:58 ` [PATCH 0/2] iomap: Fix leakage of pipe pages while splicing Darrick J. Wong
  2 siblings, 2 replies; 13+ messages in thread
From: Jan Kara @ 2019-11-21 16:15 UTC (permalink / raw)
  To: Darrick J. Wong
  Cc: linux-fsdevel, Christoph Hellwig, Matthew Bobrowski,
	Eric Biggers, Jan Kara

iomap_dio_bio_actor() copies iter to a local variable and then limits it
to a file extent we have mapped. When IO is submitted,
iomap_dio_bio_actor() advances the original iter while the copied iter
is advanced inside bio_iov_iter_get_pages(). This logic is non-obvious
especially because both iters still point to same shared structures
(such as pipe info) so if iov_iter_advance() changes anything in the
shared structure, this scheme breaks. Let's just truncate and reexpand
the original iter as needed instead of playing games with copying iters
and keeping them in sync.

Signed-off-by: Jan Kara <jack@suse.cz>
---
 fs/iomap/direct-io.c | 25 ++++++++++++-------------
 1 file changed, 12 insertions(+), 13 deletions(-)

diff --git a/fs/iomap/direct-io.c b/fs/iomap/direct-io.c
index 30189652c560..01a4264bce37 100644
--- a/fs/iomap/direct-io.c
+++ b/fs/iomap/direct-io.c
@@ -201,12 +201,12 @@ iomap_dio_bio_actor(struct inode *inode, loff_t pos, loff_t length,
 	unsigned int blkbits = blksize_bits(bdev_logical_block_size(iomap->bdev));
 	unsigned int fs_block_size = i_blocksize(inode), pad;
 	unsigned int align = iov_iter_alignment(dio->submit.iter);
-	struct iov_iter iter;
 	struct bio *bio;
 	bool need_zeroout = false;
 	bool use_fua = false;
 	int nr_pages, ret = 0;
 	size_t copied = 0;
+	size_t orig_count = iov_iter_count(dio->submit.iter);
 
 	if ((pos | length | align) & ((1 << blkbits) - 1))
 		return -EINVAL;
@@ -235,16 +235,14 @@ iomap_dio_bio_actor(struct inode *inode, loff_t pos, loff_t length,
 			use_fua = true;
 	}
 
-	/*
-	 * Operate on a partial iter trimmed to the extent we were called for.
-	 * We'll update the iter in the dio once we're done with this extent.
-	 */
-	iter = *dio->submit.iter;
-	iov_iter_truncate(&iter, length);
+	/* Operate on a partial iter trimmed to the extent we were called for */
+	iov_iter_truncate(dio->submit.iter, length);
 
-	nr_pages = iov_iter_npages(&iter, BIO_MAX_PAGES);
-	if (nr_pages <= 0)
+	nr_pages = iov_iter_npages(dio->submit.iter, BIO_MAX_PAGES);
+	if (nr_pages <= 0) {
+		iov_iter_reexpand(dio->submit.iter, orig_count);
 		return nr_pages;
+	}
 
 	if (need_zeroout) {
 		/* zero out from the start of the block to the write offset */
@@ -257,6 +255,7 @@ iomap_dio_bio_actor(struct inode *inode, loff_t pos, loff_t length,
 		size_t n;
 		if (dio->error) {
 			iov_iter_revert(dio->submit.iter, copied);
+			iov_iter_reexpand(dio->submit.iter, orig_count);
 			return 0;
 		}
 
@@ -268,7 +267,7 @@ iomap_dio_bio_actor(struct inode *inode, loff_t pos, loff_t length,
 		bio->bi_private = dio;
 		bio->bi_end_io = iomap_dio_bio_end_io;
 
-		ret = bio_iov_iter_get_pages(bio, &iter);
+		ret = bio_iov_iter_get_pages(bio, dio->submit.iter);
 		if (unlikely(ret)) {
 			/*
 			 * We have to stop part way through an IO. We must fall
@@ -294,13 +293,11 @@ iomap_dio_bio_actor(struct inode *inode, loff_t pos, loff_t length,
 				bio_set_pages_dirty(bio);
 		}
 
-		iov_iter_advance(dio->submit.iter, n);
-
 		dio->size += n;
 		pos += n;
 		copied += n;
 
-		nr_pages = iov_iter_npages(&iter, BIO_MAX_PAGES);
+		nr_pages = iov_iter_npages(dio->submit.iter, BIO_MAX_PAGES);
 		iomap_dio_submit_bio(dio, iomap, bio);
 	} while (nr_pages);
 
@@ -318,6 +315,8 @@ iomap_dio_bio_actor(struct inode *inode, loff_t pos, loff_t length,
 		if (pad)
 			iomap_dio_zero(dio, iomap, pos, fs_block_size - pad);
 	}
+	/* Undo iter limitation to current extent */
+	iov_iter_reexpand(dio->submit.iter, orig_count - copied);
 	return copied ? copied : ret;
 }
 
-- 
2.16.4


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH 0/2] iomap: Fix leakage of pipe pages while splicing
  2019-11-21 16:15 [PATCH 0/2] iomap: Fix leakage of pipe pages while splicing Jan Kara
  2019-11-21 16:15 ` [PATCH 1/2] iomap: Fix pipe page leakage during splicing Jan Kara
  2019-11-21 16:15 ` [PATCH 2/2] iomap: Do not create fake iter in iomap_dio_bio_actor() Jan Kara
@ 2019-11-21 16:58 ` Darrick J. Wong
  2019-11-21 17:15   ` Jan Kara
  2 siblings, 1 reply; 13+ messages in thread
From: Darrick J. Wong @ 2019-11-21 16:58 UTC (permalink / raw)
  To: Jan Kara
  Cc: linux-fsdevel, Christoph Hellwig, Matthew Bobrowski, Eric Biggers

On Thu, Nov 21, 2019 at 05:15:33PM +0100, Jan Kara wrote:
> Hello,
> 
> here is a fix and a cleanup for iomap code. The first patch fixes a leakage
> of pipe pages when iomap_dio_rw() splices to a pipe, the second patch is
> a cleanup that removes strange copying of iter in iomap_dio_rw(). Patches
> have passed fstests for ext4 and xfs and fix the syzkaller reproducer for
> me.

Will have a look, but in the meantime -- do you have quick reproducer
that can be packaged for fstests?  Or is it just the syzbot reproducer?

--D

> 
> 								Honza

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 0/2] iomap: Fix leakage of pipe pages while splicing
  2019-11-21 16:58 ` [PATCH 0/2] iomap: Fix leakage of pipe pages while splicing Darrick J. Wong
@ 2019-11-21 17:15   ` Jan Kara
  0 siblings, 0 replies; 13+ messages in thread
From: Jan Kara @ 2019-11-21 17:15 UTC (permalink / raw)
  To: Darrick J. Wong
  Cc: Jan Kara, linux-fsdevel, Christoph Hellwig, Matthew Bobrowski,
	Eric Biggers

On Thu 21-11-19 08:58:29, Darrick J. Wong wrote:
> On Thu, Nov 21, 2019 at 05:15:33PM +0100, Jan Kara wrote:
> > Hello,
> > 
> > here is a fix and a cleanup for iomap code. The first patch fixes a leakage
> > of pipe pages when iomap_dio_rw() splices to a pipe, the second patch is
> > a cleanup that removes strange copying of iter in iomap_dio_rw(). Patches
> > have passed fstests for ext4 and xfs and fix the syzkaller reproducer for
> > me.
> 
> Will have a look, but in the meantime -- do you have quick reproducer
> that can be packaged for fstests?  Or is it just the syzbot reproducer?

I have just the syzkaller reproducer but now that I understand the problem 
I might be able to write something more readable... I'll try.

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/2] iomap: Fix pipe page leakage during splicing
  2019-11-21 16:15 ` [PATCH 1/2] iomap: Fix pipe page leakage during splicing Jan Kara
@ 2019-11-21 23:55   ` Darrick J. Wong
  2019-11-22  6:04     ` Matthew Bobrowski
  2019-11-22 10:47     ` Jan Kara
  2019-11-22 13:17   ` Christoph Hellwig
  1 sibling, 2 replies; 13+ messages in thread
From: Darrick J. Wong @ 2019-11-21 23:55 UTC (permalink / raw)
  To: Jan Kara
  Cc: linux-fsdevel, Christoph Hellwig, Matthew Bobrowski,
	Eric Biggers, stable

On Thu, Nov 21, 2019 at 05:15:34PM +0100, Jan Kara wrote:
> When splicing using iomap_dio_rw() to a pipe, we may leak pipe pages
> because bio_iov_iter_get_pages() records that the pipe will have full
> extent worth of data however if file size is not block size aligned
> iomap_dio_rw() returns less than what bio_iov_iter_get_pages() set up
> and splice code gets confused leaking a pipe page with the file tail.
> 
> Handle the situation similarly to the old direct IO implementation and
> revert iter to actually returned read amount which makes iter consistent
> with value returned from iomap_dio_rw() and thus the splice code is
> happy.
> 
> Fixes: ff6a9292e6f6 ("iomap: implement direct I/O")
> CC: stable@vger.kernel.org
> Reported-by: syzbot+991400e8eba7e00a26e1@syzkaller.appspotmail.com
> Signed-off-by: Jan Kara <jack@suse.cz>
> ---
>  fs/iomap/direct-io.c | 9 ++++++++-
>  1 file changed, 8 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/iomap/direct-io.c b/fs/iomap/direct-io.c
> index 1fc28c2da279..30189652c560 100644
> --- a/fs/iomap/direct-io.c
> +++ b/fs/iomap/direct-io.c
> @@ -497,8 +497,15 @@ iomap_dio_rw(struct kiocb *iocb, struct iov_iter *iter,
>  		}
>  		pos += ret;
>  
> -		if (iov_iter_rw(iter) == READ && pos >= dio->i_size)
> +		if (iov_iter_rw(iter) == READ && pos >= dio->i_size) {
> +			/*
> +			 * We will report we've read data only upto i_size.

Nit: "up to"; will fix that on the way in.

> +			 * Revert iter to a state corresponding to that as
> +			 * some callers (such as splice code) rely on it.
> +			 */
> +			iov_iter_revert(iter, pos - dio->i_size);

Just to make sure I'm getting this right, iov_iter_revert walks the
iterator variables backwards through pipe buffers/bvec/iovec, which has
the effect of undoing whatever iterator walking we've just done.

In contrast, iov_iter_reexpand undoes a previous subtraction to
iov->count which was (presumably) done via iov_iter_truncate.

Or to put it another way, _revert walks the iteration pointer backwards,
whereas _truncate/_reexpand modify where the iteration ends.  Right?

Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>

--D

>  			break;
> +		}
>  	} while ((count = iov_iter_count(iter)) > 0);
>  	blk_finish_plug(&plug);
>  
> -- 
> 2.16.4
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/2] iomap: Do not create fake iter in iomap_dio_bio_actor()
  2019-11-21 16:15 ` [PATCH 2/2] iomap: Do not create fake iter in iomap_dio_bio_actor() Jan Kara
@ 2019-11-22  0:02   ` Darrick J. Wong
  2019-11-22 12:11     ` Jan Kara
  2019-11-22 13:26   ` Christoph Hellwig
  1 sibling, 1 reply; 13+ messages in thread
From: Darrick J. Wong @ 2019-11-22  0:02 UTC (permalink / raw)
  To: Jan Kara
  Cc: linux-fsdevel, Christoph Hellwig, Matthew Bobrowski, Eric Biggers

On Thu, Nov 21, 2019 at 05:15:35PM +0100, Jan Kara wrote:
> iomap_dio_bio_actor() copies iter to a local variable and then limits it
> to a file extent we have mapped. When IO is submitted,
> iomap_dio_bio_actor() advances the original iter while the copied iter
> is advanced inside bio_iov_iter_get_pages(). This logic is non-obvious
> especially because both iters still point to same shared structures
> (such as pipe info) so if iov_iter_advance() changes anything in the
> shared structure, this scheme breaks. Let's just truncate and reexpand
> the original iter as needed instead of playing games with copying iters
> and keeping them in sync.
> 
> Signed-off-by: Jan Kara <jack@suse.cz>
> ---
>  fs/iomap/direct-io.c | 25 ++++++++++++-------------
>  1 file changed, 12 insertions(+), 13 deletions(-)
> 
> diff --git a/fs/iomap/direct-io.c b/fs/iomap/direct-io.c
> index 30189652c560..01a4264bce37 100644
> --- a/fs/iomap/direct-io.c
> +++ b/fs/iomap/direct-io.c
> @@ -201,12 +201,12 @@ iomap_dio_bio_actor(struct inode *inode, loff_t pos, loff_t length,
>  	unsigned int blkbits = blksize_bits(bdev_logical_block_size(iomap->bdev));
>  	unsigned int fs_block_size = i_blocksize(inode), pad;
>  	unsigned int align = iov_iter_alignment(dio->submit.iter);
> -	struct iov_iter iter;
>  	struct bio *bio;
>  	bool need_zeroout = false;
>  	bool use_fua = false;
>  	int nr_pages, ret = 0;
>  	size_t copied = 0;
> +	size_t orig_count = iov_iter_count(dio->submit.iter);
>  
>  	if ((pos | length | align) & ((1 << blkbits) - 1))
>  		return -EINVAL;
> @@ -235,16 +235,14 @@ iomap_dio_bio_actor(struct inode *inode, loff_t pos, loff_t length,
>  			use_fua = true;
>  	}
>  
> -	/*
> -	 * Operate on a partial iter trimmed to the extent we were called for.
> -	 * We'll update the iter in the dio once we're done with this extent.
> -	 */
> -	iter = *dio->submit.iter;
> -	iov_iter_truncate(&iter, length);
> +	/* Operate on a partial iter trimmed to the extent we were called for */
> +	iov_iter_truncate(dio->submit.iter, length);

Ok... so here we shorten the dio iterator to fit the mapping we got...

>  
> -	nr_pages = iov_iter_npages(&iter, BIO_MAX_PAGES);
> -	if (nr_pages <= 0)
> +	nr_pages = iov_iter_npages(dio->submit.iter, BIO_MAX_PAGES);
> +	if (nr_pages <= 0) {
> +		iov_iter_reexpand(dio->submit.iter, orig_count);
>  		return nr_pages;

...and if there aren't any pages, we revert the truncation and bail...

> +	}
>  
>  	if (need_zeroout) {
>  		/* zero out from the start of the block to the write offset */
> @@ -257,6 +255,7 @@ iomap_dio_bio_actor(struct inode *inode, loff_t pos, loff_t length,
>  		size_t n;
>  		if (dio->error) {
>  			iov_iter_revert(dio->submit.iter, copied);
> +			iov_iter_reexpand(dio->submit.iter, orig_count);

...if the bio failed, we walk the dio iterator backward the entire
amount that it had advanced, undo the length truncation and bail...

>  			return 0;
>  		}
>  
> @@ -268,7 +267,7 @@ iomap_dio_bio_actor(struct inode *inode, loff_t pos, loff_t length,
>  		bio->bi_private = dio;
>  		bio->bi_end_io = iomap_dio_bio_end_io;
>  
> -		ret = bio_iov_iter_get_pages(bio, &iter);
> +		ret = bio_iov_iter_get_pages(bio, dio->submit.iter);

...here's where we walk the dio iter forward as part of attaching pages
to the bio...

>  		if (unlikely(ret)) {
>  			/*
>  			 * We have to stop part way through an IO. We must fall
> @@ -294,13 +293,11 @@ iomap_dio_bio_actor(struct inode *inode, loff_t pos, loff_t length,
>  				bio_set_pages_dirty(bio);
>  		}
>  
> -		iov_iter_advance(dio->submit.iter, n);
> -
>  		dio->size += n;
>  		pos += n;
>  		copied += n;
>  
> -		nr_pages = iov_iter_npages(&iter, BIO_MAX_PAGES);
> +		nr_pages = iov_iter_npages(dio->submit.iter, BIO_MAX_PAGES);
>  		iomap_dio_submit_bio(dio, iomap, bio);
>  	} while (nr_pages);
>  
> @@ -318,6 +315,8 @@ iomap_dio_bio_actor(struct inode *inode, loff_t pos, loff_t length,
>  		if (pad)
>  			iomap_dio_zero(dio, iomap, pos, fs_block_size - pad);
>  	}
> +	/* Undo iter limitation to current extent */
> +	iov_iter_reexpand(dio->submit.iter, orig_count - copied);

...and here we undo the length truncation, same as all the other exit
points.  Assuming my understanding of the bookkeeping is correct,

Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>

(Would still like to see a proper regression test for fstests though...)

--D


>  	return copied ? copied : ret;
>  }
>  
> -- 
> 2.16.4
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/2] iomap: Fix pipe page leakage during splicing
  2019-11-21 23:55   ` Darrick J. Wong
@ 2019-11-22  6:04     ` Matthew Bobrowski
  2019-11-22 10:47     ` Jan Kara
  1 sibling, 0 replies; 13+ messages in thread
From: Matthew Bobrowski @ 2019-11-22  6:04 UTC (permalink / raw)
  To: Darrick J. Wong
  Cc: Jan Kara, linux-fsdevel, Christoph Hellwig, Eric Biggers, stable

On Thu, Nov 21, 2019 at 03:55:28PM -0800, Darrick J. Wong wrote:
> On Thu, Nov 21, 2019 at 05:15:34PM +0100, Jan Kara wrote:
> > @@ -497,8 +497,15 @@ iomap_dio_rw(struct kiocb *iocb, struct iov_iter *iter,
> >  		}
> >  		pos += ret;
> >  
> > -		if (iov_iter_rw(iter) == READ && pos >= dio->i_size)
> > +		if (iov_iter_rw(iter) == READ && pos >= dio->i_size) {
> > +			/*
> > +			 * We will report we've read data only upto i_size.
> 
> Nit: "up to"; will fix that on the way in.

A nit of a nit: "We will report that we've read..."; I think it reads
better, so might as well update it if you're already fixing the other
nit up as you're pulling this in. :P

/M

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/2] iomap: Fix pipe page leakage during splicing
  2019-11-21 23:55   ` Darrick J. Wong
  2019-11-22  6:04     ` Matthew Bobrowski
@ 2019-11-22 10:47     ` Jan Kara
  1 sibling, 0 replies; 13+ messages in thread
From: Jan Kara @ 2019-11-22 10:47 UTC (permalink / raw)
  To: Darrick J. Wong
  Cc: Jan Kara, linux-fsdevel, Christoph Hellwig, Matthew Bobrowski,
	Eric Biggers, stable

On Thu 21-11-19 15:55:28, Darrick J. Wong wrote:
> On Thu, Nov 21, 2019 at 05:15:34PM +0100, Jan Kara wrote:
> > When splicing using iomap_dio_rw() to a pipe, we may leak pipe pages
> > because bio_iov_iter_get_pages() records that the pipe will have full
> > extent worth of data however if file size is not block size aligned
> > iomap_dio_rw() returns less than what bio_iov_iter_get_pages() set up
> > and splice code gets confused leaking a pipe page with the file tail.
> > 
> > Handle the situation similarly to the old direct IO implementation and
> > revert iter to actually returned read amount which makes iter consistent
> > with value returned from iomap_dio_rw() and thus the splice code is
> > happy.
> > 
> > Fixes: ff6a9292e6f6 ("iomap: implement direct I/O")
> > CC: stable@vger.kernel.org
> > Reported-by: syzbot+991400e8eba7e00a26e1@syzkaller.appspotmail.com
> > Signed-off-by: Jan Kara <jack@suse.cz>
> > ---
> >  fs/iomap/direct-io.c | 9 ++++++++-
> >  1 file changed, 8 insertions(+), 1 deletion(-)
> > 
> > diff --git a/fs/iomap/direct-io.c b/fs/iomap/direct-io.c
> > index 1fc28c2da279..30189652c560 100644
> > --- a/fs/iomap/direct-io.c
> > +++ b/fs/iomap/direct-io.c
> > @@ -497,8 +497,15 @@ iomap_dio_rw(struct kiocb *iocb, struct iov_iter *iter,
> >  		}
> >  		pos += ret;
> >  
> > -		if (iov_iter_rw(iter) == READ && pos >= dio->i_size)
> > +		if (iov_iter_rw(iter) == READ && pos >= dio->i_size) {
> > +			/*
> > +			 * We will report we've read data only upto i_size.
> 
> Nit: "up to"; will fix that on the way in.
> 
> > +			 * Revert iter to a state corresponding to that as
> > +			 * some callers (such as splice code) rely on it.
> > +			 */
> > +			iov_iter_revert(iter, pos - dio->i_size);
> 
> Just to make sure I'm getting this right, iov_iter_revert walks the
> iterator variables backwards through pipe buffers/bvec/iovec, which has
> the effect of undoing whatever iterator walking we've just done.
> 
> In contrast, iov_iter_reexpand undoes a previous subtraction to
> iov->count which was (presumably) done via iov_iter_truncate.
> 
> Or to put it another way, _revert walks the iteration pointer backwards,
> whereas _truncate/_reexpand modify where the iteration ends.  Right?

Correct.

> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>

Thanks!

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/2] iomap: Do not create fake iter in iomap_dio_bio_actor()
  2019-11-22  0:02   ` Darrick J. Wong
@ 2019-11-22 12:11     ` Jan Kara
  0 siblings, 0 replies; 13+ messages in thread
From: Jan Kara @ 2019-11-22 12:11 UTC (permalink / raw)
  To: Darrick J. Wong
  Cc: Jan Kara, linux-fsdevel, Christoph Hellwig, Matthew Bobrowski,
	Eric Biggers

On Thu 21-11-19 16:02:28, Darrick J. Wong wrote:
> On Thu, Nov 21, 2019 at 05:15:35PM +0100, Jan Kara wrote:
> > iomap_dio_bio_actor() copies iter to a local variable and then limits it
> > to a file extent we have mapped. When IO is submitted,
> > iomap_dio_bio_actor() advances the original iter while the copied iter
> > is advanced inside bio_iov_iter_get_pages(). This logic is non-obvious
> > especially because both iters still point to same shared structures
> > (such as pipe info) so if iov_iter_advance() changes anything in the
> > shared structure, this scheme breaks. Let's just truncate and reexpand
> > the original iter as needed instead of playing games with copying iters
> > and keeping them in sync.
> > 
> > Signed-off-by: Jan Kara <jack@suse.cz>
> > ---
> >  fs/iomap/direct-io.c | 25 ++++++++++++-------------
> >  1 file changed, 12 insertions(+), 13 deletions(-)
> > 
> > diff --git a/fs/iomap/direct-io.c b/fs/iomap/direct-io.c
> > index 30189652c560..01a4264bce37 100644
> > --- a/fs/iomap/direct-io.c
> > +++ b/fs/iomap/direct-io.c
> > @@ -201,12 +201,12 @@ iomap_dio_bio_actor(struct inode *inode, loff_t pos, loff_t length,
> >  	unsigned int blkbits = blksize_bits(bdev_logical_block_size(iomap->bdev));
> >  	unsigned int fs_block_size = i_blocksize(inode), pad;
> >  	unsigned int align = iov_iter_alignment(dio->submit.iter);
> > -	struct iov_iter iter;
> >  	struct bio *bio;
> >  	bool need_zeroout = false;
> >  	bool use_fua = false;
> >  	int nr_pages, ret = 0;
> >  	size_t copied = 0;
> > +	size_t orig_count = iov_iter_count(dio->submit.iter);
> >  
> >  	if ((pos | length | align) & ((1 << blkbits) - 1))
> >  		return -EINVAL;
> > @@ -235,16 +235,14 @@ iomap_dio_bio_actor(struct inode *inode, loff_t pos, loff_t length,
> >  			use_fua = true;
> >  	}
> >  
> > -	/*
> > -	 * Operate on a partial iter trimmed to the extent we were called for.
> > -	 * We'll update the iter in the dio once we're done with this extent.
> > -	 */
> > -	iter = *dio->submit.iter;
> > -	iov_iter_truncate(&iter, length);
> > +	/* Operate on a partial iter trimmed to the extent we were called for */
> > +	iov_iter_truncate(dio->submit.iter, length);
> 
> Ok... so here we shorten the dio iterator to fit the mapping we got...
> 
> >  
> > -	nr_pages = iov_iter_npages(&iter, BIO_MAX_PAGES);
> > -	if (nr_pages <= 0)
> > +	nr_pages = iov_iter_npages(dio->submit.iter, BIO_MAX_PAGES);
> > +	if (nr_pages <= 0) {
> > +		iov_iter_reexpand(dio->submit.iter, orig_count);
> >  		return nr_pages;
> 
> ...and if there aren't any pages, we revert the truncation and bail...
> 
> > +	}
> >  
> >  	if (need_zeroout) {
> >  		/* zero out from the start of the block to the write offset */
> > @@ -257,6 +255,7 @@ iomap_dio_bio_actor(struct inode *inode, loff_t pos, loff_t length,
> >  		size_t n;
> >  		if (dio->error) {
> >  			iov_iter_revert(dio->submit.iter, copied);
> > +			iov_iter_reexpand(dio->submit.iter, orig_count);
> 
> ...if the bio failed, we walk the dio iterator backward the entire
> amount that it had advanced, undo the length truncation and bail...
> 
> >  			return 0;
> >  		}
> >  
> > @@ -268,7 +267,7 @@ iomap_dio_bio_actor(struct inode *inode, loff_t pos, loff_t length,
> >  		bio->bi_private = dio;
> >  		bio->bi_end_io = iomap_dio_bio_end_io;
> >  
> > -		ret = bio_iov_iter_get_pages(bio, &iter);
> > +		ret = bio_iov_iter_get_pages(bio, dio->submit.iter);
> 
> ...here's where we walk the dio iter forward as part of attaching pages
> to the bio...
> 
> >  		if (unlikely(ret)) {
> >  			/*
> >  			 * We have to stop part way through an IO. We must fall
> > @@ -294,13 +293,11 @@ iomap_dio_bio_actor(struct inode *inode, loff_t pos, loff_t length,
> >  				bio_set_pages_dirty(bio);
> >  		}
> >  
> > -		iov_iter_advance(dio->submit.iter, n);
> > -
> >  		dio->size += n;
> >  		pos += n;
> >  		copied += n;
> >  
> > -		nr_pages = iov_iter_npages(&iter, BIO_MAX_PAGES);
> > +		nr_pages = iov_iter_npages(dio->submit.iter, BIO_MAX_PAGES);
> >  		iomap_dio_submit_bio(dio, iomap, bio);
> >  	} while (nr_pages);
> >  
> > @@ -318,6 +315,8 @@ iomap_dio_bio_actor(struct inode *inode, loff_t pos, loff_t length,
> >  		if (pad)
> >  			iomap_dio_zero(dio, iomap, pos, fs_block_size - pad);
> >  	}
> > +	/* Undo iter limitation to current extent */
> > +	iov_iter_reexpand(dio->submit.iter, orig_count - copied);
> 
> ...and here we undo the length truncation, same as all the other exit
> points.  Assuming my understanding of the bookkeeping is correct,

Yes, it is correct (or at least the same as my understanding :).

> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
> 
> (Would still like to see a proper regression test for fstests though...)

So this patch does not fix any bug as such, it is just a cleanup. After
more digging in the iter code and what iov_iter_advance() does to pipe
iters I've convinced myself that the original code copying the iter is
actually correct. But to me it seems a lot safer to do the truncate /
reexpand of the original iter rather then rely on very fine details of the
implementation of individual iters (and then debug the breakage if one iter
type changes these details).

WRT regression test for the first patch, I'll work on that.

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/2] iomap: Fix pipe page leakage during splicing
  2019-11-21 16:15 ` [PATCH 1/2] iomap: Fix pipe page leakage during splicing Jan Kara
  2019-11-21 23:55   ` Darrick J. Wong
@ 2019-11-22 13:17   ` Christoph Hellwig
  1 sibling, 0 replies; 13+ messages in thread
From: Christoph Hellwig @ 2019-11-22 13:17 UTC (permalink / raw)
  To: Jan Kara
  Cc: Darrick J. Wong, linux-fsdevel, Christoph Hellwig,
	Matthew Bobrowski, Eric Biggers, stable

Looks good modulo the spelling critique:

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/2] iomap: Do not create fake iter in iomap_dio_bio_actor()
  2019-11-21 16:15 ` [PATCH 2/2] iomap: Do not create fake iter in iomap_dio_bio_actor() Jan Kara
  2019-11-22  0:02   ` Darrick J. Wong
@ 2019-11-22 13:26   ` Christoph Hellwig
  2019-11-25  8:18     ` Jan Kara
  1 sibling, 1 reply; 13+ messages in thread
From: Christoph Hellwig @ 2019-11-22 13:26 UTC (permalink / raw)
  To: Jan Kara
  Cc: Darrick J. Wong, linux-fsdevel, Christoph Hellwig,
	Matthew Bobrowski, Eric Biggers

> -	/*
> -	 * Operate on a partial iter trimmed to the extent we were called for.
> -	 * We'll update the iter in the dio once we're done with this extent.
> -	 */
> -	iter = *dio->submit.iter;
> -	iov_iter_truncate(&iter, length);
> +	/* Operate on a partial iter trimmed to the extent we were called for */
> +	iov_iter_truncate(dio->submit.iter, length);

I think the comment could be kept a little more verbose given that the
scheme isn't exactly obvious.  Also I'd move the initialization of
orig_count here to keep it all together.  E.g.

	/*
	 * Save the original count and trim the iter to just the extent we
	 * are operating on right now.  The iter will be re-expanded once
	 * we are done.
	 */
	orig_count = iov_iter_count(dio->submit.iter);
	iov_iter_truncate(dio->submit.iter, length);

>  
> -	nr_pages = iov_iter_npages(&iter, BIO_MAX_PAGES);
> -	if (nr_pages <= 0)
> +	nr_pages = iov_iter_npages(dio->submit.iter, BIO_MAX_PAGES);
> +	if (nr_pages <= 0) {
> +		iov_iter_reexpand(dio->submit.iter, orig_count);
>  		return nr_pages;
> +	}

Can we stick to a single iov_iter_reexpand call?  E.g. turn this into

	if (nr_pages <= 0) {
		ret = nr_pages;
		goto out;
	}

and then have the out label at the very end call iov_iter_reexpand.

>  			iomap_dio_zero(dio, iomap, pos, fs_block_size - pad);
>  	}
> +	/* Undo iter limitation to current extent */
> +	iov_iter_reexpand(dio->submit.iter, orig_count - copied);
>  	return copied ? copied : ret;

In iomap-for-next this is:

	if (copied)
		return copied;
	return ret;

so please rebase to iomap-for-next for the next spin.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/2] iomap: Do not create fake iter in iomap_dio_bio_actor()
  2019-11-22 13:26   ` Christoph Hellwig
@ 2019-11-25  8:18     ` Jan Kara
  0 siblings, 0 replies; 13+ messages in thread
From: Jan Kara @ 2019-11-25  8:18 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Jan Kara, Darrick J. Wong, linux-fsdevel, Matthew Bobrowski,
	Eric Biggers

On Fri 22-11-19 05:26:58, Christoph Hellwig wrote:
> > -	/*
> > -	 * Operate on a partial iter trimmed to the extent we were called for.
> > -	 * We'll update the iter in the dio once we're done with this extent.
> > -	 */
> > -	iter = *dio->submit.iter;
> > -	iov_iter_truncate(&iter, length);
> > +	/* Operate on a partial iter trimmed to the extent we were called for */
> > +	iov_iter_truncate(dio->submit.iter, length);
> 
> I think the comment could be kept a little more verbose given that the
> scheme isn't exactly obvious.  Also I'd move the initialization of
> orig_count here to keep it all together.  E.g.
> 
> 	/*
> 	 * Save the original count and trim the iter to just the extent we
> 	 * are operating on right now.  The iter will be re-expanded once
> 	 * we are done.
> 	 */
> 	orig_count = iov_iter_count(dio->submit.iter);
> 	iov_iter_truncate(dio->submit.iter, length);
> 
> >  
> > -	nr_pages = iov_iter_npages(&iter, BIO_MAX_PAGES);
> > -	if (nr_pages <= 0)
> > +	nr_pages = iov_iter_npages(dio->submit.iter, BIO_MAX_PAGES);
> > +	if (nr_pages <= 0) {
> > +		iov_iter_reexpand(dio->submit.iter, orig_count);
> >  		return nr_pages;
> > +	}
> 
> Can we stick to a single iov_iter_reexpand call?  E.g. turn this into
> 
> 	if (nr_pages <= 0) {
> 		ret = nr_pages;
> 		goto out;
> 	}
> 
> and then have the out label at the very end call iov_iter_reexpand.
> 
> >  			iomap_dio_zero(dio, iomap, pos, fs_block_size - pad);
> >  	}
> > +	/* Undo iter limitation to current extent */
> > +	iov_iter_reexpand(dio->submit.iter, orig_count - copied);
> >  	return copied ? copied : ret;
> 
> In iomap-for-next this is:
> 
> 	if (copied)
> 		return copied;
> 	return ret;
> 
> so please rebase to iomap-for-next for the next spin.

OK, I can see Darrick has already picked up the first patch so I'll just
respin this second one with the updates you've asked for. Thanks for
review!

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2019-11-25  8:18 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-11-21 16:15 [PATCH 0/2] iomap: Fix leakage of pipe pages while splicing Jan Kara
2019-11-21 16:15 ` [PATCH 1/2] iomap: Fix pipe page leakage during splicing Jan Kara
2019-11-21 23:55   ` Darrick J. Wong
2019-11-22  6:04     ` Matthew Bobrowski
2019-11-22 10:47     ` Jan Kara
2019-11-22 13:17   ` Christoph Hellwig
2019-11-21 16:15 ` [PATCH 2/2] iomap: Do not create fake iter in iomap_dio_bio_actor() Jan Kara
2019-11-22  0:02   ` Darrick J. Wong
2019-11-22 12:11     ` Jan Kara
2019-11-22 13:26   ` Christoph Hellwig
2019-11-25  8:18     ` Jan Kara
2019-11-21 16:58 ` [PATCH 0/2] iomap: Fix leakage of pipe pages while splicing Darrick J. Wong
2019-11-21 17:15   ` Jan Kara

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).