linux-block.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH V2] block: hold ->invalidate_lock in blkdev_fallocate
@ 2021-09-23  2:37 Ming Lei
  2021-09-23 13:31 ` Jan Kara
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Ming Lei @ 2021-09-23  2:37 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-block, Ming Lei, Jan Kara

When running ->fallocate(), blkdev_fallocate() should hold
mapping->invalidate_lock to prevent page cache from being accessed,
otherwise stale data may be read in page cache.

Without this patch, blktests block/009 fails sometimes. With this patch,
block/009 can pass always.

Also as Jan pointed out, no pages can be created in the discarded area
while you are holding the invalidate_lock, so remove the 2nd
truncate_bdev_range().

Cc: Jan Kara <jack@suse.cz>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
V2:
	- include <linux/fs.h> for avoiding implicit declaration of function 
	filemap_invalidate_lock
	- remove 2nd truncate_bdev_range() as suggested by Jan

 block/fops.c | 21 ++++++++++-----------
 1 file changed, 10 insertions(+), 11 deletions(-)

diff --git a/block/fops.c b/block/fops.c
index ffce6f6c68dd..1e970c247e0e 100644
--- a/block/fops.c
+++ b/block/fops.c
@@ -14,6 +14,7 @@
 #include <linux/task_io_accounting_ops.h>
 #include <linux/falloc.h>
 #include <linux/suspend.h>
+#include <linux/fs.h>
 #include "blk.h"
 
 static struct inode *bdev_file_inode(struct file *file)
@@ -553,7 +554,8 @@ static ssize_t blkdev_read_iter(struct kiocb *iocb, struct iov_iter *to)
 static long blkdev_fallocate(struct file *file, int mode, loff_t start,
 			     loff_t len)
 {
-	struct block_device *bdev = I_BDEV(bdev_file_inode(file));
+	struct inode *inode = bdev_file_inode(file);
+	struct block_device *bdev = I_BDEV(inode);
 	loff_t end = start + len - 1;
 	loff_t isize;
 	int error;
@@ -580,10 +582,12 @@ static long blkdev_fallocate(struct file *file, int mode, loff_t start,
 	if ((start | len) & (bdev_logical_block_size(bdev) - 1))
 		return -EINVAL;
 
+	filemap_invalidate_lock(inode->i_mapping);
+
 	/* Invalidate the page cache, including dirty pages. */
 	error = truncate_bdev_range(bdev, file->f_mode, start, end);
 	if (error)
-		return error;
+		goto fail;
 
 	switch (mode) {
 	case FALLOC_FL_ZERO_RANGE:
@@ -600,17 +604,12 @@ static long blkdev_fallocate(struct file *file, int mode, loff_t start,
 					     GFP_KERNEL, 0);
 		break;
 	default:
-		return -EOPNOTSUPP;
+		error = -EOPNOTSUPP;
 	}
-	if (error)
-		return error;
 
-	/*
-	 * Invalidate the page cache again; if someone wandered in and dirtied
-	 * a page, we just discard it - userspace has no way of knowing whether
-	 * the write happened before or after discard completing...
-	 */
-	return truncate_bdev_range(bdev, file->f_mode, start, end);
+ fail:
+	filemap_invalidate_unlock(inode->i_mapping);
+	return error;
 }
 
 const struct file_operations def_blk_fops = {
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH V2] block: hold ->invalidate_lock in blkdev_fallocate
  2021-09-23  2:37 [PATCH V2] block: hold ->invalidate_lock in blkdev_fallocate Ming Lei
@ 2021-09-23 13:31 ` Jan Kara
  2021-09-24 17:07 ` Jens Axboe
  2021-11-09 10:46 ` Shinichiro Kawasaki
  2 siblings, 0 replies; 4+ messages in thread
From: Jan Kara @ 2021-09-23 13:31 UTC (permalink / raw)
  To: Ming Lei; +Cc: Jens Axboe, linux-block, Jan Kara

On Thu 23-09-21 10:37:51, Ming Lei wrote:
> When running ->fallocate(), blkdev_fallocate() should hold
> mapping->invalidate_lock to prevent page cache from being accessed,
> otherwise stale data may be read in page cache.
> 
> Without this patch, blktests block/009 fails sometimes. With this patch,
> block/009 can pass always.
> 
> Also as Jan pointed out, no pages can be created in the discarded area
> while you are holding the invalidate_lock, so remove the 2nd
> truncate_bdev_range().
> 
> Cc: Jan Kara <jack@suse.cz>
> Signed-off-by: Ming Lei <ming.lei@redhat.com>

Looks good. Feel free to add:

Reviewed-by: Jan Kara <jack@suse.cz>

								Honza

> ---
> V2:
> 	- include <linux/fs.h> for avoiding implicit declaration of function 
> 	filemap_invalidate_lock
> 	- remove 2nd truncate_bdev_range() as suggested by Jan
> 
>  block/fops.c | 21 ++++++++++-----------
>  1 file changed, 10 insertions(+), 11 deletions(-)
> 
> diff --git a/block/fops.c b/block/fops.c
> index ffce6f6c68dd..1e970c247e0e 100644
> --- a/block/fops.c
> +++ b/block/fops.c
> @@ -14,6 +14,7 @@
>  #include <linux/task_io_accounting_ops.h>
>  #include <linux/falloc.h>
>  #include <linux/suspend.h>
> +#include <linux/fs.h>
>  #include "blk.h"
>  
>  static struct inode *bdev_file_inode(struct file *file)
> @@ -553,7 +554,8 @@ static ssize_t blkdev_read_iter(struct kiocb *iocb, struct iov_iter *to)
>  static long blkdev_fallocate(struct file *file, int mode, loff_t start,
>  			     loff_t len)
>  {
> -	struct block_device *bdev = I_BDEV(bdev_file_inode(file));
> +	struct inode *inode = bdev_file_inode(file);
> +	struct block_device *bdev = I_BDEV(inode);
>  	loff_t end = start + len - 1;
>  	loff_t isize;
>  	int error;
> @@ -580,10 +582,12 @@ static long blkdev_fallocate(struct file *file, int mode, loff_t start,
>  	if ((start | len) & (bdev_logical_block_size(bdev) - 1))
>  		return -EINVAL;
>  
> +	filemap_invalidate_lock(inode->i_mapping);
> +
>  	/* Invalidate the page cache, including dirty pages. */
>  	error = truncate_bdev_range(bdev, file->f_mode, start, end);
>  	if (error)
> -		return error;
> +		goto fail;
>  
>  	switch (mode) {
>  	case FALLOC_FL_ZERO_RANGE:
> @@ -600,17 +604,12 @@ static long blkdev_fallocate(struct file *file, int mode, loff_t start,
>  					     GFP_KERNEL, 0);
>  		break;
>  	default:
> -		return -EOPNOTSUPP;
> +		error = -EOPNOTSUPP;
>  	}
> -	if (error)
> -		return error;
>  
> -	/*
> -	 * Invalidate the page cache again; if someone wandered in and dirtied
> -	 * a page, we just discard it - userspace has no way of knowing whether
> -	 * the write happened before or after discard completing...
> -	 */
> -	return truncate_bdev_range(bdev, file->f_mode, start, end);
> + fail:
> +	filemap_invalidate_unlock(inode->i_mapping);
> +	return error;
>  }
>  
>  const struct file_operations def_blk_fops = {
> -- 
> 2.31.1
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH V2] block: hold ->invalidate_lock in blkdev_fallocate
  2021-09-23  2:37 [PATCH V2] block: hold ->invalidate_lock in blkdev_fallocate Ming Lei
  2021-09-23 13:31 ` Jan Kara
@ 2021-09-24 17:07 ` Jens Axboe
  2021-11-09 10:46 ` Shinichiro Kawasaki
  2 siblings, 0 replies; 4+ messages in thread
From: Jens Axboe @ 2021-09-24 17:07 UTC (permalink / raw)
  To: Ming Lei; +Cc: linux-block, Jan Kara

On 9/22/21 8:37 PM, Ming Lei wrote:
> When running ->fallocate(), blkdev_fallocate() should hold
> mapping->invalidate_lock to prevent page cache from being accessed,
> otherwise stale data may be read in page cache.
> 
> Without this patch, blktests block/009 fails sometimes. With this patch,
> block/009 can pass always.
> 
> Also as Jan pointed out, no pages can be created in the discarded area
> while you are holding the invalidate_lock, so remove the 2nd
> truncate_bdev_range().

Applied, thanks.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH V2] block: hold ->invalidate_lock in blkdev_fallocate
  2021-09-23  2:37 [PATCH V2] block: hold ->invalidate_lock in blkdev_fallocate Ming Lei
  2021-09-23 13:31 ` Jan Kara
  2021-09-24 17:07 ` Jens Axboe
@ 2021-11-09 10:46 ` Shinichiro Kawasaki
  2 siblings, 0 replies; 4+ messages in thread
From: Shinichiro Kawasaki @ 2021-11-09 10:46 UTC (permalink / raw)
  To: Ming Lei; +Cc: Jens Axboe, linux-block, Jan Kara, Damien Le Moal

On Sep 23, 2021 / 10:37, Ming Lei wrote:
> When running ->fallocate(), blkdev_fallocate() should hold
> mapping->invalidate_lock to prevent page cache from being accessed,
> otherwise stale data may be read in page cache.
> 
> Without this patch, blktests block/009 fails sometimes. With this patch,
> block/009 can pass always.
> 
> Also as Jan pointed out, no pages can be created in the discarded area
> while you are holding the invalidate_lock, so remove the 2nd
> truncate_bdev_range().

Hello Ming, Jan, thanks for the fix.

Unfortunately, I still observe block/009 failure on the kernel version 5.15.0,
which includes this fix patch. I found that BLKDISCARD ioctl has the same
problem. I modified blk_ioctl_discard() in same manner, and the block/009
failure goes away. I also found that BLKZEROOUT has the same issue. I will
post two patches for these ioctl. Your reviews will be appreciated.

-- 
Best Regards,
Shin'ichiro Kawasaki

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2021-11-09 10:46 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-23  2:37 [PATCH V2] block: hold ->invalidate_lock in blkdev_fallocate Ming Lei
2021-09-23 13:31 ` Jan Kara
2021-09-24 17:07 ` Jens Axboe
2021-11-09 10:46 ` Shinichiro Kawasaki

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).