* [PATCH] block: hold ->invalidate_lock in blkdev_fallocate
@ 2021-09-15 12:35 Ming Lei
2021-09-15 14:41 ` kernel test robot
2021-09-16 14:16 ` Jan Kara
0 siblings, 2 replies; 4+ messages in thread
From: Ming Lei @ 2021-09-15 12:35 UTC (permalink / raw)
To: Jens Axboe; +Cc: linux-block, Ming Lei, Jan Kara
When running ->fallocate(), blkdev_fallocate() should hold
mapping->invalidate_lock to prevent page cache from being
accessed, otherwise stale data may be read in page cache.
Without this patch, blktests block/009 fails sometimes. With
this patch, block/009 can pass always.
Cc: Jan Kara <jack@suse.cz>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
fs/block_dev.c | 18 +++++++++++-------
1 file changed, 11 insertions(+), 7 deletions(-)
diff --git a/fs/block_dev.c b/fs/block_dev.c
index 45df6cbccf12..f55e14ae89a0 100644
--- a/fs/block_dev.c
+++ b/fs/block_dev.c
@@ -1516,7 +1516,8 @@ static const struct address_space_operations def_blk_aops = {
static long blkdev_fallocate(struct file *file, int mode, loff_t start,
loff_t len)
{
- struct block_device *bdev = I_BDEV(bdev_file_inode(file));
+ struct inode *inode = bdev_file_inode(file);
+ struct block_device *bdev = I_BDEV(inode);
loff_t end = start + len - 1;
loff_t isize;
int error;
@@ -1543,10 +1544,12 @@ static long blkdev_fallocate(struct file *file, int mode, loff_t start,
if ((start | len) & (bdev_logical_block_size(bdev) - 1))
return -EINVAL;
+ filemap_invalidate_lock(inode->i_mapping);
+
/* Invalidate the page cache, including dirty pages. */
error = truncate_bdev_range(bdev, file->f_mode, start, end);
if (error)
- return error;
+ goto fail;
switch (mode) {
case FALLOC_FL_ZERO_RANGE:
@@ -1563,17 +1566,18 @@ static long blkdev_fallocate(struct file *file, int mode, loff_t start,
GFP_KERNEL, 0);
break;
default:
- return -EOPNOTSUPP;
+ error = -EOPNOTSUPP;
}
- if (error)
- return error;
-
/*
* Invalidate the page cache again; if someone wandered in and dirtied
* a page, we just discard it - userspace has no way of knowing whether
* the write happened before or after discard completing...
*/
- return truncate_bdev_range(bdev, file->f_mode, start, end);
+ if (!error)
+ error = truncate_bdev_range(bdev, file->f_mode, start, end);
+ fail:
+ filemap_invalidate_unlock(inode->i_mapping);
+ return error;
}
const struct file_operations def_blk_fops = {
--
2.31.1
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH] block: hold ->invalidate_lock in blkdev_fallocate
2021-09-15 12:35 [PATCH] block: hold ->invalidate_lock in blkdev_fallocate Ming Lei
@ 2021-09-15 14:41 ` kernel test robot
2021-09-16 14:16 ` Jan Kara
1 sibling, 0 replies; 4+ messages in thread
From: kernel test robot @ 2021-09-15 14:41 UTC (permalink / raw)
To: Ming Lei, Jens Axboe; +Cc: kbuild-all, linux-block, Ming Lei, Jan Kara
[-- Attachment #1: Type: text/plain, Size: 4665 bytes --]
Hi Ming,
Thank you for the patch! Yet something to improve:
[auto build test ERROR on v5.14-rc7]
[cannot apply to v5.15-rc1 next-20210915]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]
url: https://github.com/0day-ci/linux/commits/Ming-Lei/block-hold-invalidate_lock-in-blkdev_fallocate/20210915-203759
base: e22ce8eb631bdc47a4a4ea7ecf4e4ba499db4f93
config: nds32-defconfig (attached as .config)
compiler: nds32le-linux-gcc (GCC) 11.2.0
reproduce (this is a W=1 build):
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# https://github.com/0day-ci/linux/commit/802c451a54927267d87a707d2a5fe3e47bc4ca1c
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review Ming-Lei/block-hold-invalidate_lock-in-blkdev_fallocate/20210915-203759
git checkout 802c451a54927267d87a707d2a5fe3e47bc4ca1c
# save the attached .config to linux build tree
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-11.2.0 make.cross ARCH=nds32
If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>
All errors (new ones prefixed by >>):
fs/block_dev.c: In function 'blkdev_fallocate':
>> fs/block_dev.c:1730:9: error: implicit declaration of function 'filemap_invalidate_lock' [-Werror=implicit-function-declaration]
1730 | filemap_invalidate_lock(inode->i_mapping);
| ^~~~~~~~~~~~~~~~~~~~~~~
>> fs/block_dev.c:1762:9: error: implicit declaration of function 'filemap_invalidate_unlock' [-Werror=implicit-function-declaration]
1762 | filemap_invalidate_unlock(inode->i_mapping);
| ^~~~~~~~~~~~~~~~~~~~~~~~~
cc1: some warnings being treated as errors
vim +/filemap_invalidate_lock +1730 fs/block_dev.c
1694
1695 #define BLKDEV_FALLOC_FL_SUPPORTED \
1696 (FALLOC_FL_KEEP_SIZE | FALLOC_FL_PUNCH_HOLE | \
1697 FALLOC_FL_ZERO_RANGE | FALLOC_FL_NO_HIDE_STALE)
1698
1699 static long blkdev_fallocate(struct file *file, int mode, loff_t start,
1700 loff_t len)
1701 {
1702 struct inode *inode = bdev_file_inode(file);
1703 struct block_device *bdev = I_BDEV(inode);
1704 loff_t end = start + len - 1;
1705 loff_t isize;
1706 int error;
1707
1708 /* Fail if we don't recognize the flags. */
1709 if (mode & ~BLKDEV_FALLOC_FL_SUPPORTED)
1710 return -EOPNOTSUPP;
1711
1712 /* Don't go off the end of the device. */
1713 isize = i_size_read(bdev->bd_inode);
1714 if (start >= isize)
1715 return -EINVAL;
1716 if (end >= isize) {
1717 if (mode & FALLOC_FL_KEEP_SIZE) {
1718 len = isize - start;
1719 end = start + len - 1;
1720 } else
1721 return -EINVAL;
1722 }
1723
1724 /*
1725 * Don't allow IO that isn't aligned to logical block size.
1726 */
1727 if ((start | len) & (bdev_logical_block_size(bdev) - 1))
1728 return -EINVAL;
1729
> 1730 filemap_invalidate_lock(inode->i_mapping);
1731
1732 /* Invalidate the page cache, including dirty pages. */
1733 error = truncate_bdev_range(bdev, file->f_mode, start, end);
1734 if (error)
1735 goto fail;
1736
1737 switch (mode) {
1738 case FALLOC_FL_ZERO_RANGE:
1739 case FALLOC_FL_ZERO_RANGE | FALLOC_FL_KEEP_SIZE:
1740 error = blkdev_issue_zeroout(bdev, start >> 9, len >> 9,
1741 GFP_KERNEL, BLKDEV_ZERO_NOUNMAP);
1742 break;
1743 case FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE:
1744 error = blkdev_issue_zeroout(bdev, start >> 9, len >> 9,
1745 GFP_KERNEL, BLKDEV_ZERO_NOFALLBACK);
1746 break;
1747 case FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE | FALLOC_FL_NO_HIDE_STALE:
1748 error = blkdev_issue_discard(bdev, start >> 9, len >> 9,
1749 GFP_KERNEL, 0);
1750 break;
1751 default:
1752 error = -EOPNOTSUPP;
1753 }
1754 /*
1755 * Invalidate the page cache again; if someone wandered in and dirtied
1756 * a page, we just discard it - userspace has no way of knowing whether
1757 * the write happened before or after discard completing...
1758 */
1759 if (!error)
1760 error = truncate_bdev_range(bdev, file->f_mode, start, end);
1761 fail:
> 1762 filemap_invalidate_unlock(inode->i_mapping);
1763 return error;
1764 }
1765
---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org
[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 10911 bytes --]
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] block: hold ->invalidate_lock in blkdev_fallocate
2021-09-15 12:35 [PATCH] block: hold ->invalidate_lock in blkdev_fallocate Ming Lei
2021-09-15 14:41 ` kernel test robot
@ 2021-09-16 14:16 ` Jan Kara
2021-09-23 1:51 ` Ming Lei
1 sibling, 1 reply; 4+ messages in thread
From: Jan Kara @ 2021-09-16 14:16 UTC (permalink / raw)
To: Ming Lei; +Cc: Jens Axboe, linux-block, Jan Kara
On Wed 15-09-21 20:35:45, Ming Lei wrote:
> When running ->fallocate(), blkdev_fallocate() should hold
> mapping->invalidate_lock to prevent page cache from being
> accessed, otherwise stale data may be read in page cache.
>
> Without this patch, blktests block/009 fails sometimes. With
> this patch, block/009 can pass always.
>
> Cc: Jan Kara <jack@suse.cz>
> Signed-off-by: Ming Lei <ming.lei@redhat.com>
Interesting. Looking at block/009 testcase I'm somewhat struggling to see
how it could trigger a situation where stale data would be seen. Do you
have some independent read accesses to the block device while the testcase
is running? That would explain it and yes, this patch should fix that.
BTW, you'd need to rebase this against current Linus' tree - Christoph has
moved this code to block/...
Also with the invalidate_lock held, there's no need for the second
truncate_bdev_range() call anymore. No pages can be created in the
discarded area while you are holding the invalidate_lock.
Honza
> ---
> fs/block_dev.c | 18 +++++++++++-------
> 1 file changed, 11 insertions(+), 7 deletions(-)
>
> diff --git a/fs/block_dev.c b/fs/block_dev.c
> index 45df6cbccf12..f55e14ae89a0 100644
> --- a/fs/block_dev.c
> +++ b/fs/block_dev.c
> @@ -1516,7 +1516,8 @@ static const struct address_space_operations def_blk_aops = {
> static long blkdev_fallocate(struct file *file, int mode, loff_t start,
> loff_t len)
> {
> - struct block_device *bdev = I_BDEV(bdev_file_inode(file));
> + struct inode *inode = bdev_file_inode(file);
> + struct block_device *bdev = I_BDEV(inode);
> loff_t end = start + len - 1;
> loff_t isize;
> int error;
> @@ -1543,10 +1544,12 @@ static long blkdev_fallocate(struct file *file, int mode, loff_t start,
> if ((start | len) & (bdev_logical_block_size(bdev) - 1))
> return -EINVAL;
>
> + filemap_invalidate_lock(inode->i_mapping);
> +
> /* Invalidate the page cache, including dirty pages. */
> error = truncate_bdev_range(bdev, file->f_mode, start, end);
> if (error)
> - return error;
> + goto fail;
>
> switch (mode) {
> case FALLOC_FL_ZERO_RANGE:
> @@ -1563,17 +1566,18 @@ static long blkdev_fallocate(struct file *file, int mode, loff_t start,
> GFP_KERNEL, 0);
> break;
> default:
> - return -EOPNOTSUPP;
> + error = -EOPNOTSUPP;
> }
> - if (error)
> - return error;
> -
> /*
> * Invalidate the page cache again; if someone wandered in and dirtied
> * a page, we just discard it - userspace has no way of knowing whether
> * the write happened before or after discard completing...
> */
> - return truncate_bdev_range(bdev, file->f_mode, start, end);
> + if (!error)
> + error = truncate_bdev_range(bdev, file->f_mode, start, end);
> + fail:
> + filemap_invalidate_unlock(inode->i_mapping);
> + return error;
> }
>
> const struct file_operations def_blk_fops = {
> --
> 2.31.1
>
--
Jan Kara <jack@suse.com>
SUSE Labs, CR
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] block: hold ->invalidate_lock in blkdev_fallocate
2021-09-16 14:16 ` Jan Kara
@ 2021-09-23 1:51 ` Ming Lei
0 siblings, 0 replies; 4+ messages in thread
From: Ming Lei @ 2021-09-23 1:51 UTC (permalink / raw)
To: Jan Kara; +Cc: Jens Axboe, linux-block
On Thu, Sep 16, 2021 at 04:16:55PM +0200, Jan Kara wrote:
> On Wed 15-09-21 20:35:45, Ming Lei wrote:
> > When running ->fallocate(), blkdev_fallocate() should hold
> > mapping->invalidate_lock to prevent page cache from being
> > accessed, otherwise stale data may be read in page cache.
> >
> > Without this patch, blktests block/009 fails sometimes. With
> > this patch, block/009 can pass always.
> >
> > Cc: Jan Kara <jack@suse.cz>
> > Signed-off-by: Ming Lei <ming.lei@redhat.com>
>
> Interesting. Looking at block/009 testcase I'm somewhat struggling to see
> how it could trigger a situation where stale data would be seen. Do you
> have some independent read accesses to the block device while the testcase
> is running? That would explain it and yes, this patch should fix that.
Yeah, the issue can be reproduced quite easily if independent read is
run. However, sometimes it can be triggered without this read, since
systemd-udev or reading partition table may do that too.
>
> BTW, you'd need to rebase this against current Linus' tree - Christoph has
> moved this code to block/...
OK.
>
> Also with the invalidate_lock held, there's no need for the second
> truncate_bdev_range() call anymore. No pages can be created in the
> discarded area while you are holding the invalidate_lock.
Good catch, will remove the 2nd truncate_bdev_range() in V2.
Thanks,
Ming
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2021-09-23 1:51 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-15 12:35 [PATCH] block: hold ->invalidate_lock in blkdev_fallocate Ming Lei
2021-09-15 14:41 ` kernel test robot
2021-09-16 14:16 ` Jan Kara
2021-09-23 1:51 ` Ming Lei
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).