[PATCH 1/2] Btrfs: fix unexpected file hole after disk errors

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH 1/2] Btrfs: fix unexpected file hole after disk errors
@ 2017-03-01  1:04 Liu Bo
  2017-03-01  1:04 ` [PATCH 2/2] Btrfs: remove start_pos Liu Bo
                   ` (3 more replies)
  0 siblings, 4 replies; 11+ messages in thread
From: Liu Bo @ 2017-03-01  1:04 UTC (permalink / raw)
  To: linux-btrfs; +Cc: David Sterba

Btrfs creates hole extents to cover any unwritten section right before
doing buffer writes after commit 3ac0d7b96a26 ("btrfs: Change the expanding
write sequence to fix snapshot related bug.").

However, that takes the start position of the buffered write to compare
against the current EOF, hole extents would be created only if (EOF <
start).

If the EOF is at the middle of the buffered write, no hole extents will be
created and a file hole without a hole extent is left in this file.

This bug was revealed by generic/019 in fstests.  'fsstress' in this test
may create the above situation and the test then fails all requests
including writes, so the buffer write which is supposed to cover the
hole (without the hole extent) couldn't make it on disk.  Running fsck
against such btrfs ends up with detecting file extent holes.

Things could be more serious, some stale data would be exposed to
userspace if files with this kind of hole are truncated to a position of
the hole, because the on-disk inode size is beyond the last extent in the
file.

This fixes the bug by comparing the end position against the EOF.

Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
---
 fs/btrfs/file.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
index b5c5da2..0be837b 100644
--- a/fs/btrfs/file.c
+++ b/fs/btrfs/file.c
@@ -1861,11 +1861,10 @@ static ssize_t btrfs_file_write_iter(struct kiocb *iocb,
 	pos = iocb->ki_pos;
 	count = iov_iter_count(from);
 	start_pos = round_down(pos, fs_info->sectorsize);
+	end_pos = round_up(pos + count, fs_info->sectorsize);
 	oldsize = i_size_read(inode);
-	if (start_pos > oldsize) {
+	if (end_pos > oldsize) {
 		/* Expand hole size to cover write data, preventing empty gap */
-		end_pos = round_up(pos + count,
-				   fs_info->sectorsize);
 		err = btrfs_cont_expand(inode, oldsize, end_pos);
 		if (err) {
 			inode_unlock(inode);
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 2/2] Btrfs: remove start_pos
  2017-03-01  1:04 [PATCH 1/2] Btrfs: fix unexpected file hole after disk errors Liu Bo
@ 2017-03-01  1:04 ` Liu Bo
  2017-03-01  8:48   ` Qu Wenruo
  2017-03-01  2:44 ` [PATCH 1/2] Btrfs: fix unexpected file hole after disk errors Qu Wenruo
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 11+ messages in thread
From: Liu Bo @ 2017-03-01  1:04 UTC (permalink / raw)
  To: linux-btrfs; +Cc: David Sterba

@pos, not aligned @start_pos, should be used to check whether the eof page
needs to be marked as readonly, thus @start_pos can be removed.

Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
---
 fs/btrfs/file.c | 7 +------
 1 file changed, 1 insertion(+), 6 deletions(-)

diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
index 0be837b..ef88e6d 100644
--- a/fs/btrfs/file.c
+++ b/fs/btrfs/file.c
@@ -1814,7 +1814,6 @@ static ssize_t btrfs_file_write_iter(struct kiocb *iocb,
 	struct inode *inode = file_inode(file);
 	struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb);
 	struct btrfs_root *root = BTRFS_I(inode)->root;
-	u64 start_pos;
 	u64 end_pos;
 	ssize_t num_written = 0;
 	bool sync = (file->f_flags & O_DSYNC) || IS_SYNC(file->f_mapping->host);
@@ -1822,7 +1821,6 @@ static ssize_t btrfs_file_write_iter(struct kiocb *iocb,
 	loff_t pos;
 	size_t count;
 	loff_t oldsize;
-	int clean_page = 0;
 
 	inode_lock(inode);
 	err = generic_write_checks(iocb, from);
@@ -1860,7 +1858,6 @@ static ssize_t btrfs_file_write_iter(struct kiocb *iocb,
 
 	pos = iocb->ki_pos;
 	count = iov_iter_count(from);
-	start_pos = round_down(pos, fs_info->sectorsize);
 	end_pos = round_up(pos + count, fs_info->sectorsize);
 	oldsize = i_size_read(inode);
 	if (end_pos > oldsize) {
@@ -1870,8 +1867,6 @@ static ssize_t btrfs_file_write_iter(struct kiocb *iocb,
 			inode_unlock(inode);
 			goto out;
 		}
-		if (start_pos > round_up(oldsize, fs_info->sectorsize))
-			clean_page = 1;
 	}
 
 	if (sync)
@@ -1883,7 +1878,7 @@ static ssize_t btrfs_file_write_iter(struct kiocb *iocb,
 		num_written = __btrfs_buffered_write(file, from, pos);
 		if (num_written > 0)
 			iocb->ki_pos = pos + num_written;
-		if (clean_page)
+		if (oldsize < pos)
 			pagecache_isize_extended(inode, oldsize,
 						i_size_read(inode));
 	}
-- 
2.5.5


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/2] Btrfs: fix unexpected file hole after disk errors
  2017-03-01  1:04 [PATCH 1/2] Btrfs: fix unexpected file hole after disk errors Liu Bo
  2017-03-01  1:04 ` [PATCH 2/2] Btrfs: remove start_pos Liu Bo
@ 2017-03-01  2:44 ` Qu Wenruo
  2017-03-06 19:27   ` Liu Bo
  2017-03-01  5:22 ` Qu Wenruo
  2017-03-06 20:23 ` [PATCH v2] " Liu Bo
  3 siblings, 1 reply; 11+ messages in thread
From: Qu Wenruo @ 2017-03-01  2:44 UTC (permalink / raw)
  To: Liu Bo, linux-btrfs; +Cc: David Sterba



At 03/01/2017 09:04 AM, Liu Bo wrote:
> Btrfs creates hole extents to cover any unwritten section right before
> doing buffer writes after commit 3ac0d7b96a26 ("btrfs: Change the expanding
> write sequence to fix snapshot related bug.").
>
> However, that takes the start position of the buffered write to compare
> against the current EOF, hole extents would be created only if (EOF <
> start).
>
> If the EOF is at the middle of the buffered write, no hole extents will be
> created and a file hole without a hole extent is left in this file.
>
> This bug was revealed by generic/019 in fstests.  'fsstress' in this test
> may create the above situation and the test then fails all requests
> including writes, so the buffer write which is supposed to cover the
> hole (without the hole extent) couldn't make it on disk.  Running fsck
> against such btrfs ends up with detecting file extent holes.
>
> Things could be more serious, some stale data would be exposed to
> userspace if files with this kind of hole are truncated to a position of
> the hole, because the on-disk inode size is beyond the last extent in the
> file.
>
> This fixes the bug by comparing the end position against the EOF.
>
> Signed-off-by: Liu Bo <bo.li.liu@oracle.com>

Patch looks good to me.
Reviewed-by: Qu Wenruo <quwenruo@cn.fujitsu.com>

> ---
>  fs/btrfs/file.c | 5 ++---
>  1 file changed, 2 insertions(+), 3 deletions(-)
>
> diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
> index b5c5da2..0be837b 100644
> --- a/fs/btrfs/file.c
> +++ b/fs/btrfs/file.c
> @@ -1861,11 +1861,10 @@ static ssize_t btrfs_file_write_iter(struct kiocb *iocb,
>  	pos = iocb->ki_pos;
>  	count = iov_iter_count(from);
>  	start_pos = round_down(pos, fs_info->sectorsize);
> +	end_pos = round_up(pos + count, fs_info->sectorsize);
>  	oldsize = i_size_read(inode);
> -	if (start_pos > oldsize) {
> +	if (end_pos > oldsize) {
>  		/* Expand hole size to cover write data, preventing empty gap */

The comment still makes sense here, but it could be better to explain 
why to insert the hole to cover the whole write range (in case write fails)

Thanks,
Qu

> -		end_pos = round_up(pos + count,
> -				   fs_info->sectorsize);
>  		err = btrfs_cont_expand(inode, oldsize, end_pos);
>  		if (err) {
>  			inode_unlock(inode);
>



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/2] Btrfs: fix unexpected file hole after disk errors
  2017-03-01  1:04 [PATCH 1/2] Btrfs: fix unexpected file hole after disk errors Liu Bo
  2017-03-01  1:04 ` [PATCH 2/2] Btrfs: remove start_pos Liu Bo
  2017-03-01  2:44 ` [PATCH 1/2] Btrfs: fix unexpected file hole after disk errors Qu Wenruo
@ 2017-03-01  5:22 ` Qu Wenruo
  2017-03-06 20:23 ` [PATCH v2] " Liu Bo
  3 siblings, 0 replies; 11+ messages in thread
From: Qu Wenruo @ 2017-03-01  5:22 UTC (permalink / raw)
  To: Liu Bo, linux-btrfs; +Cc: David Sterba

It seems that my previous mail doesn't reach mail list.
So send again.

At 03/01/2017 09:04 AM, Liu Bo wrote:
> Btrfs creates hole extents to cover any unwritten section right before
> doing buffer writes after commit 3ac0d7b96a26 ("btrfs: Change the expanding
> write sequence to fix snapshot related bug.").
>
> However, that takes the start position of the buffered write to compare
> against the current EOF, hole extents would be created only if (EOF <
> start).
>
> If the EOF is at the middle of the buffered write, no hole extents will be
> created and a file hole without a hole extent is left in this file.
>
> This bug was revealed by generic/019 in fstests.  'fsstress' in this test
> may create the above situation and the test then fails all requests
> including writes, so the buffer write which is supposed to cover the
> hole (without the hole extent) couldn't make it on disk.  Running fsck
> against such btrfs ends up with detecting file extent holes.
>
> Things could be more serious, some stale data would be exposed to
> userspace if files with this kind of hole are truncated to a position of
> the hole, because the on-disk inode size is beyond the last extent in the
> file.
>
> This fixes the bug by comparing the end position against the EOF.
>
> Signed-off-by: Liu Bo <bo.li.liu@oracle.com>

Patch looks good to me.
Reviewed-by: Qu Wenruo <quwenruo@cn.fujitsu.com>

> ---
>  fs/btrfs/file.c | 5 ++---
>  1 file changed, 2 insertions(+), 3 deletions(-)
>
> diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
> index b5c5da2..0be837b 100644
> --- a/fs/btrfs/file.c
> +++ b/fs/btrfs/file.c
> @@ -1861,11 +1861,10 @@ static ssize_t btrfs_file_write_iter(struct kiocb *iocb,
>  	pos = iocb->ki_pos;
>  	count = iov_iter_count(from);
>  	start_pos = round_down(pos, fs_info->sectorsize);
> +	end_pos = round_up(pos + count, fs_info->sectorsize);
>  	oldsize = i_size_read(inode);
> -	if (start_pos > oldsize) {
> +	if (end_pos > oldsize) {
>  		/* Expand hole size to cover write data, preventing empty gap */

The comment still makes sense here, but it could be better to explain 
why to insert the hole to cover the whole write range (in case write fails)

Thanks,
Qu

> -		end_pos = round_up(pos + count,
> -				   fs_info->sectorsize);
>  		err = btrfs_cont_expand(inode, oldsize, end_pos);
>  		if (err) {
>  			inode_unlock(inode);
>



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 2/2] Btrfs: remove start_pos
  2017-03-01  1:04 ` [PATCH 2/2] Btrfs: remove start_pos Liu Bo
@ 2017-03-01  8:48   ` Qu Wenruo
  2017-03-06 21:09     ` Liu Bo
  0 siblings, 1 reply; 11+ messages in thread
From: Qu Wenruo @ 2017-03-01  8:48 UTC (permalink / raw)
  To: Liu Bo, linux-btrfs; +Cc: David Sterba



At 03/01/2017 09:04 AM, Liu Bo wrote:
> @pos, not aligned @start_pos, should be used to check whether the eof page
> needs to be marked as readonly, thus @start_pos can be removed.
>
> Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
> ---
>  fs/btrfs/file.c | 7 +------
>  1 file changed, 1 insertion(+), 6 deletions(-)
>
> diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
> index 0be837b..ef88e6d 100644
> --- a/fs/btrfs/file.c
> +++ b/fs/btrfs/file.c
> @@ -1814,7 +1814,6 @@ static ssize_t btrfs_file_write_iter(struct kiocb *iocb,
>  	struct inode *inode = file_inode(file);
>  	struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb);
>  	struct btrfs_root *root = BTRFS_I(inode)->root;
> -	u64 start_pos;
>  	u64 end_pos;
>  	ssize_t num_written = 0;
>  	bool sync = (file->f_flags & O_DSYNC) || IS_SYNC(file->f_mapping->host);
> @@ -1822,7 +1821,6 @@ static ssize_t btrfs_file_write_iter(struct kiocb *iocb,
>  	loff_t pos;
>  	size_t count;
>  	loff_t oldsize;
> -	int clean_page = 0;
>
>  	inode_lock(inode);
>  	err = generic_write_checks(iocb, from);
> @@ -1860,7 +1858,6 @@ static ssize_t btrfs_file_write_iter(struct kiocb *iocb,
>
>  	pos = iocb->ki_pos;
>  	count = iov_iter_count(from);
> -	start_pos = round_down(pos, fs_info->sectorsize);
>  	end_pos = round_up(pos + count, fs_info->sectorsize);
>  	oldsize = i_size_read(inode);
>  	if (end_pos > oldsize) {
> @@ -1870,8 +1867,6 @@ static ssize_t btrfs_file_write_iter(struct kiocb *iocb,
>  			inode_unlock(inode);
>  			goto out;
>  		}
> -		if (start_pos > round_up(oldsize, fs_info->sectorsize))
> -			clean_page = 1;
>  	}
>
>  	if (sync)
> @@ -1883,7 +1878,7 @@ static ssize_t btrfs_file_write_iter(struct kiocb *iocb,
>  		num_written = __btrfs_buffered_write(file, from, pos);
>  		if (num_written > 0)
>  			iocb->ki_pos = pos + num_written;
> -		if (clean_page)
> +		if (oldsize < pos)
>  			pagecache_isize_extended(inode, oldsize,
>  						i_size_read(inode));

Not familiar with page cache, so I can be totally wrong here.

But what will happen if @oldsize and @pos are in the same page?

For example:
Page start                                    Page start + 4K
|             |                |              |
               old size         pos

Do we still need to call pagecache_iszie_extented() since we will dirty 
that page anyway?

Thanks,
Qu
>  	}
>



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/2] Btrfs: fix unexpected file hole after disk errors
  2017-03-01  2:44 ` [PATCH 1/2] Btrfs: fix unexpected file hole after disk errors Qu Wenruo
@ 2017-03-06 19:27   ` Liu Bo
  0 siblings, 0 replies; 11+ messages in thread
From: Liu Bo @ 2017-03-06 19:27 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: linux-btrfs, David Sterba

On Wed, Mar 01, 2017 at 10:44:53AM +0800, Qu Wenruo wrote:
> 
> 
> At 03/01/2017 09:04 AM, Liu Bo wrote:
> > Btrfs creates hole extents to cover any unwritten section right before
> > doing buffer writes after commit 3ac0d7b96a26 ("btrfs: Change the expanding
> > write sequence to fix snapshot related bug.").
> > 
> > However, that takes the start position of the buffered write to compare
> > against the current EOF, hole extents would be created only if (EOF <
> > start).
> > 
> > If the EOF is at the middle of the buffered write, no hole extents will be
> > created and a file hole without a hole extent is left in this file.
> > 
> > This bug was revealed by generic/019 in fstests.  'fsstress' in this test
> > may create the above situation and the test then fails all requests
> > including writes, so the buffer write which is supposed to cover the
> > hole (without the hole extent) couldn't make it on disk.  Running fsck
> > against such btrfs ends up with detecting file extent holes.
> > 
> > Things could be more serious, some stale data would be exposed to
> > userspace if files with this kind of hole are truncated to a position of
> > the hole, because the on-disk inode size is beyond the last extent in the
> > file.
> > 
> > This fixes the bug by comparing the end position against the EOF.
> > 
> > Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
> 
> Patch looks good to me.
> Reviewed-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
> 
> > ---
> >  fs/btrfs/file.c | 5 ++---
> >  1 file changed, 2 insertions(+), 3 deletions(-)
> > 
> > diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
> > index b5c5da2..0be837b 100644
> > --- a/fs/btrfs/file.c
> > +++ b/fs/btrfs/file.c
> > @@ -1861,11 +1861,10 @@ static ssize_t btrfs_file_write_iter(struct kiocb *iocb,
> >  	pos = iocb->ki_pos;
> >  	count = iov_iter_count(from);
> >  	start_pos = round_down(pos, fs_info->sectorsize);
> > +	end_pos = round_up(pos + count, fs_info->sectorsize);
> >  	oldsize = i_size_read(inode);
> > -	if (start_pos > oldsize) {
> > +	if (end_pos > oldsize) {
> >  		/* Expand hole size to cover write data, preventing empty gap */
> 
> The comment still makes sense here, but it could be better to explain why to
> insert the hole to cover the whole write range (in case write fails)
>

Sounds good, I'll update.

Thanks,

-liubo

> Thanks,
> Qu
> 
> > -		end_pos = round_up(pos + count,
> > -				   fs_info->sectorsize);
> >  		err = btrfs_cont_expand(inode, oldsize, end_pos);
> >  		if (err) {
> >  			inode_unlock(inode);
> > 
> 
> 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH v2] Btrfs: fix unexpected file hole after disk errors
  2017-03-01  1:04 [PATCH 1/2] Btrfs: fix unexpected file hole after disk errors Liu Bo
                   ` (2 preceding siblings ...)
  2017-03-01  5:22 ` Qu Wenruo
@ 2017-03-06 20:23 ` Liu Bo
  2017-03-07  0:28   ` Qu Wenruo
  2017-03-28 12:50   ` David Sterba
  3 siblings, 2 replies; 11+ messages in thread
From: Liu Bo @ 2017-03-06 20:23 UTC (permalink / raw)
  To: linux-btrfs; +Cc: David Sterba, Qu Wenruo

Btrfs creates hole extents to cover any unwritten section right before
doing buffer writes after commit 3ac0d7b96a26 ("btrfs: Change the expanding
write sequence to fix snapshot related bug.").

However, that takes the start position of the buffered write to compare
against the current EOF, hole extents would be created only if (EOF <
start).

If the EOF is at the middle of the buffered write, no hole extents will be
created and a file hole without a hole extent is left in this file.

This bug was revealed by generic/019 in fstests.  'fsstress' in this test
may create the above situation and the test then fails all requests
including writes, so the buffer write which is supposed to cover the
hole (without the hole extent) couldn't make it on disk.  Running fsck
against such btrfs ends up with detecting file extent holes.

Things could be more serious, some stale data would be exposed to
userspace if files with this kind of hole are truncated to a position of
the hole, because the on-disk inode size is beyond the last extent in the
file.

This fixes the bug by comparing the end position against the EOF.

Cc: David Sterba <dsterba@suse.cz>
Cc: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
---
v2: update comments to be precise.

 fs/btrfs/file.c | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
index 520cb72..dcf0286 100644
--- a/fs/btrfs/file.c
+++ b/fs/btrfs/file.c
@@ -1865,11 +1865,13 @@ static ssize_t btrfs_file_write_iter(struct kiocb *iocb,
 	pos = iocb->ki_pos;
 	count = iov_iter_count(from);
 	start_pos = round_down(pos, fs_info->sectorsize);
+	end_pos = round_up(pos + count, fs_info->sectorsize);
 	oldsize = i_size_read(inode);
-	if (start_pos > oldsize) {
-		/* Expand hole size to cover write data, preventing empty gap */
-		end_pos = round_up(pos + count,
-				   fs_info->sectorsize);
+	if (end_pos > oldsize) {
+		/*
+		 * Expand hole size to cover write data in order to prevent an
+		 * empty gap in case of a write failure.
+		 */
 		err = btrfs_cont_expand(inode, oldsize, end_pos);
 		if (err) {
 			inode_unlock(inode);
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH 2/2] Btrfs: remove start_pos
  2017-03-01  8:48   ` Qu Wenruo
@ 2017-03-06 21:09     ` Liu Bo
  0 siblings, 0 replies; 11+ messages in thread
From: Liu Bo @ 2017-03-06 21:09 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: linux-btrfs, David Sterba

On Wed, Mar 01, 2017 at 04:48:20PM +0800, Qu Wenruo wrote:
> 
> 
> At 03/01/2017 09:04 AM, Liu Bo wrote:
> > @pos, not aligned @start_pos, should be used to check whether the eof page
> > needs to be marked as readonly, thus @start_pos can be removed.
> > 
> > Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
> > ---
> >  fs/btrfs/file.c | 7 +------
> >  1 file changed, 1 insertion(+), 6 deletions(-)
> > 
> > diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
> > index 0be837b..ef88e6d 100644
> > --- a/fs/btrfs/file.c
> > +++ b/fs/btrfs/file.c
> > @@ -1814,7 +1814,6 @@ static ssize_t btrfs_file_write_iter(struct kiocb *iocb,
> >  	struct inode *inode = file_inode(file);
> >  	struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb);
> >  	struct btrfs_root *root = BTRFS_I(inode)->root;
> > -	u64 start_pos;
> >  	u64 end_pos;
> >  	ssize_t num_written = 0;
> >  	bool sync = (file->f_flags & O_DSYNC) || IS_SYNC(file->f_mapping->host);
> > @@ -1822,7 +1821,6 @@ static ssize_t btrfs_file_write_iter(struct kiocb *iocb,
> >  	loff_t pos;
> >  	size_t count;
> >  	loff_t oldsize;
> > -	int clean_page = 0;
> > 
> >  	inode_lock(inode);
> >  	err = generic_write_checks(iocb, from);
> > @@ -1860,7 +1858,6 @@ static ssize_t btrfs_file_write_iter(struct kiocb *iocb,
> > 
> >  	pos = iocb->ki_pos;
> >  	count = iov_iter_count(from);
> > -	start_pos = round_down(pos, fs_info->sectorsize);
> >  	end_pos = round_up(pos + count, fs_info->sectorsize);
> >  	oldsize = i_size_read(inode);
> >  	if (end_pos > oldsize) {
> > @@ -1870,8 +1867,6 @@ static ssize_t btrfs_file_write_iter(struct kiocb *iocb,
> >  			inode_unlock(inode);
> >  			goto out;
> >  		}
> > -		if (start_pos > round_up(oldsize, fs_info->sectorsize))
> > -			clean_page = 1;
> >  	}
> > 
> >  	if (sync)
> > @@ -1883,7 +1878,7 @@ static ssize_t btrfs_file_write_iter(struct kiocb *iocb,
> >  		num_written = __btrfs_buffered_write(file, from, pos);
> >  		if (num_written > 0)
> >  			iocb->ki_pos = pos + num_written;
> > -		if (clean_page)
> > +		if (oldsize < pos)
> >  			pagecache_isize_extended(inode, oldsize,
> >  						i_size_read(inode));
> 
> Not familiar with page cache, so I can be totally wrong here.
> 
> But what will happen if @oldsize and @pos are in the same page?
> 
> For example:
> Page start                                    Page start + 4K
> |             |                |              |
>               old size         pos
> 
> Do we still need to call pagecache_iszie_extented() since we will dirty that
> page anyway?

Yes, isize has changed, if blocksize < pagesize, so it's still possible that the
next write access to the new isize doesn't own an block since no page_mkwrite()
has been called to allocate it, then a following writepage() may fail silently
from userspace's view (unless they run fsync and check its ret).

Thanks,

-liubo

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2] Btrfs: fix unexpected file hole after disk errors
  2017-03-06 20:23 ` [PATCH v2] " Liu Bo
@ 2017-03-07  0:28   ` Qu Wenruo
  2017-03-28 12:50   ` David Sterba
  1 sibling, 0 replies; 11+ messages in thread
From: Qu Wenruo @ 2017-03-07  0:28 UTC (permalink / raw)
  To: Liu Bo, linux-btrfs; +Cc: David Sterba



At 03/07/2017 04:23 AM, Liu Bo wrote:
> Btrfs creates hole extents to cover any unwritten section right before
> doing buffer writes after commit 3ac0d7b96a26 ("btrfs: Change the expanding
> write sequence to fix snapshot related bug.").
>
> However, that takes the start position of the buffered write to compare
> against the current EOF, hole extents would be created only if (EOF <
> start).
>
> If the EOF is at the middle of the buffered write, no hole extents will be
> created and a file hole without a hole extent is left in this file.
>
> This bug was revealed by generic/019 in fstests.  'fsstress' in this test
> may create the above situation and the test then fails all requests
> including writes, so the buffer write which is supposed to cover the
> hole (without the hole extent) couldn't make it on disk.  Running fsck
> against such btrfs ends up with detecting file extent holes.
>
> Things could be more serious, some stale data would be exposed to
> userspace if files with this kind of hole are truncated to a position of
> the hole, because the on-disk inode size is beyond the last extent in the
> file.
>
> This fixes the bug by comparing the end position against the EOF.
>
> Cc: David Sterba <dsterba@suse.cz>
> Cc: Qu Wenruo <quwenruo@cn.fujitsu.com>
> Signed-off-by: Liu Bo <bo.li.liu@oracle.com>

Looks good.

Reviewed-by: Qu Wenruo <quwenruo@cn.fujitsu.com>

Thanks,
Qu
> ---
> v2: update comments to be precise.
>
>  fs/btrfs/file.c | 10 ++++++----
>  1 file changed, 6 insertions(+), 4 deletions(-)
>
> diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
> index 520cb72..dcf0286 100644
> --- a/fs/btrfs/file.c
> +++ b/fs/btrfs/file.c
> @@ -1865,11 +1865,13 @@ static ssize_t btrfs_file_write_iter(struct kiocb *iocb,
>  	pos = iocb->ki_pos;
>  	count = iov_iter_count(from);
>  	start_pos = round_down(pos, fs_info->sectorsize);
> +	end_pos = round_up(pos + count, fs_info->sectorsize);
>  	oldsize = i_size_read(inode);
> -	if (start_pos > oldsize) {
> -		/* Expand hole size to cover write data, preventing empty gap */
> -		end_pos = round_up(pos + count,
> -				   fs_info->sectorsize);
> +	if (end_pos > oldsize) {
> +		/*
> +		 * Expand hole size to cover write data in order to prevent an
> +		 * empty gap in case of a write failure.
> +		 */
>  		err = btrfs_cont_expand(inode, oldsize, end_pos);
>  		if (err) {
>  			inode_unlock(inode);
>



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2] Btrfs: fix unexpected file hole after disk errors
  2017-03-06 20:23 ` [PATCH v2] " Liu Bo
  2017-03-07  0:28   ` Qu Wenruo
@ 2017-03-28 12:50   ` David Sterba
  2017-03-28 18:40     ` Liu Bo
  1 sibling, 1 reply; 11+ messages in thread
From: David Sterba @ 2017-03-28 12:50 UTC (permalink / raw)
  To: Liu Bo; +Cc: linux-btrfs, David Sterba, Qu Wenruo

On Mon, Mar 06, 2017 at 12:23:30PM -0800, Liu Bo wrote:
> Btrfs creates hole extents to cover any unwritten section right before
> doing buffer writes after commit 3ac0d7b96a26 ("btrfs: Change the expanding
> write sequence to fix snapshot related bug.").
> 
> However, that takes the start position of the buffered write to compare
> against the current EOF, hole extents would be created only if (EOF <
> start).
> 
> If the EOF is at the middle of the buffered write, no hole extents will be
> created and a file hole without a hole extent is left in this file.
> 
> This bug was revealed by generic/019 in fstests.  'fsstress' in this test
> may create the above situation and the test then fails all requests
> including writes, so the buffer write which is supposed to cover the
> hole (without the hole extent) couldn't make it on disk.  Running fsck
> against such btrfs ends up with detecting file extent holes.
> 
> Things could be more serious, some stale data would be exposed to
> userspace if files with this kind of hole are truncated to a position of
> the hole, because the on-disk inode size is beyond the last extent in the
> file.
> 
> This fixes the bug by comparing the end position against the EOF.

Is the test reliable? As I read it, it should be possible to craft the
file extents and trigger the bug. And verify the fix.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2] Btrfs: fix unexpected file hole after disk errors
  2017-03-28 12:50   ` David Sterba
@ 2017-03-28 18:40     ` Liu Bo
  0 siblings, 0 replies; 11+ messages in thread
From: Liu Bo @ 2017-03-28 18:40 UTC (permalink / raw)
  To: dsterba, linux-btrfs, Qu Wenruo

On Tue, Mar 28, 2017 at 02:50:06PM +0200, David Sterba wrote:
> On Mon, Mar 06, 2017 at 12:23:30PM -0800, Liu Bo wrote:
> > Btrfs creates hole extents to cover any unwritten section right before
> > doing buffer writes after commit 3ac0d7b96a26 ("btrfs: Change the expanding
> > write sequence to fix snapshot related bug.").
> > 
> > However, that takes the start position of the buffered write to compare
> > against the current EOF, hole extents would be created only if (EOF <
> > start).
> > 
> > If the EOF is at the middle of the buffered write, no hole extents will be
> > created and a file hole without a hole extent is left in this file.
> > 
> > This bug was revealed by generic/019 in fstests.  'fsstress' in this test
> > may create the above situation and the test then fails all requests
> > including writes, so the buffer write which is supposed to cover the
> > hole (without the hole extent) couldn't make it on disk.  Running fsck
> > against such btrfs ends up with detecting file extent holes.
> > 
> > Things could be more serious, some stale data would be exposed to
> > userspace if files with this kind of hole are truncated to a position of
> > the hole, because the on-disk inode size is beyond the last extent in the
> > file.
> > 
> > This fixes the bug by comparing the end position against the EOF.
> 
> Is the test reliable? As I read it, it should be possible to craft the
> file extents and trigger the bug. And verify the fix.

It's not, running generic/019 by 10 times could produces a btrfsck
error once on my box.

Actually I think we don't need this patch any more since we're going
to remove hole extents.

I made a mistake when writing the above 'more serious' part, in fact
truncating to a hole doesn't end up stale data as we don't have any
file extent that points to the hole and reading the hole part should
get all zero, thus there's no serious problem.

It was another bug about setting disk isize to zero accidentally and
data was lost, which has been fixed.

Thanks,

-liubo

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2017-03-28 18:41 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-03-01  1:04 [PATCH 1/2] Btrfs: fix unexpected file hole after disk errors Liu Bo
2017-03-01  1:04 ` [PATCH 2/2] Btrfs: remove start_pos Liu Bo
2017-03-01  8:48   ` Qu Wenruo
2017-03-06 21:09     ` Liu Bo
2017-03-01  2:44 ` [PATCH 1/2] Btrfs: fix unexpected file hole after disk errors Qu Wenruo
2017-03-06 19:27   ` Liu Bo
2017-03-01  5:22 ` Qu Wenruo
2017-03-06 20:23 ` [PATCH v2] " Liu Bo
2017-03-07  0:28   ` Qu Wenruo
2017-03-28 12:50   ` David Sterba
2017-03-28 18:40     ` Liu Bo

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.