All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] Btrfs: fix data loss after truncate when using the no-holes feature
@ 2017-02-14 20:35 fdmanana
  2017-02-14 22:06 ` Liu Bo
  2017-02-15 18:42 ` [PATCH v2] " fdmanana
  0 siblings, 2 replies; 3+ messages in thread
From: fdmanana @ 2017-02-14 20:35 UTC (permalink / raw)
  To: linux-btrfs

From: Filipe Manana <fdmanana@suse.com>

If we have a file with an implicit hole (NO_HOLES feature enabled) that
has an extent following the hole, delayed writes against regions of the
file behind the hole happened before but were not yet flushed and then
we truncate the file to a smaller size that lies inside the hole, we
end up persisting a wrong disk_i_size value for our inode that leads to
data loss after umounting and mounting again the filesystem or after
the inode is evicted and loaded again.

This happens because at inode.c:btrfs_truncate_inode_items() we end up
setting last_size to the offset of the extent that we deleted and that
followed the hole. We then pass that value to btrfs_ordered_update_i_size()
which updates the inode's disk_i_size to a value smaller then the offset
of the buffered (delayed) writes.

Example reproducer:

 $ mkfs.btrfs -f /dev/sdb
 $ mount /dev/sdb /mnt

 $ xfs_io -f -c "pwrite -S 0x01 0K 32K" /mnt/foo
 $ xfs_io -d -c "pwrite -S 0x02 -b 32K 64K 32K" /mnt/foo
 $ xfs_io -c "truncate 60K" /mnt/foo
   --> inode's disk_i_size updated to 0

 $ md5sum /mnt/foo
 3c5ca3c3ab42f4b04d7e7eb0b0d4d806  /mnt/foo

 $ umount /dev/sdb
 $ mount /dev/sdb /mnt

 $ md5sum /mnt/foo
 d41d8cd98f00b204e9800998ecf8427e  /mnt/foo
   --> Empty file, all data lost!

Cc: <stable@vger.kernel.org>  # 3.14+
Fixes: 16e7549f045d ("Btrfs: incompatible format change to remove hole extents")
Signed-off-by: Filipe Manana <fdmanana@suse.com>
---
 fs/btrfs/inode.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 8eb6703..a0a38ce 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -4703,8 +4703,12 @@ int btrfs_truncate_inode_items(struct btrfs_trans_handle *trans,
 			btrfs_abort_transaction(trans, ret);
 	}
 error:
-	if (root->root_key.objectid != BTRFS_TREE_LOG_OBJECTID)
+	if (root->root_key.objectid != BTRFS_TREE_LOG_OBJECTID) {
+		if (last_size > new_size &&
+		    btrfs_fs_incompat(fs_info, NO_HOLES))
+			last_size = new_size;
 		btrfs_ordered_update_i_size(inode, last_size, NULL);
+	}
 
 	btrfs_free_path(path);
 
-- 
2.7.0.rc3


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH] Btrfs: fix data loss after truncate when using the no-holes feature
  2017-02-14 20:35 [PATCH] Btrfs: fix data loss after truncate when using the no-holes feature fdmanana
@ 2017-02-14 22:06 ` Liu Bo
  2017-02-15 18:42 ` [PATCH v2] " fdmanana
  1 sibling, 0 replies; 3+ messages in thread
From: Liu Bo @ 2017-02-14 22:06 UTC (permalink / raw)
  To: fdmanana; +Cc: linux-btrfs

On Tue, Feb 14, 2017 at 08:35:26PM +0000, fdmanana@kernel.org wrote:
> From: Filipe Manana <fdmanana@suse.com>
> 
> If we have a file with an implicit hole (NO_HOLES feature enabled) that
> has an extent following the hole, delayed writes against regions of the
> file behind the hole happened before but were not yet flushed and then
> we truncate the file to a smaller size that lies inside the hole, we
> end up persisting a wrong disk_i_size value for our inode that leads to
> data loss after umounting and mounting again the filesystem or after
> the inode is evicted and loaded again.
> 
> This happens because at inode.c:btrfs_truncate_inode_items() we end up
> setting last_size to the offset of the extent that we deleted and that
> followed the hole. We then pass that value to btrfs_ordered_update_i_size()
> which updates the inode's disk_i_size to a value smaller then the offset
> of the buffered (delayed) writes.
> 
> Example reproducer:
> 
>  $ mkfs.btrfs -f /dev/sdb
>  $ mount /dev/sdb /mnt
> 
>  $ xfs_io -f -c "pwrite -S 0x01 0K 32K" /mnt/foo
>  $ xfs_io -d -c "pwrite -S 0x02 -b 32K 64K 32K" /mnt/foo
>  $ xfs_io -c "truncate 60K" /mnt/foo
>    --> inode's disk_i_size updated to 0
> 
>  $ md5sum /mnt/foo
>  3c5ca3c3ab42f4b04d7e7eb0b0d4d806  /mnt/foo
> 
>  $ umount /dev/sdb
>  $ mount /dev/sdb /mnt
> 
>  $ md5sum /mnt/foo
>  d41d8cd98f00b204e9800998ecf8427e  /mnt/foo
>    --> Empty file, all data lost!
> 
> Cc: <stable@vger.kernel.org>  # 3.14+
> Fixes: 16e7549f045d ("Btrfs: incompatible format change to remove hole extents")
> Signed-off-by: Filipe Manana <fdmanana@suse.com>
> ---
>  fs/btrfs/inode.c | 6 +++++-
>  1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
> index 8eb6703..a0a38ce 100644
> --- a/fs/btrfs/inode.c
> +++ b/fs/btrfs/inode.c
> @@ -4703,8 +4703,12 @@ int btrfs_truncate_inode_items(struct btrfs_trans_handle *trans,
>  			btrfs_abort_transaction(trans, ret);
>  	}
>  error:
> -	if (root->root_key.objectid != BTRFS_TREE_LOG_OBJECTID)
> +	if (root->root_key.objectid != BTRFS_TREE_LOG_OBJECTID) {
> +		if (last_size > new_size &&
> +		    btrfs_fs_incompat(fs_info, NO_HOLES))
> +			last_size = new_size;
>  		btrfs_ordered_update_i_size(inode, last_size, NULL);
> +	}
>

Looks good to me.

Reviewed-by: Liu Bo <bo.li.liu@oracle.com>

We can also remove the code that sets last_size = new_size in the case of
NO_HOLES inside the while loop, as I think this patch covers that, too.

Because we've held the inode mutex and wait for the dirty pages beyond the
current i_size, in fact, I couldn't think of a case that with NO_HOLE last_size
< new_size, maybe we could use new_size directly and add a ASSERT just in case?

Thanks,

-liubo
>  	btrfs_free_path(path);
>  
> -- 
> 2.7.0.rc3
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 3+ messages in thread

* [PATCH v2] Btrfs: fix data loss after truncate when using the no-holes feature
  2017-02-14 20:35 [PATCH] Btrfs: fix data loss after truncate when using the no-holes feature fdmanana
  2017-02-14 22:06 ` Liu Bo
@ 2017-02-15 18:42 ` fdmanana
  1 sibling, 0 replies; 3+ messages in thread
From: fdmanana @ 2017-02-15 18:42 UTC (permalink / raw)
  To: linux-btrfs

From: Filipe Manana <fdmanana@suse.com>

If we have a file with an implicit hole (NO_HOLES feature enabled) that
has an extent following the hole, delayed writes against regions of the
file behind the hole happened before but were not yet flushed and then
we truncate the file to a smaller size that lies inside the hole, we
end up persisting a wrong disk_i_size value for our inode that leads to
data loss after umounting and mounting again the filesystem or after
the inode is evicted and loaded again.

This happens because at inode.c:btrfs_truncate_inode_items() we end up
setting last_size to the offset of the extent that we deleted and that
followed the hole. We then pass that value to btrfs_ordered_update_i_size()
which updates the inode's disk_i_size to a value smaller then the offset
of the buffered (delayed) writes.

Example reproducer:

 $ mkfs.btrfs -f /dev/sdb
 $ mount /dev/sdb /mnt

 $ xfs_io -f -c "pwrite -S 0x01 0K 32K" /mnt/foo
 $ xfs_io -d -c "pwrite -S 0x02 -b 32K 64K 32K" /mnt/foo
 $ xfs_io -c "truncate 60K" /mnt/foo
   --> inode's disk_i_size updated to 0

 $ md5sum /mnt/foo
 3c5ca3c3ab42f4b04d7e7eb0b0d4d806  /mnt/foo

 $ umount /dev/sdb
 $ mount /dev/sdb /mnt

 $ md5sum /mnt/foo
 d41d8cd98f00b204e9800998ecf8427e  /mnt/foo
   --> Empty file, all data lost!

Cc: <stable@vger.kernel.org>  # 3.14+
Fixes: 16e7549f045d ("Btrfs: incompatible format change to remove hole extents")
Signed-off-by: Filipe Manana <fdmanana@suse.com>
---

V2: Do the update of last_size only if no error happened, which is specially
    important when we're called by inode eviction and err is -eagain.
    Added an assertion and removed so no longer needed code from the loop.

 fs/btrfs/inode.c | 19 ++++++-------------
 1 file changed, 6 insertions(+), 13 deletions(-)

diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 8eb6703..a9db345 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -4508,19 +4508,8 @@ int btrfs_truncate_inode_items(struct btrfs_trans_handle *trans,
 		if (found_type > min_type) {
 			del_item = 1;
 		} else {
-			if (item_end < new_size) {
-				/*
-				 * With NO_HOLES mode, for the following mapping
-				 *
-				 * [0-4k][hole][8k-12k]
-				 *
-				 * if truncating isize down to 6k, it ends up
-				 * isize being 8k.
-				 */
-				if (btrfs_fs_incompat(root->fs_info, NO_HOLES))
-					last_size = new_size;
+			if (item_end < new_size)
 				break;
-			}
 			if (found_key.offset >= new_size)
 				del_item = 1;
 			else
@@ -4703,8 +4692,12 @@ int btrfs_truncate_inode_items(struct btrfs_trans_handle *trans,
 			btrfs_abort_transaction(trans, ret);
 	}
 error:
-	if (root->root_key.objectid != BTRFS_TREE_LOG_OBJECTID)
+	if (root->root_key.objectid != BTRFS_TREE_LOG_OBJECTID) {
+		ASSERT(last_size >= new_size);
+		if (!err && last_size > new_size)
+			last_size = new_size;
 		btrfs_ordered_update_i_size(inode, last_size, NULL);
+	}
 
 	btrfs_free_path(path);
 
-- 
2.7.0.rc3


^ permalink raw reply related	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2017-02-15 18:42 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-02-14 20:35 [PATCH] Btrfs: fix data loss after truncate when using the no-holes feature fdmanana
2017-02-14 22:06 ` Liu Bo
2017-02-15 18:42 ` [PATCH v2] " fdmanana

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.