All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 6/7] Btrfs: fix corrupted metadata in the snapshot
@ 2012-08-29  4:13 Miao Xie
  2012-09-05 16:32 ` David Sterba
  0 siblings, 1 reply; 3+ messages in thread
From: Miao Xie @ 2012-08-29  4:13 UTC (permalink / raw)
  To: Linux Btrfs; +Cc: David Sterba

When we delete a inode, we will remove all the delayed items including delayed
inode update, and then truncate all the relative metadata. If there is lots of
metadata, we will end the current transaction, and start a new transaction to
truncate the left metadata. In this way, we will leave a inode item that its
link counter is > 0, and also may leave some directory index items in fs/file tree
after the current transaction ends. In other words, the metadata in this fs/file tree
is inconsistent. If we create a snapshot for this tree now, we will find a inode with
corrupted metadata in the new snapshot, and we won't continue to drop the left metadata,
because its link counter is not 0.

We fix this problem by updating the inode item before the current transaction ends.

Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
---
 fs/btrfs/inode.c |    5 ++++-
 1 files changed, 4 insertions(+), 1 deletions(-)

diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index cae4c32..02eeecb 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -3736,7 +3736,7 @@ void btrfs_evict_inode(struct inode *inode)
 	struct btrfs_trans_handle *trans;
 	struct btrfs_root *root = BTRFS_I(inode)->root;
 	struct btrfs_block_rsv *rsv, *global_rsv;
-	u64 min_size = btrfs_calc_trunc_metadata_size(root, 1);
+	u64 min_size = btrfs_calc_trunc_metadata_size(root, 2);
 	unsigned long nr;
 	int ret;
 
@@ -3818,6 +3818,9 @@ void btrfs_evict_inode(struct inode *inode)
 		if (ret != -EAGAIN)
 			break;
 
+		ret = btrfs_update_inode(trans, root, inode);
+		BUG_ON(ret);
+
 		nr = trans->blocks_used;
 		btrfs_end_transaction(trans, root);
 		trans = NULL;
-- 
1.7.6.5

^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH 6/7] Btrfs: fix corrupted metadata in the snapshot
  2012-08-29  4:13 [PATCH 6/7] Btrfs: fix corrupted metadata in the snapshot Miao Xie
@ 2012-09-05 16:32 ` David Sterba
  2012-09-05 17:58   ` Josef Bacik
  0 siblings, 1 reply; 3+ messages in thread
From: David Sterba @ 2012-09-05 16:32 UTC (permalink / raw)
  To: Miao Xie; +Cc: Linux Btrfs, David Sterba

On Wed, Aug 29, 2012 at 12:13:16PM +0800, Miao Xie wrote:
> When we delete a inode, we will remove all the delayed items including delayed
> inode update, and then truncate all the relative metadata. If there is lots of
> metadata, we will end the current transaction, and start a new transaction to
> truncate the left metadata. In this way, we will leave a inode item that its
> link counter is > 0, and also may leave some directory index items in fs/file tree
> after the current transaction ends. In other words, the metadata in this fs/file tree
> is inconsistent. If we create a snapshot for this tree now, we will find a inode with
> corrupted metadata in the new snapshot, and we won't continue to drop the left metadata,
> because its link counter is not 0.
> 
> We fix this problem by updating the inode item before the current transaction ends.

A comment before the while() says

3780         /*
3781          * This is a bit simpler than btrfs_truncate since
3782          *
3783          * 1) We've already reserved our space for our orphan item in the
3784          *    unlink.
3785          * 2) We're going to delete the inode item, so we don't need to update
3786          *    it at all.
3787          *
3788          * So we just need to reserve some slack space in case we add bytes when
3789          * doing the truncate.
3790          */

Point 2 states that the inode update is not needed, but as you write in the
changelog it can lead to inconsistent metadata. I can't say either way, but
rather would like to hear Josef's oppinion on that, as the comment and related
code comes from
4289a667a0d7c6b134898cac7bfbe950267c305c
(Btrfs: fix how we reserve space for deleting inodes)

> Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
> ---
>  fs/btrfs/inode.c |    5 ++++-
>  1 files changed, 4 insertions(+), 1 deletions(-)
> 
> diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
> index cae4c32..02eeecb 100644
> --- a/fs/btrfs/inode.c
> +++ b/fs/btrfs/inode.c
> @@ -3736,7 +3736,7 @@ void btrfs_evict_inode(struct inode *inode)
>  	struct btrfs_trans_handle *trans;
>  	struct btrfs_root *root = BTRFS_I(inode)->root;
>  	struct btrfs_block_rsv *rsv, *global_rsv;
> -	u64 min_size = btrfs_calc_trunc_metadata_size(root, 1);
> +	u64 min_size = btrfs_calc_trunc_metadata_size(root, 2);
>  	unsigned long nr;
>  	int ret;
>  
> @@ -3818,6 +3818,9 @@ void btrfs_evict_inode(struct inode *inode)
>  		if (ret != -EAGAIN)
>  			break;
>  
> +		ret = btrfs_update_inode(trans, root, inode);
> +		BUG_ON(ret);
> +
>  		nr = trans->blocks_used;
>  		btrfs_end_transaction(trans, root);
>  		trans = NULL;

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH 6/7] Btrfs: fix corrupted metadata in the snapshot
  2012-09-05 16:32 ` David Sterba
@ 2012-09-05 17:58   ` Josef Bacik
  0 siblings, 0 replies; 3+ messages in thread
From: Josef Bacik @ 2012-09-05 17:58 UTC (permalink / raw)
  To: David Sterba; +Cc: Miao Xie, Linux Btrfs

On Wed, Sep 05, 2012 at 10:32:05AM -0600, David Sterba wrote:
> On Wed, Aug 29, 2012 at 12:13:16PM +0800, Miao Xie wrote:
> > When we delete a inode, we will remove all the delayed items including delayed
> > inode update, and then truncate all the relative metadata. If there is lots of
> > metadata, we will end the current transaction, and start a new transaction to
> > truncate the left metadata. In this way, we will leave a inode item that its
> > link counter is > 0, and also may leave some directory index items in fs/file tree
> > after the current transaction ends. In other words, the metadata in this fs/file tree
> > is inconsistent. If we create a snapshot for this tree now, we will find a inode with
> > corrupted metadata in the new snapshot, and we won't continue to drop the left metadata,
> > because its link counter is not 0.
> > 
> > We fix this problem by updating the inode item before the current transaction ends.
> 
> A comment before the while() says
> 
> 3780         /*
> 3781          * This is a bit simpler than btrfs_truncate since
> 3782          *
> 3783          * 1) We've already reserved our space for our orphan item in the
> 3784          *    unlink.
> 3785          * 2) We're going to delete the inode item, so we don't need to update
> 3786          *    it at all.
> 3787          *
> 3788          * So we just need to reserve some slack space in case we add bytes when
> 3789          * doing the truncate.
> 3790          */
> 
> Point 2 states that the inode update is not needed, but as you write in the
> changelog it can lead to inconsistent metadata. I can't say either way, but
> rather would like to hear Josef's oppinion on that, as the comment and related
> code comes from
> 4289a667a0d7c6b134898cac7bfbe950267c305c
> (Btrfs: fix how we reserve space for deleting inodes)
> 

Yeah I was wrong and Miao is right, we need to update the inode if we stop the
transaction just for consistency sake.  We're not quite doing the right thing
for enospc here but thats a problem for a later date.  Thanks,

Josef

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2012-09-05 17:58 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-08-29  4:13 [PATCH 6/7] Btrfs: fix corrupted metadata in the snapshot Miao Xie
2012-09-05 16:32 ` David Sterba
2012-09-05 17:58   ` Josef Bacik

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.