All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] ext4: set csum seed in tmp inode while migrating to extents
       [not found] <bug-213357-13602@https.bugzilla.kernel.org>
@ 2021-12-06 14:37 ` Luís Henriques
  2021-12-14 12:03   ` Jan Kara
  0 siblings, 1 reply; 3+ messages in thread
From: Luís Henriques @ 2021-12-06 14:37 UTC (permalink / raw)
  To: Theodore Ts'o, Andreas Dilger
  Cc: linux-ext4, linux-kernel, Luís Henriques, Jeroen van Wolffelaar

When migrating to extents, the temporary inode will have it's own checksum
seed.  This means that, when swapping the inodes data, the inode checksums
will be incorrect.

This can be fixed by recalculating the extents checksums again.  Or simply
by copying the seed into the temporary inode.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=213357
Reported-by: Jeroen van Wolffelaar <jeroen@wolffelaar.nl>
Signed-off-by: Luís Henriques <lhenriques@suse.de>
---
 fs/ext4/migrate.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/fs/ext4/migrate.c b/fs/ext4/migrate.c
index 7e0b4f81c6c0..dd4ece38fc83 100644
--- a/fs/ext4/migrate.c
+++ b/fs/ext4/migrate.c
@@ -413,7 +413,7 @@ int ext4_ext_migrate(struct inode *inode)
 	handle_t *handle;
 	int retval = 0, i;
 	__le32 *i_data;
-	struct ext4_inode_info *ei;
+	struct ext4_inode_info *ei, *tmp_ei;
 	struct inode *tmp_inode = NULL;
 	struct migrate_struct lb;
 	unsigned long max_entries;
@@ -503,6 +503,10 @@ int ext4_ext_migrate(struct inode *inode)
 	}
 
 	ei = EXT4_I(inode);
+	tmp_ei = EXT4_I(tmp_inode);
+	/* Use the right seed for checksumming */
+	tmp_ei->i_csum_seed = ei->i_csum_seed;
+
 	i_data = ei->i_data;
 	memset(&lb, 0, sizeof(lb));
 

^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH] ext4: set csum seed in tmp inode while migrating to extents
  2021-12-06 14:37 ` [PATCH] ext4: set csum seed in tmp inode while migrating to extents Luís Henriques
@ 2021-12-14 12:03   ` Jan Kara
  2021-12-14 16:46     ` Luís Henriques
  0 siblings, 1 reply; 3+ messages in thread
From: Jan Kara @ 2021-12-14 12:03 UTC (permalink / raw)
  To: Luís Henriques
  Cc: Theodore Ts'o, Andreas Dilger, linux-ext4, linux-kernel,
	Jeroen van Wolffelaar

On Mon 06-12-21 14:37:33, Luís Henriques wrote:
> When migrating to extents, the temporary inode will have it's own checksum
> seed.  This means that, when swapping the inodes data, the inode checksums
> will be incorrect.
> 
> This can be fixed by recalculating the extents checksums again.  Or simply
> by copying the seed into the temporary inode.
> 
> Link: https://bugzilla.kernel.org/show_bug.cgi?id=213357
> Reported-by: Jeroen van Wolffelaar <jeroen@wolffelaar.nl>
> Signed-off-by: Luís Henriques <lhenriques@suse.de>

Thanks for debugging this! Two comments below:

> diff --git a/fs/ext4/migrate.c b/fs/ext4/migrate.c
> index 7e0b4f81c6c0..dd4ece38fc83 100644
> --- a/fs/ext4/migrate.c
> +++ b/fs/ext4/migrate.c
> @@ -413,7 +413,7 @@ int ext4_ext_migrate(struct inode *inode)
>  	handle_t *handle;
>  	int retval = 0, i;
>  	__le32 *i_data;
> -	struct ext4_inode_info *ei;
> +	struct ext4_inode_info *ei, *tmp_ei;

Probably no need for the new tmp_ei variable when you use it only once...

> @@ -503,6 +503,10 @@ int ext4_ext_migrate(struct inode *inode)
>  	}
>  
>  	ei = EXT4_I(inode);
> +	tmp_ei = EXT4_I(tmp_inode);
> +	/* Use the right seed for checksumming */
> +	tmp_ei->i_csum_seed = ei->i_csum_seed;
> +

I think this is subtly broken in another way: If we crash in the middle of
migration, tmp_inode (and possibly attached extent tree blocks) will have
wrong checksums (remember that i_csum_seed is computed from inode number)
and so orphan cleanup will fail. On the other hand in that case the orphan
cleanup will free blocks we have already managed to attach to the tmp_inode
although they are still properly attached to the old 'inode'. So the
recovery from a crash in the middle of the migration seems to be broken
anyway. So I guess what you do is an improvement. But can you perhaps:

1) Move i_csum_seed initialization to a bit earlier in ext4_ext_migrate()
just after we have got the tmp_inode from  ext4_new_inode()? That way all
inode writes will at least happen with the same csum.

2) Add a comment you are updating the csum seed so that metadata blocks get
proper checksum for 'inode' and that recovery from a crash in the middle of
migration is currently broken.

Thanks!

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] ext4: set csum seed in tmp inode while migrating to extents
  2021-12-14 12:03   ` Jan Kara
@ 2021-12-14 16:46     ` Luís Henriques
  0 siblings, 0 replies; 3+ messages in thread
From: Luís Henriques @ 2021-12-14 16:46 UTC (permalink / raw)
  To: Jan Kara
  Cc: Theodore Ts'o, Andreas Dilger, linux-ext4, linux-kernel,
	Jeroen van Wolffelaar

On Tue, Dec 14, 2021 at 01:03:17PM +0100, Jan Kara wrote:
> On Mon 06-12-21 14:37:33, Luís Henriques wrote:
> > When migrating to extents, the temporary inode will have it's own checksum
> > seed.  This means that, when swapping the inodes data, the inode checksums
> > will be incorrect.
> > 
> > This can be fixed by recalculating the extents checksums again.  Or simply
> > by copying the seed into the temporary inode.
> > 
> > Link: https://bugzilla.kernel.org/show_bug.cgi?id=213357
> > Reported-by: Jeroen van Wolffelaar <jeroen@wolffelaar.nl>
> > Signed-off-by: Luís Henriques <lhenriques@suse.de>
> 
> Thanks for debugging this! Two comments below:

And thanks for the review!

> > diff --git a/fs/ext4/migrate.c b/fs/ext4/migrate.c
> > index 7e0b4f81c6c0..dd4ece38fc83 100644
> > --- a/fs/ext4/migrate.c
> > +++ b/fs/ext4/migrate.c
> > @@ -413,7 +413,7 @@ int ext4_ext_migrate(struct inode *inode)
> >  	handle_t *handle;
> >  	int retval = 0, i;
> >  	__le32 *i_data;
> > -	struct ext4_inode_info *ei;
> > +	struct ext4_inode_info *ei, *tmp_ei;
> 
> Probably no need for the new tmp_ei variable when you use it only once...

Sure, I'll drop that new variable in v2.

> > @@ -503,6 +503,10 @@ int ext4_ext_migrate(struct inode *inode)
> >  	}
> >  
> >  	ei = EXT4_I(inode);
> > +	tmp_ei = EXT4_I(tmp_inode);
> > +	/* Use the right seed for checksumming */
> > +	tmp_ei->i_csum_seed = ei->i_csum_seed;
> > +
> 
> I think this is subtly broken in another way: If we crash in the middle of
> migration, tmp_inode (and possibly attached extent tree blocks) will have
> wrong checksums (remember that i_csum_seed is computed from inode number)
> and so orphan cleanup will fail. On the other hand in that case the orphan
> cleanup will free blocks we have already managed to attach to the tmp_inode
> although they are still properly attached to the old 'inode'. So the
> recovery from a crash in the middle of the migration seems to be broken
> anyway. So I guess what you do is an improvement. But can you perhaps:
> 
> 1) Move i_csum_seed initialization to a bit earlier in ext4_ext_migrate()
> just after we have got the tmp_inode from  ext4_new_inode()? That way all
> inode writes will at least happen with the same csum.
> 
> 2) Add a comment you are updating the csum seed so that metadata blocks get
> proper checksum for 'inode' and that recovery from a crash in the middle of
> migration is currently broken.

Obviously, I did not realize the recovery process was broken and I
appreciate you took the time to explain _how_ it is broken.  I'll add a
new item to (the bottom of) my to-do list and maybe one of these days I
get to look into it.

I'll send out v2 shortly, implementing your suggestions.

Cheers,
--
Luís

> 
> Thanks!
> 
> 								Honza
> -- 
> Jan Kara <jack@suse.com>
> SUSE Labs, CR

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2021-12-14 16:46 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <bug-213357-13602@https.bugzilla.kernel.org>
2021-12-06 14:37 ` [PATCH] ext4: set csum seed in tmp inode while migrating to extents Luís Henriques
2021-12-14 12:03   ` Jan Kara
2021-12-14 16:46     ` Luís Henriques

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.