All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] ext4: don't use the orphan list when migrating an inode
@ 2022-01-06  5:05 Theodore Ts'o
  2022-01-06 10:57 ` Jan Kara
  2022-01-06 13:19 ` Lukas Czerner
  0 siblings, 2 replies; 3+ messages in thread
From: Theodore Ts'o @ 2022-01-06  5:05 UTC (permalink / raw)
  To: Ext4 Developers List; +Cc: lczerner, jack, Theodore Ts'o

We probably want to remove the indirect block to extents migration
feature after a deprecation window, but until then, let's fix a
potential data loss problem caused by the fact that we put the
tmp_inode on the orphan list.  In the unlikely case where we crash and
do a journal recovery, the data blocks belonging to the inode inode
being migrated are also represented in the tmp_inode on the orphan
list --- and so its data blocks will get marked unallocated, and
available for reuse.

Instead, we stop putting the tmp_inode on the oprhan list.  So in the
case where we crash while migrating the inode, we'll leak an inode,
which is not a disaster.  It will be easily fixed the next time we run
fsck, and it's better than potentially having blocks getting claimed
by two different files, and losing data as a result.

Signed-off-by: Theodore Ts'o <tytso@mit.edu>
---
 fs/ext4/migrate.c | 19 ++++---------------
 1 file changed, 4 insertions(+), 15 deletions(-)

diff --git a/fs/ext4/migrate.c b/fs/ext4/migrate.c
index 36dfc88ce05b..ff8916e1d38e 100644
--- a/fs/ext4/migrate.c
+++ b/fs/ext4/migrate.c
@@ -437,12 +437,12 @@ int ext4_ext_migrate(struct inode *inode)
 	percpu_down_write(&sbi->s_writepages_rwsem);
 
 	/*
-	 * Worst case we can touch the allocation bitmaps, a bgd
-	 * block, and a block to link in the orphan list.  We do need
-	 * need to worry about credits for modifying the quota inode.
+	 * Worst case we can touch the allocation bitmaps and a block
+	 * group descriptor block.  We do need need to worry about
+	 * credits for modifying the quota inode.
 	 */
 	handle = ext4_journal_start(inode, EXT4_HT_MIGRATE,
-		4 + EXT4_MAXQUOTAS_TRANS_BLOCKS(inode->i_sb));
+		3 + EXT4_MAXQUOTAS_TRANS_BLOCKS(inode->i_sb));
 
 	if (IS_ERR(handle)) {
 		retval = PTR_ERR(handle);
@@ -463,10 +463,6 @@ int ext4_ext_migrate(struct inode *inode)
 	 * Use the correct seed for checksum (i.e. the seed from 'inode').  This
 	 * is so that the metadata blocks will have the correct checksum after
 	 * the migration.
-	 *
-	 * Note however that, if a crash occurs during the migration process,
-	 * the recovery process is broken because the tmp_inode checksums will
-	 * be wrong and the orphans cleanup will fail.
 	 */
 	ei = EXT4_I(inode);
 	EXT4_I(tmp_inode)->i_csum_seed = ei->i_csum_seed;
@@ -478,7 +474,6 @@ int ext4_ext_migrate(struct inode *inode)
 	clear_nlink(tmp_inode);
 
 	ext4_ext_tree_init(handle, tmp_inode);
-	ext4_orphan_add(handle, tmp_inode);
 	ext4_journal_stop(handle);
 
 	/*
@@ -503,12 +498,6 @@ int ext4_ext_migrate(struct inode *inode)
 
 	handle = ext4_journal_start(inode, EXT4_HT_MIGRATE, 1);
 	if (IS_ERR(handle)) {
-		/*
-		 * It is impossible to update on-disk structures without
-		 * a handle, so just rollback in-core changes and live other
-		 * work to orphan_list_cleanup()
-		 */
-		ext4_orphan_del(NULL, tmp_inode);
 		retval = PTR_ERR(handle);
 		goto out_tmp_inode;
 	}
-- 
2.31.0


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH] ext4: don't use the orphan list when migrating an inode
  2022-01-06  5:05 [PATCH] ext4: don't use the orphan list when migrating an inode Theodore Ts'o
@ 2022-01-06 10:57 ` Jan Kara
  2022-01-06 13:19 ` Lukas Czerner
  1 sibling, 0 replies; 3+ messages in thread
From: Jan Kara @ 2022-01-06 10:57 UTC (permalink / raw)
  To: Theodore Ts'o; +Cc: Ext4 Developers List, lczerner, jack

On Thu 06-01-22 00:05:05, Theodore Ts'o wrote:
> We probably want to remove the indirect block to extents migration
> feature after a deprecation window, but until then, let's fix a
> potential data loss problem caused by the fact that we put the
> tmp_inode on the orphan list.  In the unlikely case where we crash and
> do a journal recovery, the data blocks belonging to the inode inode
> being migrated are also represented in the tmp_inode on the orphan
> list --- and so its data blocks will get marked unallocated, and
> available for reuse.
> 
> Instead, we stop putting the tmp_inode on the oprhan list.  So in the
> case where we crash while migrating the inode, we'll leak an inode,
> which is not a disaster.  It will be easily fixed the next time we run
> fsck, and it's better than potentially having blocks getting claimed
> by two different files, and losing data as a result.
> 
> Signed-off-by: Theodore Ts'o <tytso@mit.edu>

Looks good. Feel free to add:

Reviewed-by: Jan Kara <jack@suse.cz>

								Honza

> ---
>  fs/ext4/migrate.c | 19 ++++---------------
>  1 file changed, 4 insertions(+), 15 deletions(-)
> 
> diff --git a/fs/ext4/migrate.c b/fs/ext4/migrate.c
> index 36dfc88ce05b..ff8916e1d38e 100644
> --- a/fs/ext4/migrate.c
> +++ b/fs/ext4/migrate.c
> @@ -437,12 +437,12 @@ int ext4_ext_migrate(struct inode *inode)
>  	percpu_down_write(&sbi->s_writepages_rwsem);
>  
>  	/*
> -	 * Worst case we can touch the allocation bitmaps, a bgd
> -	 * block, and a block to link in the orphan list.  We do need
> -	 * need to worry about credits for modifying the quota inode.
> +	 * Worst case we can touch the allocation bitmaps and a block
> +	 * group descriptor block.  We do need need to worry about
> +	 * credits for modifying the quota inode.
>  	 */
>  	handle = ext4_journal_start(inode, EXT4_HT_MIGRATE,
> -		4 + EXT4_MAXQUOTAS_TRANS_BLOCKS(inode->i_sb));
> +		3 + EXT4_MAXQUOTAS_TRANS_BLOCKS(inode->i_sb));
>  
>  	if (IS_ERR(handle)) {
>  		retval = PTR_ERR(handle);
> @@ -463,10 +463,6 @@ int ext4_ext_migrate(struct inode *inode)
>  	 * Use the correct seed for checksum (i.e. the seed from 'inode').  This
>  	 * is so that the metadata blocks will have the correct checksum after
>  	 * the migration.
> -	 *
> -	 * Note however that, if a crash occurs during the migration process,
> -	 * the recovery process is broken because the tmp_inode checksums will
> -	 * be wrong and the orphans cleanup will fail.
>  	 */
>  	ei = EXT4_I(inode);
>  	EXT4_I(tmp_inode)->i_csum_seed = ei->i_csum_seed;
> @@ -478,7 +474,6 @@ int ext4_ext_migrate(struct inode *inode)
>  	clear_nlink(tmp_inode);
>  
>  	ext4_ext_tree_init(handle, tmp_inode);
> -	ext4_orphan_add(handle, tmp_inode);
>  	ext4_journal_stop(handle);
>  
>  	/*
> @@ -503,12 +498,6 @@ int ext4_ext_migrate(struct inode *inode)
>  
>  	handle = ext4_journal_start(inode, EXT4_HT_MIGRATE, 1);
>  	if (IS_ERR(handle)) {
> -		/*
> -		 * It is impossible to update on-disk structures without
> -		 * a handle, so just rollback in-core changes and live other
> -		 * work to orphan_list_cleanup()
> -		 */
> -		ext4_orphan_del(NULL, tmp_inode);
>  		retval = PTR_ERR(handle);
>  		goto out_tmp_inode;
>  	}
> -- 
> 2.31.0
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] ext4: don't use the orphan list when migrating an inode
  2022-01-06  5:05 [PATCH] ext4: don't use the orphan list when migrating an inode Theodore Ts'o
  2022-01-06 10:57 ` Jan Kara
@ 2022-01-06 13:19 ` Lukas Czerner
  1 sibling, 0 replies; 3+ messages in thread
From: Lukas Czerner @ 2022-01-06 13:19 UTC (permalink / raw)
  To: Theodore Ts'o; +Cc: Ext4 Developers List, jack

On Thu, Jan 06, 2022 at 12:05:05AM -0500, Theodore Ts'o wrote:
> We probably want to remove the indirect block to extents migration
> feature after a deprecation window, but until then, let's fix a
> potential data loss problem caused by the fact that we put the
> tmp_inode on the orphan list.  In the unlikely case where we crash and
> do a journal recovery, the data blocks belonging to the inode inode
> being migrated are also represented in the tmp_inode on the orphan
> list --- and so its data blocks will get marked unallocated, and
> available for reuse.
> 
> Instead, we stop putting the tmp_inode on the oprhan list.  So in the
> case where we crash while migrating the inode, we'll leak an inode,
> which is not a disaster.  It will be easily fixed the next time we run
> fsck, and it's better than potentially having blocks getting claimed
> by two different files, and losing data as a result.

Looks good to me. Thanks!

Reviewed-by: Lukas Czerner <lczerner@redhat.com>


> 
> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
> ---
>  fs/ext4/migrate.c | 19 ++++---------------
>  1 file changed, 4 insertions(+), 15 deletions(-)
> 
> diff --git a/fs/ext4/migrate.c b/fs/ext4/migrate.c
> index 36dfc88ce05b..ff8916e1d38e 100644
> --- a/fs/ext4/migrate.c
> +++ b/fs/ext4/migrate.c
> @@ -437,12 +437,12 @@ int ext4_ext_migrate(struct inode *inode)
>  	percpu_down_write(&sbi->s_writepages_rwsem);
>  
>  	/*
> -	 * Worst case we can touch the allocation bitmaps, a bgd
> -	 * block, and a block to link in the orphan list.  We do need
> -	 * need to worry about credits for modifying the quota inode.
> +	 * Worst case we can touch the allocation bitmaps and a block
> +	 * group descriptor block.  We do need need to worry about
> +	 * credits for modifying the quota inode.
>  	 */
>  	handle = ext4_journal_start(inode, EXT4_HT_MIGRATE,
> -		4 + EXT4_MAXQUOTAS_TRANS_BLOCKS(inode->i_sb));
> +		3 + EXT4_MAXQUOTAS_TRANS_BLOCKS(inode->i_sb));
>  
>  	if (IS_ERR(handle)) {
>  		retval = PTR_ERR(handle);
> @@ -463,10 +463,6 @@ int ext4_ext_migrate(struct inode *inode)
>  	 * Use the correct seed for checksum (i.e. the seed from 'inode').  This
>  	 * is so that the metadata blocks will have the correct checksum after
>  	 * the migration.
> -	 *
> -	 * Note however that, if a crash occurs during the migration process,
> -	 * the recovery process is broken because the tmp_inode checksums will
> -	 * be wrong and the orphans cleanup will fail.
>  	 */
>  	ei = EXT4_I(inode);
>  	EXT4_I(tmp_inode)->i_csum_seed = ei->i_csum_seed;
> @@ -478,7 +474,6 @@ int ext4_ext_migrate(struct inode *inode)
>  	clear_nlink(tmp_inode);
>  
>  	ext4_ext_tree_init(handle, tmp_inode);
> -	ext4_orphan_add(handle, tmp_inode);
>  	ext4_journal_stop(handle);
>  
>  	/*
> @@ -503,12 +498,6 @@ int ext4_ext_migrate(struct inode *inode)
>  
>  	handle = ext4_journal_start(inode, EXT4_HT_MIGRATE, 1);
>  	if (IS_ERR(handle)) {
> -		/*
> -		 * It is impossible to update on-disk structures without
> -		 * a handle, so just rollback in-core changes and live other
> -		 * work to orphan_list_cleanup()
> -		 */
> -		ext4_orphan_del(NULL, tmp_inode);
>  		retval = PTR_ERR(handle);
>  		goto out_tmp_inode;
>  	}
> -- 
> 2.31.0
> 


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2022-01-06 15:45 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-01-06  5:05 [PATCH] ext4: don't use the orphan list when migrating an inode Theodore Ts'o
2022-01-06 10:57 ` Jan Kara
2022-01-06 13:19 ` Lukas Czerner

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.