All of lore.kernel.org
 help / color / mirror / Atom feed
* ext4 online resize -> EXT4-fs error (device loop0) in ext4_update_backup_sb:174: Filesystem failed CRC
@ 2022-10-26 19:49 Unterwurzacher, Jakob
  2022-10-28  3:59 ` Theodore Ts'o
  0 siblings, 1 reply; 4+ messages in thread
From: Unterwurzacher, Jakob @ 2022-10-26 19:49 UTC (permalink / raw)
  To: linux-ext4; +Cc: linux-kernel, Schulz, Quentin

Hi,

it looks like I am hitting a similar issue as reported by Borislav Petkov
in April 2022 ( https://lore.kernel.org/lkml/YmqOqGKajOOx90ZY@zn.tnic/ ).

I'm on kernel 6.0.5 and see this on arm64 as well as x86_64.
I have a 100% reproducer using a loop mount, here it is:

	truncate -s 16g ext4.img
	mkfs.ext4 ext4.img 500m
	mkdir ext4.mnt
	mount ext4.img ext4.mnt
	resize2fs ext4.img

And these are the kernel messages it generates:

	[   33.774267] loop0: detected capacity change from 0 to 33554432
	[   33.796319] EXT4-fs (loop0): mounted filesystem with ordered data mode. Quota mode: none.
	[   33.796518] ext4 filesystem being mounted at /root/ext4.mnt supports timestamps until 2038 (0x7fffffff)
	[   33.799324] EXT4-fs (loop0): resizing filesystem from 512000 to 16777216 blocks
	[   33.933110] EXT4-fs (loop0): resized filesystem to 16777216
	[   33.965633] EXT4-fs (loop0): Invalid checksum for backup superblock 8193

	[   33.965675] EXT4-fs error (device loop0) in ext4_update_backup_sb:174: Filesystem failed CRC
	[   33.965884] EXT4-fs (loop0): Invalid checksum for backup superblock 24577

	[   33.965902] EXT4-fs error (device loop0) in ext4_update_backup_sb:174: Filesystem failed CRC
	[   33.966058] EXT4-fs (loop0): Invalid checksum for backup superblock 40961

	[   33.966075] EXT4-fs error (device loop0) in ext4_update_backup_sb:174: Filesystem failed CRC
	[   33.966225] EXT4-fs (loop0): Invalid checksum for backup superblock 57345

	[   33.966242] EXT4-fs error (device loop0) in ext4_update_backup_sb:174: Filesystem failed CRC
	[   33.966398] EXT4-fs (loop0): Invalid checksum for backup superblock 73729

	[   33.966415] EXT4-fs error (device loop0) in ext4_update_backup_sb:174: Filesystem failed CRC
	[   33.966557] EXT4-fs (loop0): Invalid checksum for backup superblock 204801

	[   33.966574] EXT4-fs error (device loop0) in ext4_update_backup_sb:174: Filesystem failed CRC
	[   33.966765] EXT4-fs (loop0): Invalid checksum for backup superblock 221185

	[   33.966784] EXT4-fs error (device loop0) in ext4_update_backup_sb:174: Filesystem failed CRC
	[   33.966946] EXT4-fs error (device loop0) in ext4_update_backup_sb:174: Filesystem failed CRC
	[   33.967074] EXT4-fs error (device loop0) in ext4_update_backup_sb:174: Filesystem failed CRC
	[   33.967237] EXT4-fs error (device loop0) in ext4_update_backup_sb:174: Filesystem failed CRC

e2fsck seems mostly happy, should I be concerned?

	e2fsck ext4.img 

	e2fsck 1.46.2 (28-Feb-2021)
	ext4.img contains a file system with errors, check forced.
	Pass 1: Checking inodes, blocks, and sizes
	Pass 2: Checking directory structure
	Pass 3: Checking directory connectivity
	Pass 4: Checking reference counts
	Pass 5: Checking group summary information
	ext4.img: 11/4161536 files (0.0% non-contiguous), 536410/16777216 blocks

Thank you,
Jakob

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: ext4 online resize -> EXT4-fs error (device loop0) in ext4_update_backup_sb:174: Filesystem failed CRC
  2022-10-26 19:49 ext4 online resize -> EXT4-fs error (device loop0) in ext4_update_backup_sb:174: Filesystem failed CRC Unterwurzacher, Jakob
@ 2022-10-28  3:59 ` Theodore Ts'o
  2022-10-28 11:49   ` Jakob Unterwurzacher
  2022-11-03 11:54   ` Quentin Schulz
  0 siblings, 2 replies; 4+ messages in thread
From: Theodore Ts'o @ 2022-10-28  3:59 UTC (permalink / raw)
  To: Unterwurzacher, Jakob; +Cc: linux-ext4, linux-kernel, Schulz, Quentin

On Wed, Oct 26, 2022 at 07:49:56PM +0000, Unterwurzacher, Jakob wrote:
> 
> it looks like I am hitting a similar issue as reported by Borislav Petkov
> in April 2022 ( https://lore.kernel.org/lkml/YmqOqGKajOOx90ZY@zn.tnic/ ).
> 
> I'm on kernel 6.0.5 and see this on arm64 as well as x86_64.
> I have a 100% reproducer using a loop mount, here it is:
> 
> 	truncate -s 16g ext4.img
> 	mkfs.ext4 ext4.img 500m
> 	mkdir ext4.mnt
> 	mount ext4.img ext4.mnt
> 	resize2fs ext4.img

Thanks for the reproducer!  The following patch should fix things.

       	       		      		- Ted

From 9a8c5b0d061554fedd7dbe894e63aa34d0bac7c4 Mon Sep 17 00:00:00 2001
From: Theodore Ts'o <tytso@mit.edu>
Date: Thu, 27 Oct 2022 16:04:36 -0400
Subject: [PATCH] ext4: update the backup superblock's at the end of the online
 resize

When expanding a file system using online resize, various fields in
the superblock (e.g., s_blocks_count, s_inodes_count, etc.) change.
To update the backup superblocks, the online resize uses the function
update_backups() in fs/ext4/resize.c.  This function was not updating
the checksum field in the backup superblocks.  This wasn't a big deal
previously, because e2fsck didn't care about the checksum field in the
backup superblock.  (And indeed, update_backups() goes all the way
back to the ext3 days, well before we had support for metadata
checksums.)

However, there is an alternate, more general way of updating
superblock fields, ext4_update_primary_sb() in fs/ext4/ioctl.c.  This
function does check the checksum of the backup superblock, and if it
doesn't match will mark the file system as corrupted.  That was
clearly not the intent, so avoid to aborting the resize when a bad
superblock is found.

In addition, teach update_backups() to properly update the checksum in
the backup superblocks.  We will eventually want to unify
updapte_backups() with the infrasture in ext4_update_primary_sb(), but
that's for another day.

Note: The problem has been around for a while; it just didn't really
matter until ext4_update_primary_sb() was added by commit bbc605cdb1e1
("ext4: implement support for get/set fs label").  And it became
trivially easy to reproduce after commit 827891a38acc ("ext4: update
the s_overhead_clusters in the backup sb's when resizing") in v6.0.

Cc: stable@kernel.org # 5.17+
Fixes: bbc605cdb1e1 ("ext4: implement support for get/set fs label")
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
---
 fs/ext4/ioctl.c  | 3 +--
 fs/ext4/resize.c | 5 +++++
 2 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/fs/ext4/ioctl.c b/fs/ext4/ioctl.c
index 4d49c5cfb690..790d5ffe8559 100644
--- a/fs/ext4/ioctl.c
+++ b/fs/ext4/ioctl.c
@@ -145,9 +145,8 @@ static int ext4_update_backup_sb(struct super_block *sb,
 	if (ext4_has_metadata_csum(sb) &&
 	    es->s_checksum != ext4_superblock_csum(sb, es)) {
 		ext4_msg(sb, KERN_ERR, "Invalid checksum for backup "
-		"superblock %llu\n", sb_block);
+		"superblock %llu", sb_block);
 		unlock_buffer(bh);
-		err = -EFSBADCRC;
 		goto out_bh;
 	}
 	func(es, arg);
diff --git a/fs/ext4/resize.c b/fs/ext4/resize.c
index 6dfe9ccae0c5..46b87ffeb304 100644
--- a/fs/ext4/resize.c
+++ b/fs/ext4/resize.c
@@ -1158,6 +1158,7 @@ static void update_backups(struct super_block *sb, sector_t blk_off, char *data,
 	while (group < sbi->s_groups_count) {
 		struct buffer_head *bh;
 		ext4_fsblk_t backup_block;
+		struct ext4_super_block *es;
 
 		/* Out of journal space, and can't get more - abort - so sad */
 		err = ext4_resize_ensure_credits_batch(handle, 1);
@@ -1186,6 +1187,10 @@ static void update_backups(struct super_block *sb, sector_t blk_off, char *data,
 		memcpy(bh->b_data, data, size);
 		if (rest)
 			memset(bh->b_data + size, 0, rest);
+		es = (struct ext4_super_block *) bh->b_data;
+		es->s_block_group_nr = cpu_to_le16(group);
+		if (ext4_has_metadata_csum(sb))
+			es->s_checksum = ext4_superblock_csum(sb, es);
 		set_buffer_uptodate(bh);
 		unlock_buffer(bh);
 		err = ext4_handle_dirty_metadata(handle, NULL, bh);
-- 
2.31.0


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: ext4 online resize -> EXT4-fs error (device loop0) in ext4_update_backup_sb:174: Filesystem failed CRC
  2022-10-28  3:59 ` Theodore Ts'o
@ 2022-10-28 11:49   ` Jakob Unterwurzacher
  2022-11-03 11:54   ` Quentin Schulz
  1 sibling, 0 replies; 4+ messages in thread
From: Jakob Unterwurzacher @ 2022-10-28 11:49 UTC (permalink / raw)
  To: Theodore Ts'o; +Cc: linux-ext4, linux-kernel, Schulz, Quentin

On 28.10.22 05:59, Theodore Ts'o wrote:
> 
> Thanks for the reproducer!  The following patch should fix things.
> 
>         	       		      		- Ted
> 
>  From 9a8c5b0d061554fedd7dbe894e63aa34d0bac7c4 Mon Sep 17 00:00:00 2001
> From: Theodore Ts'o <tytso@mit.edu>
> Date: Thu, 27 Oct 2022 16:04:36 -0400
> Subject: [PATCH] ext4: update the backup superblock's at the end of the online
>   resize

Hi Theodore,

I tested the patch on arm64 and it fixes the issue. Now the kernel 
messages are just this:

> [   14.769997] EXT4-fs (mmcblk2p1): resizing filesystem from 139771 to 3888507 blocks
> [   15.020593] EXT4-fs (mmcblk2p1): resized filesystem to 3888507
fsck after the resize is happy too.

Thank you!
Jakob

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: ext4 online resize -> EXT4-fs error (device loop0) in ext4_update_backup_sb:174: Filesystem failed CRC
  2022-10-28  3:59 ` Theodore Ts'o
  2022-10-28 11:49   ` Jakob Unterwurzacher
@ 2022-11-03 11:54   ` Quentin Schulz
  1 sibling, 0 replies; 4+ messages in thread
From: Quentin Schulz @ 2022-11-03 11:54 UTC (permalink / raw)
  To: Theodore Ts'o, Unterwurzacher, Jakob; +Cc: linux-ext4, linux-kernel

Hi Theodore,

On 10/28/22 05:59, Theodore Ts'o wrote:
> On Wed, Oct 26, 2022 at 07:49:56PM +0000, Unterwurzacher, Jakob wrote:
>>
>> it looks like I am hitting a similar issue as reported by Borislav Petkov
>> in April 2022 ( https://urldefense.com/v3/__https://lore.kernel.org/lkml/YmqOqGKajOOx90ZY@zn.tnic/__;!!OOPJP91ZZw!kg_tsVkw00-Mf-bC3nyz9aOxZvEowuWZ19B4d-Vzx22Kd8RwNeAb7lEReLYF4ulwcE_OE0im6sdv3zVWHiLXp8Tafu1i$  ).
>>
>> I'm on kernel 6.0.5 and see this on arm64 as well as x86_64.
>> I have a 100% reproducer using a loop mount, here it is:
>>
>> 	truncate -s 16g ext4.img
>> 	mkfs.ext4 ext4.img 500m
>> 	mkdir ext4.mnt
>> 	mount ext4.img ext4.mnt
>> 	resize2fs ext4.img
> 
> Thanks for the reproducer!  The following patch should fix things.
> 
>         	       		      		- Ted
> 
>  From 9a8c5b0d061554fedd7dbe894e63aa34d0bac7c4 Mon Sep 17 00:00:00 2001
> From: Theodore Ts'o <tytso@mit.edu>
> Date: Thu, 27 Oct 2022 16:04:36 -0400
> Subject: [PATCH] ext4: update the backup superblock's at the end of the online
>   resize
> 
> When expanding a file system using online resize, various fields in
> the superblock (e.g., s_blocks_count, s_inodes_count, etc.) change.
> To update the backup superblocks, the online resize uses the function
> update_backups() in fs/ext4/resize.c.  This function was not updating
> the checksum field in the backup superblocks.  This wasn't a big deal
> previously, because e2fsck didn't care about the checksum field in the
> backup superblock.  (And indeed, update_backups() goes all the way
> back to the ext3 days, well before we had support for metadata
> checksums.)
> 
> However, there is an alternate, more general way of updating
> superblock fields, ext4_update_primary_sb() in fs/ext4/ioctl.c.  This
> function does check the checksum of the backup superblock, and if it
> doesn't match will mark the file system as corrupted.  That was
> clearly not the intent, so avoid to aborting the resize when a bad
> superblock is found.
> 
> In addition, teach update_backups() to properly update the checksum in
> the backup superblocks.  We will eventually want to unify
> updapte_backups() with the infrasture in ext4_update_primary_sb(), but
> that's for another day.
> 
> Note: The problem has been around for a while; it just didn't really
> matter until ext4_update_primary_sb() was added by commit bbc605cdb1e1
> ("ext4: implement support for get/set fs label").  And it became
> trivially easy to reproduce after commit 827891a38acc ("ext4: update
> the s_overhead_clusters in the backup sb's when resizing") in v6.0.
> 
> Cc: stable@kernel.org # 5.17+
> Fixes: bbc605cdb1e1 ("ext4: implement support for get/set fs label")
> Signed-off-by: Theodore Ts'o <tytso@mit.edu>

I don't see a formal patch on the linux-ext4 mailing list yet though 
your previous mail was sent to the ML. Is there any plan to send a 
formal patch or is your mail enough? I also don't see it on 
https://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4.git yet.

Basically asking because we enforce that backporting is only allowed for 
patches that are sent to mailing lists so it's easier to follow progress 
were there any update to the patch warranted by reviews/feedback after 
we backported the patch. (In short, anything that can be fetched with b4 
shazam can be backported).

Let us know if there's anything we can do to help.

Thanks,
Quentin

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2022-11-03 11:54 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-10-26 19:49 ext4 online resize -> EXT4-fs error (device loop0) in ext4_update_backup_sb:174: Filesystem failed CRC Unterwurzacher, Jakob
2022-10-28  3:59 ` Theodore Ts'o
2022-10-28 11:49   ` Jakob Unterwurzacher
2022-11-03 11:54   ` Quentin Schulz

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.