All of lore.kernel.org
 help / color / mirror / Atom feed
* [Bug 211971] New: Incorrect fix by e2fsck for blocks_count corruption
@ 2021-02-26 21:39 bugzilla-daemon
  2021-02-27  0:58 ` Amy Parker
                   ` (4 more replies)
  0 siblings, 5 replies; 7+ messages in thread
From: bugzilla-daemon @ 2021-02-26 21:39 UTC (permalink / raw)
  To: linux-ext4

https://bugzilla.kernel.org/show_bug.cgi?id=211971

            Bug ID: 211971
           Summary: Incorrect fix by e2fsck for blocks_count corruption
           Product: File System
           Version: 2.5
    Kernel Version: Linux 5.4.0-65-generic
          Hardware: x86-64
                OS: Linux
              Tree: Mainline
            Status: NEW
          Severity: normal
          Priority: P1
         Component: ext4
          Assignee: fs_ext4@kernel-bugs.osdl.org
          Reporter: tmahmud@iastate.edu
        Regression: No

Created attachment 295497
  --> https://bugzilla.kernel.org/attachment.cgi?id=295497&action=edit
log files from mke2fs, dumpe2fs and e2fsck

For an ext4 file system image with only one superblock, if the blocks_count
field in superblock is corrupted, e2fsck fixed it incorrectly. In the fixed
image, the corrupted blocks_count is unchanged and other fields (e.g., free
blocks count) are changed accordingly.
This issue also occurs in images with multiple superblocks too. For example,
For an ext4 image with primary and backup superblock (backup superblocks are
not located in default locations, e.g., it is located on 513rd block), if the
blocks_count field in superblock is corrupted, e2fsck fixed it incorrectly. In
the fixed image, the corrupted blocks_count is unchanged and other fields
(e.g., free blocks count) are changed accordingly.

e2fsprogs_version_used: e2fsprogs 1.45.6 (20-Mar-2020) 
The commands that I ran to recreate the scenario are:
For image with only one superblock:

dd if=/dev/zero bs=1024 count=8193 of=/home/hdd/image
mke2fs -b 1024 image 8193
debugfs -w image
debugfs:  ssv blocks_count 4000
debugfs:  q
e2fsck -yf image
e2fsck -yf image

# e2fsck fixes the blocks_count corruption in correctly
# In the clean image the blocks_count was 8193, in the fixed image the
blocks_count is 4000
#The second run of e2fsck is consistent with the first run, it doesn't fix
anything, but blocks_count is still 4000
# Expected that e2fsck would fix the blocks count corruption instead of
changing other fields (e.g.,free blocks_count)

For image with multiple superblocks:
dd if=/dev/zero bs=1024 count=8193 of=/home/hdd/image1
mke2fs -b 1024 -g 512 image1 8193
debugfs -w image1
debugfs:  ssv blocks_count 4000
debugfs:  q
e2fsck -yf image1
e2fsck -yf image1  

# e2fsck fixes the blocks_count corruption in correctly
# In the clean image the blocks_count was 8193, in the fixed image the
blocks_count is 4000
# The second run of e2fsck is consistent with the first run, it doesn't fix
anything, but blocks_count is still 4000
#There were 16 block groups in the clean image, but there are only 7 block
groups in the fixed image
# Expected that e2fsck would fix the blocks count corruption instead of
changing other fields (e.g.,free blocks_count) and removing the block groups.  

I attached the images and also the logs from mke2fs, dumpe2fs and e2fsck.

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Bug 211971] New: Incorrect fix by e2fsck for blocks_count corruption
  2021-02-26 21:39 [Bug 211971] New: Incorrect fix by e2fsck for blocks_count corruption bugzilla-daemon
@ 2021-02-27  0:58 ` Amy Parker
  2021-02-27  1:29   ` Theodore Ts'o
  2021-02-27  0:58 ` [Bug 211971] " bugzilla-daemon
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 7+ messages in thread
From: Amy Parker @ 2021-02-27  0:58 UTC (permalink / raw)
  To: bugzilla-daemon; +Cc: Ext4 Developers List

Can you replicate this on modern 5.4 from kernel.org? -generic kernels
are from Canonical and are sometimes broken compared to upstream. If
you can't replicate this on mainline, you'll need to contact
Canonical. We can't do anything if the problem only persists on
distribution kernels.

On Fri, Feb 26, 2021 at 1:41 PM <bugzilla-daemon@bugzilla.kernel.org> wrote:
>
> https://bugzilla.kernel.org/show_bug.cgi?id=211971
>
>             Bug ID: 211971
>            Summary: Incorrect fix by e2fsck for blocks_count corruption
>            Product: File System
>            Version: 2.5
>     Kernel Version: Linux 5.4.0-65-generic
>           Hardware: x86-64
>                 OS: Linux
>               Tree: Mainline
>             Status: NEW
>           Severity: normal
>           Priority: P1
>          Component: ext4
>           Assignee: fs_ext4@kernel-bugs.osdl.org
>           Reporter: tmahmud@iastate.edu
>         Regression: No
>
> Created attachment 295497
>   --> https://bugzilla.kernel.org/attachment.cgi?id=295497&action=edit
> log files from mke2fs, dumpe2fs and e2fsck
>
> For an ext4 file system image with only one superblock, if the blocks_count
> field in superblock is corrupted, e2fsck fixed it incorrectly. In the fixed
> image, the corrupted blocks_count is unchanged and other fields (e.g., free
> blocks count) are changed accordingly.
> This issue also occurs in images with multiple superblocks too. For example,
> For an ext4 image with primary and backup superblock (backup superblocks are
> not located in default locations, e.g., it is located on 513rd block), if the
> blocks_count field in superblock is corrupted, e2fsck fixed it incorrectly. In
> the fixed image, the corrupted blocks_count is unchanged and other fields
> (e.g., free blocks count) are changed accordingly.
>
> e2fsprogs_version_used: e2fsprogs 1.45.6 (20-Mar-2020)
> The commands that I ran to recreate the scenario are:
> For image with only one superblock:
>
> dd if=/dev/zero bs=1024 count=8193 of=/home/hdd/image
> mke2fs -b 1024 image 8193
> debugfs -w image
> debugfs:  ssv blocks_count 4000
> debugfs:  q
> e2fsck -yf image
> e2fsck -yf image
>
> # e2fsck fixes the blocks_count corruption in correctly
> # In the clean image the blocks_count was 8193, in the fixed image the
> blocks_count is 4000
> #The second run of e2fsck is consistent with the first run, it doesn't fix
> anything, but blocks_count is still 4000
> # Expected that e2fsck would fix the blocks count corruption instead of
> changing other fields (e.g.,free blocks_count)
>
> For image with multiple superblocks:
> dd if=/dev/zero bs=1024 count=8193 of=/home/hdd/image1
> mke2fs -b 1024 -g 512 image1 8193
> debugfs -w image1
> debugfs:  ssv blocks_count 4000
> debugfs:  q
> e2fsck -yf image1
> e2fsck -yf image1
>
> # e2fsck fixes the blocks_count corruption in correctly
> # In the clean image the blocks_count was 8193, in the fixed image the
> blocks_count is 4000
> # The second run of e2fsck is consistent with the first run, it doesn't fix
> anything, but blocks_count is still 4000
> #There were 16 block groups in the clean image, but there are only 7 block
> groups in the fixed image
> # Expected that e2fsck would fix the blocks count corruption instead of
> changing other fields (e.g.,free blocks_count) and removing the block groups.
>
> I attached the images and also the logs from mke2fs, dumpe2fs and e2fsck.
>
> --
> You may reply to this email to add a comment.
>
> You are receiving this mail because:
> You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug 211971] Incorrect fix by e2fsck for blocks_count corruption
  2021-02-26 21:39 [Bug 211971] New: Incorrect fix by e2fsck for blocks_count corruption bugzilla-daemon
  2021-02-27  0:58 ` Amy Parker
@ 2021-02-27  0:58 ` bugzilla-daemon
  2021-02-27  1:29 ` bugzilla-daemon
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 7+ messages in thread
From: bugzilla-daemon @ 2021-02-27  0:58 UTC (permalink / raw)
  To: linux-ext4

https://bugzilla.kernel.org/show_bug.cgi?id=211971

--- Comment #1 from Amy (enbyamy@gmail.com) ---
Can you replicate this on modern 5.4 from kernel.org? -generic kernels
are from Canonical and are sometimes broken compared to upstream. If
you can't replicate this on mainline, you'll need to contact
Canonical. We can't do anything if the problem only persists on
distribution kernels.

On Fri, Feb 26, 2021 at 1:41 PM <bugzilla-daemon@bugzilla.kernel.org> wrote:
>
> https://bugzilla.kernel.org/show_bug.cgi?id=211971
>
>             Bug ID: 211971
>            Summary: Incorrect fix by e2fsck for blocks_count corruption
>            Product: File System
>            Version: 2.5
>     Kernel Version: Linux 5.4.0-65-generic
>           Hardware: x86-64
>                 OS: Linux
>               Tree: Mainline
>             Status: NEW
>           Severity: normal
>           Priority: P1
>          Component: ext4
>           Assignee: fs_ext4@kernel-bugs.osdl.org
>           Reporter: tmahmud@iastate.edu
>         Regression: No
>
> Created attachment 295497
>   --> https://bugzilla.kernel.org/attachment.cgi?id=295497&action=edit
> log files from mke2fs, dumpe2fs and e2fsck
>
> For an ext4 file system image with only one superblock, if the blocks_count
> field in superblock is corrupted, e2fsck fixed it incorrectly. In the fixed
> image, the corrupted blocks_count is unchanged and other fields (e.g., free
> blocks count) are changed accordingly.
> This issue also occurs in images with multiple superblocks too. For example,
> For an ext4 image with primary and backup superblock (backup superblocks are
> not located in default locations, e.g., it is located on 513rd block), if the
> blocks_count field in superblock is corrupted, e2fsck fixed it incorrectly.
> In
> the fixed image, the corrupted blocks_count is unchanged and other fields
> (e.g., free blocks count) are changed accordingly.
>
> e2fsprogs_version_used: e2fsprogs 1.45.6 (20-Mar-2020)
> The commands that I ran to recreate the scenario are:
> For image with only one superblock:
>
> dd if=/dev/zero bs=1024 count=8193 of=/home/hdd/image
> mke2fs -b 1024 image 8193
> debugfs -w image
> debugfs:  ssv blocks_count 4000
> debugfs:  q
> e2fsck -yf image
> e2fsck -yf image
>
> # e2fsck fixes the blocks_count corruption in correctly
> # In the clean image the blocks_count was 8193, in the fixed image the
> blocks_count is 4000
> #The second run of e2fsck is consistent with the first run, it doesn't fix
> anything, but blocks_count is still 4000
> # Expected that e2fsck would fix the blocks count corruption instead of
> changing other fields (e.g.,free blocks_count)
>
> For image with multiple superblocks:
> dd if=/dev/zero bs=1024 count=8193 of=/home/hdd/image1
> mke2fs -b 1024 -g 512 image1 8193
> debugfs -w image1
> debugfs:  ssv blocks_count 4000
> debugfs:  q
> e2fsck -yf image1
> e2fsck -yf image1
>
> # e2fsck fixes the blocks_count corruption in correctly
> # In the clean image the blocks_count was 8193, in the fixed image the
> blocks_count is 4000
> # The second run of e2fsck is consistent with the first run, it doesn't fix
> anything, but blocks_count is still 4000
> #There were 16 block groups in the clean image, but there are only 7 block
> groups in the fixed image
> # Expected that e2fsck would fix the blocks count corruption instead of
> changing other fields (e.g.,free blocks_count) and removing the block groups.
>
> I attached the images and also the logs from mke2fs, dumpe2fs and e2fsck.
>
> --
> You may reply to this email to add a comment.
>
> You are receiving this mail because:
> You are watching the assignee of the bug.

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Bug 211971] New: Incorrect fix by e2fsck for blocks_count corruption
  2021-02-27  0:58 ` Amy Parker
@ 2021-02-27  1:29   ` Theodore Ts'o
  0 siblings, 0 replies; 7+ messages in thread
From: Theodore Ts'o @ 2021-02-27  1:29 UTC (permalink / raw)
  To: Amy Parker; +Cc: bugzilla-daemon, Ext4 Developers List

On Fri, Feb 26, 2021 at 04:58:23PM -0800, Amy Parker wrote:
> Can you replicate this on modern 5.4 from kernel.org? -generic kernels
> are from Canonical and are sometimes broken compared to upstream. If
> you can't replicate this on mainline, you'll need to contact
> Canonical. We can't do anything if the problem only persists on
> distribution kernels.

This has nothing to do with the kernel.  What the user is complaining
about is that e2fsck trusts the blocks count field in the superblock
as to be a source of truth.  If that field is artificially changed to
be a smaller value, e2fsck will assume the file system size indicated
by that changed size.

That's an intentional design choice of e2fsck.  Given that with modern
ext4 file systems, we have metadata checksums, if the superblock has
been accidentally corrupted, the checksum will fail, and then e2fsck
will try using the backup superblock instead.

For older file systems that don't have metadata checksums enabled, we
could check to see if certain "fundamental constants" in the primary
superblock is different from the secondary superblock, but...

> > debugfs -w image
> > debugfs:  ssv blocks_count 4000
> > debugfs:  q

This will update the blocks_count in the primary and all secondary
backups.  So that's not going to really help the user.  Effectively,
the complaint is "I pointed the gun at my foot, and pulled the
triggered, and now my foot hurts!"

> > # Expected that e2fsck would fix the blocks count corruption instead of
> > changing other fields (e.g.,free blocks_count)

The problem is that e2fsck can't really determine that the blocks
count field has been corrupted.  We could warn the user if the
blocks_count is smaller than the reported size of the device,
but.... that's actually something that can happen in real life, and
it's not necessarily a file system "corruption", but rather an
intentional choice by the system administrator.  If we were to give a
warning, or worse, assume that blocks count should be adjusted to be
the size of the deivce, we'd be getting complaints from users who
deliberately chose to set the file system size to be something smaller
than the block device.

So this is a case of e2fsck is working as intended.

Cheers,

					- Ted
					

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug 211971] Incorrect fix by e2fsck for blocks_count corruption
  2021-02-26 21:39 [Bug 211971] New: Incorrect fix by e2fsck for blocks_count corruption bugzilla-daemon
  2021-02-27  0:58 ` Amy Parker
  2021-02-27  0:58 ` [Bug 211971] " bugzilla-daemon
@ 2021-02-27  1:29 ` bugzilla-daemon
  2021-03-03 17:14 ` bugzilla-daemon
  2021-03-03 17:18 ` bugzilla-daemon
  4 siblings, 0 replies; 7+ messages in thread
From: bugzilla-daemon @ 2021-02-27  1:29 UTC (permalink / raw)
  To: linux-ext4

https://bugzilla.kernel.org/show_bug.cgi?id=211971

--- Comment #2 from Theodore Tso (tytso@mit.edu) ---
On Fri, Feb 26, 2021 at 04:58:23PM -0800, Amy Parker wrote:
> Can you replicate this on modern 5.4 from kernel.org? -generic kernels
> are from Canonical and are sometimes broken compared to upstream. If
> you can't replicate this on mainline, you'll need to contact
> Canonical. We can't do anything if the problem only persists on
> distribution kernels.

This has nothing to do with the kernel.  What the user is complaining
about is that e2fsck trusts the blocks count field in the superblock
as to be a source of truth.  If that field is artificially changed to
be a smaller value, e2fsck will assume the file system size indicated
by that changed size.

That's an intentional design choice of e2fsck.  Given that with modern
ext4 file systems, we have metadata checksums, if the superblock has
been accidentally corrupted, the checksum will fail, and then e2fsck
will try using the backup superblock instead.

For older file systems that don't have metadata checksums enabled, we
could check to see if certain "fundamental constants" in the primary
superblock is different from the secondary superblock, but...

> > debugfs -w image
> > debugfs:  ssv blocks_count 4000
> > debugfs:  q

This will update the blocks_count in the primary and all secondary
backups.  So that's not going to really help the user.  Effectively,
the complaint is "I pointed the gun at my foot, and pulled the
triggered, and now my foot hurts!"

> > # Expected that e2fsck would fix the blocks count corruption instead of
> > changing other fields (e.g.,free blocks_count)

The problem is that e2fsck can't really determine that the blocks
count field has been corrupted.  We could warn the user if the
blocks_count is smaller than the reported size of the device,
but.... that's actually something that can happen in real life, and
it's not necessarily a file system "corruption", but rather an
intentional choice by the system administrator.  If we were to give a
warning, or worse, assume that blocks count should be adjusted to be
the size of the deivce, we'd be getting complaints from users who
deliberately chose to set the file system size to be something smaller
than the block device.

So this is a case of e2fsck is working as intended.

Cheers,

                                        - Ted

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug 211971] Incorrect fix by e2fsck for blocks_count corruption
  2021-02-26 21:39 [Bug 211971] New: Incorrect fix by e2fsck for blocks_count corruption bugzilla-daemon
                   ` (2 preceding siblings ...)
  2021-02-27  1:29 ` bugzilla-daemon
@ 2021-03-03 17:14 ` bugzilla-daemon
  2021-03-03 17:18 ` bugzilla-daemon
  4 siblings, 0 replies; 7+ messages in thread
From: bugzilla-daemon @ 2021-03-03 17:14 UTC (permalink / raw)
  To: linux-ext4

https://bugzilla.kernel.org/show_bug.cgi?id=211971

--- Comment #3 from tmahmud@iastate.edu ---
Hello Ted,

Thank you very much for the detailed clarification! It mostly makes sense to
me. But I still have two questions regarding the debugfs/e2fsck behavior.


(1)
> > > debugfs -w image
> > > debugfs:  ssv blocks_count 4000
> > > debugfs:  q
> 
> This will update the blocks_count in the primary and all secondary
> backups.  

This is different from what I observed. In my experiment, “debugfs: ssv
blocks_count 4000” only updated the blocks_count (and the checksum) in the
primary superblock. All secondary backups were not updated (neither the
blocks_count nor the checksum). Does this imply that there is a potential bug
in debugfs (because it didn’t update all backups as you suggested)?  I’m
attaching two images before and after “debugfs: ssv blocks_count 4000” for
reference (“image1_before”, “image1_after”). I have verified backups are not
updated by dumping the backup superblocks information with dumpe2fs.


(2)
> The problem is that e2fsck can't really determine that the blocks
> count field has been corrupted.  

In my experiment, I observed that e2fsck was able to fix the debugfs-modified
primary superblock using secondary superblocks when the secondary superblocks
are located in default locations (ex. 8193rd block). However, in an image where
secondary superblocks are not in their default locations (ex:513rd block), I
found that e2fsck cannot fix the primary superblock using secondary
superblocks. So e2fsck’s behavior is inconsistent depending on the location of
the secondary superblocks. Could you please comment on this?

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug 211971] Incorrect fix by e2fsck for blocks_count corruption
  2021-02-26 21:39 [Bug 211971] New: Incorrect fix by e2fsck for blocks_count corruption bugzilla-daemon
                   ` (3 preceding siblings ...)
  2021-03-03 17:14 ` bugzilla-daemon
@ 2021-03-03 17:18 ` bugzilla-daemon
  4 siblings, 0 replies; 7+ messages in thread
From: bugzilla-daemon @ 2021-03-03 17:18 UTC (permalink / raw)
  To: linux-ext4

https://bugzilla.kernel.org/show_bug.cgi?id=211971

--- Comment #4 from tmahmud@iastate.edu ---
Created attachment 295609
  --> https://bugzilla.kernel.org/attachment.cgi?id=295609&action=edit
The before and after image after using debugfs

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2021-03-04  0:51 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-02-26 21:39 [Bug 211971] New: Incorrect fix by e2fsck for blocks_count corruption bugzilla-daemon
2021-02-27  0:58 ` Amy Parker
2021-02-27  1:29   ` Theodore Ts'o
2021-02-27  0:58 ` [Bug 211971] " bugzilla-daemon
2021-02-27  1:29 ` bugzilla-daemon
2021-03-03 17:14 ` bugzilla-daemon
2021-03-03 17:18 ` bugzilla-daemon

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.