linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Geert Uytterhoeven <geert@linux-m68k.org>
To: "Theodore Ts'o" <tytso@mit.edu>,
	Richard Weinberger <richard.weinberger@gmail.com>,
	Arthur Marsh <arthur.marsh@internode.on.net>,
	LKML <linux-kernel@vger.kernel.org>,
	Ext4 Developers List <linux-ext4@vger.kernel.org>
Subject: Re: ext3/ext4 filesystem corruption under post 5.1.0 kernels
Date: Fri, 17 May 2019 11:23:31 +0200	[thread overview]
Message-ID: <CAMuHMdWH4Q6YoE1yV8_KhW4ChK+8RMuAqW25o1pg47Yz5f9nYg@mail.gmail.com> (raw)
In-Reply-To: <20190511220659.GB8507@mit.edu>

Hi Ted,

On Sun, May 12, 2019 at 12:07 AM Theodore Ts'o <tytso@mit.edu> wrote:
> On Sat, May 11, 2019 at 02:43:16PM +0200, Richard Weinberger wrote:
> > [CC'in linux-ext4]
> >
> > On Sat, May 11, 2019 at 1:47 PM Arthur Marsh
> > <arthur.marsh@internode.on.net> wrote:
> > >
> > >
> > > The filesystem with the kernel source tree is the root file system, ext3, mounted as:
> > >
> > > /dev/sdb7 on / type ext3 (rw,relatime,errors=remount-ro)
> > >
> > > After the "Compressing objects" stage, the following appears in dmesg:
> > >
> > > [  848.968550] EXT4-fs error (device sdb7): ext4_get_branch:171: inode #8: block 30343695: comm jbd2/sdb7-8: invalid block
> > > [  849.077426] Aborting journal on device sdb7-8.
> > > [  849.100963] EXT4-fs (sdb7): Remounting filesystem read-only
> > > [  849.100976] jbd2_journal_bmap: journal block not found at offset 989 on sdb7-8
>
> This indicates that the extent tree blocks for the journal was found
> to be corrupt; so the journal couldn't be found.
>
> > > # fsck -yv
> > > fsck from util-linux 2.33.1
> > > e2fsck 1.45.0 (6-Mar-2019)
> > > /dev/sdb7: recovering journal
> > > /dev/sdb7 contains a file system with errors, check forced.
>
> But e2fsck had no problem finding the journal.
>
> > > Pass 1: Checking inodes, blocks, and sizes
> > > Pass 2: Checking directory structure
> > > Pass 3: Checking directory connectivity
> > > Pass 4: Checking reference counts
> > > Pass 5: Checking group summary information
> > > Free blocks count wrong (4619656, counted=4619444).
> > > Fix? yes
> > >
> > > Free inodes count wrong (15884075, counted=15884058).
> > > Fix? yes
>
> And no other significant problems were found.  (Ext4 never updates or
> relies on the summary number of free blocks and free inodes, since
> updating it is a scalability bottleneck and these values can be
> calculated from the per block group free block/inodes count.  So the
> fact that e2fsck needed to update them is not an issue.)
>
> So that implies that we got one set of values when we read the journal
> inode when attempting to mount the file system, and a *different* set
> of values when e2fsck was run.  Which makes means that we need
> consider the possibility that the problem is below the file system
> layer (e.g., the block layer, device drivers, etc.).
>
>
> > > /dev/sdb7: ***** FILE SYSTEM WAS MODIFIED *****
> > >
> > > Other times, I have gotten:
> > >
> > > "Inodes that were part of a corrupted orphan linked list found."
> > > "Block bitmap differences:"
> > > "Free blocks sound wrong for group"
> > >
>
> This variety of issues also implies that the issue may be in the data
> read by the file system, as opposed to an issue in the file system.
>
> Arthur, can you give us the full details of your hardware
> configuration and your kernel config file?  Also, what kernel git
> commit ID were you testing?

I'm seeing similar things running post v5.1 on ARAnyM (Atari emulator):

    EXT4-fs (sda1): mounting ext3 file system using the ext4 subsystem
    ...
    EXT4-fs error (device sda1): ext4_get_branch:171: inode #1980:
block 27550: comm jbd2/sda1-1980: invalid block

and userspace hung somewhere during initial system startup, so I had to
kill the instance.

-----

    EXT4-fs (sda1): mounting ext3 file system using the ext4 subsystem
    EXT4-fs (sda1): INFO: recovery required on readonly filesystem
    EXT4-fs (sda1): write access will be enabled during recovery
    EXT4-fs warning (device sda1): ext4_clear_journal_err:5078:
Filesystem error recorded from previous mount: IO failure
    EXT4-fs warning (device sda1): ext4_clear_journal_err:5079:
Marking fs in need of filesystem check.
    EXT4-fs (sda1): recovery complete
    EXT4-fs (sda1): mounted filesystem with ordered data mode. Opts: (null)
    VFS: Mounted root (ext3 filesystem) readonly on device 8:1.
    ...
    Run /sbin/init as init process
    random: fast init done
    EXT4-fs (sda1): re-mounted. Opts:
    random: crng init done
    EXT4-fs (sda1): re-mounted. Opts: errors=remount-ro
    EXT4-fs (sda1): error count since last fsck: 1
    EXT4-fs (sda1): initial error at time 1557931133:
ext4_get_branch:171: inode 1980: block 27550
    EXT4-fs (sda1): last error at time 1557931133:
ext4_get_branch:171: inode 1980: block 27550

-----

    EXT4-fs (sda1): mounting ext3 file system using the ext4 subsystem
    EXT4-fs (sda1): mounted filesystem with ordered data mode. Opts: (null)
    VFS: Mounted root (ext3 filesystem) readonly on device 8:1.
    ...
    Run /sbin/init as init process
    random: fast init done
    EXT4-fs (sda1): re-mounted. Opts:
    EXT4-fs (sda1): re-mounted. Opts: errors=remount-ro
    random: crng init done
    EXT4-fs error (device sda1): ext4_get_branch:171: inode #1980:
block 27550: comm jbd2/sda1-1980: invalid block
    Aborting journal on device sda1-1980.
    EXT4-fs (sda1): Remounting filesystem read-only
    jbd2_journal_bmap: journal block not found at offset 426 on sda1-1980
    EXT4-fs error (device sda1): ext4_journal_check_start:61: Detected
aborted journal
    EXT4-fs (sda1): error count since last fsck: 3
    EXT4-fs (sda1): initial error at time 1557931133:
ext4_get_branch:171: inode 1980: block 27550
    EXT4-fs (sda1): last error at time 1558083596:
ext4_journal_check_start:61: inode 1980: block 27550
    EXT4-fs error (device sda1): ext4_remount:5328: Abort forced by user

---

    EXT4-fs (sda1): mounting ext3 file system using the ext4 subsystem
    EXT4-fs (sda1): INFO: recovery required on readonly filesystem
    EXT4-fs (sda1): write access will be enabled during recovery
    random: fast init done
    EXT4-fs warning (device sda1): ext4_clear_journal_err:5078:
Filesystem error recorded from previous mount: IO failure
    EXT4-fs warning (device sda1): ext4_clear_journal_err:5079:
Marking fs in need of filesystem check.
    EXT4-fs (sda1): recovery complete
    EXT4-fs (sda1): mounted filesystem with ordered data mode. Opts: (null)
    ...
    Run /sbin/init as init process
    random: crng init done
    EXT4-fs (sda1): re-mounted. Opts:
    EXT4-fs (sda1): re-mounted. Opts: errors=remount-ro
    EXT4-fs (sda1): error count since last fsck: 4
    EXT4-fs (sda1): initial error at time 1557931133:
ext4_get_branch:171: inode 1980: block 27550
    EXT4-fs (sda1): last error at time 1558083665: ext4_remount:5328:
inode 1980: block 27550

Notes:
  - It's always the same block,
  - Block device is an image file, accessed using
    arch/m68k/emu/nfblock.c, which did not receive any recent (bvec)
    updates.
  - There are no reported errors for the device containing the image
    file on the host,
  - Given Arthur sees the issue on a different class of machines, it's
    unlikely the issue is related to a problem with the block device
    (driver). It may still be an issue with the block layer, though,
  - Both Arthur and I are mounting an ext3 file system using the ext4
    subsystem.

Thanks!

Gr{oetje,eeting}s,

                        Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

      parent reply	other threads:[~2019-05-17  9:23 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <48BA4A6E-5E2A-478E-A96E-A31FA959964C@internode.on.net>
2019-05-11 12:43 ` ext3/ext4 filesystem corruption under post 5.1.0 kernels Richard Weinberger
2019-05-11 22:06   ` Theodore Ts'o
2019-05-13 10:31     ` Arthur Marsh
2019-05-14  1:59       ` Arthur Marsh
2019-05-14 10:42         ` Ondrej Zary
2019-05-15  2:59         ` Arthur Marsh
2019-05-15  4:57           ` Theodore Ts'o
2019-05-15 12:12             ` Arthur Marsh
2019-05-16  2:56               ` Theodore Ts'o
2019-05-17 16:44             ` Geert Uytterhoeven
2019-07-01 12:43               ` Geert Uytterhoeven
2019-07-01 13:56                 ` Theodore Ts'o
2019-07-01 14:08                   ` Geert Uytterhoeven
2019-05-17  9:23     ` Geert Uytterhoeven [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAMuHMdWH4Q6YoE1yV8_KhW4ChK+8RMuAqW25o1pg47Yz5f9nYg@mail.gmail.com \
    --to=geert@linux-m68k.org \
    --cc=arthur.marsh@internode.on.net \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=richard.weinberger@gmail.com \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).