All of lore.kernel.org
 help / color / mirror / Atom feed
From: Geert Uytterhoeven <geert@linux-m68k.org>
To: "Theodore Ts'o" <tytso@mit.edu>,
	Geert Uytterhoeven <geert@linux-m68k.org>,
	Arthur Marsh <arthur.marsh@internode.on.net>,
	Richard Weinberger <richard.weinberger@gmail.com>,
	LKML <linux-kernel@vger.kernel.org>,
	Ext4 Developers List <linux-ext4@vger.kernel.org>
Subject: Re: ext3/ext4 filesystem corruption under post 5.1.0 kernels
Date: Mon, 1 Jul 2019 16:08:13 +0200	[thread overview]
Message-ID: <CAMuHMdVn9zMsas47CZpWdrFMTu0htn11Dhk459bosFxW7YZv_A@mail.gmail.com> (raw)
In-Reply-To: <20190701135607.GB6549@mit.edu>

Hi Ted,

On Mon, Jul 1, 2019 at 3:56 PM Theodore Ts'o <tytso@mit.edu> wrote:
> On Mon, Jul 01, 2019 at 02:43:14PM +0200, Geert Uytterhoeven wrote:
> > Despite this fix having been applied upstream,  the kernel prints from
> > time to time:
> >
> >     EXT4-fs (sda1): error count since last fsck: 5
> >     EXT4-fs (sda1): initial error at time 1557931133:
> > ext4_get_branch:171: inode 1980: block 27550
> >     EXT4-fs (sda1): last error at time 1558114349:
> > ext4_get_branch:171: inode 1980: block 27550
> >
> > This happens even after a manual run of "e2fsck -f" (while it's mounted
> > RO), which reports a clean file system.
>
> What's happening is this.  When the kernel detects a corruption, newer
> kernels will set these superblock fields:
>
>         __le32  s_error_count;          /* number of fs errors */
>         __le32  s_first_error_time;     /* first time an error happened */
>         __le32  s_first_error_ino;      /* inode involved in first error */
>         __le64  s_first_error_block;    /* block involved of first error */
>         __u8    s_first_error_func[32] __nonstring;     /* function where the error happened */
>         __le32  s_first_error_line;     /* line number where error happened */
>         __le32  s_last_error_time;      /* most recent time of an error */
>         __le32  s_last_error_ino;       /* inode involved in last error */
>         __le32  s_last_error_line;      /* line number where error happened */
>         __le64  s_last_error_block;     /* block involved of last error */
>         __u8    s_last_error_func[32] __nonstring;      /* function where the error happened */
>
> When newer versions of e2fsck *fix* the corruption, it will clear
> these fields.  It's basically a safety check because *way* too many
> ext4 users run with errors=continue (aka, "don't worry, be happy"
> mode), and so this is a poke in the system logs that the file system
> is corrupted, and they, really, *REALLY* should fix it before they
> lose (more) data.

Thanks for the explanation, much appreciated!

> > The inode and block numbers match the numbers printed due to the
> > previous bug.
>
> You can also see when the last file system error was detected via:
>
> % date -d @1558114349
> Fri 17 May 2019 01:32:29 PM EDT

Good. So no new errors detected after the fix.

> > Do you have an idea what's wrong?
> > Note that I run a very old version of e2fsck (from a decade ago).
>
> ... and that's the problem.  If you're going to be using newer
> versions of the kernel, you really should be using newer versions of
> e2fsprogs.
>
> There have been a lot of bug fixes in the last 10 years, and some of
> them can be data corruption bugs....

Yeah, one day I'll have to change the winning horse...

Gr{oetje,eeting}s,

                        Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

  reply	other threads:[~2019-07-01 14:08 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-05-11 11:33 ext3/ext4 filesystem corruption under post 5.1.0 kernels Arthur Marsh
2019-05-11 12:43 ` Richard Weinberger
2019-05-11 22:06   ` Theodore Ts'o
2019-05-13  7:45     ` Arthur Marsh
2019-05-13 10:31     ` Arthur Marsh
2019-05-14  1:59       ` Arthur Marsh
2019-05-14 10:42         ` Ondrej Zary
2019-05-15  2:59         ` Arthur Marsh
2019-05-15  4:57           ` Theodore Ts'o
2019-05-15 12:12             ` Arthur Marsh
2019-05-16  2:56               ` Theodore Ts'o
2019-05-17 16:44             ` Geert Uytterhoeven
2019-07-01 12:43               ` Geert Uytterhoeven
2019-07-01 13:56                 ` Theodore Ts'o
2019-07-01 14:08                   ` Geert Uytterhoeven [this message]
2019-05-17  9:23     ` Geert Uytterhoeven

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAMuHMdVn9zMsas47CZpWdrFMTu0htn11Dhk459bosFxW7YZv_A@mail.gmail.com \
    --to=geert@linux-m68k.org \
    --cc=arthur.marsh@internode.on.net \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=richard.weinberger@gmail.com \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.