All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Theodore Y. Ts'o" <tytso@mit.edu>
To: Jan Kara <jack@suse.cz>
Cc: linux-ext4@vger.kernel.org
Subject: Re: [PATCH 07/12] ext4: Defer saving error info from atomic context
Date: Wed, 16 Dec 2020 00:31:09 -0500	[thread overview]
Message-ID: <X9mbnUqNFnJSN1S8@mit.edu> (raw)
In-Reply-To: <20201127113405.26867-8-jack@suse.cz>

On Fri, Nov 27, 2020 at 12:34:00PM +0100, Jan Kara wrote:
> When filesystem inconsistency is detected with group locked, we
> currently try to modify superblock to store error there without
> blocking. However this can cause superblock checksum failures (or
> DIF/DIX failure) when the superblock is just being written out.
> 
> Make error handling code just store error information in ext4_sb_info
> structure and copy it to on-disk superblock only in ext4_commit_super().
> In case of error happening with group locked, we just postpone the
> superblock flushing to a workqueue.
> 
> Signed-off-by: Jan Kara <jack@suse.cz>

So this patch does make a behavioral change, which is if a file system
contains errors when it is mounted, when the kernel trips over more
file system problems, the first error is overwritten.

Before, s_first_error_* used to mean the first error found since the
file system was checked.  With this patch, s_first_error_* now means
the first error found since the file system was mounted.

This distinction is critical, because there are some buggy distro's
(Ubuntu and Debian) out there where their cloud image does *not* run
fsck on boot.  So if a file system is corrupted, it does not get fixed
up, and file system can get more and more damaged.  In that case, it's
good to know when the file system was first damaged, even if it was
six months earlier and several reboots and remounts later.   :-/

We should be able to fix up this patch by making commit_super only
update the on-disk s_first_error* fields if s_first_error_time on disk
is 0.

					- Ted

P.S.  Bugs have been filed with both distro's.  Ubuntu has accepted it
is a bug, but we're still working on convincing the Debian cloud image
devs....

And it's not just the buggy cloud iamges; it's also happened on
occasion that sloppy embedded Linux developers or sysadmins have
misconfigured their system so that fsck never gets run, and it's nice
to be able to have the forensic data preserved.

  reply	other threads:[~2020-12-16  5:32 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-27 11:33 [PATCH 00/12] ext4: Various fixes of ext4 handling of fs errors Jan Kara
2020-11-27 11:33 ` [PATCH 01/12] ext4: Don't remount read-only with errors=continue on reboot Jan Kara
2020-11-29 21:33   ` Andreas Dilger
2020-12-16  4:59     ` Theodore Y. Ts'o
2020-11-27 11:33 ` [PATCH 02/12] ext4: Remove redundant sb checksum recomputation Jan Kara
2020-11-29 22:11   ` Andreas Dilger
2020-11-30 10:59     ` Jan Kara
2020-12-16  5:01   ` Theodore Y. Ts'o
2020-11-27 11:33 ` [PATCH 03/12] ext4: Standardize error message in ext4_protect_reserved_inode() Jan Kara
2020-11-29 21:39   ` Andreas Dilger
2020-12-16  5:04   ` Theodore Y. Ts'o
2020-11-27 11:33 ` [PATCH 04/12] ext4: Make ext4_abort() use __ext4_error() Jan Kara
2020-11-29 22:12   ` Andreas Dilger
2020-12-16  5:07   ` Theodore Y. Ts'o
2020-11-27 11:33 ` [PATCH 05/12] ext4: Move functions in super.c Jan Kara
2020-11-29 22:13   ` Andreas Dilger
2020-12-16  5:09   ` Theodore Y. Ts'o
2020-11-27 11:33 ` [PATCH 06/12] ext4: Simplify ext4 error translation Jan Kara
2020-11-29 21:57   ` Andreas Dilger
2020-12-16  5:11   ` Theodore Y. Ts'o
2020-11-27 11:34 ` [PATCH 07/12] ext4: Defer saving error info from atomic context Jan Kara
2020-12-16  5:31   ` Theodore Y. Ts'o [this message]
2020-12-16  5:40     ` Theodore Y. Ts'o
2020-12-16  9:56       ` Jan Kara
2020-11-27 11:34 ` [PATCH 08/12] ext4: Combine ext4_handle_error() and save_error_info() Jan Kara
2020-12-14 19:23   ` harshad shirwadkar
2020-12-16 10:11     ` Jan Kara
2020-12-16 10:24       ` Jan Kara
2020-11-27 11:34 ` [PATCH 09/12] ext4: Drop sync argument of ext4_commit_super() Jan Kara
2020-12-14 19:25   ` harshad shirwadkar
2020-11-27 11:34 ` [PATCH 10/12] ext4: Protect superblock modifications with a buffer lock Jan Kara
2020-11-27 11:34 ` [PATCH 11/12] ext4: Save error info to sb through journal if available Jan Kara
2020-11-27 11:34 ` [PATCH 12/12] ext4: Use sbi instead of EXT4_SB(sb) in ext4_update_super() Jan Kara
2020-12-14 19:07 ` [PATCH 00/12] ext4: Various fixes of ext4 handling of fs errors harshad shirwadkar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=X9mbnUqNFnJSN1S8@mit.edu \
    --to=tytso@mit.edu \
    --cc=jack@suse.cz \
    --cc=linux-ext4@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.