linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Gabriel Krisman Bertazi <krisman@collabora.com>
To: "Theodore Ts'o" <tytso@mit.edu>
Cc: jack@suse.com, viro@zeniv.linux.org.uk, amir73il@gmail.com,
	dhowells@redhat.com, david@fromorbit.com,
	darrick.wong@oracle.com, khazhy@google.com,
	linux-fsdevel@vger.kernel.org, kernel@collabora.com
Subject: Re: [RFC] Filesystem error notifications proposal
Date: Mon, 08 Feb 2021 13:49:41 -0500	[thread overview]
Message-ID: <87wnvi8ke2.fsf@collabora.com> (raw)
In-Reply-To: <YBnTekVOQipGKXQc@mit.edu> (Theodore Ts'o's message of "Tue, 2 Feb 2021 17:34:34 -0500")

"Theodore Ts'o" <tytso@mit.edu> writes:

> On Tue, Feb 02, 2021 at 03:26:35PM -0500, Gabriel Krisman Bertazi wrote:
>> 
>> Thanks for the explanation.  That makes sense to me.  For corruptions
>> where it is impossible to map to a mountpoint, I thought they could be
>> considered global filesystem errors, being exposed only to someone
>> watching the entire filesystem (like FAN_MARK_FILESYSTEM).
>
> At least for ext4, there are only 3 ext4_error_*() that we could map
> to a subtree without having to make changes to the call points:
>
> % grep -i ext4_error_file\( fs/ext4/*.c  | wc -l
> 3
> % grep -i ext4_error_inode\( fs/ext4/*.c  | wc -l
> 79
> % grep -i ext4_error\( fs/ext4/*.c  | wc -l
> 42
>
> So in practice, unless we want to make a lot of changes to ext4, most
> of them will be global file system errors....
>
>> But, as you mentioned regarding the google use case, the entire idea of
>> watching a subtree is a bit beyond the scope of my use-case, and was
>> only added given the feedback on the previous proposal of this feature.
>> While nice to have, I don't have the need to watch different mountpoints
>> for errors, only the entire filesystem.
>
> I suspect that for most use cases, the most interesting thing is the
> first error.  We already record this in the ext4 superblock, because
> unfortunately, I can't guarantee that system administrators have
> correctly configured their system logs, so when handling upstream bug
> reports, I can just ask them to run dumpe2fs -h on the file system:
>
> FS Error count:           2
> First error time:         Tue Feb  2 16:27:42 2021
> First error function:     ext4_lookup
> First error line #:       1704
> First error inode #:      12
> First error err:          EFSCORRUPTED
> Last error time:          Tue Feb  2 16:27:59 2021
> Last error function:      ext4_lookup
> Last error line #:        1704
> Last error inode #:       12
> Last error err:           EFSCORRUPTED
>
> So it's not just the Google case.  I'd argue for most system
> administrator, one of the most useful things is when the file system
> was first found to be corrupted, so they can try correlating file
> system corruptions, with, say, reports of I/O errors, or OOM kils,
> etc.  This can also be useful for correlating the start of file system
> problems with problems at the application layer --- say, MongoDB,
> MySQL, etc.
>
> The reason why a notification system useful is because if you are
> using database some kind of high-availability replication system, and
> if there are problems detected in the file system of the primary MySQL
> server, you'd want to have the system fail over to the secondary MySQL
> server.  Sure, you *could* do this by polling the superblock, but
> that's not the most efficient way to do things.

Hi Ted,

I think this closes a full circle back to my original proposal.  It
doesn't have the complexities of objects other than superblock
notifications, doesn't require allocations.  I sent an RFC for that a
while ago [1] which resulted in this discussion and the current
implementation.

For the sake of a having a proposal and a way to move forward, I'm not
sure what would be the next step here.  I could revive the previous
implementation, addressing some issues like avoiding the superblock
name, the way we refer to blocks and using CAP_SYS_ADMIN.  I think that
implementation solves the usecase you explained with more simplicity.
But I'm not sure Darrick and Dave (all in cc) will be convinced by this
approach of global pipe where we send messages for the entire
filesystem, as Dave described it in the previous implementation.

Are you familiar with that implementation?

[1] https://www.spinics.net/lists/linux-ext4/msg75742.html


-- 
Gabriel Krisman Bertazi

  reply	other threads:[~2021-02-08 18:51 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-20 20:13 [RFC] Filesystem error notifications proposal Gabriel Krisman Bertazi
2021-01-21  4:01 ` Viacheslav Dubeyko
2021-01-21 11:44 ` Jan Kara
2021-01-21 13:27   ` Amir Goldstein
2021-01-21 18:56   ` Gabriel Krisman Bertazi
2021-01-21 22:44 ` Theodore Ts'o
2021-01-22  0:44   ` Gabriel Krisman Bertazi
2021-01-22  7:36     ` Amir Goldstein
2021-02-02 20:51       ` Gabriel Krisman Bertazi
2021-01-28 22:28     ` Theodore Ts'o
2021-02-02 20:26       ` Gabriel Krisman Bertazi
2021-02-02 22:34         ` Theodore Ts'o
2021-02-08 18:49           ` Gabriel Krisman Bertazi [this message]
2021-02-08 22:19             ` Dave Chinner
2021-02-09  1:08               ` Theodore Ts'o
2021-02-09  5:12                 ` Khazhismel Kumykov
2021-02-09  8:55                 ` Dave Chinner
2021-02-09 17:57                   ` Theodore Ts'o
2021-02-10  0:52                     ` Darrick J. Wong
2021-02-10  2:21                       ` Theodore Ts'o
2021-02-10  2:32                         ` Darrick J. Wong
2021-02-09 17:35               ` Jan Kara
2021-02-10  0:22                 ` Darrick J. Wong
2021-02-10  7:46                 ` Dave Chinner
2021-02-10  0:49               ` Darrick J. Wong
2021-02-10  0:09 ` Darrick J. Wong
2021-02-10  7:23   ` Amir Goldstein
2021-02-10 23:29   ` Gabriel Krisman Bertazi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87wnvi8ke2.fsf@collabora.com \
    --to=krisman@collabora.com \
    --cc=amir73il@gmail.com \
    --cc=darrick.wong@oracle.com \
    --cc=david@fromorbit.com \
    --cc=dhowells@redhat.com \
    --cc=jack@suse.com \
    --cc=kernel@collabora.com \
    --cc=khazhy@google.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=tytso@mit.edu \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).