All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ian Kent <raven@themaw.net>
To: Gabriel Krisman Bertazi <krisman@collabora.com>, amir73il@gmail.com
Cc: kernel@collabora.com, "Darrick J . Wong" <djwong@kernel.org>,
	Theodore Ts'o <tytso@mit.edu>, Dave Chinner <david@fromorbit.com>,
	jack@suse.com, dhowells@redhat.com, khazhy@google.com,
	linux-fsdevel@vger.kernel.org, linux-ext4@vger.kernel.org
Subject: Re: [PATCH 00/11] File system wide monitoring
Date: Mon, 24 May 2021 11:06:40 +0800	[thread overview]
Message-ID: <cc89bd8edb24a9a5d8e632937010480111294484.camel@themaw.net> (raw)
In-Reply-To: <20210521024134.1032503-1-krisman@collabora.com>

On Thu, 2021-05-20 at 22:41 -0400, Gabriel Krisman Bertazi wrote:
> Hi,
> 
> This series follow up on my previous proposal [1] to support file
> system
> wide monitoring.  As suggested by Amir, this proposal drops the ring
> buffer in favor of a single slot associated with each mark.  This
> simplifies a bit the implementation, as you can see in the code.

I get the need for simplification but I'm wondering where this
will end up.

I also know kernel space to user space error communication has
been a concern for quite a while now.

And, from that, there are a couple of things that occur to me.

One is that the standard errno is often not sufficient to give
sufficiently accurate error reports.

It seems to me that, in the long run, there needs to be a way
for sub-systems to register errors that they will use to report
events (with associated text description) so they can be more
informative. That's probably not as simple as it sounds due to
things like error number clashes, etc. OTOH that mechanism could
be used to avoid using text strings in notifications provided
provided there was a matching user space library, thereby reducing
the size of the event report.

Another aspect, also related to the limitations of error reporting
in general, is the way the information could be used. Again, not a
simple thing to do or grok, but would probably require some way of
grouping errors that are related in a stack like manner for user
space inference engines to analyse. Yes, this is very much out of
scope but is a big picture long term usefulness type of notion.

And I don't know how error storms occurring as a side effect of
some fairly serious problem could be handled ... 

So not really related to the current implementation but a comment
to try and get peoples thoughts about where this is heading in
the long run.

Ian
> 
> As a reminder, This proposal is limited to an interface for
> administrators to monitor the health of a file system, instead of a
> generic inteface for file errors.  Therefore, this doesn't solve the
> problem of writeback errors or the need to watch a specific subtree.
> 
> In comparison to the previous RFC, this implementation also drops the
> per-fs data and location, and leave those as future extensions.
> 
> * Implementation
> 
> The feature is implemented on top of fanotify, as a new type of
> fanotify
> mark, FAN_ERROR, which a file system monitoring tool can register to
> receive error notifications.  When an error occurs a new notification
> is
> generated, in addition followed by this info field:
> 
>  - FS generic data: A file system agnostic structure that has a
> generic
>  error code and identifies the filesystem.  Basically, it let's
>  userspace know something happened on a monitored filesystem.  Since
>  only the first error is recorded since the last read, this also
>  includes a counter of errors that happened since the last read.
> 
> * Testing
> 
> This was tested by watching notifications flowing from an
> intentionally
> corrupted filesystem in different places.  In addition, other events
> were watched in an attempt to detect regressions.
> 
> Is there a specific testsuite for fanotify I should be running?
> 
> * Patches
> 
> This patchset is divided as follows: Patch 1 through 5 are
> refactoring
> to fsnotify/fanotify in preparation for FS_ERROR/FAN_ERROR; patch 6
> and
> 7 implement the FS_ERROR API for filesystems to report error; patch 8
> add support for FAN_ERROR in fanotify; Patch 9 is an example
> implementation for ext4; patch 10 and 11 provide a sample userspace
> code
> and documentation.
> 
> I also pushed the full series to:
> 
>   https://gitlab.collabora.com/krisman/linux -b fanotify-
> notifications-single-slot
> 
> [1] https://lwn.net/Articles/854545/
> 
> Cc: Darrick J. Wong <djwong@kernel.org>
> Cc: Theodore Ts'o <tytso@mit.edu>
> Cc: Dave Chinner <david@fromorbit.com>
> Cc: jack@suse.com
> To: amir73il@gmail.com
> Cc: dhowells@redhat.com
> Cc: khazhy@google.com
> Cc: linux-fsdevel@vger.kernel.org
> Cc: linux-ext4@vger.kernel.org
> 
> Gabriel Krisman Bertazi (11):
>   fanotify: Fold event size calculation to its own function
>   fanotify: Split fsid check from other fid mode checks
>   fanotify: Simplify directory sanity check in DFID_NAME mode
>   fanotify: Expose fanotify_mark
>   inotify: Don't force FS_IN_IGNORED
>   fsnotify: Support FS_ERROR event type
>   fsnotify: Introduce helpers to send error_events
>   fanotify: Introduce FAN_ERROR event
>   ext4: Send notifications on error
>   samples: Add fs error monitoring example
>   Documentation: Document the FAN_ERROR event
> 
>  .../admin-guide/filesystem-monitoring.rst     |  52 +++++
>  Documentation/admin-guide/index.rst           |   1 +
>  fs/ext4/super.c                               |   8 +
>  fs/notify/fanotify/fanotify.c                 |  80 ++++++-
>  fs/notify/fanotify/fanotify.h                 |  38 +++-
>  fs/notify/fanotify/fanotify_user.c            | 213 ++++++++++++++--
> --
>  fs/notify/inotify/inotify_user.c              |   6 +-
>  include/linux/fanotify.h                      |   6 +-
>  include/linux/fsnotify.h                      |  13 ++
>  include/linux/fsnotify_backend.h              |  15 +-
>  include/uapi/linux/fanotify.h                 |  10 +
>  samples/Kconfig                               |   8 +
>  samples/Makefile                              |   1 +
>  samples/fanotify/Makefile                     |   3 +
>  samples/fanotify/fs-monitor.c                 |  91 ++++++++
>  15 files changed, 485 insertions(+), 60 deletions(-)
>  create mode 100644 Documentation/admin-guide/filesystem-
> monitoring.rst
>  create mode 100644 samples/fanotify/Makefile
>  create mode 100644 samples/fanotify/fs-monitor.c
> 



      parent reply	other threads:[~2021-05-24  3:06 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-21  2:41 [PATCH 00/11] File system wide monitoring Gabriel Krisman Bertazi
2021-05-21  2:41 ` [PATCH 01/11] fanotify: Fold event size calculation to its own function Gabriel Krisman Bertazi
2021-05-21  2:41 ` [PATCH 02/11] fanotify: Split fsid check from other fid mode checks Gabriel Krisman Bertazi
2021-05-21  8:33   ` Amir Goldstein
2021-05-21  2:41 ` [PATCH 03/11] fanotify: Simplify directory sanity check in DFID_NAME mode Gabriel Krisman Bertazi
2021-05-21  8:37   ` Amir Goldstein
2021-05-21  2:41 ` [PATCH 04/11] fanotify: Expose fanotify_mark Gabriel Krisman Bertazi
2021-05-21  9:06   ` Amir Goldstein
2021-05-21  2:41 ` [PATCH 05/11] inotify: Don't force FS_IN_IGNORED Gabriel Krisman Bertazi
2021-05-21  9:07   ` Amir Goldstein
2021-05-21  2:41 ` [PATCH 06/11] fsnotify: Support FS_ERROR event type Gabriel Krisman Bertazi
2021-05-21  9:13   ` Amir Goldstein
2021-05-21  2:41 ` [PATCH 07/11] fsnotify: Introduce helpers to send error_events Gabriel Krisman Bertazi
2021-05-21  9:32   ` Amir Goldstein
2021-05-22 17:51   ` kernel test robot
2021-05-22 17:51     ` kernel test robot
2021-05-21  2:41 ` [PATCH 08/11] fanotify: Introduce FAN_ERROR event Gabriel Krisman Bertazi
2021-05-21 11:02   ` Amir Goldstein
2021-05-21 15:02   ` Darrick J. Wong
2021-05-21  2:41 ` [PATCH 09/11] ext4: Send notifications on error Gabriel Krisman Bertazi
2021-05-21  9:44   ` Amir Goldstein
2021-05-21  2:41 ` [PATCH 10/11] samples: Add fs error monitoring example Gabriel Krisman Bertazi
2021-05-21  9:48   ` Amir Goldstein
2021-05-26 23:37     ` Gabriel Krisman Bertazi
2021-05-22 20:21   ` kernel test robot
2021-05-22 20:21     ` kernel test robot
2021-05-21  2:41 ` [PATCH 11/11] Documentation: Document the FAN_ERROR event Gabriel Krisman Bertazi
2021-05-21  9:54   ` Amir Goldstein
2021-05-21  8:31 ` [PATCH 00/11] File system wide monitoring Amir Goldstein
2021-05-22 23:25 ` Theodore Y. Ts'o
2021-05-24 15:19   ` Gabriel Krisman Bertazi
2021-05-24  3:06 ` Ian Kent [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=cc89bd8edb24a9a5d8e632937010480111294484.camel@themaw.net \
    --to=raven@themaw.net \
    --cc=amir73il@gmail.com \
    --cc=david@fromorbit.com \
    --cc=dhowells@redhat.com \
    --cc=djwong@kernel.org \
    --cc=jack@suse.com \
    --cc=kernel@collabora.com \
    --cc=khazhy@google.com \
    --cc=krisman@collabora.com \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.