[PATCH RFC 00/15] File system wide monitoring

* [PATCH RFC 00/15] File system wide monitoring
@ 2021-04-26 18:41 Gabriel Krisman Bertazi
  2021-04-26 18:41 ` [PATCH RFC 01/15] fanotify: Fold event size calculation to its own function Gabriel Krisman Bertazi
                   ` (15 more replies)
  0 siblings, 16 replies; 37+ messages in thread
From: Gabriel Krisman Bertazi @ 2021-04-26 18:41 UTC (permalink / raw)
  To: amir73il, tytso, djwong
  Cc: david, jack, dhowells, khazhy, linux-fsdevel, linux-ext4,
	Gabriel Krisman Bertazi, kernel

Hi,

In an attempt to consolidate some of the feedback from the previous
proposals, I wrote a new attempt to solve the file system error reporting
problem.  Before I spend more time polishing it, I'd like to hear your
feedback if I'm going in the wrong direction, in particular with the
modifications to fsnotify.

This RFC follows up on my previous proposals which attempted to leverage
watch_queue[1] and fsnotify[2] to provide a mechanism for file systems
to push error notifications to user space.  This proposal starts by, as
suggested by Darrick, limiting the scope of what I'm trying to do to an
interface for administrators to monitor the health of a file system,
instead of a generic inteface for file errors.  Therefore, this doesn't
solve the problem of writeback errors or the need to watch a specific
subsystem.

* Format

The feature is implemented on top of fanotify, as a new type of fanotify
mark, FAN_ERROR, which a file system monitoring tool can register to
receive notifications.  A notification is split in three parts, and only
the first is guaranteed to exist for any given error event:

 - FS generic data: A file system agnostic structure that has a generic
 error code and identifies the filesystem.  Basically, it let's
 userspace know something happen on a monitored filesystem.

 - FS location data: Identifies where in the code the problem
 happened. (This is important for the use case of analysing frequent
 error points that we discussed earlier).

 - FS specific data: A detailed error report in a filesystem specific
 format that details what the error is.  Ideally, a capable monitoring
 tool can use the information here for error recovery.  For instance,
 xfs can put the xfs_scrub structures here, ext4 can send its error
 reports, etc.  An example of usage is done in the ext4 patch of this
 series.

More details on the information in each record can be found on the
documentation introduced in patch 15.

* Using fanotify

Using fanotify for this kind of thing is slightly tricky because we want
to guarantee delivery in some complicated conditions, for instance, the
file system might want to send an error while holding several locks.

Instead of working around file system constraints at the file system
level, this proposal tries to make the FAN_ERROR submission safe in
those contexts.  This is done with a new mode in fsnotify that
preallocates the memory at group creation to be used for the
notification submission.

This new mode in fsnotify introduces a ring buffer to queue
notifications, which eliminates the allocation path in fsnotify.  From
what I saw, the allocation is the only problem in fsnotify for
filesystems to submit errors in constrained situations.

* Visibility

Since the usecase is limited to a tool for whole file system monitoring,
errors are associated with the superblock and visible filesystem-wide.
It is assumed and required that userspace has CAP_SYS_ADMIN.

* Testing

This was tested with corrupted ext4 images in a few scenarios, which
caused errors to be triggered and monitored with the sample tool
provided in the next to final patch.

* patches

Patches 1-4 massage fanotify attempt to refactor fanotify a bit for
the patches to come.  Patch 5 introduce the ring buffer interface to
fsnotify, while patch 6 enable this support in fanotify.  Patch 7, 8 wire
the FS_ERROR event type, which will be used by filesystems.  In
sequennce, patches 9-12 implement the FAN_ERROR record types and create
the new event.  Patch 13 is an ext4 example implementation supporting
this feature.  Finally, patches 14 and 15 document and provide examples
of a userspace tool that uses this feature.

I also pushed the full series to:

  https://gitlab.collabora.com/krisman/linux -b fanotify-notifications

[1] https://lwn.net/Articles/839310/
[2] https://www.spinics.net/lists/linux-fsdevel/msg187075.html

Gabriel Krisman Bertazi (15):
  fanotify: Fold event size calculation to its own function
  fanotify: Split fsid check from other fid mode checks
  fsnotify: Wire flags field on group allocation
  fsnotify: Wire up group information on event initialization
  fsnotify: Support event submission through ring buffer
  fanotify: Support submission through ring buffer
  fsnotify: Support FS_ERROR event type
  fsnotify: Introduce helpers to send error_events
  fanotify: Introduce generic error record
  fanotify: Introduce code location record
  fanotify: Introduce filesystem specific data record
  fanotify: Introduce the FAN_ERROR mark
  ext4: Send notifications on error
  samples: Add fs error monitoring example
  Documentation: Document the FAN_ERROR framework

 .../admin-guide/filesystem-monitoring.rst     | 103 ++++++
 Documentation/admin-guide/index.rst           |   1 +
 fs/ext4/super.c                               |  60 +++-
 fs/notify/Makefile                            |   2 +-
 fs/notify/dnotify/dnotify.c                   |   2 +-
 fs/notify/fanotify/fanotify.c                 | 127 +++++--
 fs/notify/fanotify/fanotify.h                 |  35 +-
 fs/notify/fanotify/fanotify_user.c            | 319 ++++++++++++++----
 fs/notify/fsnotify.c                          |   2 +-
 fs/notify/group.c                             |  25 +-
 fs/notify/inotify/inotify_fsnotify.c          |   2 +-
 fs/notify/inotify/inotify_user.c              |   4 +-
 fs/notify/notification.c                      |  10 +
 fs/notify/ring.c                              | 199 +++++++++++
 include/linux/fanotify.h                      |  12 +-
 include/linux/fsnotify.h                      |  15 +
 include/linux/fsnotify_backend.h              |  63 +++-
 include/uapi/linux/ext4-notify.h              |  17 +
 include/uapi/linux/fanotify.h                 |  26 ++
 kernel/audit_fsnotify.c                       |   2 +-
 kernel/audit_tree.c                           |   2 +-
 kernel/audit_watch.c                          |   2 +-
 samples/Kconfig                               |   7 +
 samples/Makefile                              |   1 +
 samples/fanotify/Makefile                     |   3 +
 samples/fanotify/fs-monitor.c                 | 135 ++++++++
 26 files changed, 1034 insertions(+), 142 deletions(-)
 create mode 100644 Documentation/admin-guide/filesystem-monitoring.rst
 create mode 100644 fs/notify/ring.c
 create mode 100644 include/uapi/linux/ext4-notify.h
 create mode 100644 samples/fanotify/Makefile
 create mode 100644 samples/fanotify/fs-monitor.c

-- 
2.31.0

^ permalink raw reply	[flat|nested] 37+ messages in thread