All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jan Kara <jack@suse.cz>
To: Amir Goldstein <amir73il@gmail.com>
Cc: "Darrick J. Wong" <djwong@kernel.org>, Jan Kara <jack@suse.cz>,
	Gabriel Krisman Bertazi <krisman@collabora.com>,
	Jan Kara <jack@suse.com>, Linux API <linux-api@vger.kernel.org>,
	Ext4 <linux-ext4@vger.kernel.org>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	Khazhismel Kumykov <khazhy@google.com>,
	David Howells <dhowells@redhat.com>,
	Dave Chinner <david@fromorbit.com>, Theodore Tso <tytso@mit.edu>,
	Matthew Bobrowski <repnop@google.com>,
	kernel@collabora.com
Subject: Re: [PATCH v6 18/21] fanotify: Emit generic error info type for error event
Date: Wed, 18 Aug 2021 11:58:18 +0200	[thread overview]
Message-ID: <20210818095818.GA28119@quack2.suse.cz> (raw)
In-Reply-To: <CAOQ4uxhccRchiajjje3C20UOKwxQUapu=RYPsM1Y0uTnS81Vew@mail.gmail.com>

On Wed 18-08-21 06:24:26, Amir Goldstein wrote:
> [...]
> 
> > > Just keep in mind that the current scheme pre-allocates the single event slot
> > > on fanotify_mark() time and (I think) we agreed to pre-allocate
> > > sizeof(fsnotify_error_event) + MAX_HDNALE_SZ.
> > > If filesystems would want to store some variable length fs specific info,
> > > a future implementation will have to take that into account.
> >
> > <nod> I /think/ for the fs and AG metadata we could preallocate these,
> > so long as fsnotify doesn't free them out from under us.
> 
> fs won't get notified when the event is freed, so fsnotify must
> take ownership on the data structure.
> I was thinking more along the lines of limiting maximum size for fs
> specific info and pre-allocating that size for the event.

Agreed. If there's a sensible upperbound than preallocating this inside
fsnotify is likely the least problematic solution.

> > For inodes...
> > there are many more of those, so they'd have to be allocated
> > dynamically.
> 
> The current scheme is that the size of the queue for error events
> is one and the single slot is pre-allocated.
> The reason for pre-allocate is that the assumption is that fsnotify_error()
> could be called from contexts where memory allocation would be
> inconvenient.
> Therefore, we can store the encoded file handle of the first erroneous
> inode, but we do not store any more events until user read this
> one event.

Right. OTOH I can imagine allowing GFP_NOFS allocations in the error
context. At least for ext4 it would be workable (after all ext4 manages to
lock & modify superblock in its error handlers, GFP_NOFS allocation isn't
harder). But then if events are dynamically allocated there's still the
inconvenient question what are you going to do if you need to report fs
error and you hit ENOMEM. Just not sending the notification may have nasty
consequences and in the world of containerization and virtualization
tightly packed machines where ENOMEM happens aren't that unlikely. It is
just difficult to make assumptions about filesystems overall so we decided
to be better safe and preallocate the event.

Or, we could leave the allocation troubles for the filesystem and
fsnotify_sb_error() would be passed already allocated event (this way
attaching of fs-specific blobs to the event is handled as well) which it
would just queue. Plus we'd need to provide some helper to fill in generic
part of the event...

The disadvantage is that if there are filesystems / callsites needing
preallocated events, it would be painful for them. OTOH current two users -
ext4 & xfs - can handle allocation in the error path AFAIU.

Thinking about this some more, maybe we could have event preallocated (like
a "rescue event"). Normally we would dynamically allocate (or get passed
from fs) the event and only if the allocation fails, we would queue the
rescue event to indicate to listeners that something bad happened, there
was error but we could not fully report it.

But then, even if we'd go for dynamic event allocation by default, we need
to efficiently merge events since some fs failures (e.g. resulting in
journal abort in ext4) lead to basically all operations with the filesystem
to fail and that could easily swamp the notification system with useless
events. Current system with preallocated event nicely handles this
situation, it is questionable how to extend it for online fsck usecase
where we need to queue more than one event (but even there probably needs
to be some sensible upper-bound). I'll think about it...

> > Hmm.  For handling accumulated errors, can we still access the
> > fanotify_event_info_* object once we've handed it to fanotify?  If the
> > user hasn't picked up the event yet, it might be acceptable to set more
> > bits in the type mask and bump the error count.  In other words, every
> > time userspace actually reads the event, it'll get the latest error
> > state.  I /think/ that's where the design of this patchset is going,
> > right?
> 
> Sort of.
> fsnotify does have a concept of "merging" new event with an event
> already in queue.
> 
> With most fsnotify events, merge only happens if the info related
> to the new event (e.g. sb,inode) is the same as that off the queued
> event and the "merge" is only in the event mask
> (e.g. FS_OPEN|FS_CLOSE).
> 
> However, the current scheme for "merge" of an FS_ERROR event is only
> bumping err_count, even if the new reported error or inode do not
> match the error/inode in the queued event.
> 
> If we define error event subtypes (e.g. FS_ERROR_WRITEBACK,
> FS_ERROR_METADATA), then the error event could contain
> a field for subtype mask and user could read the subtype mask
> along with the accumulated error count, but this cannot be
> done by providing the filesystem access to modify an internal
> fsnotify event, so those have to be generic UAPI defined subtypes.
> 
> If you think that would be useful, then we may want to consider
> reserving the subtype mask field in fanotify_event_info_error in
> advance.

It depends on what exactly Darrick has in mind but I suspect we'd need a
fs-specific merge helper that would look at fs-specific blobs in the event
and decide whether events can be merged or not, possibly also handling the
merge by updating the blob. From the POV of fsnotify that would probably
mean merge callback in the event itself. But I guess this needs more
details from Darrick and maybe we don't need to decide this at this moment
since nobody is close to the point of having code needing to pass fs-blobs
with events.

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

  reply	other threads:[~2021-08-18  9:58 UTC|newest]

Thread overview: 69+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-12 21:39 [PATCH v6 00/21] File system wide monitoring Gabriel Krisman Bertazi
2021-08-12 21:39 ` [PATCH v6 01/21] fsnotify: Don't insert unmergeable events in hashtable Gabriel Krisman Bertazi
2021-08-12 21:39 ` [PATCH v6 02/21] fanotify: Fold event size calculation to its own function Gabriel Krisman Bertazi
2021-08-12 21:39 ` [PATCH v6 03/21] fanotify: Split fsid check from other fid mode checks Gabriel Krisman Bertazi
2021-08-12 21:39 ` [PATCH v6 04/21] fsnotify: Reserve mark flag bits for backends Gabriel Krisman Bertazi
2021-08-13  7:28   ` Amir Goldstein
2021-08-16 13:15     ` Jan Kara
2021-08-23 14:36       ` Gabriel Krisman Bertazi
2021-08-12 21:39 ` [PATCH v6 05/21] fanotify: Split superblock marks out to a new cache Gabriel Krisman Bertazi
2021-08-16 13:18   ` Jan Kara
2021-08-12 21:39 ` [PATCH v6 06/21] inotify: Don't force FS_IN_IGNORED Gabriel Krisman Bertazi
2021-08-12 21:39 ` [PATCH v6 07/21] fsnotify: Add helper to detect overflow_event Gabriel Krisman Bertazi
2021-08-12 21:39 ` [PATCH v6 08/21] fsnotify: Add wrapper around fsnotify_add_event Gabriel Krisman Bertazi
2021-08-12 21:39 ` [PATCH v6 09/21] fsnotify: Allow events reported with an empty inode Gabriel Krisman Bertazi
2021-08-13  7:58   ` Amir Goldstein
2021-08-25 18:40     ` Gabriel Krisman Bertazi
2021-08-25 19:45       ` Amir Goldstein
2021-08-25 21:50         ` Gabriel Krisman Bertazi
2021-08-26 10:44           ` Amir Goldstein
2021-08-27  2:26             ` Paul Moore
2021-08-27  9:36               ` audit watch and kernfs Amir Goldstein
2021-08-27 10:22                 ` Al Viro
2021-08-12 21:39 ` [PATCH v6 10/21] fsnotify: Support FS_ERROR event type Gabriel Krisman Bertazi
2021-08-13  7:48   ` Amir Goldstein
2021-08-16 13:23   ` Jan Kara
2021-08-12 21:40 ` [PATCH v6 11/21] fanotify: Allow file handle encoding for unhashed events Gabriel Krisman Bertazi
2021-08-13  7:59   ` Amir Goldstein
2021-08-12 21:40 ` [PATCH v6 12/21] fanotify: Encode invalid file handle when no inode is provided Gabriel Krisman Bertazi
2021-08-13  8:27   ` Amir Goldstein
2021-08-16 14:06     ` Jan Kara
2021-08-16 15:54       ` Amir Goldstein
2021-08-16 16:11         ` Jan Kara
2021-08-12 21:40 ` [PATCH v6 13/21] fanotify: Require fid_mode for any non-fd event Gabriel Krisman Bertazi
2021-08-13  8:28   ` Amir Goldstein
2021-08-12 21:40 ` [PATCH v6 14/21] fanotify: Reserve UAPI bits for FAN_FS_ERROR Gabriel Krisman Bertazi
2021-08-13  8:29   ` Amir Goldstein
2021-08-12 21:40 ` [PATCH v6 15/21] fanotify: Preallocate per superblock mark error event Gabriel Krisman Bertazi
2021-08-13  8:40   ` Amir Goldstein
2021-08-16 15:57   ` Jan Kara
2021-08-27 18:18     ` Gabriel Krisman Bertazi
2021-09-02 21:24       ` Gabriel Krisman Bertazi
2021-09-03  4:16         ` Amir Goldstein
2021-09-15 10:31           ` Jan Kara
2021-08-12 21:40 ` [PATCH v6 16/21] fanotify: Handle FAN_FS_ERROR events Gabriel Krisman Bertazi
2021-08-13  9:35   ` Amir Goldstein
2021-08-12 21:40 ` [PATCH v6 17/21] fanotify: Report fid info for file related file system errors Gabriel Krisman Bertazi
2021-08-13  9:00   ` Amir Goldstein
2021-08-13  9:03     ` Amir Goldstein
2021-08-16 16:18   ` Jan Kara
2021-08-12 21:40 ` [PATCH v6 18/21] fanotify: Emit generic error info type for error event Gabriel Krisman Bertazi
2021-08-13  8:47   ` Amir Goldstein
2021-08-16 16:23   ` Jan Kara
2021-08-16 21:41   ` Darrick J. Wong
2021-08-17  9:05     ` Jan Kara
2021-08-17 10:08       ` Amir Goldstein
2021-08-18  0:16         ` Darrick J. Wong
2021-08-18  3:24           ` Amir Goldstein
2021-08-18  9:58             ` Jan Kara [this message]
2021-08-19  3:58               ` Darrick J. Wong
2021-08-18  0:10       ` Darrick J. Wong
2021-08-24 16:53       ` Gabriel Krisman Bertazi
2021-08-25  4:09         ` Darrick J. Wong
2021-08-12 21:40 ` [PATCH v6 19/21] ext4: Send notifications on error Gabriel Krisman Bertazi
2021-08-16 16:26   ` Jan Kara
2021-08-12 21:40 ` [PATCH v6 20/21] samples: Add fs error monitoring example Gabriel Krisman Bertazi
2021-08-18 13:02   ` Jan Kara
2021-08-23 14:49     ` Gabriel Krisman Bertazi
2021-08-12 21:40 ` [PATCH v6 21/21] docs: Document the FAN_FS_ERROR event Gabriel Krisman Bertazi
2021-08-16 16:40   ` Jan Kara

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210818095818.GA28119@quack2.suse.cz \
    --to=jack@suse.cz \
    --cc=amir73il@gmail.com \
    --cc=david@fromorbit.com \
    --cc=dhowells@redhat.com \
    --cc=djwong@kernel.org \
    --cc=jack@suse.com \
    --cc=kernel@collabora.com \
    --cc=khazhy@google.com \
    --cc=krisman@collabora.com \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=repnop@google.com \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.