linux-api.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Amir Goldstein <amir73il@gmail.com>
To: Gabriel Krisman Bertazi <krisman@collabora.com>
Cc: Jan Kara <jack@suse.com>, Linux API <linux-api@vger.kernel.org>,
	Ext4 <linux-ext4@vger.kernel.org>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	Khazhismel Kumykov <khazhy@google.com>,
	David Howells <dhowells@redhat.com>,
	Dave Chinner <david@fromorbit.com>, Theodore Tso <tytso@mit.edu>,
	"Darrick J. Wong" <djwong@kernel.org>,
	Matthew Bobrowski <repnop@google.com>,
	kernel@collabora.com
Subject: Re: [PATCH v6 16/21] fanotify: Handle FAN_FS_ERROR events
Date: Fri, 13 Aug 2021 12:35:52 +0300	[thread overview]
Message-ID: <CAOQ4uxjb8kpfaX2Jtq-h6Vai2My67PdqGRbVUP5+GspLCsq_+A@mail.gmail.com> (raw)
In-Reply-To: <20210812214010.3197279-17-krisman@collabora.com>

On Fri, Aug 13, 2021 at 12:41 AM Gabriel Krisman Bertazi
<krisman@collabora.com> wrote:
>
> Wire up FAN_FS_ERROR in the fanotify_mark syscall.  The event can only
> be requested for the entire filesystem, thus it requires the
> FAN_MARK_FILESYSTEM.

Please split the Wire-up to fanotify_mark syscall into a separate patch applied
after patches that implement the report of event info records.

>
> FAN_FS_ERROR has to be handled slightly differently from other events
> because it needs to be submitted in an atomic context, using
> preallocated memory.  This patch implements the submission path by only
> storing the first error event that happened in the slot (userspace
> resets the slot by reading the event).
>
> Extra error events happening when the slot is occupied are merged to the
> original report, and the only information keep for these extra errors is
> an accumulator counting the number of events, which is part of the
> record reported back to userspace.
>
> Reporting only the first event should be fine, since when a FS error
> happens, a cascade of error usually follows, but the most meaningful
> information is (usually) on the first erro.
>
> The event dequeueing is also a bit special to avoid losing events. Since
> event merging only happens while the event is queued, there is a window
> between when an error event is dequeued (notification_lock is dropped)
> until it is reset (.free_event()) where the slot is full, but no merges
> can happen.
>
> The proposed solution is to copy the event to the stack prior to
> dropping the lock.  This way, if a new event arrives in the time between
> the event was dequeued and the time it resets, the new errors will still
> be logged and merged in the recently freed slot.
>
> Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com>
>
> ---
> Changes since v5:
>   - Copy to stack instead of replacing the fee slot(jan)
>   - prepare error slot outside of the notification lock(jan)
> Changes since v4:
>   - Split parts to earlier patches (amir)
>   - Simplify fanotify entry replacement
>   - Update handle size prediction on overflow
> Changes since v3:
>   - Convert WARN_ON to pr_warn (amir)
>   - Remove unecessary READ/WRITE_ONCE (amir)
>   - Alloc with GFP_KERNEL_ACCOUNT(amir)
>   - Simplify flags on mark allocation (amir)
>   - Avoid atomic set of error_count (amir)
>   - Simplify rules when merging error_event (amir)
>   - Allocate new error_event on get_one_event (amir)
>   - Report superblock error with invalid FH (amir,jan)
>
> Changes since v2:
>   - Support and equire FID mode (amir)
>   - Goto error path instead of early return (amir)
>   - Simplify get_one_event (me)
>   - Base merging on error_count
>   - drop fanotify_queue_error_event
>
> Changes since v1:
>   - Pass dentry to fanotify_check_fsid (Amir)
>   - FANOTIFY_EVENT_TYPE_ERROR -> FANOTIFY_EVENT_TYPE_FS_ERROR
>   - Merge previous patch into it
>   - Use a single slot
>   - Move fanotify_mark.error_event definition to this commit
>   - Rename FAN_ERROR -> FAN_FS_ERROR
>   - Restrict FAN_FS_ERROR to FAN_MARK_FILESYSTEM
> ---
>  fs/notify/fanotify/fanotify.c      | 57 +++++++++++++++++++++++++++++-
>  fs/notify/fanotify/fanotify.h      | 21 +++++++++++
>  fs/notify/fanotify/fanotify_user.c | 39 ++++++++++++++++++--
>  include/linux/fanotify.h           |  6 +++-
>  4 files changed, 119 insertions(+), 4 deletions(-)
>
> diff --git a/fs/notify/fanotify/fanotify.c b/fs/notify/fanotify/fanotify.c
> index 3bf6fd85c634..0c7667d3f5d1 100644
> --- a/fs/notify/fanotify/fanotify.c
> +++ b/fs/notify/fanotify/fanotify.c
> @@ -709,6 +709,55 @@ static __kernel_fsid_t fanotify_get_fsid(struct fsnotify_iter_info *iter_info)
>         return fsid;
>  }
>
> +static void fanotify_insert_error_event(struct fsnotify_group *group,
> +                                       struct fsnotify_event *fsn_event)
> +
> +{
> +       struct fanotify_event *event = FANOTIFY_E(fsn_event);
> +
> +       if (!fanotify_is_error_event(event->mask))
> +               return;
> +
> +       /*
> +        * Prevent the mark from going away while an outstanding error
> +        * event is queued.  The reference is released by
> +        * fanotify_dequeue_first_event.
> +        */
> +       fsnotify_get_mark(&FANOTIFY_EE(event)->sb_mark->fsn_mark);
> +
> +}
> +
> +static int fanotify_handle_error_event(struct fsnotify_iter_info *iter_info,
> +                                      struct fsnotify_group *group,
> +                                      const struct fs_error_report *report)
> +{
> +       struct fanotify_sb_mark *sb_mark =
> +               FANOTIFY_SB_MARK(fsnotify_iter_sb_mark(iter_info));
> +       struct fanotify_error_event *fee = sb_mark->fee_slot;
> +
> +       spin_lock(&group->notification_lock);
> +       if (fee->err_count++) {
> +               spin_unlock(&group->notification_lock);
> +               return 0;
> +       }

Please add commentary to explain why logic is before merge()/insert().

> +       spin_unlock(&group->notification_lock);
> +
> +       fee->fae.type = FANOTIFY_EVENT_TYPE_FS_ERROR;
> +
> +       if (fsnotify_insert_event(group, &fee->fae.fse,
> +                                 NULL, fanotify_insert_error_event)) {
> +               /*
> +                *  Even if an error occurred, an overflow event is
> +                *  queued. Just reset the error count and succeed.
> +                */
> +               spin_lock(&group->notification_lock);
> +               fanotify_reset_error_slot(fee);
> +               spin_unlock(&group->notification_lock);

This feels racy.
I think that fanotify_reset_error_slot() should WARN about
trying to reset a queued error event and here we need to
check that fee was not queued while we dropped the lock.

And I am not convinced about correctness of incrementing
err_count while the lock is dropped.
Need to see the commentary.

> +       }
> +
> +       return 0;
> +}
> +
>  /*
>   * Add an event to hash table for faster merge.
>   */
> @@ -762,7 +811,7 @@ static int fanotify_handle_event(struct fsnotify_group *group, u32 mask,
>         BUILD_BUG_ON(FAN_OPEN_EXEC_PERM != FS_OPEN_EXEC_PERM);
>         BUILD_BUG_ON(FAN_FS_ERROR != FS_ERROR);
>
> -       BUILD_BUG_ON(HWEIGHT32(ALL_FANOTIFY_EVENT_BITS) != 19);
> +       BUILD_BUG_ON(HWEIGHT32(ALL_FANOTIFY_EVENT_BITS) != 20);
>
>         mask = fanotify_group_event_mask(group, iter_info, mask, data,
>                                          data_type, dir);
> @@ -787,6 +836,9 @@ static int fanotify_handle_event(struct fsnotify_group *group, u32 mask,
>                         return 0;
>         }
>
> +       if (fanotify_is_error_event(mask))
> +               return fanotify_handle_error_event(iter_info, group, data);
> +
>         event = fanotify_alloc_event(group, mask, data, data_type, dir,
>                                      file_name, &fsid);
>         ret = -ENOMEM;
> @@ -857,10 +909,13 @@ static void fanotify_free_name_event(struct fanotify_event *event)
>
>  static void fanotify_free_error_event(struct fanotify_event *event)
>  {
> +       struct fanotify_error_event *fee = FANOTIFY_EE(event);
> +
>         /*
>          * The actual event is tied to a mark, and is released on mark
>          * removal
>          */
> +       fsnotify_put_mark(&fee->sb_mark->fsn_mark);
>  }
>
>  static void fanotify_free_event(struct fsnotify_event *fsn_event)
> diff --git a/fs/notify/fanotify/fanotify.h b/fs/notify/fanotify/fanotify.h
> index 3f03333df32f..eeb4a85af74e 100644
> --- a/fs/notify/fanotify/fanotify.h
> +++ b/fs/notify/fanotify/fanotify.h
> @@ -220,6 +220,8 @@ FANOTIFY_NE(struct fanotify_event *event)
>
>  struct fanotify_error_event {
>         struct fanotify_event fae;
> +       u32 err_count; /* Suppressed errors count */
> +
>         struct fanotify_sb_mark *sb_mark; /* Back reference to the mark. */
>  };
>
> @@ -320,6 +322,11 @@ static inline struct fanotify_event *FANOTIFY_E(struct fsnotify_event *fse)
>         return container_of(fse, struct fanotify_event, fse);
>  }
>
> +static inline bool fanotify_is_error_event(u32 mask)
> +{
> +       return mask & FAN_FS_ERROR;
> +}
> +
>  static inline bool fanotify_event_has_path(struct fanotify_event *event)
>  {
>         return event->type == FANOTIFY_EVENT_TYPE_PATH ||
> @@ -349,6 +356,7 @@ static inline struct path *fanotify_event_path(struct fanotify_event *event)
>  static inline bool fanotify_is_hashed_event(u32 mask)
>  {
>         return !(fanotify_is_perm_event(mask) ||
> +                fanotify_is_error_event(mask) ||
>                  fsnotify_is_overflow_event(mask));
>  }
>
> @@ -358,3 +366,16 @@ static inline unsigned int fanotify_event_hash_bucket(
>  {
>         return event->hash & FANOTIFY_HTABLE_MASK;
>  }
> +
> +/*
> + * Reset the FAN_FS_ERROR event slot
> + *
> + * This is used to restore the error event slot to a a zeroed state,
> + * where it can be used for a new incoming error.  It does not
> + * initialize the event, but clear only the required data to free the
> + * slot.
> + */
> +static inline void fanotify_reset_error_slot(struct fanotify_error_event *fee)
> +{
> +       fee->err_count = 0;

Makes sense that it should also zero the error field. No?


> +}
> diff --git a/fs/notify/fanotify/fanotify_user.c b/fs/notify/fanotify/fanotify_user.c
> index b77030386d7f..3fff0c994dc8 100644
> --- a/fs/notify/fanotify/fanotify_user.c
> +++ b/fs/notify/fanotify/fanotify_user.c
> @@ -167,6 +167,19 @@ static void fanotify_unhash_event(struct fsnotify_group *group,
>         hlist_del_init(&event->merge_list);
>  }
>
> +static struct fanotify_event *fanotify_dup_error_to_stack(
> +                               struct fanotify_error_event *fee,
> +                               struct fanotify_error_event *error_on_stack)
> +{
> +       fanotify_init_event(&error_on_stack->fae, 0, FS_ERROR);
> +
> +       error_on_stack->fae.type = FANOTIFY_EVENT_TYPE_FS_ERROR;
> +       error_on_stack->err_count = fee->err_count;
> +       error_on_stack->sb_mark = fee->sb_mark;
> +
> +       return &error_on_stack->fae;
> +}
> +
>  /*
>   * Get an fanotify notification event if one exists and is small
>   * enough to fit in "count". Return an error pointer if the count
> @@ -174,7 +187,9 @@ static void fanotify_unhash_event(struct fsnotify_group *group,
>   * updated accordingly.
>   */
>  static struct fanotify_event *get_one_event(struct fsnotify_group *group,
> -                                           size_t count)
> +                                   size_t count,
> +                                   struct fanotify_error_event *error_on_stack)
> +
>  {
>         size_t event_size;
>         struct fanotify_event *event = NULL;
> @@ -205,6 +220,16 @@ static struct fanotify_event *get_one_event(struct fsnotify_group *group,
>                 FANOTIFY_PERM(event)->state = FAN_EVENT_REPORTED;
>         if (fanotify_is_hashed_event(event->mask))
>                 fanotify_unhash_event(group, event);
> +
> +       if (fanotify_is_error_event(event->mask)) {
> +               /*
> +                * Error events are returned as a copy of the error
> +                * slot.  The actual error slot is reused.
> +                */
> +               fanotify_dup_error_to_stack(FANOTIFY_EE(event), error_on_stack);
> +               fanotify_reset_error_slot(FANOTIFY_EE(event));
> +               event = &error_on_stack->fae;
> +       }
>  out:
>         spin_unlock(&group->notification_lock);
>         return event;
> @@ -564,6 +589,7 @@ static __poll_t fanotify_poll(struct file *file, poll_table *wait)
>  static ssize_t fanotify_read(struct file *file, char __user *buf,
>                              size_t count, loff_t *pos)
>  {
> +       struct fanotify_error_event error_on_stack;
>         struct fsnotify_group *group;
>         struct fanotify_event *event;
>         char __user *start;
> @@ -582,7 +608,7 @@ static ssize_t fanotify_read(struct file *file, char __user *buf,
>                  * in case there are lots of available events.
>                  */
>                 cond_resched();
> -               event = get_one_event(group, count);
> +               event = get_one_event(group, count, &error_on_stack);
>                 if (IS_ERR(event)) {
>                         ret = PTR_ERR(event);
>                         break;
> @@ -1031,6 +1057,10 @@ static int fanotify_add_mark(struct fsnotify_group *group,
>                         fanotify_init_event(&fee->fae, 0, FS_ERROR);
>                         fee->sb_mark = sb_mark;
>                         sb_mark->fee_slot = fee;
> +
> +                       /* Mark the error slot ready to receive events. */
> +                       fanotify_reset_error_slot(fee);
> +
>                 }
>         }
>
> @@ -1459,6 +1489,11 @@ static int do_fanotify_mark(int fanotify_fd, unsigned int flags, __u64 mask,
>                 fsid = &__fsid;
>         }
>
> +       if (mask & FAN_FS_ERROR && mark_type != FAN_MARK_FILESYSTEM) {
> +               ret = -EINVAL;
> +               goto path_put_and_out;
> +       }
> +

Split to Wire-up patch please.

>         /* inode held in place by reference to path; group by fget on fd */
>         if (mark_type == FAN_MARK_INODE)
>                 inode = path.dentry->d_inode;
> diff --git a/include/linux/fanotify.h b/include/linux/fanotify.h
> index c05d45bde8b8..c4d49308b2d0 100644
> --- a/include/linux/fanotify.h
> +++ b/include/linux/fanotify.h
> @@ -88,9 +88,13 @@ extern struct ctl_table fanotify_table[]; /* for sysctl */
>  #define FANOTIFY_INODE_EVENTS  (FANOTIFY_DIRENT_EVENTS | \
>                                  FAN_ATTRIB | FAN_MOVE_SELF | FAN_DELETE_SELF)
>
> +/* Events that can only be reported with data type FSNOTIFY_EVENT_ERROR */
> +#define FANOTIFY_ERROR_EVENTS  (FAN_FS_ERROR)
> +
>  /* Events that user can request to be notified on */
>  #define FANOTIFY_EVENTS                (FANOTIFY_PATH_EVENTS | \
> -                                FANOTIFY_INODE_EVENTS)
> +                                FANOTIFY_INODE_EVENTS | \
> +                                FANOTIFY_ERROR_EVENTS)
>

Split to Wire-up patch please.

Thanks,
Amir.

  reply	other threads:[~2021-08-13  9:36 UTC|newest]

Thread overview: 67+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-12 21:39 [PATCH v6 00/21] File system wide monitoring Gabriel Krisman Bertazi
2021-08-12 21:39 ` [PATCH v6 01/21] fsnotify: Don't insert unmergeable events in hashtable Gabriel Krisman Bertazi
2021-08-12 21:39 ` [PATCH v6 02/21] fanotify: Fold event size calculation to its own function Gabriel Krisman Bertazi
2021-08-12 21:39 ` [PATCH v6 03/21] fanotify: Split fsid check from other fid mode checks Gabriel Krisman Bertazi
2021-08-12 21:39 ` [PATCH v6 04/21] fsnotify: Reserve mark flag bits for backends Gabriel Krisman Bertazi
2021-08-13  7:28   ` Amir Goldstein
2021-08-16 13:15     ` Jan Kara
2021-08-23 14:36       ` Gabriel Krisman Bertazi
2021-08-12 21:39 ` [PATCH v6 05/21] fanotify: Split superblock marks out to a new cache Gabriel Krisman Bertazi
2021-08-16 13:18   ` Jan Kara
2021-08-12 21:39 ` [PATCH v6 06/21] inotify: Don't force FS_IN_IGNORED Gabriel Krisman Bertazi
2021-08-12 21:39 ` [PATCH v6 07/21] fsnotify: Add helper to detect overflow_event Gabriel Krisman Bertazi
2021-08-12 21:39 ` [PATCH v6 08/21] fsnotify: Add wrapper around fsnotify_add_event Gabriel Krisman Bertazi
2021-08-12 21:39 ` [PATCH v6 09/21] fsnotify: Allow events reported with an empty inode Gabriel Krisman Bertazi
2021-08-13  7:58   ` Amir Goldstein
2021-08-25 18:40     ` Gabriel Krisman Bertazi
2021-08-25 19:45       ` Amir Goldstein
2021-08-25 21:50         ` Gabriel Krisman Bertazi
2021-08-26 10:44           ` Amir Goldstein
2021-08-27  2:26             ` Paul Moore
2021-08-12 21:39 ` [PATCH v6 10/21] fsnotify: Support FS_ERROR event type Gabriel Krisman Bertazi
2021-08-13  7:48   ` Amir Goldstein
2021-08-16 13:23   ` Jan Kara
2021-08-12 21:40 ` [PATCH v6 11/21] fanotify: Allow file handle encoding for unhashed events Gabriel Krisman Bertazi
2021-08-13  7:59   ` Amir Goldstein
2021-08-12 21:40 ` [PATCH v6 12/21] fanotify: Encode invalid file handle when no inode is provided Gabriel Krisman Bertazi
2021-08-13  8:27   ` Amir Goldstein
2021-08-16 14:06     ` Jan Kara
2021-08-16 15:54       ` Amir Goldstein
2021-08-16 16:11         ` Jan Kara
2021-08-12 21:40 ` [PATCH v6 13/21] fanotify: Require fid_mode for any non-fd event Gabriel Krisman Bertazi
2021-08-13  8:28   ` Amir Goldstein
2021-08-12 21:40 ` [PATCH v6 14/21] fanotify: Reserve UAPI bits for FAN_FS_ERROR Gabriel Krisman Bertazi
2021-08-13  8:29   ` Amir Goldstein
2021-08-12 21:40 ` [PATCH v6 15/21] fanotify: Preallocate per superblock mark error event Gabriel Krisman Bertazi
2021-08-13  8:40   ` Amir Goldstein
2021-08-16 15:57   ` Jan Kara
2021-08-27 18:18     ` Gabriel Krisman Bertazi
2021-09-02 21:24       ` Gabriel Krisman Bertazi
2021-09-03  4:16         ` Amir Goldstein
2021-09-15 10:31           ` Jan Kara
2021-08-12 21:40 ` [PATCH v6 16/21] fanotify: Handle FAN_FS_ERROR events Gabriel Krisman Bertazi
2021-08-13  9:35   ` Amir Goldstein [this message]
2021-08-12 21:40 ` [PATCH v6 17/21] fanotify: Report fid info for file related file system errors Gabriel Krisman Bertazi
2021-08-13  9:00   ` Amir Goldstein
2021-08-13  9:03     ` Amir Goldstein
2021-08-16 16:18   ` Jan Kara
2021-08-12 21:40 ` [PATCH v6 18/21] fanotify: Emit generic error info type for error event Gabriel Krisman Bertazi
2021-08-13  8:47   ` Amir Goldstein
2021-08-16 16:23   ` Jan Kara
2021-08-16 21:41   ` Darrick J. Wong
2021-08-17  9:05     ` Jan Kara
2021-08-17 10:08       ` Amir Goldstein
2021-08-18  0:16         ` Darrick J. Wong
2021-08-18  3:24           ` Amir Goldstein
2021-08-18  9:58             ` Jan Kara
2021-08-19  3:58               ` Darrick J. Wong
2021-08-18  0:10       ` Darrick J. Wong
2021-08-24 16:53       ` Gabriel Krisman Bertazi
2021-08-25  4:09         ` Darrick J. Wong
2021-08-12 21:40 ` [PATCH v6 19/21] ext4: Send notifications on error Gabriel Krisman Bertazi
2021-08-16 16:26   ` Jan Kara
2021-08-12 21:40 ` [PATCH v6 20/21] samples: Add fs error monitoring example Gabriel Krisman Bertazi
2021-08-18 13:02   ` Jan Kara
2021-08-23 14:49     ` Gabriel Krisman Bertazi
2021-08-12 21:40 ` [PATCH v6 21/21] docs: Document the FAN_FS_ERROR event Gabriel Krisman Bertazi
2021-08-16 16:40   ` Jan Kara

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAOQ4uxjb8kpfaX2Jtq-h6Vai2My67PdqGRbVUP5+GspLCsq_+A@mail.gmail.com \
    --to=amir73il@gmail.com \
    --cc=david@fromorbit.com \
    --cc=dhowells@redhat.com \
    --cc=djwong@kernel.org \
    --cc=jack@suse.com \
    --cc=kernel@collabora.com \
    --cc=khazhy@google.com \
    --cc=krisman@collabora.com \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=repnop@google.com \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).