linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jan Kara <jack@suse.cz>
To: Amir Goldstein <amir73il@gmail.com>
Cc: Jan Kara <jack@suse.cz>, linux-fsdevel <linux-fsdevel@vger.kernel.org>
Subject: Re: [PATCH 5/7] fanotify: limit number of event merge attempts
Date: Wed, 15 Sep 2021 18:33:34 +0200	[thread overview]
Message-ID: <20210915163334.GD6166@quack2.suse.cz> (raw)
In-Reply-To: <CAOQ4uxi7MRV-6PxyVovaR83sLWX8mZpiOM9OjdUqOHvZM9h2Wg@mail.gmail.com>

On Wed 15-09-21 15:39:13, Amir Goldstein wrote:
> On Mon, Mar 1, 2021 at 3:08 PM Jan Kara <jack@suse.cz> wrote:
> >
> > On Sat 27-02-21 10:31:52, Amir Goldstein wrote:
> > > On Tue, Feb 2, 2021 at 6:20 PM Amir Goldstein <amir73il@gmail.com> wrote:
> > > >
> > > > Event merges are expensive when event queue size is large.
> > > > Limit the linear search to 128 merge tests.
> > > > In combination with 128 hash lists, there is a potential to
> > > > merge with up to 16K events in the hashed queue.
> > > >
> > > > Signed-off-by: Amir Goldstein <amir73il@gmail.com>
> > > > ---
> > > >  fs/notify/fanotify/fanotify.c | 6 ++++++
> > > >  1 file changed, 6 insertions(+)
> > > >
> > > > diff --git a/fs/notify/fanotify/fanotify.c b/fs/notify/fanotify/fanotify.c
> > > > index 12df6957e4d8..6d3807012851 100644
> > > > --- a/fs/notify/fanotify/fanotify.c
> > > > +++ b/fs/notify/fanotify/fanotify.c
> > > > @@ -129,11 +129,15 @@ static bool fanotify_should_merge(struct fsnotify_event *old_fsn,
> > > >         return false;
> > > >  }
> > > >
> > > > +/* Limit event merges to limit CPU overhead per event */
> > > > +#define FANOTIFY_MAX_MERGE_EVENTS 128
> > > > +
> > > >  /* and the list better be locked by something too! */
> > > >  static int fanotify_merge(struct list_head *list, struct fsnotify_event *event)
> > > >  {
> > > >         struct fsnotify_event *test_event;
> > > >         struct fanotify_event *new;
> > > > +       int i = 0;
> > > >
> > > >         pr_debug("%s: list=%p event=%p\n", __func__, list, event);
> > > >         new = FANOTIFY_E(event);
> > > > @@ -147,6 +151,8 @@ static int fanotify_merge(struct list_head *list, struct fsnotify_event *event)
> > > >                 return 0;
> > > >
> > > >         list_for_each_entry_reverse(test_event, list, list) {
> > > > +               if (++i > FANOTIFY_MAX_MERGE_EVENTS)
> > > > +                       break;
> > > >                 if (fanotify_should_merge(test_event, event)) {
> > > >                         FANOTIFY_E(test_event)->mask |= new->mask;
> > > >                         return 1;
> > > > --
> > > > 2.25.1
> > > >
> > >
> > > Jan,
> > >
> > > I was thinking that this patch or a variant thereof should be applied to stable
> > > kernels, but not the entire series.
> > >
> > > OTOH, I am concerned about regressing existing workloads that depend on
> > > merging events on more than 128 inodes.
> >
> > Honestly, I don't think pushing anything to stable for this is really worth
> > it.
> >
> > 1) fanotify() is limited to CAP_SYS_ADMIN (in init namespace) so this is
> > hardly a security issue.
> >
> > 2) We have cond_resched() in the merge code now so the kernel doesn't
> > lockup anymore. So this is only about fanotify becoming slow if you have
> > lots of events.
> >
> > 3) I haven't heard any complaints since we've added the cond_resched()
> > patch so the performance issue seems to be really rare.
> >
> > If I get complaits from real users about this, we can easily reconsider, it
> > is not a big deal. But I just don't think preemptive action is warranted...
> >
> 
> Hi Jan,
> 
> I know you have some catching up to do, but applying this patch to stable
> has become a priority for me.
> It was a mistake on my part not to push harder 6 months ago, so I am trying
> to rectify this mistake now as soon as possible.
> 
> To answer your arguments against preemptive action:
> 1) My application has CAP_SYS_ADMIN, it is not malicious, but it cannot
>     do its job without taking up more CPU that it needs to, because bursts of
>     events will cause the event queue to grow to thousands of events and
>     fanotify_merge() will become a high CPU consumer
> 2) It's not only about fanotify becoming slow, it's about fanotify making the
>      entire system slow and as a result it takes a long time for the system
>      to recover from this condition
> 3) You haven't heard any complains because nobody was using sb mark
>     We have been using sb mark in production for a few years and carry
>     this patch in our kernel, so I can say for certain that sb mark on a fs with
>     heavy workload is disturbing the entire system without this patch.
> 
> I don't think that "regressing" the number of merged event is a big issue,
> as we never guaranteed any specific merge behavior and the behavior
> was hard enough to predict, so I don't think applications could have
> relied on it.
> 
> So, are you ok with me sending this patch to stable as is?

Sure, go ahead. I was not strongly against pushing this to stable, I just
didn't see good reason to do that but your arguments make sense - you count
as a user report I was waiting for ;).

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

  reply	other threads:[~2021-09-15 16:33 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-02-02 16:20 [PATCH 0/7] Performance improvement for fanotify merge Amir Goldstein
2021-02-02 16:20 ` [PATCH 1/7] fsnotify: allow fsnotify_{peek,remove}_first_event with empty queue Amir Goldstein
2021-02-02 16:20 ` [PATCH 2/7] fsnotify: support hashed notification queue Amir Goldstein
2021-02-16 15:02   ` Jan Kara
2021-02-17 12:33     ` Amir Goldstein
2021-02-17 13:48       ` Jan Kara
2021-02-17 15:42         ` Amir Goldstein
2021-02-17 16:49           ` Jan Kara
2021-02-18 10:52           ` Amir Goldstein
2021-02-02 16:20 ` [PATCH 3/7] fsnotify: read events from hashed notification queue by order of insertion Amir Goldstein
2021-02-16 15:10   ` Jan Kara
2021-02-02 16:20 ` [PATCH 4/7] fanotify: enable hashed notification queue for FAN_CLASS_NOTIF groups Amir Goldstein
2021-02-02 16:20 ` [PATCH 5/7] fanotify: limit number of event merge attempts Amir Goldstein
2021-02-27  8:31   ` Amir Goldstein
2021-03-01 13:08     ` Jan Kara
2021-03-01 13:58       ` Amir Goldstein
2021-09-15 12:39       ` Amir Goldstein
2021-09-15 16:33         ` Jan Kara [this message]
2021-02-02 16:20 ` [PATCH 6/7] fanotify: mix event info into merge key hash Amir Goldstein
2021-02-16 15:39   ` Jan Kara
2021-02-17 10:13     ` Amir Goldstein
2021-02-18 10:46       ` Amir Goldstein
2021-02-18 11:11         ` Jan Kara
2021-02-18 12:17           ` Amir Goldstein
2021-02-02 16:20 ` [PATCH 7/7] fsnotify: print some debug stats on hashed queue overflow Amir Goldstein
2021-02-16 16:02 ` [PATCH 0/7] Performance improvement for fanotify merge Jan Kara
2021-02-17 10:52   ` Amir Goldstein
2021-02-17 11:25     ` Jan Kara
2021-02-18 10:56       ` Amir Goldstein
2021-02-18 11:15         ` Jan Kara
2021-02-18 12:35           ` Amir Goldstein
2021-02-19 10:15             ` Jan Kara
2021-02-19 10:21               ` Jan Kara
2021-02-19 13:38                 ` Amir Goldstein
2021-02-21 12:53                   ` Amir Goldstein
2021-02-22  9:29                     ` Jan Kara

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210915163334.GD6166@quack2.suse.cz \
    --to=jack@suse.cz \
    --cc=amir73il@gmail.com \
    --cc=linux-fsdevel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).