All of lore.kernel.org
 help / color / mirror / Atom feed
From: Amir Goldstein <amir73il@gmail.com>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Jan Kara <jack@suse.cz>, linux-fsdevel <linux-fsdevel@vger.kernel.org>
Subject: Re: [GIT PULL] Fsnotify cleanups
Date: Sat, 9 Jun 2018 11:00:17 +0300	[thread overview]
Message-ID: <CAOQ4uxiGom8qdMA+8Gy+LHhR8fF8fX5nHbYRHHxjDgou6BxYiA@mail.gmail.com> (raw)
In-Reply-To: <CAOQ4uxhKkJ=4x4UbcgZRci-qJ5XrY0NmNYfR6zyoRZH4UzQGaQ@mail.gmail.com>

On Sat, Jun 9, 2018 at 9:57 AM, Amir Goldstein <amir73il@gmail.com> wrote:
> On Fri, Jun 8, 2018 at 11:34 PM, Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
>> On Fri, Jun 8, 2018 at 6:27 AM Jan Kara <jack@suse.cz> wrote:
>>>
>>> The alignment could be fixed by having
>>>
>>> struct inode {
>>>         ...
>>>         struct fsnotify_obj i_fsnotify __attribute__ ((aligned(sizeof(void *))));
>>>         ...
>>> }
>>>
>>> but that's too ugly even for my twisted taste. And before someone asks,
>>> adding aligned attribute to the definition of fsnotify_obj structure makes
>>> it grow the padding at the end again.
>>
>> Yeah, I don't think there's any way to say "this can't be an array
>> member, don't pad the size out to the alignment".
>>
>> I do wonder if perhaps "struct fsnotify_obj" could actually have a
>> mask and a 32-bit object ID instead, which would (a) avoid the
>> alignment issue and (b) actually shrink the structure onm 64-bit.
>>
>> Obviously you'd then have to look up the pointer from the object ID
>> (presumably using a hash, but maybe it would use the radix tree of
>> idr_alloc() or something).
>>
>> I haven't looked at how often those pointers actually get looked up.
>> If the 'mask' makes it fairly rare, then the extra indirection might
>> be ok. I suspect it's not. But I thought I'd mention it as a possible
>> approach.
>
> As a matter of fact, the pointer is checked to be non zero before
> accessing the mask:
>
> 7c49b8616460 fs/notify: optimize inotify/fsnotify code for unwatched files
>
> So the mask is not really needed to optimize away unwatched inodes.
> I guess the mask is needed to optimize away use cases like frequent
> fsnotify_access() calls on inodes that are watched for modification only,
> but not sure that is worth optimizing.
>
>>
>> The other approach is - as you say - to move 'mask' to be behind the
>> pointer. But if that causes a lot of extra pointer chasing that mask
>> would normally avoid, that could be really expensive too. We see that
>> with the security objects: the cost of looking up whatever is behind
>> i_security is really nasty, because every inode gets allocated a
>> private security structure, so it's absolutely horrendous for cache
>> effects (no sharing).
>
> Besides pointer chasing, there is also srcu_read_lock.
>
>>
>> But I suspect that the behavior of i_fsnotify_marks is very different
>> from i_security, in that most inodes probably don't have marks at all.
>> No? Yes? So you might not have the nasty case of "almost all inode
>> access ends up following that pointer and it's horrible in the cache".
>>
>
> Yes and No :-)
> Yes, most inodes don't have marks at all,
> but, every fsnotify() call may need to follow mnt->mnt_fsnotify.marks
> if mount is watched and there is no mnt->mnt_fsnotify.mask in fsnotify_obj.
>
> Sure, we don't need to worry about 32bits in struct mount, so we could
> have a different attach method for inodes and mounts, but the whole
> purpose of this cleanup is to make the code more generic in handing
> and attaching of marks to inode/mounts.
>
> How about if we move mask into connector, but also leave a shadow mask,
> in an "inherited" struct i.e.:
>
> /* struct to embed in objects, which marks can be attached to */
> struct fsnotify_obj {
>        struct fsnotify_mark_connector __rcu *marks;
> };
>
> /* struct to embed in objects, which marks can be attached to with
> shadow mask */
> struct fsnotify_obj_mask {
>        struct fsnotify_obj obj;
>        /* all events this object cares about */
>        __u32 mask;
> };
>
> We embed fsnotify_obj in struct inode and fsnotify_obj_mask in struct mount.
>
> The only code that needs to test the shadow mask before following into
> connector is the optimization code in start of fsnotify() which was not
> generalized anyway and where the mask optimization is more important
> for mount watches.
>

No. That's not true, there is also fsnotify_inode_watches_children() and
__fsnotify_parent(). But I think the only bits in the mask that really matter
for optimizing inode watches are FS_ACCESS and FS_EVENT_ON_CHILD
and those could be encoded in lower bits of connector pointer.
connector is pointer is always dereferenced via fsnotify_grab_connector(),
so it's easy to mask out those bits.

> The only code that sets the mask is __fsnotify_recalc_mask(), needs to
> set the shadow mask, if conn->type != FSNOTIFY_OBJ_TYPE_INODE.
> We already special case FSNOTIFY_OBJ_TYPE_INODE for pinning inode
> in several places, so there is nothing new here.
>

So instead, __fsnotify_recalc_mask() needs to set the bits on connector
pointer in FSNOTIFY_OBJ_TYPE_INODE case.

I'll try to sketch some patches.

Thanks,
Amir.

  reply	other threads:[~2018-06-09  8:00 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-06-07 15:02 [GIT PULL] Fsnotify cleanups Jan Kara
2018-06-07 16:34 ` Linus Torvalds
2018-06-08 13:27   ` Jan Kara
2018-06-08 20:34     ` Linus Torvalds
2018-06-09  6:57       ` Amir Goldstein
2018-06-09  8:00         ` Amir Goldstein [this message]
2018-06-11 16:12           ` Jan Kara
2018-06-11 16:31             ` Amir Goldstein
2018-06-09 17:30         ` Linus Torvalds
2018-06-09 18:46           ` Amir Goldstein
2018-06-10 17:49             ` Amir Goldstein
2018-06-11 13:36               ` Jan Kara
2018-06-11 13:58                 ` Amir Goldstein
2018-06-11 16:03                   ` Jan Kara
2018-06-11 16:38                     ` Amir Goldstein
2018-06-11 19:51                       ` Amir Goldstein
2018-06-13 13:21                         ` Jan Kara
2018-06-13 13:56                           ` Amir Goldstein
2018-06-13 22:17                             ` Amir Goldstein
2018-06-22 16:44                               ` Jan Kara
2018-06-23  7:42                                 ` Amir Goldstein
2018-06-11 11:08       ` Jan Kara

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAOQ4uxiGom8qdMA+8Gy+LHhR8fF8fX5nHbYRHHxjDgou6BxYiA@mail.gmail.com \
    --to=amir73il@gmail.com \
    --cc=jack@suse.cz \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.