linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Amir Goldstein <amir73il@gmail.com>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Jan Kara <jack@suse.cz>, linux-fsdevel <linux-fsdevel@vger.kernel.org>
Subject: Re: [GIT PULL] Fsnotify cleanups
Date: Sat, 9 Jun 2018 11:00:17 +0300	[thread overview]
Message-ID: <CAOQ4uxiGom8qdMA+8Gy+LHhR8fF8fX5nHbYRHHxjDgou6BxYiA@mail.gmail.com> (raw)
In-Reply-To: <CAOQ4uxhKkJ=4x4UbcgZRci-qJ5XrY0NmNYfR6zyoRZH4UzQGaQ@mail.gmail.com>

On Sat, Jun 9, 2018 at 9:57 AM, Amir Goldstein <amir73il@gmail.com> wrote:
> On Fri, Jun 8, 2018 at 11:34 PM, Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
>> On Fri, Jun 8, 2018 at 6:27 AM Jan Kara <jack@suse.cz> wrote:
>>>
>>> The alignment could be fixed by having
>>>
>>> struct inode {
>>>         ...
>>>         struct fsnotify_obj i_fsnotify __attribute__ ((aligned(sizeof(void *))));
>>>         ...
>>> }
>>>
>>> but that's too ugly even for my twisted taste. And before someone asks,
>>> adding aligned attribute to the definition of fsnotify_obj structure makes
>>> it grow the padding at the end again.
>>
>> Yeah, I don't think there's any way to say "this can't be an array
>> member, don't pad the size out to the alignment".
>>
>> I do wonder if perhaps "struct fsnotify_obj" could actually have a
>> mask and a 32-bit object ID instead, which would (a) avoid the
>> alignment issue and (b) actually shrink the structure onm 64-bit.
>>
>> Obviously you'd then have to look up the pointer from the object ID
>> (presumably using a hash, but maybe it would use the radix tree of
>> idr_alloc() or something).
>>
>> I haven't looked at how often those pointers actually get looked up.
>> If the 'mask' makes it fairly rare, then the extra indirection might
>> be ok. I suspect it's not. But I thought I'd mention it as a possible
>> approach.
>
> As a matter of fact, the pointer is checked to be non zero before
> accessing the mask:
>
> 7c49b8616460 fs/notify: optimize inotify/fsnotify code for unwatched files
>
> So the mask is not really needed to optimize away unwatched inodes.
> I guess the mask is needed to optimize away use cases like frequent
> fsnotify_access() calls on inodes that are watched for modification only,
> but not sure that is worth optimizing.
>
>>
>> The other approach is - as you say - to move 'mask' to be behind the
>> pointer. But if that causes a lot of extra pointer chasing that mask
>> would normally avoid, that could be really expensive too. We see that
>> with the security objects: the cost of looking up whatever is behind
>> i_security is really nasty, because every inode gets allocated a
>> private security structure, so it's absolutely horrendous for cache
>> effects (no sharing).
>
> Besides pointer chasing, there is also srcu_read_lock.
>
>>
>> But I suspect that the behavior of i_fsnotify_marks is very different
>> from i_security, in that most inodes probably don't have marks at all.
>> No? Yes? So you might not have the nasty case of "almost all inode
>> access ends up following that pointer and it's horrible in the cache".
>>
>
> Yes and No :-)
> Yes, most inodes don't have marks at all,
> but, every fsnotify() call may need to follow mnt->mnt_fsnotify.marks
> if mount is watched and there is no mnt->mnt_fsnotify.mask in fsnotify_obj.
>
> Sure, we don't need to worry about 32bits in struct mount, so we could
> have a different attach method for inodes and mounts, but the whole
> purpose of this cleanup is to make the code more generic in handing
> and attaching of marks to inode/mounts.
>
> How about if we move mask into connector, but also leave a shadow mask,
> in an "inherited" struct i.e.:
>
> /* struct to embed in objects, which marks can be attached to */
> struct fsnotify_obj {
>        struct fsnotify_mark_connector __rcu *marks;
> };
>
> /* struct to embed in objects, which marks can be attached to with
> shadow mask */
> struct fsnotify_obj_mask {
>        struct fsnotify_obj obj;
>        /* all events this object cares about */
>        __u32 mask;
> };
>
> We embed fsnotify_obj in struct inode and fsnotify_obj_mask in struct mount.
>
> The only code that needs to test the shadow mask before following into
> connector is the optimization code in start of fsnotify() which was not
> generalized anyway and where the mask optimization is more important
> for mount watches.
>

No. That's not true, there is also fsnotify_inode_watches_children() and
__fsnotify_parent(). But I think the only bits in the mask that really matter
for optimizing inode watches are FS_ACCESS and FS_EVENT_ON_CHILD
and those could be encoded in lower bits of connector pointer.
connector is pointer is always dereferenced via fsnotify_grab_connector(),
so it's easy to mask out those bits.

> The only code that sets the mask is __fsnotify_recalc_mask(), needs to
> set the shadow mask, if conn->type != FSNOTIFY_OBJ_TYPE_INODE.
> We already special case FSNOTIFY_OBJ_TYPE_INODE for pinning inode
> in several places, so there is nothing new here.
>

So instead, __fsnotify_recalc_mask() needs to set the bits on connector
pointer in FSNOTIFY_OBJ_TYPE_INODE case.

I'll try to sketch some patches.

Thanks,
Amir.

  reply	other threads:[~2018-06-09  8:00 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-06-07 15:02 [GIT PULL] Fsnotify cleanups Jan Kara
2018-06-07 16:34 ` Linus Torvalds
2018-06-08 13:27   ` Jan Kara
2018-06-08 20:34     ` Linus Torvalds
2018-06-09  6:57       ` Amir Goldstein
2018-06-09  8:00         ` Amir Goldstein [this message]
2018-06-11 16:12           ` Jan Kara
2018-06-11 16:31             ` Amir Goldstein
2018-06-09 17:30         ` Linus Torvalds
2018-06-09 18:46           ` Amir Goldstein
2018-06-10 17:49             ` Amir Goldstein
2018-06-11 13:36               ` Jan Kara
2018-06-11 13:58                 ` Amir Goldstein
2018-06-11 16:03                   ` Jan Kara
2018-06-11 16:38                     ` Amir Goldstein
2018-06-11 19:51                       ` Amir Goldstein
2018-06-13 13:21                         ` Jan Kara
2018-06-13 13:56                           ` Amir Goldstein
2018-06-13 22:17                             ` Amir Goldstein
2018-06-22 16:44                               ` Jan Kara
2018-06-23  7:42                                 ` Amir Goldstein
2018-06-11 11:08       ` Jan Kara

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAOQ4uxiGom8qdMA+8Gy+LHhR8fF8fX5nHbYRHHxjDgou6BxYiA@mail.gmail.com \
    --to=amir73il@gmail.com \
    --cc=jack@suse.cz \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).