From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-yw0-f195.google.com ([209.85.161.195]:42589 "EHLO mail-yw0-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753107AbeFIIAS (ORCPT ); Sat, 9 Jun 2018 04:00:18 -0400 Received: by mail-yw0-f195.google.com with SMTP id q7-v6so4843791ywd.9 for ; Sat, 09 Jun 2018 01:00:18 -0700 (PDT) MIME-Version: 1.0 In-Reply-To: References: <20180607150217.jq757ncqopuimbkd@quack2.suse.cz> <20180608132737.etbkpqg77yz3vhp7@quack2.suse.cz> From: Amir Goldstein Date: Sat, 9 Jun 2018 11:00:17 +0300 Message-ID: Subject: Re: [GIT PULL] Fsnotify cleanups To: Linus Torvalds Cc: Jan Kara , linux-fsdevel Content-Type: text/plain; charset="UTF-8" Sender: linux-fsdevel-owner@vger.kernel.org List-ID: On Sat, Jun 9, 2018 at 9:57 AM, Amir Goldstein wrote: > On Fri, Jun 8, 2018 at 11:34 PM, Linus Torvalds > wrote: >> On Fri, Jun 8, 2018 at 6:27 AM Jan Kara wrote: >>> >>> The alignment could be fixed by having >>> >>> struct inode { >>> ... >>> struct fsnotify_obj i_fsnotify __attribute__ ((aligned(sizeof(void *)))); >>> ... >>> } >>> >>> but that's too ugly even for my twisted taste. And before someone asks, >>> adding aligned attribute to the definition of fsnotify_obj structure makes >>> it grow the padding at the end again. >> >> Yeah, I don't think there's any way to say "this can't be an array >> member, don't pad the size out to the alignment". >> >> I do wonder if perhaps "struct fsnotify_obj" could actually have a >> mask and a 32-bit object ID instead, which would (a) avoid the >> alignment issue and (b) actually shrink the structure onm 64-bit. >> >> Obviously you'd then have to look up the pointer from the object ID >> (presumably using a hash, but maybe it would use the radix tree of >> idr_alloc() or something). >> >> I haven't looked at how often those pointers actually get looked up. >> If the 'mask' makes it fairly rare, then the extra indirection might >> be ok. I suspect it's not. But I thought I'd mention it as a possible >> approach. > > As a matter of fact, the pointer is checked to be non zero before > accessing the mask: > > 7c49b8616460 fs/notify: optimize inotify/fsnotify code for unwatched files > > So the mask is not really needed to optimize away unwatched inodes. > I guess the mask is needed to optimize away use cases like frequent > fsnotify_access() calls on inodes that are watched for modification only, > but not sure that is worth optimizing. > >> >> The other approach is - as you say - to move 'mask' to be behind the >> pointer. But if that causes a lot of extra pointer chasing that mask >> would normally avoid, that could be really expensive too. We see that >> with the security objects: the cost of looking up whatever is behind >> i_security is really nasty, because every inode gets allocated a >> private security structure, so it's absolutely horrendous for cache >> effects (no sharing). > > Besides pointer chasing, there is also srcu_read_lock. > >> >> But I suspect that the behavior of i_fsnotify_marks is very different >> from i_security, in that most inodes probably don't have marks at all. >> No? Yes? So you might not have the nasty case of "almost all inode >> access ends up following that pointer and it's horrible in the cache". >> > > Yes and No :-) > Yes, most inodes don't have marks at all, > but, every fsnotify() call may need to follow mnt->mnt_fsnotify.marks > if mount is watched and there is no mnt->mnt_fsnotify.mask in fsnotify_obj. > > Sure, we don't need to worry about 32bits in struct mount, so we could > have a different attach method for inodes and mounts, but the whole > purpose of this cleanup is to make the code more generic in handing > and attaching of marks to inode/mounts. > > How about if we move mask into connector, but also leave a shadow mask, > in an "inherited" struct i.e.: > > /* struct to embed in objects, which marks can be attached to */ > struct fsnotify_obj { > struct fsnotify_mark_connector __rcu *marks; > }; > > /* struct to embed in objects, which marks can be attached to with > shadow mask */ > struct fsnotify_obj_mask { > struct fsnotify_obj obj; > /* all events this object cares about */ > __u32 mask; > }; > > We embed fsnotify_obj in struct inode and fsnotify_obj_mask in struct mount. > > The only code that needs to test the shadow mask before following into > connector is the optimization code in start of fsnotify() which was not > generalized anyway and where the mask optimization is more important > for mount watches. > No. That's not true, there is also fsnotify_inode_watches_children() and __fsnotify_parent(). But I think the only bits in the mask that really matter for optimizing inode watches are FS_ACCESS and FS_EVENT_ON_CHILD and those could be encoded in lower bits of connector pointer. connector is pointer is always dereferenced via fsnotify_grab_connector(), so it's easy to mask out those bits. > The only code that sets the mask is __fsnotify_recalc_mask(), needs to > set the shadow mask, if conn->type != FSNOTIFY_OBJ_TYPE_INODE. > We already special case FSNOTIFY_OBJ_TYPE_INODE for pinning inode > in several places, so there is nothing new here. > So instead, __fsnotify_recalc_mask() needs to set the bits on connector pointer in FSNOTIFY_OBJ_TYPE_INODE case. I'll try to sketch some patches. Thanks, Amir.