linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Al Viro <viro@ZenIV.linux.org.uk>
To: NeilBrown <neilb@suse.de>
Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 14/20] VFS/namei: add 'inode' arg to put_link().
Date: Sat, 18 Apr 2015 09:09:16 +0100	[thread overview]
Message-ID: <20150418080915.GG889@ZenIV.linux.org.uk> (raw)
In-Reply-To: <20150417190910.GD889@ZenIV.linux.org.uk>

On Fri, Apr 17, 2015 at 08:09:10PM +0100, Al Viro wrote:
> On Fri, Apr 17, 2015 at 05:25:36PM +0100, Al Viro wrote:
> > On Mon, Mar 23, 2015 at 01:37:40PM +1100, NeilBrown wrote:
> > > @@ -1669,13 +1669,14 @@ static inline int nested_symlink(struct path *path, struct nameidata *nd)
> > >  
> > >  	do {
> > >  		struct path link = *path;
> > > +		struct inode *inode = link.dentry->d_inode;
> > >  		void *cookie;
> > >  
> > >  		res = follow_link(&link, nd, &cookie);
> > >  		if (res)
> > >  			break;
> > >  		res = walk_component(nd, path, LOOKUP_FOLLOW);
> > > -		put_link(nd, &link, cookie);
> > > +		put_link(nd, &link, inode, cookie);
> > >  	} while (res > 0);
> > 
> > That's really unpleasant - it means increased stack footprint in the
> > recursion.
> > 
> > Damn, maybe it's time to bite the bullet and kill the recursion completely...
> > 
> > What do we really need to save across the recursive call?
> > 	* how far did we get in the previous pathname
> > 	* data needed for put_link:
> > 		cookie
> > 		link body
> > 		dentry of link
> > 		vfsmount (to pin containing fs; non-RCU) or inode (RCU)
> > 
> > We are already saving link body in nameidata, so we could fatten that array.
> > It would allow flattening link_path_walk() completely - instead of
> > recursive call we would just save what needed saving and jump to the beginning
> > and on exits we'd check the depth and either return or restore the saved state
> > and jump back to just past the place where recursive call used to be.
> > It would even save quite a bit of space in the worst case.  However, it would
> > blow the stack footprint in normal cases *and* blow it even worse for the
> > things that need two struct nameidata instances at once (rename(), basically).
> > 5 pointers instead of 1 pointer per level - extra 32 words on stack, i.e.
> > extra 256 bytes on 64bit.  Extra 0.5Kb of stack footprint on rename() is
> > probably too much, especially since this "saved" stuff from its two nameidata
> > instances will never be used at the same time...
> > 
> > Alternatively, we could just allocate about a page worth of an array when
> > the depth of nesting goes beyond 2 and put this saved stuff there - at
> > 5 pointers per level it would completely dispose of the depth of nesting
> > limit, giving us uniform "can't traverse more than 40 symlinks per pathname
> > resolution".  40 * 5 * sizeof(pointer) is what, at most 1600 bytes?  So
> > even half a page would suffice for that quite comfortably...
> > 
> > The question is whether we'll be able to avoid blowing the I-cache footprint
> > of link_path_walk() to hell while doing that; it feels like we should be,
> > but we'll have to see how well does that work in reality...
> > 
> > I'll try to implement that (with your #3..#7 as the first steps) and see
> > how well does it work; it's obviously the next cycle fodder, but hopefully
> > in testable shape by -rc2...
> 
> 	Hmm...  Actually, right now we have 192 bytes of stack footprint per
> nesting level (amd64 allmodconfig).  Which means that simply making the
> array fatter would give a clean benefit at the 3rd level of recursion (symlink
> encountered while traversing a symlink) for everything other than rename()...
> allnoconfig+64bit gives 160 bytes per level, with the same breakeven point.
> 
> 	Interesting...  It might even make sense to separate that array from
> struct nameidata and solve the rename() problem that way (current->nameidata
> would be replaced with pointer to that sucker in such variant, of course, and
> ->depth would move there).  In that variant we do not get rid of nesting limit
> completely, but it would probably be simpler than the one above...

	OK, right now in my tree recursion is gone, it seems to survive the
testing *and* stack footprint of link_path_walk() is 432 bytes.  Less than the
mainline with two nested symlinks, and I definitely see how to trim that down
by another 64 bytes, which would put us within a spitting distance from what
the mainline gets with a single symlink encountered in the middle of a
pathname.  It still needs more massage (link_path_walk() is ugly as hell
right now), but I see how to clean it up, and porting the rest of Neil's
RCU follow_link series on top of that shouldn't be hard.  Obviously next cycle
fodder, but if everything works out, we'll get serious stack footprint
reduction *and* not falling out of lazy pathwalk whenever we run into a symlink.

	I've dumped the current branch in vfs.git#link_path_walk; more after
I get some sleep...

  reply	other threads:[~2015-04-18  8:09 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-03-23  2:37 [PATCH 00/20] Support follow_link in RCU-walk - V3 NeilBrown
2015-03-23  2:37 ` [PATCH 02/20] STAGING/lustre: limit follow_link recursion using stack space NeilBrown
2015-04-18  3:01   ` Al Viro
2015-04-19 20:57     ` Andreas Dilger
2015-04-19 21:33       ` Al Viro
2015-04-20  2:29         ` Al Viro
2015-03-23  2:37 ` [PATCH 03/20] VFS: replace {, total_}link_count in task_struct with pointer to nameidata NeilBrown
2015-03-23  2:37 ` [PATCH 01/20] Documentation: remove outdated information from automount-support.txt NeilBrown
2015-03-23  2:37 ` [PATCH 10/20] security: make inode_follow_link RCU-walk aware NeilBrown
2015-03-23  2:37 ` [PATCH 04/20] ovl: rearrange ovl_follow_link to it doesn't need to call ->put_link NeilBrown
2015-03-23  2:37 ` [PATCH 07/20] VFS: remove nameidata args from ->follow_link NeilBrown
2015-03-23  2:37 ` [PATCH 06/20] SECURITY: remove nameidata arg from inode_follow_link NeilBrown
2015-03-23  2:37 ` [PATCH 05/20] VFS: replace nameidata arg to ->put_link with a char* NeilBrown
2015-03-23  2:37 ` [PATCH 09/20] security/selinux: pass 'flags' arg to avc_audit() and avc_has_perm_flags() NeilBrown
2015-03-23  2:37 ` [PATCH 11/20] VFS/namei: use terminate_walk when symlink lookup fails NeilBrown
2015-03-23  2:37 ` [PATCH 08/20] VFS: make all ->follow_link handlers aware for LOOKUP_RCU NeilBrown
2015-03-23  2:37 ` [PATCH 13/20] VFS/namei: abort RCU-walk on symlink if atime needs updating NeilBrown
2015-03-23  2:37 ` [PATCH 14/20] VFS/namei: add 'inode' arg to put_link() NeilBrown
2015-04-17 16:25   ` Al Viro
2015-04-17 19:09     ` Al Viro
2015-04-18  8:09       ` Al Viro [this message]
2015-03-23  2:37 ` [PATCH 16/20] VFS/namei: enable RCU-walk when following symlinks NeilBrown
2015-03-23  2:37 ` [PATCH 19/20] XFS: allow follow_link to often succeed in RCU-walk NeilBrown
2015-03-23  2:37 ` [PATCH 20/20] NFS: support LOOKUP_RCU in nfs_follow_link NeilBrown
2015-03-23  2:37 ` [PATCH 18/20] xfs: use RCU to free 'struct xfs_mount' NeilBrown
2015-03-23  2:37 ` [PATCH 17/20] VFS/namei: handle LOOKUP_RCU in page_follow_link_light NeilBrown
2015-03-23  2:37 ` [PATCH 12/20] VFS/namei: new flag to support RCU symlinks: LOOKUP_LINK_RCU NeilBrown
2015-03-23  2:37 ` [PATCH 15/20] VFS/namei: enhance follow_link to support RCU-walk NeilBrown
2015-03-25 23:23 ` [PATCH 00/20] Support follow_link in RCU-walk - V3 NeilBrown

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150418080915.GG889@ZenIV.linux.org.uk \
    --to=viro@zeniv.linux.org.uk \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=neilb@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).