All of lore.kernel.org
 help / color / mirror / Atom feed
From: ebiederm@xmission.com (Eric W. Biederman)
To: Al Viro <viro@ZenIV.linux.org.uk>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	Ian Kent <raven@themaw.net>
Subject: Re: [git pull] vfs fixes
Date: Mon, 03 Apr 2017 01:00:56 -0500	[thread overview]
Message-ID: <87inmmvxlz.fsf@xmission.com> (raw)
In-Reply-To: <20170403022059.GM29622@ZenIV.linux.org.uk> (Al Viro's message of "Mon, 3 Apr 2017 03:21:00 +0100")

Al Viro <viro@ZenIV.linux.org.uk> writes:

> On Sun, Apr 02, 2017 at 05:58:41PM -0700, Linus Torvalds wrote:
>
>> I had to go and double-check that "DCACHE_DIRECTORY_TYPE" is what
>> d_can_lookup() actually checks, so _that_ part is perhaps a bit
>> subtle, and might be worth noting in that comment that you edited.
>> 
>> So the real "rule" ends up being that we only ever look up things from
>> dentries of type DCACHE_DIRECTORY_TYPE set, and those had better have
>> DCACHE_RCUACCESS bit set.
>> 
>> And the only reason path_init() only checks it for that case is that
>> nd->root and nd->pwd both have to be of type d_can_lookup().
>> 
>> Do we check that when we set it? I hope/assume we do.
>
> For chdir()/chroot()/pivot_root() it's done by LOOKUP_DIRECTORY in lookup
> flags; fchdir() is slightly different - there we check S_ISDIR of inode
> of opened file.  Which is almost the same thing, except for
> kinda-sorta directories that have no ->lookup() - we have them for
> NFS referral points.  It should be impossible to end up with
> one of those opened - not even with O_PATH; follow_managed() will be called
> and we'll either fail or cross into whatever ends up overmounting them.
> Still, it might be cleaner to turn that check into
> 	d_can_lookup(f.file->f_path.dentry)
> simply for consistency sake.
>
> The thing I really don't like is mntns_install().  With sufficiently
> nasty nfsroot setup it might be possible to end up with referral point
> as one's root/pwd; getting out of such state would be interesting...
> Smells like that place should be a solitary follow_down(), not a loop
> of follow_down_one(), but I want Eric's opinion on that one; userns stuff
> is weird.

If I read the conversation correctly the concern is that we might
initialize a pwd or root with something that is almost but not quite a
directory in mntns_install.

Refereshing my memory.  d_automount mounts things and is what
is used for nfs referrals.  d_manage blocks waiting for
an automounts to complete or expire.  follow_down just calls d_manage,
follow_manage calls both d_manage and d_automount as appropriate.

If the concern is nfs referral points calling follow_down is wrong and
what is wanted is follow_managed.

The only thing that follow_down prevents is changing onto directories
that are only half mounted, and not really directories yet.  Which
is certainly part of the invarient we want to preserve.



The intent of the logic in mntns_install is to just pick a reasonable
looking place somewhere in that mount namespace to use as a root
directory.  I arbitrarily picked the top of the mount stack on "/".  Which
is typically used as the root directory.  If people really care where
their root is they save a directory file descriptor off somewhere and
call chroot.  So there is a little wiggle room exactly what the code
does.

There is a secondary use of mntns_install which is to give you a way to
access what is under "/" if you are so foolish as to umount "/".  I keep
thinking setns to your own mount namespace would be a handy way to get
back to the rootfs and to use it for something during system shutdown.
I don't know if anyone has actually used setns to your own mount
namespace for that.

The worst case I can see from the proposed change is we would
not be able to umount all of the way down to rootfs.  That
would be a self inflicted wound so I don't care.

I can't imagine anyone mounting an automount point deliberately on /
except as way to confuse the vfs.  Though I can almost imagine getting
there by accident if an automount expires.

So yes please let's change the follow_down_one loop to follow_managed
to preserve the invariant that we always have a directory that
supports d_can_lookup to pass to set_fs_pwd and set_fs_root.

Eric

> diff --git a/fs/dcache.c b/fs/dcache.c
> index 95d71eda8142..05550139a8a6 100644
> --- a/fs/dcache.c
> +++ b/fs/dcache.c
> @@ -1757,7 +1757,13 @@ static unsigned d_flags_for_inode(struct inode *inode)
>  		return DCACHE_MISS_TYPE;
>  
>  	if (S_ISDIR(inode->i_mode)) {
> -		add_flags = DCACHE_DIRECTORY_TYPE;
> +		/*
> +		 * Any potential starting point of lookup should have
> +		 * DCACHE_RCUACCESS; currently directory dentries
> +		 * come from d_alloc() anyway, but it costs us nothing
> +		 * to enforce it here.
> +		 */
> +		add_flags = DCACHE_DIRECTORY_TYPE | DCACHE_RCUACCESS;
>  		if (unlikely(!(inode->i_opflags & IOP_LOOKUP))) {
>  			if (unlikely(!inode->i_op->lookup))
>  				add_flags = DCACHE_AUTODIR_TYPE;
> diff --git a/fs/namei.c b/fs/namei.c
> index d41fab78798b..19dcf62133cc 100644
> --- a/fs/namei.c
> +++ b/fs/namei.c
> @@ -2145,6 +2145,9 @@ static const char *path_init(struct nameidata *nd, unsigned flags)
>  	int retval = 0;
>  	const char *s = nd->name->name;
>  
> +	if (!*s)
> +		flags &= ~LOOKUP_RCU;
> +
>  	nd->last_type = LAST_ROOT; /* if there are only slashes... */
>  	nd->flags = flags | LOOKUP_JUMPED | LOOKUP_PARENT;
>  	nd->depth = 0;
> diff --git a/fs/namespace.c b/fs/namespace.c
> index cc1375eff88c..31ec9a79d2d4 100644
> --- a/fs/namespace.c
> +++ b/fs/namespace.c
> @@ -3467,6 +3467,7 @@ static int mntns_install(struct nsproxy *nsproxy, struct ns_common *ns)
>  	struct fs_struct *fs = current->fs;
>  	struct mnt_namespace *mnt_ns = to_mnt_ns(ns);
>  	struct path root;
> +	int err;
>  
>  	if (!ns_capable(mnt_ns->user_ns, CAP_SYS_ADMIN) ||
>  	    !ns_capable(current_user_ns(), CAP_SYS_CHROOT) ||
> @@ -3484,15 +3485,14 @@ static int mntns_install(struct nsproxy *nsproxy, struct ns_common *ns)
>  	root.mnt    = &mnt_ns->root->mnt;
>  	root.dentry = mnt_ns->root->mnt.mnt_root;
>  	path_get(&root);
> -	while(d_mountpoint(root.dentry) && follow_down_one(&root))
> -		;
> -
> -	/* Update the pwd and root */
> -	set_fs_pwd(fs, &root);
> -	set_fs_root(fs, &root);
> -
> +	err = follow_down(&root);
> +	if (likely(!err)) {
> +		/* Update the pwd and root */
> +		set_fs_pwd(fs, &root);
> +		set_fs_root(fs, &root);
> +	}
>  	path_put(&root);
> -	return 0;
> +	return err;
>  }
>  
>  static struct user_namespace *mntns_owner(struct ns_common *ns)
> diff --git a/fs/open.c b/fs/open.c
> index 949cef29c3bb..217b5db588c8 100644
> --- a/fs/open.c
> +++ b/fs/open.c
> @@ -459,20 +459,17 @@ SYSCALL_DEFINE1(chdir, const char __user *, filename)
>  SYSCALL_DEFINE1(fchdir, unsigned int, fd)
>  {
>  	struct fd f = fdget_raw(fd);
> -	struct inode *inode;
>  	int error = -EBADF;
>  
>  	error = -EBADF;
>  	if (!f.file)
>  		goto out;
>  
> -	inode = file_inode(f.file);
> -
>  	error = -ENOTDIR;
> -	if (!S_ISDIR(inode->i_mode))
> +	if (!d_can_lookup(f.file->f_path.dentry))
>  		goto out_putf;
>  
> -	error = inode_permission(inode, MAY_EXEC | MAY_CHDIR);
> +	error = inode_permission(file_inode(f.file), MAY_EXEC | MAY_CHDIR);
>  	if (!error)
>  		set_fs_pwd(current->fs, &f.file->f_path);
>  out_putf:

  reply	other threads:[~2017-04-03  6:06 UTC|newest]

Thread overview: 117+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-04-02 17:01 [git pull] vfs fixes Al Viro
2017-04-02 23:59 ` Linus Torvalds
2017-04-03  0:10   ` Linus Torvalds
2017-04-03  0:30     ` Al Viro
2017-04-03  0:43       ` Al Viro
2017-04-03  0:58         ` Linus Torvalds
2017-04-03  2:21           ` Al Viro
2017-04-03  6:00             ` Eric W. Biederman [this message]
2017-04-03  7:46               ` Al Viro
2017-04-04  0:22               ` Ian Kent
2017-04-04  0:47               ` Ian Kent
2017-04-03  0:20   ` Al Viro
  -- strict thread matches above, loose matches on Subject: below --
2024-04-26 14:59 [GIT PULL] " Christian Brauner
2024-04-26 18:09 ` pr-tracker-bot
2024-04-05 11:22 Christian Brauner
2024-04-05 17:09 ` pr-tracker-bot
2024-03-18 12:19 Christian Brauner
2024-03-18 16:48 ` pr-tracker-bot
2024-03-18 19:14 ` Linus Torvalds
2024-03-18 19:41   ` Linus Torvalds
2024-03-19  6:58     ` Christian Brauner
2024-03-20 10:21       ` Christian Brauner
2024-03-06 15:45 Christian Brauner
2024-03-06 16:33 ` pr-tracker-bot
2024-03-01 12:45 Christian Brauner
2024-03-01 20:37 ` pr-tracker-bot
2024-02-22 14:03 Christian Brauner
2024-02-22 18:18 ` pr-tracker-bot
2024-02-12 13:00 Christian Brauner
2024-02-12 17:03 ` pr-tracker-bot
2024-01-13 12:31 Christian Brauner
2024-01-17 20:03 ` pr-tracker-bot
2023-11-24 10:27 Christian Brauner
2023-11-24 18:25 ` Linus Torvalds
2023-11-24 18:52   ` Linus Torvalds
2023-11-24 20:12     ` Linus Torvalds
2023-11-25 13:05       ` Christian Brauner
2023-11-25 13:10   ` Christian Brauner
2023-11-25 13:28     ` Omar Sandoval
2023-11-25 14:04       ` Christian Brauner
2023-11-24 18:26 ` pr-tracker-bot
2023-10-19 10:07 Christian Brauner
2023-10-19 16:37 ` Linus Torvalds
2023-10-20 11:14   ` Christian Brauner
2023-10-19 18:36 ` pr-tracker-bot
2023-09-26 10:39 Christian Brauner
2023-09-26 16:14 ` pr-tracker-bot
2023-07-06 11:52 Christian Brauner
2023-07-07  2:27 ` pr-tracker-bot
2023-07-02 11:28 Christian Brauner
2023-07-02 18:53 ` pr-tracker-bot
2023-05-25 12:22 Christian Brauner
2023-05-25 18:18 ` pr-tracker-bot
2023-05-12 15:31 Christian Brauner
2023-05-12 22:14 ` pr-tracker-bot
2023-04-03 11:04 Christian Brauner
2023-04-03 16:51 ` pr-tracker-bot
2023-03-12 12:18 Christian Brauner
2023-03-12 16:20 ` pr-tracker-bot
2020-09-22 21:29 [git pull] " Al Viro
2020-09-22 22:15 ` pr-tracker-bot
     [not found] <CAHk-=wgdsv1UA+QtgiJM8KQAG7N7_9iK_edchnzZYyj+nxmfLA@mail.gmail.com>
     [not found] ` <20200113195448.GT8904@ZenIV.linux.org.uk>
     [not found]   ` <CAHk-=whn5qk-e-KnYr6HNe5hp45v+XyDbsA2+szXvK3gC06A2w@mail.gmail.com>
2020-01-15  6:41     ` Al Viro
2020-01-15 19:35       ` pr-tracker-bot
2018-07-01 12:31 Al Viro
2018-07-01 19:36 ` Linus Torvalds
2018-07-01 20:05   ` Al Viro
2018-07-01 20:25     ` Linus Torvalds
2018-04-20 15:58 Al Viro
2018-04-20 18:29 ` Andrew Morton
2018-04-20 19:09   ` Al Viro
2018-04-20 19:57     ` Andrew Morton
2017-06-17  2:56 Al Viro
2017-04-09  5:40 Al Viro
2017-04-11  6:10 ` Linus Torvalds
2017-04-11  6:48   ` Al Viro
2017-04-11 21:02     ` Andreas Dilger
2017-04-12  7:00       ` Linus Torvalds
2017-04-15  6:41 ` Vegard Nossum
2017-04-15 16:51   ` Linus Torvalds
2017-04-15 17:08     ` Al Viro
2016-06-17 20:50 Q. hlist_bl_add_head_rcu() in d_alloc_parallel() J. R. Okajima
2016-06-17 22:16 ` Al Viro
2016-06-19  5:24   ` J. R. Okajima
2016-06-19 16:55     ` Al Viro
2016-06-20  4:34       ` J. R. Okajima
2016-06-20  5:35         ` Al Viro
2016-06-20 14:51           ` Al Viro
2016-06-20 17:14             ` [git pull] vfs fixes Al Viro
2016-06-08  2:12 Al Viro
2016-05-28  0:10 Al Viro
2016-02-28  1:09 Al Viro
2014-09-14 19:47 Al Viro
2014-09-26 20:38 ` Joachim Eastwood
2014-09-26 20:46 ` Joachim Eastwood
2014-09-26 20:58   ` Al Viro
2014-09-26 21:28     ` Joachim Eastwood
2014-09-26 21:52       ` Joachim Eastwood
2014-03-24 22:58 Imre Deak
2014-03-25  7:21 ` Sedat Dilek
2014-03-23  7:16 Al Viro
2014-03-23 10:57 ` Sedat Dilek
2014-03-23 15:35   ` Al Viro
2014-03-23 16:56     ` Al Viro
2014-03-23 16:36 ` Linus Torvalds
2014-03-23 16:45   ` Al Viro
2014-03-23 17:01     ` Linus Torvalds
2014-03-24  8:52       ` Sedat Dilek
2014-03-25  0:46         ` Linus Torvalds
2014-03-26 16:36           ` Sedat Dilek
2014-03-26 20:55             ` Linus Torvalds
2014-03-27  6:14               ` Sedat Dilek
2014-03-30 20:33               ` Al Viro
2014-03-30 20:55                 ` Al Viro
2014-03-30 22:39                   ` Linus Torvalds
2014-03-30 23:21                     ` Al Viro
2013-06-22  7:16 Al Viro
2013-03-27  0:36 Al Viro
2012-03-10 21:30 Al Viro
2012-03-10 21:49 ` Linus Torvalds
2012-03-10 22:14   ` Al Viro
2010-01-29  2:39 Al Viro
2010-01-17  7:57 Al Viro
2008-08-25  5:25 Al Viro
2008-08-25  5:29 ` Al Viro

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87inmmvxlz.fsf@xmission.com \
    --to=ebiederm@xmission.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=raven@themaw.net \
    --cc=torvalds@linux-foundation.org \
    --cc=viro@ZenIV.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.