linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Oleg Drokin <green@linuxhacker.ru>
To: "J. Bruce Fields" <bfields@fieldses.org>
Cc: Jeff Layton <jlayton@poochiereds.net>,
	linux-nfs@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH] nfsd: Make creates return EEXIST correctly instead of EPERM
Date: Fri, 8 Jul 2016 12:16:14 -0400	[thread overview]
Message-ID: <8FC42FC1-AB95-4AF7-8493-EF0A34138B4A@linuxhacker.ru> (raw)
In-Reply-To: <20160708160426.GB7395@fieldses.org>


On Jul 8, 2016, at 12:04 PM, J. Bruce Fields wrote:

> On Fri, Jul 08, 2016 at 11:53:28AM -0400, Jeff Layton wrote:
>> On Fri, 2016-07-08 at 11:14 -0400, Oleg Drokin wrote:
>>> On Jul 8, 2016, at 7:02 AM, Jeff Layton wrote:
>>> 
>>>> On Thu, 2016-07-07 at 21:47 -0400, Oleg Drokin wrote:
>>>>> It looks like we are bit overzealous about failing mkdir/create/mknod
>>>>> with permission denied if the parent dir is not writeable.
>>>>> Need to make sure the name does not exist first, because we need to
>>>>> return EEXIST in that case.
>>>>> 
>>>>> Signed-off-by: Oleg Drokin <green@linuxhacker.ru>
>>>>> ---
>>>>> A very similar problem exists with symlinks, but the patch is more
>>>>> involved, so assuming this one is ok, I'll send a symlink one separately.
>>>>>  fs/nfsd/nfs4proc.c |  6 +++++-
>>>>>  fs/nfsd/vfs.c      | 11 ++++++++++-
>>>>>  2 files changed, 15 insertions(+), 2 deletions(-)
>>>>> 
>>>> 
>>>> 
>>>> nit: subject says EPERM, but I think you mean EACCES. The mnemonic I've
>>>> always used is that EPERM is "permanent". IOW, changing permissions
>>>> won't ever allow the user to do something. For instance, unprivileged
>>>> users can never chown a file, so they should get back EPERM there. When
>>>> a directory isn't writeable on a create they should get EACCES since
>>>> they could do the create if the directory were writeable.
>>> 
>>> Hm, I see, thanks.
>>> Confusing that you get "Permission denied" from perror ;)
>>> 
>> 
>> Yes indeed. It's a subtle and confusing distinction.
>> 
>>>>> diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
>>>>> index de1ff1d..0067520 100644
>>>>> --- a/fs/nfsd/nfs4proc.c
>>>>> +++ b/fs/nfsd/nfs4proc.c
>>>>> @@ -605,8 +605,12 @@ nfsd4_create(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
>>>>>  
>>>>>  	fh_init(&resfh, NFS4_FHSIZE);
>>>>>  
>>>>> +	/*
>>>>> +	 * We just check thta parent is accessible here, nfsd_* do their
>>>>> +	 * own access permission checks
>>>>> +	 */
>>>>>  	status = fh_verify(rqstp, &cstate->current_fh, S_IFDIR,
>>>>> -			   NFSD_MAY_CREATE);
>>>>> +			   NFSD_MAY_EXEC);
>>>>>  	if (status)
>>>>>  		return status;
>>>>>  
>>>>> diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
>>>>> index 6fbd81e..6a45ec6 100644
>>>>> --- a/fs/nfsd/vfs.c
>>>>> +++ b/fs/nfsd/vfs.c
>>>>> @@ -1161,7 +1161,11 @@ nfsd_create(struct svc_rqst *rqstp, struct svc_fh *fhp,
>>>>>  	if (isdotent(fname, flen))
>>>>>  		goto out;
>>>>>  
>>>>> -	err = fh_verify(rqstp, fhp, S_IFDIR, NFSD_MAY_CREATE);
>>>>> +	/*
>>>>> +	 * Even though it is a create, first we see if we are even allowed
>>>>> +	 * to peek inside the parent
>>>>> +	 */
>>>>> +	err = fh_verify(rqstp, fhp, S_IFDIR, NFSD_MAY_EXEC);
>>>>>  	if (err)
>>>>>  		goto out;
>>>>>  
>>>>> @@ -1211,6 +1215,11 @@ nfsd_create(struct svc_rqst *rqstp, struct svc_fh *fhp,
>>>>>  		goto out; 
>>>>>  	}
>>>>>  
>>>>> +	/* Now let's see if we actually have permissions to create */
>>>>> +	err = nfsd_permission(rqstp, fhp->fh_export, dentry, NFSD_MAY_CREATE);
>>>>> +	if (err)
>>>>> +		goto out;
>>>>> +
>>>>>  	if (!(iap->ia_valid & ATTR_MODE))
>>>>>  		iap->ia_mode = 0;
>>>>>  	iap->ia_mode = (iap->ia_mode & S_IALLUGO) | type;
>>>> 
>>>> 
>>>> Ouch. This means two nfsd_permission calls per create operation. If
>>>> it's necessary for correctness then so be it, but is it actually
>>>> documented anywhere (POSIX perhaps?) that we must prefer EEXIST over
>>>> EACCES in this situation?
>>> 
>>> Opengroup manpage: http://pubs.opengroup.org/onlinepubs/009695399/functions/mkdir.html
>>> newer version is here:
>>> http://pubs.opengroup.org/onlinepubs/9699919799/
>>> 
>>> They tell us that we absolutely must fail with EEXIST if the name is a symlink
>>> (so we need to lookup it anyway), and also that EEXIST is the failure code
>>> if the path exists.
>>> 
>> 
>> I'm not sure that that verbiage supersedes the fact that you don't have
>> write permissions on the directory. Does it?
>> 
>> ISTM that it's perfectly valid to shortcut looking up the dentry if the
>> user doesn't have write permissions on the directory, even when the
>> target is a symlink.
>> 
>> IOW, I'm not sure I see a bug here.
> 
> If this is causing real programs to behave incorrectly, then that may
> matter more than the letter of the spec.  But I'm a little curious why
> we'd be hearing about that just now--did the client or server's behavior
> change recently?

We, on the Lustre side, have been hearing about this since 2010, (this optimization
was enabled in 2009).

I suspect some people just complain in places that not everybody monitors.
I tried 3.10 and it has the same problem here.
I just tried on RHEL6 (2.6.32) and the problem is also apparent there.

Also it's confusing how you get different errors depending on if the cache is hot or not:
[green@centos6-16 racer]$ mkdir test
mkdir: cannot create directory `test': Permission denied
[green@centos6-16 racer]$ ls -ld test
drwxr-xr-x 2 root root 4096 Jul  8 12:12 test
[green@centos6-16 racer]$ mkdir test
mkdir: cannot create directory `test': File exists


>>> Are double permission checks really as bad for nfs? it looked like it would
>>> call mostly into VFS so even if first call would be expensive, second call should
>>> be really cheap?
>>> 
>> 
>> It depends on the underlying fs. In most cases, you're right, but you
>> can export things that overload the ->permission op, and those can be
>> as expensive as they like (within reason of course).
> 
> Weird if the expense of a second permission call is significant compared
> to following the mkdir and sync.  But, what do I know.
> 
> --b.

  reply	other threads:[~2016-07-08 16:16 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-07-08  1:47 [PATCH] nfsd: Make creates return EEXIST correctly instead of EPERM Oleg Drokin
2016-07-08 11:02 ` Jeff Layton
2016-07-08 15:14   ` Oleg Drokin
2016-07-08 15:53     ` Jeff Layton
2016-07-08 15:59       ` Oleg Drokin
2016-07-08 16:17         ` Jeff Layton
2016-07-08 16:28           ` Oleg Drokin
2016-07-09  2:52         ` Al Viro
2016-07-09  2:58           ` Oleg Drokin
2016-07-09  3:13             ` Al Viro
2016-07-08 16:04       ` J. Bruce Fields
2016-07-08 16:16         ` Oleg Drokin [this message]
2016-07-08 20:49           ` J. Bruce Fields
2016-07-08 21:47             ` Oleg Drokin
2016-07-09  3:10               ` Al Viro
2016-07-09  3:41                 ` Oleg Drokin
2016-07-13 19:00                   ` J. Bruce Fields
2016-07-08 20:54 ` J. Bruce Fields
2016-07-08 21:53   ` Oleg Drokin
2016-07-21 20:34     ` J. Bruce Fields
2016-07-21 20:37       ` Oleg Drokin
2016-07-22  1:57         ` J. Bruce Fields
2016-07-22  6:35           ` Oleg Drokin
2016-07-22 10:55             ` J. Bruce Fields
2016-07-22 15:13               ` Oleg Drokin
2016-07-22 17:48                 ` J. Bruce Fields
2016-07-22 17:48                   ` [PATCH 1/7] nfsd: Make creates return EEXIST instead of EACCES J. Bruce Fields
2016-07-22 17:48                   ` [PATCH 2/7] nfsd: remove redundant zero-length check from create J. Bruce Fields
2016-07-22 17:48                   ` [PATCH 3/7] nfsd: remove redundant i_lookup check J. Bruce Fields
2016-07-24  0:22                     ` Al Viro
2016-07-24 12:10                       ` J. Bruce Fields
2016-07-24 14:23                         ` Al Viro
2016-07-24 20:21                           ` J. Bruce Fields
2016-07-22 17:48                   ` [PATCH 4/7] nfsd: reorganize nfsd_create J. Bruce Fields
2016-07-22 17:48                   ` [PATCH 5/7] nfsd: remove unnecessary positive-dentry check J. Bruce Fields
2016-07-22 17:48                   ` [PATCH 6/7] nfsd: clean up bad-type check in nfsd_create_locked J. Bruce Fields
2016-07-22 17:48                   ` [PATCH 7/7] nfsd: drop unnecessary MAY_EXEC check from create J. Bruce Fields

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8FC42FC1-AB95-4AF7-8493-EF0A34138B4A@linuxhacker.ru \
    --to=green@linuxhacker.ru \
    --cc=bfields@fieldses.org \
    --cc=jlayton@poochiereds.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).